Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Optimal � acceleration parameter for the ADI iteration
for the real three dimensional Helmholtz equation with
nonnegative !
Sangback Ma
J. KSIAM Vol.3, No.2, 1-4, 1999
Abstract
The Helmholtz equation is very important in physics and engineering. How-
ever, solution of the Helmholtz equation is in general known as a very di�cult
phenomenon. For if the ! is negative, the FDM discretized linear system becomes
inde�nite, whose solution by iterative method requires a very clever preconditioner.
In this paper we assume that ! is nonnegative, and determine the optimal � pa-
rameter for the three dimensional ADI iteration for the Helmholtz equation. The
ADI(Alternating Direction Implicit) method is also getting new attentions due
to the fact that it is very suitable to the vector/parallel computers, for exam-
ple, as a preconditioner to the Krylov subspace methods. However, classical ADI
was developed for two dimensions, and for three dimensions it is known that its
convergence behaviour is quite di�erent from that in two dimensions. So far, in
three dimensions the so-called Douglas-Rachford form of ADI was developed. It
is known to converge for a relatively wide range of � values but its convergence is
very slow. In this paper we determine the necessary conditions of the � parame-
ter for the convergence and optimal � for the three dimensional ADI iteration of
the Peaceman-Rachford form for the real Helmholtz equation with nonnegative !.
Also, we conducted some experiments which is in close agreement with our theory.
This straightforward extension of Peaceman-rachford ADI into three dimensions
will be useful as an iterative solver itself or as a preconditioner to the the Krylov
subspace methods, such as CG(Conjugate Gradient) method or GMRES(m).
1 Three Dimensional Extension into the Helmholtz equa-
tion
For three dimensional Poisson problems Douglas[1] proposed a variant of the classical
ADI, which has more smooth convergence behavior.
Algorithm 1.1 DO3-ADI(Douglas ADI)
(H + �iI)ui+1=3 = �(H + 2V + 2W � �iI)ui + 2b (1)
(V + �iI)ui+2=3 = �(H + V + 2W � �iI)ui �Hui+1=3 + 2b
(W + �iI)ui+1 = �(H + V +W � �iI)ui �Hui+1=3 � V ui+2=3 + 2b
This work was supported by project for supprting leading trial schools in IT from the Ministry of
Information and Communication
1
2 Sangback Ma
Douglas[1] has proven that in the case where the matrices H;V; and W all commute,
the above iteration is convergent for �xed � > 0. However, it is demonstrated by
experiments that DO3-ADI converges for a wider range of values of �, but the conver-
gence rate is very slow. Due to the slow convergence rate it has rarely been used as an
iterative method.
So writing
A = ( ~H + !3I + �iI) + (A� ~H � !
3I � �iI)
= ( ~V + !3I + �iI) + (A� ~V � !
3I � �iI)
= ( ~W + !3I + �iI) + (A� ~W �
!3I � �iI)
Algorithm 1.2 Peaceman-Rachford ADI in three dimensions(PR3-ADI)
(H + �iI)ui+1=3 = �(V + w � �iI)ui + b (2)
(V + �iI)ui+2=3 = �(H + w � �iI)ui+1=3 + b
(W + �iI)ui+1 = �(H + V � �iI)ui+2=3 + b
where H = ~H + !3I; V = ~V + !
3I; W = ~W + !
3I. The convergence behavior of
this algorithm is quite di�erent from that of PR2-ADI in two dimension. Assume that
H, V , and W are pairwise commutative, and that
a � �(H); �(V ); �(W ) � b;
where �(M) denote the spectrum of the matrix M .
Then, a and b are known to be
a = 4sin2(�
2(n+ 1)) +
!
3; b = 4sin2(
n�
2(n+ 1)) +
!
3
where N = n3.
Let T� be the operator associated with PR3-ADI. Then,
T� = (W + �I)�1(H + V � �I)(V + �I)�1(H +W � �I)(H + �I)�1(V +W � �I) (3)
Since the given equation is separable, HV = V H, HW = WH, and VW = WV
and H, V, and W share common set of eigenvectors. Let v be any such vector, and
Hv = �v; ; V v = �v;Wv = �v:
Then,
T�v =(�+ � � �)(� + � � �)(�+ � � �)
(�+ �)(� + �) (� + �)v (4)
Then, the spectral radius of T� is given by
Sp(T�) = maxa��;�;��b
����(�+ � � �)(� + � � �)(�+ � � �)
(�+ �)(� + �)(� + �)
���� (5)
Optimal � acceleration parameter 3
Now, we are looking for � such that (5) becomes smaller than 1. Now, we introduce
several functions. Let
�1(�) = maxa��;�;��b
����� + � � �
�+ �
���� (6)
�2(�) = maxa��;�;��b
�����+ � � �
� + �
���� (7)
�3(�) = maxa��;�;��b
�����+ � � �
� + �
���� (8)
and
1(�) = maxa���b
����2�� �
�+ �
���� (9)
2(�) = maxa���b
����2� � �
� + �
���� (10)
3(�) = maxa���b
����2� � �
� + �
���� (11)
Theorem 1.1 Assume that � > b=2. Then,
�1(�) � 1(�)
Corollary 1.1 With the same hypotheses as in theorem 1.1 the necessary and su�cient
condition that the PR3-ADI iteration is convergent is that � > b=2.
Proof. Sp(T�) = �1(�)3, hence if � > b=2 then Sp(T�) < 1. 2
Theorem 1.2 � minimizing Sp(T�) is given by � = ��, where
�� =a+ b+
q(a+ b)2 + 32ab
4:
Proof.@�1
@�= �
3b
(b+ �)2< 0; � < ��
and@�1
@�=
3a
(a+ �)2> 0; � > ��:
So, the minimum is obtained when � = ��. 2
2 Experiments
The above tables shows that Douglas-Rachford ADI is indeed always convergent for
any positive �, while the convergence speed is very slow. Also, the minimum number of
iteration for PR3-ADI seems to happen around � = 2.0, which is close to our theoretical
��.
4 Sangback Ma
�
0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0
PR3 SL SL SL SL SL SL SL 87 58 62 63 80 61
DO3 145 140 133 134 134 128 122 154 170 174 176 183 195
Table 1: Poission Problem with N=48x48x48, iteration of CG-ADI with constant �
untilkrkkkr0k
� 10�6
3 Conclusion
In three dimensions for the real Helmholtz equation with nonnegative ! the optimal
� parameter for the stationary ADI iteration was determined. We believe that for
the speci�c Helmholtz equation straightforward extension of Peaceman-Rachford ADI
with properly predetermined � might converge faster our result might turn out to be
useful. than the Douglas-rachford ADI for three dimensions. We believe that as a
preconditioner to the Krylov subspace methods, such as CG(Conjugate Gradient) or
GMRES(m),
References
[1] J. Douglas, \Alternating direction methods for three space variables", Numerische
Mathematik, Vol. 4, pp. 41-63, 1962
Hanyang University,
Computer Science Department,
Kyungki-Do, Korea
A PROJECTION ALGORITHM FOR SYMMETRIC
EIGENVALUE PROBLEMS
PIL SEONG PARK
J. KSIAM Vol.3, No.2, 5-16, 1999
Abstract
We introduce a new projector for accelerating convergence of a symmetric eigen-
value problem Ax = x, and devise a power/Lanczos hybrid algorithm. Acceleration
can be achieved by removing the hard-to-annihilate nonsolution eigencomponents
corresponding to the widespread eigenvalues with modulus close to 1, by estimat-
ing them accurately using the Lanczos method. However, the additional Lanczos
results can be obtained without expensive matrix-vector multiplications but a very
small amount of extra work, by utilizing simple power-Lanczos interconversion
algorithms suggested. Numerical experiments are given at the end.
1. Introduction
Numerical models often yield eigenvalue problems Ax = �x for �nding the dom-
inant eigenvector x corresponding to the eigenvalue � with the largest modulus. In
many cases, the dominant eigenvalue is known in advance. One such example is the
queuing problem Qx = 0 described in [2], that can be converted to an eigenvalue
problem Ax = x for �nding the dominant eigenvector corresponding to the eigenvalue
1.
It is well known that, if the moduli of some eigenvalues of A are nearly equal to
that of the dominant one we look for, usual algorithms such as the power method or its
variants like the Chebyshev iteration do not work well, since the convergence depends
on the modulus ratio of the second largest to the dominant. To improve the convergence
in such cases, an orthogonal projector was proposed in [3] under the assumption that
these unwanted eigenvalues are clustered closely to each other. However the projector
sometimes may not work well if these unwanted major eigenvalues are well separated.
In this paper, we introduce a better orthogonal projector to deal with such cases,
and devise a new power/Lanczos hybrid algorithm for symmetric eigenvalue problems.
Numerical results of the algorithm in various cases are given at the end.
Throughout this paper, we deal with an n � n real symmetric eigenvalue problem
Ax = x, i.e., we look for the eigenvector corresponding to the dominant eigenvalue 1.
If the dominant eigenvalue, theoretically known in advance, is di�erent from 1, we can
scale the problem to satisfy this condition.
Key words: projector, Lanczos method, the power method, power-Lanczos interconversion, sym-
metric eigenvalue problems, Krylov subspace
5
6 Pil Seong Park
2. Some backgrounds
We �rst review the method introduced in [3] and look at the Lanczos method for
further development.
De�nition 1 Let (�j ; zj) be the jth eigenpair of a given matrix A numbered in decreas-
ing order by its eigenvalue modulus, and z1 be the eigenvector we look for, corresponding
to �1 = 1.
De�nition 2 For any vector x, we de�ne the residual of x by r = (A� I)x.
The \unnormalized" power method xi+1 := Axi is one of the main driving force.
Then the residual is just the di�erence between two consecutive power iterates and can
be computed without extra matrix-vector multiplication. The solution component z1in power iterates does not change, and residuals contain nonsolution components only,
but not z1. Hence convergence of an iterate can be estimated by normalizing it and
then computing the norm of its residual.
Let x =Pn
j=1 �jzj be the initial vector for the unnormalized power iteration. After
enough(say, e) power iterations, let x1 be the �rst power iterate of our concern and r1be its residual. Then
x1 = Aex =nX
j=1
�jAezj =
nX
j=1
�j�jezj:
Assume that the nonsolution components in x1 mainly consist of k major compo-
nents(say, major nonsolution components) z2; z3; : : : ; zk+1, and others(say, minor non-
solution components) are negligibly small, i.e., 1 � j�j�jej � j�i�i
ej; j = 2; : : : ; k +
1; i = k+2; : : : ; n. If we could remove most of these k major components, we get very
fast convergence.
Note that, after enough power iterations, residuals will be rich in a few major
nonsolution components, depending on the eigenstructure of A. Hence residuals are
used to approximate the subspace spanned by these major nonsolution components,
and an orthogonal projector was designed to reduce them in power iterates as described
below.
We apply k+1 more power iterations to obtain xi+1 = Aix1; i = 1; 2; : : : ; k+1, and
compute the residuals ri � xi+1�xi, i = 1; 2; : : : ; k+1. By applying the Gram-Schmidt
process to the k residuals r1; : : : ; rk, we form the matrix V 2 Rn�k whose columns are
orthonormal and R(V ) = spanfr1; : : : ; rkg, which may be close to spanfz2; : : : ; zk+1g.
By subtracting the nonsolution components in rk+1 projected onto R(V ), most of the
major nonsolution components in the iterate xk+1 can be removed e�ectively by the
projection step
xnew = xk+1 +1
1� �V V T rk+1;(1)
where � is an approximation to the eigenvalues corresponding to the major eigencompo-
nents z2; : : : ; zk+1 we try to remove. Note that xk+2 is needed only for the computation
of rk+1 and is not used any further.
In an ideal case when R(V ) = spanfz2; z3; : : : ; zk+1g,
A Projection Algorithm 7
Theorem 1 Let V 2 Rn�k be the matrix with orthonormal columns such that
R(V ) = spanfr1; r2; : : : ; rkg = spanfz2; z3; : : : ; zk+1g:
Let the current iterate be xk+1 =Pn
j=1 �jzj, where j�1j = O(1); j�ij � �1 for i =
2; : : : ; k + 1, and j�j j � �2 for j = k + 2; : : : ; n, and �2 � �1 � 1. If � is chosen so
that j�j��
1��j � � for some 0 < � � 1 for j = 2; : : : ; k + 1, then the remaining major
nonsolution components z2; : : : ; zk+1 after the projection step (1) are at most O(�k)
where �k = max(k��1; t�2) and t = (n� k � 1)(1 + 21��
).
The Lanczos method
Given a matrix A 2 Rn�n and a set fq1; : : : ;qkg of k linearly independent vec-
tors(k � n), the projection method on spanfq1; : : : ;qkg tries to approximate an eigen-
pair (�; z) of the matrix A by a pair (�(k); z(k)) satisfying [5]
z(k) 2 spanfq1; : : : ;qkg;
(A� �(k)I)z(k) ? qj ; j = 1; 2; : : : ; k:
The solutions �(k) are called Ritz values on the subspace spanfq1; : : : ;qkg, and to each
Ritz value is associated a Ritz vector z(k) [7].
Let Qk = [q1; : : : ;qk]. Writing z(k) = Qks(k), we see that (�(k); z(k)) are eigenpairs
of the problem
(Tk � �(k)Bk)s(k) = 0;
or equivalently
(B�1k Tk � �(k)I)s(k) = 0;
where Tk = QTkAQk and Bk = QT
kQk.
In usual applications, we choose an orthonormal system Qk, so that Bk reduces to
the identity matrix. One such process is the symmetric Lanczos method. It uses the
orthonormal system obtained by orthogonalization of the Krylov vectors q1; Aq1; : : :,
Ak�1q1, where q1 is a starting vector. Then the matrix Tk becomes tridiagonal, and
letting
Tk =
266666664
�1 �1�1 �2 �2
�2 �3. . .
. . .. . . �k�1�k�1 �k
377777775;
the entries are easily obtainable from a three-term recurrence relation [7]
Aqj = �j�1qj�1 + �jqj + �jqj+1; j = 1; : : : ; n� 1;
�0q0 � 0:
8 Pil Seong Park
3. A new projector
The reduction/magni�cation factor of the jth eigencomponent zj by the projection
step (1) in the previous method is j�j � �j=j1 � �j. One major drawback of the pre-
vious algorithm is that nonsolution components can grow if this ratio is larger than 1,
especially if the eigenvalue �j is far away from the point (�; 0) in the complex plane.
The parameter � in the old projector (1) is a representative value for the k eigen-
values that correspond to the major nonsolution components, which are di�erent in
general. Moreover if these eigenvalues are well-separated, a projection step using one
parameter can be a disaster.
To remedy such phenomena, we modify the old projector to a multi-parametered
one. That is, we replace the old orthogonal projector (1) by
xnew = xk+1 + V �V T rk+1;(2)
where columns of V 2 Rn�k are orthonormal, and R(V ) = spanfr2; r3; : : : ; rk+1g and
� = diag(1
1� �2; : : : ;
1
1� �k+1
)
where �j 's are hopefully close to �j 's.
As before, after enough power iterations, assume that the current residual rk+1
mainly consists of k major nonsolution components z2; z3; : : : ; zk+1, and the rest are
much smaller. Now consider the new projection step (2). In an ideal case when the
columns of V are exactly z2; : : : ; zk+1 and �j = �j , j = 2; : : : ; k + 1, we have
Theorem 2 Let V 2 Rn�k be the matrix with orthonormal columns such that V =
[z2; z3; : : : ; zk+1], and let �j = �j; j = 2; 3; : : : ; k + 1. After enough power iterations,
assume that the current iterate can be written as xk+1 =Pn
j=1 �jzj, where j�1j =
O(1); j�ij � �1 for i = 2; : : : ; k + 1, and j�j j � �2 for j = k + 2; : : : ; n, and �2 � �1 �
1. Then after applying the projection step (2), the sum of the remaining nonsolution
components is at most of O(n�2).
Proof . Let V = [z2; : : : ; zk+1] and W = [zk+2; : : : ; zn]. Then both V and W have
orthonormal columns since A is symmetric. We can write the current iterate xk+1 and
its residual rk+1 in a matrix form as xk+1 = �1z1+V �2+W�3 and rk+1 = (A�I)xk+1 =Pnj=2 �j(�j � 1)zj = V �2 + W�3 where �2 = [�2; : : : ; �k+1]
T , �3 = [�k+2; : : : ; �n]T ,
�2 = [�2(�2 � 1); : : : ; �k+1(�k+1 � 1)]T , and �3 = [�k+2(�k+2 � 1); : : : ; �n(�n � 1)]T .
Hence after the projection step, we obtain
xnew = xk+1 + V �V T rk+1
= �1z1 + V �2 +W�3 + V �V T (V �2 +W�3)
= �1z1 + V �2 +W�3 + V ��2
= �1z1 +W�3
A Projection Algorithm 9
since V TV = I, V TW�3 = 0, and ��2 = ��2. Hence the size of the sum of nonsolution
components in xnew is
jjxnew � �1z1jj = jj
nX
j=k+2
�jzj jj � n�2:
Note that we need compute the exact eigenpairs (�j ; zj); j = 2; : : : ; k+1 for better
convergence. The Lanczos method, which is known to compute a few extreme eigenpairs
very accurately, is a good candidate for this purpose.
Hence in our new algorithm, we make use of the Ritz pairs(e.g., see [1]). That
is, by applying the Lanczos method to the residual r1, we compute k Lanczos vectors
q1; : : : ;qk 2 Rn and a k � k tridiagonal matrix Tk. Through eigenanalysis of Tk, we
compute its eigenpairs (�j; sj), j = 2; : : : ; k+1 (We intentionally number eigenvalues in
this way to conform future development.). It is well known that a few of �j's are good
approximations to the extreme eigenvalues �2; �3; : : : of A(since the Lanczos method
is applied to the residual r1), and some of Qksj's where Qk = [q1; : : : ;qk], are good
approximations to the eigenvectors z2,z3; : : :, of A.
Hence we select only c eigenpairs (�j ; sj), j = 2; : : : ; c + 1(c < k, that are believed
to be accurate), out of them and obtain � 2 Rc�c and c approximations yj = Qksj,
j = 2; : : : ; c + 1, to the eigenvectors z2; : : : ; zc+1 of A. Then take V = [y2; : : : ;yc+1] 2
Rn�c.
4. Interconversion between the power iterates and the Lanczos results
It seems that, according to the previous explanation, we need apply the Lanczos
method too, independently of the power method, that requires a lot of extra costly
matrix-vector multiplications. However we can avoid it by carefully considering the
relation among power iterates, their residuals, and the Lanczos vectors computed from
the residuals.
The reason for such interconversion is because the power iterates, their residuals,
and the Lanczos method all make use of Krylov subspaces, and the residual has been
de�ned so that such interconversion is possible.
Lemma 1 Consider the power iteration xi+1 := Axi for an eigenvalue problem Ax = x
with an initial vector x1. Then, not only the power iterates x1;x2; : : : but also their
residuals r1; r2; : : : form Krylov subspaces.
Proof . The facts are clear by the de�nition of the Krylov subspace and De�nition 2.
This means that, the residual sequence r1; r2; : : : can be thought of as another power
iterates by ri+1 := Ari. Power iterates can be constructed using the residual sequence
and the initial iterate x1 by
Lemma 2 Let x1;x2; : : : be power iterates. Then xj+1 = x1 +Pj
i=1 ri.
Getting Lanczos tridiagonal matrix from power iterates
Since we apply the Lanczos method to the residual r1,
10 Pil Seong Park
Theorem 3 The Lanczos tridiagonal matrix Tk can be obtained from residual iterates
r1,r2,: : :,rk with an additional matrix-vector multiplication and some inner products.
Proof . It is just a symmetric version of the Theorem 2 in [4], with power iterates
replaced by the residual iterates.
Corollary 1 The Lanczos tridiagonal matrix Tk can be constructed from power iterates
x1,x2,: : :,xk+2.
Getting power iterates from Lanczos process
Conversely, we can obtain power iterates from Lanczos results.
Theorem 4 The residual sequence r2; : : : ; rk can be computed using the Lanczos results
by
rj = Aj�1r1 = jjr1jjV Tj�2k t1
where V 2 Rn�k; Tk 2 Rk�k, and t1 is the �rst column of the Lanczos tridiagonal
matrix Tk.
Proof . Consider the full Lanczos' relation
AV = V T;(3)
where A; V; T 2 Rn�n. Let V = [v1;v2; : : : ;vn]. The �rst column of AV can be written
as
Av1 = V t1(4)
where t1 is the �rst column of T . Premultiplying (4) by A and using (3),
A2v1 = AV t1 = V T t1:
In general, we obtain
Ajv1 = V T j�1t1:
We only need the �rst 2 columns of T j�1 to compute Ajv1, since T is tridiagonal and
only the �rst 2 entries of t1 are nonzero. Hence we need only the �rst k columns of V
and k � k principal submatrix of T , which is Tk. Since we apply the Lanczos method
to the residual r1(i.e., the starting vector is v1 = r1=jjr1jj), the result is immediate.
Corollary 2 The power iterates x2; : : : ;xk+1 can be obtained from the Lanczos matrix
Tk and the Lanczos vectors.
5. A power/Lanczos hybrid algorithm
Our algorithm consists of one or more hybrid steps, each of which consists of some
power iterations(so that residuals mainly contain some major nonsolution components
only), Lanczos steps(to estimate major nonsolution eigenpairs), and a projection step(to
remove those major nonsolution components).
More precisely, at the beginning of each hybrid iteration, we apply some(say, s)
power iterations to the most recent iterate xnew and obtain x1. Then we compute the
residual r1 = x2�x1 where x2 = Ax1. At this point, we can use either of the following
two methods:
A Projection Algorithm 11
� Continue applying power iteration to generate x3; : : : ;xk+2 by xi+1 := Axi, i =
2,3,: : :,k + 1, then convert them to create the Lanczos matrix and vectors by
Theorem 3 and Corollary 1.
or
� Stop power iteration, and apply the Lanczos method to r1 to get the Lanczos
matrix Tk and the Lanczos vectors. Then the residual rk(hence the power iterate
xk and xk+1 too) can be obtained by Theorem 4 and Corollary 2.
In any case, the Lanczos matrix Tk and the Lanczos vectors are used to approximate
the major nonsolution eigencomponents (�j ; zj); j = 2; : : : ; k + 1, as explained previ-
ously. However, since only some(say, c where c < s) of them are accurate, we choose
only those c eigenpairs and project the most recent residual rk+1 onto the subspace
spanned by the c eigenpairs and subtract the components from xk+1.
Based on the latter that looks simpler, we suggest the following power/Lanczos
hybrid algorithm (m; s; k; c), where s is the number of power iterations to be applied
before the Lanczos step, k is the size of the Lanczos tridiagonal matrix to be constructed,
and c is the number of dimensions onto which a projection step is applied(i.e., the
number of eigenpairs that are assumed to be accurate). Note that we may allow extra
m power iterations to the initial guess xinit before the �rst hybrid iteration begins
(outside of the hybrid loop).
Algorithm 1 : Power/Lanczos hybrid algorithm(m; s; k; c)
Given a matrix A 2 Rn�n, assume c < k � n.
1. Take an initial guess xinit.
2. Apply m power iterations to xinit.
3. For i = 1; 2; : : : ; until convergence, do
1) Apply s power iterations to the most recent iterate to obtain x1. Then
compute r1 = x2 � x1 where x2 = Ax1.
2) Apply the Lanczos method to r1 to obtain a tridiagonal matrix Tk and
k Lanczos vectors q1; : : : ;qk.
3) Compute the eigenpairs (�j ; sj), j = 2; : : : ; k+1, of Tk, and select the
major c pairs out of them.
4) Form � 2 Rc�c and V = [y2; : : : ;yc+1] 2 Rn�c where yj = Qksj,
j = 2, : : : ,c+ 1, and Qk = [q1; : : : ;qk].
5) Use the Lanczos results to form xk+1 and rk+1.
6) Perform a projection step to obtain a new iterate xnew = xk+1 +
V �V T rk+1 and normalize xnew.
6. Numerical experiments and discussion
We created the following two relatively hard sample problems Ax = x as follows: for
each problem, we created a two-queue over ow queuing problemQx = 0 with parameter
12 Pil Seong Park
quadruples (sj; wj ; ij ; oj), where sj is the number of servers, wj is the number of wait
spaces, ij is the mean arrival rate, and oj is the mean departure rate in the jth queue [2].
We convert this into the corresponding eigenvalue problem Gx = x by Jacobi splitting
(e.g., see [1]). Since G is 2-cyclic(hence the power iteration does not converge as is,
e.g., see [6]), we slightly shift it by B = (G+ 0:01I)=1:01 so that the resulting matrix
is acyclic. >From this, we get an eigenvalue problem Ax = x, where A is obtained by
symmetrizing B by (B +BT )=2 and scaling the resulting matrix by its spectral radius
so that �1 = 1 is a simple eigenvalue. The eigenvalues, other than 1, of each of the
matrices are well separated, and many of them have relatively large modulus close to
1.
Problem 1 The parameter quadruples used are (5; 3; 10; 6) and (5; 3; 3; 1) in each
queue(j = 1; 2) respectively. Hence the matrix size is 81. Eigenvalues other than
1 are, in decreasing order of magnitude, -0.98144, 0.97465, -0.95610, 0.94286, -
0.92430, 0.89565, -0.87709, 0.86261, 0.84435, -0.84405, 0.83755, -0.82580, -0.81900,
0.80051, and the rest are between -0.8 and 0.8.
Problem 2 The parameter quadruples used are (7; 12; 33; 7) and (4; 15; 22; 5) in
each queue(j = 1; 2) respectively. Hence the matrix size is 400. 0.99608, -098048,
0.98004, 0.97843, -0.97655, 0.97326, -0.96051, -0.95891, 0.95794, -0.95374, 0.95273
are the eigenvalues with modulus greater than 0.95. There are 14 eigenvalues
whose moduli are between 0.90 and 0.95, and 14 more between 0.85 and 0.80,
others being between -0.85 and 0.85.
As an initial guess xinit, we use a vector of all ones scaled by its 2-norm. One
projection step needs one matrix-vector multiplication and some extra calculation for
eigenanalysis of the Lanczos tridiagonal matrix Tk of size k, etc. However, since k � n,
we will simply count the work for one projection step as one matrix-vector multiplica-
tion, ignoring any extra calculation (According to actual timing result, the assumption
seems to be acceptable.).
Table 1 shows the number of matrix-vector multiplications used to reduce the norm
of the residual to certain sizes(10�3; 10�7, and 10�10) for Problem 1, by various meth-
ods: pure power method, shifted power method with best shift 0.006223529(computed
by eigenanalysis of A), the old projection method(denoted by \Old") in [3], and the
new hybrid algorithm with various parameters.
In general, regardless of the choice of parameters, the new hybrid algorithm worked
well, reducing the amount of work to nearly 1/5 of the underlying power method and
less than 1/2 of the old method. It also has been observed that, for �xed value of
m; s; k, di�erent choices of c do not give much di�erence. Anyhow, taking c = k may
sometimes give a slightly worse result since not all of the Lanczos pairs are accurate.
Fig. 1 shows a typical residual reduction pattern by the hybrid algorithm, together
with those by the pure power method and the old projection method. We denote
a speci�c algorithm by "algorithm (m; s; k; c)" where "algorithm" is either "old"(for
the old projection method) or "hybrid"(for the new hybrid method) and m; s; k; c are
the parameters as described previously. Note that rapid residual reduction after each
A Projection Algorithm 13
Table 1: The number of matrix-vector multiplications required to reduce the residual
to certain sizes by various methods.
Algorithms Parameters Residual less than
Projector m s k c 10�3 10�7 10�10
Normal power iteration 89 555 924
Power iteration with best shift 82 441 731
Old - 54 185 312
2 38 109 160
Hybrid 10 5 5 3 36 109 160
4 32 113 165
5 32 111 154
Old - 81 241 359
2 61 131 187
Hybrid 50 5 5 3 61 116 168
4 61 111 172
5 61 116 163
Old - 82 241 368
3 31 149 200
Hybrid 10 10 10 5 31 94 139
7 31 94 137
9 31 94 137
Old - 82 332 441
3 33 149 218
Hybrid 0 10 10 5 25 86 140
7 25 84 127
9 25 84 127
Old - 68 326 413
Hybrid 20 1 4 2 35 136 222
3 34 137 232
14 Pil Seong Park
Figure 1: A typical residual reduction pattern of the hybrid algorithm applied to prob-
lem 1.
projection is clearly visible. Comparing hybrid(10,10,10,5) and hybrid(0,10,10,5), we
may conclude that it gives little di�erence whether we start a projection step slightly
earlier or not, because the Lanczos method gives a very good approximation.
However for the result of old(10,10,10,-), the same values for m; s; k are used but
rapid drop is seldomly seen(In fact, we did apply a projection step at every 20 power
iterations, but most of them have been found of no use.). The reason seems to be that
the estimation for major nonsolution component in the old algorithm is not so accurate
as the Lanczos method used in the hybrid algorithm.
However, even though old(10,10,10, - ) performs a projection step at every 21(i.e., 10
for power iterations, 11 for the Lanczos method applied to the residual) matrix-vector
multiplications, rapid improvement is seldomly seen. In the �gure, rapid improvement
is seen at the 6th, 11th, and 15th projections only. Thus the old algorithm must have
failed to accelerate convergence in most hybrid steps, unless more power iterations
reduce the number of major nonsolution components so that the condition is favorable
for the old projector.
When we apply projection steps too often(i.e., see the cases for m = 20; s = 1; k = 4
at the end of Table 1 and Figure 2), convergence may be worse, because it is not ready
for another projection yet in the sense that the span of the residual sequence may not be
a good approximation to the span of dominant nonsolution components. However, even
in such cases, the new hybrid algorithm still works far better than the old algorithm.
Figure 3 shows that the new algorithm still works well even though the matrix of
A Projection Algorithm 15
Figure 2: E�ect of frequent projections.(Problem 1).
Problem 2 has many more(compared to the size of the dimension c = 5) well-separated
eigenvalues with modulus close to 1.
In general, any choice of the parameters (k; s;m; c) for the new hybrid algorithm
seems to work well, but determination of their optimal values needs further research.
References
1. G. H. Golub and C. F. van Loan, 1996. Matrix computations, 3rd Ed., Johns
Hopkins Univ. Press, Baltimore, U. S. A.
2. L. Kaufman, 1983. Matrix methods for queuing problems, SIAM J. Sci. Comput.,
4:525-552.
3. P. S. Park, 1996. Use of an orthogonal projector for accelerating a queuing problem
solver, Korean J. Com. & Appl. Math. 3(2):193-204.
4. P. S. Park, 1997. Interconversion between the power and Arnoldi's methods. Comm.
Korean Math. Soc. 12(1):145-155.
5. Y. Saad, 1980. Variations on Arnoldi's method for computing eigenelements of
large unsymmetric matrices, Lin. Alg. App., 34:pp. 269-295.
6. E. Seneta, 1981. Non-negative matrices and Markov chains, 2nd Ed., Springer-
Verlag, New York, U. S. A.
16 Pil Seong Park
Figure 3: Residual reduction by the hybrid algorithm applied to problem 2.
7. L. N. Trefethen and D. Baus, III, 1997. Numerical linear algebra, SIAM, Philadel-
phia, U. S. A.
Department of Computer Science
University of Suwon
Kyungki-Do 445-743, Korea
J. KSIAM Vol.3, No.2, 17-28, 1999
THE BOUNDARY ELEMENT METHOD FOR
POTENTIAL PROBLEMS WITH SINGULARITIES
BEONG IN YUN
Abstract. A new procedure of the boundary element method(BEM),say, singular BEM
for the potential problems with singularities is presented. To obtain the numerical solu-
tion of which asymptotic behavior near the singularities is close to that of the analytic
solution, we use particular elements on the boundary segments containing singularities.
The Motz problem and the crack problem are taken as the typical examples, and numer-
ical results of these cases show the e�ciency of the present method.
1. Introduction. The general potential boundary value problem in the plane can be
written by
�u = 0 ; in
u = u ; on �u
q = q ; on �q ;
(1.1)
where � = �u + �q is the piecewisely smooth boundary of the domain and q = @u
@n
is the normal derivative of the potential u with the outward unit normal vector n.
There are many numerical methods for solving boundary value problems such as
�nite element method(FEM), �nite di�erence method(FDM), Ritz-Galerkin method
and boundary element method(BEM). Each of these methods has its own advantages
and disadvantages. When the solutions have singularities, which are due to the compli-
cated boundary conditions or the geometries of the boundaries, the traditional numeri-
cal schemes are not useful. Therefore some special manipulation is needed to overcome
this di�culty [1 { 4].
The BEM is prevailing recently in many engineering disciplines because of the re-
duction in the dimensionality of the problem which results in a much smaller system of
algebraic equations to be solved numerically. The present work is concerned with the
simple and e�ective numerical implementation of the traditional BEM [5,6], for solving
the potential problems with singularities.
It is well known that, for interior points P in , the solution of the problem (1.1)
satis�es
u(P ) +
Z�
q�(P;Q)u(Q) d�(Q) =
Z�
u�(P;Q) q(Q) d�(Q) ; (1.2)
Keywords- boundary element method, singular BEM, Motz problem, crack problem
1991 mathematics subject classi�cation 65N38
17
18 BEONG IN YUN
in which u�(P;Q) and q�(P;Q) are fundamental solutions of the Laplace equation such
as
u�(P;Q) =
1
2�log
1
r(1.3)
and
q�(P;Q) =
@u�
@nQ(P;Q) = �
1
2�
1
r2(r1n1 + r2n2) : (1.4)
In the formulae (1.3) and (1.4), for the points P = (p1; p2) and Q = (q1; q2),
r = jQ� P j =qr21 + r
22 ;
r1 = q1 � p1 ; r2 = q2 � p2 ;
nQ = (n1; n2) :
(1.5)
Limiting process of the equation (1.2) to the boundary point induces the boundary
integral equation
1
2u(P ) +
Z�
q�(P;Q)u(Q) d�(Q) =
Z�
u�(P;Q) q(Q) d�(Q) ; P 2 � : (1.6)
First, the numerical scheme for the traditional BEM is given in the section 2. A
simple modi�cation of the boundary elements near the singularities is proposed in
the section 3, and applications of the present method to the Motz problem and crack
problem are studied in the last section.
2. Traditional BEM with Constant Elements and Linear Discretization.
In this section we review the complete algorithm of the traditional BEM, for sim-
plicity, based on the constant element with the linear discretization of the boundary.
2.1. Discretization
Let the boundary � is discretized by the line segments �j (j = 1; 2; � � � ; n), of whichend points are (xj ; yj) and (xj+1; yj+1). Then every point Q = (x; y) 2 �j can be
written by
x = �j(t) =1
2[(xj+1 � xj) t + (xj+1 + xj)] ;
y = �j(t) =1
2[(yj+1 � yj) t + (yj+1 + yj)] ; �1 � t � 1 ;
(2.1)
with
d�j(Q) =q�0j(t)2 + �0
j(t)2 dt
=1
2
q(xj+1 � xj)2 + (yj+1 � yj)2 dt �
1
2Lj dt ;
(2.2)
POTENTIAL PROBLEMS WITH SINGULARITIES 19
and the outward unit normal vector is
nQ = (n1; n2) = (yj+1 � yj ; xj � xj+1)=Lj : (2.3)
On the other hand we take the node point on �i as
Pi = (xi; yi) ; xi = (xi + xi+1)=2 ; yi = (yi + yi+1)=2 : (2.4)
Then the distance between Pi and Q 2 �j is
r(t) = jQ� Pij =pr1(t)2 + r2(t)2 ; (2.5)
with
r1(t) = �j(t)� xi and r2(t) = �j(t)� yi : (2.6)
On each boundary segment �j , we take the constant values for the potential and
ux, say,
u(Q) = u(Pj) = uj; q(Q) = q(Pj) = q
j; for all Q 2 �j : (2.7)
Then, for every node point P = Pi 2 �i, the boundary integral equation (1.6) results
in
1
2ui +
nXj=1
"Z�j
q�(Pi; Q) d�j(Q)
#uj =
nXj=1
"Z�j
u�(Pi; Q) d�j(Q)
#qj; (2.8)
i = 1; 2; � � � ; n :The equation (2.8) is rewritten by
nXj=1
Hijuj =
nXj=1
Gijqj; i = 1; 2; � � � ; n ; (2.9)
in which the integrals Gij and Hij can be evaluated numerically by the Gauss quad-
rature rule. That is, referring to the formulae (2.1) { (2.6),
Gij =
Z�j
u�(Pi; Q) d�j(Q)
= �1
2�
Z 1
�1log r(t)
�1
2Lj
�dt
���Lj
4�
� MXm=1
!m log r(tm) ;
(2.10)
where !m and tm are weights and nodes of Gauss quadrature rule in the interval
�1 � t � 1. In particular, when i = j,
Gii =
��Li
4�
�Z 1
�1log
����Li2 t
���� dt=
��Li
2�
��log
�Li
2
�� 1
�:
(2.11)
20 BEONG IN YUN
In the formula (2.9) Hij = 12�ij + H
ij , and Hij is approximated by
Hij =
Z�j
q�(Pi; Q) d�j(Q)
=
��Lj
4�
�Z 1
�1
1
r(t)2[r1(t)n1 + r2(t)n2] dt
���Lj
4�
� MXm=1
!m
r(tm)2[r1(tm)n1 + r2(tm)n2] :
(2.12)
When i = j, [r1(t)n1 + r2(t)n2] = 0 so that
Hii = 0 : (2.13)
2.2. Solving the system of boundary integral equations
If we de�ne the following matrices
H =�Hij�n�n ; G =
�Gij�n�n
and vectors
u = fujgn�1 ; q = fqjgn�1 ;
the equation (2.9) can be written by
Hu = Gq : (2.14)
Assume that the input data is given as
EP = fxj ; yjgn�2 ;
f = ff jgn�1 ;
T = fT [j]gn�1 ( T [j] = 0; or 1 ) ;
(2.15)
in which EP is a set of the extreme points of the boundary segments, and f is a set of
boundary conditions at the node points. T indicates the type of boundary conditions
at the nodes ; T [j] = 0 means that the value of the potential is known at the node j,
that is, uj = fj while the ux q
j is unknown. T [j] = 1 means that qj = fj with the
potential uj unknown.
To �nd the unknown values of u and q by substituting the boundary conditions into
(2.14) one has to rearrange the system by moving columns of H and G from one side
to the other. If all the unknowns are passed to the left hand side, then the system
(2.14) is translated into
Ax = y ;
A = [aij ]n�n ;
x = fxjgn�1 ; y = fyjgn�1 ;(2.16)
POTENTIAL PROBLEMS WITH SINGULARITIES 21
where x is a vector of unknowns for u's and q's. y is found by multiplying the corre-
sponding columns of the translated matrix of G by known values for u's and q's.
Based on the statement given above, we introduce an algorithm to translate the
system (2.14) into (2.16) as following :
DO j = 1; 2; � � � ; nDO i = 1; 2; � � � ; n
aij = (T [j]� 1)Gij + T [j]Hij
Bij = T [j]Gij + (T [j]� 1)Hij
(2.17)
CONTINUE
DO i = 1; 2; � � � ; n
yi =
nXk=1
Bikfk (2.18)
CONTINUE
Once the vector of unknowns, x = fxjgn�1 is obtained by solving the equation
(2.16), values of the potential and the ux at each node are given by
uj = T [j]xj + (1� T [j])f j
qj = (1� T [j])xj + T [j]f j ; j = 1; 2; � � � n :
(2.19)
2.3. Evaluation at the internal points
After the unknown coe�cients of fujg and fqjg are obtained, the values of potential
and ux at the interior point can be evaluated such as
u(P ) =
nXj=1
Gj(P )qj �
nXj=1
Hj(P )uj ; P = (x; y) 2 : (2.20)
In this formula Gj(P ) and Hj(P ) are same as Gij and H
ij given in (2.10) and (2.12),
respectively if we replace ri(t) (i = 1; 2) in (2.6) by
r1(t) = �j(t)� x and r2(t) = �j(t)� y : (2.21)
In order to evaluate the derivatives of the potential at the internal points, we consider
the following equations resulting from (1.2).
@u
@x(P ) =
Z�
@
@xu�(P;Q) q(Q) d�(Q) �
Z�
@
@xq�(P;Q)u(Q) d�(Q) ;
@u
@y(P ) =
Z�
@
@yu�(P;Q) q(Q) d�(Q) �
Z�
@
@yq�(P;Q)u(Q) d�(Q) ;
(2.22)
22 BEONG IN YUN
for P = (x; y) 2 : In these formulae the kernels are
@
@xu�(P;Q) =
1
2�
r1
r2;
@
@yu�(P;Q) =
1
2�
r2
r2;
@
@xq�(P;Q) = �
1
2�
1
r2
�2
r2r1(r1n1 + r2n2)� n1
�;
@
@yq�(P;Q) = �
1
2�
1
r2
�2
r2r2(r1n1 + r2n2)� n2
�:
(2.23)
Discretization of the equation (2.22) gives
@u
@x(P ) =
nXj=1
�Gj
x(P )qj � H
j
x(P )uj
;
@u
@y(P ) =
nXj=1
�Gj
y(P )qj � H
j
y(P )uj
;
(2.24)
where
Gj
x(P ) =
Z�j
@u�
@xd�j ; G
j
y(P ) =
Z�j
@u�
@yd�j ;
Hj
x(P ) =
Z�j
@q�
@xd�j ; H
j
y(P ) =
Z�j
@q�
@yd�j :
(2.25)
Using the formulae (2.23) { (2.25) one can obtain the derivatives of the potential at
every internal points.
3. Boundary Elements near the Singular Points.
Assume that � is a boundary of a bounded region and that behavior of the solutions
for the potential and the ux at the internal point P near P� is like as
u(P ) = O(jP� � P j�) ; q(P ) = O(jP� � P j��1) ; (3.1)
0 < � < 1. That is P� is the singular point for the ux. For some integer k, one may
take the sequential boundary segments �k and �k+1 of which common extreme point
is P�. In this case, instead of the constant elements, we present particular boundary
elements on these segments �k and �k+1 such as
uk(t) = uk(1� t)� ; qk(t) = g
k(1� t)��1 on �k ;
uk+1(t) = uk+1(1 + t)� ; qk+1(t) = g
k+1(1 + t)��1 on �k+1 ;(3.2)
�1 � t � 1. It should be noted that uk(t) ; uk+1(t) ; qk(t) and qk+1(t) satisfy the
conditions in (3.1) near the singular point P�.
POTENTIAL PROBLEMS WITH SINGULARITIES 23
Then, for the singular boundary segments �k and �k+1, the integrals in (2.10) and
(2.12) are replaced by
Gik �
��Lk
4�
� MXm=1
!m(1� tm)��1 log r(tm) ;
Hik �
��Lk
4�
� MXm=1
(1� tm)�
!m
r(tm)2[r1(tm)n1 + r2(tm)n2] ;
(3.3)
and
Gi(k+1) �
��Lk+1
4�
� MXm=1
!m(1 + tm)��1 log r(tm) ;
Hi(k+1) �
��Lk+1
4�
� MXm=1
(1 + tm)�
!m
r(tm)2[r1(tm)n1 + r2(tm)n2] :
(3.4)
When � is an open arc, say, a crack and thus the singularities occur at the two
crack tips, one may take �1 and �n so that they contain the singular points in the left
and right hand sides, respectively. Therefore, in this case, the formulae (3.3) and (3.4)
hold, by replacing k and k + 1 by n and 1, respectively.
4. Applications of the Singular BEM.
We introduce two typical examples, a Motz problem and a crack problem, in se-
quence, to show the e�ciency of the singular BEM introduced in this article. The
number of the nodes of Gauss quadrature rule, in the formulae (2.10) and (2.12), is
taken as M = 4.
4.1. The Motz problem.
As a typical singularity problem we consider the Motz problem, as shown in Figure 1,
in a rectangular domain = f(x; y) j � 1 < x < 1; 0 < y < 1g with the boundary
conditions :
ujy=0;x<0 = 0; ujx=1 = 500;
qjy=1 = qjy=0;x>0 = qjx=�1 = 0 :(4.1)
Fig 1 is located hear.
24 BEONG IN YUN
It is known that the solution of (4.1) has a singularity at the origin. In fact the
exact solution can be expressed in a series as [3]
u(r; �) =
1Xj=0
bjrj+ 1
2 cos(j +1
2)� ; (4.2)
where (r; �) are polar coordinates.
Applying the singular BEM to this problem, we have a good approximation to the
exact solution near the singular point. Referring to the fact that the derivatives of the
exact solution has O(r�1
2 ) singularity in the vicinity of the origin, one has to take the
singular boundary elements in (3.2) such as
uk(t) = uk(1� t)
1
2 ; qk(t) = gk(1� t)�
1
2
uk+1(t) = uk+1(1 + t)
1
2 ; qk+1(t) = gk+1(1 + t)�
1
2 :
(4.3)
Figure 2 shows the behavior of the traditional BEM solution, uTnand the singular
BEM solution, uSnnear the singular point, respectively . The subscript n indicates
the number of the boundary elements. Comparison of the relative errors of these two
results are given in Figure 3. We have taken u as an exact solution which is evaluated
from the truncated series with su�ciently large number of terms in (4.2).
Fig 2 is located hear.
POTENTIAL PROBLEMS WITH SINGULARITIES 25
Fig 3 is located hear.
In Table 1 relative errors of the traditional and the singular BEM solutions with
n = 30 for the potential and normal derivatives are given. The selected points are such
as
P1 = (�1
2;1
4); P2 = (
1
2;1
2); P3 = (
1
2;3
4) :
The relative errors are de�ned as
ET =
����u� uT
n
u
���� ; ET
1 =
���� @@x�u� u
T
n
�.@u
@x
���� ; ET
2 =
���� @@y�u� u
T
n
�.@u
@y
���� ;and E
S , ES
1 , ES
2 are similarly de�ned. Table 1 shows that the singular BEM solution
is satisfactory on the whole interior points as well as near the singular points.
Table 1 is located hear.
4.2. The crack problem.
In this case we consider the Dirichlet problem on a crack, � such as
�u(P ) = 0; P 2 R2 n �;
u(P ) = f(P ); P 2 �;
supP2R2
ju(P )j < 1 :
(4.4)
26 BEONG IN YUN
Recently several numerical methods for this type of problems are studied by the
indirect BEM [7,8,9]. Even though they have given complete approximation schemes
and convergence analysis, numerical results for the singular �elds near the crack tips
are not presented.
Above all it should be noted that, for crack problems as given above, the double
layer potentialR�q�(P;Q)u(Q) d�(Q) in (1.2) is canceled by the opposite signs of the
fundamental solution q� on the upper and lower crack faces. Thus the integral equation(1.2) should be replaced by
u(P ) =
Z�
u�(P;Q) q(Q) d�(Q) + � ; (4.5)
where � is an unknown constant. As mentioned in the literature [8], the addition of
the unknown constant � is due to the following constraintZ�
q(Q) d�(Q) = 0 ; (4.6)
which results from the boundedness condition in the problem (4.4). In fact the constant
� means the limit value of the potential u(P ) as jP j ! 1.
It is known that the ux has also O(r�1
2 ) singularity near the crack tips. Taking the
�rst boundary segment, �1 and the last one, �n so that they contain the left and right
hand side crack tips, respectively, we denote the singular boundary elements such as
q1(t) = q1(1 + t)�
1
2 on �1 ; qn(t) = gn(1� t)�
1
2 on �n : (4.7)
Then the discretization scheme given in the section 2 and the condition (4.6) imply
thatnXj=1
Lj
2
Z 1
�1qj(t) dt =
p2�L1q
1 + Lnqn+
n�1Xj=2
Ljqj = 0 : (4.8)
If one chooses the boundary segments so that L1 = L2 = � � � = Ln, equations (4.5)
and (4.8) result in the system
266664G11
G12 � � � � � � G
1n 1
G21
G22 � � � � � � G
2n 1...
......
...
Gn1
Gn2 � � � � � � G
nn 1p2 1 1 � � � � � � 1
p2 0
377775
266664
q1
q2
...
qn
�
377775 =
266664u1
u2
...
un
0
377775 (4.9)
As an example we take � as the line segments on the x-axis, that is, � = [�1; 1],and suppose the boundary condition is given by
f(x) = e�x cos
p1� x2 ; x 2 � : (4.10)
The exact solution of (4.4) is known as [7]
u(x1; x2) = Rehe
pz2�1�z
i; z = x1 + ix2 : (4.11)
POTENTIAL PROBLEMS WITH SINGULARITIES 27
Figure 4 shows the behavior of the indirect BEM solution in the ref. [8], uInand
the singular BEM solution, uSnnear the right crack tip, respectively. Figure 5 gives
comparison of the relative errors of these solutions with respective to the exact solution
(4.11). Figure 4 and Figure 5 prove that the present method is very e�ective near the
singularities.
Fig 4 is located hear.
Fig 5 is located hear.
28 BEONG IN YUN
References
1. R.W. Thatcher, The use of in�nite grid re�nement at singularities in the solution of Laplace's
equation, Numer. Math. 25 (1976), 163 { 178.
2. N. Papamichael, Numerical conformal mapping onto a rectangle with applications to a solution
of Laplacian problems, J. Comput. Appl. Math. 28 (1990), 63 { 83.
3. Z.C. Li, Numerical Methods for Elliptic Problems with Singularities, World Scienti�c, Singapore,
1990.
4. A. Potela, M.H. Aliabadi and D.P. Rooke, The dual boundary element method: E�ective imple-
mentation for crack problems, International J. Numer. Meth. Engng. 33 (1992), 1269 { 1287.
5. C.A. Brebbia, J.C.F. Telles and L.C. Wrobel, Boundary Element Techniques, Springer-Verlag,
New York, 1984.
6. P.K. Banerjee, Boundary Element Methods in Engineering, McGraw-Hill, London, 1994.
7. K. Atkinson and I.H. Sloan, The numerical solution of the �rst kind logarithmic kernel integral
equations on smooth open arcs, Math. Comp. 56(193) (1991), 119 { 139.
8. B.I. Yun, S. Lee and U.J. Choi, A modi�ed boundary integral method on open arcs in the plane,
Computers and Math. Applic. 31(11) (1996), 37 { 43.
9. B.I. Yun and S. Lee, Double layer potential scheme for Dirichlet problems on smooth open arcs,
Computers and Math. Applic. 37(7) (1999), 31 { 40.
Department of Informatics and Statistics
Kunsan National University
573-701 Kunsan, Korea
AN ALGORITHM FOR SYMMETRIC INDEFINITE
SYSTEMS OF LINEAR EQUATIONS
SUCHEOL YI
J. KSIAM Vol.3, No.2, 29-36, 1999
Abstract
It is shown that a new Krylov subspace method for solving symmetric inde�nite
systems of linear equations can be obtained. We call the method as the projection
method in this paper. The residual vector of the projection method is maintained
at each iteration, which may be useful in some applications.
1. Introduction. The kth Krylov subspace Kk(r0; A) generated by an initial residual
vector r0 = b�Ax0 and A is de�ned by
Kk(r0; A) � spanfr0; Ar0; : : : ; Ak�1r0g:(1)
Iterative methods that choose corrections from the space Kk(r0; A) at each iteration
are called Krylov subspace methods. The GMRES method [7] is a Krylov subspace
method for solving systems of linear equations
Ax = b; where A 2 Rn�n is nonsingular:(2)
The kth iterate of GMRES can be characterized as xk = x0 + zk for a given initial
guess x0 2 Rn and the correction zk is chosen to minimize the norm of the residual
vector r(z) = r0 �Az over the kth Krylov subspace Kk(r0; A) at each iteration, i.e.,
kr0 �Azkk2 = minz2Kk(r0;A)
kr0 �Azk2:(3)
If the Arnoldi process is applied with v1 = Ar0=kAr0k2 to generate a basis for the
Krylov subspace Kk(r0; A), simpler GMRES implementations of Walker and Zhou [8]
are obtained and the Arnoldi process is summarized as follows:
Algorithm 1.1 Arnoldi process
Initialize: Choose an initial guess v1 with kv1k2 = 1:
Iterate: For k = 1; 2; : : : ; do:
hi;k = vTi Avk; i = 1; 2; : : : ; k;
~vk+1 = Avk �Pk
i=1 hi;kvi:
Set hk+1;k = k~vk+1k2:
If hk+1;k = 0, stop; otherwise,
vk+1 = ~vk+1=hk+1;k:
Key words: GMRES, MINRES, SYMMLQ, symmetric QMR, and Krylov subspace method.
AMS subject classi�cation. 65F10
29
30 SuCheol Yi
Without loss of generality we may assume the initial residual vector is nonzero. The
initial Arnoldi vector v1 = Ar0=kAr0k2 is then well-de�ned, since A is a nonsingular
matrix. Setting �1;1 = kAr0k2 gives the equation
Ar0 = �1;1v1;(4)
and the following equation is satis�ed by the Arnoldi process:
Avk�1 =kXi=1
�i;kvi for unique �;i;ks with �k;k > 0 for k > 1:(5)
From the equations (4) and (5) we have the following relation:
AUk = VkRk;(6)
where Uk = (r0; v1; : : : ; vk�1); Vk = (v1; : : : ; vk), and
Rk =
0B@�1;1 : : : �1;k
. . ....
�k;k
1CA :
Then the relation (6) reduces the least-squares problem (3) directly to an upper tri-
angular least-squares problem by decomposing the initial residual vector r0 as r0 =
�?
k r0 + VkVTk r0 for each k, where �?
k is the orthogonal projection onto the orthogonal
complement of the space Kk(v1; A).
We introduce another approach to Krylov subspace methods for solving symmetric
inde�nite linear systems, which is called the projection method in this paper. The pro-
jection method is closely related to the simpler GMRES method in that the projection
and simpler GMRES methods use the same initial basis vector v1 = Ar0=kAr0k2 in
applying the symmetric Lanczos and Arnoldi processes, respectively, and, in the sym-
metric case, the projection method can be derived from the simpler GMRES method
by �nding a search direction pk such that Apk = vk for each k. Both simpler GMRES
and the projection method maintain orthonormal bases of the space AKk(r0; A), which
permit residual minimization through projection of the residual onto AKk(r0; A)?.
With simpler GMRES, the kth approximate solution is obtained by solving a k � k
upper triangular system. This is also done with the projection method, but only im-
plicitly. Because the projection method is based on the short recurrence symmetric
Lanczos process, the triangular system is tridiagonal and, therefore, one can update
the approximate solution using a three-term short recurrence formula. In contrast to
simpler GMRES, the usual GMRES implementation maintains an orthonormal basis
of Kk(r0; A) through the Arnoldi process, and, consequently, achieves residual mini-
mization through the solution of an upper Hessenberg least-squares problem. MINRES
[6] can be viewed as a specialization of the usual GMRES approach to the symmetric
case, in which the short recurrence symmetric Lanczos process is used to generate an
orthonormal basis of Kk(r0; A). The upper Hessenberg system is tridiagonal, and so
An Algorithm for Symmetric Inde�nite Systems 31
solution of the upper Hessenberg least-squares problem is done implicitly in MINRES
by implementing a three-term short recurrence formula for updating the approximate
solution. In the symmetric inde�nite case without preconditioning, symmetric QMR
[2] is obtained using the same approach as MINRES. However, in solving the systems
of the preconditioned system
A0x0 = b0; where A0 =M�11 AM�1
2 ; x0 =M2x; and b0 =M�11 b;(7)
symmetric QMR is implemented by solving a quasi-minimization problem. Thus the
approach of the projection method is similar to that of simpler GMRES, while standard
GMRES, MINRES, and symmetric QMR follow an alternative approach. In section 2,
we give a derivation of the projection method and also present the results of numerical
experiments in section 3.
2. A derivation of the projection method. By applying the Arnoldi process
starting with v1 = Ar0=kAr0k2 we can have a set fv1; : : : ; vkg of orthonormal basis
vectors of the space Kk(v1; A). Suppose we have a vector pk such that Apk = vk for
each k. Then the kth residual vector rk in the simpler GMRES method is
rk = rk�1 � (rTk�1vk)vk(8)
= r0 �Azk�1 � (rTk�1vk)Apk
= r0 �A[zk�1 + (rTk�1vk)pk]:
By the last expression in equation (8) it is natural to de�ne the kth iterate xk of the
projection method as xk = xk�1 + (rTk�1vk)pk: Setting Pk = (p1; : : : ; pk) and Vk =
(v1; : : : ; vk) we need to have APk = Vk by the requirement of Apk = vk for each k. By
the relation AUk = VkRk in (6), the equation APk = Vk is equivalent to
Uk = PkRk:(9)
The search direction pk is then de�ned as
pk =
(r0=�1;1 if k = 11
�k;k(vk�1 � �1;kp1 � � � � � �k�1;kpk�1) if k > 1.
Then we have a long recursion formula to generate pk in general.
If A is symmetric, then an orthonormal basis fv1; : : : ; vkg of the space Kk(v1; A)
can be generated by the symmetric Lanczos process. Then the upper triangular matrix
Rk in (6) can be reduced to the form of0BBBBBBBBBB@
�1;1 �1;2 �1;3 0 � � � 0
0 �2;2 �2;3 �2;4. . .
......
. . .. . .
. . .. . . 0
.... . .
. . .. . . �k�2;k
.... . .
. . . �k�1;k0 : : : : : : : : : 0 �k;k
1CCCCCCCCCCA:
32 SuCheol Yi
Therefore, we have a short recursion formula for pk by (9), i.e.,
pk =1
�k;k(vk�1 � �k�1;kpk�1 � �k�2;kpk�2) for k > 1;
where �k�2;k = vTk�2Avk�1; �k�1;k = vTk�1Avk�1; �k;k = k~vkk2; and
~vk = Avk�1 � �k�1;kvk�1 � �k�2;kvk�2:
Note that we may wish to apply the projection method for solving nonsymmetric
linear systems using the nonsymmetric Lanczos process to get a short recursion formula
for a search direction pk. However, we found that the projection method with the
nonsymmetric Lanczos process is very unstable for solving nonsymmetric linear systems.
Therefore, we consider only symmetric inde�nite systems in this paper. It is known
that there exists a symmetric positive de�nite matrix S such that M = S2 for a
given symmetric positive de�nite matrix M . Therefore, the MINRES, SYMMLQ, and
projection methods can be applied to the following system:
~A~x = ~b; where ~A = S�1AS�1; ~x = Sx; and ~b = S�1b:(10)
With symmetric positive de�nite preconditionersM , the projection method for a sym-
metric matrix A can be summarized as follows:
Algorithm 2.1 Projection method (symmetric A)
Initialize: Choose x0 and set r0 = b�Ax0,
z =M�1r0; u1 = Az;w1 =M�1u1; and �1 =quT1 w1.
Update u1 u1=�1 and w1 w1=�1:
Compute �1 = rT0 w1.
Set r1 = r0 � �1u1; p1 = z=�1; and set x1 = x0 + �1p1:
Iterate: For k = 2; 3; : : : ; do:
Set uk = Awk�1:
For i = maxfk � 2; 1g; : : : ; k � 1; do:
Set ��i = uTkwi:
Update uk uk � ��iui:
Set wk =M�1uk and �k =quTkwk:
Update uk uk=�k and wk wk=�k:
Compute �k = rTk�1wk and set rk = rk�1 � �kuk.
Set pk =1�k
0@wk�1 �
k�1Xi=maxfk�2;1g
��ipi
1A and set
xk = xk�1 + �kpk.
3. Numerical Experiments. We present numerical experiments that show the
performance of the Krylov subspace methods for symmetric inde�nite systems discussed
in the previous sections. In our experiments, we also include the SYMMLQ method
[6] for solving symmetric inde�nite linear systems. Basically, the kth iterate xk of
SYMMLQ can be obtained by orthogonalizing the residual vector r(z) = r0�Az against
Kk(r0; A), whereas that of MINRES is obtained by minimizing the residual vector
An Algorithm for Symmetric Inde�nite Systems 33
over the space Kk(r0; A) for each k. For a symmetric positive de�nite preconditioner
M , it can be shown that algorithms for the SYMMLQ, MINRES, symmetric QMR,
and projection method can be implemented with only one matrix-vector multiplication
with A and one preconditioner-vector solve with M at each iteration if MT1 = M2
in implementing symmetric QMR. However, in implementing a preconditioner-vector
solve of the form Mw = r, factorizing the preconditioner M �rst, i.e., M = M1M2;
we may save oating-point operations by solving two preconditioning solves of the
form M1u = r and M2w = u instead of performing a preconditioner-vector solve with
M . Besides matrix-vector multiplication with A, two M1 and M2 preconditioning
solves or one preconditioner-vector solve with M , algorithms for the symmetric QMR,
MINRES, SYMMLQ, and projection methods use approximately 7n, 10n, 11n, and
12n multiplications and divisions, respectively.
We use a discretization of
�u+ cu = f in D;
u = 0 on @D;
for a test problem involving a symmetric linear system, where D = [0; 1] � [0; 1], and
c is a constant. The usual centered di�erence approximations were used in the dis-
cretization. We set f � x(1� x) + y(1� y) and used m = 64, where m is the number
of equally spaced interior points on each side of D, so that the resulting system has
dimension 4096. For a preconditioner we used �M + I, which is symmetric positive
de�nite, whereM is the discretized Laplacian matrix. In experiments of the SYMMLQ,
MINRES, symmetric QMR, and projection methods, we used Cholesky decomposition
of the preconditioner. Also, we used the vector (1; 1; : : : ; 1)T 2 Rn for the initial guess
and used double precision on Sun Microsystems workstations in all experiments. The
true residual norms kb�Axkk2 are monitored in assessing the comparative performance.
In the following Figure 1, the true residual norm curves generated by the MINRES,
SYMMLQ, symmetric QMR, and projection methods are monitored using values of c =
100 and m = 64. As shown in Figure 1, we could see that there were some di�erences
in the limits of reduction of the true residual norms. We regard these di�erences as
insigni�cant, since the di�erences are small relative to that of the satisfactory limit of
residual norms. The projection method is as numerically sound as MINRES, SYMMLQ,
symmetric QMR in all our experiments.
In the following Figure 2, we plotted the true residual norm reduction versus
oating-point operation counts for the MINRES, SYMMLQ, symmetric QMR, and
projection methods. We ran the algorithms for 80 iterations. Figure 2 shows that the
symmetric QMR, MINRES, projection methods need about the same number of oper-
ations to reach around the 10�10 level of residual norm reduction, although symmetric
QMR needs slightly fewer number of operations than the MINRES and projection
methods do. Figure 2 also shows that SYMMLQ requires approximately 10% more
operations relative to that of the other three methods for 10�10 level of residual norm
reduction.
34 SuCheol Yi
0 10 20 30 40 50 60 70 80−14
−12
−10
−8
−6
−4
−2
0
2
4
Figure 1: Log10 of the true residual norms vs. the number of iterations; c = 100 with
preconditioner �M + I, where M is the discretized Laplacian matrix. Solid curve:
MINRES; dashdot curve: symmetric QMR; dotted curve: algorithm 2.1; dashed curve:
SYMMLQ; m = 64.
An Algorithm for Symmetric Inde�nite Systems 35
0 2 4 6 8 10 12
x 107
−14
−12
−10
−8
−6
−4
−2
0
2
4
Figure 2: Log10 of the true residual norms vs. the number of oating-point operations;
c = 100 with preconditioner �M + I, where M is the discretized Laplacian matrix.
Solid curve: MINRES; dashdot curve: symmetric QMR; dotted curve: algorithm 2.1;
dashed curve: SYMMLQ; m = 64.
36 SuCheol Yi
4. Conclusion. In this paper, we have considered Krylov subspace methods for
solving large symmetric inde�nite linear systems and have introduced a new approach
for solving them, which is called the projection method in this paper. Our numerical
experiments showed that the projection method is as numerically sound as the MIN-
RES, SYMMLQ, and symmetric QMR methods. Furthermore, these methods require
roughly similar e�ort to achieve comparable residual norm reduction, although sym-
metric QMR is most e�cient and SYMMLQ is mostly cost by a slight margin. However,
only the symmetric QMR method allows use of arbitrary nonsingular symmetric indef-
inite preconditioners, which is an advantage of this method over the other methods.
The symmetric QMR and projection methods have also an advantage over MINRES
and SYMMLQ in easier programming.
REFERENCES
[1] R. W. FREUND, AND N. M. NACHTIGAL, QMR: a quasi-minimal residual method for
non-Hermitian linear systems, Numer. Math., 60 (1991), pp. 315-339.
[2] R. W. FREUND, AND N. M. NACHTIGAL, A new Krylov subspace method for sym-
metric inde�nite linear systems, ORNL/TM-12754 (1994).
[3] R. W. FREUND, AND T. SZETO, A quasi-minimal residual squared algorithm for non-
Hermitian linear systems, Tech. Rep. 91-26, Research Institute for Advanced
Computer Science, NASA, Ames Research Center (1991).
[4] R. W. FREUND, AND H. ZHA, Simpli�cations of the nonsymmetric Lanczos process
and a new algorithm for Hermitian inde�nite linear systems, Murray Hill Bell
Labs (1994).
[5] G. H. GOLUB, AND C. F. VAN LOAN, Matrix Computations, 2nd ed., The Johns
Hopkins University press, Baltimore MD, (1989).
[6] C. C. PAGE, AND M. A. SAUNDERS, Solution of sparse inde�nite systems of linear
equations, SIAM J. Numer. Anal., 12 (1975), pp. 617-629.
[7] Y. SAAD, AND M. H. SCHULTZ, GMRES: a generalized minimal residual algorithm
for solving nonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 7 (1986),
pp. 856-869.
[8] H. F. WALKER, AND L. ZHOU, A simpler GMRES, Numer. Lin. Alg. Appl., 1
(1994), pp. 571-581.
Department of Applied Mathematics
Changwon National University
9 Sarim-dong, Changwon,
Kyongnam, 641-773, Korea.
E-mail: [email protected]
NUMERICAL SOLUTION OF A CONSTRICTED STEPPED
CHANNEL PROBLEM USING A FOURTH ORDER METHOD
Paulo F. de A. Mancera?
Roland Hunty
J. KSIAM Vol.3, No.2, 51-67, 1999
Abstract
The numerical solution of the Navier-Stokes equations in a constricted stepped
channel problem has been obtained using a fourth order numerical method. Trans-
formations are made to have a �ne grid near the sharp corner and a long channel
downstream. The derivatives in the Navier-Stokes equations are replaced by fourth
order central di�erences which result a 29-point computational stencil. A proce-
dure is used to avoid extra numerical boundary conditions near the solid walls.
Results have been obtained for Reynolds numbers up to 1000.
1 Introduction
We apply an wide fourth order numerical method for solving the Navier-Stokes equa-
tions to a constricted stepped channel problem. The constricted stepped channel prob-
lem considered consists of a sudden contraction (a forward-facing stepped channel, see
Figure 1) and contains a re-entrant corner. Because of the di�culties associated with
that corner this channel problem has been much studied.
Mo�at [13] has studied the Stokesian ow near a re-entrant corner and has shown
that the vorticity is singular. Bramley and Dennis [1] compare the Mo�at expansion
with their numerical solution near the corner for a branching channel problem and Hunt
[9] compares the Mo�at expansion along with other techniques for a constricted stepped
channel problem. Dennis and Smith [4] solve a constricted stepped channel problem
using diagonal grids near the corner. Holstein and Paddon [7] present a method which
is based on using Mo�at's expansion to produce �nite di�erence stencils which take into
account the nature of the singularity at the corner. Ma and Ruth [12] compare some
of the techniques referenced above and others with their vorticity-circulation method.
There are many other calculations of this problem which use either the streamfunction-
vorticity or the streamfunction formulation of the Navier-Stokes equations, and use
various methods to solve the system of the non-linear equations. We will cite a few.
Dennis and Smith [4] use the streamfunction-vorticity formulation of the Navier-Stokes
equations discretisating the streamfunction by second order central di�erences and the
vorticity equation by second order central di�erences which incorporate the Dennis-
Hudson arti�cial viscosity and the resulting system of equations is solved using an
Keywords: Navier-Stokes equations, fourth order numerical method, streamfunction formulation.
AMS Subject Classi�cation: 65N06
51
52 Paulo F. de A. Mancera, Roland Hunt
Figure 1: Stepped channel.
SOR iteration. Huang and Seymour [8] use the interior constraint method for solving
the streamfunction-vorticity formulation and again an SOR iteration is used to solve the
system of equations. Hunt [9] solves the streamfunction formulation using second order
central di�erences and Newton method is used to solve the resulting system of equa-
tions. Karageorghis and Phillips [11] solve the streamfunction formulation using the
Chebyshev spectral element method and the resulting system of equations is solved by
Newton method. Finally we observe that the computational domain for the constricted
stepped channel problem is L-shaped region. This causes considerable di�culties in ap-
plying a 29-point computational stencil near the re-entrant corner. However the main
di�culty with the constricted stepped channel problem is that the ow at the re-entrant
corner is singular, that is the second and higher derivatives of the streamfunction are
singular.
2 Forward-facing stepped channel
Let us consider a channel problem with walls at y = �1 for x < 0, y = �1
2for x > 0
and1
2� jyj � 1 for x = 0. Due to symmetry the problem is solved for y � 0 (see Figure
1). The governing equations for this channel problem are given by the Navier-Stokes
equations
@2
@x2+@2
@y2= �� (1)
A Constricted Stepped Channel Problem 53
@2�
@x2+@2�
@y2= Re
�@
@y
@�
@x�@
@x
@�
@y
�(2)
where Re is the Reynolds number and the boundary conditions are
= 1;@
@y= 0 on y = 1; x � 0; and y =
1
2; x � 0
= 1;@
@x= 0 on x = 0;
1
2� y � 1
= 0;@2
@y2= 0 on y = 0 (3)
!3
2y �
1
2y3; � ! 3y as x! �1;
! 3y � 4y3; � ! 24y as x! +1
where Poiseuille ow has been assumed far upstream and far downstream.
Substituting equation (1) into equation (2) gives
@4
@x4+ 2
@4
@x2@y2+@4
@y4= Re
@
@y
@3
@x3+
@3
@x@y2
!�@
@x
@3
@y@x2+@3
@y3
!!(4)
which is called the streamfunction formulation for the steady incompressible Navier-
Stokes equations
Because of the need of �ner grid near the sharp corner we consider transformations
given by
� = f(�); � = g(�) (5)
and hence the governing equations are given by
D = �� (6)
D� =Re
f 0g0
�@�
@�
@
@��@�
@�
@
@�
�(7)
where
D �1
f 02
@2
@�2�f00
f 03
@
@�+
1
g02
@2
@�2�g00
g03
@
@�(8)
We have chosen the same transformations given by Hunt [9], that is
x = f(�) =�x0
ksinh(k �) (9)
y = g(�) = � +1
2�(1��y0) sin(2� �) (10)
where h�x0 and h�y0 are the dimensions of a cell in the x{y plane near the corner, k
is a parameter determined by the position of the upstream boundary and h is the grid
size. Figure 2 shows an example of a non-uniform grid placed on the channel.
54 Paulo F. de A. Mancera, Roland Hunt
Figure 2: Forward-facing stepped channel: grid mesh.
3 THE STREAMFUNCTION FORMULATION ON A
NON-UNIFORM GRID
Mancera [2] and Mancera and Hunt [3] have used a procedure to deal with the stream-
function formulation of the Navier-Stokes equations on a non-uniform grid which con-
sists of
1. Discretise equations (6) and (7) using fourth order central di�erences.
2. Eliminate �i;j from these equations.
3. Obtain a computational stencil with 29 points.
We will obtain the full expression for the streamfunction formulation of the Navier-
Stokes equations to analyse this constricted stepped problem since after discretising
the equation we will have a 29-point computational stencil, instead of the 33-point
computational stencil resulting from the procedure cited above (step 2). Writing equa-
tion (6) as
� = �1
f 02
@2
@�2+f00
f 03
@
@��
1
g02
@2
@�2+g00
g03
@
@�(11)
and then calculating@�
@�,@�
@�,@2�
@�2e@2�
@�2we obtain, after substituting these derivatives
in equation (7),
�1
f 04
@4
@�4�
1
g04
@4
@�4�
2
f 02g02
@4
@�2@�2+6f 00
f 05
@3
@�3+6g00
g05
@3
@�3+
2g00
f 02g03
@3
@�2@�
+2f 00
g02f 03
@3
@�2@�+4f 000f 0 � 15f 002
f 06
@2
@�2+4g000g0 � 15g002
g06
@2
@�2+
2g00f 00
(f 0g0)3@2
@�@�
+f0000f02 � 10f 0f 00f 000 + 15f 003
f 07
@
@�+g0000g02 � 10g0g00g000 + 15g003
g07
@
@�
�Re
f 0g0
@
@�
3f 00
f 03
@2
@�2�
1
f 02
@3
@�3+f000f0 � 3f 002
f 04
@
@��
1
g02
@3
@�@�2+g00
g03
@2
@�@�
!
A Constricted Stepped Channel Problem 55
�@
@�
�
1
f 02
@3
@�@�2+f00
f 03
@2
@�@�+3g00
g03
@2
@�2�
1
g02
@3
@�3+g000g0 � 3g002
g04
@
@�
!!= 0(12)
Equation (12) is the streamfunction formulation of the Navier-Stokes equations on a
non-uniform grid considering the transformations given in (5). Finally, we observe that
if f 0 = g0 = 1 in (12) then the expression (4) is obtained.
4 DISCRETISATION OF THE EQUATION
We set a uniform grid on the computational domain with grid size h in both directions
� and �. If i;j denotes an aproximation to at position (i; j) then the derivatives on
the �-direction are approximated by fourth order centre di�erences, that is,
@
@x=
1
12h(� i+2;j + 8 i+1;j � 8 i�1;j + i�2;j) +O
�h4�
@2
@x2=
1
12h2(� i+2;j + 16 i+1;j � 30 i;j + 16 i�1;j � i�2;j) +O
�h4�
(13)
@3
@x3=
1
8h3(� i+3;j + 8 i+2;j � 13 i+1;j + 13 i�1;j � 8 i�2;j + i�3;j) +O
�h4�
@4
@x4=
1
6h4(� i+3;j + 12 i+2;j � 39 i+1;j + 56 i;j � 39 i�1;j
+ 12 i�2;j � i�3;j) +O
�h4�
The mixed derivatives are evaluated in the programme using a do loop. For example,@3
@�@�2is approximated by
al =1
12h(� i+2;j+l + 8 i+1;j+l � 8 i�1;j+l + i�2;j+l) ; l = �2;�1; 0; 1; 2
@3
@�@�2'
1
12h2(�a2 + 16a1 � 30a0 + 16a�1 � a�2) (14)
which gives a 5�5-computational stencil. Equation (12) is discretised by formulas (13)
and (14) to result the 29-point-computational stencil as shown in Figure 3.
The application of the 29-point computational stencil to this constricted stepped
channel problem is not straightforward because of the di�culties to deal with �ctitious
points near the sharp corner (singular point). To understand the di�culties let us
consider two situations of calculations near the sharp corner as illustrated in Figure 4.
Applying the computational stencil with 29 points at positions denoted by � we note
that both calculations use common �ctitious points, but the behaviour of the ow before
the corner is di�erent from the behaviour after the corner. Hence the four �ctitious
nodes near the corner have two values for the streamfunction each depending whether
the centre of the computational stencil is before or after the corner. To overcome these
di�culties we have not used �ctitious points at the solid walls in the calculations. If we
56 Paulo F. de A. Mancera, Roland Hunt
r r r r
r
r r r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
Figure 3: Computational stencil with 29 points.
apply the 13-point computational stencil� at the interior points next to the boundary
and the 29-point computational stencil at all other interior points then we only require
the value of at a single point, denoted by ?, outside the boundary. This can be
eliminated using the derivative boundary condition at the wall given by
@
@n= 0 (15)
where@
@nis the normal derivative. Using fourth order discretisation we can approxi-
mate this by
@
@n'
1
12h(3 ? + 10 0 � 18 1 + 6 2 � 3) = 0 (16)
where the subscripts 0, 1, 2 and 3 denote, respectively, a point on the boundary and
j = 1; 2; 3 the j-th internal grid point along the inward normal from 0, and ? a point
outside the boundary. From (16) we obtain that
? =1
3(�10 0 + 18 1 � 6 2 + 3) (17)
which can be used to remove the �ctitious point from the computational stencils with
13 and 29 points and then there are no �ctitious points used in the calculations. Our
use of applying the 13-point computational stencil next to the boundary di�ers slightly
from Henshaw's procedure (see Henshaw [5] and Henshaw et. al. [6]) where that
computational stencil is applied on the boundary, but the method can be shown to be
still fourth order accurate (Hunt, private communication).
Using these ideas we have set up a procedure (see Figure 5 for positions of calcu-
lations near the re-entrant corner) to discretise the governing equation in the compu-
tational domain. The procedure is as follows. Let us consider an N1 �M1 grid before
�This computational stencil is obtained by discretising the governing equation by second order
central di�erences.
A Constricted Stepped Channel Problem 57
�rrr r r r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r �rrr r r r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
Figure 4: Computational stencils near the re-entrant corner.
the sharp corner and an N2�M2, where M2 =M1
2after it, where the grid vertices are
(i; j), i = �N1;�N1+1; : : : ;�1; 0; 1; : : : ; N2, j = 0; 1; : : : ;M , where M =M1 for i � 0
and M =M2 for i > 0. Then
1. At points j = M1 � 1 for �N1 + 1 � i � �2 and j = M2 � 1 for 1 � i �
N2 � 1 we apply a computational stencil with 13 points (second order accurate)
with all �ctitious points replaced by (17), that is ? =1
3(�10 i;M1
+ 18 i;M1�1
�6 i;M1�2 + i;M1�3) at j =M1 � 1 and ? =1
3(�10 i;M2
+ 18 i;M2�1 �6 i;M2�2 + i;M2�3) at j =M2 � 1. In Figure 5 these
points are indicated by 1.
2. At i = �1 for M2 + 1 � j �M1 � 2 we apply a 13-point computational stencil
with �ctitious points substituted by ? =1
3(�10 0;j + 18 �1;j � 6 �2;j+ �3;j)
(see 2 in Figure 5).
3. At grid position (�1;M1 � 1) we apply a computational stencil with 13 points,
where to eliminate �ctitious points in both axis directions we apply similar ex-
pressions to ? as those given in the two items above (see 3 in Figure 5).
4. At grid positions (�1;M2), (�1;M2�1) and (0;M2�1) we apply a computational
stencil with 13 points (see 4 in Figure 5).
5. At positions j =M1� 2 for �N1+1 � i � �3 and j =M2� 2 for 1 � i � N2� 1
we apply a computational stencil with 29 points (fourth order accurate), where
? =1
3(�10 i;k + 18 i;k�1 � 6 i;k�2 + i;k�3) with k either M1 or M2 is used
to eliminate �ctitious points (see 5 in Figure 5).
58 Paulo F. de A. Mancera, Roland Hunt
6. At i = �2 for M2+1 � j �M1� 3 the 29-point computational stencil is applied
with all �ctitious points substituted by the same expression given in the second
item (see 6 in Figure 5).
7. At position (�2;M1�2) we apply a computational stencil with 29 points with all
�ctitious points in both directions eliminated using ? given in the two preceding
items (see 7 in Figure 5).
8. To other interior points we apply a computational stencil with 29 points. In
Figure 5 they are indicated by 8.
9. On solid walls we have i;j = 1 and along the line of symmetry i;0 = 0, i;�1 =
� i;1 and i;�2 = � i;2.
10. At the ends of the channel we have set 0;j =jh
2(3� jh), N;j =
�3jh� 4(jh)3
�,
� �N1+2;j + 16 �N1+1;j �30 �N1;j + 16 �N1�1;j � �N1�2;j = 0 and� �N2+2;j
+16 �N2+1;j � 30 �N2;j + 16 �N2�1;j � �N2�2;j = 0.
3
2
4
2
2
2
111
755
6
6
6
8
8
8
8
8
8
8
8
8
4888
8888
8888
8888
4 1 1 1
8 5 5 5
8 8 8 8
8 8 8 8
Figure 5: Stepped channel: positions of the calculations near the solid walls.
A Constricted Stepped Channel Problem 59
5 NUMERICAL SOLUTION AND ACCURACY
The system of algebraic equations resulting from the discretizations is solved by Newton
method which is described in Hunt [9, 10] and Mancera and Hunt [3]. The numerical
solution is obtained on an N � M grid and in order to estimate the error in these
results we obtain a second solution on a N=2�M=2 grid for comparison. Suppose, at a
common location, the numerical solution is �F on the original �ne grid and �M on the
coarser grid and, if further � is the exact solution at this point, then since the methods
are fourth order we have
�� �F ' Kh4; �� �M ' K(2h)4 (18)
for some constant K. Eliminating � we obtain an estimate for the error EF on the �ne
grid (' Kh4) as
EF '�F � �M
15(19)
The errors are estimated as follows
RMS : jj F � M jj2 =
�1
N
X� Fij �
Mij
�2�1=2(20)
maximum : jj F � M jj1 = maxijj F
ij � Mij j (21)
where N is the number of points on the computational space, Fi;j and
Mij , are, respec-
tively the numerical solution at (i; j) on the �ne and coarser grids.
6 Results for the stepped channel
We numerically solve the ow in the forward-facing stepped channel problem using
the 29-point computational stencil together with boundary data in which all �ctitious
points are eliminated. We consider the elimination of the �ctitious points to be the
best approach to this problem. Let us explain step by step the process of discretisation.
First, the Navier-Stokes equations (1) and (2) are transformated in equations (6) and
(7), where the coordinate transformations are given by equations (9) and (10). Second,
equations (6) and (7) are written in the streamfunction formulation alone (equation
(12)) and then discretised by fourth order central di�erences to obtain a 29-point com-
putational stencil. Third, equation (12) is also discretised using second order central
di�erences to obtain a 13-point computational stencil which is applied adjacent to the
walls. Fourth, we apply a procedure to eliminate all �ctitious points at the solid walls.
The results are presented both for a uniform grid and for a non-uniform.
For the uniform grid we have set the upstream boundary at x = �2 and the down-
stream boundary at x = 2, where the number of points on the �ne grid is 96 in the
y-direction before the corner. The maximum and RMS errors are shown in Tables y
yThe notation a(�b) means a� 10�b.
60 Paulo F. de A. Mancera, Roland Hunt
Table 1: Maximum errors on a uniform grid.
Methods
Fourth order method Second order method
Re errors errors Ratio
0 1.50(-4) 8.00(-4) 5.33
1 1.55(-4) 8.32(-4) 5.37
10 1.86(-4) 1.04(-3) 5.59
50 2.24(-4) 1.53(-3) 6.83
100 2.32(-4) 1.84(-3) 7.93
250 4.35(-4) 2.26(-3) 5.20
500 1.97(-3) 2.30(-3) 1.17
Table 2: RMS errors on a uniform grid.
Methods
Fourth order method Second order method
Re errors errors Ratio
0 2.86(-5) 1.41(-4) 4.93
1 2.81(-5) 1.42(-4) 5.05
10 2.68(-5) 1.54(-4) 5.75
50 2.91(-5) 1.70(-4) 5.84
100 3.14(-5) 1.75(-4) 5.57
250 5.45(-5) 2.55(-4) 4.68
500 3.56(-4) 4.46(-4) 1.25
1 and 2, respectively. We note from both tables that the results given by the fourth
order method are not much more accurate than their second order counterparts where
for all Re the errors for the fourth order method are less than 8 times smaller than
their second order equivalent.
Now we analyse the results where �ctitious points are not used on the solid walls.
The upstream position is at x = �2 and the downstream position is either at x ' 100
or at x ' 1000. In the numerical simulations we have chosen �x0 = �y0 = 0:025 and
�x0 = 0:01 and �y0 = 0:025 on the coarser grid.
In Tables 3 and 4 we present results for the fourth and second order methods on a
non-uniform grid where the upstream position is at x = �2, the downstream position at
x ' 109, the value of the parameter k in equation (9) is 2.9973 and �x0 = �y0 = 0:025.
The number of points in the �-direction is 80 and in the �-direction 24 on the coarser
grid. Comparing the results given in Tables 3 and 4 with those given in Tables 1
and 2 we have obtained results up to Re = 1000, even though for the fourth order
method Newton method has failed to converge after 10 iterations. For the fourth order
numerical method the errors on a non-uniform grid are smaller than the errors on a
A Constricted Stepped Channel Problem 61
Table 3: Maximum errors and ratio between errors for the upstream position at x = �2,
downstream position at x ' 109 and �x0 = �y0 = 0:025.
Methods
Fourth order method Second order method
Re errors errors Ratio
0 8.81(-5) 9.60(-4) 10.90
1 9.11(-5) 9.13(-4) 10.02
10 1.10(-4) 6.28(-4) 5.71
50 1.43(-4) 7.02(-4) 4.91
100 1.57(-4) 9.53(-4) 6.07
125 1.62(-4) 1.04(-3) 6.42
250 1.89(-4) 1.34(-3) 7.09
500 3.99(-4) 1.64(-3) 4.11
750 9.16(-4) 2.84(-3) 3.10
1000 | 4.79(-3) |
uniform grid for all Reynolds numbers. The ratio between the errors are less than 11
for the maximum erros and less than 13 for the RMS errors.
The results for the situation upstream position at x = �2 and downstream position
at x ' 1033 are given in Tables 5 and 6, where the number of points in the �-direction
is 98 and in the �-direction is 24 on the coarser grid. We have obtained results for
Reynolds number up to 1000 and for both fourth and second order numerical methods.
Comparing the maximum and RMS errors given in Tables 5 and 6 with their coun-
terparts given in Tables 3 and 4 we observe the same results for the maximum errors
and the RMS errors are smaller for the downstream position x ' 1033 since the ow
changes very slowly after the the sharp corner. Again the ratios between the errors
from the second and fourth order methods are, respectively, less than 12 and less than
14 for the maximum and RMS errors.
We have also analysed the constricted stepped channel problem for the same up-
stream and downstream positions but considering �x0 = 0:01 and �y0 = 0:025. We
have chosen these values after many numerical experimentations, since the Newton
method employed has not converged for some values of �x0 and �y0. For the up-
stream position at x = �2 and downstream position at x ' 142 the number of points
in the �-direction is 72 and in the �-direction 24 on the coarser grid and k = 4:2638.
The maximum errors (see Table 7) range from 4:78�10�5 to 3:12�10�4 for the fourth
order method and the ratios between the errors from the second and fourth order meth-
ods are less than 25 for all Reynolds numbers. For the fourth order method the RMS
errors given in Table 8 range from 6:02 � 10�5 to 1:28 � 10�5 and again the ratios
between the errors are less than 25. The fourth order method did not converge after
10 iterations to Re = 1000 on the coarser grid. Comparing the errors for the fourth
order method given in Tables 3 and 4 with those given in Tables 7 and 8 we observe
that both maximum and RMS errors are smaller for �x0 = 0:01 and �y0 = 0:025.
62 Paulo F. de A. Mancera, Roland Hunt
Table 4: RMS errors and ratio between errors for the upstream position at x = �2,
downstream position at x ' 109 and �x0 = �y0 = 0:025.
Methods
Fourth order method Second order method
Re errors errors Ratio
0 2.03(-5) 2.59(-4) 12.76
1 2.04(-5) 2.49(-4) 12.21
10 2.10(-5) 1.89(-4) 9.00
50 2.33(-5) 1.34(-4) 5.75
100 2.60(-5) 1.35(-4) 5.19
125 2.67(-5) 1.37(-4) 5.13
250 3.15(-5) 1.49(-4) 4.73
500 6.85(-5) 1.70(-4) 2.48
750 1.71(-4) 5.14(-4) 3.01
1000 | 6.52(-4) |
Table 5: Maximum errors and ratio between errors for the upstream position at x = �2,
downstream position at x ' 1033 and �x0 = �y0 = 0:025.
Methods
Fourth order method Second order method
Re errors errors Ratio
0 8.81(-5) 9.60(-4) 10.90
1 9.11(-5) 9.13(-4) 10.02
10 1.10(-4) 6.28(-4) 5.71
50 1.43(-4) 7.02(-4) 4.91
100 1.57(-4) 9.53(-4) 6.07
125 1.62(-4) 1.04(-3) 6.42
250 1.89(-4) 1.34(-3) 7.09
500 3.99(-4) 1.64(-3) 4.11
750 9.26(-4) 1.72(-3) 1.86
1000 1.33(-3) 2.00(-3) 1.50
A Constricted Stepped Channel Problem 63
Table 6: RMS errors and ratio between errors for the upstream position at x = �2,
downstream position at x ' 1033 and �x0 = �y0 = 0:025.
Methods
Fourth order method Second order method
Re errors errors Ratio
0 1.84(-5) 2.44(-4) 13.26
1 1.85(-5) 2.36(-4) 12.76
10 1.90(-5) 1.85(-4) 9.74
50 2.13(-5) 1.40(-4) 6.57
100 2.35(-5) 1.41(-4) 6.00
125 2.42(-5) 1.42(-4) 5.87
250 2.85(-5) 1.52(-4) 5.33
500 6.19(-5) 1.70(-4) 2.75
750 1.60(-4) 2.23(-4) 1.39
1000 2.28(-4) 3.01(-4) 1.32
Table 7: Maximum errors and ratio between errors for the upstream position at x = �2,
downstream position at x ' 142 and �x0 = 0:01 and �y0 = 0:025.
Methods
Fourth order method Second order method
Re errors errors Ratio
0 4.78(-5) 1.16(-3) 24.27
1 4.94(-5) 1.10(-3) 22.27
10 6.05(-5) 7.49(-4) 12.38
50 8.33(-5) 6.82(-4) 8.19
100 9.72(-5) 5.10(-4) 5.25
125 1.00(-4) 5.37(-4) 5.37
250 1.03(-4) 6.82(-4) 6.62
500 1.38(-4) 1.03(-3) 7.46
750 3.12(-4) 6.98(-3) 22.37
1000 | 5.37(-2) |
64 Paulo F. de A. Mancera, Roland Hunt
Table 8: RMS errors and ratio between errors for the upstream position at x = �2,
downstream position at x ' 142 and �x0 = 0:01 and �y0 = 0:025.
Methods
Fourth order method Second order method
Re errors errors Ratio
0 1.28(-5) 3.05(-4) 23.83
1 1.31(-5) 2.96(-4) 22.60
10 1.50(-5) 2.18(-4) 14.53
50 1.82(-5) 1.23(-4) 6.76
100 2.05(-5) 1.18(-4) 5.76
125 2.10(-5) 1.18(-4) 5.62
250 1.95(-5) 1.23(-4) 6.31
500 2.63(-5) 2.05(-4) 7.80
750 6.02(-5) 1.25(-3) 20.76
1000 | 9.74(-3) |
In Tables 9 and 10 we present errors for the downstream position at x ' 1199
with the number of points in the �-position equal to 84. We have obtained results
up to Re = 1000 for the fourth order method and up to Re = 750 for the second
order method. Again the ratios between the errors are less than 25 for both maximum
and RMS errors and the RMS errors are slightly smaller than the RMS errors for the
channel with the downstream position at x ' 142.
Comparing both situations of channel length and grid re�ning we observe that the
upstream position x = �2, downstream position x ' 1199, �x0 = 0:01 and �y0 =
0:025 have given the best results for the fourth order numerical method, although the
ratio between the errors for all situations analysed has indicated that the fourth order
numerical method is not much more accurate than the second order method for this
channel problem.
7 Conclusions
We have analysed a fourth order numerical method for solving the Navier-Stokes equa-
tions for the constricted stepped channel. We have set a transformation which gives a
�ne grid near the sharp corner and a long channel downstream. For the most situa-
tions we have obtained results for Reynolds numbers up to 1000. Due to di�culties to
deal with the solution near the sharp corner we have used a procedure which gives no
�ctitious points at the solid walls. For this channel problem the fourth order numerical
method is not much more accurate than the second order method, but we must note
that has singular derivatives at the sharp corner which in uences the solution, as
can be observed in Mancera and Hunt [3] where a channel problem with gradual and
smooth constriction is solved.
A Constricted Stepped Channel Problem 65
Table 9: Maximum errors and ratio between errors for the upstream position at x = �2,
downstream position at x ' 1199 and �x0 = 0:01 and �y0 = 0:025.
Methods
Fourth order method Second order method
Re errors errors Ratio
0 4.78(-5) 1.16(-3) 24.27
1 4.94(-5) 1.10(-3) 22.27
10 6.05(-5) 7.49(-4) 12.38
50 8.33(-5) 3.95(-4) 4.74
100 9.72(-5) 5.10(-4) 5.25
125 1.00(-4) 5.37(-4) 5.37
250 1.03(-4) 6.82(-4) 6.62
500 1.38(-4) 1.03(-3) 7.46
750 3.11(-4) 1.03(-3) 3.31
1000 5.64(-4) | |
Table 10: RMS errors and ratio between errors for the upstream position at x = �2,
downstream position at x ' 1199 and �x0 = 0:01 and �y0 = 0:025.
Methods
Fourth order methods Second order methods
Re erros erros Ratio
0 1.19(-5) 2.89(-4) 24.29
1 1.22(-5) 2.81(-4) 23.03
10 1.39(-5) 2.11(-4) 15.18
50 1.69(-5) 1.30(-4) 7.69
100 1.90(-5) 1.25(-4) 6.58
125 1.95(-5) 1.25(-4) 6.41
250 1.80(-5) 1.30(-4) 7.22
500 2.43(-5) 1.99(-4) 8.19
750 5.59(-5) 3.94(-4) 4.23
1000 9.31(-5) | |
66 Paulo F. de A. Mancera, Roland Hunt
ACKNOWLEDGEMENT
The �rst author was supported by FAPESP{Funda�c~ao de Amparo �a Pesquisa do Estado
de S~ao Paulo under grant 1996-9530-5.
References
[1] J. S. Bramley and S. C. R. Dennis, The numerical solution of two-dimensional
ow in a branching channel, Comput Fluids, 12{4 (1984), pp. 339{355.
[2] P. F. de A. Mancera, Fourth Order Numerical Methods For Solving The Navier-
Stokes Equations in Two Dimensions, PhD thesis, University of Strathclyde, 1996.
[3] P. F. de A. Mancera and R. Hunt, Fourth order method for solving the Navier-
Stokes equations in a constricting channel, Int. J. Numer. Methods Fluids, 25
(1997), pp. 1119{1135.
[4] S. C. R. Dennis and F. T. Smith, Steady ow through a channel with a sym-
metrical constriction in the form of a step, Proc. R. Soc. Lond., A. 372 (1980),
pp. 393{414.
[5] W. D. Henshaw, A fourth-order accurate method for the incompressible Navier-
Stokes equations on overlapping grids, J. Comput. Phys., 113 (1994), pp. 13{25.
[6] W. D. Henshaw, H.-O. Kreiss, and L. G. M. Reyna, A fourth-order-accurate
di�erence approximation for the incompressible Navier-Stokes equations, Comput
Fluids, 23-4 (1994), pp. 575{593.
[7] H. Holstein and D. J. Paddon, A singular �nite di�erence treatment of re-
entrant corner ow, J. Non-Newtonian Fluid Mech., 8 (1981), pp. 81{93.
[8] H. Huang and B. R. Seymour, A �nite di�erence method for ow in a con-
stricted channel, technical report 93-167, University of British Columbia, 1993.
[9] R. Hunt, The numerical solution of the laminar ow in a constricted channel at
moderately high Reynolds number using Newton iteration, Int. J. Numer. Methods
Fluids, 11 (1990), pp. 247{259.
[10] , The numerical solution of the ow in a general bifurcating channel at moder-
ately high Reynolds number using boundary-�tted co-ordinates, primitive variables
and Newton iteration, Int. J. Numer. Methods Fluids, 17 (1993), pp. 711{729.
[11] A. Karageorghis and T. N. Phillips, Chebyshev spectral collocation methods
for laminar ow through a channel contraction, J. Comput. Physics, 84 (1989),
pp. 114{133.
[12] H. Ma and D. W. Ruth, A new scheme for vorticity computations near a sharp
corner, Comput Fluids, 23{1 (1994), pp. 23{38.
A Constricted Stepped Channel Problem 67
[13] M. K. Moffat, Viscous and resistive eddies near a sharp corner, J. Fluid Mech.,
18 (1964), pp. 1{59.
Departamento de Bioestat�istica
Instituto de Biociencias{UNESP
CP 510
18618-000 Botucatu{Brazil
e-mail: [email protected]
Department of Mathematics
University of Strathclyde
26 Richmond Street
G1 1XH Glasgow{Scotland
e-mail: [email protected]
J. KSIAM Vol.3, No.2, 69-80, 1999
AN ANALYSIS OF MMPP=D1;D2=1=B QUEUE FOR
TRAFFIC SHAPING OF VOICE IN ATM NETWORK
Doo Il Choi
Abstract. Recently in telecommunication, BISDN ( Broadband Integrated Service Dig-
ital Network ) has received considerable attention for its capability of providing a common
interface for future communication needs including voice, data and video. Since all in-
formation in BISDN are statistically multiplexed and are transported in high speed by
means of discrete units of 53-octet ATM ( asynchronous Transfer Mode ) cells, appropri-
ate tra�c control needs. For tra�c shaping of voice, the output cell discarding scheme
has been proposed. We analyze the scheme with aMMPP=D1;D2=1=B queueing system
to obtain performance measures such as loss probability and waiting time distribution.
1. Introduction
The Asynchronous Transfer Mode ( ATM ) has been selected as a mode of trans-
mission and switching in the BISDN ( Broadband Integrated Service Digital Networks
), because of its e�ciency and exibility. The ATM is based on asynchronous time
division multiplexing and fast packet switching technology. In ATM networks, all
information are transmitted in a �xed-size packet called cell which has a 48-octet in-
formation �eld and 5-octet header. The header contains various information required
to transfer the information �eld across the network.
The ATM networks support diverse services which require the di�erent Quality of
Service ( QoS ) such as voice, data and video. Since user terminals in BISDN generate
cells only when they have information to transmit and these cells are statistically
multiplexed, the tra�c stream uctuates uncertainly. Therefore, tra�cs such as voice
and video have properties of time-correlation and burstiness. This characteristics of
tra�c may cause to congestion of network, so appropriate tra�c control needs.
Voice tra�c has delay-sensitive but loss-insensitive characteristic. An e�ective
method to support voice tra�c in ATM networks is use of output cell discarding (CD)
scheme. The output CD scheme operates as follows: Voice information is stored in pair
of cells to separate the more signi�cant and less signi�cant bits. The cell containing the
more signi�cant bits is identi�ed as high priority cell ( i.e. nondiscardable in network
) and the cell containing the less signi�cant bits is identi�ed as low priority cell ( i.e.
discardable in network ). The low-priority cells may be discarded during congestion of
network. This output CD scheme results in signi�cant transmission bandwidth saving
key words and phrases : queueing analysis, tra�c shaping
69
70 DOO IL CHOI
and resiliency of the network during congestion. Therefore, the spare bandwidth ob-
tained by CD scheme can be used to support di�erent tra�c such as data and video.
Also, this smoothing e�ect of voice helps in avoiding bu�er over ow[2,3].
To model the bursty voice tra�c, we use a Markov-modulated Poisson process(MMPP)
in pair of cells. We put a threshold on bu�er considering congestion of network. If
the bu�er occupancy at transmission epoch is less than or equal to the threshold, the
service time is D1 ( the transmission time of cell pair ). Otherwise, the service time
is D2(= D1=2, because low-priority cell is discarded ). We assume a �nite capacity (
B ) queue for practical applications. Then, the output CD scheme is modeled by the
queueing systemMMPP=D1;D2=1=B with one threshold. In following section, we an-
alyze the queueing model by using the embedded Markov chain and the supplementary
variable method.
2. Description of model and MMPP
A Markov-modulated Poisson process(MMPP) has been used to model the video
and the packetized voice tra�c. The MMPP can be constructed by a Poisson process
with a rate that varies according to an N -state irreducible continuous-time Markov
process fJ(t); t � 0g (called the underlying Markov process). When the underlying
Markov process is in state i at time t, arrivals occur according to a Poisson process
of rate �i. The sojourn time of the state i follows exponential distribution with mean1
�i. Then, the MMPP is characterized by the Markov process fJ(t); t � 0g with the
transition rate matrix Q and the arrival rate matrix � , diag (�1; �2; � � � ; �N ). Thetransition rate matrix Q is as follows:
Q =
��������
��1 �12 : : : �1N�21 ��2 : : : �2N...
.... . .
...
�N1 �N2 : : : ��N
��������:
The steady-state probability vector � of the underlying Markov process fJ(t); t � 0gis given by solving the following equations
�Q = 0; �e = 1; e = (1; 1; � � � ; 1)T :
The arriving cells in pair are �rst queued in a bu�er of �nite capacity B in unit of pair
of cells. Cells arriving when the bu�er is full are lost, and cell pairs in bu�er are served
on the �rst-come �rst-service basis.
Introduce the notations
M(t) =the number of cell pairs arriving during the interval (0; t];
J(t) =the state of the underlying Markov process at time t:
Now we de�ne the conditional probabilities
pi;j(n; t) = PfM(t) = n; J(t) = jjM(0) = 0; J(0) = ig; n � 0; 1 � j � N:
AN ANALYSIS OF MMPP=D1;D2=1=B QUEUE 71
Then, it is easily shown that theN�N matrix of probabilitiesP (n; t) = (pi;j(n; t))1�i;j�N ,
has the probability generating function
�P (z; t) =
1Xn=0
P (n; t)zn; jzj � 1;
= eR(z)t;
where R(z) = Q+ (z � 1)�.
3. Analysis of queue length distribution
3.1 The queue length distribution at transmission epochs
Introduce the notations
�n = the n-th service completion epoch; n � 1; �0 , 0;
Nn = the queue length at time �n+;
Jn = the state of the underlying Markov process at time �n + :
Then, the process f(Nn; Jn); n � 0g forms a Markov chain with �nite state space f0; 1;� � � ; B � 1g � f1; 2; � � � ; Ng.De�ne the limiting probabilities xk;i and its probability vectors as
xk;i , limn!1
PfNn = k; Jn = ig;
x , (x0; x1; � � � ; xB�1) with xk , (xk;1; xk;2; � � � ; xk;N ):
The transition probability matrix Q1 of the Markov chain f(Nn; Jn); n � 0g is given
by
Q1 =
�����������������������
A0
0 A0
1 A0
2 : : : A0
L1�1A0
L1A0
L1+1 : : : A0
B�2 A0
B�1
A0 A1 A2 : : : AL1�1 AL1 AL1+1 : : : AB�2 AB�1
0 A0 A1 : : : AL1�2 AL1�1 AL1 : : : AB�3 AB�2
......
.... . .
......
.... . .
......
0 0 0 : : : A1 A2 A3 : : : AB�L1 AB�L1+1
0 0 0 : : : A0 A1 A2 : : : AB�L1�1 AB�L1
0 0 0 : : : 0 B0 B1 : : : BB�L1�2 BB�L1�1
......
.... . .
......
.... . .
......
0 0 0 : : : 0 0 0 : : : B1 B2
0 0 0 : : : 0 0 0 : : : B0 B1
�����������������������where the blocks Ak; Bk; A
0
k; Ak; Bk, and A0
k are as following:
Ak = P (k;D1); Bk = P (k;D2); Ak =
1Xn=k
An; Bk =
1Xn=k
Bn;
A0
k =
Z 10
P (0; t)�dtAk = (��Q)�1�Ak; A0
k =
1Xn=k
A0
n:
72 DOO IL CHOI
The steady-state probability vector x of the Markov chain f(Nn; Jn); n � 0g is obtainedfrom the equations
x Q1 = x; x e = 1:
3.2 The queue length distribution at an arbitrary time
In this subsection we derive the queue length distribution at an arbitrary time. Let
N(t) be the queue length ( including the cell in service ) at time t.
R(t) =
�1 if the service time of the cell is by D1 at time t;
2 if the service time of the cell is by D2 at time t:
and
� =
�0 if the server is idle ;
1 if the server is busy:
De�ne the limiting probabilities
y0 = limt!1
PfN(t) = 0; � = 0g;
yn = limt!1
PfN(t) = n; � = 1g; n � 1:
First we compute the vector y0 that the system is idle. Analogously to Choi[1], we
have
(1) y0 =1
C 1x0(��Q)�1:
where C1 = x0(��Q)�1e+D2+(D1�D2)PL1
n=0 xne: Let T and ~T are the respective
remaining and elapsed service time for the cell in service. In order to obtain the
queue length distribution yn(n � 1) at arbitrary time, we de�ne the joint probability
distribution of the queue length and the remaining service time at arbitrary time � .
�r(n; j; t)dt = PfN(�) = n; J(�) = j;R(�) = r; t < T � t+ dt; � = 1g;
and its Laplace transform and the vectors
��r(n; j; s) =
Z 10
e�st�r(n; j; t)dt;
��r(n; s) = (��r(n; 1; s); � � � ; ��r(n;N; s)); r = 1; 2;
��(n; s) = ��1(n; s) + ��2(n; s):
We furthermore de�ne the conditional probability �r(n; j1; j2; t) dt(r = 1; 2) and its
Laplace transform
�r(n; j1; j2; t)dt = Pf�( ~T ) = n; J(�� + ~T ) = j2; R(�� + ~T ) = r;
t < T � t+ dt; � = 1jJ�� = j1g;
��r (n; j1; j2; s) =
Z 10
e�st�r(n; j1; j2; t)dt;
��r (n; s) = (��r (n; j1; j2; s))1�j1;j2�N ;
AN ANALYSIS OF MMPP=D1;D2=1=B QUEUE 73
where �(T ) is the number of cells arriving during the time T . Then, the vectors ��r(n; s)
can be represented as the following equations:
��1(n; s) =D1
C1
[x0(��Q)�1���1 (n� 1; s) +
min(n;L1)Xk=1
xk��1(n� k; s)];
(2)
��1(B; s) =D1
C1
[x0(��Q)�1�f
1Xm=B�1
��1(m; s)g+
L1Xk=1
xkf
1Xm=B�k
��1(n� k; s)g];
(3)
��2(n; s) = 0; 1 � n � L1;(4)
��2(n; s) =D2
C1
nXk=L1+1
xk��2(n� k; s); L1 + 1 � n � B � 1;(5)
��2(B; s) =D2
C1
B�1Xk=L1+1
xk[
1Xm=B�k
��2(m; s)]:(6)
We �nally obtain that
��(n; s) = ��1(n; s) + ��2(n; s);
(7)
=
8>>>>>>>>>>>>>>>>>>>>>><>>>>>>>>>>>>>>>>>>>>>>:
D1
C1
[x0(��Q)�1���1 (n� 1; s) +
nXk=1
xk��1(n� k; s)]; 1 � n � L1;
D1
C1
[x0(��Q)�1���1 (n� 1; s) +
L1Xk=1
xk��1(n� k; s)]
+D2
C1
nXk=L1+1
xk��2(n� k; s); L1 + 1 � n � B � 1;
D1
C1
[x0(��Q)�1�f
1Xm=B�1
��1 (m; s)g+
L1Xk=1
xkf
1Xm=B�k
��1 (m; s)g]
+D2
C1
B�1Xk=L1+1
xkf
1Xm=B�k
��2 (m; s)g; n = B:
In order to obtain ��r (n; s)(r = 1; 2), we consider the following equation
(8)
1Xn=0
��1(n; s)zn = E[e�sT eR(z)
~T ] = eR(z)D1E[e�(sI+R(z))T ];
where D1 = ~T + T . Since E[e�sT ] =RD1
0e�st
1
D1
dt =1� e�sD1
sD1
,
74 DOO IL CHOI
1Xn=0
��1(n; s)zn = eR(z)D1 [I � e�(sI+R(z))D1][(sI +R(z))D1]
�1
=1
D1
[eR(z)D1 � e�sD1I](sI +R(z))�1:(9)
It is known that1Xn=0
Anzn =
1Xn=0
P (n;D1)zn = eR(z)D1 :
Substituting above equation to (9), we obtain1Xn=0
��1(n; s)zn =
1
D1
[
1Xn=0
Anzn � e�sD1I](sI +R(z))�1
=1
D1
[
1Xn=0
Anzn � e�sD1I][
1Xn=0
Rn(s)zn]
=1
D1
[
1Xn=0
nXk=0
AkRn�k(s)�
1Xn=0
e�sD1Rn(s)]zn
where Rn(s) = (sI � � +Q)�1[�(�� sI �Q)�1]n. Thus, ��1 (n; s) is given by
��1 (n; s) =1
D1
[
nXm=0
AmRn�m(s)� e�sD1Rn(s)]:
Similarly, we can obtain ��2 (n; s) as following:
��2(n; s) =1
D2
[
nXm=0
BmRn�m(s)� e�sD2Rn(s)]:
Substituting ��1 (n; s) and ��2(n; s) to ��(n; s), we obtain
��(n; s)
(10)
=
8>>>>>>>>>>>>>>>>>>>>>>>>><>>>>>>>>>>>>>>>>>>>>>>>>>:
1
C1
[x0(��Q)�1�
n�1Xm=0
AmRn�1�m(s) +
nXk=1
xk
n�kXm=0
AmRn�k�m(s)
� e�sD1fx0(��Q)�1�Rn�1(s) +
nXk=1
xkRn�k(s)]; 1 � n � L1;
1
C1
[x0(��Q)�1�
n�1Xm=0
AmRn�1�m(s) +
L1Xk=1
xk
n�kXm=0
AmRn�k�m(s)
� e�sD1fx0(��Q)�1�Rn�1(s) +
L1Xk=1
xkRn�k(s)
+
nXk=L1+1
xk
n�kXm=0
BmRn�k�m(s)� e�sD2
nXk=L1+1
xkRn�k(s)];
L1 + 1 � n � B � 1:
AN ANALYSIS OF MMPP=D1;D2=1=B QUEUE 75
Finally, we obtain the queue length probabilities yn(n � 1) at an arbitrary time:
For 1 � n � L1,
yn =��(n; 0)
=1
C1
[x0(��Q)�1�
n�1Xm=0
Am(Q� �)�1f�(��Q)�1gn�1�m
+
nXk=1
xk
n�kXm=0
Am(Q� �)�1f�(��Q)�1gn�k�m
�
nXk=0
xk(Q� �)�1f�(��Q)�1gn�k]:(11)
For L1 + 1 � n � B � 1,
yn =1
C1
[x0(��Q)�1�
n�1Xm=0
Am(Q� �)�1f�(��Q)�1gn�1�m
+
L1Xk=1
xk
n�kXm=0
Am(Q� �)�1f�(��Q)�1gn�k�m
�
L1Xk=0
xk(Q� �)�1f�(��Q)�1gn�k
+
nXk=L1+1
xk
n�kXm=0
Bm(Q� �)�1f�(��Q)�1gn�k�m
�
nXk=L1+1
xk(Q� �)�1f�(��Q)�1gn�k];
and
yB = ��
B�1Xk=0
yk:
Using the probabilities yn(n � 0) obtained above, we obtain performance measures
such as loss ( Ploss ) and mean queue length (Mq):
Ploss =yB�ePB
i=0 yi�e=yB�e
��e; Mq =
BXi=0
iyie:
4. Analysis of waiting time distribution
In order to derive the waiting time distribution of an arbitrary cell pair, let's tag
a cell pair arriving at time � . Suppose that there are i(1 � i � B � 1) cell pairs
in the system at time � . Since the service time may change according to the bu�er
76 DOO IL CHOI
occupancy at service completion epoch, we need to know the time(U i�1) required to
complete transmission of (i � 1) cells at service completion epoch of the cell under
service present at time � . We �rst de�ne the hitting time of the level more than the
threshold L1 from the level less than or equal to the threshold L1 at service completion
epoch and of the threshold L1 from the level more than the threshold L1:
Yk;m(j1; j2) , inffn � 1; (Nn; Jn) = (m; j2); Nn 2 Aj(N0; J0) = (k; j1)g;
k = 1; � � � ; L1; m = L1 + 1; � � � ; B � 1;
Zk;L1(j1; j2) , inffn � 1; (Nn; Jn) = (L1; j2)j(N0; J0) = (k; j1)g;
k = L1 + 1; � � � ; B � 1; 1 � j1; j2 � N;
where A = fL1 + 1; � � � ; B � 1g. Introduce the matrices P1; P0
1; P 1; and P0
1 of order
BN to obtain distribution of Yk;m(j1; j2) and Zk;L1(j1; j2):
P1 =
����������������������
A0
0 A0
1 A0
2 : : : A0
L1�1A0
L1A0
L1+1 : : : A0
B�2 A0
B�1
A0 A1 A2 : : : AL1�1 AL AL1+1 : : : AB�2 AB�1
0 A0 A1 : : : AL1�2 AL1�1 AL1 : : : AB�3 AB�2
......
.... . .
......
.... . .
......
0 0 0 : : : A1 A2 A3 : : : AB�L1 AB�L1+1
0 0 0 : : : A0 A1 A2 : : : AB�L1�1 AB�L1
0 0 0 : : : 0 0 0 : : : 0 0...
......
. . ....
......
. . ....
...
0 0 0 : : : 0 0 0 : : : 0 0
0 0 0 : : : 0 0 0 : : : 0 0
����������������������
and the matrix P0
1 is the same as the matrix P1 except that all rows and columns more
than L1 are block 0.
P 1 =
�����������������
0 : : : 0 0 0 : : : 0 0...
. . ....
......
. . ....
...
0 : : : 0 0 0 : : : 0 0
0 : : : B0 B1 B2 : : : 0 0
0 : : : 0 B0 B1 : : : 0 0...
. . ....
......
. . ....
...
0 : : : 0 0 0 : : : B1 B2
0 : : : 0 0 0 : : : B0 B1
�����������������
and the matrix P0
1 is the same as the matrix P 1 except that (L1 + 1; L1)-block B0 in
the matrix P 1 is replaced by block 0.
For k = 1; � � � ; L1;m = L1 + 1; � � � ; B � 1, the event fYk;m(j1; j2) = lg means that the
Markov chain f(Nn; Jn); n � 0g starting at the state (k; j1) stays in the level less than
AN ANALYSIS OF MMPP=D1;D2=1=B QUEUE 77
the level L1 + 1 during l � 1 transitions and at the l-th transition the Markov chain
hits the state (m; j2). Therefore, we have
PfYk;m(j1; j2) = lg = [P0(l�1)1 P1](k; j1;m; j2);
, f lk;m(j1; j2);
where [X](k; j1;m; j2) is the (j1; j2)-element of the (k;m)-block of the matrix X. Sim-
ilarly, we obtain distribution for the random variable Zk;L1(j1; j2)
PfZk;L1(j1; j2) = lg = [P0(l�1)
1 P 1](k; j1;L1; j2);
, glk;L1(j1; j2); k = L1 + 1; � � � ; B � 1:
Then, the Laplace transform of the time(U i�1) required to complete the service of
(i� 1) cell pairs is given by
For 1 � i; k � L1,
E[e�sUi�1
j(Nn; Jn) = (k; j)]
=
B�1Xm0=L1+1
Xj0
[
i�1Xa0=1
E[e�sUi�1
j(Nn; Jn) = (k; j); Yk;m0(j; j0) = a0]PfYk;m0
(j; j0) = a0g
+
1Xa0=i
E[e�sUi�1
j(Nn; Jn) = (k; j); Yk;m0(j; j0) = a0]PfYk;m0
(j; j0) = a0g]
=
B�1Xm0=L1+1
Xj0
� i�1Xa0=1
E[e�sUi�1
j(Nn; Jn) = (k; j); Yk;m0(j; j0) = a0]f
a0k;m0
(j; j0)
+ e�s(i�1)D1
1Xa0=i
fa0k;m0(j; j0)
�;
ConditioningZm;L1(j; j0
) and YL1;m(j; j0
) atE[e�sUi�1
j(N�n ; J�n) = (k; j); Yk;m0(j; j0) =
a0], the summation below is �nite:
E[e�sUi�1
j(Nn; Jn) = (k; j)]
=Xl=0
(A1l (k; j) +B1
l (k; j))
,W i�1k;j (s); k = 1; 2; � � � ; L1; 1 � j � N;
78 DOO IL CHOI
where
A10(k; j) = e�s(i�1)D1
B�1Xm0=L1+1
�eT �
i�1Xa0=1
fa0k;m0e�j; B1
0(k; j) , 0;
A1l (k; j) =
Xm0
: : :Xml
i�1Xa0=1
i�1�a0Xb1=1
: : :
i�1�P
l�1
0an�P
l�1
1bnX
bl=1
e�s((i�1)D1+(D2�D1)P
l
1bn)
�fa0k;m0
l�1Yr=1
fgbrmr�1;L1farL1;mr
gfeT �
i�1�P
l�1
0an�P
l�1
1bnX
al=1
falL1;mleg�j;
B1l (k; j) =
Xm0
: : :Xml�1
i�1Xa0=1
i�1�a0Xb1=1
: : :
i�1�P
l�2
0an�P
l�1
1bnX
al�1=1
e�s((i�1)D2+(D1�D2)P
l�1
0an)
�fa0k;m0
l�1Yr=1
fgbrmr�1;L1farL1;mr
gfeT �
i�1�P
l�1
0an�P
l�1
1bnX
bl=1
gblml�1;L1eg�j;
and [X]j = j � th component of row vector X:
For L1 + 1 � k � B � 1,
E[e�sUi�1
j(Nn; Jn) = (k; j)] =Xl=0
(A1
l (k; j) +B1
l (k; j));
,W i�1k;j
(s); 1 � j � N:
where
A1
0(k; j) , 0; B1
0(k; j) = e�s(i�1)D2�eT �
i�1Xb0=1
gb0k;L1
e�j;
A1
l (k; j) =Xm1
: : :Xml
i�1Xb0=1
i�1�b0Xa1=1
: : :
i�1�P
l�1
1an�P
l�2
0bnX
bl�1=1
e�s((i�1)D1+(D2�D1)P
l�1
0bn)
�gb0k;L1
l�1Yr=1
ffarL1;mrgbrmr;L1
gfeT �
i�1�P
l�1
1an�P
l�1
0bnX
al=1
falL1;mleg�j;
B1
l (k; j) =Xm1
: : :Xml
i�1Xb0=1
i�1�b0Xa1=1
: : :
i�1�P
l�1
1an�P
l�1
0bnX
al=1
e�s((i�1)D2+(D1�D2)P
l�1
1an)
�gb0k;L1
l�1Yr=1
ffarL1;mrgbrmr;L1
gfalL1;mlfeT �
i�1�P
l�1
1an�P
l�1
0bnX
bl=1
gblml;L1eg�j:
Since the service time may change according to bu�er occupancy, we must know the
number of cell pairs arriving from an arbitrary time � to the service completion epoch.
Consider the following joint probabilities:
AN ANALYSIS OF MMPP=D1;D2=1=B QUEUE 79
PfN(�) = 0; An arrival is in (�; � + d�)g = y0�ed�:
For 1 � n � B � 1; n+ l < B � 1,
PfN(�) = n;N�k+1 = n+ l; J�k+1 = j;R(�) = 1; An arrival is in (�; � + d�);
t < T � t+ dt; � = 1g
=1
C1
[x0(��Q)�1�P (n� 1;D1 � t)�P (l; t)d�dt
+
min(n;L1)Xi=1
xiP (n� i;D1 � t)�P (l; t)d�dt]j:
For 1 � n � B � 1,
PfN(�) = n;N�k+1 = B � 1; J�k+1 = j;R(�) = 1;An arrival is in (�; � + d�);
t < T � t+ dt; � = 1g
=1
C1
[x0(��Q)�1�P (n� 1;D1 � t)�P (B � n� 1; t)d�dt
+
min(n;L1)Xi=1
xiP (n� i;D1 � t)�P (B � n� 1; t)d�dt]j ;
where P (k; t) =
1Xl=k
P (l; t):
For L1 + 1 � n � B � 1; n+ l < B � 1,
PfN(�) = n;N�k+1 = n+ l; J�k+1 = j;R(�) = 2; An arrival is in (�; � + d�);
t < T � t+ dt; � = 1g
=1
C1
[
nXi=L1+1
xiP (n� i;D2 � t)�P (l; t)d�dt]j
For L1 + 1 � n � B � 1,
PfN(�) = n;N�k+1 = B � 1; J�k+1 = j;R(�) = 2; An arrival is in (�; � + d�);
t < T � t+ dt; � = 1g
=1
C1
[
nXi=L1+1
xiP (n� i;D2 � t)�P (B � n� 1; t)d�dt]j
By combining above results, we obtain the Laplace transform for the waiting time of
80 DOO IL CHOI
a cell pair:
E[e�sW ] =1
(1� Ploss)��e[y0�e
+1
C1
f
B�2Xn=1
B�n�2Xl=0
x0(��Q)�1�
Z D1
0
e�stP (n� 1;D1 � t)�P (l; t)dtWn�1n+l (s)
+
B�1Xn=1
x0(��Q)�1�
Z D1
0
e�stP (n� 1;D1 � t)�P (B � n� 1; t)dtWn�1B�1(s)
+
L1Xi=1
B�2Xn=i
B�n�2Xl=0
Z D1
0
e�stxiP (n� i;D1 � t)�P (l; t)dtWn�1n+l (s)
+
L1Xi=1
B�1Xn=i
Z D1
0
e�stxiP (n� i;D1 � t)�P (B � n� 1; t)dtWn�1B�1(s)
+
B�2Xi=L1+1
B�2Xn=i
B�n�2Xl=0
Z D2
0
e�stxiP (n� i;D2 � t)�P (l; t)dtWn�1n+l (s)
+
B�1Xi=L1+1
B�1Xn=i
Z D2
0
e�stxiP (n� i;D2 � t)�P (B � n� 1; t)dtWn�1B�1(s)g]:
References
[1] B. D. Choi and D. I. Choi, The queueing system with queue length dependent service times and
its application to cell discarding scheme in ATM networks, IEE Proc. Commun., vol. 143, no.1,
pp. 5-11, 1996.
[2] B. D. Choi, D. I. Choi, Young Chul Kim and Dan Keun Sung, An analysis of M, MMPP/G/1
queues with QLT scheduling policy and Bernoulli schedule, IEICE Transactions on Communica-
tions, vol. E81-B, no.1, pp. 13-22, 1998.
[3] D. I. Choi, C. Knessl and C. Tier, A queueing system with queue length dependent service times,
with applications to cell discarding in ATM networks, J. applied Mathematics and Stochastic
analysis, vol. 20, no. 1 pp. 35-62, 1999.
Department of Mathematics,
Halla Institute ofTechnology
220-840 Wonju-shi, Kangwon-do, Korea
J. KSIAM Vol.3, No.2, 81-97, 1999
TIME OPTIMAL CONTROL PROBLEM OF
RETARDED SEMILINEAR SYSTEMS WITH
UNBOUNDED OPERATORS IN HILBERT SPACES
Jong-Yeoul Park,Jin-Mun Jeong,Yong-Han Kang
Abstract. This paper deals with the time optimal control problem for the retarded
semilinear system by using the construction of fundamental solution in case where the
principal operators are unbounded operators.
1. Introduction
Let H and V be complex Hilbert spaces such that the embedding V � H is con-
tinuous. In this paper we deal with the time optimal control problem governed by
semilinear parabolic type equation in Hilbert space H as follows.
(RSE)
8>>>>><>>>>>:
d
dtx(t) = A0x(t) +A1x(t� h)
+
Z 0
�h
a(s)A2x(t+ s)ds+ f(t; x(t)) + k(t);
x(0) = �0; x(s) = �1(s) � h � s < 0:
Let A0 be the operator associated with a bounded sesquilinear form de�ned in V � V
and satisfying G�arding inequality. Then A0 generates an analytic semigroup S(t) in
both H and V � and so the equation (RSE) may be considered as an equation in both
H and V �.
Let (�0; �1) 2 H � L2(0; T ;V ) and x(T ;�; f; u) be a solution of the system (RSE)
associated with nonlinear term f and control u at time T .
We now de�ne the fundamental solution W (t) of (RSE) by
W (t) =
�x(t; (�0; 0); 0; 0); t � 0
0 t < 0:
1991 Mathematics Subject Classi�cation. Primary 35B37; Secondary 93C20.
Key words and phrases. semilinear evolution equation, regularity, optimal conrol,
compact imbedding.
Typeset by AMS-TEX
81
82 JONG-YEOUL PARK,JIN-MUN JEONG,YONG-HAN KANG
According to the above de�nition W (t) is a unique solotion of
W (t) = S(t) +
Z t
0
S(t� s)fA1W (s� h) +
Z 0
�h
a(�)A2W (s+ �)d�gds
for t � 0 (cf. Nakagiri [5]). Under the conditions a(�) 2 L2(�h; 0;R) and Ai(i = 1; 2)
are bounded linear operators on H into itself, S. Nakariri in [5] proved the standard
optimal control proplems and the time optimal control problem for linear retarded
system (RSE) in case f � 0 in Banch space. If Ai(i = 0; 1; 2) : D(A0) � H ! H are
unbounded operators, G. Di Blasio, K. Kunish and E. sinestrari in [2] obtained global
existence and uniqueness of the strict solution for linear retarded system in Hilbert
spaces. With the more general Lipschitz continuity of nonlinear operator f from R�Vto H, in [4] they eatablished the problem for existencs and uniqueness of solution of the
given system. But we can not immediately obtain the time optimal control problem
as in [5; section 8] without the condition for boundedness of the fundamental solution
W (t). Since the integral of A0S(t�s) has a sigularity at t = s we can not solve directly
the integral equation of W (t). In [6], H. Tanabe was investigated the fundamental
solution W (t) by constructing the resolvent operators for integrodi�erential equations
of Volterra type(see (3.14), (3.21) of [6]) with the condition that a(�) is real valued and
H�older continuous on [�h; 0].This paper deals with the time optimal control problem by using the construction
of fundamental solution, which is the same results of [5], in case where the principal
operators Ai(i = 0; 1; 2) are unbounded operators.
2. Retarded semilinear equations
The inner product and norm in H are denoted by (�; �) and j � j. The notations jj � jjand jj � jj� denote the norms of V and V � as usual, respectively. Hence we may regard
that
(2.1) jjujj� � juj � jjujj; u 2 V:
Let a(�; �) be a bounded sesquilinear form de�ned in V �V and satisfying G�arding's
inequality
(2.2) Re a(u; u) � c0jjujj2 � c1juj2; c0 > 0; c1 � 0:
Let A0 be the operator associated with the sesquilinear form �a(�; �):
(A0u; v) = �a(u; v); u; v 2 V:
It follows from (2.2) that for every u 2 V
Re ((c1 �A0)u; u) � c0jjujj2:
Then A0 is a bounded linear operator from V to V �, and its realization in H which is
the restriction of A0 to
D(A0) = fu 2 V ;A0u 2 Hg
TIME OPTIMAL CONTROL PROBLEM 83
is also denoted by A0. Then A0 generates an analytic semigroup in both H and V �.
Hence we may assume that there exists a constant C0 such that
(2.3) jjujj � C0jjujj1=2D(A0)juj1=2;
for every u 2 D(A0), where
jjujjD(A0) = (jA0uj2 + juj2)1=2
is the graph norm of D(A0).
First, we introduce the following linear retarded functional di�erential equation:
(RE)
8>>>>><>>>>>:
d
dtx(t) = A0x(t) +A1x(t� h)
+
Z 0
�h
a(s)A2x(t+ s)ds+ k(t);
x(0) = �0; x(s) = �1(s) � h � s < 0:
Here, the operators A1 and A2 are bounded linear from V to V � such that their
restrictions to D(A0) are bounded linear operators from D(A0) to H. The function
a(�) is assumed to be a real valued and H�older continous in the interval [�h; 0].Let W (�) be the fundamental solution of the linear equaton associated with (RE)
which is the operator valued function satisfying
W (t) = S(t) +
Z t
0
S(t� s)fA1W (s� h)(2.4)
+
Z 0
�h
a(�)A2W (s+ �)d�gds; t > 0;
W (0) = I; W (s) = 0; �h � s < 0;
where S(�) is the semigroup generated by A0. Then
x(t) =W (t)�0 +
Z 0
�h
Ut(s)�1(s)ds+
Z t
0
W (t� s)k(s)ds;(2.5)
Ut(s) =W (t� s� h)A1 +
Z s
�h
W (t� s+ �)a(�)A2d�:
Recalling the formulation of mild solutions, we know that the mild solution of (RE)
is also represented by
x(t) =
8>>>>><>>>>>:
S(t)�0 +
Z t
0
S(t� s)fA1x(s� h)
+
Z 0
�h
a(�)A2x(s+ �)d� + k(s)gds; (t > 0);
�(s); �h � s < 0:
>From Theorem 1 in [6] it follows the following results.
84 JONG-YEOUL PARK,JIN-MUN JEONG,YONG-HAN KANG
Proposition 2.1. The fundamental solution W (t) to (RE) exists uniquely. The func-
tions A0W (t) and dW (t)=dt are strongly continuous except at t = nh; h = 0; 1; 2; :::,
and the following inequalities hold:
for i = 0; 1; 2 and n = 0; 1; 2; :::
jAiW (t)j � Cn=(t� nh);(2.6)
jdW (t)=dtj � Cn=(t� nh);(2.7)
jAiW (t)A�10 j � Cn(2.8)
in (nh; (n+ 1)h),
(2.9) jZ t
0
t
AiW (�)d� j � Cn
for nh � t < t0 � (n+ 1)h. Let � be the order of H�older continuity of a(�). Then for
nh � t < t0 � (n+ 1)h and 0 < � < �
jW (t0
)�W (t)j � Cn;�(t0 � t)�(t� nh)��;(2.10)
jAi(W (t0
)�W (t))j � Cn;�(t0 � t)�(t� nh)���1;(2.11)
jAi(W (t0
)�W (t))A�10 j � Cn;�(t
0 � t)�(t� nh)��;(2.12)
where Cn and Cn;� are constants dependent on n and n; �, respectively, but not on t
and t0
.
Considering as an equation in V � we also obtain the same norm eatimates of (2.6)-
(2.12) in the space V �. By virtue of Theorem 3.3 of [2] we have the following result on
the linear equation (RE).
Proposition 2.2. 1) Let F = (D(A0);H) 12;2 where (D(A0);H)1=2;2 denote the real
interpolation space between D(A0) and H. For (�0; �1) 2F � L2(�h; 0;D(A0)) and
k 2 L2(0; T ;H), T > 0, there exists a unique solution x of (RE) belonging to
L2(�h; T ;D(A0)) \W 1;2(0; T ;H) � C([0; T ];F )
and satisfying
jjxjjL2(�h;T ;D(A0))\W 1;2(0;T ;H) � C 01(jj�0jjF(2.13)
+ jj�1jjL2(�h;0;D(A0)) + jjkjjL2(0;T ;H));
where C 01 is a constant depending on T .
2) Let (�0; �1) 2 H � L2(�h; 0;V ) and k 2 L2(0; T ;V �), T > 0. Then there exists a
unique solution x of (RE) belonging to
L2(�h; T ;V ) \W 1;2(0; T ;V �) � C([0; T ];H)
and satisfying
jjxjjL2(�h;T ;V )\W 1;2(0;T ;V �) � C 01(j�0j(2.14)
+ jj�1jjL2(�h;0;V ) + jjkjjL2(0;T ;V �)):
In what follows we assume that
jjW (t)jj �M; t > 0
for the sake of simplicity.
TIME OPTIMAL CONTROL PROBLEM 85
Proposition 2.3. Let k 2 L2(0; T ;H) and x(t) =R t0W (t � s)k(s)ds. Then there
exists a constant C 01 such that for T > 0
jjxjjL2(0;T ;D(A0)) � C 01jjkjjL2(0;T ;H);(2.15)
jjxjjL2(0;T ;H) �MT jjkjjL2(0;T ;H);(2.16)
and
(2.17) jjxjjL2(0;T ;V ) � (C 01MT )1
2 jjkjjL2(0;T ;H):
Proof. The assertion (2.15) is immediately obtained from Proposition 2.2 for the equa-
tion (RE) with (�0; �1) = (0; 0): Since
jjxjj2L2(0;T ;H) =
Z T
0
jZ t
0
W (t� s)k(s)dsj2dt
�M2
Z T
0
(
Z t
0
jk(s)jds)2dt
�M2
Z T
0
t
Z t
0
jk(s)j2dsdt
�M2T2
2
Z T
0
jk(s)j2ds
it follows that
jjxjjL2(0;T ;H) �MT jjkjjL2(0;T ;H):
>From (2.3), (2.15), and (2.16) it holds that
jjxjjL2(0;T ;V ) � (C 01MT )1
2 jjkjjL2(0;T ;H): �
Let f be a nonlinear mapping from R � V into H. We assume that for any x1,
x2 2 V there exists a constant L > 0 such that
jf(t; x1)� f(t; x2)j � Ljjx1 � x2jj;(F1)
f(t; 0) = 0:(F2)
The following result on (RSE) is obtained from theorem 2.1 in [4].
Proposition 2.4. Suppose that the assumptions (F1), (F2) are satis�ed. Then for
any (�0; �1) 2 H � L2(�h; 0;V ) and k 2 L2(0; T ;V �), T > 0, the solution x of (RE)
exists and is unique in L2(�h; T ;V ) \W 1;2(0; T ;V �), and there exists a constant C 02depending on T such that
jjxjjL2(�h;T ;V )\W 1;2(0;T ;V �) � C 02(1 + j�0j(2.18)
+ jj�1jjL2(�h;0;V ) + jjkjjL2(0;T ;V �)):
86 JONG-YEOUL PARK,JIN-MUN JEONG,YONG-HAN KANG
3. Lemmas for fundamental solutions
For the sake of simplicity we assume that S(t) is uniformly bounded. Then
(3.1) jS(t)j �M0(t � 0); jA0S(t)j �M0=t(t > 0); jA20S(t)j � K=t2(t > 0)
for some constant M0(e.g., [6]). we also assume that a(�) is H�older continuous of oder�:
(3.2) ja(�)j � H0; ja(s)� a(�)j � H1(s� �)�
for some constants H0;H1.
Lemma 3.1. For 0 < s < t and 0 < � < 1
jS(t)� S(s)j � M0
�(t� s
s)�;(3.3)
jA0S(t)� A0S(s)j �M0(t� s)�s���1:(3.4)
Proof. >From (3.1) for 0 < s < t
(3.5) jS(t)� S(s)j = jZ t
s
A0S(�)d� j �M0 logt
s:
It is easily seen that for any t > 0 and 0 < � < 1
(3.6) log(1 + t) � t�=�:
Combining (3.6) with (3.5) we get (3.3). For 0 < s < t
(3.7) jA0S(t)�A0S(s)j = jZ t
s
A20S(�)d� j �M0(t� s)=ts:
Noting that (t� s)=s � ((t� s)=s)� for 0 < � < 1, we obtain (3.4) from (3.7). �
According to Tanabe [6] we set
(3.8) V (t) =
8><>:A0(W (t)� S(t)); t 2 (0; h]
A0(W (t)�Z t
nh
S(t� s)A1W (s� h)ds);
where t 2 (nh; (n+1)h](n = 1; 2; ::: ) in the second line of the right term of (3.8). For
0 < t � h
W (t) = S(t) + A�10 V (t)
and from (2.4) we have
W (t) = S(t) +
Z t
0
Z t
�
S(t� s)a(� � s)dsA2W (�)d�:
TIME OPTIMAL CONTROL PROBLEM 87
Hence,
V (t) = V0(t) +
Z t
0
A0
Z t
�
S(t� s)a(� � s)dsA2A�10 V (�)d�
where
V0(t) =
Z t
0
A0
Z t
�
S(t� s)a(� � s)dsA2S(�)d�:
For nh � t � (n+ 1)h(n = 0; 1; 2; ::: ) the fundamental solution W (t) is represended
by
W (t) =S(t) +
Z t
nh
S(t� s)A1W (s� h)ds
+
Z t�h
0
Z �+h
�
S(t� s)a(� � s)dsA2W (�)d�
+
Z nh
t�h
Z t
�
S(t� s)a(� � s)dsA2W (�)d�
+
Z t
nh
Z t
�
S(t� s)a(� � s)dsA2W (�)d�:
The integral equation to be satis�ed by (3.8) is
V (t) = V0(t) +
Z t
nh
A0
Z t
�
S(t� s)a(� � s)dsA2A�10 V (�)d�
where
V0(t) = A0S(t) +A0
Z nh
h
S(t� s)A1W (s� h)ds
+
Z t�h
0
A0
Z �+h
�
S(t� s)a(� � s)dsA2W (�)d�
+
Z nh
t�h
A0
Z t
0
S(t� s)a(� � s)dsA2W (�)d�
+
Z t
nh
A0
Z t
�
S(t� s)a(� � s)dsA2
Z �
nh
S(� � �)A1W (� � h)d�d�:
Thus, the integral equation (3.8) can be solved by succesive approximation and V (t)
is uniformly bounded in [nh; (n+ 1)h](e.g. (3.16) and the preceding part of (3.40) in
[6]).It is not di�cult to show that for n > 1
V (nh+ 0) 6= V (nh� 0); and W (nh+ 0) = W (nh� 0):
Moreover, we obtain the following result.
88 JONG-YEOUL PARK,JIN-MUN JEONG,YONG-HAN KANG
Lemma 3.2. There exists a constant C 0n > 0 such that
(3.9) jZ t
nh
a(� � s)AiW (�)d� j � C 0n; i = 1; 2;
for n = 0; 1; 2; :::, t 2 [nh; (n+ 1)h] and t � s � t+ h.
Proof. For t 2 [0; h](i.e.,n = 0), from (3.8) it follows
Z t
0
a(� � s)AiW (�)d� =
Z t
0
a(� � s)dsAiA�10 (A0S(�) + V (�))d�
=
Z t
0
(a(� � s)� a(s))AiA�10 A0S(�)d� + a(s)AiA
�10 (S(t)� I)
+
Z t
0
a(� � s)AiA�10 V (�)d�:
Noting that
jZ t
0
(a(� � s)� a(s))AiA�10 A0S(�)d� j �M0H1jAiA
�10 jZ t
0
���1d�;
we have
jZ t
0
a(� � s)AiW (�)d� j �jAiA�10 jfh�M0H1 +H0(M + 1)
+ hH0( sup0�t�h
jV (t)j)g:
Thus the assertion (3.9) holds in [0; h]. For t 2 [nh; (n+ 1)h]; n � 1,
Z t
nh
a(� � s)AiW (�)d� =
Z t
nh
a(� � s)AiA�10 V (�)d�(3.10)
+
Z t
nh
a(� � s)Ai
Z �
nh
S(� � �)A1W (� � h)d�d�:
The �rst term of the right of (3.10) is estimated as
jZ t
nh
a(� � s)AiA�10 V (�)d� j � hH0jAiA
�10 j( sup
nh�t�(n+1)h
jV (t)j)g:
TIME OPTIMAL CONTROL PROBLEM 89
Let � = (� + nh)=2 for nh < � < (n+ 1)h. Then
jA0
Z �
nh
S(� � �)A1W (� � h)d�j
(3.11)
� jZ �
�
A0S(� � �)(A1W (� � h)�A1(W (� � h))d�
+ (S((� � nh)=2)� I)A1W (� � h)
+
Z �
nh
(A0S(� � �)�A0S(� � nh))A1W (� � h)d�
+A0S(� � nh)
Z �
nh
A1W (� � h)d�j
�Z �
�
M0
� � �Cn�1;�(� � �)�(� � nh)���1d� + (M0 + 1)
Cn�1
� � nh
+
Z �
nh
M0(� � nh)
(� � �)(� � nh)
Cn�1
� � nhd� +
M0Cn�1
� � nh
�M0Cn�1;�
Z �
nh
(� � �)��1(� � nh)��d�2
� � nh
+(2M0 + 1)Cn�1
� � nh+M0Cn�1
� � nhlog 2
= f2M0Cn�1;�B(�; 1� �) + (2M0 + 1 +M0 log 2)Cn�1g=(� � nh)
� C0
n;�=(� � nh)
where B(�; �) is the Beta function. Noting that
d
d�
Z �
nh
S(� � �)A1W (� � h)d� = A1W (� � h) +A0
Z �
nh
S(� � �)A1W (� � h)d�;
and integrating this equality on [nh; t]Z t
nh
A0
Z �
nh
S(� � �)A1W (� � h)d�d�(3.12)
=
Z t
nh
S(t� �)A1W (� � h)d� �Z t
nh
A1W (� � h)d�:
By Lemma 3.1 and the induction hypothesis, the �rst term of the right of (3.12) is
estimated as
jZ �
nh
S(� � �)A1W (� � h)d�j(3.13)
= jZ �
nh
(S(� � �)� S(� � nh))A1W (� � h)d�
+ S(� � nh)
Z �
nh
A1W (� � h)d�j
�Z �
nh
M0 log� � nh
� � �
Cn�1
� � nhd� +M0Cn�1
�M0Cn�1c0 +M0Cn�1
90 JONG-YEOUL PARK,JIN-MUN JEONG,YONG-HAN KANG
where
c0 =
Z 1
0
log1
1� �
d�
�:
Thus, combining the above inequarity with (2.9) we get
(3.14) jZ t
nh
A0
Z �
nh
S(� � s)AiW (s� h)dsd� j � (M0c0 +M0 + 1)Cn�1:
Therefore, from (3.11), (3.14) the second term of the right of (3.10) is estimated as
jZ t
nh
a(� � s)Ai
Z �
nh
S(� � �)A1W (� � h)d�d� j
= jZ t
nh
(a(� � s)� a(s� nh))Ai
Z �
nh
S(� � �)A1W (� � h)d�d�
+ a(s� nh)
Z t
nh
Ai
Z �
nh
S(� � �)A1W (� � h)d�d� j
�Z t
nh
H1(� � nh)�jAiA�10 jC 0
n:�(� � nh)�1d�
+ ja(s� nh)jjAiA�10 j(M0c0 +M0 + 1)Cn�1
� H1C0
n:�jAiA�10 j(t� nh)� +H0jAiA
�10 j(Mc0 +M + 1)Cn�1:
Hence, we get the assertion (3.9). �
We de�ne the operator K1(t0; t) : H ! H( or V � ! V �) by
(3.15) K1(t0; t) =
Z t0
t
S(t0 � s)A1W (s� h)ds;
for nh � t < t0 < (n + 1)h. In terms of (3.13) K1(t0; t) is uniformly bounded in
(nh; (n+ 1)h]. And we remark that K1(t0; t) converges to 0 as t0 ! t at any element
of D(A0) in view of (2.8). We introduce another operator K2(t0; t) : H ! H( or
V � ! V �) by
(3.16) K2(t0; t) =
Z t0
t
S(t0 � s)
Z 0
�h
a(�)A2W (s+ �)d�ds;
for nh � t < t0 < (n+ 1)h.
Lemma 3.3. Let nh � t < t0 < (n+ 1)h. Then there exists a constant C 0n such that
and
(3.17) jK2(t0; t)j � 3M0C
0
n(t0 � t):
TIME OPTIMAL CONTROL PROBLEM 91
Proof. In [0; h], we transform K2(t0; t) by suitable change of variables and Fubini's
theorem as
K2(t0; t) =
Z t0
t
S(t0 � s)
Z s
0
a(� � s)A2W (�)d�ds
=
Z t
0
Z t0
t
S(t0 � s)a(� � s)A2W (�)dsd�
+
Z t0
t
Z t0
�
S(t0 � s)a(� � s)A2W (�)dsd�
=
Z t0
t
S(t0 � s)
Z t
0
a(� � s)A2W (�)d�ds
+
Z t0
t
S(t0 � s)
Z s
t
a(� � s)A2W (�)d�ds:
Thus from Lemma 3.2 we have
jK2(t0; t)j � 2M0C
0
n(t0 � t):
In [nh; (n+ 1)h), by the similar way mentioned above we get
K2(t0; t) =
Z t0
t
S(t0 � s)
Z 0
�h
a(�)A2W (� + s)d�ds
=
Z t0
t
S(t0 � s)
Z s
s�h
a(� � s)A2W (�)d�ds
=
Z t0�h
t�h
Z �+h
t
S(t0 � s)a(� � s)A2W (�)dsd�
+
Z t
t0�h
Z t0
t
S(t0 � s)a(� � s)A2W (�)dsd�
+
Z t0
t
Z t0
�
S(t0 � s)a(� � s)A2W (�)dsd�
=
Z t0
t
S(t0 � s)
Z t0�h
s�h
a(� � s)A2W (�)d�ds
+
Z t0
t
S(t0 � s)
Z t
t0�h
a(� � s)A2W (�)d�ds
+
Z t0
t
S(t0 � s)
Z s
t
a(� � s)A2W (�)d�ds:
Therefore, by Lemma 3.2 it holds (3.17) �
92 JONG-YEOUL PARK,JIN-MUN JEONG,YONG-HAN KANG
4. Time optimal control
Let Y be a real Banach space. In what follows the admissible set Uad be weakly
compact subset in L2(0; T ;Y ). Consider the following hereditary controlled system:
(RSC)
8>>>>>>><>>>>>>>:
d
dtx(t) = A0x(t) +A1x(t� h)
+
Z 0
�h
a(s)A2x(t+ s)ds+ f(t; x(t)) +Bu(t);
x(0) = �0; x(s) = �1(s) � h � s < 0;
u 2 Uad:
Here the controllerB is a bounded linear operator from Y toH. We denote the solution
x(t) in (RSC) by xu(t) to express the dependence on u 2 Uad. That is, xu is trajectory
corresponding to the controll u. Suppose the target setW is weakly compact in H and
de�ne
U0 = fu 2 Uad : xu(t) 2 W for some t 2 [0; T ]gfor T > 0 and suppose that U0 6= ;. The optimal time is de�ned by low limit t0 of t
such that xu(t) 2 W for some admissible control u. For each u 2 U0 we can de�ne the
�rst time ~t(u) such that xu(~t) 2 W . The our problem is to �nd a control �u 2 U0 such
that~t(�u) � ~t(u) for all u 2 U0
subject to the constraint (RSC).
Since xu 2 C([0; T ];H), the transition time ~t(u) is well de�ned for each u 2 Uad.
Theorem 4.1. 1) Let F = (D(A0);H)1=2;2). If (�0; �1) 2 F � L2(�h; 0; D(A0)) and
k 2 L2(0; T ;H), then the solution x of the equation (RSE) belonging to L2(�h; T ;D(A0))
\W 1;2(0; T ;H), and the mapping F �L2(�h; 0;D(A0 ))�L2(0; T ;H) 3 (�0; �1; k) 7!x 2 L2(�h; T ;D(A0))\ W 1;2(0; T ;H) is continuous.
2) If (�0; �1) 2 H � L2(�h; 0;V ) and k 2 L2(0; T ;V �), then the solution x of the
equation (RSE) belonging to L2(�h; T ;V ))\W 1;2(0; T ;V �), and the mapping H �L2(�h; 0;V )) � L2(0; T ;V �) 3 (�0; �1; k) 7! x 2 L2(�h; T ;V )\W 1;2(0; T ;V �) is
continuous.
Proof. 1) We know that x belongs to L2(0; T ;D(A0))\W 1;2(0; T ;H) from Proposition
2.2. Let (�0i ; �1i ; ki)2F�L2(�h; 0;D(A0))�L2(0; T ;H), and xi be the solution of (RSE)
with (�0i ; �1i ; ki) in place of (�0; �1; k) for i = 1; 2. Then in view of Proposition 2.2 we
have
jjx1 � x2jjL2(�h;T ;D(A0))\W 1;2(0;T ;H) � C 01fjj�01 � �02jjF(4.1)
+ jj�11 � �12jjL2(�h;0:D(A0)) + jjf(�; x1)� f(�; x2)jjL2(0;T ;H)
+ jjk1 � k2jjL2(0;T ;H)g� C 01fjj�01 � �02jjF + jj�11 � �12jjL2(�h;0:D(A0)) + jjk1 � k2jjL2(0;T ;H)
+ Ljjx1 � x2jjL2(0;T :V )g:
TIME OPTIMAL CONTROL PROBLEM 93
Since
x1(t)� x2(t) = �01 � �02 +
Z t
0
( _x1(s)� _x2(s))ds;
we get
jjx1 � x2jjL2(0;T ;H) �pT j�10 � �02j+
Tp2jjx1 � x2jjW 1;2(0;T ;H):
Hence arguing as in (2.3) we get
jjx1 � x2jjL2(0;T ;V ) � C0jjx1 � x2jj1=2L2(0;T ;D(A0))jjx1 � x2jj1=2L2(0;T ;H)
(4.2)
� C0jjx1 � x2jj1=2L2(0;T ;D(A0))
� fT 1=4j�01 � �02j1=2 + (Tp2)1=2jjx1 � x2jj1=2W 1;2(0;T ;H)
g
� C0T1=4j�01 � �02j1=2jjx1 � x2jj1=2L2(0;T ;D(A0))
+ C0(Tp2)1=2jjx1 � x2jjL2(0;T ;D(A0))\W 1;2(0;T ;H)
� 2�7=4C0j�01 � �02j
+ 2C0(Tp2)1=2jjx1 � x2jjL2(0;T ;D(A0))\W 1;2(0;T ;H):
Combining (4.1) and (4.2) we obtain
jjx1 � x2jjL2(�h;T ;D(A0))\W 1;2(0;T ;H) � C 01fjj�01 � �02jjF(4.3)
+ jj�11 � �12jjL2(�h;0:D(A0)) + jjk1 � k2jjL2(0;T ;H)
+ 2�7=4C0Lj�01 � �02j
+ 2C0(Tp2)1=2Ljjx1 � x2jjL2(0;T ;D(A0))\W 1;2(0;T ;H)g:
Suppose that (�0n; �1n; kn) ! (�0; �1; k) in F�L2(�h; 0;D(A0))�L2(0; T ;H), and let
xn and x be the solutions (RSE) with (�0n; �1n; kn) and (�0; �1; k) respectively. Let
0 < T1 � T be such that
2C0C0
1(T1=p2)1=2L < 1:
Then by virtue of (4.3) with T replaced by T1we see that xn ! x in L2(�h; T1;D(A0))\W 1;2(0; T1;H). This implies that (xn(T1); (xn)T1)
7! (x(T1); xT1) in F�L2(�h; 0;D(A0)). Hence the same argument shows that xn ! x
in
L2(T1;minf2T1; Tg;D(A0)) \W 1;2(T1;minf2T1; Tg;H):
Repeating this process we conclude that xn ! x in L2(�h; T ;D(A0))\W 1;2(0; T ;H):
94 JONG-YEOUL PARK,JIN-MUN JEONG,YONG-HAN KANG
2) From proposition 2.2 or 2.4 we have
jjx1 � x2jjL2(�h;T ;V )\W 1;2(0;T ;V �) � C 01fj�01 � �02j+ jj�11 � �12jjL2(�h;0:V ) + jjf(�; x1)� f(�; x2)jjL2(0;T ;V �)
+ jjk1 � k2jjL2(0;T ;V �)g� C 01fj�01 � �02j+ jj�11 � �12jjL2(�h;0:V ) + jjk1 � k2jjL2(0;T ;V �)
+ Ljjx1 � x2jjL2(0;T :V )g:
Hence, in vrtue of (4.2) and since the embedding L2(�h; T ;D(A0))\W 1;2(0; T ;H) �L2(�h; T ;V )\W 1;2(0; T ;V �) is continuous, by the similar way of 1) we can obtain the
result of 2) �
Theorem 4.2. Assume that U0 6= ;. Then there exists a time optimal control.
Proof. Let tn ! t0 + 0, un be an admissible control and suppose that the trajec-
tory xn corresponding to un belongs to W . Let F and B be the Nemitsky operators
corresponding to the maps f and B,which are de�ned by
(Fu)(�) = f(�; xu); and (Bu)(�) = Bu(�);
respectively. Then
xn(tn) =x(tn;�; 0) +
Z t0
0
W (tn � s)((F + B)un)(s)ds;(4.4)
+
Z tn
t0
W (tn � s)((F + B)u)(s)ds
where
x(tn;�; 0) = W (t)�0 +
Z 0
�h
Ut(s)�1(s)ds:
>From Proposition 2.4 it follows that
(4.5) x(tn; �; 0)! x(t0;�; 0) strongly in H:
The third term in (4.4) tends to zero as tn ! t0 + 0 from the fact that
jZ t
n
t0
W (tn � s)((F + B)u)(s)dsj
(4.6)
� ( supt2[0;T ]
jjW (t)jj)fLC 02(j�0j+ jj�1jjL2(0;T ;V ) + jjujjL2(0;T ;Y )) + jf(0)j
+ jjBjjjjujjL2(0;T ;Y )g(tn � t0)1=2:
TIME OPTIMAL CONTROL PROBLEM 95
By the de�nition of fundamental solution W (t) it holds
W (t+ �)� S(�)W (t) = S(t+ �) +
Z t+�
0
S(t+ �� s)fA1W (s� h)
+
Z 0
�h
a(�)A2W (s+ �)d�gds
� S(�)fS(t) +Z t
0
S(t� s)fA1W (s� h)
+
Z 0
�h
a(�)A2W (s+ �)d�gds
=
Z t+�
t
S(t+ �� s)fA1W (s� h)
+
Z 0
�h
a(�)A2W (s+ �)d�gds
= K1(t+ �; t) +K2(t+ �; t):
Hence, since
W (tn � s) = S(tn � t0)W (t0 � s) +K1(tn � s; t0 � s) +K2(tn � s; t0 � s)
the second term of (4.4) is represented as
Z t0
0
S(tn � t0)W (t0 � s)((F + B)un)(s)ds(4.7)
+
Z t0
0
(K1(tn � s; t0 � s) +K2(tn � s; t0 � s))((F + B)un)(s)ds:
The second term of the (4.7) tends to zero as �! 0 in terms of Lemma 3.3.
We denote xn(tn) by wn. SinceW and Uad are weakly compact, there exist an u0 2U0, w0 2W such that we may assume that w� limun = u in Uad and w� limwn = w0
in L2 \W 1;2.
Let p 2 H. Then S�(tn � t0)p! p strongly in H and by (F1) and Theorem 4.1,
(4.8) W (t0 � �)((F + B)un)(�)!W (t0 � �)((F + B)u0)(�)
weakly L2(0; T ;V ). Hence from (4.5)-(4.8) it follows that
(w0; p) = (x(t0;�; 0); p) +
Z t0
0
(W (t0 � s)((F + B)u0)(s); p)ds
by tending n!1. Since p is arbitrary, we have
w0 = x(t0;�; 0) +
Z t0
0
W (t0 � s)((F + B)u0)(s)ds 2W
96 JONG-YEOUL PARK,JIN-MUN JEONG,YONG-HAN KANG
and hence w0 is the trajectory correspondiny to u0, i.e., u0 2 U0. �
Now we consider the case where the target set W is singleton.
Consider that W = w0 such that �0 6= w0 and �1(s) 6= w0 for some s 2 [�h; 0).Then we can choose a decreasing sequence fWng of weakly compact sets with nonempty
interior such that
(4.9) w0 21\n=1
Wn; and dist(w0;W ) = supx2W
n
jx� w0j ! 0(n!1):
De�ne
Un0 = fu 2 Uad : xu(t) 2Wn for some t 2 [0; T ]g:
Then, we may assume that un is the time optimal control with the optimal time tn to
the target set Wn, n = 1; 2; ... .
Theorem 4.3. Let fWng be a sequence of closed convex in X satisfying the condition
(4.9) and Un0 6= ;. Then there exists a time optimal control u0 with the optimal time
t0 = supn�1ftng to the point target set fw0g which is given by the weak limit of some
subsequence of fung in L2(0; t0;Y ).
Proof. Since (4.9) is satis�ed and Uad is weakly compact, there exists wn = xn(tn) 2Wn ! w0 strongly in H. Since Uad is weakly compact, there exists u0 2 Uad such that
un ! u0 weakly in L2(0; t0;Y ). Thus, from the similar argument used in the proof
of Theorem 4.2 we can easily prove that u0 is the time optimal control and t0 is the
optimal time to the target fw0g. �
Remark 1. Let xu be the solution of (RSC) corresponding to u. Then the mapping
u 7! xu is compact from L2(0; T ;Y ) to L2(0; T ;H). We de�ne the soluton mapping S
from L2(0; T ;Y ) to L2(0; T ;H) by
(Su)(t) = xu(t); u 2 L2(0; T ;Y ):
In virtue of Proposition 2.4
jjSujjL2(0;T ;V )\W 1;2(0;T ;V �) = jjxujj � C 02fjx0j+ jj+BujjL2(0;T ;H):
Hence if u is bounded in L2(0; T ;Y ), then so is xu in L2(0; T ;V ) \ W 1;2(0; T ;V �).
Since V is compactly imbedded in H by assumption, the imbedding L2(0; T ;V ) \W 1;2(0; T ;V �) � L2(0; T ;H)) is also compact in view of Theorem 2 of J. P. Aubin [1].
Hence, the mapping u 7! Su = xu is compact from L2(0; T ;Y ) to L2(0; T ;H).
Since fxng is bounded in L2 \W 1;2 and L2 \W 1;2 � L2(0; T ;H) compacvtively
it holds xn ! x strongly in L2(0; T ;H). Since xn ! x weakly in L2 \W 1;2 we have
xn ! x strongly in L2(0; T ;H). >From (f1) and Lemma 3.1 we see that F is a compact
operator from L2(0; T ;Y ) to L2(0; T ;H) and hence, it holds Fun ! Fu strongly in
L2(0; T ;V �). Therefore (Fun; x�) = (Fu0; x�).
TIME OPTIMAL CONTROL PROBLEM 97
References
1. J. P. Aubin, Un th�eor�eme de composit�e, C. R. Acad. Sci. 256 (1963), 5042{5044.
2. G. Di Blasio, K. Kunisch and E. Sinestrari, L2�regularity for parabolic partial integrodi�erential
equations with delay in the highest-order derivatives, J. Math. Anal. Appl. 102 (1984), 38{57.
3. J. M. Jeong, Retarded functional di�erential equations with L1-valued controller, Funkcialaj Ek-
vacioj 36 (1993), 71{93.
4. J. Y. Park, J. M. Jeong and Y. C. Kwun, Regularity and controllability for semilinear control
system, Indian J. pure appl. Math. 29(3) (1998), 239-252.
5. S. Nakagiri, Optimal control of linea retarded systems in Banach spaces, J. Math. Anal. Appl.
120(1) (1986), 169-210.
6. H. Tanabe, Fundamental solutions for linear retarded functional di�erential equations in Banach
space, Funkcialaj Ekvacioj 35(1) (1992), 149{177.
Department of Mathematics,
Pusan National University,
Pusan 609-739, Korea
Division of Mathematical Sciences,
Pukyong National University,
Pusan 608-737, Korea
Department of Mathematics,
Pusan National University,
Pusan 609-739, Korea
SPLINE HAZARD RATE ESTIMATION
USING CENSORED DATA
Myung Hwan Na
J. KSIAM Vol.3, No.2, 99-106, 1999
Abstract
In this paper, the spline hazard rate model to the randomly censored data is
introduced. The unknown hazard rate function is expressed as a linear combination
of B-splines which is constrained to be linear(or constant) in tails. We determine
the coe�cients of the linear combination by maximizing the likelihood function.
The number of knots are determined by Bayesian Information Criterion. Examples
using simulated data are used to illustrate the performance of this method under
presenting the random censoring.
1 Introduction
Reliability engineers, biostatisticians, and actuaries are all interested in lifetimes. In
particular, they are interested in �ve lifetime distribution representations: the hazard
rate function h(t), the cumulative hazard rate function H(t), the reliability function
R(t), the probability density function f(t), and the mean residual life function m(t).
Perhaps, the hazard rate function is the most popular of the �ve representations for
life{time modelling. The hazard rate function is de�ned as
h(t) =f(t)
R(t); t � 0:
Thus, the hazard rate function is the ratio of the probability density function to the
reliability function. Throughout this paper we assume that the hazard rate function
satisfy two conditions:
Z1
0h(t)dt =1; h(t) > 0 for all t > 0: (1)
A smooth estimation of the hazard rate function is a very important topic in both the-
oretical and applied statistics; Anderson and Senthilselvan(1980) use quadratic spline
with discontinuity in the slope at the times of death. O'sullivan(1988) use smoothing
splines for the log-hazard function. Senthilselvan(1987) use hyperbolic spline function
which is continuous with its �rst derivative discontinuous only at a �nite number of
points. Kooperberg, et al.(1995) use cubic splines and two additional log terms for
Key Words: Hazard Rate, Spline, Censoring, Simulation
99
100 Myung Hwan Na
log-hazard function. The discussion section of Abrahamowicz, et al.(1992) contains a
good review of many of the papers on the use of splines to estimate density in the
presence of censored data.
In this paper we introduce the spline hazard rate model to the random censored
data. The unknown hazard rate function is expressed as a fuction from a space of cu-
bic splines constrained to be linear(or constant) in tails. The coe�cients of the linear
combination are determined by maximizing the likelihood function. The number of
knots are determined by Bayesian Information Criterion(BIC). Examples using simu-
lated data are used to illustrate the performance of this method under presenting the
random censoring.
Section 2 is devoted to an introduction to spline model for the hazard rate function.
A maximum likelihood estimation procedure is discussed in section 3. Section 4 contains
knot deletion procedure. Section 5 contains examples using simulated data.
2 SPLINE HAZARD RATE MODEL
Let K denote a nonnegative integer. When K � 1, let �1; � � � ; �K be a (simple) knot
sequence in [0;1) where 0 < �1 < � � � < �K <1. Let S0 denote the collection of twice
continuously di�erentiable functions s on [0;1) such that the restriction of s to each
of the intervals [0; �1]; [�1; �2]; � � � ; [�K ;1) is a cubic polynomial, i.e., s is a polynomial
of order 4 (or less) on each of the intervals. Then S0 is the (K+4)-dimensional vector
space of cubic splines corresponding to the knot positions �1; � � � ; �K . Set S denote
the subspace of S0 consisting of the natural cubic splines with knots at �1; � � � ; �K , i.e.,
the functions in S that are linear (or constant) on [0; �1] and [�K ;1). This linear
vector space is K-dimensional and has a basis B1; � � � ; BK of S. When K = 0, there
are no basis functions depending on t. For exhaustive treatment of splines, the reader
should consult Greville(1969), de Boor(1978), and Schumaker(1981). For statistical
applications we refer to Smith(1979), and Wegman and Wright(1983).
Let � denote the collection of all column-vector � = (�1; � � � ; �K)t 2 RK such thatPK
j=1 �jBj(t) > 0 for all t > 0. Given � 2 �, consider the model
h(tj�) =KXj=1
�jBj(t); t > 0
for the hazard rate function. For this spline model, the corresponding cumulative hazard
rate function, reliability function, and probability density function are respectively
given by
H(tj�) =KXj=1
�jCj(t);
R(tj�) = exp
0@�
KXj=1
�jCj(t)
1A ;
Spline Hazard Rate Estimation 101
f(tj�) =
0@ KXj=1
�jBj(t)
1A exp
0@�
KXj=1
�jCj(t)
1A
where Cj(t) =R t0 Bj(u)du.
In particular, when K = 0, this spline model includes exactly exponential distribu-
tion. When K = 1, this is exact hazard rate function of the Rayleigh distribution. The
MLE � is obtained by maximizing the likelihood function. We refer to h(�) = h(�j�) as
the spline hazard rate estimate.
3 MAXIMUM LIKELIHOOD ESTIMATION
Let T1; T2; � � � ; Tn be independent identically distributed(i.i.d.) with a life distri-
bution function(d.f.) F and let C1; C2; � � � ; Cn be i.i.d. with d.f. G. Ci is the
censoring time associated with Ti. In random censoring case we can only observe
(Y1; �1); � � � ; (Yn; �n) where Yi = min(Ti; Ci), �i = I(Ti � Ci), 1 � i � n. It is assumed
that Ti and Ci are independent. The random variable Yi is said to be uncensored or
censored according as �i = 1 or �i = 0. Note that the partial likelihood corresponding
to the data (yi; �i) equals [f(yi)]�i [1�F (yi)]
1��i (see Miller, 1981), so the log-likelihood
for the data (yi; �i) equals
(yi; �i) = �i log h(yi)�H(yi):
Thus the log-likelihood function corresponding to the spline model is determined by
l(�) =nXi=1
(yi; �i)
=nXi=1
�i log(h(yi))�nXi=1
H(yi):
Moreover,
@
@�jl(�) =
nXi=1
�iBj(yi)
h(yi)�
nXi=1
Cj(yi); 1 � j � K
and@2
@�j@�kl(�) = �
nXi=1
�iBj(yi)Bk(yi)
(h(yi))2; 1 � j; k � K:
It follows from the last result that l(�) is a concave function. Thus the MLE is unique
if it exists.
Let S(�) denote the score function of l(�), that is, K-dimensional column vector
with entries @l(�)=@�j , and let H(�) denote the Hessian of l(�), the K�K matrix with
entries @2l(�)=@�j@�k. The maximum likelihood equation for � is S(�) = 0. We use
102 Myung Hwan Na
Newton-Raphson method with step-halving for computing �, to start with an initial
guess �(0) and iteratively determine �(m+1) by the formula
�(m+1) = �(m) +1
2MI�1(�(m))S(�(m))
where I(�) = �H(�) and M is the smallest nonnegative integer such that
lf(�(m) +1
2MI�1(�(m))S(�(m))g � lf(�(m) +
1
2M+1I�1(�(m))S(�(m))g:
We stop the iterations when l(�(m+1))� l(�(m)) � 10�6:
4 KNOT DELETION PROCEDURE
In this section we determine the rules for selecting the number and location of knots.
In order to determine the number and location, we can directly apply the stepwise
knot deletion method of Smith(1982). First place enough initial knots appropriately
and then delete unnecessary knots. According to Stone(1991), for twice continuously
di�erentiable h(t), an optimal rate of convergence n�2=5 can be achieved if the number
of knots is increased proportionally to n1=5. So we use the integer K closest to 4n1=5
as a number of initial knots. We will describe an initial knot placement rule: place
two knots at the �rst and the last order statistics and the remaining knots as closely
as possible to the equi-spaced percentiles. For example, if the number of initial knots
is �ve, they are placed at the 0, 25, 50, 75, 100 percentiles.
First, consider the problem that the estimate of hazard rate function take negative
values. The estimates of the hazard rate may take negative values for large value of K
or on intervals [0; �1] and [�K ;1). So we used following method to satisfy conditions
(1) and (2).
(i) If �1(�1+ �2+ �3) + �2 less than 0, we set �1 = 0, i.e. the function is constant on
[0; �1].
(ii) If �K less than 0, we set �K = 0, i.e. the function is constant on [�K ;1).
(iii) If the minimum value of h(xi), i = 1; � � � ; n, is negative, we delete the closest
knot to xj� , where xj� is argument of minimum value of h(xi).
Now consider the problem of deleting unnecessary knots. Following Smith(1982),
the absence of a knot � of a splinePK
j=1 �jBj(t) means that
KXj=1
�j�j(�) = 0
where �j(�) = B(3)j (��)� B
(3)j (�+), B
(3)j (��) and B
(3)j (�+) are, respectively, the left-
and right-hand limit of @3Bj(t)=@t3 at �.
At any step we compute
�k =j kj
SE( k)k = 1; � � � ;K
Spline Hazard Rate Estimation 103
where k =PK
j=1 �j�j(�k) and SE( k) = f�t(�k)(I(�))�1�(�k)g
1=2. And we delete the
knot having the smallest value of �k. In this manner, we arrive at a sequence of models
indexed by J , which ranges from 0 to K. Let IL = 1 when the estimated function
is constant on [0; �1], IL = 0 otherwise. Let IR = 1 when the estimated function is
constant on [tK ;1), IR = 0 otherwise. Let lJ denote the log-likelihood function for the
Jth model evaluated at the MLE for that model. Let BIC = �2lJ + log(n)(K � J �
IL� IR) be the Bayesian Information Criterion(Schwarz, 1978) for the Jth model. We
choose the model corresponding to that value J of J that minimizes BIC. This model
has K � J knots and K � J � IL � IR free parameters.
5 EXAMPLES
The spline hazard rate estimation procedure described in Section 2 is applied to Weibull,
Gamma and Dhillon distributions. The density functions are respectively given by:
f(t) =�
�(t
�)��1 exp(�(
t
�)�);
f(t) =1
�(�)
t��1
��exp(�
t
�);
f(t) = ��(�t)��1 exp((�t)�) exp(1� exp(�t)�):
The simulation is performed on the subroutine IMSL of the package FORTRAN.
In Figure 1, we show the true hazard rate function(solid) corresponding to Weibull
distribution with parameter � = 0:8 and � = 1. The dotted line corresponds to the
estimated hazard rate function based on a sample of size 200. Figure 2 is similar to
Figure 1, but the underlying distribution for Figure 2 is Gamma distribution. The data
for Figure 2 is from Gamma distribution with parameter � = 2 and � = 1 based on a
sample of size 200. In the �gure we show the true hazard rate function corresponding
to this Gamma distribution together with the estimate for the hazard rate function
based on the spline model. In Figure 3, we show the result of similar calculation based
on a sample of size 200 from Dhillon distribution with parameter � = 1 and � = 1,
i.e., extreme value distribution. In the �gure we show the true hazard rate function
corresponding to this Dhillon distribution together with the estimate for the hazard
rate function based on the spline model.
From these examples, we have found that the spline hazard rate estimate yields a
reasonable estimate for the hazard rate function.
104 Myung Hwan Na
Figure 1. Spline hazard rate estimate for Weibull distribution
with � = 0:8 and � = 1 based on sample of size 200.
Figure 2. Spline hazard rate estimate for Gamma distribution
with � = 2 and � = 1 based on sample of size 200.
Spline Hazard Rate Estimation 105
Figure 3. Spline hazard rate estimate for Dhilon distribution
with � = 1 and � = 1 based on sample of size 200.
REFERENCES
1. Abrahamowicz, M., Ciampi, A. and Ramsay, J. O. (1992): "Nonparametric Den-
sity Estimation for Censored Survival Data : Regression-Spline Approach", The
Canadian Journal of Statistics, Vol. 20, 171-185.
2. Anderson, J. A. and Senthilselvan, A. (1980): "Smooth Estimates for the Hazard
Function", Journal of the Royal Statistical Society, Ser. B, Vol. 42, 322-327.
3. de Boor, C. (1978), A Practical Guide to Splines, Springer-Verlag, New York.
4. Greville, T. N. E. (1969), Theory and Application of Spline Function, Academic
Press, New York.
5. Kooperberg, C., Stone C. J. and Truong, Y. K. (1995): "Hazard Regression",
Journal of American Statistic Association, Vol. 90, 78-94.
6. Miller, R. (1981) Survival Analysis, John Wiley & Sons, New York.
7. O'sullivan F. (1988): "Fast Computation of Fully Automated Log-Density and
Log-Hazard Estimates", SIAM Journal of Scienti�c and Statistical Computing,
Vol. 9, 363-379.
8. Schumaker, L. L. (1981): Spline functions; Basic Theory, Wiley, New York.
106 Myung Hwan Na
9. Schwarz, G. (1978): "Estimating the dimension of model", Annals of Statistics,
Vol. 6, 461-464.
10. Senthilelvan, A. (1987): "Penalized Likelihood Estimation of Hazard and Inten-
sity Functions", Journal of the Royal Statistical Society, Ser. B, Vol. 49, 170-174.
11. Smith, P. L. (1979): "Splines as a Useful and Convenient Statistical Tools", The
American Statistician, Vol. 33, pp. 57-62.
12. Smith, P. L. (1982): "Curve �tting and modeling with splines using statistical
variable selection methods" NASA, Langley Research Center, Hampla, VA, NASA
Report 166034.
13. Stone C. J. (1991): Generalized Multivariate Regression Splines, Technical Report
No. 318, Dept. statist. Univ. California, Berkeley.
14. Wegman, E. J. and Wright, I. W. (1983), "Splines in Statistics", Journal of
American Statistic Association, Vol. 78, 351-366.
Department of Statistics, Seoul National
University, Seoul 151-742, Korea
e-mail: [email protected]
An Ostrowski Type Inequality for Weighted
Mappings with Bounded Second Derivatives
J. Roumeliotis, P. Cerone, S.S. Dragomir
J. KSIAM Vol.3, No.2, 107-119, 1999
Abstract
A weighted integral inequality of Ostrowski type for mappings whose second
derivatives are bounded is proved. The inequality is extended to account for ap-
plications in numerical integration.
1 Introduction
In 1938, Ostrowski (see for example Mitrinovi�c et al. (1994, p. 468)) proved the follow-
ing inequality
THEOREM 1.1. Let f : I � R ! R be a di�erentiable mapping in Io (Io is the
interior of I), and let a; b 2 Io with a < b. If f 0 : (a; b) ! R is bounded on (a; b), i.e.,
kf 0k1 := supt2(a;b)
jf 0(t)j <1, then we have the inequality:
���� 1
b� a
Zb
a
f(t) dt� f(x)
���� �"1
4+
�x� a+b
2
�2(b� a)2
#(b� a)kf 0k1 (1.1)
for all x 2 (a; b).
The constant 14 is sharp in the sense that it cannot be replaced by a smaller one.
A similar result for twice di�erentiable mappings (Cerone et al. 1998) is given
below.
THEOREM 1.2. Let f : [a; b] ! R be a twice di�erentiable mapping such that f 00 :
(a; b) ! R is bounded on (a,b), i.e. kf 00k1 := supt2(a;b)
jf 00(t)j < 1. Then we have the
inequality���� 1
b� a
Zb
a
f(t) dt� f(x) +
�x� a+ b
2
�f 0(x)
�����"1
24+
�x� a+b
2
�22(b� a)2
#(b� a)2kf 00k1 (1.2)
for all x 2 [a; b].
Key Words and Phrases: Ostrowski's inequality, weighted integrals, numerical integration
107
108 J. Roumeliotis, P. Cerone, S.S. Dragomir
In this paper, we extend the above result and develop an Ostrowski-type inequality
for weighted integrals. Applications to special weight functions and numerical integra-
tion are investigated.
2 Preliminaries
In the next section weighted (or product) integral inequalities are constructed. The
weight function (or density) is assumed to be non-negative and integrable over its
entire domain. The following generic quantitative measures of the weight are de�ned.
De�nition 2.1. Let w : (a; b)! [0;1) be an integrable function, i.e.Rb
aw(t) dt <1,
then de�ne
mi(a; b) =
Zb
a
tiw(t) dt; i = 0; 1; : : : (2.1)
as the ith moment of w.
De�nition 2.2. De�ne the mean of the interval [a; b] with respect to the density w as
�(a; b) =m1(a; b)
m0(a; b)(2.2)
and the variance by
�2(a; b) =m2(a; b)
m0(a; b)� �2(a; b): (2.3)
3 The Results
3.1 1-point inequality
THEOREM 3.1. Let f; w : (a; b) ! R be two mappings on (a; b) with the following
properties:
1. supt2(a;b)
jf 00(t)j <1,
2. w(t) � 0 8t 2 (a; b),
3.Rb
aw(t) dt <1,
then the following inequalities hold���� 1
m0(a; b)
Zb
a
w(t)f(t) dt�f(x) +�x� �(a; b)
�f 0(x)
����� kf 00k1
2
h�x� �(a; b)
�2+ �2(a; b)
i(3.1)
� kf 00k12
�����x� a+ b
2
����+ b� a
2
�2
(3.2)
for all x 2 [a; b].
Ostrowski Type Inequality 109
Proof. De�ne the mapping K(�; �) : [a; b]2 ! R by
K(x; t) :=
(Rt
a(t� u)w(u) du; a � t � x;Rt
b(t� u)w(u) du; x < t � b:
Integrating by parts gives
Zb
a
K(x; t)f 00(t) dt =
Zx
a
Zt
a
(t� u)w(u)f 00(t) dudt+
Zb
x
Zt
b
(t� u)w(u)f 00(t) dudt
= f 0(x)
Zb
a
(x� u)w(u) du
�Z
x
a
Zt
a
(t� u)w(u)f 0(t) dudt�Z
b
x
Zt
b
(t� u)w(u)f 0(t) dudt
=
Zb
a
w(t)f(t) dt + f 0(x)
Zb
a
(x� u)w(u) du � f(x)
Zb
a
w(u) du
providing the identity
Zb
a
K(x; t)f 00(t) dt
=
Zb
a
w(t)f(t) dt�m0(a; b)f(x) +m0(a; b)�x� �(a; b)
�f 0(x) (3.3)
that is valid for all x 2 [a; b].
Now taking the modulus of (3.3) we have,
����Z
b
a
w(t)f(t) dt �m0(a; b)f(x) +m0(a; b)�x� �(a; b)
�f 0(x)
����=
����Z
b
a
K(x; t)f 00(t) dt
����� kf 00k1
Zb
a
jK(x; t)j dt
= kf 00k1�Z
x
a
Zt
a
(t� u)w(u) dudt +
Zb
x
Zt
b
(t� u)w(u) dudt
�
=kf 00k1
2
Zb
a
(x� t)2w(t) dt: (3.4)
The last line being computed by reversing the order of integration and evaluating
the inner integrals. To obtain the desired result (3.1) observe that
Zb
a
(x� t)2w(t) dt = m0(a; b)h�x� �(a; b)
�2+ �2(a; b)
i:
110 J. Roumeliotis, P. Cerone, S.S. Dragomir
To obtain (3.2) note that
Zb
a
(x� t)2 dt � supt2[a;b]
(x� t)2mo(a; b)
= maxf(x� a)2; (x� b)2gm0(a; b)
=1
2
�(x� a)2 + (x� b)2 +
��(x� a)2 � (x� b)2���m0(a; b)
=
�����x� a+ b
2
����+ b� a
2
�2
m0(a; b)
which upon susbitution into (3.4) furnishes the result. �
Note also that the inequality (3.1) is valid even for unbounded w or interval [a; b].
This is not the case with (1.2).
COROLLARY 3.2. The inequality (3.1) is minimized at x = �(a; b) producing the
generalized \mid-point" inequality
���� 1
m0(a; b)
Zb
a
w(t)f(t) dt � f(�(a; b))
���� � kf 00k1�2(a; b)
2: (3.5)
Proof. Substituting �(a; b) for x in (3.1) produces the desired result. Note that
x = �(a; b) not only minimizes the bound of the inequality (3.1), but also causes the
derivative term to vanish. �
The optimal point (2.2) can be interpreted in many ways. In a physical context,
�(a; b) represents the centre of mass of a one dimensional rod with mass density w.
Equivalently, this point can be viewed as that which minimizes the error variance for
the probability density w (see Barnett et al. (1995) for an application). Finally (2.2) is
also the Gauss node point for a one-point rule (Stroud and Secrest 1966). The bound
in (3.5) is directly proportional to the variance of the density w. So that the tightest
bound is achieved by sampling at the mean point of the interval (a; b), while its value
is given by the variance.
3.2 2-point inequality
Here a two point analogy of (3.1) is developed where the result is extended to create an
inequality with two independent parameters x1 and x2. This is mainly used (Section
5) to �nd an optimal grid for composite weighted-quadrature rules.
THEOREM 3.3. Let the conditions of Theorem 3.1 hold, then the following 2-point
Ostrowski Type Inequality 111
inequality is obtained����Z
b
a
w(t)f(t) dt �m0(a; �)f(x1) +m0(a; �)�x1 � �(a; �)
�f 0(x1)
�m0(�; b)f(x2) +m0(�; b)�x2 � �(�; b)
�f 0(x2)
����� kf 00k1
2
�m0(a; �)
h�x1 � �(a; �)
�2+ �2(a; �)
i
+m0(�; b)h�x2 � �(�; b)
�2+ �2(�; b)
i�(3.6)
for all a � x1 < � < x2 � b.
Proof. De�ne the mapping K(�; �; �; �) : [a; b]4 ! R by
K(x1; x2; �; t) :=
8><>:Rt
a(t� u)w(u) du; a � t � x1;Rt
�(t� u)w(u) du; x1 < t; � < x2;Rt
b(t� u)w(u) du; x2 � t � b:
With this kernel, the proof is almost identical to that of Theorem 3.1.
Integrating by parts produces the integral identityZb
a
K(x1; x2; �; t)f00(t) dt
=
Zb
a
w(t)f(t) dt�m0(a; �)f(x1) +m0(a; b)�x� �(a; �)
�f 0(x1)
�m0(�; b)f(x2) +m0(�; b)�x� �(�; b)
�f 0(x2): (3.7)
Re-arranging and taking bounds produces the result (3.6). �
COROLLARY 3.4. The optimal location of the points x1; x2 and � satisfy
x1 = �(a; �); x2 = �(�; b); � =�(a; �) + �(�; b)
2(3.8)
Proof. By inspection of the right hand side of (3.6) it is obvious that choosing
x1 = �(a; �) and x2 = �(�; b) (3.9)
minimizes this quantity. To �nd the optimal value for � write the expression in braces
in (3.6) as
2
Zb
a
jK(x1; x2; �; t)j dt = m0(a; �)h�x1 � �(a; �)
�2+ �2(a; �)
i+m0(�; b)
h�x2 � �(�; b)
�2+ �2(�; b)
i=
Z�
a
(x1 � t)2w(t) dt +
Zb
�
(x2 � t)2w(t) dt: (3.10)
112 J. Roumeliotis, P. Cerone, S.S. Dragomir
Substituting (3.9) into the right hand side of (3.10) and di�erentiating with respect to
� gives
d
d�
Zb
a
jK(�(a; �); �(�; b); �; t)j dt =��(�; b)� �(�; a)
��� � �(a; �) + �(�; b)
2
�w(�):
Assuming w(�) 6= 0, then this equation possesses only one root. A minimum exists at
this root since (3.10) is convex, and so the corollary is proved. �
Equation (3.8) shows not only where sampling should occur within each subinterval
(i.e. x1 and x2), but how the domain should be divided to make up these subintervals
(�).
4 Some Weighted Integral Inequalities
Integration with weight functions are used in countless mathematical problems. Two
main areas are: (i) approximation theory and spectral analysis and (ii) statistical anal-
ysis and the theory of distributions.
In this section (3.1) is evaluated for the more popular weight functions. In each
case (1.2) cannot be used since the weight w(t) or the interval (b � a) is unbounded.
The optimal point (2.2) is easily identi�ed.
4.1 Uniform (Legendre)
Substituting w(t) = 1 into (2.2) and (2.3) gives
�(a; b) =
Rb
at dtR
b
adt
=a+ b
2(4.1)
and
�2(a; b) =
Rb
at2 dtRb
adt
��a+ b
2
�2
=(b� a)2
12
respectively. Substituting into (3.1) produces (1.2). Note that the interval mean is
simply the midpoint (4.1).
4.2 Logarithm
This weight is present in many physical problems; the main body of which exhibit
some axial symmetry. Special logarithmic rules are used extensively in the Boundary
Element Method popularized by Brebbia (see for example Brebbia and Dominguez
(1989)). Some applications of which include bubble cavitation (Blake and Gibson
1987) and viscous drop deformation (Rallison and Acrivos (1978) and more recently by
Roumeliotis et al. (1997)).
With w(t) = ln(1=t), a = 0, b = 1, (2.2) and (2.3) are
�(0; 1) =
R 10 t ln(1=t) dtR 10 ln(1=t) dt
=1
4
Ostrowski Type Inequality 113
and
�2(0; 1) =
R 10 t2 ln(1=t) dtR 10 ln(1=t) dt
��1
4
�2
=7
144
respectively. Substituting into (3.1) gives
����Z 1
0ln(1=t)f(t) dt � f(x) +
�x� 1
4
�f 0(x)
���� � kf 00k12
7
144+
�x� 1
4
�2!:
The optimal point
x = �(0; 1) =1
4
is closer to the origin than the midpoint (4.1) re ecting the strength of the log singu-
larity.
4.3 Jacobi
Substituting w(t) = 1=pt, a = 0, b = 1 into (2.2) and (2.3) gives
�(0; 1) =
R 10
pt dtR 1
0 1=pt dt
=1
3
and
�2(0; 1) =
R 10 tpt dtR 1
0 1=pt dt
��1
3
�2
=4
45
respectively. Hence, the inequality for a Jacobi weight is
����12Z 1
0
f(t)ptdt� f(x) +
�x� 1
3
�f 0(x)
���� � kf 00k12
4
45+
�x� 1
3
�2!:
The optimal point
x = �(0; 1) =1
3
is again shifted to the left of the mid-point due to the t�1=2 singularity at the origin.
4.4 Chebyshev
The mean and variance for the Chebyshev weight w(t) = 1=p1� t2, a = �1; b = 1 are
�(�1; 1) =R 1�1 t=
p1� t2 dtR 1
�1 1=p1� t2 dt
= 0
and
�2(�1; 1) =R 1�1 t
2p1� t2 dtR 1
�1 1=p1� t2 dt
� 02 =1
2
114 J. Roumeliotis, P. Cerone, S.S. Dragomir
respectively. Hence, the inequality corresponding to the Chebyshev weight is���� 1�Z 1
�1
f(t)p1� t2
dt� f(x) + xf 0(x)
���� � kf 00k12
�1
2+ x2
�:
The optimal point
x = �(�1; 1) = 0
is at the mid-point of the interval re ecting the symmetry of the Chebyshev weight
over its interval.
4.5 Laguerre
The conditions in Theorem 3.1 are not violated if the integral domain is in�nite. The
Laguerre weight w(t) = e�t is de�ned for positive values, t 2 [0;1). The mean and
variance of the Laguerre weight are
�(0;1) =
R1
0 te�t dtR1
0 e�t dt= 1
and
�2(0;1) =
R1
0 t2e�t dtR1
0 e�t dt� 12 = 1
respectively.
The appropriate inequality is����Z1
0e�tf(t) dt� f(x) + (x� 1)f 0(x)
���� � kf 00k12
�1 + (x� 1)2
�;
from which the optimal sample point of x = 1 may be deduced.
4.6 Hermite
Finally, the Hermite weight is w(t) = e�t2 de�ned over the entire real line. The mean
and variance for this weight are
�(�1;1) =
R1
�1te�t2 dtR
1
�1e�t2 dt
= 0
and
�2(�1;1) =
R1
�1t2e�t2 dtR
1
�1e�t2 dt
� 02 =1
2
respectively.
The inequality from Theorem 3.1 with the Hermite weight function is thus���� 1p�
Z1
�1
e�t2f(t) dt� f(x) + xf 0(x)
���� � kf 00k12
�1
2+ x2
�;
which results in an optimal sampling point of x = 0.
Ostrowski Type Inequality 115
5 Application in Numerical Integration
De�ne a grid In : a = �0 < �1 < � � � < �n�1 < �n = b on the interval [a,b], with
xi 2 [�i; �i+1] for i = 0; 1; : : : ; n � 1. The following quadrature formulae for weighted
integrals are obtained.
THEOREM 5.1. Let the conditions in Theorem 3.1 hold. The following weighted
quadrature rule holds Zb
a
w(t)f(t) dt = A(f; �;x) +R(f; �;x) (5.1)
where
A(f; �;x) =
n�1Xi=0
�hif(xi)� hi(xi � �i)f
0(xi)�
and
jR(f; �;x)j � kf 00k12
n�1Xi=0
�(xi � �i)
2 + �2i�hi: (5.2)
The parameters hi, �i and �2iare given by
hi = m0(�i; �i+1); �i = �(�i; �i+1); and �2i = �2(�i; �i+1)
respectively.
Proof. Apply Theorem 3.1 over the interval [�i; �i+1] with x = xi to obtain
����Z
�i+1
�i
w(t)f(t) dt� hif(xi) + hi(xi � �i)f0(xi)
����� kf 00k1
2hi�(xi � �i)
2 + �2i�:
Summing over i from 0 to n� 1 and using the triangle inequality produces the desired
result. �
COROLLARY 5.2. The optimal location of the points xi, i = 0; 1; 2; : : : ; n� 1, and
grid distribution In satisfy
xi = �i; i = 0; 1; : : : ; n� 1 and (5.3)
�i =�i�1 + �i
2; i = 1; 2; : : : ; n� 1; (5.4)
producing the composite generalized mid-point rule for weighted integrals
Zb
a
w(t)f(t) dt =
n�1Xi=0
hif(xi) +R(f; �; n) (5.5)
116 J. Roumeliotis, P. Cerone, S.S. Dragomir
where the remainder is bounded by
jR(f; �; n)j � kf 00k12
n�1Xi=0
hi�2i (5.6)
Proof. The proof follows that of Corollary 3.4 where it is observed that the minimum
bound (5.2) will occur at xi = �i. Di�erentiating the right hand side of (5.2) gives
d
d�i
n�1Xj=0
�(xj � �j)
2 + �2j�hj = 2w(�i)(xi � xi�1)
��i �
xi�1 + xi
2
�:
Inspection of the second derivative at the root reveals that the stationary point is a
minimum and hence the result is proved. �
Ostrowski Type Inequality 117
6 Numerical Results
In this section, for illustratration, the quadrature rule of Section 5 is used on the integral
Z 1
0100t ln(1=t) cos(4�t) dt = �1:972189325199166 (6.1)
This is evaluated using the following three rules:
(1) the composite mid-point rule, where the grid has a uniform step-size and the node
is simply the mid-point of each sub-interval,
(2) the composite generalized mid-point rule (5.1). The grid, In, is uniform and the
nodes are the mean point of each sub-interval (5.3),
(3) equation (5.5) where the grid is distributed according to (5.4) and the nodes are
the sub-interval means (5.3).
Table 1 shows the numerical error of each method for an increasing number of sample
points. For a uniform grid, it can be seen that changing the location of the sampling
point from the midpoint [method (1)] to the mean point [method (2)] roughly doubles
the accuracy. Changing the grid distribution as well as the node point [method (3)] from
the composite mid-point rule [method (1)] increases the accuracy by approximately an
order of magnitude. It is important to note that the nodes and weights for method
(3) can be easily calculated numerically using an iterative scheme. For example on a
Pentium-90 personal computer, with n = 64, calculating (5.3) and (5.4) took close to
37 seconds.
Note that equations (5.3) and (5.4) are quite general in nature and only rely on the
weight insofar as knowledge of the �rst two moments is required. This contrasts with
Gaussian quadrature where for an n point rule, the �rst n+1 moments are needed (or
equivalently the 2n + 1 coe�cients of the continued fraction expansion (Rutishauser
1962b; Rutishauser 1962a)) to construct the appropriate orthogonal polynomial and
then a root-�nding procedure is called to �nd the abscissae (Atkinson 1989). This
n Error (1) Error (2) Error (3) Error ratio (3) Bound ratio (3)
4 1.97(0) 2.38(0) 2.48(0) { {
8 3.41(-1) 2.93(-1) 2.35(-1) 10.56 3.90
16 8.63(-2) 5.68(-2) 2.62(-2) 8.97 3.95
32 2.37(-2) 1.31(-2) 4.34(-3) 6.04 3.97
64 6.58(-3) 3.20 (-3) 9.34(-4) 4.65 3.99
128 1.82(-3) 7.94(-4) 2.23(-4) 4.18 3.99
256 4.98(-4) 1.98(-4) 5.51(-5) 4.05 4.00
Table 1: The error in evaluating (6.1) under di�erent quadrature rules. The parameter
n is the number of sample points.
118 J. Roumeliotis, P. Cerone, S.S. Dragomir
procedure, of course, can be greatly simpli�ed for the more well known weight functions
(Gautschi 1994).
The second last column of Table 1 shows the ratio of the numerical errors for method
(3) and the last column the ratio of the theoretical error bound (5.5)
Bound ratio (3) =jR(f; �; n=2)jjR(f; �; n)j : (6.2)
As n increases the numerical ratio approaches the theoretical one. The theoretical ratio
is consistently close to 4. This value suggests an asymptotic form of the error bound
jR(f; �; n)j � O
�1
n2
�(6.3)
for the log weight. Similiar results have been obtained for the other weights of Section
4. This is consistent with mid-point type rules and it is anticipated that developing
other product rules, for example a generalized trapezoidal or Simpsons rule, will yield
more accurate results.
REFERENCES
Atkinson, K. E. (1989). An Introduction to Numerical Analysis. John Wiley.
Barnett, N. S., I. S. Gomm, and L. Armour (1995). Location of the optimal sampling
point for the quality assessment of continuous streams. Austral. J. Statist. 37 (2),
145{152.
Blake, J. R. and D. C. Gibson (1987). Cavitation bubbles near boundaries. Ann.
Rev. Fluid Mech. 19, 99{123.
Brebbia, C. H. and J. Dominguez (1989). Boundary elements: an introductory course.
Southampton: Computational Mechanics.
Cerone, P., S. S. Dragomir, and J. Roumeliotis (1998). An inequality of Ostrowski
type for mappings whose derivatives are bounded and applications. submitted to
East Asian Mathematical Journal .
Gautschi, W. (1994). Algorithm 726: ORTHPOL { a package of routines for gen-
erating orthogonal polynomials and Gauss-type quadrature rules. ACM Trans.
Math. Software 20, 21{62.
Mitrinovi�c, D. S., J. E. Pe�cari�c, and A. M. Fink (1994). Inequalities for functions
and their integrals and derivatives. Dordrecht: Kluwer Academic.
Rallison, J. M. and A. Acrivos (1978). A numerical study of the deformation and
burst of a viscous drop in an extensional ow. J. Fluid Mech. 89, 191{200.
Roumeliotis, J., G. R. Fulford, and A. Kucera (1997). Boundary integral equation
applied to free surface creeping ow. In B. J. Noye, M. D. Teubner, and A. W.
Gill (Eds.), Computational Techniques and Applications: CTAC97, Singapore,
pp. 599{607. World Scienti�c.
Ostrowski Type Inequality 119
Rutishauser, H. (1962a). Alogorithm 125: WEIGHTCOEFF. CACM 5 (10), 510{511.
Rutishauser, H. (1962b). On a modi�cation of the QD-algorithm with Grae�e-type
convergence. In Proceedings of the IFIPS Congress, Munich.
Stroud, A. H. and D. Secrest (1966). Gaussian quadrature formulas. Prentice Hall.
School of Communications and Informatics,
Victoria University of Technology,
PO Box 14428,
MCMC, Melbourne,
Victoria, 8001,
Australia
A NOTE ON SOME HIGHER ORDER CUMULANTS IN
k PARAMETER NATURAL EXPONENTIAL FAMILY
HYUN CHUL KIM
J. KSIAM Vol.3, No.2, 157-160, 1999
Abstract. We show the cumulants of a minimal su�cient statistics in k parameter
natural exponential family by parameter function and partial parameter function.
We �nd the cumulants have some merits of central moments and general cumulants
both. The �rst three cumulants are the central moments themselves and the fourth
cumulant has the form related with kurtosis.
1. Introduction
In this paper, we found some interesting results about the higher order cumulants in
k parameter natural exponential family. We will follow the notation of Bar-Lev[1].
Let T = (T1; � � � ; Tk; k � 2)be a minimal su�cient statistic for an exponential model
that constitutes a k parameter natural exponential family. Consider a partition of T
into (T1;T2) where T1 = (T1; � � � ; Tr), and T2 = (Tr+1; � � � ; Tk; 1 � r � k � 1). We
present some higher order cumulants of T, and conditional cumulants of T1, given
T2 = t2.
Assume that the model from which the sample observationsX1; � � � ;Xn are taken has
the form of k parameter exponential family. Then the joint pdf of X = (X1; � � � ;Xn)
may be represented as follows
fX(x;�) =n nYi=1
h(xi)IS(xi)oexpn kX
i=1
�i
nXj=1
ui(xj)� nl(�)o
(1.1)
where S is the common support of the Xi's, Is(�) is the indicator function of the
set S, and � = (�1; � � � ; �k) is the vector of natural parameters (2 �). De�ne Ti =Pnj=1 ui(xj); i = 1; � � � ; k, and let (T1;T2) be a minimal su�cient statistic for �. In
addition, consider a partition of � into (�1;�2), where �1 = (�1; � � � ; �r), and �2 =
(�r+1; � � � ; �k).
The derivation of moments or conditional moments from the pdf (1.1) is cumbersome
and di�cult to carry out. Most authors make the pdf into natural exponential family
through reparametrization (see, Bickel and Docksum [2, p.70]; Bar-Lev [1]; Lehmann
[4, p.57]). The pdf of k parameter natural exponential family has the form
fT(t : �) = g(t) expn� � t� nl(�)
oIST(t) (1.2)
AMS Mathematics Subject Classi�cation : 62B05, 62F10
Key word and phrases : cumulant, conditional cumulant, moment, natural exponential family, su�cient
statistics
157
158 HYUN CHUL KIM
for some measurable function g. In the case of (1.2) we get the moment generating
function of T easily.
For � 2 �, the moment generating function of T is
Ehexpn kX
i=1
siTi
oi= exp
nnl(�1 + s1; � � � ; �k + sk)� nl(�1; � � � ; �k)
o(1.3)
from which moments of T can be obtained. Cumulants of T can be derived by taking
logarithms in eq.(1.3) that follows, di�erentiating with respect to the si's and substi-
tuting si = 0; i = 1; � � � ; k. But we can calculate the cumulants easily by di�erentiating
l(�) also. We refer to l(�) as parameter function.
Bar-Lev[1] de�nes other parameter function from which conditional cumulants of T1given T2 = t2 can be obtained like moment generating function. We refer to it as
partial parameter function.
b(�1 : t2) � fT2(t2 : �) exp
nnl(�)� �2 � t2
o(1.4)
And conditional cumulants are calculated by di�erentiating log b(�1; t2).
He shows that the �rst two cumulants equal to that of moments. The results are
same with the cumulants in Kendall and Stuart[3, p.73]. We show the higher order
cumulants derived from the parameter function are same with the cumulants also, and
the usefulness of them.
2. Results
We can get l(�) by integrating eq.(1.2) since fT(t : �) is a pdf. The parameter
function is
l(�) =1
nlog
Zg(t) exp(� � t)dt (2.1)
Now by di�erentiating (2.1) with respect to �i; i = 1; � � � ; k, we get following results.
Results 1 : Cumulants of minimal su�cient statistic
n@l(�)
@�i= E(Ti) = �i
n@2l(�)
@�2i= E((Ti � �i)
2) = �2i
n@3l(�)
@�3i= E((Ti � �i)
3)
n@4l(�)
@�4i= E((Ti � �i)
4)� 3�4i
The �rst cumulant is the �rst moment �i, and it is same with the Kendall and
Stuart's �rst cumulant. And we can �nd the kurtosis of Ti; i = 1; � � � ; k from the
results above.
ki =n@4l(�)
@�4i=�n@2l(�)
@�2i
�2=
E((Ti � �i)4)
�4i� 3 (2.2)
A NOTE ON SOME HIGHER ORDER CUMULANTS 159
It is interesting also. We usually de�ne basic kurtosis as subtracting 3 from general
kurtosis, because that of normal distribution is 3. If we calculate kurtosis from l(�), it
is lessened by 3 from general kurtosis, and resulted basic kurtosis as 0.
And now, we get conditional cumulants of T1, given T2 = t2, by di�erentiating
log b(�1 : t2) with respect to �i; i = 1; � � � ; r.
Results 2 : Conditional cumulants of minimal su�cient statistics
@ log b(�1 : t2)
@�i= E(TijT2 = t2) = ��i
@2 log b(�1 : t2)
@�2i= E((Ti � ��i )
2jT2 = t2) = �i
�2
@3 log b(�1 : t2)
@�3i= E((Ti � ��i )
3jT2 = t2)
@4 log b(�1 : t2)
@�4i= E((Ti � ��i )
4jT2 = t2)� 3�i
�4
They present consistent results with Results 1 above.
3. Concluding Remarks
In this section, we show some types of moments with our results derived from pa-
rameter function l(�) and partial parameter function b(�1 : t2). We can �nd our results
are more convenient than others in Table 1.
In general, it is some statistics(mean, variance, skewness, kurtosis etc.) to know when
we calculate moments. In this view, no type of moment is always superior to other usual
types of moment. But our results show that the cumulants from parameter function
are always superior to other types of moments especially in fourth order moment for
calculating kurtosis. And it is very easy to calculate the cumulants.
References
1. S. K. Bar-Lev, A Derivation of Conditional Cumulants in Exponential Models, The American Statis-
tician, 48(2), 1994, 126-129.
2. P. J. Bickel and K. A. Doksum, Mathematical Statistics, Holden-Day Inc., 1977.
3. M. Kendall and A. Stuart, The Advanced Theory of Statistics Vol.1 Distribution Theory, Charles
Gri�n & Co., 1977.
4. E. L. Lehmann, Testing Statistical Hypothesis second ed., Wiley, 1986.
Dept. of Informatics and Statistics
Kunsan National Univ. Kunsan, 573-701, Korea [email protected]
160 HYUN CHUL KIM
Table1.Comparisonwithothertypesofmoments
typesandorder
�rst
second
third
fourth
�0 k
=E(Ti)k
�i
�2 i
+�
2 i
E(T
3 i)
E(T
4 i)
�k
=E((Ti�
�i)k)
0
�2 i
E((Ti�
�i)3)
E((Ti�
�i)4)
�k
=E(Ti(Ti�
1)���(Ti�
k�
1))
�i
�2 i
+�
2 i
�
�i
E(Ti(Ti�
1)(Ti�
2))
E(Ti(Ti�
1)(Ti�
2)(Ti�
3))
n@k
l(�)
@�k i
�i
�2 i
E((Ti�
�i)3)
E((Ti�
�i)4)�
3�
4 i
@k
logb(�1:t2)
@�k i
�� i
�i
2�
E((Ti�
�i)3jT2=t2)
E((Ti�
�i)4jT2=t2)�
3�i
4�
J. KSIAM Vol. 3, No.2, 161-171, 1999
161
AERODYNAMIC SENSITIVITY ANALYSISFOR NAVIER-STOKES EQUATIONS
Hyoung-Jin Kim, Chongam Kim, and Oh-Hyun RhoKi Dong Lee
Abstract
Aerodynamic sensitivity analysis codes are developed via the hand-differentiation using a directdifferentiation method and an adjoint method respectively from discrete two-dimensional compressi-ble Navier-Stokes equations. Unlike previous other researches, Baldwin-Lomax algebraic turbu-lence model is also differentiated by hand to obtain design sensitivities with respect to design vari-ables of interest in turbulent flows. Discrete direct sensitivity equations and adjoint equations areefficiently solved by the same time integration scheme adopted in the flow solver routine. The re-quired memory for the adjoint sensitivity code is greatly reduced at the cost of the computational timeby allowing the large banded flux jacobian matrix unassembled. Direct sensitivity code results arefound to be exactly coincident with sensitivity derivatives obtained by the finite difference. Adjointcode results of a turbulent flow case show slight deviations from the exact results due to the limita-tion of the algebraic turbulence model in implementing the adjoint formulation. However, currentadjoint sensitivity code yields much more accurate sensitivity derivatives than the adjoint code withthe turbulence eddy viscosity being kept constant, which is a usual assumption for the prior re-searches.
1. Introduction
With the advances in computational fluid dynamics, design optimization methods in the aerody-namic design are more important than ever. In the application of gradient-based optimizationmethods to aerodynamic design problems, one of the major concerns is an accurate and efficient cal-culation of sensitivity derivatives of system responses of interest, which are usually aerodynamiccoefficients or surface pressure distributions with respect to design variables.
The finite difference approximation approach is the easiest way to use since it does not requireany development of a sensitivity code. However, the accuracy of finite difference approach dependscritically on the perturbation size of design variables and the flow initialization.[1]
A robust way of computing sensitivity derivatives is to build a sensitivity analysis code. A sen-sitivity analysis code can be developed by direct differentiation methods[2-5] or adjoint variablemethods[6-8]. Direct differentiation methods are more economical than adjoint variable methodswhen the numbers of objectives and constraints are larger than the number of design variables.
Key Words: Sensitivity derivative, direct differentiation, adjoint variable
Hyoung-Jin Kim, Chongam Kim, and Oh-Hyun Rho, Ki Dong Lee162
Adjoint variable methods are preferable in the opposite case.Both methods can be dealt with either discreteor continuous approach. In the discrete approach,discretized flow equations are differentiated, while the flow equations are differentiated before theyare discretized in the continuous approach. The discrete approach can be advantageous in the sensethat the derivatives obtained are consistent with finite-difference derivatives regardless of a computa-tional grid size . On the other hand, the continuous approach provides a clear insight into the natureof the sensitivity solution.
Previous works on the sensitivity analysis by human hand in both direct differentiation and ad-joint variable methods have shown troubles in differentiating viscosity terms , which reflects thevariation of laminar and/or turbulent viscosit ies with respect to the variation of design variables.[5,6]Automatic differentiation tools such as ADIFOR[3,4] and Odyssée[7,8] have been successfully usedfor a sensitivity code generation from Navier-Stokes codes including turbulence models. However,the sensitivity code generated by the automatic differentiation is much less efficient than a hand-differentiated one.[3,8]
Another problem with adjoint codes generated by automatic differentiation in reverse mode[7,8]or by human hand2 is a memory problem. Actually, they require much more memory than the origi-nal flow solver and are prohibitive for large two-dimensional problems and all three-dimensionalproblems.
In this study, a Navier-Stokes solver with the Baldwin–Lomax algebraic turbulence model is di-rectly differentiated by hand, and a corresponding adjoint code is developed from the direct-differentiated sensitivity code. The required memory for the adjoint sensitivity code is greatly re-duced at the cost of the computational time by allowing the large banded flux jacobian matrix unas-sembled. Sensitivity derivatives obtained by the sensitivity codes developed herein are comparedwith those calculated using the finite difference approximation.
The rest of this paper presents a brief review on the flow solver used in this study, and a basictheory of the direct differentiation method and the adjoint variable method in the discrete approach.Computational results are then given for example problems , including subsonic and transonic laminarand turbulent flows around NACA0012 airfoil.
2. Flow Analysis
A two-dimensional Navier-Stokes solver developed and validated in Ref.[9,10] was used for theflow analysis. Reynolds averaged two-dimensional compressible Navier-Stokes equations in gener-alized coordinates are used in the conservation form based on a cell-centered finite volume approach,given as
0RtQ
J1
=+∂∂ , (1)
where R is the residual vector, Q is a four-element vector of conserved flow variables. Navier-Stokes equations are discretized in time using the Euler implicit method and linearized by employingthe flux jacobian. This results in a large system of linear equations in delta form at each time step as
nn
RQQR
tJI −=∆
∂∂+
∆. (2)
Roe's Flux Difference Splitting (FDS) scheme was adopted for the space discretization in the in-viscid flux terms of the residual vector on the right-hand side; MUSCL approach with Koren limiteris employed to obtain a third order accuracy. The central difference method is used for viscous fluxterms of the residual vector. In the implicit part , Beam & Warming's Alternating Direction Implicit(ADI) method is used, and van Leer’s Flux Vector Splitting (FVS) is employed with a first order
Aerodynamic Sensitivity Analysis for Navier-Stokes Equations 163
accuracy for the flux jacobian. The flux jacobian for the viscous part is neglected in the implicitpart since it does not influence the solution accuracy. Turbulence effects were considered using theBaldwin-Lomax algebraic model with a relaxation technique. All boundary conditions were speci-fied explicitly. One-dimensional characteristic conditions were used for the inflow and outflowboundaries. The no-slip condition and adiabatic wall condition were specified on the solid wall, andlocal time stepping was used.
A C-type grid system around the airfoil was generated by a conformal mapping technique, giv-ing 135 points in the chordwise direction, 41 points in the normal direction and 95 points on the air-foil surface.
3. Sensitivity Analysis
Direct Differentiation Method
The discrete residual vector of nonlinear aerodynamic analysis for steady problems can be writtensymbolically as
[ ] 0),(X),(QR =βββ , (3)
where X is the grid position vector, and β is the vector of design variables. Boundary conditions arealso included in the residual vector R.
Eq. (3) is directly differentiated with respect to βk to yield the fo llowing equation.
0R
ddX
XR
ddQ
QR
ddR
kkkk
=
β∂∂
+
β
∂∂
+
β
∂∂
=
β. (4)
The grid sensitivity vector { }kd/dX β can be calculated by differentiating the grid generation code or
simply by applying the finite difference approxiation. However, it can be obtained analytically, ifthe grid points are analytically modified during the design process.
In order to find the solution { }kd/dQ β of Eq. (4), a pseudo time term is added and the same time
integration scheme with the flow solver is adopted. Applying Euler implicit method followed by thelinearization with van Leer flux jacobian of a first-order accuracy gives the following system of linearalgebraic equations.
n
kk ddR
ddQ
QR
tJI
β−=
β∆
∂∂+
∆. (5)
The above system of equations is solved with ADI scheme which is used for the flow solver.By comparing Eq. (2) and (5), it can be noted that one can obtain a direct sensitivity code by di-
rectly differentiating the right-hand side of the discretized flow equations. All of the derivativeterms in Eq. (4) are differentiated by hand except grid sensitivity vector { }kd/dX β , which are cal-
culated from a grid generation code.The jacobian matrices [ ]Q/R ∂∂ and [ ]X/R ∂∂ in Eq. (4) include Roe’s FDS flux jacobian and vis-
cous flux jacobian and are very large banded matrices as the inviscid and viscous fluxes are third- andsecond- order accurate, respectively. In order to avoid this problem, the terms [ ]{ }kd/dQQ/R β∂∂
and [ ]{ }kd/dXX/R β∂∂ of Eq. (4) are calculated without the explicit formulation of the very large
Jacobian matrices [ ]Q/R ∂∂ and [ ]X/R ∂∂ . Van Leer’s flux Jacobian in the LHS is frozen at the
steady-state value. This leads to the required memory increase and computational time reduction.When the flow variable sensitivity vector { }kd/dQ β is obtained, The total derivative of the system
response of interest, Cj can be calculated. Cj is a function of flow variables Q, grid position X, and
Hyoung-Jin Kim, Chongam Kim, and Oh-Hyun Rho, Ki Dong Lee164
design variables β; i.e.,( )βββ= ),(X),(QCC jj . (6)
The sensitivity derivative of the aerodynamic coefficient Cj with respect to the kth design vari-able βk is given by
β∂
∂+
β
∂
∂+
β
∂
∂=
β k
j
k
Tj
k
Tj
k
j C
ddX
X
C
ddQ
Q
C
d
dC . (7)
Adjoint Variable Method
As the total derivative of the residual vector { }kd/dR β is null in the steady state, we can introduce
adjoint variables and combine Eq. (4) and (7) to obtain
∂
∂+
∂
∂+
∂
∂=
k
j
k
Tj
k
Tj
k
j C
ddX
X
C
ddQ
Q
C
d
dC
ββββ { }
β∂∂
+
β
∂∂
+
β
∂∂
λ+kkk
Tj
RddX
XR
ddQ
QR . (8)
If we find { }Tjλ that satisfies the following adjoint equation
{ } 0=
∂
∂+
∂∂
Q
C
QR j
j
T
λ , (9)
we can obtain the sensitivity derivative of Cj with respect to βk by the following equation.
β∂
∂+
β
∂
∂=
β k
j
k
Tjj C
ddX
X
C
d
dC { }
β∂∂
+
β
∂∂
λ+kk
Tj
RddX
XR . (10)
The adjoint equation (9) is also converted to the following system of linear algebraic equationsand is solved by the ADI scheme.
{ } { }n
jj
T
j
T
Q
C
QR
QR
tJI
∂
∂+λ
∂∂−=λ∆
∂∂+
∆. (11)
The transposed flux Jacobian [ ]TQ/R ∂∂ in the LHS of Eq. (11) is van Leer’s FVS flux jacobian that is
frozen at the steady-state value. The transposed flux Jacobian [ ]TQ/R ∂∂ in the RHS includes Roe’s
FDS flux Jacobian and viscous flux Jacobian and is a very large banded matrix. Unlike the fluxJacobian [ ]Q/R ∂∂ of the direct differentiation method, all the element of [ ]TQ/R ∂∂ should be
explicitly calculated. In the prior researches on the discrete adjoint variable methods[2,7], all theelements of the jacobian matrix [ ]TQ/R ∂∂ were calculated and assembled at the cost of very large
memory requirement. Although the computational time can be decreased as the elements of thejacobian matrix are calculated only once, the memory requirement is prohibitive for large two-dimensional problems and all three-dimensional problems.
In the present study, elements of [ ]TQ/R ∂∂ are calculated and multiplied by the corresponding
element of the adjoint vector { }jλ , and thus the large matrix [ ]TQ/R ∂∂ need not to be assembled.
This increases the computational time as the elements of the flux jacobian matrix [ ]TQ/R ∂∂ have to
be calculated every iteration. However, the required memory can be remarkably reduced to theorder of the required memory of the flow solver.
In Ref.[3] it was reported that a hand-differentiated sensitivity code may take six man-months totwo man-years, or even longer to be generated, whereas it takes about one man-week to generate adirect-differentiation sensitivity code by the automatic differentiation. According to authors’ expe-rience, however, less than two man-weeks were required to build a hand-differentiated direct diffe r-entiation sensitivity code, and one man-month to build an adjoint sensitivity code for two-
Aerodynamic Sensitivity Analysis for Navier-Stokes Equations 165
dimensional Navier-Stokes equations. This indicates that the hand-differentiation is a viable ap-proach compared to the automatic differentiation in the required time for the sensitivity code genera-tion.
4. Results & Discussion
In order to evaluate the performance of the direct and adjoint sensitivity codes to accurately cal-culate sensitivity derivatives, sensitivity analyses are conducted for NACA0012 airfoil in the lami-nar/turbulent flow field of subsonic and transonic regime.
The results are compared with the derivatives computed by the following finite difference ap-proximation.
k
,j,j
k
j kkkCC
d
dC
β∆
−≅
βββ∆+β . (12)
The residual of the flow solver is reduced to 10-11 from the freestream value for the finite diffe r-ence calculation. The step size
kβ∆ is 10-6 or 10-7 depending on the design variable and the flow
condition.The residuals of the sensitivity codes are reduced to 10-7 from the initial value of the residual.
The initial value of the sensitivity derivatives { }kd/dQ β and adjoint variables { }jλ is set to zero.
In order to consider the sensitivity derivative with respect to the geometry change, one of theHicks-Henne functions is adopted as a shape function as follows.
( )( )
= 6.0ln
5.0ln3sin)( xxF π
. (13)
The upper surface of NACA0012 airfoil is modified as follows.
)x(FYY oldnew β+= . (14)
Where β is a design variable. The incidence angle α is the other design variable in this study.
Direct Differentiation Approach
In order to validate the sensitivity code by the direct differentiation method, sensitivity derivativesobtained by the direct sensitivity code is compared to those by the finite difference method. Co m-putations are conducted for both subsonic and transonic turbulent flows.
The first example is a subsonic turbulent flow case. The flow condition is M∞ = 0.6, α = 2° andReynolds number = 6,500,000. Fig.1 shows pressure derivative (p ′= dp/dα) contours. Streamlinesof velocity sensitivity derivatives (u ′, v′) in Fig. 2 show the effect of the incidence angle increment.Table 1 shows the computed sensitivity derivatives of aerodynamic coefficients with respect to inci-dence angle α and the coefficient β of the shape function. The sensitivity derivatives of aerody-namic coefficients by the direct differentiation and the finite difference method compare very wellwith each other up to four or five significant digits.
Hyoung-Jin Kim, Chongam Kim, and Oh-Hyun Rho, Ki Dong Lee166
Table 1 Validation of direct sensitivity code:subsonic turbulence case (M∞ = 0.60, α = 2° and Re = 6,500,000)
dCl/dα dCd/dα DCm/dαFD (Δα=10-6) 7.379933 0.05013069 0.1343019
DD 7.379935 0.05013063 0.1343012dCl/dβ dCd/dβ dCm/dβ
FD(Δβ=10-6) 3.65374 0.0217594 -1.26809
DD 3.65373 0.0217605 -1.26808
Table 2 Validation of direct sensitivity code: transonic turbulence case (M∞ = 0.75, α = 2° and Re = 6,500,000)
dCl/dα dCd/dα dCm/dαFD (Δα=10-6) 8.37948 0.486983 0.222065
DD 8.37972 0.486994 0.222014
dCl/dβ dCd/dβ dCm/dβFD (Δβ=10-7) 7.13337 -0.195777 -1.85788
DD 7.13346 -0.195774 -1.85790
The second example is a transonic turbulent flow case. The flow condition is M∞ = 0.75, α = 2°and Reynolds number = 6,500,000. A strong shock wave is formed on the upper surface of the air-foil. Fig. 3 shows pressure sensitivity contours with respect to incidence angle α, which represents adrastic variation of the pressure around the shock wave. Streamlines of the velocity derivative aresimilar to the subsonic case, but it is getting close to the flow separation downstream of the shockwave as can be seen in Fig. 4. Table 2 presents the sensitivity derivatives of aerodynamic coeffi-cients with respect to two design variables. As in the subsonic case, the sensitivity derivatives ofaerodynamic coefficients by the direct differentiation method and the finite difference method areidentical with each other up to four or five significant dig its.
Adjoint Variable Method
In order to validate the sensitivity code by the adjoint variable method, the sensitivity derivativescalculated by using the adjoint sensitivity code are compared to those by the direct sensitivity code.Computations are conducted for both subsonic laminar and transonic turbulent flows.
The first example is a subsonic laminar case. The flow condition is M∞ = 0.6, α = 1° and Rey-nolds number = 5,000. Fig.5 shows λm1 contours and streamlines of (λm2,λm3). The subscript mmeans that the system response of interest Cj was Cm in the adjoint equations. The integer subscriptfigures 1, 2 and 3 represents elements of the adjoint variable vector { }T
jλ ( { }T4j3j2j1j ,,, λλλλ= ) corre-
sponding to the conservative flow variable vector Q (= {ρ, ρu, ρv, e}Τ). The λm1 has a large gradi-ent in the boundary layer region on the airfoil surface and around the stagnation streamline upstreamof the airfoil leading edge. Streamlines of the vector (λm2,λm3) show discontinuity of the vectordirection around the flow stagnation streamline upstream of the airfoil leading edge. Streamlines ofthe vector (λm2,λm3) look like circulation lines with a negative lift or the opposite flow direction.
In Table 3, we can note that the total derivatives of aerodynamic coefficients with respect togeometric perturbation by the direct and adjoint sensitivity codes are identical with each other up to 6
Aerodynamic Sensitivity Analysis for Navier-Stokes Equations 167
significant digits.The second example is a transonic turbulent flow case. The flow condition is M∞ = 0.75, α = 2°
and Reynolds number = 6,500,000. The adjoint sensitivity code was run in two modes, one withturbulence eddy viscosity (µt) terms differentiated and the other with constant turbulence eddy vis-cosity (µ’t=0). λl1 contours around NACA0012 airfoil are plotted in Fig. 6. The λl1 has a largegradient around the stagnation streamline and in the boundary layer on the airfoil surface.
Table 3 Validation of adjoint sensitivity code:subsonic laminar case (M∞ = 0.60, α = 1° and Re = 5,000)
dCl/dβ dCd/dβ dCm/dβAV -1.2065570 0.08362676 -0.12162388FD -1.2065569 0.08362679 -0.12162394
discontinuity of the adjoint variables around the stagnation streamline in Fig. 5 and 6 is due to theexistence of singularity crossing the incoming stagnation streamline upstream of the airfoil leadingedge.[11]
Table 4 Validation of Adjoint Sensitivity Code: transonic turbulence case (M∞ = 0.75, α = 2° and Re = 6,500,000)
dCl/dα dCd/dα dCm/dαDD 8.37972 0.486994 0.222014AV 8.28911
(0.9892)0.486983(0.9948)
0.242023(1.0901)
AV(µ′ t=0)
7.73075(0.9226)
0.483649(0.9931)
0.342835(1.544)
dCl/dβ dCd/dβ dCm/dβDD 7.13346 -0.195774 -1.85790AV 7.17228
(0.9946)-0.190391(0.9725)
-1.86124(1.0018)
AV(µ′ t=0)
9.37707(1.3145)
-0.123950(0.6331)
-2.233926(1.2024)
Table 4 compares the sensitivity derivatives of aerodynamic coefficients with respect to two des-ign variables. The values in the parentheses are sensitivity derivative ratios that are the sensitivityderivatives via the adjoint sensitivity code normalized by the respective sensitivity derivatives via thedirect differentiation code. The sensitivity derivatives of the adjoint sensitivity code with differenti-ated eddy viscosity terms show about 0.18~9 % deviation from those of the direct differentiation code.On the other hand, the adjoint code results with µ′ t=0 show much larger deviations(0.5~54%).
Required Computational Time and Memory
Table 5 compares computational time per iteration and memory for the sensitivity codes devel-oped herein with those for the codes generated by the automatic differentiation tools. The presentedvalues are normalized by the time and memory required for the original flow solver. The sensitivitycodes developed in this study require less computational time and memory than those by automaticdifferentiation. Although computational time per iteration for an Odyssée-generated adjoint code
Hyoung-Jin Kim, Chongam Kim, and Oh-Hyun Rho, Ki Dong Lee168
was not available, it is significantly slower than a hand-differentiated code by a factor of 5 in CPUtime.[8]
The direct and adjoint sensitivity codes have similar convergence ratio with the flow solver asthey all employ van Leer’s flux jacobian in the implicit part. The direct sensitivity code requiresmuch less computational time than finite difference approximations as its convergence tolerance toobtain accurate sensitivity derivatives is larger than that of the flow solver by four orders of magni-tude approximately.
Table5 Comparison of computational time and me morySensitivity code
Present ADFlow
SolverDD AV DDa AVb
TimePer iteration 1 1.1 2.4 2~3 NA
Memory 1 1.8 1.8 2 10
a: by ADIFOR[4] b: by Odyssée[7] , NA : Not Available
The adjoint sensitivity code requires more than two times the computational time of the directsensitivity code because the elements of the residual vector R are differentiated by the four elementsof the flow variable Q instead of the design variable βk as in the direct code. Thus, the adjoint codewould be more economical than the direct sensitivity code to calculate sensitivity derivatives if thenumber of design variables is larger than twice the number of objectives and constraints.
5. Concluding Remarks
The direct differentiation approach and the adjoint variable approach are applied respectively tothe discrete flow equations to develop aerodynamic sensitivity analysis codes for turbulent flows. ANavier-Stokes solver with Baldwin-Lomax turbulence model is differentiated by hand to obtain des-ign sensitivities with respect to design variables of interest efficiently. Direct sensitivity equationsand adjoint equations are efficiently solved by the same time integration scheme with the flow solver.The required memory for the adjoint sensitivity code is greatly reduced at the cost of the computa-tional time by allowing the large banded flux jacobian matrix unassembled.
Sensitivity derivatives computed by the sensitivity codes almost exactly coincide with those of thefinite difference method. Although adjoint code results of the turbulent flow case show slight de-viations from the exact results, the adjoint sensitivity code gives much more accurate sensitivity de-rivatives than the adjoint code with the turbulence eddy viscosity being kept constant, which is ausual assumption for the prior researches.
The strategy adopted in this research shows a promise for the extension to three-dimensionalproblems as it is efficient, accurate and requires much less memory than the prior approaches.
6. References
[1] Eyi, S. and Lee, K. D., “Effect of Sensitivity Calculation on Navier-Stokes Design Optimiza-tion,” AIAA 94-0060, Jan. 1994
[2] Eleshaky, M. E. and Bayal, O. “Aerodynamic Shape Optimization Using Sensitivity Analysis
Aerodynamic Sensitivity Analysis for Navier-Stokes Equations 169
on Viscous Flow Equations,” J. of Fluid Engineering, Vol.115, No.3, 1993, pp75-84[3] Sherman, L. L., Taylor, III, A. C., Green, L. L., Newman, P. A., Hou, G. J., and Korivi, V.
M., “First- and Second-Order Aerodynamic Sensitivity Derivatives via Automatic Differen-tiation with Incremental Iterative Methods,” AIAA-94-4262-CP, Sep. 1994.
[4] Taylor III, A. C., Oloso, A., “ Aerodynamic Design Sensitivities By Automatic Differentia-tion,” AIAA98-2536, June, 1998
[5] Ajmani, K., and Taylor, III, A. C., “Discrete Sensitivity Derivatives of the Navier-StokesEquations with a Parallel Krylov Solver,” AIAA 94-0091, Jan. 1994.
[6] Jameson, A., Pierce, N. A., Martinelli, L., “Optimum Aerodynamic Design using the Navier-Stokes Equations,” AIAA 97-0101, Jan. 1997.
[7] Mohammadi, B. “ Optimal Shape Design, Reverse Mode of Automatic Differentiation andTurbulence,” AIAA 97-0099, Jan. 1997.
[8] Malé, J. M., Mohammadi, N., Schmidt, R., “ Direct and Reverse Modes of Automatic Dif-ferentiation of Programs for Inverse Problems,” Application to Optimum Shape Design Proc.2nd Int. SIAM Workshop on Co mputational Differentiation, Santafe, 1996
[9] S. W. Hwang, " Numerical Analysis of Unsteady Supersonic Flow over Double Cavity,"Ph.D. Thesis, Seoul National Univ., Seoul, Korea, 1996.
[10] Kim, H. J. and Rho, O. H.," Dual-Point Design of Transonic Airfoils using the Hybrid Inver-se Optimization Method," J. of Aircraft vol.34 No.5 pp612-618, 1997.
[11] Giles, M. B. and Pierce , N. A. “Adjoint Equations in CFD: Duality, Boundary Conditionsand Solution Behavior,” AIAA 97-1850, June, 1997
Hyoung-Jin Kim, Chongam Kim, and Oh-Hyun Rho, Ki Dong Lee170
Fig.1 Direct differentiation sensitivity analysis: dp/dα contour, subsonic turbulence case
(M∞ = 0.60, α = 2° , Re = 6,500,000)
Fig.3 Direct differentiation sensitivity analysis: pressure sensitivity contour for α transonic turbulence case (M∞ = 0.75, α = 2° , Re = 6,500,000)
Fig.2 Direct differentiation senitivity analysis:velocity sensitivity streamline for α
subsonic turbulence case (M∞ = 0.60, α = 2° , Re = 6,500,000)
Fig.4 Direct differentiation sensitivity analysis:velocity sensitivity streamline for α
transonic turbulence case (M∞ = 0.75, α = 2° , Re = 6,500,000)
0 0.5 1 1.5x
-0.5
0
0.5
1
y
0 0.5 1 1.5x
-0.5
0
0.5
1
y
-0.5 0 0.5 1x
-0.5
0
0.5
y
0 1x
-1
-0.5
0
0.5
1
y
Aerodynamic Sensitivity Analysis for Navier-Stokes Equations 171
Fig.5 Adjoint sensitivity analysis: λm1 contour and (λm2,λm3) streamlines
subsonic laminar case (M∞ = 0.60, α = 1° , Re = 5,000)
Fig.6 Adjoint sensitivity analysis : λl1 contour transonic turbulence case
(M∞ = 0.75, α = 2° , Re = 6,500,000)
-1 0 1x
-1.5
-1
-0.5
0
0.5
1
1.5
y
-1 0 1x
-1
-0.5
0
0.5
1
y