J. KSIAM V ol.3 , No.2, 1-4, 1999 · suc h that (5) b ecomes smaller than 1. ein tro duce sev eral functions. Let 1 ( ) = max a ; ; b + + (6) 2 ( ) = max a ; ; b + + (7) 3 ( ) = max

Optimal � acceleration parameter for the ADI iteration

for the real three dimensional Helmholtz equation with

nonnegative !

Sangback Ma

J. KSIAM Vol.3, No.2, 1-4, 1999

Abstract

The Helmholtz equation is very important in physics and engineering. How-

ever, solution of the Helmholtz equation is in general known as a very di�cult

phenomenon. For if the ! is negative, the FDM discretized linear system becomes

inde�nite, whose solution by iterative method requires a very clever preconditioner.

In this paper we assume that ! is nonnegative, and determine the optimal � pa-

rameter for the three dimensional ADI iteration for the Helmholtz equation. The

ADI(Alternating Direction Implicit) method is also getting new attentions due

to the fact that it is very suitable to the vector/parallel computers, for exam-

ple, as a preconditioner to the Krylov subspace methods. However, classical ADI

was developed for two dimensions, and for three dimensions it is known that its

convergence behaviour is quite di�erent from that in two dimensions. So far, in

three dimensions the so-called Douglas-Rachford form of ADI was developed. It

is known to converge for a relatively wide range of � values but its convergence is

very slow. In this paper we determine the necessary conditions of the � parame-

ter for the convergence and optimal � for the three dimensional ADI iteration of

the Peaceman-Rachford form for the real Helmholtz equation with nonnegative !.

Also, we conducted some experiments which is in close agreement with our theory.

This straightforward extension of Peaceman-rachford ADI into three dimensions

will be useful as an iterative solver itself or as a preconditioner to the the Krylov

subspace methods, such as CG(Conjugate Gradient) method or GMRES(m).

1 Three Dimensional Extension into the Helmholtz equa-

tion

For three dimensional Poisson problems Douglas[1] proposed a variant of the classical

ADI, which has more smooth convergence behavior.

Algorithm 1.1 DO3-ADI(Douglas ADI)

(H + �iI)ui+1=3 = �(H + 2V + 2W � �iI)ui + 2b (1)

(V + �iI)ui+2=3 = �(H + V + 2W � �iI)ui �Hui+1=3 + 2b

(W + �iI)ui+1 = �(H + V +W � �iI)ui �Hui+1=3 � V ui+2=3 + 2b

This work was supported by project for supprting leading trial schools in IT from the Ministry of

Information and Communication

1

2 Sangback Ma

Douglas[1] has proven that in the case where the matrices H;V; and W all commute,

the above iteration is convergent for �xed � > 0. However, it is demonstrated by

experiments that DO3-ADI converges for a wider range of values of �, but the conver-

gence rate is very slow. Due to the slow convergence rate it has rarely been used as an

iterative method.

So writing

A = ( ~H + !3I + �iI) + (A� ~H � !

3I � �iI)

= ( ~V + !3I + �iI) + (A� ~V � !

3I � �iI)

= ( ~W + !3I + �iI) + (A� ~W �

!3I � �iI)

Algorithm 1.2 Peaceman-Rachford ADI in three dimensions(PR3-ADI)

(H + �iI)ui+1=3 = �(V + w � �iI)ui + b (2)

(V + �iI)ui+2=3 = �(H + w � �iI)ui+1=3 + b

(W + �iI)ui+1 = �(H + V � �iI)ui+2=3 + b

where H = ~H + !3I; V = ~V + !

3I; W = ~W + !

3I. The convergence behavior of

this algorithm is quite di�erent from that of PR2-ADI in two dimension. Assume that

H, V , and W are pairwise commutative, and that

a � �(H); �(V ); �(W ) � b;

where �(M) denote the spectrum of the matrix M .

Then, a and b are known to be

a = 4sin2(�

2(n+ 1)) +

!

3; b = 4sin2(

n�

2(n+ 1)) +

!

3

where N = n3.

Let T� be the operator associated with PR3-ADI. Then,

T� = (W + �I)�1(H + V � �I)(V + �I)�1(H +W � �I)(H + �I)�1(V +W � �I) (3)

Since the given equation is separable, HV = V H, HW = WH, and VW = WV

and H, V, and W share common set of eigenvectors. Let v be any such vector, and

Hv = �v; ; V v = �v;Wv = �v:

Then,

T�v =(�+ � � �)(� + � � �)(�+ � � �)

(�+ �)(� + �) (� + �)v (4)

Then, the spectral radius of T� is given by

Sp(T�) = maxa��;�;��b

��(�+ � � �)(� + � � �)(�+ � � �)

(�+ �)(� + �)(� + �)

�� (5)

Optimal � acceleration parameter 3

Now, we are looking for � such that (5) becomes smaller than 1. Now, we introduce

several functions. Let

�1(�) = maxa��;�;��b

�� + � � �

�+ �

�� (6)

�2(�) = maxa��;�;��b

��+ � � �

� + �

�� (7)

�3(�) = maxa��;�;��b

��+ � � �

� + �

�� (8)

and

1(�) = maxa��b

��2��

�+ �

�� (9)

2(�) = maxa��b

��2� � �

� + �

�� (10)

3(�) = maxa��b

��2� � �

� + �

�� (11)

Theorem 1.1 Assume that � > b=2. Then,

�1(�) � 1(�)

Corollary 1.1 With the same hypotheses as in theorem 1.1 the necessary and su�cient

condition that the PR3-ADI iteration is convergent is that � > b=2.

Proof. Sp(T�) = �1(�)3, hence if � > b=2 then Sp(T�) < 1. 2

Theorem 1.2 � minimizing Sp(T�) is given by � = ��, where

�� =a+ b+

q(a+ b)2 + 32ab

4:

Proof.@�1

@�= �

3b

(b+ �)2< 0; � < ��

and@�1

@�=

3a

(a+ �)2> 0; � > ��:

So, the minimum is obtained when � = ��. 2

2 Experiments

The above tables shows that Douglas-Rachford ADI is indeed always convergent for

any positive �, while the convergence speed is very slow. Also, the minimum number of

iteration for PR3-ADI seems to happen around � = 2.0, which is close to our theoretical

��.

4 Sangback Ma

�

0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0

PR3 SL SL SL SL SL SL SL 87 58 62 63 80 61

DO3 145 140 133 134 134 128 122 154 170 174 176 183 195

Table 1: Poission Problem with N=48x48x48, iteration of CG-ADI with constant �

untilkrkkkr0k

� 10�6

3 Conclusion

In three dimensions for the real Helmholtz equation with nonnegative ! the optimal

� parameter for the stationary ADI iteration was determined. We believe that for

the speci�c Helmholtz equation straightforward extension of Peaceman-Rachford ADI

with properly predetermined � might converge faster our result might turn out to be

useful. than the Douglas-rachford ADI for three dimensions. We believe that as a

preconditioner to the Krylov subspace methods, such as CG(Conjugate Gradient) or

GMRES(m),

References

[1] J. Douglas, \Alternating direction methods for three space variables", Numerische

Mathematik, Vol. 4, pp. 41-63, 1962

Hanyang University,

Computer Science Department,

Kyungki-Do, Korea

A PROJECTION ALGORITHM FOR SYMMETRIC

EIGENVALUE PROBLEMS

PIL SEONG PARK

J. KSIAM Vol.3, No.2, 5-16, 1999

Abstract

We introduce a new projector for accelerating convergence of a symmetric eigen-

value problem Ax = x, and devise a power/Lanczos hybrid algorithm. Acceleration

can be achieved by removing the hard-to-annihilate nonsolution eigencomponents

corresponding to the widespread eigenvalues with modulus close to 1, by estimat-

ing them accurately using the Lanczos method. However, the additional Lanczos

results can be obtained without expensive matrix-vector multiplications but a very

small amount of extra work, by utilizing simple power-Lanczos interconversion

algorithms suggested. Numerical experiments are given at the end.

1. Introduction

Numerical models often yield eigenvalue problems Ax = �x for �nding the dom-

inant eigenvector x corresponding to the eigenvalue � with the largest modulus. In

many cases, the dominant eigenvalue is known in advance. One such example is the

queuing problem Qx = 0 described in [2], that can be converted to an eigenvalue

problem Ax = x for �nding the dominant eigenvector corresponding to the eigenvalue

1.

It is well known that, if the moduli of some eigenvalues of A are nearly equal to

that of the dominant one we look for, usual algorithms such as the power method or its

variants like the Chebyshev iteration do not work well, since the convergence depends

on the modulus ratio of the second largest to the dominant. To improve the convergence

in such cases, an orthogonal projector was proposed in [3] under the assumption that

these unwanted eigenvalues are clustered closely to each other. However the projector

sometimes may not work well if these unwanted major eigenvalues are well separated.

In this paper, we introduce a better orthogonal projector to deal with such cases,

and devise a new power/Lanczos hybrid algorithm for symmetric eigenvalue problems.

Numerical results of the algorithm in various cases are given at the end.

Throughout this paper, we deal with an n � n real symmetric eigenvalue problem

Ax = x, i.e., we look for the eigenvector corresponding to the dominant eigenvalue 1.

If the dominant eigenvalue, theoretically known in advance, is di�erent from 1, we can

scale the problem to satisfy this condition.

Key words: projector, Lanczos method, the power method, power-Lanczos interconversion, sym-

metric eigenvalue problems, Krylov subspace

5

6 Pil Seong Park

2. Some backgrounds

We �rst review the method introduced in [3] and look at the Lanczos method for

further development.

De�nition 1 Let (�j ; zj) be the jth eigenpair of a given matrix A numbered in decreas-

ing order by its eigenvalue modulus, and z1 be the eigenvector we look for, corresponding

to �1 = 1.

De�nition 2 For any vector x, we de�ne the residual of x by r = (A� I)x.

The \unnormalized" power method xi+1 := Axi is one of the main driving force.

Then the residual is just the di�erence between two consecutive power iterates and can

be computed without extra matrix-vector multiplication. The solution component z1in power iterates does not change, and residuals contain nonsolution components only,

but not z1. Hence convergence of an iterate can be estimated by normalizing it and

then computing the norm of its residual.

Let x =Pn

j=1 �jzj be the initial vector for the unnormalized power iteration. After

enough(say, e) power iterations, let x1 be the �rst power iterate of our concern and r1be its residual. Then

x1 = Aex =nX

j=1

�jAezj =

nX

j=1

�j�jezj:

Assume that the nonsolution components in x1 mainly consist of k major compo-

nents(say, major nonsolution components) z2; z3; : : : ; zk+1, and others(say, minor non-

solution components) are negligibly small, i.e., 1 � j�j�jej � j�i�i

ej; j = 2; : : : ; k +

1; i = k+2; : : : ; n. If we could remove most of these k major components, we get very

fast convergence.

Note that, after enough power iterations, residuals will be rich in a few major

nonsolution components, depending on the eigenstructure of A. Hence residuals are

used to approximate the subspace spanned by these major nonsolution components,

and an orthogonal projector was designed to reduce them in power iterates as described

below.

We apply k+1 more power iterations to obtain xi+1 = Aix1; i = 1; 2; : : : ; k+1, and

compute the residuals ri � xi+1�xi, i = 1; 2; : : : ; k+1. By applying the Gram-Schmidt

process to the k residuals r1; : : : ; rk, we form the matrix V 2 Rn�k whose columns are

orthonormal and R(V ) = spanfr1; : : : ; rkg, which may be close to spanfz2; : : : ; zk+1g.

By subtracting the nonsolution components in rk+1 projected onto R(V ), most of the

major nonsolution components in the iterate xk+1 can be removed e�ectively by the

projection step

xnew = xk+1 +1

1� �V V T rk+1;(1)

where � is an approximation to the eigenvalues corresponding to the major eigencompo-

nents z2; : : : ; zk+1 we try to remove. Note that xk+2 is needed only for the computation

of rk+1 and is not used any further.

In an ideal case when R(V ) = spanfz2; z3; : : : ; zk+1g,

A Projection Algorithm 7

Theorem 1 Let V 2 Rn�k be the matrix with orthonormal columns such that

R(V ) = spanfr1; r2; : : : ; rkg = spanfz2; z3; : : : ; zk+1g:

Let the current iterate be xk+1 =Pn

j=1 �jzj, where j�1j = O(1); j�ij � �1 for i =

2; : : : ; k + 1, and j�j j � �2 for j = k + 2; : : : ; n, and �2 � �1 � 1. If � is chosen so

that j�j��

1��j � � for some 0 < � � 1 for j = 2; : : : ; k + 1, then the remaining major

nonsolution components z2; : : : ; zk+1 after the projection step (1) are at most O(�k)

where �k = max(k��1; t�2) and t = (n� k � 1)(1 + 21��

).

The Lanczos method

Given a matrix A 2 Rn�n and a set fq1; : : : ;qkg of k linearly independent vec-

tors(k � n), the projection method on spanfq1; : : : ;qkg tries to approximate an eigen-

pair (�; z) of the matrix A by a pair (�(k); z(k)) satisfying [5]

z(k) 2 spanfq1; : : : ;qkg;

(A� �(k)I)z(k) ? qj ; j = 1; 2; : : : ; k:

The solutions �(k) are called Ritz values on the subspace spanfq1; : : : ;qkg, and to each

Ritz value is associated a Ritz vector z(k) [7].

Let Qk = [q1; : : : ;qk]. Writing z(k) = Qks(k), we see that (�(k); z(k)) are eigenpairs

of the problem

(Tk � �(k)Bk)s(k) = 0;

or equivalently

(B�1k Tk � �(k)I)s(k) = 0;

where Tk = QTkAQk and Bk = QT

kQk.

In usual applications, we choose an orthonormal system Qk, so that Bk reduces to

the identity matrix. One such process is the symmetric Lanczos method. It uses the

orthonormal system obtained by orthogonalization of the Krylov vectors q1; Aq1; : : :,

Ak�1q1, where q1 is a starting vector. Then the matrix Tk becomes tridiagonal, and

letting

Tk =

266666664

�1 �1�1 �2 �2

�2 �3. . .

. . .. . . �k�1�k�1 �k

377777775;

the entries are easily obtainable from a three-term recurrence relation [7]

Aqj = �j�1qj�1 + �jqj + �jqj+1; j = 1; : : : ; n� 1;

�0q0 � 0:

8 Pil Seong Park

3. A new projector

The reduction/magni�cation factor of the jth eigencomponent zj by the projection

step (1) in the previous method is j�j � �j=j1 � �j. One major drawback of the pre-

vious algorithm is that nonsolution components can grow if this ratio is larger than 1,

especially if the eigenvalue �j is far away from the point (�; 0) in the complex plane.

The parameter � in the old projector (1) is a representative value for the k eigen-

values that correspond to the major nonsolution components, which are di�erent in

general. Moreover if these eigenvalues are well-separated, a projection step using one

parameter can be a disaster.

To remedy such phenomena, we modify the old projector to a multi-parametered

one. That is, we replace the old orthogonal projector (1) by

xnew = xk+1 + V �V T rk+1;(2)

where columns of V 2 Rn�k are orthonormal, and R(V ) = spanfr2; r3; : : : ; rk+1g and

� = diag(1

1� �2; : : : ;

1

1� �k+1

)

where �j 's are hopefully close to �j 's.

As before, after enough power iterations, assume that the current residual rk+1

mainly consists of k major nonsolution components z2; z3; : : : ; zk+1, and the rest are

much smaller. Now consider the new projection step (2). In an ideal case when the

columns of V are exactly z2; : : : ; zk+1 and �j = �j , j = 2; : : : ; k + 1, we have

Theorem 2 Let V 2 Rn�k be the matrix with orthonormal columns such that V =

[z2; z3; : : : ; zk+1], and let �j = �j; j = 2; 3; : : : ; k + 1. After enough power iterations,

assume that the current iterate can be written as xk+1 =Pn

j=1 �jzj, where j�1j =

O(1); j�ij � �1 for i = 2; : : : ; k + 1, and j�j j � �2 for j = k + 2; : : : ; n, and �2 � �1 �

1. Then after applying the projection step (2), the sum of the remaining nonsolution

components is at most of O(n�2).

Proof . Let V = [z2; : : : ; zk+1] and W = [zk+2; : : : ; zn]. Then both V and W have

orthonormal columns since A is symmetric. We can write the current iterate xk+1 and

its residual rk+1 in a matrix form as xk+1 = �1z1+V �2+W�3 and rk+1 = (A�I)xk+1 =Pnj=2 �j(�j � 1)zj = V �2 + W�3 where �2 = [�2; : : : ; �k+1]

T , �3 = [�k+2; : : : ; �n]T ,

�2 = [�2(�2 � 1); : : : ; �k+1(�k+1 � 1)]T , and �3 = [�k+2(�k+2 � 1); : : : ; �n(�n � 1)]T .

Hence after the projection step, we obtain

xnew = xk+1 + V �V T rk+1

= �1z1 + V �2 +W�3 + V �V T (V �2 +W�3)

= �1z1 + V �2 +W�3 + V ��2

= �1z1 +W�3


since V TV = I, V TW�3 = 0, and ��2 = ��2. Hence the size of the sum of nonsolution

components in xnew is

jjxnew � �1z1jj = jj

nX

j=k+2

�jzj jj � n�2:

Note that we need compute the exact eigenpairs (�j ; zj); j = 2; : : : ; k+1 for better

convergence. The Lanczos method, which is known to compute a few extreme eigenpairs

very accurately, is a good candidate for this purpose.

Hence in our new algorithm, we make use of the Ritz pairs(e.g., see [1]). That

is, by applying the Lanczos method to the residual r1, we compute k Lanczos vectors

q1; : : : ;qk 2 Rn and a k � k tridiagonal matrix Tk. Through eigenanalysis of Tk, we

compute its eigenpairs (�j; sj), j = 2; : : : ; k+1 (We intentionally number eigenvalues in

this way to conform future development.). It is well known that a few of �j's are good

approximations to the extreme eigenvalues �2; �3; : : : of A(since the Lanczos method

is applied to the residual r1), and some of Qksj's where Qk = [q1; : : : ;qk], are good

approximations to the eigenvectors z2,z3; : : :, of A.

Hence we select only c eigenpairs (�j ; sj), j = 2; : : : ; c + 1(c < k, that are believed

to be accurate), out of them and obtain � 2 Rc�c and c approximations yj = Qksj,

j = 2; : : : ; c + 1, to the eigenvectors z2; : : : ; zc+1 of A. Then take V = [y2; : : : ;yc+1] 2

Rn�c.

4. Interconversion between the power iterates and the Lanczos results

It seems that, according to the previous explanation, we need apply the Lanczos

method too, independently of the power method, that requires a lot of extra costly

matrix-vector multiplications. However we can avoid it by carefully considering the

relation among power iterates, their residuals, and the Lanczos vectors computed from

the residuals.

The reason for such interconversion is because the power iterates, their residuals,

and the Lanczos method all make use of Krylov subspaces, and the residual has been

de�ned so that such interconversion is possible.

Lemma 1 Consider the power iteration xi+1 := Axi for an eigenvalue problem Ax = x

with an initial vector x1. Then, not only the power iterates x1;x2; : : : but also their

residuals r1; r2; : : : form Krylov subspaces.

Proof . The facts are clear by the de�nition of the Krylov subspace and De�nition 2.

This means that, the residual sequence r1; r2; : : : can be thought of as another power

iterates by ri+1 := Ari. Power iterates can be constructed using the residual sequence

and the initial iterate x1 by

Lemma 2 Let x1;x2; : : : be power iterates. Then xj+1 = x1 +Pj

i=1 ri.

Getting Lanczos tridiagonal matrix from power iterates

Since we apply the Lanczos method to the residual r1,

10 Pil Seong Park

Theorem 3 The Lanczos tridiagonal matrix Tk can be obtained from residual iterates

r1,r2,: : :,rk with an additional matrix-vector multiplication and some inner products.

Proof . It is just a symmetric version of the Theorem 2 in [4], with power iterates

replaced by the residual iterates.

Corollary 1 The Lanczos tridiagonal matrix Tk can be constructed from power iterates

x1,x2,: : :,xk+2.

Getting power iterates from Lanczos process

Conversely, we can obtain power iterates from Lanczos results.

Theorem 4 The residual sequence r2; : : : ; rk can be computed using the Lanczos results

by

rj = Aj�1r1 = jjr1jjV Tj�2k t1

where V 2 Rn�k; Tk 2 Rk�k, and t1 is the �rst column of the Lanczos tridiagonal

matrix Tk.

Proof . Consider the full Lanczos' relation

AV = V T;(3)

where A; V; T 2 Rn�n. Let V = [v1;v2; : : : ;vn]. The �rst column of AV can be written

as

Av1 = V t1(4)

where t1 is the �rst column of T . Premultiplying (4) by A and using (3),

A2v1 = AV t1 = V T t1:

In general, we obtain

Ajv1 = V T j�1t1:

We only need the �rst 2 columns of T j�1 to compute Ajv1, since T is tridiagonal and

only the �rst 2 entries of t1 are nonzero. Hence we need only the �rst k columns of V

and k � k principal submatrix of T , which is Tk. Since we apply the Lanczos method

to the residual r1(i.e., the starting vector is v1 = r1=jjr1jj), the result is immediate.

Corollary 2 The power iterates x2; : : : ;xk+1 can be obtained from the Lanczos matrix

Tk and the Lanczos vectors.

5. A power/Lanczos hybrid algorithm

Our algorithm consists of one or more hybrid steps, each of which consists of some

power iterations(so that residuals mainly contain some major nonsolution components

only), Lanczos steps(to estimate major nonsolution eigenpairs), and a projection step(to

remove those major nonsolution components).

More precisely, at the beginning of each hybrid iteration, we apply some(say, s)

power iterations to the most recent iterate xnew and obtain x1. Then we compute the

residual r1 = x2�x1 where x2 = Ax1. At this point, we can use either of the following

two methods:


� Continue applying power iteration to generate x3; : : : ;xk+2 by xi+1 := Axi, i =

2,3,: : :,k + 1, then convert them to create the Lanczos matrix and vectors by

Theorem 3 and Corollary 1.

or

� Stop power iteration, and apply the Lanczos method to r1 to get the Lanczos

matrix Tk and the Lanczos vectors. Then the residual rk(hence the power iterate

xk and xk+1 too) can be obtained by Theorem 4 and Corollary 2.

In any case, the Lanczos matrix Tk and the Lanczos vectors are used to approximate

the major nonsolution eigencomponents (�j ; zj); j = 2; : : : ; k + 1, as explained previ-

ously. However, since only some(say, c where c < s) of them are accurate, we choose

only those c eigenpairs and project the most recent residual rk+1 onto the subspace

spanned by the c eigenpairs and subtract the components from xk+1.

Based on the latter that looks simpler, we suggest the following power/Lanczos

hybrid algorithm (m; s; k; c), where s is the number of power iterations to be applied

before the Lanczos step, k is the size of the Lanczos tridiagonal matrix to be constructed,

and c is the number of dimensions onto which a projection step is applied(i.e., the

number of eigenpairs that are assumed to be accurate). Note that we may allow extra

m power iterations to the initial guess xinit before the �rst hybrid iteration begins

(outside of the hybrid loop).

Algorithm 1 : Power/Lanczos hybrid algorithm(m; s; k; c)

Given a matrix A 2 Rn�n, assume c < k � n.

1. Take an initial guess xinit.

2. Apply m power iterations to xinit.

3. For i = 1; 2; : : : ; until convergence, do

1) Apply s power iterations to the most recent iterate to obtain x1. Then

compute r1 = x2 � x1 where x2 = Ax1.

2) Apply the Lanczos method to r1 to obtain a tridiagonal matrix Tk and

k Lanczos vectors q1; : : : ;qk.

3) Compute the eigenpairs (�j ; sj), j = 2; : : : ; k+1, of Tk, and select the

major c pairs out of them.

4) Form � 2 Rc�c and V = [y2; : : : ;yc+1] 2 Rn�c where yj = Qksj,

j = 2, : : : ,c+ 1, and Qk = [q1; : : : ;qk].

5) Use the Lanczos results to form xk+1 and rk+1.

6) Perform a projection step to obtain a new iterate xnew = xk+1 +

V �V T rk+1 and normalize xnew.

6. Numerical experiments and discussion

We created the following two relatively hard sample problems Ax = x as follows: for

each problem, we created a two-queue over ow queuing problemQx = 0 with parameter

12 Pil Seong Park

quadruples (sj; wj ; ij ; oj), where sj is the number of servers, wj is the number of wait

spaces, ij is the mean arrival rate, and oj is the mean departure rate in the jth queue [2].

We convert this into the corresponding eigenvalue problem Gx = x by Jacobi splitting

(e.g., see [1]). Since G is 2-cyclic(hence the power iteration does not converge as is,

e.g., see [6]), we slightly shift it by B = (G+ 0:01I)=1:01 so that the resulting matrix

is acyclic. >From this, we get an eigenvalue problem Ax = x, where A is obtained by

symmetrizing B by (B +BT )=2 and scaling the resulting matrix by its spectral radius

so that �1 = 1 is a simple eigenvalue. The eigenvalues, other than 1, of each of the

matrices are well separated, and many of them have relatively large modulus close to

1.

Problem 1 The parameter quadruples used are (5; 3; 10; 6) and (5; 3; 3; 1) in each

queue(j = 1; 2) respectively. Hence the matrix size is 81. Eigenvalues other than

1 are, in decreasing order of magnitude, -0.98144, 0.97465, -0.95610, 0.94286, -

0.92430, 0.89565, -0.87709, 0.86261, 0.84435, -0.84405, 0.83755, -0.82580, -0.81900,

0.80051, and the rest are between -0.8 and 0.8.

Problem 2 The parameter quadruples used are (7; 12; 33; 7) and (4; 15; 22; 5) in

each queue(j = 1; 2) respectively. Hence the matrix size is 400. 0.99608, -098048,

0.98004, 0.97843, -0.97655, 0.97326, -0.96051, -0.95891, 0.95794, -0.95374, 0.95273

are the eigenvalues with modulus greater than 0.95. There are 14 eigenvalues

whose moduli are between 0.90 and 0.95, and 14 more between 0.85 and 0.80,

others being between -0.85 and 0.85.

As an initial guess xinit, we use a vector of all ones scaled by its 2-norm. One

projection step needs one matrix-vector multiplication and some extra calculation for

eigenanalysis of the Lanczos tridiagonal matrix Tk of size k, etc. However, since k � n,

we will simply count the work for one projection step as one matrix-vector multiplica-

tion, ignoring any extra calculation (According to actual timing result, the assumption

seems to be acceptable.).

Table 1 shows the number of matrix-vector multiplications used to reduce the norm

of the residual to certain sizes(10�3; 10�7, and 10�10) for Problem 1, by various meth-

ods: pure power method, shifted power method with best shift 0.006223529(computed

by eigenanalysis of A), the old projection method(denoted by \Old") in [3], and the

new hybrid algorithm with various parameters.

In general, regardless of the choice of parameters, the new hybrid algorithm worked

well, reducing the amount of work to nearly 1/5 of the underlying power method and

less than 1/2 of the old method. It also has been observed that, for �xed value of

m; s; k, di�erent choices of c do not give much di�erence. Anyhow, taking c = k may

sometimes give a slightly worse result since not all of the Lanczos pairs are accurate.

Fig. 1 shows a typical residual reduction pattern by the hybrid algorithm, together

with those by the pure power method and the old projection method. We denote

a speci�c algorithm by "algorithm (m; s; k; c)" where "algorithm" is either "old"(for

the old projection method) or "hybrid"(for the new hybrid method) and m; s; k; c are

the parameters as described previously. Note that rapid residual reduction after each


Table 1: The number of matrix-vector multiplications required to reduce the residual

to certain sizes by various methods.

Algorithms Parameters Residual less than

Projector m s k c 10�3 10�7 10�10

Normal power iteration 89 555 924

Power iteration with best shift 82 441 731

Old - 54 185 312

2 38 109 160

Hybrid 10 5 5 3 36 109 160

4 32 113 165

5 32 111 154

Old - 81 241 359

2 61 131 187

Hybrid 50 5 5 3 61 116 168

4 61 111 172

5 61 116 163

Old - 82 241 368

3 31 149 200

Hybrid 10 10 10 5 31 94 139

7 31 94 137

9 31 94 137

Old - 82 332 441

3 33 149 218

Hybrid 0 10 10 5 25 86 140

7 25 84 127

9 25 84 127

Old - 68 326 413

Hybrid 20 1 4 2 35 136 222

3 34 137 232

14 Pil Seong Park

Figure 1: A typical residual reduction pattern of the hybrid algorithm applied to prob-

lem 1.

projection is clearly visible. Comparing hybrid(10,10,10,5) and hybrid(0,10,10,5), we

may conclude that it gives little di�erence whether we start a projection step slightly

earlier or not, because the Lanczos method gives a very good approximation.

However for the result of old(10,10,10,-), the same values for m; s; k are used but

rapid drop is seldomly seen(In fact, we did apply a projection step at every 20 power

iterations, but most of them have been found of no use.). The reason seems to be that

the estimation for major nonsolution component in the old algorithm is not so accurate

as the Lanczos method used in the hybrid algorithm.

However, even though old(10,10,10, - ) performs a projection step at every 21(i.e., 10

for power iterations, 11 for the Lanczos method applied to the residual) matrix-vector

multiplications, rapid improvement is seldomly seen. In the �gure, rapid improvement

is seen at the 6th, 11th, and 15th projections only. Thus the old algorithm must have

failed to accelerate convergence in most hybrid steps, unless more power iterations

reduce the number of major nonsolution components so that the condition is favorable

for the old projector.

When we apply projection steps too often(i.e., see the cases for m = 20; s = 1; k = 4

at the end of Table 1 and Figure 2), convergence may be worse, because it is not ready

for another projection yet in the sense that the span of the residual sequence may not be

a good approximation to the span of dominant nonsolution components. However, even

in such cases, the new hybrid algorithm still works far better than the old algorithm.

Figure 3 shows that the new algorithm still works well even though the matrix of


Figure 2: E�ect of frequent projections.(Problem 1).

Problem 2 has many more(compared to the size of the dimension c = 5) well-separated

eigenvalues with modulus close to 1.

In general, any choice of the parameters (k; s;m; c) for the new hybrid algorithm

seems to work well, but determination of their optimal values needs further research.

References

1. G. H. Golub and C. F. van Loan, 1996. Matrix computations, 3rd Ed., Johns

Hopkins Univ. Press, Baltimore, U. S. A.

2. L. Kaufman, 1983. Matrix methods for queuing problems, SIAM J. Sci. Comput.,

4:525-552.

3. P. S. Park, 1996. Use of an orthogonal projector for accelerating a queuing problem

solver, Korean J. Com. & Appl. Math. 3(2):193-204.

4. P. S. Park, 1997. Interconversion between the power and Arnoldi's methods. Comm.

Korean Math. Soc. 12(1):145-155.

5. Y. Saad, 1980. Variations on Arnoldi's method for computing eigenelements of

large unsymmetric matrices, Lin. Alg. App., 34:pp. 269-295.

6. E. Seneta, 1981. Non-negative matrices and Markov chains, 2nd Ed., Springer-

Verlag, New York, U. S. A.

16 Pil Seong Park

Figure 3: Residual reduction by the hybrid algorithm applied to problem 2.

7. L. N. Trefethen and D. Baus, III, 1997. Numerical linear algebra, SIAM, Philadel-

phia, U. S. A.

Department of Computer Science

University of Suwon

Kyungki-Do 445-743, Korea

J. KSIAM Vol.3, No.2, 17-28, 1999

THE BOUNDARY ELEMENT METHOD FOR

POTENTIAL PROBLEMS WITH SINGULARITIES

BEONG IN YUN

Abstract. A new procedure of the boundary element method(BEM),say, singular BEM

for the potential problems with singularities is presented. To obtain the numerical solu-

tion of which asymptotic behavior near the singularities is close to that of the analytic

solution, we use particular elements on the boundary segments containing singularities.

The Motz problem and the crack problem are taken as the typical examples, and numer-

ical results of these cases show the e�ciency of the present method.

1. Introduction. The general potential boundary value problem in the plane can be

written by

�u = 0 ; in

u = u ; on �u

q = q ; on �q ;

(1.1)

where � = �u + �q is the piecewisely smooth boundary of the domain and q = @u

@n

is the normal derivative of the potential u with the outward unit normal vector n.

There are many numerical methods for solving boundary value problems such as

�nite element method(FEM), �nite di�erence method(FDM), Ritz-Galerkin method

and boundary element method(BEM). Each of these methods has its own advantages

and disadvantages. When the solutions have singularities, which are due to the compli-

cated boundary conditions or the geometries of the boundaries, the traditional numeri-

cal schemes are not useful. Therefore some special manipulation is needed to overcome

this di�culty [1 { 4].

The BEM is prevailing recently in many engineering disciplines because of the re-

duction in the dimensionality of the problem which results in a much smaller system of

algebraic equations to be solved numerically. The present work is concerned with the

simple and e�ective numerical implementation of the traditional BEM [5,6], for solving

the potential problems with singularities.

It is well known that, for interior points P in , the solution of the problem (1.1)

satis�es

u(P ) +

Z�

q�(P;Q)u(Q) d�(Q) =

Z�

u�(P;Q) q(Q) d�(Q) ; (1.2)

Keywords- boundary element method, singular BEM, Motz problem, crack problem

1991 mathematics subject classi�cation 65N38

17

18 BEONG IN YUN

in which u�(P;Q) and q�(P;Q) are fundamental solutions of the Laplace equation such

as

u�(P;Q) =

1

2�log

1

r(1.3)

and

q�(P;Q) =

@u�

@nQ(P;Q) = �

1

2�

1

r2(r1n1 + r2n2) : (1.4)

In the formulae (1.3) and (1.4), for the points P = (p1; p2) and Q = (q1; q2),

r = jQ� P j =qr21 + r

22 ;

r1 = q1 � p1 ; r2 = q2 � p2 ;

nQ = (n1; n2) :

(1.5)

Limiting process of the equation (1.2) to the boundary point induces the boundary

integral equation

1

2u(P ) +

Z�

q�(P;Q)u(Q) d�(Q) =

Z�

u�(P;Q) q(Q) d�(Q) ; P 2 � : (1.6)

First, the numerical scheme for the traditional BEM is given in the section 2. A

simple modi�cation of the boundary elements near the singularities is proposed in

the section 3, and applications of the present method to the Motz problem and crack

problem are studied in the last section.

2. Traditional BEM with Constant Elements and Linear Discretization.

In this section we review the complete algorithm of the traditional BEM, for sim-

plicity, based on the constant element with the linear discretization of the boundary.

2.1. Discretization

Let the boundary � is discretized by the line segments �j (j = 1; 2; � � � ; n), of whichend points are (xj ; yj) and (xj+1; yj+1). Then every point Q = (x; y) 2 �j can be

written by

x = �j(t) =1

2[(xj+1 � xj) t + (xj+1 + xj)] ;

y = �j(t) =1

2[(yj+1 � yj) t + (yj+1 + yj)] ; �1 � t � 1 ;

(2.1)

with

d�j(Q) =q�0j(t)2 + �0

j(t)2 dt

=1

2

q(xj+1 � xj)2 + (yj+1 � yj)2 dt �

1

2Lj dt ;

(2.2)

POTENTIAL PROBLEMS WITH SINGULARITIES 19

and the outward unit normal vector is

nQ = (n1; n2) = (yj+1 � yj ; xj � xj+1)=Lj : (2.3)

On the other hand we take the node point on �i as

Pi = (xi; yi) ; xi = (xi + xi+1)=2 ; yi = (yi + yi+1)=2 : (2.4)

Then the distance between Pi and Q 2 �j is

r(t) = jQ� Pij =pr1(t)2 + r2(t)2 ; (2.5)

with

r1(t) = �j(t)� xi and r2(t) = �j(t)� yi : (2.6)

On each boundary segment �j , we take the constant values for the potential and

ux, say,

u(Q) = u(Pj) = uj; q(Q) = q(Pj) = q

j; for all Q 2 �j : (2.7)

Then, for every node point P = Pi 2 �i, the boundary integral equation (1.6) results

in

1

2ui +

nXj=1

"Z�j

q�(Pi; Q) d�j(Q)

#uj =

nXj=1

"Z�j

u�(Pi; Q) d�j(Q)

#qj; (2.8)

i = 1; 2; � � � ; n :The equation (2.8) is rewritten by

nXj=1

Hijuj =

nXj=1

Gijqj; i = 1; 2; � � � ; n ; (2.9)

in which the integrals Gij and Hij can be evaluated numerically by the Gauss quad-

rature rule. That is, referring to the formulae (2.1) { (2.6),

Gij =

Z�j

u�(Pi; Q) d�j(Q)

= �1

2�

Z 1

�1log r(t)

�1

2Lj

�dt

��Lj

4�

� MXm=1

!m log r(tm) ;

(2.10)

where !m and tm are weights and nodes of Gauss quadrature rule in the interval

�1 � t � 1. In particular, when i = j,

Gii =

��Li

4�

�Z 1

�1log

��Li2 t

�� dt=

��Li

2�

��log

�Li

2

�� 1

�:

(2.11)

20 BEONG IN YUN

In the formula (2.9) Hij = 12�ij + H

ij , and Hij is approximated by

Hij =

Z�j

q�(Pi; Q) d�j(Q)

=

��Lj

4�

�Z 1

�1

1

r(t)2[r1(t)n1 + r2(t)n2] dt

��Lj

4�

� MXm=1

!m

r(tm)2[r1(tm)n1 + r2(tm)n2] :

(2.12)

When i = j, [r1(t)n1 + r2(t)n2] = 0 so that

Hii = 0 : (2.13)

2.2. Solving the system of boundary integral equations

If we de�ne the following matrices

H =�Hij�n�n ; G =

�Gij�n�n

and vectors

u = fujgn�1 ; q = fqjgn�1 ;

the equation (2.9) can be written by

Hu = Gq : (2.14)

Assume that the input data is given as

EP = fxj ; yjgn�2 ;

f = ff jgn�1 ;

T = fT [j]gn�1 ( T [j] = 0; or 1 ) ;

(2.15)

in which EP is a set of the extreme points of the boundary segments, and f is a set of

boundary conditions at the node points. T indicates the type of boundary conditions

at the nodes ; T [j] = 0 means that the value of the potential is known at the node j,

that is, uj = fj while the ux q

j is unknown. T [j] = 1 means that qj = fj with the

potential uj unknown.

To �nd the unknown values of u and q by substituting the boundary conditions into

(2.14) one has to rearrange the system by moving columns of H and G from one side

to the other. If all the unknowns are passed to the left hand side, then the system

(2.14) is translated into

Ax = y ;

A = [aij ]n�n ;

x = fxjgn�1 ; y = fyjgn�1 ;(2.16)


where x is a vector of unknowns for u's and q's. y is found by multiplying the corre-

sponding columns of the translated matrix of G by known values for u's and q's.

Based on the statement given above, we introduce an algorithm to translate the

system (2.14) into (2.16) as following :

DO j = 1; 2; � � � ; nDO i = 1; 2; � � � ; n

aij = (T [j]� 1)Gij + T [j]Hij

Bij = T [j]Gij + (T [j]� 1)Hij

(2.17)

CONTINUE

DO i = 1; 2; � � � ; n

yi =

nXk=1

Bikfk (2.18)

CONTINUE

Once the vector of unknowns, x = fxjgn�1 is obtained by solving the equation

(2.16), values of the potential and the ux at each node are given by

uj = T [j]xj + (1� T [j])f j

qj = (1� T [j])xj + T [j]f j ; j = 1; 2; � � � n :

(2.19)

2.3. Evaluation at the internal points

After the unknown coe�cients of fujg and fqjg are obtained, the values of potential

and ux at the interior point can be evaluated such as

u(P ) =

nXj=1

Gj(P )qj �

nXj=1

Hj(P )uj ; P = (x; y) 2 : (2.20)

In this formula Gj(P ) and Hj(P ) are same as Gij and H

ij given in (2.10) and (2.12),

respectively if we replace ri(t) (i = 1; 2) in (2.6) by

r1(t) = �j(t)� x and r2(t) = �j(t)� y : (2.21)

In order to evaluate the derivatives of the potential at the internal points, we consider

the following equations resulting from (1.2).

@u

@x(P ) =

Z�

@

@xu�(P;Q) q(Q) d�(Q) �

Z�

@

@xq�(P;Q)u(Q) d�(Q) ;

@u

@y(P ) =

Z�

@

@yu�(P;Q) q(Q) d�(Q) �

Z�

@

@yq�(P;Q)u(Q) d�(Q) ;

(2.22)

22 BEONG IN YUN

for P = (x; y) 2 : In these formulae the kernels are

@

@xu�(P;Q) =

1

2�

r1

r2;

@

@yu�(P;Q) =

1

2�

r2

r2;

@

@xq�(P;Q) = �

1

2�

1

r2

�2

r2r1(r1n1 + r2n2)� n1

�;

@

@yq�(P;Q) = �

1

2�

1

r2

�2

r2r2(r1n1 + r2n2)� n2

�:

(2.23)

Discretization of the equation (2.22) gives

@u

@x(P ) =

nXj=1

�Gj

x(P )qj � H

j

x(P )uj

;

@u

@y(P ) =

nXj=1

�Gj

y(P )qj � H

j

y(P )uj

;

(2.24)

where

Gj

x(P ) =

Z�j

@u�

@xd�j ; G

j

y(P ) =

Z�j

@u�

@yd�j ;

Hj

x(P ) =

Z�j

@q�

@xd�j ; H

j

y(P ) =

Z�j

@q�

@yd�j :

(2.25)

Using the formulae (2.23) { (2.25) one can obtain the derivatives of the potential at

every internal points.

3. Boundary Elements near the Singular Points.

Assume that � is a boundary of a bounded region and that behavior of the solutions

for the potential and the ux at the internal point P near P� is like as

u(P ) = O(jP� � P j�) ; q(P ) = O(jP� � P j��1) ; (3.1)

0 < � < 1. That is P� is the singular point for the ux. For some integer k, one may

take the sequential boundary segments �k and �k+1 of which common extreme point

is P�. In this case, instead of the constant elements, we present particular boundary

elements on these segments �k and �k+1 such as

uk(t) = uk(1� t)� ; qk(t) = g

k(1� t)��1 on �k ;

uk+1(t) = uk+1(1 + t)� ; qk+1(t) = g

k+1(1 + t)��1 on �k+1 ;(3.2)

�1 � t � 1. It should be noted that uk(t) ; uk+1(t) ; qk(t) and qk+1(t) satisfy the

conditions in (3.1) near the singular point P�.


Then, for the singular boundary segments �k and �k+1, the integrals in (2.10) and

(2.12) are replaced by

Gik �

��Lk

4�

� MXm=1

!m(1� tm)��1 log r(tm) ;

Hik �

��Lk

4�

� MXm=1

(1� tm)�

!m

r(tm)2[r1(tm)n1 + r2(tm)n2] ;

(3.3)

and

Gi(k+1) �

��Lk+1

4�

� MXm=1

!m(1 + tm)��1 log r(tm) ;

Hi(k+1) �

��Lk+1

4�

� MXm=1

(1 + tm)�

!m

r(tm)2[r1(tm)n1 + r2(tm)n2] :

(3.4)

When � is an open arc, say, a crack and thus the singularities occur at the two

crack tips, one may take �1 and �n so that they contain the singular points in the left

and right hand sides, respectively. Therefore, in this case, the formulae (3.3) and (3.4)

hold, by replacing k and k + 1 by n and 1, respectively.

4. Applications of the Singular BEM.

We introduce two typical examples, a Motz problem and a crack problem, in se-

quence, to show the e�ciency of the singular BEM introduced in this article. The

number of the nodes of Gauss quadrature rule, in the formulae (2.10) and (2.12), is

taken as M = 4.

4.1. The Motz problem.

As a typical singularity problem we consider the Motz problem, as shown in Figure 1,

in a rectangular domain = f(x; y) j � 1 < x < 1; 0 < y < 1g with the boundary

conditions :

ujy=0;x<0 = 0; ujx=1 = 500;

qjy=1 = qjy=0;x>0 = qjx=�1 = 0 :(4.1)

Fig 1 is located hear.

24 BEONG IN YUN

It is known that the solution of (4.1) has a singularity at the origin. In fact the

exact solution can be expressed in a series as [3]

u(r; �) =

1Xj=0

bjrj+ 1

2 cos(j +1

2)� ; (4.2)

where (r; �) are polar coordinates.

Applying the singular BEM to this problem, we have a good approximation to the

exact solution near the singular point. Referring to the fact that the derivatives of the

exact solution has O(r�1

2 ) singularity in the vicinity of the origin, one has to take the

singular boundary elements in (3.2) such as

uk(t) = uk(1� t)

1

2 ; qk(t) = gk(1� t)�

1

2

uk+1(t) = uk+1(1 + t)

1

2 ; qk+1(t) = gk+1(1 + t)�

1

2 :

(4.3)

Figure 2 shows the behavior of the traditional BEM solution, uTnand the singular

BEM solution, uSnnear the singular point, respectively . The subscript n indicates

the number of the boundary elements. Comparison of the relative errors of these two

results are given in Figure 3. We have taken u as an exact solution which is evaluated

from the truncated series with su�ciently large number of terms in (4.2).




In Table 1 relative errors of the traditional and the singular BEM solutions with

n = 30 for the potential and normal derivatives are given. The selected points are such

as

P1 = (�1

2;1

4); P2 = (

1

2;1

2); P3 = (

1

2;3

4) :

The relative errors are de�ned as

ET =

��u� uT

n

u

�� ; ET

1 =

�� @@x�u� u

T

n

�.@u

@x

�� ; ET

2 =

�� @@y�u� u

T

n

�.@u

@y

�� ;and E

S , ES

1 , ES

2 are similarly de�ned. Table 1 shows that the singular BEM solution

is satisfactory on the whole interior points as well as near the singular points.

Table 1 is located hear.

4.2. The crack problem.

In this case we consider the Dirichlet problem on a crack, � such as

�u(P ) = 0; P 2 R2 n �;

u(P ) = f(P ); P 2 �;

supP2R2

ju(P )j < 1 :

(4.4)

26 BEONG IN YUN

Recently several numerical methods for this type of problems are studied by the

indirect BEM [7,8,9]. Even though they have given complete approximation schemes

and convergence analysis, numerical results for the singular �elds near the crack tips

are not presented.

Above all it should be noted that, for crack problems as given above, the double

layer potentialR�q�(P;Q)u(Q) d�(Q) in (1.2) is canceled by the opposite signs of the

fundamental solution q� on the upper and lower crack faces. Thus the integral equation(1.2) should be replaced by

u(P ) =

Z�

u�(P;Q) q(Q) d�(Q) + � ; (4.5)

where � is an unknown constant. As mentioned in the literature [8], the addition of

the unknown constant � is due to the following constraintZ�

q(Q) d�(Q) = 0 ; (4.6)

which results from the boundedness condition in the problem (4.4). In fact the constant

� means the limit value of the potential u(P ) as jP j ! 1.

It is known that the ux has also O(r�1

2 ) singularity near the crack tips. Taking the

�rst boundary segment, �1 and the last one, �n so that they contain the left and right

hand side crack tips, respectively, we denote the singular boundary elements such as

q1(t) = q1(1 + t)�

1

2 on �1 ; qn(t) = gn(1� t)�

1

2 on �n : (4.7)

Then the discretization scheme given in the section 2 and the condition (4.6) imply

thatnXj=1

Lj

2

Z 1

�1qj(t) dt =

p2�L1q

1 + Lnqn+

n�1Xj=2

Ljqj = 0 : (4.8)

If one chooses the boundary segments so that L1 = L2 = � � � = Ln, equations (4.5)

and (4.8) result in the system

266664G11

G12 � � � � � � G

1n 1

G21

G22 � � � � � � G

2n 1...

......

...

Gn1

Gn2 � � � � � � G

nn 1p2 1 1 � � � � � � 1

p2 0

377775

266664

q1

q2

...

qn

�

377775 =

266664u1

u2

...

un

0

377775 (4.9)

As an example we take � as the line segments on the x-axis, that is, � = [�1; 1],and suppose the boundary condition is given by

f(x) = e�x cos

p1� x2 ; x 2 � : (4.10)

The exact solution of (4.4) is known as [7]

u(x1; x2) = Rehe

pz2�1�z

i; z = x1 + ix2 : (4.11)


Figure 4 shows the behavior of the indirect BEM solution in the ref. [8], uInand

the singular BEM solution, uSnnear the right crack tip, respectively. Figure 5 gives

comparison of the relative errors of these solutions with respective to the exact solution

(4.11). Figure 4 and Figure 5 prove that the present method is very e�ective near the

singularities.



28 BEONG IN YUN

References

1. R.W. Thatcher, The use of in�nite grid re�nement at singularities in the solution of Laplace's

equation, Numer. Math. 25 (1976), 163 { 178.

2. N. Papamichael, Numerical conformal mapping onto a rectangle with applications to a solution

of Laplacian problems, J. Comput. Appl. Math. 28 (1990), 63 { 83.

3. Z.C. Li, Numerical Methods for Elliptic Problems with Singularities, World Scienti�c, Singapore,

1990.

4. A. Potela, M.H. Aliabadi and D.P. Rooke, The dual boundary element method: E�ective imple-

mentation for crack problems, International J. Numer. Meth. Engng. 33 (1992), 1269 { 1287.

5. C.A. Brebbia, J.C.F. Telles and L.C. Wrobel, Boundary Element Techniques, Springer-Verlag,

New York, 1984.

6. P.K. Banerjee, Boundary Element Methods in Engineering, McGraw-Hill, London, 1994.

7. K. Atkinson and I.H. Sloan, The numerical solution of the �rst kind logarithmic kernel integral

equations on smooth open arcs, Math. Comp. 56(193) (1991), 119 { 139.

8. B.I. Yun, S. Lee and U.J. Choi, A modi�ed boundary integral method on open arcs in the plane,

Computers and Math. Applic. 31(11) (1996), 37 { 43.

9. B.I. Yun and S. Lee, Double layer potential scheme for Dirichlet problems on smooth open arcs,

Computers and Math. Applic. 37(7) (1999), 31 { 40.

Department of Informatics and Statistics

Kunsan National University

573-701 Kunsan, Korea

AN ALGORITHM FOR SYMMETRIC INDEFINITE

SYSTEMS OF LINEAR EQUATIONS

SUCHEOL YI

J. KSIAM Vol.3, No.2, 29-36, 1999

Abstract

It is shown that a new Krylov subspace method for solving symmetric inde�nite

systems of linear equations can be obtained. We call the method as the projection

method in this paper. The residual vector of the projection method is maintained

at each iteration, which may be useful in some applications.

1. Introduction. The kth Krylov subspace Kk(r0; A) generated by an initial residual

vector r0 = b�Ax0 and A is de�ned by

Kk(r0; A) � spanfr0; Ar0; : : : ; Ak�1r0g:(1)

Iterative methods that choose corrections from the space Kk(r0; A) at each iteration

are called Krylov subspace methods. The GMRES method [7] is a Krylov subspace

method for solving systems of linear equations

Ax = b; where A 2 Rn�n is nonsingular:(2)

The kth iterate of GMRES can be characterized as xk = x0 + zk for a given initial

guess x0 2 Rn and the correction zk is chosen to minimize the norm of the residual

vector r(z) = r0 �Az over the kth Krylov subspace Kk(r0; A) at each iteration, i.e.,

kr0 �Azkk2 = minz2Kk(r0;A)

kr0 �Azk2:(3)

If the Arnoldi process is applied with v1 = Ar0=kAr0k2 to generate a basis for the

Krylov subspace Kk(r0; A), simpler GMRES implementations of Walker and Zhou [8]

are obtained and the Arnoldi process is summarized as follows:

Algorithm 1.1 Arnoldi process

Initialize: Choose an initial guess v1 with kv1k2 = 1:

Iterate: For k = 1; 2; : : : ; do:

hi;k = vTi Avk; i = 1; 2; : : : ; k;

~vk+1 = Avk �Pk

i=1 hi;kvi:

Set hk+1;k = k~vk+1k2:

If hk+1;k = 0, stop; otherwise,

vk+1 = ~vk+1=hk+1;k:

Key words: GMRES, MINRES, SYMMLQ, symmetric QMR, and Krylov subspace method.

AMS subject classi�cation. 65F10

29

30 SuCheol Yi

Without loss of generality we may assume the initial residual vector is nonzero. The

initial Arnoldi vector v1 = Ar0=kAr0k2 is then well-de�ned, since A is a nonsingular

matrix. Setting �1;1 = kAr0k2 gives the equation

Ar0 = �1;1v1;(4)

and the following equation is satis�ed by the Arnoldi process:

Avk�1 =kXi=1

�i;kvi for unique �;i;ks with �k;k > 0 for k > 1:(5)

From the equations (4) and (5) we have the following relation:

AUk = VkRk;(6)

where Uk = (r0; v1; : : : ; vk�1); Vk = (v1; : : : ; vk), and

Rk =

0B@�1;1 : : : �1;k

. . ....

�k;k

1CA :

Then the relation (6) reduces the least-squares problem (3) directly to an upper tri-

angular least-squares problem by decomposing the initial residual vector r0 as r0 =

�?

k r0 + VkVTk r0 for each k, where �?

k is the orthogonal projection onto the orthogonal

complement of the space Kk(v1; A).

We introduce another approach to Krylov subspace methods for solving symmetric

inde�nite linear systems, which is called the projection method in this paper. The pro-

jection method is closely related to the simpler GMRES method in that the projection

and simpler GMRES methods use the same initial basis vector v1 = Ar0=kAr0k2 in

applying the symmetric Lanczos and Arnoldi processes, respectively, and, in the sym-

metric case, the projection method can be derived from the simpler GMRES method

by �nding a search direction pk such that Apk = vk for each k. Both simpler GMRES

and the projection method maintain orthonormal bases of the space AKk(r0; A), which

permit residual minimization through projection of the residual onto AKk(r0; A)?.

With simpler GMRES, the kth approximate solution is obtained by solving a k � k

upper triangular system. This is also done with the projection method, but only im-

plicitly. Because the projection method is based on the short recurrence symmetric

Lanczos process, the triangular system is tridiagonal and, therefore, one can update

the approximate solution using a three-term short recurrence formula. In contrast to

simpler GMRES, the usual GMRES implementation maintains an orthonormal basis

of Kk(r0; A) through the Arnoldi process, and, consequently, achieves residual mini-

mization through the solution of an upper Hessenberg least-squares problem. MINRES

[6] can be viewed as a specialization of the usual GMRES approach to the symmetric

case, in which the short recurrence symmetric Lanczos process is used to generate an

orthonormal basis of Kk(r0; A). The upper Hessenberg system is tridiagonal, and so

An Algorithm for Symmetric Inde�nite Systems 31

solution of the upper Hessenberg least-squares problem is done implicitly in MINRES

by implementing a three-term short recurrence formula for updating the approximate

solution. In the symmetric inde�nite case without preconditioning, symmetric QMR

[2] is obtained using the same approach as MINRES. However, in solving the systems

of the preconditioned system

A0x0 = b0; where A0 =M�11 AM�1

2 ; x0 =M2x; and b0 =M�11 b;(7)

symmetric QMR is implemented by solving a quasi-minimization problem. Thus the

approach of the projection method is similar to that of simpler GMRES, while standard

GMRES, MINRES, and symmetric QMR follow an alternative approach. In section 2,

we give a derivation of the projection method and also present the results of numerical

experiments in section 3.

2. A derivation of the projection method. By applying the Arnoldi process

starting with v1 = Ar0=kAr0k2 we can have a set fv1; : : : ; vkg of orthonormal basis

vectors of the space Kk(v1; A). Suppose we have a vector pk such that Apk = vk for

each k. Then the kth residual vector rk in the simpler GMRES method is

rk = rk�1 � (rTk�1vk)vk(8)

= r0 �Azk�1 � (rTk�1vk)Apk

= r0 �A[zk�1 + (rTk�1vk)pk]:

By the last expression in equation (8) it is natural to de�ne the kth iterate xk of the

projection method as xk = xk�1 + (rTk�1vk)pk: Setting Pk = (p1; : : : ; pk) and Vk =

(v1; : : : ; vk) we need to have APk = Vk by the requirement of Apk = vk for each k. By

the relation AUk = VkRk in (6), the equation APk = Vk is equivalent to

Uk = PkRk:(9)

The search direction pk is then de�ned as

pk =

(r0=�1;1 if k = 11

�k;k(vk�1 � �1;kp1 � � � � � �k�1;kpk�1) if k > 1.

Then we have a long recursion formula to generate pk in general.

If A is symmetric, then an orthonormal basis fv1; : : : ; vkg of the space Kk(v1; A)

can be generated by the symmetric Lanczos process. Then the upper triangular matrix

Rk in (6) can be reduced to the form of0BBBBBBBBBB@

�1;1 �1;2 �1;3 0 � � � 0

0 �2;2 �2;3 �2;4. . .

......

. . .. . .

. . .. . . 0

.... . .

. . .. . . �k�2;k

.... . .

. . . �k�1;k0 : : : : : : : : : 0 �k;k

1CCCCCCCCCCA:

32 SuCheol Yi

Therefore, we have a short recursion formula for pk by (9), i.e.,

pk =1

�k;k(vk�1 � �k�1;kpk�1 � �k�2;kpk�2) for k > 1;

where �k�2;k = vTk�2Avk�1; �k�1;k = vTk�1Avk�1; �k;k = k~vkk2; and

~vk = Avk�1 � �k�1;kvk�1 � �k�2;kvk�2:

Note that we may wish to apply the projection method for solving nonsymmetric

linear systems using the nonsymmetric Lanczos process to get a short recursion formula

for a search direction pk. However, we found that the projection method with the

nonsymmetric Lanczos process is very unstable for solving nonsymmetric linear systems.

Therefore, we consider only symmetric inde�nite systems in this paper. It is known

that there exists a symmetric positive de�nite matrix S such that M = S2 for a

given symmetric positive de�nite matrix M . Therefore, the MINRES, SYMMLQ, and

projection methods can be applied to the following system:

~A~x = ~b; where ~A = S�1AS�1; ~x = Sx; and ~b = S�1b:(10)

With symmetric positive de�nite preconditionersM , the projection method for a sym-

metric matrix A can be summarized as follows:

Algorithm 2.1 Projection method (symmetric A)

Initialize: Choose x0 and set r0 = b�Ax0,

z =M�1r0; u1 = Az;w1 =M�1u1; and �1 =quT1 w1.

Update u1 u1=�1 and w1 w1=�1:

Compute �1 = rT0 w1.

Set r1 = r0 � �1u1; p1 = z=�1; and set x1 = x0 + �1p1:

Iterate: For k = 2; 3; : : : ; do:

Set uk = Awk�1:

For i = maxfk � 2; 1g; : : : ; k � 1; do:

Set ��i = uTkwi:

Update uk uk � ��iui:

Set wk =M�1uk and �k =quTkwk:

Update uk uk=�k and wk wk=�k:

Compute �k = rTk�1wk and set rk = rk�1 � �kuk.

Set pk =1�k

0@wk�1 �

k�1Xi=maxfk�2;1g

��ipi

1A and set

xk = xk�1 + �kpk.

3. Numerical Experiments. We present numerical experiments that show the

performance of the Krylov subspace methods for symmetric inde�nite systems discussed

in the previous sections. In our experiments, we also include the SYMMLQ method

[6] for solving symmetric inde�nite linear systems. Basically, the kth iterate xk of

SYMMLQ can be obtained by orthogonalizing the residual vector r(z) = r0�Az against

Kk(r0; A), whereas that of MINRES is obtained by minimizing the residual vector


over the space Kk(r0; A) for each k. For a symmetric positive de�nite preconditioner

M , it can be shown that algorithms for the SYMMLQ, MINRES, symmetric QMR,

and projection method can be implemented with only one matrix-vector multiplication

with A and one preconditioner-vector solve with M at each iteration if MT1 = M2

in implementing symmetric QMR. However, in implementing a preconditioner-vector

solve of the form Mw = r, factorizing the preconditioner M �rst, i.e., M = M1M2;

we may save oating-point operations by solving two preconditioning solves of the

form M1u = r and M2w = u instead of performing a preconditioner-vector solve with

M . Besides matrix-vector multiplication with A, two M1 and M2 preconditioning

solves or one preconditioner-vector solve with M , algorithms for the symmetric QMR,

MINRES, SYMMLQ, and projection methods use approximately 7n, 10n, 11n, and

12n multiplications and divisions, respectively.

We use a discretization of

�u+ cu = f in D;

u = 0 on @D;

for a test problem involving a symmetric linear system, where D = [0; 1] � [0; 1], and

c is a constant. The usual centered di�erence approximations were used in the dis-

cretization. We set f � x(1� x) + y(1� y) and used m = 64, where m is the number

of equally spaced interior points on each side of D, so that the resulting system has

dimension 4096. For a preconditioner we used �M + I, which is symmetric positive

de�nite, whereM is the discretized Laplacian matrix. In experiments of the SYMMLQ,

MINRES, symmetric QMR, and projection methods, we used Cholesky decomposition

of the preconditioner. Also, we used the vector (1; 1; : : : ; 1)T 2 Rn for the initial guess

and used double precision on Sun Microsystems workstations in all experiments. The

true residual norms kb�Axkk2 are monitored in assessing the comparative performance.

In the following Figure 1, the true residual norm curves generated by the MINRES,

SYMMLQ, symmetric QMR, and projection methods are monitored using values of c =

100 and m = 64. As shown in Figure 1, we could see that there were some di�erences

in the limits of reduction of the true residual norms. We regard these di�erences as

insigni�cant, since the di�erences are small relative to that of the satisfactory limit of

residual norms. The projection method is as numerically sound as MINRES, SYMMLQ,

symmetric QMR in all our experiments.

In the following Figure 2, we plotted the true residual norm reduction versus

oating-point operation counts for the MINRES, SYMMLQ, symmetric QMR, and

projection methods. We ran the algorithms for 80 iterations. Figure 2 shows that the

symmetric QMR, MINRES, projection methods need about the same number of oper-

ations to reach around the 10�10 level of residual norm reduction, although symmetric

QMR needs slightly fewer number of operations than the MINRES and projection

methods do. Figure 2 also shows that SYMMLQ requires approximately 10% more

operations relative to that of the other three methods for 10�10 level of residual norm

reduction.

34 SuCheol Yi

0 10 20 30 40 50 60 70 80−14

−12

−10

−8

−6

−4

−2

0

2

4

Figure 1: Log10 of the true residual norms vs. the number of iterations; c = 100 with

preconditioner �M + I, where M is the discretized Laplacian matrix. Solid curve:

MINRES; dashdot curve: symmetric QMR; dotted curve: algorithm 2.1; dashed curve:

SYMMLQ; m = 64.


0 2 4 6 8 10 12

x 107

−14

−12

−10

−8

−6

−4

−2

0

2

4

Figure 2: Log10 of the true residual norms vs. the number of oating-point operations;

c = 100 with preconditioner �M + I, where M is the discretized Laplacian matrix.

Solid curve: MINRES; dashdot curve: symmetric QMR; dotted curve: algorithm 2.1;

dashed curve: SYMMLQ; m = 64.

36 SuCheol Yi

4. Conclusion. In this paper, we have considered Krylov subspace methods for

solving large symmetric inde�nite linear systems and have introduced a new approach

for solving them, which is called the projection method in this paper. Our numerical

experiments showed that the projection method is as numerically sound as the MIN-

RES, SYMMLQ, and symmetric QMR methods. Furthermore, these methods require

roughly similar e�ort to achieve comparable residual norm reduction, although sym-

metric QMR is most e�cient and SYMMLQ is mostly cost by a slight margin. However,

only the symmetric QMR method allows use of arbitrary nonsingular symmetric indef-

inite preconditioners, which is an advantage of this method over the other methods.

The symmetric QMR and projection methods have also an advantage over MINRES

and SYMMLQ in easier programming.

REFERENCES

[1] R. W. FREUND, AND N. M. NACHTIGAL, QMR: a quasi-minimal residual method for

non-Hermitian linear systems, Numer. Math., 60 (1991), pp. 315-339.

[2] R. W. FREUND, AND N. M. NACHTIGAL, A new Krylov subspace method for sym-

metric inde�nite linear systems, ORNL/TM-12754 (1994).

[3] R. W. FREUND, AND T. SZETO, A quasi-minimal residual squared algorithm for non-

Hermitian linear systems, Tech. Rep. 91-26, Research Institute for Advanced

Computer Science, NASA, Ames Research Center (1991).

[4] R. W. FREUND, AND H. ZHA, Simpli�cations of the nonsymmetric Lanczos process

and a new algorithm for Hermitian inde�nite linear systems, Murray Hill Bell

Labs (1994).

[5] G. H. GOLUB, AND C. F. VAN LOAN, Matrix Computations, 2nd ed., The Johns

Hopkins University press, Baltimore MD, (1989).

[6] C. C. PAGE, AND M. A. SAUNDERS, Solution of sparse inde�nite systems of linear

equations, SIAM J. Numer. Anal., 12 (1975), pp. 617-629.

[7] Y. SAAD, AND M. H. SCHULTZ, GMRES: a generalized minimal residual algorithm

for solving nonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 7 (1986),

pp. 856-869.

[8] H. F. WALKER, AND L. ZHOU, A simpler GMRES, Numer. Lin. Alg. Appl., 1

(1994), pp. 571-581.

Department of Applied Mathematics

Changwon National University

9 Sarim-dong, Changwon,

Kyongnam, 641-773, Korea.

E-mail: [email protected]

NUMERICAL SOLUTION OF A CONSTRICTED STEPPED

CHANNEL PROBLEM USING A FOURTH ORDER METHOD

Paulo F. de A. Mancera?

Roland Hunty

J. KSIAM Vol.3, No.2, 51-67, 1999

Abstract

The numerical solution of the Navier-Stokes equations in a constricted stepped

channel problem has been obtained using a fourth order numerical method. Trans-

formations are made to have a �ne grid near the sharp corner and a long channel

downstream. The derivatives in the Navier-Stokes equations are replaced by fourth

order central di�erences which result a 29-point computational stencil. A proce-

dure is used to avoid extra numerical boundary conditions near the solid walls.

Results have been obtained for Reynolds numbers up to 1000.

1 Introduction

We apply an wide fourth order numerical method for solving the Navier-Stokes equa-

tions to a constricted stepped channel problem. The constricted stepped channel prob-

lem considered consists of a sudden contraction (a forward-facing stepped channel, see

Figure 1) and contains a re-entrant corner. Because of the di�culties associated with

that corner this channel problem has been much studied.

Mo�at [13] has studied the Stokesian ow near a re-entrant corner and has shown

that the vorticity is singular. Bramley and Dennis [1] compare the Mo�at expansion

with their numerical solution near the corner for a branching channel problem and Hunt

[9] compares the Mo�at expansion along with other techniques for a constricted stepped

channel problem. Dennis and Smith [4] solve a constricted stepped channel problem

using diagonal grids near the corner. Holstein and Paddon [7] present a method which

is based on using Mo�at's expansion to produce �nite di�erence stencils which take into

account the nature of the singularity at the corner. Ma and Ruth [12] compare some

of the techniques referenced above and others with their vorticity-circulation method.

There are many other calculations of this problem which use either the streamfunction-

vorticity or the streamfunction formulation of the Navier-Stokes equations, and use

various methods to solve the system of the non-linear equations. We will cite a few.

Dennis and Smith [4] use the streamfunction-vorticity formulation of the Navier-Stokes

equations discretisating the streamfunction by second order central di�erences and the

vorticity equation by second order central di�erences which incorporate the Dennis-

Hudson arti�cial viscosity and the resulting system of equations is solved using an

Keywords: Navier-Stokes equations, fourth order numerical method, streamfunction formulation.

AMS Subject Classi�cation: 65N06

51

52 Paulo F. de A. Mancera, Roland Hunt

Figure 1: Stepped channel.

SOR iteration. Huang and Seymour [8] use the interior constraint method for solving

the streamfunction-vorticity formulation and again an SOR iteration is used to solve the

system of equations. Hunt [9] solves the streamfunction formulation using second order

central di�erences and Newton method is used to solve the resulting system of equa-

tions. Karageorghis and Phillips [11] solve the streamfunction formulation using the

Chebyshev spectral element method and the resulting system of equations is solved by

Newton method. Finally we observe that the computational domain for the constricted

stepped channel problem is L-shaped region. This causes considerable di�culties in ap-

plying a 29-point computational stencil near the re-entrant corner. However the main

di�culty with the constricted stepped channel problem is that the ow at the re-entrant

corner is singular, that is the second and higher derivatives of the streamfunction are

singular.

2 Forward-facing stepped channel

Let us consider a channel problem with walls at y = �1 for x < 0, y = �1

2for x > 0

and1

2� jyj � 1 for x = 0. Due to symmetry the problem is solved for y � 0 (see Figure

1). The governing equations for this channel problem are given by the Navier-Stokes

equations

@2

@x2+@2

@y2= �� (1)

A Constricted Stepped Channel Problem 53

@2�

@x2+@2�

@y2= Re

�@

@y

@�

@x�@

@x

@�

@y

�(2)

where Re is the Reynolds number and the boundary conditions are

= 1;@

@y= 0 on y = 1; x � 0; and y =

1

2; x � 0

= 1;@

@x= 0 on x = 0;

1

2� y � 1

= 0;@2

@y2= 0 on y = 0 (3)

!3

2y �

1

2y3; � ! 3y as x! �1;

! 3y � 4y3; � ! 24y as x! +1

where Poiseuille ow has been assumed far upstream and far downstream.

Substituting equation (1) into equation (2) gives

@4

@x4+ 2

@4

@x2@y2+@4

@y4= Re

@

@y

@3

@x3+

@3

@x@y2

!�@

@x

@3

@y@x2+@3

@y3

!!(4)

which is called the streamfunction formulation for the steady incompressible Navier-

Stokes equations

Because of the need of �ner grid near the sharp corner we consider transformations

given by

� = f(�); � = g(�) (5)

and hence the governing equations are given by

D = �� (6)

D� =Re

f 0g0

�@�

@�

@

@��@�

@�

@

@�

�(7)

where

D �1

f 02

@2

@�2�f00

f 03

@

@�+

1

g02

@2

@�2�g00

g03

@

@�(8)

We have chosen the same transformations given by Hunt [9], that is

x = f(�) =�x0

ksinh(k �) (9)

y = g(�) = � +1

2�(1��y0) sin(2� �) (10)

where h�x0 and h�y0 are the dimensions of a cell in the x{y plane near the corner, k

is a parameter determined by the position of the upstream boundary and h is the grid

size. Figure 2 shows an example of a non-uniform grid placed on the channel.


Figure 2: Forward-facing stepped channel: grid mesh.

3 THE STREAMFUNCTION FORMULATION ON A

NON-UNIFORM GRID

Mancera [2] and Mancera and Hunt [3] have used a procedure to deal with the stream-

function formulation of the Navier-Stokes equations on a non-uniform grid which con-

sists of

1. Discretise equations (6) and (7) using fourth order central di�erences.

2. Eliminate �i;j from these equations.

3. Obtain a computational stencil with 29 points.

We will obtain the full expression for the streamfunction formulation of the Navier-

Stokes equations to analyse this constricted stepped problem since after discretising

the equation we will have a 29-point computational stencil, instead of the 33-point

computational stencil resulting from the procedure cited above (step 2). Writing equa-

tion (6) as

� = �1

f 02

@2

@�2+f00

f 03

@

@��

1

g02

@2

@�2+g00

g03

@

@�(11)

and then calculating@�

@�,@�

@�,@2�

@�2e@2�

@�2we obtain, after substituting these derivatives

in equation (7),

�1

f 04

@4

@�4�

1

g04

@4

@�4�

2

f 02g02

@4

@�2@�2+6f 00

f 05

@3

@�3+6g00

g05

@3

@�3+

2g00

f 02g03

@3

@�2@�

+2f 00

g02f 03

@3

@�2@�+4f 000f 0 � 15f 002

f 06

@2

@�2+4g000g0 � 15g002

g06

@2

@�2+

2g00f 00

(f 0g0)3@2

@�@�

+f0000f02 � 10f 0f 00f 000 + 15f 003

f 07

@

@�+g0000g02 � 10g0g00g000 + 15g003

g07

@

@�

�Re

f 0g0

@

@�

3f 00

f 03

@2

@�2�

1

f 02

@3

@�3+f000f0 � 3f 002

f 04

@

@��

1

g02

@3

@�@�2+g00

g03

@2

@�@�

!


�@

@�

�

1

f 02

@3

@�@�2+f00

f 03

@2

@�@�+3g00

g03

@2

@�2�

1

g02

@3

@�3+g000g0 � 3g002

g04

@

@�

!!= 0(12)

Equation (12) is the streamfunction formulation of the Navier-Stokes equations on a

non-uniform grid considering the transformations given in (5). Finally, we observe that

if f 0 = g0 = 1 in (12) then the expression (4) is obtained.

4 DISCRETISATION OF THE EQUATION

We set a uniform grid on the computational domain with grid size h in both directions

� and �. If i;j denotes an aproximation to at position (i; j) then the derivatives on

the �-direction are approximated by fourth order centre di�erences, that is,

@

@x=

1

12h(� i+2;j + 8 i+1;j � 8 i�1;j + i�2;j) +O

�h4�

@2

@x2=

1

12h2(� i+2;j + 16 i+1;j � 30 i;j + 16 i�1;j � i�2;j) +O

�h4�

(13)

@3

@x3=

1

8h3(� i+3;j + 8 i+2;j � 13 i+1;j + 13 i�1;j � 8 i�2;j + i�3;j) +O

�h4�

@4

@x4=

1

6h4(� i+3;j + 12 i+2;j � 39 i+1;j + 56 i;j � 39 i�1;j

+ 12 i�2;j � i�3;j) +O

�h4�

The mixed derivatives are evaluated in the programme using a do loop. For example,@3

@�@�2is approximated by

al =1

12h(� i+2;j+l + 8 i+1;j+l � 8 i�1;j+l + i�2;j+l) ; l = �2;�1; 0; 1; 2

@3

@�@�2'

1

12h2(�a2 + 16a1 � 30a0 + 16a�1 � a�2) (14)

which gives a 5�5-computational stencil. Equation (12) is discretised by formulas (13)

and (14) to result the 29-point-computational stencil as shown in Figure 3.

The application of the 29-point computational stencil to this constricted stepped

channel problem is not straightforward because of the di�culties to deal with �ctitious

points near the sharp corner (singular point). To understand the di�culties let us

consider two situations of calculations near the sharp corner as illustrated in Figure 4.

Applying the computational stencil with 29 points at positions denoted by � we note

that both calculations use common �ctitious points, but the behaviour of the ow before

the corner is di�erent from the behaviour after the corner. Hence the four �ctitious

nodes near the corner have two values for the streamfunction each depending whether

the centre of the computational stencil is before or after the corner. To overcome these

di�culties we have not used �ctitious points at the solid walls in the calculations. If we


r r r r

r

r r r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

Figure 3: Computational stencil with 29 points.

apply the 13-point computational stencil� at the interior points next to the boundary

and the 29-point computational stencil at all other interior points then we only require

the value of at a single point, denoted by ?, outside the boundary. This can be

eliminated using the derivative boundary condition at the wall given by

@

@n= 0 (15)

where@

@nis the normal derivative. Using fourth order discretisation we can approxi-

mate this by

@

@n'

1

12h(3 ? + 10 0 � 18 1 + 6 2 � 3) = 0 (16)

where the subscripts 0, 1, 2 and 3 denote, respectively, a point on the boundary and

j = 1; 2; 3 the j-th internal grid point along the inward normal from 0, and ? a point

outside the boundary. From (16) we obtain that

? =1

3(�10 0 + 18 1 � 6 2 + 3) (17)

which can be used to remove the �ctitious point from the computational stencils with

13 and 29 points and then there are no �ctitious points used in the calculations. Our

use of applying the 13-point computational stencil next to the boundary di�ers slightly

from Henshaw's procedure (see Henshaw [5] and Henshaw et. al. [6]) where that

computational stencil is applied on the boundary, but the method can be shown to be

still fourth order accurate (Hunt, private communication).

Using these ideas we have set up a procedure (see Figure 5 for positions of calcu-

lations near the re-entrant corner) to discretise the governing equation in the compu-

tational domain. The procedure is as follows. Let us consider an N1 �M1 grid before

�This computational stencil is obtained by discretising the governing equation by second order

central di�erences.


�rrr r r r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r �rrr r r r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

r

Figure 4: Computational stencils near the re-entrant corner.

the sharp corner and an N2�M2, where M2 =M1

2after it, where the grid vertices are

(i; j), i = �N1;�N1+1; : : : ;�1; 0; 1; : : : ; N2, j = 0; 1; : : : ;M , where M =M1 for i � 0

and M =M2 for i > 0. Then

1. At points j = M1 � 1 for �N1 + 1 � i � �2 and j = M2 � 1 for 1 � i �

N2 � 1 we apply a computational stencil with 13 points (second order accurate)

with all �ctitious points replaced by (17), that is ? =1

3(�10 i;M1

+ 18 i;M1�1

�6 i;M1�2 + i;M1�3) at j =M1 � 1 and ? =1

3(�10 i;M2

+ 18 i;M2�1 �6 i;M2�2 + i;M2�3) at j =M2 � 1. In Figure 5 these

points are indicated by 1.

2. At i = �1 for M2 + 1 � j �M1 � 2 we apply a 13-point computational stencil

with �ctitious points substituted by ? =1

3(�10 0;j + 18 �1;j � 6 �2;j+ �3;j)

(see 2 in Figure 5).

3. At grid position (�1;M1 � 1) we apply a computational stencil with 13 points,

where to eliminate �ctitious points in both axis directions we apply similar ex-

pressions to ? as those given in the two items above (see 3 in Figure 5).

4. At grid positions (�1;M2), (�1;M2�1) and (0;M2�1) we apply a computational

stencil with 13 points (see 4 in Figure 5).

5. At positions j =M1� 2 for �N1+1 � i � �3 and j =M2� 2 for 1 � i � N2� 1

we apply a computational stencil with 29 points (fourth order accurate), where

? =1

3(�10 i;k + 18 i;k�1 � 6 i;k�2 + i;k�3) with k either M1 or M2 is used

to eliminate �ctitious points (see 5 in Figure 5).


6. At i = �2 for M2+1 � j �M1� 3 the 29-point computational stencil is applied

with all �ctitious points substituted by the same expression given in the second

item (see 6 in Figure 5).

7. At position (�2;M1�2) we apply a computational stencil with 29 points with all

�ctitious points in both directions eliminated using ? given in the two preceding

items (see 7 in Figure 5).

8. To other interior points we apply a computational stencil with 29 points. In

Figure 5 they are indicated by 8.

9. On solid walls we have i;j = 1 and along the line of symmetry i;0 = 0, i;�1 =

� i;1 and i;�2 = � i;2.

10. At the ends of the channel we have set 0;j =jh

2(3� jh), N;j =

�3jh� 4(jh)3

�,

� �N1+2;j + 16 �N1+1;j �30 �N1;j + 16 �N1�1;j � �N1�2;j = 0 and� �N2+2;j

+16 �N2+1;j � 30 �N2;j + 16 �N2�1;j � �N2�2;j = 0.

3

2

4

2

2

2

111

755

6

6

6

8

8

8

8

8

8

8

8

8

4888

8888

8888

8888

4 1 1 1

8 5 5 5

8 8 8 8

8 8 8 8

Figure 5: Stepped channel: positions of the calculations near the solid walls.


5 NUMERICAL SOLUTION AND ACCURACY

The system of algebraic equations resulting from the discretizations is solved by Newton

method which is described in Hunt [9, 10] and Mancera and Hunt [3]. The numerical

solution is obtained on an N � M grid and in order to estimate the error in these

results we obtain a second solution on a N=2�M=2 grid for comparison. Suppose, at a

common location, the numerical solution is �F on the original �ne grid and �M on the

coarser grid and, if further � is the exact solution at this point, then since the methods

are fourth order we have

�� F ' Kh4; �� M ' K(2h)4 (18)

for some constant K. Eliminating � we obtain an estimate for the error EF on the �ne

grid (' Kh4) as

EF '�F � �M

15(19)

The errors are estimated as follows

RMS : jj F � M jj2 =

�1

N

X� Fij �

Mij

�2�1=2(20)

maximum : jj F � M jj1 = maxijj F

ij � Mij j (21)

where N is the number of points on the computational space, Fi;j and

Mij , are, respec-

tively the numerical solution at (i; j) on the �ne and coarser grids.

6 Results for the stepped channel

We numerically solve the ow in the forward-facing stepped channel problem using

the 29-point computational stencil together with boundary data in which all �ctitious

points are eliminated. We consider the elimination of the �ctitious points to be the

best approach to this problem. Let us explain step by step the process of discretisation.

First, the Navier-Stokes equations (1) and (2) are transformated in equations (6) and

(7), where the coordinate transformations are given by equations (9) and (10). Second,

equations (6) and (7) are written in the streamfunction formulation alone (equation

(12)) and then discretised by fourth order central di�erences to obtain a 29-point com-

putational stencil. Third, equation (12) is also discretised using second order central

di�erences to obtain a 13-point computational stencil which is applied adjacent to the

walls. Fourth, we apply a procedure to eliminate all �ctitious points at the solid walls.

The results are presented both for a uniform grid and for a non-uniform.

For the uniform grid we have set the upstream boundary at x = �2 and the down-

stream boundary at x = 2, where the number of points on the �ne grid is 96 in the

y-direction before the corner. The maximum and RMS errors are shown in Tables y

yThe notation a(�b) means a� 10�b.


Table 1: Maximum errors on a uniform grid.

Methods

Fourth order method Second order method

Re errors errors Ratio

0 1.50(-4) 8.00(-4) 5.33

1 1.55(-4) 8.32(-4) 5.37

10 1.86(-4) 1.04(-3) 5.59

50 2.24(-4) 1.53(-3) 6.83

100 2.32(-4) 1.84(-3) 7.93

250 4.35(-4) 2.26(-3) 5.20

500 1.97(-3) 2.30(-3) 1.17

Table 2: RMS errors on a uniform grid.

Methods



0 2.86(-5) 1.41(-4) 4.93

1 2.81(-5) 1.42(-4) 5.05

10 2.68(-5) 1.54(-4) 5.75

50 2.91(-5) 1.70(-4) 5.84

100 3.14(-5) 1.75(-4) 5.57

250 5.45(-5) 2.55(-4) 4.68

500 3.56(-4) 4.46(-4) 1.25

1 and 2, respectively. We note from both tables that the results given by the fourth

order method are not much more accurate than their second order counterparts where

for all Re the errors for the fourth order method are less than 8 times smaller than

their second order equivalent.

Now we analyse the results where �ctitious points are not used on the solid walls.

The upstream position is at x = �2 and the downstream position is either at x ' 100

or at x ' 1000. In the numerical simulations we have chosen �x0 = �y0 = 0:025 and

�x0 = 0:01 and �y0 = 0:025 on the coarser grid.

In Tables 3 and 4 we present results for the fourth and second order methods on a

non-uniform grid where the upstream position is at x = �2, the downstream position at

x ' 109, the value of the parameter k in equation (9) is 2.9973 and �x0 = �y0 = 0:025.

The number of points in the �-direction is 80 and in the �-direction 24 on the coarser

grid. Comparing the results given in Tables 3 and 4 with those given in Tables 1

and 2 we have obtained results up to Re = 1000, even though for the fourth order

method Newton method has failed to converge after 10 iterations. For the fourth order

numerical method the errors on a non-uniform grid are smaller than the errors on a


Table 3: Maximum errors and ratio between errors for the upstream position at x = �2,

downstream position at x ' 109 and �x0 = �y0 = 0:025.

Methods



0 8.81(-5) 9.60(-4) 10.90

1 9.11(-5) 9.13(-4) 10.02

10 1.10(-4) 6.28(-4) 5.71

50 1.43(-4) 7.02(-4) 4.91

100 1.57(-4) 9.53(-4) 6.07

125 1.62(-4) 1.04(-3) 6.42

250 1.89(-4) 1.34(-3) 7.09

500 3.99(-4) 1.64(-3) 4.11

750 9.16(-4) 2.84(-3) 3.10

1000 | 4.79(-3) |

uniform grid for all Reynolds numbers. The ratio between the errors are less than 11

for the maximum erros and less than 13 for the RMS errors.

The results for the situation upstream position at x = �2 and downstream position

at x ' 1033 are given in Tables 5 and 6, where the number of points in the �-direction

is 98 and in the �-direction is 24 on the coarser grid. We have obtained results for

Reynolds number up to 1000 and for both fourth and second order numerical methods.

Comparing the maximum and RMS errors given in Tables 5 and 6 with their coun-

terparts given in Tables 3 and 4 we observe the same results for the maximum errors

and the RMS errors are smaller for the downstream position x ' 1033 since the ow

changes very slowly after the the sharp corner. Again the ratios between the errors

from the second and fourth order methods are, respectively, less than 12 and less than

14 for the maximum and RMS errors.

We have also analysed the constricted stepped channel problem for the same up-

stream and downstream positions but considering �x0 = 0:01 and �y0 = 0:025. We

have chosen these values after many numerical experimentations, since the Newton

method employed has not converged for some values of �x0 and �y0. For the up-

stream position at x = �2 and downstream position at x ' 142 the number of points

in the �-direction is 72 and in the �-direction 24 on the coarser grid and k = 4:2638.

The maximum errors (see Table 7) range from 4:78�10�5 to 3:12�10�4 for the fourth

order method and the ratios between the errors from the second and fourth order meth-

ods are less than 25 for all Reynolds numbers. For the fourth order method the RMS

errors given in Table 8 range from 6:02 � 10�5 to 1:28 � 10�5 and again the ratios

between the errors are less than 25. The fourth order method did not converge after

10 iterations to Re = 1000 on the coarser grid. Comparing the errors for the fourth

order method given in Tables 3 and 4 with those given in Tables 7 and 8 we observe

that both maximum and RMS errors are smaller for �x0 = 0:01 and �y0 = 0:025.


Table 4: RMS errors and ratio between errors for the upstream position at x = �2,


Methods



0 2.03(-5) 2.59(-4) 12.76

1 2.04(-5) 2.49(-4) 12.21

10 2.10(-5) 1.89(-4) 9.00

50 2.33(-5) 1.34(-4) 5.75

100 2.60(-5) 1.35(-4) 5.19

125 2.67(-5) 1.37(-4) 5.13

250 3.15(-5) 1.49(-4) 4.73

500 6.85(-5) 1.70(-4) 2.48

750 1.71(-4) 5.14(-4) 3.01

1000 | 6.52(-4) |



Methods



0 8.81(-5) 9.60(-4) 10.90

1 9.11(-5) 9.13(-4) 10.02

10 1.10(-4) 6.28(-4) 5.71

50 1.43(-4) 7.02(-4) 4.91

100 1.57(-4) 9.53(-4) 6.07

125 1.62(-4) 1.04(-3) 6.42

250 1.89(-4) 1.34(-3) 7.09

500 3.99(-4) 1.64(-3) 4.11

750 9.26(-4) 1.72(-3) 1.86

1000 1.33(-3) 2.00(-3) 1.50




Methods



0 1.84(-5) 2.44(-4) 13.26

1 1.85(-5) 2.36(-4) 12.76

10 1.90(-5) 1.85(-4) 9.74

50 2.13(-5) 1.40(-4) 6.57

100 2.35(-5) 1.41(-4) 6.00

125 2.42(-5) 1.42(-4) 5.87

250 2.85(-5) 1.52(-4) 5.33

500 6.19(-5) 1.70(-4) 2.75

750 1.60(-4) 2.23(-4) 1.39

1000 2.28(-4) 3.01(-4) 1.32


downstream position at x ' 142 and �x0 = 0:01 and �y0 = 0:025.

Methods



0 4.78(-5) 1.16(-3) 24.27

1 4.94(-5) 1.10(-3) 22.27

10 6.05(-5) 7.49(-4) 12.38

50 8.33(-5) 6.82(-4) 8.19

100 9.72(-5) 5.10(-4) 5.25

125 1.00(-4) 5.37(-4) 5.37

250 1.03(-4) 6.82(-4) 6.62

500 1.38(-4) 1.03(-3) 7.46

750 3.12(-4) 6.98(-3) 22.37

1000 | 5.37(-2) |




Methods



0 1.28(-5) 3.05(-4) 23.83

1 1.31(-5) 2.96(-4) 22.60

10 1.50(-5) 2.18(-4) 14.53

50 1.82(-5) 1.23(-4) 6.76

100 2.05(-5) 1.18(-4) 5.76

125 2.10(-5) 1.18(-4) 5.62

250 1.95(-5) 1.23(-4) 6.31

500 2.63(-5) 2.05(-4) 7.80

750 6.02(-5) 1.25(-3) 20.76

1000 | 9.74(-3) |

In Tables 9 and 10 we present errors for the downstream position at x ' 1199

with the number of points in the �-position equal to 84. We have obtained results

up to Re = 1000 for the fourth order method and up to Re = 750 for the second

order method. Again the ratios between the errors are less than 25 for both maximum

and RMS errors and the RMS errors are slightly smaller than the RMS errors for the

channel with the downstream position at x ' 142.

Comparing both situations of channel length and grid re�ning we observe that the

upstream position x = �2, downstream position x ' 1199, �x0 = 0:01 and �y0 =

0:025 have given the best results for the fourth order numerical method, although the

ratio between the errors for all situations analysed has indicated that the fourth order

numerical method is not much more accurate than the second order method for this

channel problem.

7 Conclusions

We have analysed a fourth order numerical method for solving the Navier-Stokes equa-

tions for the constricted stepped channel. We have set a transformation which gives a

�ne grid near the sharp corner and a long channel downstream. For the most situa-

tions we have obtained results for Reynolds numbers up to 1000. Due to di�culties to

deal with the solution near the sharp corner we have used a procedure which gives no

�ctitious points at the solid walls. For this channel problem the fourth order numerical

method is not much more accurate than the second order method, but we must note

that has singular derivatives at the sharp corner which in uences the solution, as

can be observed in Mancera and Hunt [3] where a channel problem with gradual and

smooth constriction is solved.




Methods



0 4.78(-5) 1.16(-3) 24.27

1 4.94(-5) 1.10(-3) 22.27

10 6.05(-5) 7.49(-4) 12.38

50 8.33(-5) 3.95(-4) 4.74

100 9.72(-5) 5.10(-4) 5.25

125 1.00(-4) 5.37(-4) 5.37

250 1.03(-4) 6.82(-4) 6.62

500 1.38(-4) 1.03(-3) 7.46

750 3.11(-4) 1.03(-3) 3.31

1000 5.64(-4) | |



Methods

Fourth order methods Second order methods

Re erros erros Ratio

0 1.19(-5) 2.89(-4) 24.29

1 1.22(-5) 2.81(-4) 23.03

10 1.39(-5) 2.11(-4) 15.18

50 1.69(-5) 1.30(-4) 7.69

100 1.90(-5) 1.25(-4) 6.58

125 1.95(-5) 1.25(-4) 6.41

250 1.80(-5) 1.30(-4) 7.22

500 2.43(-5) 1.99(-4) 8.19

750 5.59(-5) 3.94(-4) 4.23

1000 9.31(-5) | |


ACKNOWLEDGEMENT

The �rst author was supported by FAPESP{Funda�c~ao de Amparo �a Pesquisa do Estado

de S~ao Paulo under grant 1996-9530-5.

References

[1] J. S. Bramley and S. C. R. Dennis, The numerical solution of two-dimensional

ow in a branching channel, Comput Fluids, 12{4 (1984), pp. 339{355.

[2] P. F. de A. Mancera, Fourth Order Numerical Methods For Solving The Navier-

Stokes Equations in Two Dimensions, PhD thesis, University of Strathclyde, 1996.

[3] P. F. de A. Mancera and R. Hunt, Fourth order method for solving the Navier-

Stokes equations in a constricting channel, Int. J. Numer. Methods Fluids, 25

(1997), pp. 1119{1135.

[4] S. C. R. Dennis and F. T. Smith, Steady ow through a channel with a sym-

metrical constriction in the form of a step, Proc. R. Soc. Lond., A. 372 (1980),

pp. 393{414.

[5] W. D. Henshaw, A fourth-order accurate method for the incompressible Navier-

Stokes equations on overlapping grids, J. Comput. Phys., 113 (1994), pp. 13{25.

[6] W. D. Henshaw, H.-O. Kreiss, and L. G. M. Reyna, A fourth-order-accurate

di�erence approximation for the incompressible Navier-Stokes equations, Comput

Fluids, 23-4 (1994), pp. 575{593.

[7] H. Holstein and D. J. Paddon, A singular �nite di�erence treatment of re-

entrant corner ow, J. Non-Newtonian Fluid Mech., 8 (1981), pp. 81{93.

[8] H. Huang and B. R. Seymour, A �nite di�erence method for ow in a con-

stricted channel, technical report 93-167, University of British Columbia, 1993.

[9] R. Hunt, The numerical solution of the laminar ow in a constricted channel at

moderately high Reynolds number using Newton iteration, Int. J. Numer. Methods

Fluids, 11 (1990), pp. 247{259.

[10] , The numerical solution of the ow in a general bifurcating channel at moder-

ately high Reynolds number using boundary-�tted co-ordinates, primitive variables

and Newton iteration, Int. J. Numer. Methods Fluids, 17 (1993), pp. 711{729.

[11] A. Karageorghis and T. N. Phillips, Chebyshev spectral collocation methods

for laminar ow through a channel contraction, J. Comput. Physics, 84 (1989),

pp. 114{133.

[12] H. Ma and D. W. Ruth, A new scheme for vorticity computations near a sharp

corner, Comput Fluids, 23{1 (1994), pp. 23{38.


[13] M. K. Moffat, Viscous and resistive eddies near a sharp corner, J. Fluid Mech.,

18 (1964), pp. 1{59.

Departamento de Bioestat�istica

Instituto de Biociencias{UNESP

CP 510

18618-000 Botucatu{Brazil

e-mail: [email protected]

Department of Mathematics

University of Strathclyde

26 Richmond Street

G1 1XH Glasgow{Scotland


J. KSIAM Vol.3, No.2, 69-80, 1999

AN ANALYSIS OF MMPP=D1;D2=1=B QUEUE FOR

TRAFFIC SHAPING OF VOICE IN ATM NETWORK

Doo Il Choi

Abstract. Recently in telecommunication, BISDN ( Broadband Integrated Service Dig-

ital Network ) has received considerable attention for its capability of providing a common

interface for future communication needs including voice, data and video. Since all in-

formation in BISDN are statistically multiplexed and are transported in high speed by

means of discrete units of 53-octet ATM ( asynchronous Transfer Mode ) cells, appropri-

ate tra�c control needs. For tra�c shaping of voice, the output cell discarding scheme

has been proposed. We analyze the scheme with aMMPP=D1;D2=1=B queueing system

to obtain performance measures such as loss probability and waiting time distribution.

1. Introduction

The Asynchronous Transfer Mode ( ATM ) has been selected as a mode of trans-

mission and switching in the BISDN ( Broadband Integrated Service Digital Networks

), because of its e�ciency and exibility. The ATM is based on asynchronous time

division multiplexing and fast packet switching technology. In ATM networks, all

information are transmitted in a �xed-size packet called cell which has a 48-octet in-

formation �eld and 5-octet header. The header contains various information required

to transfer the information �eld across the network.

The ATM networks support diverse services which require the di�erent Quality of

Service ( QoS ) such as voice, data and video. Since user terminals in BISDN generate

cells only when they have information to transmit and these cells are statistically

multiplexed, the tra�c stream uctuates uncertainly. Therefore, tra�cs such as voice

and video have properties of time-correlation and burstiness. This characteristics of

tra�c may cause to congestion of network, so appropriate tra�c control needs.

Voice tra�c has delay-sensitive but loss-insensitive characteristic. An e�ective

method to support voice tra�c in ATM networks is use of output cell discarding (CD)

scheme. The output CD scheme operates as follows: Voice information is stored in pair

of cells to separate the more signi�cant and less signi�cant bits. The cell containing the

more signi�cant bits is identi�ed as high priority cell ( i.e. nondiscardable in network

) and the cell containing the less signi�cant bits is identi�ed as low priority cell ( i.e.

discardable in network ). The low-priority cells may be discarded during congestion of

network. This output CD scheme results in signi�cant transmission bandwidth saving

key words and phrases : queueing analysis, tra�c shaping

69

70 DOO IL CHOI

and resiliency of the network during congestion. Therefore, the spare bandwidth ob-

tained by CD scheme can be used to support di�erent tra�c such as data and video.

Also, this smoothing e�ect of voice helps in avoiding bu�er over ow[2,3].

To model the bursty voice tra�c, we use a Markov-modulated Poisson process(MMPP)

in pair of cells. We put a threshold on bu�er considering congestion of network. If

the bu�er occupancy at transmission epoch is less than or equal to the threshold, the

service time is D1 ( the transmission time of cell pair ). Otherwise, the service time

is D2(= D1=2, because low-priority cell is discarded ). We assume a �nite capacity (

B ) queue for practical applications. Then, the output CD scheme is modeled by the

queueing systemMMPP=D1;D2=1=B with one threshold. In following section, we an-

alyze the queueing model by using the embedded Markov chain and the supplementary

variable method.

2. Description of model and MMPP

A Markov-modulated Poisson process(MMPP) has been used to model the video

and the packetized voice tra�c. The MMPP can be constructed by a Poisson process

with a rate that varies according to an N -state irreducible continuous-time Markov

process fJ(t); t � 0g (called the underlying Markov process). When the underlying

Markov process is in state i at time t, arrivals occur according to a Poisson process

of rate �i. The sojourn time of the state i follows exponential distribution with mean1

�i. Then, the MMPP is characterized by the Markov process fJ(t); t � 0g with the

transition rate matrix Q and the arrival rate matrix � , diag (�1; �2; � � � ; �N ). Thetransition rate matrix Q is as follows:

Q =

��

��1 �12 : : : �1N�21 ��2 : : : �2N...

.... . .

...

�N1 �N2 : : : ��N

��:

The steady-state probability vector � of the underlying Markov process fJ(t); t � 0gis given by solving the following equations

�Q = 0; �e = 1; e = (1; 1; � � � ; 1)T :

The arriving cells in pair are �rst queued in a bu�er of �nite capacity B in unit of pair

of cells. Cells arriving when the bu�er is full are lost, and cell pairs in bu�er are served

on the �rst-come �rst-service basis.

Introduce the notations

M(t) =the number of cell pairs arriving during the interval (0; t];

J(t) =the state of the underlying Markov process at time t:

Now we de�ne the conditional probabilities

pi;j(n; t) = PfM(t) = n; J(t) = jjM(0) = 0; J(0) = ig; n � 0; 1 � j � N:

AN ANALYSIS OF MMPP=D1;D2=1=B QUEUE 71

Then, it is easily shown that theN�N matrix of probabilitiesP (n; t) = (pi;j(n; t))1�i;j�N ,

has the probability generating function

�P (z; t) =

1Xn=0

P (n; t)zn; jzj � 1;

= eR(z)t;

where R(z) = Q+ (z � 1)�.

3. Analysis of queue length distribution

3.1 The queue length distribution at transmission epochs

Introduce the notations

�n = the n-th service completion epoch; n � 1; �0 , 0;

Nn = the queue length at time �n+;

Jn = the state of the underlying Markov process at time �n + :

Then, the process f(Nn; Jn); n � 0g forms a Markov chain with �nite state space f0; 1;� � � ; B � 1g � f1; 2; � � � ; Ng.De�ne the limiting probabilities xk;i and its probability vectors as

xk;i , limn!1

PfNn = k; Jn = ig;

x , (x0; x1; � � � ; xB�1) with xk , (xk;1; xk;2; � � � ; xk;N ):

The transition probability matrix Q1 of the Markov chain f(Nn; Jn); n � 0g is given

by

Q1 =

��

A0

0 A0

1 A0

2 : : : A0

L1�1A0

L1A0

L1+1 : : : A0

B�2 A0

B�1

A0 A1 A2 : : : AL1�1 AL1 AL1+1 : : : AB�2 AB�1

0 A0 A1 : : : AL1�2 AL1�1 AL1 : : : AB�3 AB�2

......

.... . .

......

.... . .

......

0 0 0 : : : A1 A2 A3 : : : AB�L1 AB�L1+1

0 0 0 : : : A0 A1 A2 : : : AB�L1�1 AB�L1

0 0 0 : : : 0 B0 B1 : : : BB�L1�2 BB�L1�1

......

.... . .

......

.... . .

......

0 0 0 : : : 0 0 0 : : : B1 B2

0 0 0 : : : 0 0 0 : : : B0 B1

��where the blocks Ak; Bk; A

0

k; Ak; Bk, and A0

k are as following:

Ak = P (k;D1); Bk = P (k;D2); Ak =

1Xn=k

An; Bk =

1Xn=k

Bn;

A0

k =

Z 10

P (0; t)�dtAk = (��Q)�1�Ak; A0

k =

1Xn=k

A0

n:

72 DOO IL CHOI

The steady-state probability vector x of the Markov chain f(Nn; Jn); n � 0g is obtainedfrom the equations

x Q1 = x; x e = 1:

3.2 The queue length distribution at an arbitrary time

In this subsection we derive the queue length distribution at an arbitrary time. Let

N(t) be the queue length ( including the cell in service ) at time t.

R(t) =

�1 if the service time of the cell is by D1 at time t;

2 if the service time of the cell is by D2 at time t:

and

� =

�0 if the server is idle ;

1 if the server is busy:

De�ne the limiting probabilities

y0 = limt!1

PfN(t) = 0; � = 0g;

yn = limt!1

PfN(t) = n; � = 1g; n � 1:

First we compute the vector y0 that the system is idle. Analogously to Choi[1], we

have

(1) y0 =1

C 1x0(��Q)�1:

where C1 = x0(��Q)�1e+D2+(D1�D2)PL1

n=0 xne: Let T and ~T are the respective

remaining and elapsed service time for the cell in service. In order to obtain the

queue length distribution yn(n � 1) at arbitrary time, we de�ne the joint probability

distribution of the queue length and the remaining service time at arbitrary time � .

�r(n; j; t)dt = PfN(�) = n; J(�) = j;R(�) = r; t < T � t+ dt; � = 1g;

and its Laplace transform and the vectors

��r(n; j; s) =

Z 10

e�st�r(n; j; t)dt;

��r(n; s) = (��r(n; 1; s); � � � ; ��r(n;N; s)); r = 1; 2;

��(n; s) = ��1(n; s) + ��2(n; s):

We furthermore de�ne the conditional probability �r(n; j1; j2; t) dt(r = 1; 2) and its

Laplace transform

�r(n; j1; j2; t)dt = Pf�( ~T ) = n; J(�� + ~T ) = j2; R(�� + ~T ) = r;

t < T � t+ dt; � = 1jJ�� = j1g;

��r (n; j1; j2; s) =

Z 10

e�st�r(n; j1; j2; t)dt;

��r (n; s) = (��r (n; j1; j2; s))1�j1;j2�N ;


where �(T ) is the number of cells arriving during the time T . Then, the vectors ��r(n; s)

can be represented as the following equations:

��1(n; s) =D1

C1

[x0(��Q)�1��1 (n� 1; s) +

min(n;L1)Xk=1

xk��1(n� k; s)];

(2)

��1(B; s) =D1

C1

[x0(��Q)�1�f

1Xm=B�1

��1(m; s)g+

L1Xk=1

xkf

1Xm=B�k

��1(n� k; s)g];

(3)

��2(n; s) = 0; 1 � n � L1;(4)

��2(n; s) =D2

C1

nXk=L1+1

xk��2(n� k; s); L1 + 1 � n � B � 1;(5)

��2(B; s) =D2

C1

B�1Xk=L1+1

xk[

1Xm=B�k

��2(m; s)]:(6)

We �nally obtain that

��(n; s) = ��1(n; s) + ��2(n; s);

(7)

=

8>>>>>>>>>>>>>>>>>>>>>><>>>>>>>>>>>>>>>>>>>>>>:

D1

C1

[x0(��Q)�1��1 (n� 1; s) +

nXk=1

xk��1(n� k; s)]; 1 � n � L1;

D1

C1

[x0(��Q)�1��1 (n� 1; s) +

L1Xk=1

xk��1(n� k; s)]

+D2

C1

nXk=L1+1

xk��2(n� k; s); L1 + 1 � n � B � 1;

D1

C1

[x0(��Q)�1�f

1Xm=B�1

��1 (m; s)g+

L1Xk=1

xkf

1Xm=B�k

��1 (m; s)g]

+D2

C1

B�1Xk=L1+1

xkf

1Xm=B�k

��2 (m; s)g; n = B:

In order to obtain ��r (n; s)(r = 1; 2), we consider the following equation

(8)

1Xn=0

��1(n; s)zn = E[e�sT eR(z)

~T ] = eR(z)D1E[e�(sI+R(z))T ];

where D1 = ~T + T . Since E[e�sT ] =RD1

0e�st

1

D1

dt =1� e�sD1

sD1

,

74 DOO IL CHOI

1Xn=0

��1(n; s)zn = eR(z)D1 [I � e�(sI+R(z))D1][(sI +R(z))D1]

�1

=1

D1

[eR(z)D1 � e�sD1I](sI +R(z))�1:(9)

It is known that1Xn=0

Anzn =

1Xn=0

P (n;D1)zn = eR(z)D1 :

Substituting above equation to (9), we obtain1Xn=0

��1(n; s)zn =

1

D1

[

1Xn=0

Anzn � e�sD1I](sI +R(z))�1

=1

D1

[

1Xn=0

Anzn � e�sD1I][

1Xn=0

Rn(s)zn]

=1

D1

[

1Xn=0

nXk=0

AkRn�k(s)�

1Xn=0

e�sD1Rn(s)]zn

where Rn(s) = (sI � � +Q)�1[�(�� sI �Q)�1]n. Thus, ��1 (n; s) is given by

��1 (n; s) =1

D1

[

nXm=0

AmRn�m(s)� e�sD1Rn(s)]:

Similarly, we can obtain ��2 (n; s) as following:

��2(n; s) =1

D2

[

nXm=0

BmRn�m(s)� e�sD2Rn(s)]:

Substituting ��1 (n; s) and ��2(n; s) to ��(n; s), we obtain

��(n; s)

(10)

=

8>>>>>>>>>>>>>>>>>>>>>>>>><>>>>>>>>>>>>>>>>>>>>>>>>>:

1

C1

[x0(��Q)�1�

n�1Xm=0

AmRn�1�m(s) +

nXk=1

xk

n�kXm=0

AmRn�k�m(s)

� e�sD1fx0(��Q)�1�Rn�1(s) +

nXk=1

xkRn�k(s)]; 1 � n � L1;

1

C1

[x0(��Q)�1�

n�1Xm=0

AmRn�1�m(s) +

L1Xk=1

xk

n�kXm=0

AmRn�k�m(s)

� e�sD1fx0(��Q)�1�Rn�1(s) +

L1Xk=1

xkRn�k(s)

+

nXk=L1+1

xk

n�kXm=0

BmRn�k�m(s)� e�sD2

nXk=L1+1

xkRn�k(s)];

L1 + 1 � n � B � 1:


Finally, we obtain the queue length probabilities yn(n � 1) at an arbitrary time:

For 1 � n � L1,

yn =��(n; 0)

=1

C1

[x0(��Q)�1�

n�1Xm=0

Am(Q� �)�1f�(��Q)�1gn�1�m

+

nXk=1

xk

n�kXm=0

Am(Q� �)�1f�(��Q)�1gn�k�m

�

nXk=0

xk(Q� �)�1f�(��Q)�1gn�k]:(11)

For L1 + 1 � n � B � 1,

yn =1

C1

[x0(��Q)�1�

n�1Xm=0

Am(Q� �)�1f�(��Q)�1gn�1�m

+

L1Xk=1

xk

n�kXm=0

Am(Q� �)�1f�(��Q)�1gn�k�m

�

L1Xk=0

xk(Q� �)�1f�(��Q)�1gn�k

+

nXk=L1+1

xk

n�kXm=0

Bm(Q� �)�1f�(��Q)�1gn�k�m

�

nXk=L1+1

xk(Q� �)�1f�(��Q)�1gn�k];

and

yB = ��

B�1Xk=0

yk:

Using the probabilities yn(n � 0) obtained above, we obtain performance measures

such as loss ( Ploss ) and mean queue length (Mq):

Ploss =yB�ePB

i=0 yi�e=yB�e

��e; Mq =

BXi=0

iyie:

4. Analysis of waiting time distribution

In order to derive the waiting time distribution of an arbitrary cell pair, let's tag

a cell pair arriving at time � . Suppose that there are i(1 � i � B � 1) cell pairs

in the system at time � . Since the service time may change according to the bu�er

76 DOO IL CHOI

occupancy at service completion epoch, we need to know the time(U i�1) required to

complete transmission of (i � 1) cells at service completion epoch of the cell under

service present at time � . We �rst de�ne the hitting time of the level more than the

threshold L1 from the level less than or equal to the threshold L1 at service completion

epoch and of the threshold L1 from the level more than the threshold L1:

Yk;m(j1; j2) , inffn � 1; (Nn; Jn) = (m; j2); Nn 2 Aj(N0; J0) = (k; j1)g;

k = 1; � � � ; L1; m = L1 + 1; � � � ; B � 1;

Zk;L1(j1; j2) , inffn � 1; (Nn; Jn) = (L1; j2)j(N0; J0) = (k; j1)g;

k = L1 + 1; � � � ; B � 1; 1 � j1; j2 � N;

where A = fL1 + 1; � � � ; B � 1g. Introduce the matrices P1; P0

1; P 1; and P0

1 of order

BN to obtain distribution of Yk;m(j1; j2) and Zk;L1(j1; j2):

P1 =

��

A0

0 A0

1 A0

2 : : : A0

L1�1A0

L1A0

L1+1 : : : A0

B�2 A0

B�1

A0 A1 A2 : : : AL1�1 AL AL1+1 : : : AB�2 AB�1

0 A0 A1 : : : AL1�2 AL1�1 AL1 : : : AB�3 AB�2

......

.... . .

......

.... . .

......

0 0 0 : : : A1 A2 A3 : : : AB�L1 AB�L1+1

0 0 0 : : : A0 A1 A2 : : : AB�L1�1 AB�L1

0 0 0 : : : 0 0 0 : : : 0 0...

......

. . ....

......

. . ....

...

0 0 0 : : : 0 0 0 : : : 0 0

0 0 0 : : : 0 0 0 : : : 0 0

��

and the matrix P0

1 is the same as the matrix P1 except that all rows and columns more

than L1 are block 0.

P 1 =

��

0 : : : 0 0 0 : : : 0 0...

. . ....

......

. . ....

...

0 : : : 0 0 0 : : : 0 0

0 : : : B0 B1 B2 : : : 0 0

0 : : : 0 B0 B1 : : : 0 0...

. . ....

......

. . ....

...

0 : : : 0 0 0 : : : B1 B2

0 : : : 0 0 0 : : : B0 B1

��

and the matrix P0

1 is the same as the matrix P 1 except that (L1 + 1; L1)-block B0 in

the matrix P 1 is replaced by block 0.

For k = 1; � � � ; L1;m = L1 + 1; � � � ; B � 1, the event fYk;m(j1; j2) = lg means that the

Markov chain f(Nn; Jn); n � 0g starting at the state (k; j1) stays in the level less than


the level L1 + 1 during l � 1 transitions and at the l-th transition the Markov chain

hits the state (m; j2). Therefore, we have

PfYk;m(j1; j2) = lg = [P0(l�1)1 P1](k; j1;m; j2);

, f lk;m(j1; j2);

where [X](k; j1;m; j2) is the (j1; j2)-element of the (k;m)-block of the matrix X. Sim-

ilarly, we obtain distribution for the random variable Zk;L1(j1; j2)

PfZk;L1(j1; j2) = lg = [P0(l�1)

1 P 1](k; j1;L1; j2);

, glk;L1(j1; j2); k = L1 + 1; � � � ; B � 1:

Then, the Laplace transform of the time(U i�1) required to complete the service of

(i� 1) cell pairs is given by

For 1 � i; k � L1,

E[e�sUi�1

j(Nn; Jn) = (k; j)]

=

B�1Xm0=L1+1

Xj0

[

i�1Xa0=1

E[e�sUi�1

j(Nn; Jn) = (k; j); Yk;m0(j; j0) = a0]PfYk;m0

(j; j0) = a0g

+

1Xa0=i

E[e�sUi�1

j(Nn; Jn) = (k; j); Yk;m0(j; j0) = a0]PfYk;m0

(j; j0) = a0g]

=

B�1Xm0=L1+1

Xj0

� i�1Xa0=1

E[e�sUi�1

j(Nn; Jn) = (k; j); Yk;m0(j; j0) = a0]f

a0k;m0

(j; j0)

+ e�s(i�1)D1

1Xa0=i

fa0k;m0(j; j0)

�;

ConditioningZm;L1(j; j0

) and YL1;m(j; j0

) atE[e�sUi�1

j(N�n ; J�n) = (k; j); Yk;m0(j; j0) =

a0], the summation below is �nite:

E[e�sUi�1

j(Nn; Jn) = (k; j)]

=Xl=0

(A1l (k; j) +B1

l (k; j))

,W i�1k;j (s); k = 1; 2; � � � ; L1; 1 � j � N;

78 DOO IL CHOI

where

A10(k; j) = e�s(i�1)D1

B�1Xm0=L1+1

�eT �

i�1Xa0=1

fa0k;m0e�j; B1

0(k; j) , 0;

A1l (k; j) =

Xm0

: : :Xml

i�1Xa0=1

i�1�a0Xb1=1

: : :

i�1�P

l�1

0an�P

l�1

1bnX

bl=1

e�s((i�1)D1+(D2�D1)P

l

1bn)

�fa0k;m0

l�1Yr=1

fgbrmr�1;L1farL1;mr

gfeT �

i�1�P

l�1

0an�P

l�1

1bnX

al=1

falL1;mleg�j;

B1l (k; j) =

Xm0

: : :Xml�1

i�1Xa0=1

i�1�a0Xb1=1

: : :

i�1�P

l�2

0an�P

l�1

1bnX

al�1=1

e�s((i�1)D2+(D1�D2)P

l�1

0an)

�fa0k;m0

l�1Yr=1

fgbrmr�1;L1farL1;mr

gfeT �

i�1�P

l�1

0an�P

l�1

1bnX

bl=1

gblml�1;L1eg�j;

and [X]j = j � th component of row vector X:

For L1 + 1 � k � B � 1,

E[e�sUi�1

j(Nn; Jn) = (k; j)] =Xl=0

(A1

l (k; j) +B1

l (k; j));

,W i�1k;j

(s); 1 � j � N:

where

A1

0(k; j) , 0; B1

0(k; j) = e�s(i�1)D2�eT �

i�1Xb0=1

gb0k;L1

e�j;

A1

l (k; j) =Xm1

: : :Xml

i�1Xb0=1

i�1�b0Xa1=1

: : :

i�1�P

l�1

1an�P

l�2

0bnX

bl�1=1

e�s((i�1)D1+(D2�D1)P

l�1

0bn)

�gb0k;L1

l�1Yr=1

ffarL1;mrgbrmr;L1

gfeT �

i�1�P

l�1

1an�P

l�1

0bnX

al=1

falL1;mleg�j;

B1

l (k; j) =Xm1

: : :Xml

i�1Xb0=1

i�1�b0Xa1=1

: : :

i�1�P

l�1

1an�P

l�1

0bnX

al=1

e�s((i�1)D2+(D1�D2)P

l�1

1an)

�gb0k;L1

l�1Yr=1

ffarL1;mrgbrmr;L1

gfalL1;mlfeT �

i�1�P

l�1

1an�P

l�1

0bnX

bl=1

gblml;L1eg�j:

Since the service time may change according to bu�er occupancy, we must know the

number of cell pairs arriving from an arbitrary time � to the service completion epoch.

Consider the following joint probabilities:


PfN(�) = 0; An arrival is in (�; � + d�)g = y0�ed�:

For 1 � n � B � 1; n+ l < B � 1,

PfN(�) = n;N�k+1 = n+ l; J�k+1 = j;R(�) = 1; An arrival is in (�; � + d�);

t < T � t+ dt; � = 1g

=1

C1

[x0(��Q)�1�P (n� 1;D1 � t)�P (l; t)d�dt

+

min(n;L1)Xi=1

xiP (n� i;D1 � t)�P (l; t)d�dt]j:

For 1 � n � B � 1,

PfN(�) = n;N�k+1 = B � 1; J�k+1 = j;R(�) = 1;An arrival is in (�; � + d�);

t < T � t+ dt; � = 1g

=1

C1

[x0(��Q)�1�P (n� 1;D1 � t)�P (B � n� 1; t)d�dt

+

min(n;L1)Xi=1

xiP (n� i;D1 � t)�P (B � n� 1; t)d�dt]j ;

where P (k; t) =

1Xl=k

P (l; t):

For L1 + 1 � n � B � 1; n+ l < B � 1,

PfN(�) = n;N�k+1 = n+ l; J�k+1 = j;R(�) = 2; An arrival is in (�; � + d�);

t < T � t+ dt; � = 1g

=1

C1

[

nXi=L1+1

xiP (n� i;D2 � t)�P (l; t)d�dt]j

For L1 + 1 � n � B � 1,

PfN(�) = n;N�k+1 = B � 1; J�k+1 = j;R(�) = 2; An arrival is in (�; � + d�);

t < T � t+ dt; � = 1g

=1

C1

[

nXi=L1+1

xiP (n� i;D2 � t)�P (B � n� 1; t)d�dt]j

By combining above results, we obtain the Laplace transform for the waiting time of

80 DOO IL CHOI

a cell pair:

E[e�sW ] =1

(1� Ploss)��e[y0�e

+1

C1

f

B�2Xn=1

B�n�2Xl=0

x0(��Q)�1�

Z D1

0

e�stP (n� 1;D1 � t)�P (l; t)dtWn�1n+l (s)

+

B�1Xn=1

x0(��Q)�1�

Z D1

0

e�stP (n� 1;D1 � t)�P (B � n� 1; t)dtWn�1B�1(s)

+

L1Xi=1

B�2Xn=i

B�n�2Xl=0

Z D1

0

e�stxiP (n� i;D1 � t)�P (l; t)dtWn�1n+l (s)

+

L1Xi=1

B�1Xn=i

Z D1

0

e�stxiP (n� i;D1 � t)�P (B � n� 1; t)dtWn�1B�1(s)

+

B�2Xi=L1+1

B�2Xn=i

B�n�2Xl=0

Z D2

0

e�stxiP (n� i;D2 � t)�P (l; t)dtWn�1n+l (s)

+

B�1Xi=L1+1

B�1Xn=i

Z D2

0

e�stxiP (n� i;D2 � t)�P (B � n� 1; t)dtWn�1B�1(s)g]:

References

[1] B. D. Choi and D. I. Choi, The queueing system with queue length dependent service times and

its application to cell discarding scheme in ATM networks, IEE Proc. Commun., vol. 143, no.1,

pp. 5-11, 1996.

[2] B. D. Choi, D. I. Choi, Young Chul Kim and Dan Keun Sung, An analysis of M, MMPP/G/1

queues with QLT scheduling policy and Bernoulli schedule, IEICE Transactions on Communica-

tions, vol. E81-B, no.1, pp. 13-22, 1998.

[3] D. I. Choi, C. Knessl and C. Tier, A queueing system with queue length dependent service times,

with applications to cell discarding in ATM networks, J. applied Mathematics and Stochastic

analysis, vol. 20, no. 1 pp. 35-62, 1999.

Department of Mathematics,

Halla Institute ofTechnology

220-840 Wonju-shi, Kangwon-do, Korea

J. KSIAM Vol.3, No.2, 81-97, 1999

TIME OPTIMAL CONTROL PROBLEM OF

RETARDED SEMILINEAR SYSTEMS WITH

UNBOUNDED OPERATORS IN HILBERT SPACES

Jong-Yeoul Park,Jin-Mun Jeong,Yong-Han Kang

Abstract. This paper deals with the time optimal control problem for the retarded

semilinear system by using the construction of fundamental solution in case where the

principal operators are unbounded operators.

1. Introduction

Let H and V be complex Hilbert spaces such that the embedding V � H is con-

tinuous. In this paper we deal with the time optimal control problem governed by

semilinear parabolic type equation in Hilbert space H as follows.

(RSE)

8>>>>><>>>>>:

d

dtx(t) = A0x(t) +A1x(t� h)

+

Z 0

�h

a(s)A2x(t+ s)ds+ f(t; x(t)) + k(t);

x(0) = �0; x(s) = �1(s) � h � s < 0:

Let A0 be the operator associated with a bounded sesquilinear form de�ned in V � V

and satisfying G�arding inequality. Then A0 generates an analytic semigroup S(t) in

both H and V � and so the equation (RSE) may be considered as an equation in both

H and V �.

Let (�0; �1) 2 H � L2(0; T ;V ) and x(T ;�; f; u) be a solution of the system (RSE)

associated with nonlinear term f and control u at time T .

We now de�ne the fundamental solution W (t) of (RSE) by

W (t) =

�x(t; (�0; 0); 0; 0); t � 0

0 t < 0:

1991 Mathematics Subject Classi�cation. Primary 35B37; Secondary 93C20.

Key words and phrases. semilinear evolution equation, regularity, optimal conrol,

compact imbedding.

Typeset by AMS-TEX

81

82 JONG-YEOUL PARK,JIN-MUN JEONG,YONG-HAN KANG

According to the above de�nition W (t) is a unique solotion of

W (t) = S(t) +

Z t

0

S(t� s)fA1W (s� h) +

Z 0

�h

a(�)A2W (s+ �)d�gds

for t � 0 (cf. Nakagiri [5]). Under the conditions a(�) 2 L2(�h; 0;R) and Ai(i = 1; 2)

are bounded linear operators on H into itself, S. Nakariri in [5] proved the standard

optimal control proplems and the time optimal control problem for linear retarded

system (RSE) in case f � 0 in Banch space. If Ai(i = 0; 1; 2) : D(A0) � H ! H are

unbounded operators, G. Di Blasio, K. Kunish and E. sinestrari in [2] obtained global

existence and uniqueness of the strict solution for linear retarded system in Hilbert

spaces. With the more general Lipschitz continuity of nonlinear operator f from R�Vto H, in [4] they eatablished the problem for existencs and uniqueness of solution of the

given system. But we can not immediately obtain the time optimal control problem

as in [5; section 8] without the condition for boundedness of the fundamental solution

W (t). Since the integral of A0S(t�s) has a sigularity at t = s we can not solve directly

the integral equation of W (t). In [6], H. Tanabe was investigated the fundamental

solution W (t) by constructing the resolvent operators for integrodi�erential equations

of Volterra type(see (3.14), (3.21) of [6]) with the condition that a(�) is real valued and

H�older continuous on [�h; 0].This paper deals with the time optimal control problem by using the construction

of fundamental solution, which is the same results of [5], in case where the principal

operators Ai(i = 0; 1; 2) are unbounded operators.

2. Retarded semilinear equations

The inner product and norm in H are denoted by (�; �) and j � j. The notations jj � jjand jj � jj� denote the norms of V and V � as usual, respectively. Hence we may regard

that

(2.1) jjujj� � juj � jjujj; u 2 V:

Let a(�; �) be a bounded sesquilinear form de�ned in V �V and satisfying G�arding's

inequality

(2.2) Re a(u; u) � c0jjujj2 � c1juj2; c0 > 0; c1 � 0:

Let A0 be the operator associated with the sesquilinear form �a(�; �):

(A0u; v) = �a(u; v); u; v 2 V:

It follows from (2.2) that for every u 2 V

Re ((c1 �A0)u; u) � c0jjujj2:

Then A0 is a bounded linear operator from V to V �, and its realization in H which is

the restriction of A0 to

D(A0) = fu 2 V ;A0u 2 Hg

TIME OPTIMAL CONTROL PROBLEM 83

is also denoted by A0. Then A0 generates an analytic semigroup in both H and V �.

Hence we may assume that there exists a constant C0 such that

(2.3) jjujj � C0jjujj1=2D(A0)juj1=2;

for every u 2 D(A0), where

jjujjD(A0) = (jA0uj2 + juj2)1=2

is the graph norm of D(A0).

First, we introduce the following linear retarded functional di�erential equation:

(RE)

8>>>>><>>>>>:

d


+

Z 0

�h

a(s)A2x(t+ s)ds+ k(t);

x(0) = �0; x(s) = �1(s) � h � s < 0:

Here, the operators A1 and A2 are bounded linear from V to V � such that their

restrictions to D(A0) are bounded linear operators from D(A0) to H. The function

a(�) is assumed to be a real valued and H�older continous in the interval [�h; 0].Let W (�) be the fundamental solution of the linear equaton associated with (RE)

which is the operator valued function satisfying

W (t) = S(t) +

Z t

0

S(t� s)fA1W (s� h)(2.4)

+

Z 0

�h

a(�)A2W (s+ �)d�gds; t > 0;

W (0) = I; W (s) = 0; �h � s < 0;

where S(�) is the semigroup generated by A0. Then

x(t) =W (t)�0 +

Z 0

�h

Ut(s)�1(s)ds+

Z t

0

W (t� s)k(s)ds;(2.5)

Ut(s) =W (t� s� h)A1 +

Z s

�h

W (t� s+ �)a(�)A2d�:

Recalling the formulation of mild solutions, we know that the mild solution of (RE)

is also represented by

x(t) =

8>>>>><>>>>>:

S(t)�0 +

Z t

0

S(t� s)fA1x(s� h)

+

Z 0

�h

a(�)A2x(s+ �)d� + k(s)gds; (t > 0);

�(s); �h � s < 0:

>From Theorem 1 in [6] it follows the following results.


Proposition 2.1. The fundamental solution W (t) to (RE) exists uniquely. The func-

tions A0W (t) and dW (t)=dt are strongly continuous except at t = nh; h = 0; 1; 2; :::,

and the following inequalities hold:

for i = 0; 1; 2 and n = 0; 1; 2; :::

jAiW (t)j � Cn=(t� nh);(2.6)

jdW (t)=dtj � Cn=(t� nh);(2.7)

jAiW (t)A�10 j � Cn(2.8)

in (nh; (n+ 1)h),

(2.9) jZ t

0

t

AiW (�)d� j � Cn

for nh � t < t0 � (n+ 1)h. Let � be the order of H�older continuity of a(�). Then for

nh � t < t0 � (n+ 1)h and 0 < � < �

jW (t0

)�W (t)j � Cn;�(t0 � t)�(t� nh)��;(2.10)

jAi(W (t0

)�W (t))j � Cn;�(t0 � t)�(t� nh)��1;(2.11)

jAi(W (t0

)�W (t))A�10 j � Cn;�(t

0 � t)�(t� nh)��;(2.12)

where Cn and Cn;� are constants dependent on n and n; �, respectively, but not on t

and t0

.

Considering as an equation in V � we also obtain the same norm eatimates of (2.6)-

(2.12) in the space V �. By virtue of Theorem 3.3 of [2] we have the following result on

the linear equation (RE).

Proposition 2.2. 1) Let F = (D(A0);H) 12;2 where (D(A0);H)1=2;2 denote the real

interpolation space between D(A0) and H. For (�0; �1) 2F � L2(�h; 0;D(A0)) and

k 2 L2(0; T ;H), T > 0, there exists a unique solution x of (RE) belonging to

L2(�h; T ;D(A0)) \W 1;2(0; T ;H) � C([0; T ];F )

and satisfying

jjxjjL2(�h;T ;D(A0))\W 1;2(0;T ;H) � C 01(jj�0jjF(2.13)

+ jj�1jjL2(�h;0;D(A0)) + jjkjjL2(0;T ;H));

where C 01 is a constant depending on T .

2) Let (�0; �1) 2 H � L2(�h; 0;V ) and k 2 L2(0; T ;V �), T > 0. Then there exists a

unique solution x of (RE) belonging to

L2(�h; T ;V ) \W 1;2(0; T ;V �) � C([0; T ];H)

and satisfying

jjxjjL2(�h;T ;V )\W 1;2(0;T ;V �) � C 01(j�0j(2.14)

+ jj�1jjL2(�h;0;V ) + jjkjjL2(0;T ;V �)):

In what follows we assume that

jjW (t)jj �M; t > 0

for the sake of simplicity.


Proposition 2.3. Let k 2 L2(0; T ;H) and x(t) =R t0W (t � s)k(s)ds. Then there

exists a constant C 01 such that for T > 0

jjxjjL2(0;T ;D(A0)) � C 01jjkjjL2(0;T ;H);(2.15)

jjxjjL2(0;T ;H) �MT jjkjjL2(0;T ;H);(2.16)

and

(2.17) jjxjjL2(0;T ;V ) � (C 01MT )1

2 jjkjjL2(0;T ;H):

Proof. The assertion (2.15) is immediately obtained from Proposition 2.2 for the equa-

tion (RE) with (�0; �1) = (0; 0): Since

jjxjj2L2(0;T ;H) =

Z T

0

jZ t

0

W (t� s)k(s)dsj2dt

�M2

Z T

0

(

Z t

0

jk(s)jds)2dt

�M2

Z T

0

t

Z t

0

jk(s)j2dsdt

�M2T2

2

Z T

0

jk(s)j2ds

it follows that

jjxjjL2(0;T ;H) �MT jjkjjL2(0;T ;H):

>From (2.3), (2.15), and (2.16) it holds that

jjxjjL2(0;T ;V ) � (C 01MT )1

2 jjkjjL2(0;T ;H): �

Let f be a nonlinear mapping from R � V into H. We assume that for any x1,

x2 2 V there exists a constant L > 0 such that

jf(t; x1)� f(t; x2)j � Ljjx1 � x2jj;(F1)

f(t; 0) = 0:(F2)

The following result on (RSE) is obtained from theorem 2.1 in [4].

Proposition 2.4. Suppose that the assumptions (F1), (F2) are satis�ed. Then for

any (�0; �1) 2 H � L2(�h; 0;V ) and k 2 L2(0; T ;V �), T > 0, the solution x of (RE)

exists and is unique in L2(�h; T ;V ) \W 1;2(0; T ;V �), and there exists a constant C 02depending on T such that

jjxjjL2(�h;T ;V )\W 1;2(0;T ;V �) � C 02(1 + j�0j(2.18)

+ jj�1jjL2(�h;0;V ) + jjkjjL2(0;T ;V �)):


3. Lemmas for fundamental solutions

For the sake of simplicity we assume that S(t) is uniformly bounded. Then

(3.1) jS(t)j �M0(t � 0); jA0S(t)j �M0=t(t > 0); jA20S(t)j � K=t2(t > 0)

for some constant M0(e.g., [6]). we also assume that a(�) is H�older continuous of oder�:

(3.2) ja(�)j � H0; ja(s)� a(�)j � H1(s� �)�

for some constants H0;H1.

Lemma 3.1. For 0 < s < t and 0 < � < 1

jS(t)� S(s)j � M0

�(t� s

s)�;(3.3)

jA0S(t)� A0S(s)j �M0(t� s)�s��1:(3.4)

Proof. >From (3.1) for 0 < s < t

(3.5) jS(t)� S(s)j = jZ t

s

A0S(�)d� j �M0 logt

s:

It is easily seen that for any t > 0 and 0 < � < 1

(3.6) log(1 + t) � t�=�:

Combining (3.6) with (3.5) we get (3.3). For 0 < s < t

(3.7) jA0S(t)�A0S(s)j = jZ t

s

A20S(�)d� j �M0(t� s)=ts:

Noting that (t� s)=s � ((t� s)=s)� for 0 < � < 1, we obtain (3.4) from (3.7). �

According to Tanabe [6] we set

(3.8) V (t) =

8><>:A0(W (t)� S(t)); t 2 (0; h]

A0(W (t)�Z t

nh

S(t� s)A1W (s� h)ds);

where t 2 (nh; (n+1)h](n = 1; 2; ::: ) in the second line of the right term of (3.8). For

0 < t � h

W (t) = S(t) + A�10 V (t)

and from (2.4) we have

W (t) = S(t) +

Z t

0

Z t

�

S(t� s)a(� � s)dsA2W (�)d�:


Hence,

V (t) = V0(t) +

Z t

0

A0

Z t

�

S(t� s)a(� � s)dsA2A�10 V (�)d�

where

V0(t) =

Z t

0

A0

Z t

�

S(t� s)a(� � s)dsA2S(�)d�:

For nh � t � (n+ 1)h(n = 0; 1; 2; ::: ) the fundamental solution W (t) is represended

by

W (t) =S(t) +

Z t

nh

S(t� s)A1W (s� h)ds

+

Z t�h

0

Z �+h

�

S(t� s)a(� � s)dsA2W (�)d�

+

Z nh

t�h

Z t

�

S(t� s)a(� � s)dsA2W (�)d�

+

Z t

nh

Z t

�

S(t� s)a(� � s)dsA2W (�)d�:

The integral equation to be satis�ed by (3.8) is

V (t) = V0(t) +

Z t

nh

A0

Z t

�

S(t� s)a(� � s)dsA2A�10 V (�)d�

where

V0(t) = A0S(t) +A0

Z nh

h

S(t� s)A1W (s� h)ds

+

Z t�h

0

A0

Z �+h

�

S(t� s)a(� � s)dsA2W (�)d�

+

Z nh

t�h

A0

Z t

0

S(t� s)a(� � s)dsA2W (�)d�

+

Z t

nh

A0

Z t

�

S(t� s)a(� � s)dsA2

Z �

nh

S(� � �)A1W (� � h)d�d�:

Thus, the integral equation (3.8) can be solved by succesive approximation and V (t)

is uniformly bounded in [nh; (n+ 1)h](e.g. (3.16) and the preceding part of (3.40) in

[6]).It is not di�cult to show that for n > 1

V (nh+ 0) 6= V (nh� 0); and W (nh+ 0) = W (nh� 0):

Moreover, we obtain the following result.


Lemma 3.2. There exists a constant C 0n > 0 such that

(3.9) jZ t

nh

a(� � s)AiW (�)d� j � C 0n; i = 1; 2;

for n = 0; 1; 2; :::, t 2 [nh; (n+ 1)h] and t � s � t+ h.

Proof. For t 2 [0; h](i.e.,n = 0), from (3.8) it follows

Z t

0

a(� � s)AiW (�)d� =

Z t

0

a(� � s)dsAiA�10 (A0S(�) + V (�))d�

=

Z t

0

(a(� � s)� a(s))AiA�10 A0S(�)d� + a(s)AiA

�10 (S(t)� I)

+

Z t

0

a(� � s)AiA�10 V (�)d�:

Noting that

jZ t

0

(a(� � s)� a(s))AiA�10 A0S(�)d� j �M0H1jAiA

�10 jZ t

0

��1d�;

we have

jZ t

0

a(� � s)AiW (�)d� j �jAiA�10 jfh�M0H1 +H0(M + 1)

+ hH0( sup0�t�h

jV (t)j)g:

Thus the assertion (3.9) holds in [0; h]. For t 2 [nh; (n+ 1)h]; n � 1,

Z t

nh

a(� � s)AiW (�)d� =

Z t

nh

a(� � s)AiA�10 V (�)d�(3.10)

+

Z t

nh

a(� � s)Ai

Z �

nh

S(� � �)A1W (� � h)d�d�:

The �rst term of the right of (3.10) is estimated as

jZ t

nh

a(� � s)AiA�10 V (�)d� j � hH0jAiA

�10 j( sup

nh�t�(n+1)h

jV (t)j)g:


Let � = (� + nh)=2 for nh < � < (n+ 1)h. Then

jA0

Z �

nh

S(� � �)A1W (� � h)d�j

(3.11)

� jZ �

�

A0S(� � �)(A1W (� � h)�A1(W (� � h))d�

+ (S((� � nh)=2)� I)A1W (� � h)

+

Z �

nh

(A0S(� � �)�A0S(� � nh))A1W (� � h)d�

+A0S(� � nh)

Z �

nh

A1W (� � h)d�j

�Z �

�

M0

� � �Cn�1;�(� � �)�(� � nh)��1d� + (M0 + 1)

Cn�1

� � nh

+

Z �

nh

M0(� � nh)

(� � �)(� � nh)

Cn�1

� � nhd� +

M0Cn�1

� � nh

�M0Cn�1;�

Z �

nh

(� � �)��1(� � nh)��d�2

� � nh

+(2M0 + 1)Cn�1

� � nh+M0Cn�1

� � nhlog 2

= f2M0Cn�1;�B(�; 1� �) + (2M0 + 1 +M0 log 2)Cn�1g=(� � nh)

� C0

n;�=(� � nh)

where B(�; �) is the Beta function. Noting that

d

d�

Z �

nh

S(� � �)A1W (� � h)d� = A1W (� � h) +A0

Z �

nh

S(� � �)A1W (� � h)d�;

and integrating this equality on [nh; t]Z t

nh

A0

Z �

nh

S(� � �)A1W (� � h)d�d�(3.12)

=

Z t

nh

S(t� �)A1W (� � h)d� �Z t

nh

A1W (� � h)d�:

By Lemma 3.1 and the induction hypothesis, the �rst term of the right of (3.12) is

estimated as

jZ �

nh

S(� � �)A1W (� � h)d�j(3.13)

= jZ �

nh

(S(� � �)� S(� � nh))A1W (� � h)d�

+ S(� � nh)

Z �

nh

A1W (� � h)d�j

�Z �

nh

M0 log� � nh

� � �

Cn�1

� � nhd� +M0Cn�1

�M0Cn�1c0 +M0Cn�1


where

c0 =

Z 1

0

log1

1� �

d�

�:

Thus, combining the above inequarity with (2.9) we get

(3.14) jZ t

nh

A0

Z �

nh

S(� � s)AiW (s� h)dsd� j � (M0c0 +M0 + 1)Cn�1:

Therefore, from (3.11), (3.14) the second term of the right of (3.10) is estimated as

jZ t

nh

a(� � s)Ai

Z �

nh

S(� � �)A1W (� � h)d�d� j

= jZ t

nh

(a(� � s)� a(s� nh))Ai

Z �

nh

S(� � �)A1W (� � h)d�d�

+ a(s� nh)

Z t

nh

Ai

Z �

nh

S(� � �)A1W (� � h)d�d� j

�Z t

nh

H1(� � nh)�jAiA�10 jC 0

n:�(� � nh)�1d�

+ ja(s� nh)jjAiA�10 j(M0c0 +M0 + 1)Cn�1

� H1C0

n:�jAiA�10 j(t� nh)� +H0jAiA

�10 j(Mc0 +M + 1)Cn�1:

Hence, we get the assertion (3.9). �

We de�ne the operator K1(t0; t) : H ! H( or V � ! V �) by

(3.15) K1(t0; t) =

Z t0

t

S(t0 � s)A1W (s� h)ds;

for nh � t < t0 < (n + 1)h. In terms of (3.13) K1(t0; t) is uniformly bounded in

(nh; (n+ 1)h]. And we remark that K1(t0; t) converges to 0 as t0 ! t at any element

of D(A0) in view of (2.8). We introduce another operator K2(t0; t) : H ! H( or

V � ! V �) by

(3.16) K2(t0; t) =

Z t0

t

S(t0 � s)

Z 0

�h

a(�)A2W (s+ �)d�ds;

for nh � t < t0 < (n+ 1)h.

Lemma 3.3. Let nh � t < t0 < (n+ 1)h. Then there exists a constant C 0n such that

and

(3.17) jK2(t0; t)j � 3M0C

0

n(t0 � t):


Proof. In [0; h], we transform K2(t0; t) by suitable change of variables and Fubini's

theorem as

K2(t0; t) =

Z t0

t

S(t0 � s)

Z s

0

a(� � s)A2W (�)d�ds

=

Z t

0

Z t0

t

S(t0 � s)a(� � s)A2W (�)dsd�

+

Z t0

t

Z t0

�

S(t0 � s)a(� � s)A2W (�)dsd�

=

Z t0

t

S(t0 � s)

Z t

0

a(� � s)A2W (�)d�ds

+

Z t0

t

S(t0 � s)

Z s

t

a(� � s)A2W (�)d�ds:

Thus from Lemma 3.2 we have

jK2(t0; t)j � 2M0C

0

n(t0 � t):

In [nh; (n+ 1)h), by the similar way mentioned above we get

K2(t0; t) =

Z t0

t

S(t0 � s)

Z 0

�h

a(�)A2W (� + s)d�ds

=

Z t0

t

S(t0 � s)

Z s

s�h

a(� � s)A2W (�)d�ds

=

Z t0�h

t�h

Z �+h

t

S(t0 � s)a(� � s)A2W (�)dsd�

+

Z t

t0�h

Z t0

t

S(t0 � s)a(� � s)A2W (�)dsd�

+

Z t0

t

Z t0

�

S(t0 � s)a(� � s)A2W (�)dsd�

=

Z t0

t

S(t0 � s)

Z t0�h

s�h

a(� � s)A2W (�)d�ds

+

Z t0

t

S(t0 � s)

Z t

t0�h

a(� � s)A2W (�)d�ds

+

Z t0

t

S(t0 � s)

Z s

t

a(� � s)A2W (�)d�ds:

Therefore, by Lemma 3.2 it holds (3.17) �


4. Time optimal control

Let Y be a real Banach space. In what follows the admissible set Uad be weakly

compact subset in L2(0; T ;Y ). Consider the following hereditary controlled system:

(RSC)

8>>>>>>><>>>>>>>:

d


+

Z 0

�h

a(s)A2x(t+ s)ds+ f(t; x(t)) +Bu(t);

x(0) = �0; x(s) = �1(s) � h � s < 0;

u 2 Uad:

Here the controllerB is a bounded linear operator from Y toH. We denote the solution

x(t) in (RSC) by xu(t) to express the dependence on u 2 Uad. That is, xu is trajectory

corresponding to the controll u. Suppose the target setW is weakly compact in H and

de�ne

U0 = fu 2 Uad : xu(t) 2 W for some t 2 [0; T ]gfor T > 0 and suppose that U0 6= ;. The optimal time is de�ned by low limit t0 of t

such that xu(t) 2 W for some admissible control u. For each u 2 U0 we can de�ne the

�rst time ~t(u) such that xu(~t) 2 W . The our problem is to �nd a control �u 2 U0 such

that~t(�u) � ~t(u) for all u 2 U0

subject to the constraint (RSC).

Since xu 2 C([0; T ];H), the transition time ~t(u) is well de�ned for each u 2 Uad.

Theorem 4.1. 1) Let F = (D(A0);H)1=2;2). If (�0; �1) 2 F � L2(�h; 0; D(A0)) and

k 2 L2(0; T ;H), then the solution x of the equation (RSE) belonging to L2(�h; T ;D(A0))

\W 1;2(0; T ;H), and the mapping F �L2(�h; 0;D(A0 ))�L2(0; T ;H) 3 (�0; �1; k) 7!x 2 L2(�h; T ;D(A0))\ W 1;2(0; T ;H) is continuous.

2) If (�0; �1) 2 H � L2(�h; 0;V ) and k 2 L2(0; T ;V �), then the solution x of the

equation (RSE) belonging to L2(�h; T ;V ))\W 1;2(0; T ;V �), and the mapping H �L2(�h; 0;V )) � L2(0; T ;V �) 3 (�0; �1; k) 7! x 2 L2(�h; T ;V )\W 1;2(0; T ;V �) is

continuous.

Proof. 1) We know that x belongs to L2(0; T ;D(A0))\W 1;2(0; T ;H) from Proposition

2.2. Let (�0i ; �1i ; ki)2F�L2(�h; 0;D(A0))�L2(0; T ;H), and xi be the solution of (RSE)

with (�0i ; �1i ; ki) in place of (�0; �1; k) for i = 1; 2. Then in view of Proposition 2.2 we

have

jjx1 � x2jjL2(�h;T ;D(A0))\W 1;2(0;T ;H) � C 01fjj�01 � �02jjF(4.1)

+ jj�11 � �12jjL2(�h;0:D(A0)) + jjf(�; x1)� f(�; x2)jjL2(0;T ;H)

+ jjk1 � k2jjL2(0;T ;H)g� C 01fjj�01 � �02jjF + jj�11 � �12jjL2(�h;0:D(A0)) + jjk1 � k2jjL2(0;T ;H)

+ Ljjx1 � x2jjL2(0;T :V )g:


Since

x1(t)� x2(t) = �01 � �02 +

Z t

0

( _x1(s)� _x2(s))ds;

we get

jjx1 � x2jjL2(0;T ;H) �pT j�10 � �02j+

Tp2jjx1 � x2jjW 1;2(0;T ;H):

Hence arguing as in (2.3) we get

jjx1 � x2jjL2(0;T ;V ) � C0jjx1 � x2jj1=2L2(0;T ;D(A0))jjx1 � x2jj1=2L2(0;T ;H)

(4.2)

� C0jjx1 � x2jj1=2L2(0;T ;D(A0))

� fT 1=4j�01 � �02j1=2 + (Tp2)1=2jjx1 � x2jj1=2W 1;2(0;T ;H)

g

� C0T1=4j�01 � �02j1=2jjx1 � x2jj1=2L2(0;T ;D(A0))

+ C0(Tp2)1=2jjx1 � x2jjL2(0;T ;D(A0))\W 1;2(0;T ;H)

� 2�7=4C0j�01 � �02j

+ 2C0(Tp2)1=2jjx1 � x2jjL2(0;T ;D(A0))\W 1;2(0;T ;H):

Combining (4.1) and (4.2) we obtain

jjx1 � x2jjL2(�h;T ;D(A0))\W 1;2(0;T ;H) � C 01fjj�01 � �02jjF(4.3)

+ jj�11 � �12jjL2(�h;0:D(A0)) + jjk1 � k2jjL2(0;T ;H)

+ 2�7=4C0Lj�01 � �02j

+ 2C0(Tp2)1=2Ljjx1 � x2jjL2(0;T ;D(A0))\W 1;2(0;T ;H)g:

Suppose that (�0n; �1n; kn) ! (�0; �1; k) in F�L2(�h; 0;D(A0))�L2(0; T ;H), and let

xn and x be the solutions (RSE) with (�0n; �1n; kn) and (�0; �1; k) respectively. Let

0 < T1 � T be such that

2C0C0

1(T1=p2)1=2L < 1:

Then by virtue of (4.3) with T replaced by T1we see that xn ! x in L2(�h; T1;D(A0))\W 1;2(0; T1;H). This implies that (xn(T1); (xn)T1)

7! (x(T1); xT1) in F�L2(�h; 0;D(A0)). Hence the same argument shows that xn ! x

in

L2(T1;minf2T1; Tg;D(A0)) \W 1;2(T1;minf2T1; Tg;H):

Repeating this process we conclude that xn ! x in L2(�h; T ;D(A0))\W 1;2(0; T ;H):


2) From proposition 2.2 or 2.4 we have

jjx1 � x2jjL2(�h;T ;V )\W 1;2(0;T ;V �) � C 01fj�01 � �02j+ jj�11 � �12jjL2(�h;0:V ) + jjf(�; x1)� f(�; x2)jjL2(0;T ;V �)

+ jjk1 � k2jjL2(0;T ;V �)g� C 01fj�01 � �02j+ jj�11 � �12jjL2(�h;0:V ) + jjk1 � k2jjL2(0;T ;V �)

+ Ljjx1 � x2jjL2(0;T :V )g:

Hence, in vrtue of (4.2) and since the embedding L2(�h; T ;D(A0))\W 1;2(0; T ;H) �L2(�h; T ;V )\W 1;2(0; T ;V �) is continuous, by the similar way of 1) we can obtain the

result of 2) �

Theorem 4.2. Assume that U0 6= ;. Then there exists a time optimal control.

Proof. Let tn ! t0 + 0, un be an admissible control and suppose that the trajec-

tory xn corresponding to un belongs to W . Let F and B be the Nemitsky operators

corresponding to the maps f and B,which are de�ned by

(Fu)(�) = f(�; xu); and (Bu)(�) = Bu(�);

respectively. Then

xn(tn) =x(tn;�; 0) +

Z t0

0

W (tn � s)((F + B)un)(s)ds;(4.4)

+

Z tn

t0

W (tn � s)((F + B)u)(s)ds

where

x(tn;�; 0) = W (t)�0 +

Z 0

�h

Ut(s)�1(s)ds:

>From Proposition 2.4 it follows that

(4.5) x(tn; �; 0)! x(t0;�; 0) strongly in H:

The third term in (4.4) tends to zero as tn ! t0 + 0 from the fact that

jZ t

n

t0

W (tn � s)((F + B)u)(s)dsj

(4.6)

� ( supt2[0;T ]

jjW (t)jj)fLC 02(j�0j+ jj�1jjL2(0;T ;V ) + jjujjL2(0;T ;Y )) + jf(0)j

+ jjBjjjjujjL2(0;T ;Y )g(tn � t0)1=2:


By the de�nition of fundamental solution W (t) it holds

W (t+ �)� S(�)W (t) = S(t+ �) +

Z t+�

0

S(t+ �� s)fA1W (s� h)

+

Z 0

�h


� S(�)fS(t) +Z t

0

S(t� s)fA1W (s� h)

+

Z 0

�h


=

Z t+�

t

S(t+ �� s)fA1W (s� h)

+

Z 0

�h


= K1(t+ �; t) +K2(t+ �; t):

Hence, since

W (tn � s) = S(tn � t0)W (t0 � s) +K1(tn � s; t0 � s) +K2(tn � s; t0 � s)

the second term of (4.4) is represented as

Z t0

0

S(tn � t0)W (t0 � s)((F + B)un)(s)ds(4.7)

+

Z t0

0

(K1(tn � s; t0 � s) +K2(tn � s; t0 � s))((F + B)un)(s)ds:

The second term of the (4.7) tends to zero as �! 0 in terms of Lemma 3.3.

We denote xn(tn) by wn. SinceW and Uad are weakly compact, there exist an u0 2U0, w0 2W such that we may assume that w� limun = u in Uad and w� limwn = w0

in L2 \W 1;2.

Let p 2 H. Then S�(tn � t0)p! p strongly in H and by (F1) and Theorem 4.1,

(4.8) W (t0 � �)((F + B)un)(�)!W (t0 � �)((F + B)u0)(�)

weakly L2(0; T ;V ). Hence from (4.5)-(4.8) it follows that

(w0; p) = (x(t0;�; 0); p) +

Z t0

0

(W (t0 � s)((F + B)u0)(s); p)ds

by tending n!1. Since p is arbitrary, we have

w0 = x(t0;�; 0) +

Z t0

0

W (t0 � s)((F + B)u0)(s)ds 2W


and hence w0 is the trajectory correspondiny to u0, i.e., u0 2 U0. �

Now we consider the case where the target set W is singleton.

Consider that W = w0 such that �0 6= w0 and �1(s) 6= w0 for some s 2 [�h; 0).Then we can choose a decreasing sequence fWng of weakly compact sets with nonempty

interior such that

(4.9) w0 21\n=1

Wn; and dist(w0;W ) = supx2W

n

jx� w0j ! 0(n!1):

De�ne

Un0 = fu 2 Uad : xu(t) 2Wn for some t 2 [0; T ]g:

Then, we may assume that un is the time optimal control with the optimal time tn to

the target set Wn, n = 1; 2; ... .

Theorem 4.3. Let fWng be a sequence of closed convex in X satisfying the condition

(4.9) and Un0 6= ;. Then there exists a time optimal control u0 with the optimal time

t0 = supn�1ftng to the point target set fw0g which is given by the weak limit of some

subsequence of fung in L2(0; t0;Y ).

Proof. Since (4.9) is satis�ed and Uad is weakly compact, there exists wn = xn(tn) 2Wn ! w0 strongly in H. Since Uad is weakly compact, there exists u0 2 Uad such that

un ! u0 weakly in L2(0; t0;Y ). Thus, from the similar argument used in the proof

of Theorem 4.2 we can easily prove that u0 is the time optimal control and t0 is the

optimal time to the target fw0g. �

Remark 1. Let xu be the solution of (RSC) corresponding to u. Then the mapping

u 7! xu is compact from L2(0; T ;Y ) to L2(0; T ;H). We de�ne the soluton mapping S

from L2(0; T ;Y ) to L2(0; T ;H) by

(Su)(t) = xu(t); u 2 L2(0; T ;Y ):

In virtue of Proposition 2.4

jjSujjL2(0;T ;V )\W 1;2(0;T ;V �) = jjxujj � C 02fjx0j+ jj+BujjL2(0;T ;H):

Hence if u is bounded in L2(0; T ;Y ), then so is xu in L2(0; T ;V ) \ W 1;2(0; T ;V �).

Since V is compactly imbedded in H by assumption, the imbedding L2(0; T ;V ) \W 1;2(0; T ;V �) � L2(0; T ;H)) is also compact in view of Theorem 2 of J. P. Aubin [1].

Hence, the mapping u 7! Su = xu is compact from L2(0; T ;Y ) to L2(0; T ;H).

Since fxng is bounded in L2 \W 1;2 and L2 \W 1;2 � L2(0; T ;H) compacvtively

it holds xn ! x strongly in L2(0; T ;H). Since xn ! x weakly in L2 \W 1;2 we have

xn ! x strongly in L2(0; T ;H). >From (f1) and Lemma 3.1 we see that F is a compact

operator from L2(0; T ;Y ) to L2(0; T ;H) and hence, it holds Fun ! Fu strongly in

L2(0; T ;V �). Therefore (Fun; x�) = (Fu0; x�).


References

1. J. P. Aubin, Un th�eor�eme de composit�e, C. R. Acad. Sci. 256 (1963), 5042{5044.

2. G. Di Blasio, K. Kunisch and E. Sinestrari, L2�regularity for parabolic partial integrodi�erential

equations with delay in the highest-order derivatives, J. Math. Anal. Appl. 102 (1984), 38{57.

3. J. M. Jeong, Retarded functional di�erential equations with L1-valued controller, Funkcialaj Ek-

vacioj 36 (1993), 71{93.

4. J. Y. Park, J. M. Jeong and Y. C. Kwun, Regularity and controllability for semilinear control

system, Indian J. pure appl. Math. 29(3) (1998), 239-252.

5. S. Nakagiri, Optimal control of linea retarded systems in Banach spaces, J. Math. Anal. Appl.

120(1) (1986), 169-210.

6. H. Tanabe, Fundamental solutions for linear retarded functional di�erential equations in Banach

space, Funkcialaj Ekvacioj 35(1) (1992), 149{177.


Pusan National University,

Pusan 609-739, Korea

Division of Mathematical Sciences,

Pukyong National University,



Pusan National University,


SPLINE HAZARD RATE ESTIMATION

USING CENSORED DATA

Myung Hwan Na

J. KSIAM Vol.3, No.2, 99-106, 1999

Abstract

In this paper, the spline hazard rate model to the randomly censored data is

introduced. The unknown hazard rate function is expressed as a linear combination

of B-splines which is constrained to be linear(or constant) in tails. We determine

the coe�cients of the linear combination by maximizing the likelihood function.

The number of knots are determined by Bayesian Information Criterion. Examples

using simulated data are used to illustrate the performance of this method under

presenting the random censoring.

1 Introduction

Reliability engineers, biostatisticians, and actuaries are all interested in lifetimes. In

particular, they are interested in �ve lifetime distribution representations: the hazard

rate function h(t), the cumulative hazard rate function H(t), the reliability function

R(t), the probability density function f(t), and the mean residual life function m(t).

Perhaps, the hazard rate function is the most popular of the �ve representations for

life{time modelling. The hazard rate function is de�ned as

h(t) =f(t)

R(t); t � 0:

Thus, the hazard rate function is the ratio of the probability density function to the

reliability function. Throughout this paper we assume that the hazard rate function

satisfy two conditions:

Z1

0h(t)dt =1; h(t) > 0 for all t > 0: (1)

A smooth estimation of the hazard rate function is a very important topic in both the-

oretical and applied statistics; Anderson and Senthilselvan(1980) use quadratic spline

with discontinuity in the slope at the times of death. O'sullivan(1988) use smoothing

splines for the log-hazard function. Senthilselvan(1987) use hyperbolic spline function

which is continuous with its �rst derivative discontinuous only at a �nite number of

points. Kooperberg, et al.(1995) use cubic splines and two additional log terms for

Key Words: Hazard Rate, Spline, Censoring, Simulation

99

100 Myung Hwan Na

log-hazard function. The discussion section of Abrahamowicz, et al.(1992) contains a

good review of many of the papers on the use of splines to estimate density in the

presence of censored data.

In this paper we introduce the spline hazard rate model to the random censored

data. The unknown hazard rate function is expressed as a fuction from a space of cu-

bic splines constrained to be linear(or constant) in tails. The coe�cients of the linear

combination are determined by maximizing the likelihood function. The number of

knots are determined by Bayesian Information Criterion(BIC). Examples using simu-

lated data are used to illustrate the performance of this method under presenting the

random censoring.

Section 2 is devoted to an introduction to spline model for the hazard rate function.

A maximum likelihood estimation procedure is discussed in section 3. Section 4 contains

knot deletion procedure. Section 5 contains examples using simulated data.

2 SPLINE HAZARD RATE MODEL

Let K denote a nonnegative integer. When K � 1, let �1; � � � ; �K be a (simple) knot

sequence in [0;1) where 0 < �1 < � � � < �K <1. Let S0 denote the collection of twice

continuously di�erentiable functions s on [0;1) such that the restriction of s to each

of the intervals [0; �1]; [�1; �2]; � � � ; [�K ;1) is a cubic polynomial, i.e., s is a polynomial

of order 4 (or less) on each of the intervals. Then S0 is the (K+4)-dimensional vector

space of cubic splines corresponding to the knot positions �1; � � � ; �K . Set S denote

the subspace of S0 consisting of the natural cubic splines with knots at �1; � � � ; �K , i.e.,

the functions in S that are linear (or constant) on [0; �1] and [�K ;1). This linear

vector space is K-dimensional and has a basis B1; � � � ; BK of S. When K = 0, there

are no basis functions depending on t. For exhaustive treatment of splines, the reader

should consult Greville(1969), de Boor(1978), and Schumaker(1981). For statistical

applications we refer to Smith(1979), and Wegman and Wright(1983).

Let � denote the collection of all column-vector � = (�1; � � � ; �K)t 2 RK such thatPK

j=1 �jBj(t) > 0 for all t > 0. Given � 2 �, consider the model

h(tj�) =KXj=1

�jBj(t); t > 0

for the hazard rate function. For this spline model, the corresponding cumulative hazard

rate function, reliability function, and probability density function are respectively

given by

H(tj�) =KXj=1

�jCj(t);

R(tj�) = exp

0@�

KXj=1

�jCj(t)

1A ;

Spline Hazard Rate Estimation 101

f(tj�) =

0@ KXj=1

�jBj(t)

1A exp

0@�

KXj=1

�jCj(t)

1A

where Cj(t) =R t0 Bj(u)du.

In particular, when K = 0, this spline model includes exactly exponential distribu-

tion. When K = 1, this is exact hazard rate function of the Rayleigh distribution. The

MLE � is obtained by maximizing the likelihood function. We refer to h(�) = h(�j�) as

the spline hazard rate estimate.

3 MAXIMUM LIKELIHOOD ESTIMATION

Let T1; T2; � � � ; Tn be independent identically distributed(i.i.d.) with a life distri-

bution function(d.f.) F and let C1; C2; � � � ; Cn be i.i.d. with d.f. G. Ci is the

censoring time associated with Ti. In random censoring case we can only observe

(Y1; �1); � � � ; (Yn; �n) where Yi = min(Ti; Ci), �i = I(Ti � Ci), 1 � i � n. It is assumed

that Ti and Ci are independent. The random variable Yi is said to be uncensored or

censored according as �i = 1 or �i = 0. Note that the partial likelihood corresponding

to the data (yi; �i) equals [f(yi)]�i [1�F (yi)]

1��i (see Miller, 1981), so the log-likelihood

for the data (yi; �i) equals

(yi; �i) = �i log h(yi)�H(yi):

Thus the log-likelihood function corresponding to the spline model is determined by

l(�) =nXi=1

(yi; �i)

=nXi=1

�i log(h(yi))�nXi=1

H(yi):

Moreover,

@

@�jl(�) =

nXi=1

�iBj(yi)

h(yi)�

nXi=1

Cj(yi); 1 � j � K

and@2

@�j@�kl(�) = �

nXi=1

�iBj(yi)Bk(yi)

(h(yi))2; 1 � j; k � K:

It follows from the last result that l(�) is a concave function. Thus the MLE is unique

if it exists.

Let S(�) denote the score function of l(�), that is, K-dimensional column vector

with entries @l(�)=@�j , and let H(�) denote the Hessian of l(�), the K�K matrix with

entries @2l(�)=@�j@�k. The maximum likelihood equation for � is S(�) = 0. We use

102 Myung Hwan Na

Newton-Raphson method with step-halving for computing �, to start with an initial

guess �(0) and iteratively determine �(m+1) by the formula

�(m+1) = �(m) +1

2MI�1(�(m))S(�(m))

where I(�) = �H(�) and M is the smallest nonnegative integer such that

lf(�(m) +1

2MI�1(�(m))S(�(m))g � lf(�(m) +

1

2M+1I�1(�(m))S(�(m))g:

We stop the iterations when l(�(m+1))� l(�(m)) � 10�6:

4 KNOT DELETION PROCEDURE

In this section we determine the rules for selecting the number and location of knots.

In order to determine the number and location, we can directly apply the stepwise

knot deletion method of Smith(1982). First place enough initial knots appropriately

and then delete unnecessary knots. According to Stone(1991), for twice continuously

di�erentiable h(t), an optimal rate of convergence n�2=5 can be achieved if the number

of knots is increased proportionally to n1=5. So we use the integer K closest to 4n1=5

as a number of initial knots. We will describe an initial knot placement rule: place

two knots at the �rst and the last order statistics and the remaining knots as closely

as possible to the equi-spaced percentiles. For example, if the number of initial knots

is �ve, they are placed at the 0, 25, 50, 75, 100 percentiles.

First, consider the problem that the estimate of hazard rate function take negative

values. The estimates of the hazard rate may take negative values for large value of K

or on intervals [0; �1] and [�K ;1). So we used following method to satisfy conditions

(1) and (2).

(i) If �1(�1+ �2+ �3) + �2 less than 0, we set �1 = 0, i.e. the function is constant on

[0; �1].

(ii) If �K less than 0, we set �K = 0, i.e. the function is constant on [�K ;1).

(iii) If the minimum value of h(xi), i = 1; � � � ; n, is negative, we delete the closest

knot to xj� , where xj� is argument of minimum value of h(xi).

Now consider the problem of deleting unnecessary knots. Following Smith(1982),

the absence of a knot � of a splinePK

j=1 �jBj(t) means that

KXj=1

�j�j(�) = 0

where �j(�) = B(3)j (��)� B

(3)j (�+), B

(3)j (��) and B

(3)j (�+) are, respectively, the left-

and right-hand limit of @3Bj(t)=@t3 at �.

At any step we compute

�k =j kj

SE( k)k = 1; � � � ;K


where k =PK

j=1 �j�j(�k) and SE( k) = f�t(�k)(I(�))�1�(�k)g

1=2. And we delete the

knot having the smallest value of �k. In this manner, we arrive at a sequence of models

indexed by J , which ranges from 0 to K. Let IL = 1 when the estimated function

is constant on [0; �1], IL = 0 otherwise. Let IR = 1 when the estimated function is

constant on [tK ;1), IR = 0 otherwise. Let lJ denote the log-likelihood function for the

Jth model evaluated at the MLE for that model. Let BIC = �2lJ + log(n)(K � J �

IL� IR) be the Bayesian Information Criterion(Schwarz, 1978) for the Jth model. We

choose the model corresponding to that value J of J that minimizes BIC. This model

has K � J knots and K � J � IL � IR free parameters.

5 EXAMPLES

The spline hazard rate estimation procedure described in Section 2 is applied to Weibull,

Gamma and Dhillon distributions. The density functions are respectively given by:

f(t) =�

�(t

�)��1 exp(�(

t

�)�);

f(t) =1

�(�)

t��1

��exp(�

t

�);

f(t) = ��(�t)��1 exp((�t)�) exp(1� exp(�t)�):

The simulation is performed on the subroutine IMSL of the package FORTRAN.

In Figure 1, we show the true hazard rate function(solid) corresponding to Weibull

distribution with parameter � = 0:8 and � = 1. The dotted line corresponds to the

estimated hazard rate function based on a sample of size 200. Figure 2 is similar to

Figure 1, but the underlying distribution for Figure 2 is Gamma distribution. The data

for Figure 2 is from Gamma distribution with parameter � = 2 and � = 1 based on a

sample of size 200. In the �gure we show the true hazard rate function corresponding

to this Gamma distribution together with the estimate for the hazard rate function

based on the spline model. In Figure 3, we show the result of similar calculation based

on a sample of size 200 from Dhillon distribution with parameter � = 1 and � = 1,

i.e., extreme value distribution. In the �gure we show the true hazard rate function

corresponding to this Dhillon distribution together with the estimate for the hazard

rate function based on the spline model.

From these examples, we have found that the spline hazard rate estimate yields a

reasonable estimate for the hazard rate function.

104 Myung Hwan Na

Figure 1. Spline hazard rate estimate for Weibull distribution

with � = 0:8 and � = 1 based on sample of size 200.

Figure 2. Spline hazard rate estimate for Gamma distribution

with � = 2 and � = 1 based on sample of size 200.


Figure 3. Spline hazard rate estimate for Dhilon distribution

with � = 1 and � = 1 based on sample of size 200.

REFERENCES

1. Abrahamowicz, M., Ciampi, A. and Ramsay, J. O. (1992): "Nonparametric Den-

sity Estimation for Censored Survival Data : Regression-Spline Approach", The

Canadian Journal of Statistics, Vol. 20, 171-185.

2. Anderson, J. A. and Senthilselvan, A. (1980): "Smooth Estimates for the Hazard

Function", Journal of the Royal Statistical Society, Ser. B, Vol. 42, 322-327.

3. de Boor, C. (1978), A Practical Guide to Splines, Springer-Verlag, New York.

4. Greville, T. N. E. (1969), Theory and Application of Spline Function, Academic

Press, New York.

5. Kooperberg, C., Stone C. J. and Truong, Y. K. (1995): "Hazard Regression",

Journal of American Statistic Association, Vol. 90, 78-94.

6. Miller, R. (1981) Survival Analysis, John Wiley & Sons, New York.

7. O'sullivan F. (1988): "Fast Computation of Fully Automated Log-Density and

Log-Hazard Estimates", SIAM Journal of Scienti�c and Statistical Computing,

Vol. 9, 363-379.

8. Schumaker, L. L. (1981): Spline functions; Basic Theory, Wiley, New York.

106 Myung Hwan Na

9. Schwarz, G. (1978): "Estimating the dimension of model", Annals of Statistics,

Vol. 6, 461-464.

10. Senthilelvan, A. (1987): "Penalized Likelihood Estimation of Hazard and Inten-

sity Functions", Journal of the Royal Statistical Society, Ser. B, Vol. 49, 170-174.

11. Smith, P. L. (1979): "Splines as a Useful and Convenient Statistical Tools", The

American Statistician, Vol. 33, pp. 57-62.

12. Smith, P. L. (1982): "Curve �tting and modeling with splines using statistical

variable selection methods" NASA, Langley Research Center, Hampla, VA, NASA

Report 166034.

13. Stone C. J. (1991): Generalized Multivariate Regression Splines, Technical Report

No. 318, Dept. statist. Univ. California, Berkeley.

14. Wegman, E. J. and Wright, I. W. (1983), "Splines in Statistics", Journal of

American Statistic Association, Vol. 78, 351-366.

Department of Statistics, Seoul National

University, Seoul 151-742, Korea


An Ostrowski Type Inequality for Weighted

Mappings with Bounded Second Derivatives

J. Roumeliotis, P. Cerone, S.S. Dragomir

J. KSIAM Vol.3, No.2, 107-119, 1999

Abstract

A weighted integral inequality of Ostrowski type for mappings whose second

derivatives are bounded is proved. The inequality is extended to account for ap-

plications in numerical integration.

1 Introduction

In 1938, Ostrowski (see for example Mitrinovi�c et al. (1994, p. 468)) proved the follow-

ing inequality

THEOREM 1.1. Let f : I � R ! R be a di�erentiable mapping in Io (Io is the

interior of I), and let a; b 2 Io with a < b. If f 0 : (a; b) ! R is bounded on (a; b), i.e.,

kf 0k1 := supt2(a;b)

jf 0(t)j <1, then we have the inequality:

�� 1

b� a

Zb

a

f(t) dt� f(x)

�� "1

4+

�x� a+b

2

�2(b� a)2

#(b� a)kf 0k1 (1.1)

for all x 2 (a; b).

The constant 14 is sharp in the sense that it cannot be replaced by a smaller one.

A similar result for twice di�erentiable mappings (Cerone et al. 1998) is given

below.

THEOREM 1.2. Let f : [a; b] ! R be a twice di�erentiable mapping such that f 00 :

(a; b) ! R is bounded on (a,b), i.e. kf 00k1 := supt2(a;b)

jf 00(t)j < 1. Then we have the

inequality�� 1

b� a

Zb

a

f(t) dt� f(x) +

�x� a+ b

2

�f 0(x)

��"1

24+

�x� a+b

2

�22(b� a)2

#(b� a)2kf 00k1 (1.2)

for all x 2 [a; b].

Key Words and Phrases: Ostrowski's inequality, weighted integrals, numerical integration

107

108 J. Roumeliotis, P. Cerone, S.S. Dragomir

In this paper, we extend the above result and develop an Ostrowski-type inequality

for weighted integrals. Applications to special weight functions and numerical integra-

tion are investigated.

2 Preliminaries

In the next section weighted (or product) integral inequalities are constructed. The

weight function (or density) is assumed to be non-negative and integrable over its

entire domain. The following generic quantitative measures of the weight are de�ned.

De�nition 2.1. Let w : (a; b)! [0;1) be an integrable function, i.e.Rb

aw(t) dt <1,

then de�ne

mi(a; b) =

Zb

a

tiw(t) dt; i = 0; 1; : : : (2.1)

as the ith moment of w.

De�nition 2.2. De�ne the mean of the interval [a; b] with respect to the density w as

�(a; b) =m1(a; b)

m0(a; b)(2.2)

and the variance by

�2(a; b) =m2(a; b)

m0(a; b)� �2(a; b): (2.3)

3 The Results

3.1 1-point inequality

THEOREM 3.1. Let f; w : (a; b) ! R be two mappings on (a; b) with the following

properties:

1. supt2(a;b)

jf 00(t)j <1,

2. w(t) � 0 8t 2 (a; b),

3.Rb

aw(t) dt <1,

then the following inequalities hold�� 1

m0(a; b)

Zb

a

w(t)f(t) dt�f(x) +�x� �(a; b)

�f 0(x)

�� kf 00k1

2

h�x� �(a; b)

�2+ �2(a; b)

i(3.1)

� kf 00k12

��x� a+ b

2

��+ b� a

2

�2

(3.2)

for all x 2 [a; b].

Ostrowski Type Inequality 109

Proof. De�ne the mapping K(�; �) : [a; b]2 ! R by

K(x; t) :=

(Rt

a(t� u)w(u) du; a � t � x;Rt

b(t� u)w(u) du; x < t � b:

Integrating by parts gives

Zb

a

K(x; t)f 00(t) dt =

Zx

a

Zt

a

(t� u)w(u)f 00(t) dudt+

Zb

x

Zt

b

(t� u)w(u)f 00(t) dudt

= f 0(x)

Zb

a

(x� u)w(u) du

�Z

x

a

Zt

a

(t� u)w(u)f 0(t) dudt�Z

b

x

Zt

b

(t� u)w(u)f 0(t) dudt

=

Zb

a

w(t)f(t) dt + f 0(x)

Zb

a

(x� u)w(u) du � f(x)

Zb

a

w(u) du

providing the identity

Zb

a

K(x; t)f 00(t) dt

=

Zb

a

w(t)f(t) dt�m0(a; b)f(x) +m0(a; b)�x� �(a; b)

�f 0(x) (3.3)

that is valid for all x 2 [a; b].

Now taking the modulus of (3.3) we have,

��Z

b

a

w(t)f(t) dt �m0(a; b)f(x) +m0(a; b)�x� �(a; b)

�f 0(x)

��=

��Z

b

a

K(x; t)f 00(t) dt

�� kf 00k1

Zb

a

jK(x; t)j dt

= kf 00k1�Z

x

a

Zt

a

(t� u)w(u) dudt +

Zb

x

Zt

b

(t� u)w(u) dudt

�

=kf 00k1

2

Zb

a

(x� t)2w(t) dt: (3.4)

The last line being computed by reversing the order of integration and evaluating

the inner integrals. To obtain the desired result (3.1) observe that

Zb

a

(x� t)2w(t) dt = m0(a; b)h�x� �(a; b)

�2+ �2(a; b)

i:


To obtain (3.2) note that

Zb

a

(x� t)2 dt � supt2[a;b]

(x� t)2mo(a; b)

= maxf(x� a)2; (x� b)2gm0(a; b)

=1

2

�(x� a)2 + (x� b)2 +

��(x� a)2 � (x� b)2��m0(a; b)

=

��x� a+ b

2

��+ b� a

2

�2

m0(a; b)

which upon susbitution into (3.4) furnishes the result. �

Note also that the inequality (3.1) is valid even for unbounded w or interval [a; b].

This is not the case with (1.2).

COROLLARY 3.2. The inequality (3.1) is minimized at x = �(a; b) producing the

generalized \mid-point" inequality

�� 1

m0(a; b)

Zb

a

w(t)f(t) dt � f(�(a; b))

�� kf 00k1�2(a; b)

2: (3.5)

Proof. Substituting �(a; b) for x in (3.1) produces the desired result. Note that

x = �(a; b) not only minimizes the bound of the inequality (3.1), but also causes the

derivative term to vanish. �

The optimal point (2.2) can be interpreted in many ways. In a physical context,

�(a; b) represents the centre of mass of a one dimensional rod with mass density w.

Equivalently, this point can be viewed as that which minimizes the error variance for

the probability density w (see Barnett et al. (1995) for an application). Finally (2.2) is

also the Gauss node point for a one-point rule (Stroud and Secrest 1966). The bound

in (3.5) is directly proportional to the variance of the density w. So that the tightest

bound is achieved by sampling at the mean point of the interval (a; b), while its value

is given by the variance.

3.2 2-point inequality

Here a two point analogy of (3.1) is developed where the result is extended to create an

inequality with two independent parameters x1 and x2. This is mainly used (Section

5) to �nd an optimal grid for composite weighted-quadrature rules.

THEOREM 3.3. Let the conditions of Theorem 3.1 hold, then the following 2-point


inequality is obtained��Z

b

a

w(t)f(t) dt �m0(a; �)f(x1) +m0(a; �)�x1 � �(a; �)

�f 0(x1)

�m0(�; b)f(x2) +m0(�; b)�x2 � �(�; b)

�f 0(x2)

�� kf 00k1

2

�m0(a; �)

h�x1 � �(a; �)

�2+ �2(a; �)

i

+m0(�; b)h�x2 � �(�; b)

�2+ �2(�; b)

i�(3.6)

for all a � x1 < � < x2 � b.

Proof. De�ne the mapping K(�; �; �; �) : [a; b]4 ! R by

K(x1; x2; �; t) :=

8><>:Rt

a(t� u)w(u) du; a � t � x1;Rt

�(t� u)w(u) du; x1 < t; � < x2;Rt

b(t� u)w(u) du; x2 � t � b:

With this kernel, the proof is almost identical to that of Theorem 3.1.

Integrating by parts produces the integral identityZb

a

K(x1; x2; �; t)f00(t) dt

=

Zb

a

w(t)f(t) dt�m0(a; �)f(x1) +m0(a; b)�x� �(a; �)

�f 0(x1)

�m0(�; b)f(x2) +m0(�; b)�x� �(�; b)

�f 0(x2): (3.7)

Re-arranging and taking bounds produces the result (3.6). �

COROLLARY 3.4. The optimal location of the points x1; x2 and � satisfy

x1 = �(a; �); x2 = �(�; b); � =�(a; �) + �(�; b)

2(3.8)

Proof. By inspection of the right hand side of (3.6) it is obvious that choosing

x1 = �(a; �) and x2 = �(�; b) (3.9)

minimizes this quantity. To �nd the optimal value for � write the expression in braces

in (3.6) as

2

Zb

a

jK(x1; x2; �; t)j dt = m0(a; �)h�x1 � �(a; �)

�2+ �2(a; �)

i+m0(�; b)

h�x2 � �(�; b)

�2+ �2(�; b)

i=

Z�

a

(x1 � t)2w(t) dt +

Zb

�

(x2 � t)2w(t) dt: (3.10)


Substituting (3.9) into the right hand side of (3.10) and di�erentiating with respect to

� gives

d

d�

Zb

a

jK(�(a; �); �(�; b); �; t)j dt =��(�; b)� �(�; a)

�� (a; �) + �(�; b)

2

�w(�):

Assuming w(�) 6= 0, then this equation possesses only one root. A minimum exists at

this root since (3.10) is convex, and so the corollary is proved. �

Equation (3.8) shows not only where sampling should occur within each subinterval

(i.e. x1 and x2), but how the domain should be divided to make up these subintervals

(�).

4 Some Weighted Integral Inequalities

Integration with weight functions are used in countless mathematical problems. Two

main areas are: (i) approximation theory and spectral analysis and (ii) statistical anal-

ysis and the theory of distributions.

In this section (3.1) is evaluated for the more popular weight functions. In each

case (1.2) cannot be used since the weight w(t) or the interval (b � a) is unbounded.

The optimal point (2.2) is easily identi�ed.

4.1 Uniform (Legendre)

Substituting w(t) = 1 into (2.2) and (2.3) gives

�(a; b) =

Rb

at dtR

b

adt

=a+ b

2(4.1)

and

�2(a; b) =

Rb

at2 dtRb

adt

��a+ b

2

�2

=(b� a)2

12

respectively. Substituting into (3.1) produces (1.2). Note that the interval mean is

simply the midpoint (4.1).

4.2 Logarithm

This weight is present in many physical problems; the main body of which exhibit

some axial symmetry. Special logarithmic rules are used extensively in the Boundary

Element Method popularized by Brebbia (see for example Brebbia and Dominguez

(1989)). Some applications of which include bubble cavitation (Blake and Gibson

1987) and viscous drop deformation (Rallison and Acrivos (1978) and more recently by

Roumeliotis et al. (1997)).

With w(t) = ln(1=t), a = 0, b = 1, (2.2) and (2.3) are

�(0; 1) =

R 10 t ln(1=t) dtR 10 ln(1=t) dt

=1

4


and

�2(0; 1) =

R 10 t2 ln(1=t) dtR 10 ln(1=t) dt

��1

4

�2

=7

144

respectively. Substituting into (3.1) gives

��Z 1

0ln(1=t)f(t) dt � f(x) +

�x� 1

4

�f 0(x)

�� kf 00k12

7

144+

�x� 1

4

�2!:

The optimal point

x = �(0; 1) =1

4

is closer to the origin than the midpoint (4.1) re ecting the strength of the log singu-

larity.

4.3 Jacobi

Substituting w(t) = 1=pt, a = 0, b = 1 into (2.2) and (2.3) gives

�(0; 1) =

R 10

pt dtR 1

0 1=pt dt

=1

3

and

�2(0; 1) =

R 10 tpt dtR 1

0 1=pt dt

��1

3

�2

=4

45

respectively. Hence, the inequality for a Jacobi weight is

��12Z 1

0

f(t)ptdt� f(x) +

�x� 1

3

�f 0(x)

�� kf 00k12

4

45+

�x� 1

3

�2!:

The optimal point

x = �(0; 1) =1

3

is again shifted to the left of the mid-point due to the t�1=2 singularity at the origin.

4.4 Chebyshev

The mean and variance for the Chebyshev weight w(t) = 1=p1� t2, a = �1; b = 1 are

�(�1; 1) =R 1�1 t=

p1� t2 dtR 1

�1 1=p1� t2 dt

= 0

and

�2(�1; 1) =R 1�1 t

2p1� t2 dtR 1

�1 1=p1� t2 dt

� 02 =1

2


respectively. Hence, the inequality corresponding to the Chebyshev weight is�� 1�Z 1

�1

f(t)p1� t2

dt� f(x) + xf 0(x)

�� kf 00k12

�1

2+ x2

�:

The optimal point

x = �(�1; 1) = 0

is at the mid-point of the interval re ecting the symmetry of the Chebyshev weight

over its interval.

4.5 Laguerre

The conditions in Theorem 3.1 are not violated if the integral domain is in�nite. The

Laguerre weight w(t) = e�t is de�ned for positive values, t 2 [0;1). The mean and

variance of the Laguerre weight are

�(0;1) =

R1

0 te�t dtR1

0 e�t dt= 1

and

�2(0;1) =

R1

0 t2e�t dtR1

0 e�t dt� 12 = 1

respectively.

The appropriate inequality is��Z1

0e�tf(t) dt� f(x) + (x� 1)f 0(x)

�� kf 00k12

�1 + (x� 1)2

�;

from which the optimal sample point of x = 1 may be deduced.

4.6 Hermite

Finally, the Hermite weight is w(t) = e�t2 de�ned over the entire real line. The mean

and variance for this weight are

�(�1;1) =

R1

�1te�t2 dtR

1

�1e�t2 dt

= 0

and

�2(�1;1) =

R1

�1t2e�t2 dtR

1

�1e�t2 dt

� 02 =1

2

respectively.

The inequality from Theorem 3.1 with the Hermite weight function is thus�� 1p�

Z1

�1

e�t2f(t) dt� f(x) + xf 0(x)

�� kf 00k12

�1

2+ x2

�;

which results in an optimal sampling point of x = 0.


5 Application in Numerical Integration

De�ne a grid In : a = �0 < �1 < � � � < �n�1 < �n = b on the interval [a,b], with

xi 2 [�i; �i+1] for i = 0; 1; : : : ; n � 1. The following quadrature formulae for weighted

integrals are obtained.

THEOREM 5.1. Let the conditions in Theorem 3.1 hold. The following weighted

quadrature rule holds Zb

a

w(t)f(t) dt = A(f; �;x) +R(f; �;x) (5.1)

where

A(f; �;x) =

n�1Xi=0

�hif(xi)� hi(xi � �i)f

0(xi)�

and

jR(f; �;x)j � kf 00k12

n�1Xi=0

�(xi � �i)

2 + �2i�hi: (5.2)

The parameters hi, �i and �2iare given by

hi = m0(�i; �i+1); �i = �(�i; �i+1); and �2i = �2(�i; �i+1)

respectively.

Proof. Apply Theorem 3.1 over the interval [�i; �i+1] with x = xi to obtain

��Z

�i+1

�i

w(t)f(t) dt� hif(xi) + hi(xi � �i)f0(xi)

�� kf 00k1

2hi�(xi � �i)

2 + �2i�:

Summing over i from 0 to n� 1 and using the triangle inequality produces the desired

result. �

COROLLARY 5.2. The optimal location of the points xi, i = 0; 1; 2; : : : ; n� 1, and

grid distribution In satisfy

xi = �i; i = 0; 1; : : : ; n� 1 and (5.3)

�i =�i�1 + �i

2; i = 1; 2; : : : ; n� 1; (5.4)

producing the composite generalized mid-point rule for weighted integrals

Zb

a

w(t)f(t) dt =

n�1Xi=0

hif(xi) +R(f; �; n) (5.5)


where the remainder is bounded by

jR(f; �; n)j � kf 00k12

n�1Xi=0

hi�2i (5.6)

Proof. The proof follows that of Corollary 3.4 where it is observed that the minimum

bound (5.2) will occur at xi = �i. Di�erentiating the right hand side of (5.2) gives

d

d�i

n�1Xj=0

�(xj � �j)

2 + �2j�hj = 2w(�i)(xi � xi�1)

��i �

xi�1 + xi

2

�:

Inspection of the second derivative at the root reveals that the stationary point is a

minimum and hence the result is proved. �


6 Numerical Results

In this section, for illustratration, the quadrature rule of Section 5 is used on the integral

Z 1

0100t ln(1=t) cos(4�t) dt = �1:972189325199166 (6.1)

This is evaluated using the following three rules:

(1) the composite mid-point rule, where the grid has a uniform step-size and the node

is simply the mid-point of each sub-interval,

(2) the composite generalized mid-point rule (5.1). The grid, In, is uniform and the

nodes are the mean point of each sub-interval (5.3),

(3) equation (5.5) where the grid is distributed according to (5.4) and the nodes are

the sub-interval means (5.3).

Table 1 shows the numerical error of each method for an increasing number of sample

points. For a uniform grid, it can be seen that changing the location of the sampling

point from the midpoint [method (1)] to the mean point [method (2)] roughly doubles

the accuracy. Changing the grid distribution as well as the node point [method (3)] from

the composite mid-point rule [method (1)] increases the accuracy by approximately an

order of magnitude. It is important to note that the nodes and weights for method

(3) can be easily calculated numerically using an iterative scheme. For example on a

Pentium-90 personal computer, with n = 64, calculating (5.3) and (5.4) took close to

37 seconds.

Note that equations (5.3) and (5.4) are quite general in nature and only rely on the

weight insofar as knowledge of the �rst two moments is required. This contrasts with

Gaussian quadrature where for an n point rule, the �rst n+1 moments are needed (or

equivalently the 2n + 1 coe�cients of the continued fraction expansion (Rutishauser

1962b; Rutishauser 1962a)) to construct the appropriate orthogonal polynomial and

then a root-�nding procedure is called to �nd the abscissae (Atkinson 1989). This

n Error (1) Error (2) Error (3) Error ratio (3) Bound ratio (3)

4 1.97(0) 2.38(0) 2.48(0) { {

8 3.41(-1) 2.93(-1) 2.35(-1) 10.56 3.90

16 8.63(-2) 5.68(-2) 2.62(-2) 8.97 3.95

32 2.37(-2) 1.31(-2) 4.34(-3) 6.04 3.97

64 6.58(-3) 3.20 (-3) 9.34(-4) 4.65 3.99

128 1.82(-3) 7.94(-4) 2.23(-4) 4.18 3.99

256 4.98(-4) 1.98(-4) 5.51(-5) 4.05 4.00

Table 1: The error in evaluating (6.1) under di�erent quadrature rules. The parameter

n is the number of sample points.


procedure, of course, can be greatly simpli�ed for the more well known weight functions

(Gautschi 1994).

The second last column of Table 1 shows the ratio of the numerical errors for method

(3) and the last column the ratio of the theoretical error bound (5.5)

Bound ratio (3) =jR(f; �; n=2)jjR(f; �; n)j : (6.2)

As n increases the numerical ratio approaches the theoretical one. The theoretical ratio

is consistently close to 4. This value suggests an asymptotic form of the error bound

jR(f; �; n)j � O

�1

n2

�(6.3)

for the log weight. Similiar results have been obtained for the other weights of Section

4. This is consistent with mid-point type rules and it is anticipated that developing

other product rules, for example a generalized trapezoidal or Simpsons rule, will yield

more accurate results.

REFERENCES

Atkinson, K. E. (1989). An Introduction to Numerical Analysis. John Wiley.

Barnett, N. S., I. S. Gomm, and L. Armour (1995). Location of the optimal sampling

point for the quality assessment of continuous streams. Austral. J. Statist. 37 (2),

145{152.

Blake, J. R. and D. C. Gibson (1987). Cavitation bubbles near boundaries. Ann.

Rev. Fluid Mech. 19, 99{123.

Brebbia, C. H. and J. Dominguez (1989). Boundary elements: an introductory course.

Southampton: Computational Mechanics.

Cerone, P., S. S. Dragomir, and J. Roumeliotis (1998). An inequality of Ostrowski

type for mappings whose derivatives are bounded and applications. submitted to

East Asian Mathematical Journal .

Gautschi, W. (1994). Algorithm 726: ORTHPOL { a package of routines for gen-

erating orthogonal polynomials and Gauss-type quadrature rules. ACM Trans.

Math. Software 20, 21{62.

Mitrinovi�c, D. S., J. E. Pe�cari�c, and A. M. Fink (1994). Inequalities for functions

and their integrals and derivatives. Dordrecht: Kluwer Academic.

Rallison, J. M. and A. Acrivos (1978). A numerical study of the deformation and

burst of a viscous drop in an extensional ow. J. Fluid Mech. 89, 191{200.

Roumeliotis, J., G. R. Fulford, and A. Kucera (1997). Boundary integral equation

applied to free surface creeping ow. In B. J. Noye, M. D. Teubner, and A. W.

Gill (Eds.), Computational Techniques and Applications: CTAC97, Singapore,

pp. 599{607. World Scienti�c.


Rutishauser, H. (1962a). Alogorithm 125: WEIGHTCOEFF. CACM 5 (10), 510{511.

Rutishauser, H. (1962b). On a modi�cation of the QD-algorithm with Grae�e-type

convergence. In Proceedings of the IFIPS Congress, Munich.

Stroud, A. H. and D. Secrest (1966). Gaussian quadrature formulas. Prentice Hall.

School of Communications and Informatics,

Victoria University of Technology,

PO Box 14428,

MCMC, Melbourne,

Victoria, 8001,

Australia

[email protected]

[email protected]

[email protected]

A NOTE ON SOME HIGHER ORDER CUMULANTS IN

k PARAMETER NATURAL EXPONENTIAL FAMILY

HYUN CHUL KIM

J. KSIAM Vol.3, No.2, 157-160, 1999

Abstract. We show the cumulants of a minimal su�cient statistics in k parameter

natural exponential family by parameter function and partial parameter function.

We �nd the cumulants have some merits of central moments and general cumulants

both. The �rst three cumulants are the central moments themselves and the fourth

cumulant has the form related with kurtosis.

1. Introduction

In this paper, we found some interesting results about the higher order cumulants in

k parameter natural exponential family. We will follow the notation of Bar-Lev[1].

Let T = (T1; � � � ; Tk; k � 2)be a minimal su�cient statistic for an exponential model

that constitutes a k parameter natural exponential family. Consider a partition of T

into (T1;T2) where T1 = (T1; � � � ; Tr), and T2 = (Tr+1; � � � ; Tk; 1 � r � k � 1). We

present some higher order cumulants of T, and conditional cumulants of T1, given

T2 = t2.

Assume that the model from which the sample observationsX1; � � � ;Xn are taken has

the form of k parameter exponential family. Then the joint pdf of X = (X1; � � � ;Xn)

may be represented as follows

fX(x;�) =n nYi=1

h(xi)IS(xi)oexpn kX

i=1

�i

nXj=1

ui(xj)� nl(�)o

(1.1)

where S is the common support of the Xi's, Is(�) is the indicator function of the

set S, and � = (�1; � � � ; �k) is the vector of natural parameters (2 �). De�ne Ti =Pnj=1 ui(xj); i = 1; � � � ; k, and let (T1;T2) be a minimal su�cient statistic for �. In

addition, consider a partition of � into (�1;�2), where �1 = (�1; � � � ; �r), and �2 =

(�r+1; � � � ; �k).

The derivation of moments or conditional moments from the pdf (1.1) is cumbersome

and di�cult to carry out. Most authors make the pdf into natural exponential family

through reparametrization (see, Bickel and Docksum [2, p.70]; Bar-Lev [1]; Lehmann

[4, p.57]). The pdf of k parameter natural exponential family has the form

fT(t : �) = g(t) expn� � t� nl(�)

oIST(t) (1.2)

AMS Mathematics Subject Classi�cation : 62B05, 62F10

Key word and phrases : cumulant, conditional cumulant, moment, natural exponential family, su�cient

statistics

157

158 HYUN CHUL KIM

for some measurable function g. In the case of (1.2) we get the moment generating

function of T easily.

For � 2 �, the moment generating function of T is

Ehexpn kX

i=1

siTi

oi= exp

nnl(�1 + s1; � � � ; �k + sk)� nl(�1; � � � ; �k)

o(1.3)

from which moments of T can be obtained. Cumulants of T can be derived by taking

logarithms in eq.(1.3) that follows, di�erentiating with respect to the si's and substi-

tuting si = 0; i = 1; � � � ; k. But we can calculate the cumulants easily by di�erentiating

l(�) also. We refer to l(�) as parameter function.

Bar-Lev[1] de�nes other parameter function from which conditional cumulants of T1given T2 = t2 can be obtained like moment generating function. We refer to it as

partial parameter function.

b(�1 : t2) � fT2(t2 : �) exp

nnl(�)� �2 � t2

o(1.4)

And conditional cumulants are calculated by di�erentiating log b(�1; t2).

He shows that the �rst two cumulants equal to that of moments. The results are

same with the cumulants in Kendall and Stuart[3, p.73]. We show the higher order

cumulants derived from the parameter function are same with the cumulants also, and

the usefulness of them.

2. Results

We can get l(�) by integrating eq.(1.2) since fT(t : �) is a pdf. The parameter

function is

l(�) =1

nlog

Zg(t) exp(� � t)dt (2.1)

Now by di�erentiating (2.1) with respect to �i; i = 1; � � � ; k, we get following results.

Results 1 : Cumulants of minimal su�cient statistic

n@l(�)

@�i= E(Ti) = �i

n@2l(�)

@�2i= E((Ti � �i)

2) = �2i

n@3l(�)

@�3i= E((Ti � �i)

3)

n@4l(�)

@�4i= E((Ti � �i)

4)� 3�4i

The �rst cumulant is the �rst moment �i, and it is same with the Kendall and

Stuart's �rst cumulant. And we can �nd the kurtosis of Ti; i = 1; � � � ; k from the

results above.

ki =n@4l(�)

@�4i=�n@2l(�)

@�2i

�2=

E((Ti � �i)4)

�4i� 3 (2.2)

A NOTE ON SOME HIGHER ORDER CUMULANTS 159

It is interesting also. We usually de�ne basic kurtosis as subtracting 3 from general

kurtosis, because that of normal distribution is 3. If we calculate kurtosis from l(�), it

is lessened by 3 from general kurtosis, and resulted basic kurtosis as 0.

And now, we get conditional cumulants of T1, given T2 = t2, by di�erentiating

log b(�1 : t2) with respect to �i; i = 1; � � � ; r.

Results 2 : Conditional cumulants of minimal su�cient statistics

@ log b(�1 : t2)

@�i= E(TijT2 = t2) = ��i

@2 log b(�1 : t2)

@�2i= E((Ti � ��i )

2jT2 = t2) = �i

�2

@3 log b(�1 : t2)

@�3i= E((Ti � ��i )

3jT2 = t2)

@4 log b(�1 : t2)

@�4i= E((Ti � ��i )

4jT2 = t2)� 3�i

�4

They present consistent results with Results 1 above.

3. Concluding Remarks

In this section, we show some types of moments with our results derived from pa-

rameter function l(�) and partial parameter function b(�1 : t2). We can �nd our results

are more convenient than others in Table 1.

In general, it is some statistics(mean, variance, skewness, kurtosis etc.) to know when

we calculate moments. In this view, no type of moment is always superior to other usual

types of moment. But our results show that the cumulants from parameter function

are always superior to other types of moments especially in fourth order moment for

calculating kurtosis. And it is very easy to calculate the cumulants.

References

1. S. K. Bar-Lev, A Derivation of Conditional Cumulants in Exponential Models, The American Statis-

tician, 48(2), 1994, 126-129.

2. P. J. Bickel and K. A. Doksum, Mathematical Statistics, Holden-Day Inc., 1977.

3. M. Kendall and A. Stuart, The Advanced Theory of Statistics Vol.1 Distribution Theory, Charles

Gri�n & Co., 1977.

4. E. L. Lehmann, Testing Statistical Hypothesis second ed., Wiley, 1986.

Dept. of Informatics and Statistics

Kunsan National Univ. Kunsan, 573-701, Korea [email protected]

160 HYUN CHUL KIM

Table1.Comparisonwithothertypesofmoments

typesandorder

�rst

second

third

fourth

�0 k

=E(Ti)k

�i

�2 i

+�

2 i

E(T

3 i)

E(T

4 i)

�k

=E((Ti�

�i)k)

0

�2 i

E((Ti�

�i)3)

E((Ti�

�i)4)

�k

=E(Ti(Ti�

1)��(Ti�

k�

1))

�i

�2 i

+�

2 i

�

�i

E(Ti(Ti�

1)(Ti�

2))

E(Ti(Ti�

1)(Ti�

2)(Ti�

3))

n@k

l(�)

@�k i

�i

�2 i

E((Ti�

�i)3)

E((Ti�

�i)4)�

3�

4 i

@k

logb(�1:t2)

@�k i

�� i

�i

2�

E((Ti�

�i)3jT2=t2)

E((Ti�

�i)4jT2=t2)�

3�i

4�

J. KSIAM Vol. 3, No.2, 161-171, 1999

161

AERODYNAMIC SENSITIVITY ANALYSISFOR NAVIER-STOKES EQUATIONS

Hyoung-Jin Kim, Chongam Kim, and Oh-Hyun RhoKi Dong Lee

Abstract

Aerodynamic sensitivity analysis codes are developed via the hand-differentiation using a directdifferentiation method and an adjoint method respectively from discrete two-dimensional compressi-ble Navier-Stokes equations. Unlike previous other researches, Baldwin-Lomax algebraic turbu-lence model is also differentiated by hand to obtain design sensitivities with respect to design vari-ables of interest in turbulent flows. Discrete direct sensitivity equations and adjoint equations areefficiently solved by the same time integration scheme adopted in the flow solver routine. The re-quired memory for the adjoint sensitivity code is greatly reduced at the cost of the computational timeby allowing the large banded flux jacobian matrix unassembled. Direct sensitivity code results arefound to be exactly coincident with sensitivity derivatives obtained by the finite difference. Adjointcode results of a turbulent flow case show slight deviations from the exact results due to the limita-tion of the algebraic turbulence model in implementing the adjoint formulation. However, currentadjoint sensitivity code yields much more accurate sensitivity derivatives than the adjoint code withthe turbulence eddy viscosity being kept constant, which is a usual assumption for the prior re-searches.

1. Introduction

With the advances in computational fluid dynamics, design optimization methods in the aerody-namic design are more important than ever. In the application of gradient-based optimizationmethods to aerodynamic design problems, one of the major concerns is an accurate and efficient cal-culation of sensitivity derivatives of system responses of interest, which are usually aerodynamiccoefficients or surface pressure distributions with respect to design variables.

The finite difference approximation approach is the easiest way to use since it does not requireany development of a sensitivity code. However, the accuracy of finite difference approach dependscritically on the perturbation size of design variables and the flow initialization.[1]

A robust way of computing sensitivity derivatives is to build a sensitivity analysis code. A sen-sitivity analysis code can be developed by direct differentiation methods[2-5] or adjoint variablemethods[6-8]. Direct differentiation methods are more economical than adjoint variable methodswhen the numbers of objectives and constraints are larger than the number of design variables.

Key Words: Sensitivity derivative, direct differentiation, adjoint variable

Hyoung-Jin Kim, Chongam Kim, and Oh-Hyun Rho, Ki Dong Lee162

Adjoint variable methods are preferable in the opposite case.Both methods can be dealt with either discreteor continuous approach. In the discrete approach,discretized flow equations are differentiated, while the flow equations are differentiated before theyare discretized in the continuous approach. The discrete approach can be advantageous in the sensethat the derivatives obtained are consistent with finite-difference derivatives regardless of a computa-tional grid size . On the other hand, the continuous approach provides a clear insight into the natureof the sensitivity solution.

Previous works on the sensitivity analysis by human hand in both direct differentiation and ad-joint variable methods have shown troubles in differentiating viscosity terms , which reflects thevariation of laminar and/or turbulent viscosit ies with respect to the variation of design variables.[5,6]Automatic differentiation tools such as ADIFOR[3,4] and Odyssée[7,8] have been successfully usedfor a sensitivity code generation from Navier-Stokes codes including turbulence models. However,the sensitivity code generated by the automatic differentiation is much less efficient than a hand-differentiated one.[3,8]

Another problem with adjoint codes generated by automatic differentiation in reverse mode[7,8]or by human hand2 is a memory problem. Actually, they require much more memory than the origi-nal flow solver and are prohibitive for large two-dimensional problems and all three-dimensionalproblems.

In this study, a Navier-Stokes solver with the Baldwin–Lomax algebraic turbulence model is di-rectly differentiated by hand, and a corresponding adjoint code is developed from the direct-differentiated sensitivity code. The required memory for the adjoint sensitivity code is greatly re-duced at the cost of the computational time by allowing the large banded flux jacobian matrix unas-sembled. Sensitivity derivatives obtained by the sensitivity codes developed herein are comparedwith those calculated using the finite difference approximation.

The rest of this paper presents a brief review on the flow solver used in this study, and a basictheory of the direct differentiation method and the adjoint variable method in the discrete approach.Computational results are then given for example problems , including subsonic and transonic laminarand turbulent flows around NACA0012 airfoil.

2. Flow Analysis

A two-dimensional Navier-Stokes solver developed and validated in Ref.[9,10] was used for theflow analysis. Reynolds averaged two-dimensional compressible Navier-Stokes equations in gener-alized coordinates are used in the conservation form based on a cell-centered finite volume approach,given as

0RtQ

J1

=+∂∂ , (1)

where R is the residual vector, Q is a four-element vector of conserved flow variables. Navier-Stokes equations are discretized in time using the Euler implicit method and linearized by employingthe flux jacobian. This results in a large system of linear equations in delta form at each time step as

nn

RQQR

tJI −=∆

∂∂+

∆. (2)

Roe's Flux Difference Splitting (FDS) scheme was adopted for the space discretization in the in-viscid flux terms of the residual vector on the right-hand side; MUSCL approach with Koren limiteris employed to obtain a third order accuracy. The central difference method is used for viscous fluxterms of the residual vector. In the implicit part , Beam & Warming's Alternating Direction Implicit(ADI) method is used, and van Leer’s Flux Vector Splitting (FVS) is employed with a first order

Aerodynamic Sensitivity Analysis for Navier-Stokes Equations 163

accuracy for the flux jacobian. The flux jacobian for the viscous part is neglected in the implicitpart since it does not influence the solution accuracy. Turbulence effects were considered using theBaldwin-Lomax algebraic model with a relaxation technique. All boundary conditions were speci-fied explicitly. One-dimensional characteristic conditions were used for the inflow and outflowboundaries. The no-slip condition and adiabatic wall condition were specified on the solid wall, andlocal time stepping was used.

A C-type grid system around the airfoil was generated by a conformal mapping technique, giv-ing 135 points in the chordwise direction, 41 points in the normal direction and 95 points on the air-foil surface.

3. Sensitivity Analysis

Direct Differentiation Method

The discrete residual vector of nonlinear aerodynamic analysis for steady problems can be writtensymbolically as

[ ] 0),(X),(QR =βββ , (3)

where X is the grid position vector, and β is the vector of design variables. Boundary conditions arealso included in the residual vector R.

Eq. (3) is directly differentiated with respect to βk to yield the fo llowing equation.

0R

ddX

XR

ddQ

QR

ddR

kkkk

=

β∂∂

+

β

∂∂

+

β

∂∂

=

β. (4)

The grid sensitivity vector { }kd/dX β can be calculated by differentiating the grid generation code or

simply by applying the finite difference approxiation. However, it can be obtained analytically, ifthe grid points are analytically modified during the design process.

In order to find the solution { }kd/dQ β of Eq. (4), a pseudo time term is added and the same time

integration scheme with the flow solver is adopted. Applying Euler implicit method followed by thelinearization with van Leer flux jacobian of a first-order accuracy gives the following system of linearalgebraic equations.

n

kk ddR

ddQ

QR

tJI

β−=

β∆

∂∂+

∆. (5)

The above system of equations is solved with ADI scheme which is used for the flow solver.By comparing Eq. (2) and (5), it can be noted that one can obtain a direct sensitivity code by di-

rectly differentiating the right-hand side of the discretized flow equations. All of the derivativeterms in Eq. (4) are differentiated by hand except grid sensitivity vector { }kd/dX β , which are cal-

culated from a grid generation code.The jacobian matrices [ ]Q/R ∂∂ and [ ]X/R ∂∂ in Eq. (4) include Roe’s FDS flux jacobian and vis-

cous flux jacobian and are very large banded matrices as the inviscid and viscous fluxes are third- andsecond- order accurate, respectively. In order to avoid this problem, the terms [ ]{ }kd/dQQ/R β∂∂

and [ ]{ }kd/dXX/R β∂∂ of Eq. (4) are calculated without the explicit formulation of the very large

Jacobian matrices [ ]Q/R ∂∂ and [ ]X/R ∂∂ . Van Leer’s flux Jacobian in the LHS is frozen at the

steady-state value. This leads to the required memory increase and computational time reduction.When the flow variable sensitivity vector { }kd/dQ β is obtained, The total derivative of the system

response of interest, Cj can be calculated. Cj is a function of flow variables Q, grid position X, and


design variables β; i.e.,( )βββ= ),(X),(QCC jj . (6)

The sensitivity derivative of the aerodynamic coefficient Cj with respect to the kth design vari-able βk is given by

β∂

∂+

β

∂

∂+

β

∂

∂=

β k

j

k

Tj

k

Tj

k

j C

ddX

X

C

ddQ

Q

C

d

dC . (7)

Adjoint Variable Method

As the total derivative of the residual vector { }kd/dR β is null in the steady state, we can introduce

adjoint variables and combine Eq. (4) and (7) to obtain

∂

∂+

∂

∂+

∂

∂=

k

j

k

Tj

k

Tj

k

j C

ddX

X

C

ddQ

Q

C

d

dC

ββββ { }

β∂∂

+

β

∂∂

+

β

∂∂

λ+kkk

Tj

RddX

XR

ddQ

QR . (8)

If we find { }Tjλ that satisfies the following adjoint equation

{ } 0=

∂

∂+

∂∂

Q

C

QR j

j

T

λ , (9)

we can obtain the sensitivity derivative of Cj with respect to βk by the following equation.

β∂

∂+

β

∂

∂=

β k

j

k

Tjj C

ddX

X

C

d

dC { }

β∂∂

+

β

∂∂

λ+kk

Tj

RddX

XR . (10)

The adjoint equation (9) is also converted to the following system of linear algebraic equationsand is solved by the ADI scheme.

{ } { }n

jj

T

j

T

Q

C

QR

QR

tJI

∂

∂+λ

∂∂−=λ∆

∂∂+

∆. (11)

The transposed flux Jacobian [ ]TQ/R ∂∂ in the LHS of Eq. (11) is van Leer’s FVS flux jacobian that is

frozen at the steady-state value. The transposed flux Jacobian [ ]TQ/R ∂∂ in the RHS includes Roe’s

FDS flux Jacobian and viscous flux Jacobian and is a very large banded matrix. Unlike the fluxJacobian [ ]Q/R ∂∂ of the direct differentiation method, all the element of [ ]TQ/R ∂∂ should be

explicitly calculated. In the prior researches on the discrete adjoint variable methods[2,7], all theelements of the jacobian matrix [ ]TQ/R ∂∂ were calculated and assembled at the cost of very large

memory requirement. Although the computational time can be decreased as the elements of thejacobian matrix are calculated only once, the memory requirement is prohibitive for large two-dimensional problems and all three-dimensional problems.

In the present study, elements of [ ]TQ/R ∂∂ are calculated and multiplied by the corresponding

element of the adjoint vector { }jλ , and thus the large matrix [ ]TQ/R ∂∂ need not to be assembled.

This increases the computational time as the elements of the flux jacobian matrix [ ]TQ/R ∂∂ have to

be calculated every iteration. However, the required memory can be remarkably reduced to theorder of the required memory of the flow solver.

In Ref.[3] it was reported that a hand-differentiated sensitivity code may take six man-months totwo man-years, or even longer to be generated, whereas it takes about one man-week to generate adirect-differentiation sensitivity code by the automatic differentiation. According to authors’ expe-rience, however, less than two man-weeks were required to build a hand-differentiated direct diffe r-entiation sensitivity code, and one man-month to build an adjoint sensitivity code for two-


dimensional Navier-Stokes equations. This indicates that the hand-differentiation is a viable ap-proach compared to the automatic differentiation in the required time for the sensitivity code genera-tion.

4. Results & Discussion

In order to evaluate the performance of the direct and adjoint sensitivity codes to accurately cal-culate sensitivity derivatives, sensitivity analyses are conducted for NACA0012 airfoil in the lami-nar/turbulent flow field of subsonic and transonic regime.

The results are compared with the derivatives computed by the following finite difference ap-proximation.

k

,j,j

k

j kkkCC

d

dC

β∆

−≅

βββ∆+β . (12)

The residual of the flow solver is reduced to 10-11 from the freestream value for the finite diffe r-ence calculation. The step size

kβ∆ is 10-6 or 10-7 depending on the design variable and the flow

condition.The residuals of the sensitivity codes are reduced to 10-7 from the initial value of the residual.

The initial value of the sensitivity derivatives { }kd/dQ β and adjoint variables { }jλ is set to zero.

In order to consider the sensitivity derivative with respect to the geometry change, one of theHicks-Henne functions is adopted as a shape function as follows.

( )( )

= 6.0ln

5.0ln3sin)( xxF π

. (13)

The upper surface of NACA0012 airfoil is modified as follows.

)x(FYY oldnew β+= . (14)

Where β is a design variable. The incidence angle α is the other design variable in this study.

Direct Differentiation Approach

In order to validate the sensitivity code by the direct differentiation method, sensitivity derivativesobtained by the direct sensitivity code is compared to those by the finite difference method. Co m-putations are conducted for both subsonic and transonic turbulent flows.

The first example is a subsonic turbulent flow case. The flow condition is M∞ = 0.6, α = 2° andReynolds number = 6,500,000. Fig.1 shows pressure derivative (p ′= dp/dα) contours. Streamlinesof velocity sensitivity derivatives (u ′, v′) in Fig. 2 show the effect of the incidence angle increment.Table 1 shows the computed sensitivity derivatives of aerodynamic coefficients with respect to inci-dence angle α and the coefficient β of the shape function. The sensitivity derivatives of aerody-namic coefficients by the direct differentiation and the finite difference method compare very wellwith each other up to four or five significant digits.


Table 1 Validation of direct sensitivity code:subsonic turbulence case (M∞ = 0.60, α = 2° and Re = 6,500,000)

dCl/dα dCd/dα DCm/dαFD (Δα=10-6) 7.379933 0.05013069 0.1343019

DD 7.379935 0.05013063 0.1343012dCl/dβ dCd/dβ dCm/dβ

FD(Δβ=10-6) 3.65374 0.0217594 -1.26809

DD 3.65373 0.0217605 -1.26808

Table 2 Validation of direct sensitivity code: transonic turbulence case (M∞ = 0.75, α = 2° and Re = 6,500,000)

dCl/dα dCd/dα dCm/dαFD (Δα=10-6) 8.37948 0.486983 0.222065

DD 8.37972 0.486994 0.222014

dCl/dβ dCd/dβ dCm/dβFD (Δβ=10-7) 7.13337 -0.195777 -1.85788

DD 7.13346 -0.195774 -1.85790

The second example is a transonic turbulent flow case. The flow condition is M∞ = 0.75, α = 2°and Reynolds number = 6,500,000. A strong shock wave is formed on the upper surface of the air-foil. Fig. 3 shows pressure sensitivity contours with respect to incidence angle α, which represents adrastic variation of the pressure around the shock wave. Streamlines of the velocity derivative aresimilar to the subsonic case, but it is getting close to the flow separation downstream of the shockwave as can be seen in Fig. 4. Table 2 presents the sensitivity derivatives of aerodynamic coeffi-cients with respect to two design variables. As in the subsonic case, the sensitivity derivatives ofaerodynamic coefficients by the direct differentiation method and the finite difference method areidentical with each other up to four or five significant dig its.

Adjoint Variable Method

In order to validate the sensitivity code by the adjoint variable method, the sensitivity derivativescalculated by using the adjoint sensitivity code are compared to those by the direct sensitivity code.Computations are conducted for both subsonic laminar and transonic turbulent flows.

The first example is a subsonic laminar case. The flow condition is M∞ = 0.6, α = 1° and Rey-nolds number = 5,000. Fig.5 shows λm1 contours and streamlines of (λm2,λm3). The subscript mmeans that the system response of interest Cj was Cm in the adjoint equations. The integer subscriptfigures 1, 2 and 3 represents elements of the adjoint variable vector { }T

jλ ( { }T4j3j2j1j ,,, λλλλ= ) corre-

sponding to the conservative flow variable vector Q (= {ρ, ρu, ρv, e}Τ). The λm1 has a large gradi-ent in the boundary layer region on the airfoil surface and around the stagnation streamline upstreamof the airfoil leading edge. Streamlines of the vector (λm2,λm3) show discontinuity of the vectordirection around the flow stagnation streamline upstream of the airfoil leading edge. Streamlines ofthe vector (λm2,λm3) look like circulation lines with a negative lift or the opposite flow direction.

In Table 3, we can note that the total derivatives of aerodynamic coefficients with respect togeometric perturbation by the direct and adjoint sensitivity codes are identical with each other up to 6


significant digits.The second example is a transonic turbulent flow case. The flow condition is M∞ = 0.75, α = 2°

and Reynolds number = 6,500,000. The adjoint sensitivity code was run in two modes, one withturbulence eddy viscosity (µt) terms differentiated and the other with constant turbulence eddy vis-cosity (µ’t=0). λl1 contours around NACA0012 airfoil are plotted in Fig. 6. The λl1 has a largegradient around the stagnation streamline and in the boundary layer on the airfoil surface.

Table 3 Validation of adjoint sensitivity code:subsonic laminar case (M∞ = 0.60, α = 1° and Re = 5,000)

dCl/dβ dCd/dβ dCm/dβAV -1.2065570 0.08362676 -0.12162388FD -1.2065569 0.08362679 -0.12162394

discontinuity of the adjoint variables around the stagnation streamline in Fig. 5 and 6 is due to theexistence of singularity crossing the incoming stagnation streamline upstream of the airfoil leadingedge.[11]

Table 4 Validation of Adjoint Sensitivity Code: transonic turbulence case (M∞ = 0.75, α = 2° and Re = 6,500,000)

dCl/dα dCd/dα dCm/dαDD 8.37972 0.486994 0.222014AV 8.28911

(0.9892)0.486983(0.9948)

0.242023(1.0901)

AV(µ′ t=0)

7.73075(0.9226)

0.483649(0.9931)

0.342835(1.544)

dCl/dβ dCd/dβ dCm/dβDD 7.13346 -0.195774 -1.85790AV 7.17228

(0.9946)-0.190391(0.9725)

-1.86124(1.0018)

AV(µ′ t=0)

9.37707(1.3145)

-0.123950(0.6331)

-2.233926(1.2024)

Table 4 compares the sensitivity derivatives of aerodynamic coefficients with respect to two des-ign variables. The values in the parentheses are sensitivity derivative ratios that are the sensitivityderivatives via the adjoint sensitivity code normalized by the respective sensitivity derivatives via thedirect differentiation code. The sensitivity derivatives of the adjoint sensitivity code with differenti-ated eddy viscosity terms show about 0.18~9 % deviation from those of the direct differentiation code.On the other hand, the adjoint code results with µ′ t=0 show much larger deviations(0.5~54%).

Required Computational Time and Memory

Table 5 compares computational time per iteration and memory for the sensitivity codes devel-oped herein with those for the codes generated by the automatic differentiation tools. The presentedvalues are normalized by the time and memory required for the original flow solver. The sensitivitycodes developed in this study require less computational time and memory than those by automaticdifferentiation. Although computational time per iteration for an Odyssée-generated adjoint code


was not available, it is significantly slower than a hand-differentiated code by a factor of 5 in CPUtime.[8]

The direct and adjoint sensitivity codes have similar convergence ratio with the flow solver asthey all employ van Leer’s flux jacobian in the implicit part. The direct sensitivity code requiresmuch less computational time than finite difference approximations as its convergence tolerance toobtain accurate sensitivity derivatives is larger than that of the flow solver by four orders of magni-tude approximately.

Table5 Comparison of computational time and me morySensitivity code

Present ADFlow

SolverDD AV DDa AVb

TimePer iteration 1 1.1 2.4 2~3 NA

Memory 1 1.8 1.8 2 10

a: by ADIFOR[4] b: by Odyssée[7] , NA : Not Available

The adjoint sensitivity code requires more than two times the computational time of the directsensitivity code because the elements of the residual vector R are differentiated by the four elementsof the flow variable Q instead of the design variable βk as in the direct code. Thus, the adjoint codewould be more economical than the direct sensitivity code to calculate sensitivity derivatives if thenumber of design variables is larger than twice the number of objectives and constraints.

5. Concluding Remarks

The direct differentiation approach and the adjoint variable approach are applied respectively tothe discrete flow equations to develop aerodynamic sensitivity analysis codes for turbulent flows. ANavier-Stokes solver with Baldwin-Lomax turbulence model is differentiated by hand to obtain des-ign sensitivities with respect to design variables of interest efficiently. Direct sensitivity equationsand adjoint equations are efficiently solved by the same time integration scheme with the flow solver.The required memory for the adjoint sensitivity code is greatly reduced at the cost of the computa-tional time by allowing the large banded flux jacobian matrix unassembled.

Sensitivity derivatives computed by the sensitivity codes almost exactly coincide with those of thefinite difference method. Although adjoint code results of the turbulent flow case show slight de-viations from the exact results, the adjoint sensitivity code gives much more accurate sensitivity de-rivatives than the adjoint code with the turbulence eddy viscosity being kept constant, which is ausual assumption for the prior researches.

The strategy adopted in this research shows a promise for the extension to three-dimensionalproblems as it is efficient, accurate and requires much less memory than the prior approaches.

6. References

[1] Eyi, S. and Lee, K. D., “Effect of Sensitivity Calculation on Navier-Stokes Design Optimiza-tion,” AIAA 94-0060, Jan. 1994

[2] Eleshaky, M. E. and Bayal, O. “Aerodynamic Shape Optimization Using Sensitivity Analysis


on Viscous Flow Equations,” J. of Fluid Engineering, Vol.115, No.3, 1993, pp75-84[3] Sherman, L. L., Taylor, III, A. C., Green, L. L., Newman, P. A., Hou, G. J., and Korivi, V.

M., “First- and Second-Order Aerodynamic Sensitivity Derivatives via Automatic Differen-tiation with Incremental Iterative Methods,” AIAA-94-4262-CP, Sep. 1994.

[4] Taylor III, A. C., Oloso, A., “ Aerodynamic Design Sensitivities By Automatic Differentia-tion,” AIAA98-2536, June, 1998

[5] Ajmani, K., and Taylor, III, A. C., “Discrete Sensitivity Derivatives of the Navier-StokesEquations with a Parallel Krylov Solver,” AIAA 94-0091, Jan. 1994.

[6] Jameson, A., Pierce, N. A., Martinelli, L., “Optimum Aerodynamic Design using the Navier-Stokes Equations,” AIAA 97-0101, Jan. 1997.

[7] Mohammadi, B. “ Optimal Shape Design, Reverse Mode of Automatic Differentiation andTurbulence,” AIAA 97-0099, Jan. 1997.

[8] Malé, J. M., Mohammadi, N., Schmidt, R., “ Direct and Reverse Modes of Automatic Dif-ferentiation of Programs for Inverse Problems,” Application to Optimum Shape Design Proc.2nd Int. SIAM Workshop on Co mputational Differentiation, Santafe, 1996

[9] S. W. Hwang, " Numerical Analysis of Unsteady Supersonic Flow over Double Cavity,"Ph.D. Thesis, Seoul National Univ., Seoul, Korea, 1996.

[10] Kim, H. J. and Rho, O. H.," Dual-Point Design of Transonic Airfoils using the Hybrid Inver-se Optimization Method," J. of Aircraft vol.34 No.5 pp612-618, 1997.

[11] Giles, M. B. and Pierce , N. A. “Adjoint Equations in CFD: Duality, Boundary Conditionsand Solution Behavior,” AIAA 97-1850, June, 1997


Fig.1 Direct differentiation sensitivity analysis: dp/dα contour, subsonic turbulence case

(M∞ = 0.60, α = 2° , Re = 6,500,000)

Fig.3 Direct differentiation sensitivity analysis: pressure sensitivity contour for α transonic turbulence case (M∞ = 0.75, α = 2° , Re = 6,500,000)

Fig.2 Direct differentiation senitivity analysis:velocity sensitivity streamline for α

subsonic turbulence case (M∞ = 0.60, α = 2° , Re = 6,500,000)

Fig.4 Direct differentiation sensitivity analysis:velocity sensitivity streamline for α

transonic turbulence case (M∞ = 0.75, α = 2° , Re = 6,500,000)

0 0.5 1 1.5x

-0.5

0

0.5

1

y

0 0.5 1 1.5x

-0.5

0

0.5

1

y

-0.5 0 0.5 1x

-0.5

0

0.5

y

0 1x

-1

-0.5

0

0.5

1

y


Fig.5 Adjoint sensitivity analysis: λm1 contour and (λm2,λm3) streamlines

subsonic laminar case (M∞ = 0.60, α = 1° , Re = 5,000)

Fig.6 Adjoint sensitivity analysis : λl1 contour transonic turbulence case

(M∞ = 0.75, α = 2° , Re = 6,500,000)

-1 0 1x

-1.5

-1

-0.5

0

0.5

1

1.5

y

-1 0 1x

-1

-0.5

0

0.5

1

y

Documents

J. KSIAM V ol.3 , No.2, 1-4, 1999 · suc h that (5) b ecomes smaller than 1. ein tro duce sev eral functions. Let 1 ( ) = max a ; ; b + + (6) 2 ( ) = max a ; ; b + + (7) 3 ( ) = max