14
BIT 37:3 (1997), 706-719. NUMERICAL BEHAVIOUR OF THE MODIFIED GRAM-SCHMIDT GMRES IMPLEMENTATION * A. GREENBAUM 1, M. ROZLOZNiK 2 and Z. STRAKOS 2 t 1Courant Institute of Mathematical Sciences, 251 Mercer Street, New York N Y 10012, U.S.A. email: [email protected] 2Institute of Computer Science, Academy of Sciences of the Czech Republic Pod voddrenskou vd21 2, 182 07 Praha 8, Czech Republic email: [email protected], [email protected] Abstract. In [6] the Generalized Minimal Residual Method (GMRES) which constructs the Arnoldi basis and then solves the transformed least squares problem was studied. It was proved that GMRES with the Householder orthogonalization-based implementation of the Arnoldi process (HHA), see [9], is backward stable. In practical computations, however, the Householder orthogonalization is too expensive, and it is usually replaced by the modified Gram-Schmidt process (MGSA). Unlike the HHA case, in the MGSA implementation the orthogonality of the Arnoldi basis vectors is not preserved near the level of machine precision. Despite this, the MGSA-GMRES performs surprisingly well, and its convergence behaviour and the ultimately attainable accuracy do not differ significantly from those of the HHA-GMRES. As it was observed, but not explained, in [6], it is the linear independence of the Arnoldi basis, not the orthogonality near machine precision, that is important. Until the linear independence of the basis vectors is nearly lost, the norms of the residuals in the MGSA implementation of GMRES match those of the HHA implementation despite the more significant loss of orthogonality. In this paper we study the MGSA implementation. It is proved that the Arnoldi basis vectors begin to lose their linear independence only after the GMRES residual norm has been reduced to almost its final level of accuracy, which is proportional to a(A)e, where ,~(A) is the condition number of A and e is the machine precision. Consequently, unless the system matrix is very ill-conditioned, the use of the modified Gram-Schmidt GMRES is theoretically well-justified. AMS subject classification: 65F05. Key words: linear algebraic systems, numerical stability, GMRES method, modified Gram-Schmidt Arnoldi process. *Received August 1996. Revised January 1997. tThis work was supported by the GA AS CR under grants 230401 and A2030706, by the NSF grant Int 921824, and by the DOE grant DEFG0288ER25053. Part of the work was performed while the third author visited Department of Mathematics and Computer Science, Emory University, Atlanta, USA.

Numerical behaviour of the modified gram-schmidt GMRES implementation

Embed Size (px)

Citation preview

BIT 37:3 (1997), 706-719.

NUMERICAL BEHAVIOUR OF THE MODIFIED GRAM-SCHMIDT GMRES IMPLEMENTATION *

A. GREENBAUM 1, M. ROZLOZNiK 2 and Z. STRAKOS 2 t

1 Courant Institute of Mathematical Sciences, 251 Mercer Street, New York NY 10012, U.S.A. email: [email protected]

2Institute of Computer Science, Academy of Sciences of the Czech Republic Pod voddrenskou vd21 2, 182 07 Praha 8, Czech Republic

email: [email protected], [email protected]

A b s t r a c t .

In [6] the Generalized Minimal Residual Method (GMRES) which constructs the Arnoldi basis and then solves the transformed least squares problem was studied. It was proved that GMRES with the Householder orthogonalization-based implementation of the Arnoldi process (HHA), see [9], is backward stable. In practical computations, however, the Householder orthogonalization is too expensive, and it is usually replaced by the modified Gram-Schmidt process (MGSA). Unlike the HHA case, in the MGSA implementation the orthogonality of the Arnoldi basis vectors is not preserved near the level of machine precision. Despite this, the MGSA-GMRES performs surprisingly well, and its convergence behaviour and the ultimately attainable accuracy do not differ significantly from those of the HHA-GMRES. As it was observed, but not explained, in [6], it is the linear independence of the Arnoldi basis, not the orthogonality near machine precision, that is important. Until the linear independence of the basis vectors is nearly lost, the norms of the residuals in the MGSA implementation of GMRES match those of the HHA implementation despite the more significant loss of orthogonality.

In this paper we study the MGSA implementation. It is proved that the Arnoldi basis vectors begin to lose their linear independence only after the GMRES residual norm has been reduced to almost its final level of accuracy, which is proportional to a(A)e, where ,~(A) is the condition number of A and e is the machine precision. Consequently, unless the system matrix is very ill-conditioned, the use of the modified Gram-Schmidt GMRES is theoretically well-justified.

AMS subject classification: 65F05.

Key words: linear algebraic systems, numerical stability, GMRES method, modified Gram-Schmidt Arnoldi process.

*Received August 1996. Revised January 1997. tThis work was supported by the GA AS CR under grants 230401 and A2030706, by the

NSF grant Int 921824, and by the DOE grant DEFG0288ER25053. Part of the work was performed while the third author visited Department of Mathematics and Computer Science, Emory University, Atlanta, USA.

M O D I F I E D G R A M - S C H M I D T G M R E S I M P L E M E N T A T I O N 7 0 7

1 I n t r o d u c t i o n .

Consider the linear algebraic system

(1.1) A x = b,

where A is a real nonsingular N by N matrix and b is a real vector. Starting with an initial guess x0, the GMRES method generates approximate solutions x,~ E xo + K n ( A , ro), n = 1, 2 , . . . ; K,~(A, r0) = span{r0, A r o , . . . , A '~- lro} , such that

(1.2) lib - Ax,~fl = min Irb - Aull, UExo+Kn(A,ro)

where ro = b - Axo is the initial residual. One possible way to compute x~ involves constructing an orthonormal basis {v l , . . . , v n } for the Krylov space K ~ ( A , ro) using the Arnoldi process. The Arnoldi recurrence is described in matrix form by

(1.3) AV~ = V n + l n n + l , n ,

where the columns of V~ are the n orthonormal basis vectors, the last column of Vn+l is the next orthonormal basis vector for the space Kn+l (m, ro), and H~+I,~ is an n + 1 by n upper Hessenberg matrix. To obtain the approximation x~ satisfying (1.2), one sets x~ = xo + V~y~, where

(1.4) HQel - Hn+l,ny,~H = minllQel - H,~+I,,~yJI, Q = Ilroll. y

In [6] the Householder orthogonalization was considered for building up the Arnoldi basis. It was proved that the resulting (so called HHA) implementation of GMRES (proposed originally by Walker, see [9]) is backward stable. The proof relied upon the fact that in the HHA process the loss of orthogonality of the computed Arnoldi vectors is proportional to the machine precision (independent of the condition number of the matrix A). This is not true, however, for the cheaper and therefore more frequently used MGSA implementation, in which the Arnoldi basis is computed using the algorithm

MGS-Arnoldi algorithm:

0--Irr011; Vl = r o / Q ;

for i = 1 , 2 , . . . , n w = Avi; for k = 1 , 2 , . . . , i

T . hk , i = v k w ,

W = w -- hk , iV k ;

end h i + l # = IPwll; Vi+l = w / h i + l , i ;

end.

708 A. GREENBAUM, M. ROZLOZNIK AND Z. STRAKOS

Here, the upper Hessenberg matrix Hn§ n is still computed in a backward stable way, i.e., in finite precision arithmetic we have (see [2, 4, 1, 6])

(1.5)

where

AVn = Vn+IHn+I,n + F,~, AVn = Vn+IHn+:,n + Fn,

IIF ll ff, rl(n,l,N) l[AII, IIFnll ffx (n,I,N)EllAll, V [ + l Y n + l = In+l, ~(n,l ,N) = n3/2N +nl /2lN 1/2,

Vn§ is the closest orthonormal matrix t o Vn+ 1 in any unitarily invariant norm, is the machine precision, and l is the maximal number of nonzero entries per

row of the matrix A (as in [6], by r0, 0, Vn, Hn+l,n, yn and xn we denote from now on the computed quantities). The orthogonality of V~+I may gradually deteriorate, but assuming nN3/2cn([Vl, AVn]) << 1, it can be shown, see [2, 4, 6], that

(1.6)

and also

H I - vT+IVn+' I[ < @7/(n, l, N)en([v,, AVn]),

(1.7) 1[ITn+l - Vn+ll[ < ~2~(n, l, N)en([Vl, AVn]),

for a moderate size constant ~2- Constant factors ~1, ~2 (as the other con- stants ~j,j = 3 , . . . , introduced in a number of places) are independent of the problem parameters (e.g.N, n(A), etc.) and the machine precision e, but dependent on the details of the machine arithmetic. In a finite precision MGS- Arnoldi computation, the modified Gram-Schmidt process is applied to the ma- trix [Vl, f l (Avl) , . . . , fl(Avn)], where fl(.) denotes the floating point result of the operation (.). In developing the bounds, however, it is convenient to work with the matrix Iv1, AVn]. We have therefore included effects of rounding errors in the matrix-vector multiplications in the bounds (1.6), (1.7) and all the follow- ing results will be related to [Vl, AV~]. Throughout this paper we will assume that nN3/2c << 1, so that (1.6) holds for n + 1 < N if ~([vl, AVn]) << N-5/2~ -1.

It was observed in [6] that although orthogonality of the MGSA vectors is not maintained near the machine precision, as for the HHA implementation, the norms of the computed residuals of the MGSA-GMRES are almost identical to those of the HHA-GMRES, until the smallest singular value of the matrix Vn begins to depart from the value one. At that point the MGSA-GMRES residual norm begins to stagnate close to its final precision level. This observation is demonstrated on the numerical examples for matrices STEAM1 (N = 240, sym- metric positive definite matrix used in oil recovery simulations) and IMPCOLE (N = 225, unsymmetric matrix from modelling of the hydrocarbon separation problem) taken from the Harwell-Boeing collection (see Figures 1.1-1.4).

In both experiments x = (1 , . . . , 1) T, b = Ax and x0 -- 0. Experiments were performed on the SGI Crimson workstation, e = 2.2-10 -16, using MATLAB. The solid line represents the norm of the relative true residual Ilrnl[/Q = l ib -

M O D I F I E D G R A M - S C H M I D T G M R E S I M P L E M E N T A T I O N 709

10 2 Househo lde r imp lemen ta t i on

10 ~

10 -2

0~ -4

o

OlO

-~ -lO �9 ~, 10

>~ 10-~2 ,m

10 -14

10 -16

1 0 -18

STEAM1 (240)

itl I I I

50 100 150 200 i terat ion n u m b e r

Figure 1.1: Householder implementation for STEAM1.

102

10 ~

10 -2

1 0 -4 o

~ 10 ~

o10-

~ 10 -lo

.~ 10 -12 H

10 -14

10 -~6

fo rward mod i f ied G r a m - S c h m i d t imp lemen ta t i on

. . . . . . . . . . . . . . . . . . . . . . . . . . . . \

\ �9 t

STEAM1 (240)

10 -18 L I I 0 50 100 150 200

i terat ion n u m b e r

Figure 1.2: Modified Gram-Schmidt implementation for STEAM1.

710 A. GREENBAUM, M. ROZLOT, NfK AND Z. STR,AKOS

102

100

10 -2

g lo

~o10 -s

o ~ 1 0 - 8

~ 10 -10 .r

>~ 10-12

10 -14

10 -~8

10 -18

Householder implementation

IMPCOLE(225)

i i i L 50 1 (30 150 200

iteration number

Figure 1.3: Householder implementation for IMPCOLE.

10 2 forward modified Gram-Schmidt implementation

10 ~

10 -2

~ 1 o o = ~ 10 -s

i 10 a

:~ 10 -~c

>~ 10-12

10-1 ̀=

10 -16

10 -18

i i i

IMPCOLE(225)

i i i

50 1 (30 150 200 iteration number

Figure 1.4: Modified Gram-Schmidt implementation for IMPCOLE.

M O D I F I E D G R A M - S C H M I D T GM RES I M P L E M E N T A T I O N 711

Ax,~[I/~, the dashed line the norm of the relative Arnoldi residual [It,~ll/Q = []Qel - H ~ + l , n y n H / p , the dash-dotted line the smallest singular value of the matr ix V~ and the dotted line the loss of orthogonality I[I - vTv~[[F measured in the Frobenius norm. Figures 1.1 and 1.2 describe the results for STEAM1 (the HHA and MGSA implementations, respectively). Similarly, Figures 1.3 and 1.4 correspond to the IMPCOLE. We emphasize that the behaviour illustrated on these figures represents typical behaviour of the MGSA and HHA-GMRES. The condition number of the system matrix was ~(A) -- 2.855 x 107 for STEAM1 and ~(A) = 7.102 x 106 for IMPCOLE.

In Sections 2 and 3 these observations are justified theoretically. It is proved that the condition number of the matrix [vl,AVn] in (1.6) is approximately bounded by the condition number of A divided by the relative norm of the MGSA-GMRES residual at step n. It follows that until this residual norm reaches the level ~(A)r which is its final theoretical level of accuracy, no serious loss of orthogonality occurs in the modified Gram-Schmidt process, and the behaviour of the MGSA-GMRES is approximately the same as in the HHA- GMRES.

2 C o n d i t i o n o f t h e A r n o l d i v e c t o r s .

The MGS-Arnoldi algorithm given in Section 1 applies the modified Gram- Schmidt procedure to the independent vectors Iv1, Avl, Av2,. . . , Av,~], where Vl is given and v2 , . . . , Vn are computed vectors, each having norm approximately one. In exact arithmetic, the vectors v l , . . . , v n are orthonormal, but in finite precision arithmetic this may not be the case.

The success of the modified Gram-Schmidt procedure in constructing an or- thonormal basis depends on the condition number of the matrix [vl,AVn], as shown by inequality (1.6). It is easy to see, in addition, that the orthogonality of computed modified Gram-Schmidt vectors is independent of the norms of the original vectors. Thus, the expression ~([vl, AVn]) in (1.6) can be replaced by the condition number of the best conditioned matrix of the form [vl, AVn]D, where D is any diagonal matrix. In the rest of the paper we will assume that A has been multiplied by a constant so that IIA[I -- 1 and, for convenience, we will assume also that the norms of the columns of V~ are exactly 1; that is, D -- diag(1/[[vl[[, 1/([]A[[I]Vl[[),... , 1/([[A[[[[vn[])). In this section we will show that the condition number ~([vl,AV~]) is less than or equal to a mod- erate size multiple of the condition number of A divided by the norm of the optimal approximation to V 1 by a linear combination of the columns of AVn, miny[[Vl - AV,~y[[ - [[r~[[/0. In exact arithmetic, this would be the norm of the GMRES residual at step n divided by the norm of the initial residual. In finite precision arithmetic these two quantities may differ, but it will be shown in Section 3 that, until the relative GMRES residual drops close to the level t~(A)e, they are approximately the same.

We begin with the following general result relating the smallest singular value of an m by n matrix of rank n, m > n, to the smallest singular value of the matr ix with an extra column appended.

712 A. G R E E N B A U M , M. R O Z L O Z N I K AND Z. S T R A K O S

THEOREM 2.1. Let G be an m by n matrix of rank n, m > n, and let h be an m-vector. Let an(G) denote the smallest singular value of G, and let a~+l ([G, hi) denote the smallest singular value of the m by n + l matrix consisting of G appended with the last column h. Then

T (2.1) cr,~(G) >_ an+l([G,h]) >_ an(G),

~ / ~ ( a ) + ~ + Ilhll 2

where ~- = minullh - Gyll.

PROOF. If a~(G) = Crn+i([G, hi), the s t a t ement is trivial. Assume an(G) > an+i([G,h]). Then the smallest eigenvalue ~ of the ma t r ix [G,h][G,h] T = GG T + hh T (which equals 2 an+] ( [G , hi)) satisfies the secular equat ion

2 T2 wj -- 0.

(2.2) 1 + Ilhll = i f - ~ j = l a j - -

Here a j , j = 1, 2 , . . . , n, are the singular values of the ma t r ix G, w -= (Wl,. �9 �9 Wnl = U~h/l lhl l , and U~ is the m by ~ matrix whose columns are the left singular vectors of G (see, for example, [5]). Considering

we have

2 1 wj < ~' ~-~ - ~ 2 -

j = l a j - -

1 T 2 1 + Ilhll2a 2 _ ~X ~X ~ O,

2 2 A2_(a~+r 2+ilhll2),x+r~n -< O.

Solving this quadra t ic for A, we find

>

2 T2 an + + ]lhl]2 _ V/(~n + ~2 + ilhl]2)2 _ 4 ~

2 2 T O" n

~2 + T= + ilhll2,

and the result (2.1) follows upon taking square roots. []

COROLLARY 2.2. The smallest singular value of [vl,AVn] satisfies

(2 .3 ) an+a([vl,AVn]) >_ II§ an(AV,,) v/2 + a~(AV, d '

where ll?nl] is defined as ll§ = 0miny[Iva - aVnyll.

PROOF. If AVn does not have rank n, then the result is trivial, so assume rank(AVn) = n. The quant i ty []?,~][/0 plays the role of T in Theorem 2.1. Since Ilvlll = 1 and I]?nll/~ < 1, the denomina to r in (2.1) is less than or equal to

v ~ + a,~ [] With (2.3) and the assumpt ion ]IAI] =- ax(A) = 1, we can now bound the

condit ion number of the ma t r ix [Vl, AV,~].

MODIFIED GRAM-SCHMIDT GMRES IMPLEMENTATION 713

THEOREM 2.3. Let I1 = [I = ominyl]vl - AVny[I, Vn be the matrix of basis vectors computed in a finite precision MGS-Arnoldi computation. Assume that IIAII = 1, that the columns of Vn have norm 1, that inequality (1.6) holds, and t ha t ~ 2 N 3 / 2 6 " < 1/2, where ~2 is the constant in (1.6). Assume also that

(2.4)

Then

II/o ~> 2x/'-~@nN3/2t~(A) ~.

x/~oa( A ) (2.5) t~([Vl, AVn])

II§

PROOF. The smallest singular value of AV,, satisfies

an(AEJ >_ ~rN(A)(yn(Vn).

Let an denote the condit ion number n([Vl, AV,,_I]). It follows from (1.6) tha t

(2.6) O'n(Vn) > V/1- ~2nN3/2t%r

Since the function a / ~ in (2.3) is an increasing function of a, it follows from (2.6) tha t (2.3) can be replaced by

(2.7) O'n+l([Vl,AVn] ) > Ilpnll aN(A)x/1 -- ~2rtN3/2t~ng 0 V/2 + a 2 ( d ) ( l - ~ 2 n g 3 / 2 g n C ) "

The largest singular value of Iv1, AVn] is bounded above by the Frobenius norm of the matr ix, and so we have

o'l([vl,aVn]) <_ 4HVlH2-~ - ]IAN21]Vn[] 2 < ~/1 +a2(Vn).

It also follows from (1.6) tha t

o'2(Vn) < 1 "Jr- ~2ng3/2an-C,

and so we can write

(2.8) 0"l([Vl, AV,~]) < V/2 + (2nN3/2anE.

Combining (2.7) and (2.8) gives

o n ( A ) / ( 2 + ~2nN3/2t%c)(2 + a2 (A) (1 - ~2nN3/2anC) ) (2.9)a([Vl, AVn]) < ~ V 1 - ~2nN3/2ane

First, assume tha t 1 - ~2nN3/2anr >_ 1/2. Using this and the bound a2N(A) < 1 in (2.9) gives the result (2.5). Now, for n = 1, since aa = 1, we have by assumption tha t 1 - @N3/2alc >_ 1/2, and hence (2.5) holds for n = 1. Wi th

714 h. GREENBAUM, M. ROZLOZNIK AND Z. STRAKOS

the assumption (2.4), which must also hold at steps j < n, since II~jll decreases with j , we can now conclude that 1 - ~22N3/2a2e > 1/2, and proceeding in this way, we see, by induction, that the necessary assumption holds and (2.5) is proved. []

We end up this section with the following summary of the bounds for the loss of orthogonality among the Arnoldi basis vectors and bounds for the singular values of V.. All of them are immediate consequences of Theorem 2.3.

COROLLARY 2.4. Denote

5n -- V f ~ 2 n y 3 / 2 ~ ~nAN) .

Then, under the assumptions of Theorem 2.3, 5~ <_ 1/2,

H I -- UnT+iVn+ll] ~_ ~n, IlUn+l - ?n+l[] ~ I~n,

y/1 - 5~_~ <_ an(V,~) <_ a~(V~) <_ V/1 + 5~_~,

where the matrix Vn+I is the closest orthonormal matrix to Vn+l in any unitarily invariant norm.

3 C o m p a r i s o n o f t h e res idua ls .

The preceding results relate the conditioning of the Arnoldi vectors to the size of I1§ In this section we show that until the last term reaches the level proportional to a(A)e, the norms of the true residual, IIr,~]l = l ib-Axnll and the Arnoldi residual Iltnll = II~el --Hn+l,nYnll, are approximately the same as II?nll.

In the following bounds we present only those terms which are linear in e and do not account for the terms proportional to the higher powers of ~.

THEOREM 3.1. Under the assumptions of Theorem 2.3, and assuming more- over that ~(n, l, N)~a( A ) << 1/V~, the norm of the Arnoldi residual is bounded by

1 1

where ~11 ~/1 + 5n-l~?(n, l, N)Eg(A)

(3.1) X3 = V~ 1 _ 5n-1 - r

PROOF. It is easy to see that under the assumptions of Theorem 2.3,

~ <_ Vf f + 5n_,a , (A) , a~(AUn) >_ V ~ - 5 n - l a N ( A ) ;

the matrices V~+I and AV~ = V~+IHn+I,~ + F~ have full column rank. From

AVn ---- Vn+lUn+i,n -~- t n it follows

al(Hn+l,~) <_ ~I(AV~) + IIF~N <_ V~+5~-IlIAI] +~l~?(n,l,i)~llA[I,

ITn(Un+i,n) ~_ O'n(AUn)- IIFnll ~ V/~--(~n-IITN(A)-~I?~(n,I ,N)cl lAll .

M O D I F I E D G R A M - S C H M 1 D T G M R E S I M P L E M E N T A T I O N 715

Similarly, from AVn = g n + l H n + l , n + Fn ,

O'n(Vn+lHn+l,n) = drn(AVn - Fn) >_ V/1 - ~n-lCrN(A) -- ~l~(n, l, N)r

Assuming ~?(n,l, N)r << l/v/-2, it follows that the matrices Hn+l , , and V,~+lHn+l,n have also full column rank. We will use the perturbation theory for the least squares problem. Using (1.5),

[[§ = pmin[Ivl - (Vn+lHn+l,n -b Fn)y[[, Y

which can be considered as a perturbation of the following least squares problem

(3.2) Ilrn[t = pminl[vl - Vn+lHn+l,~y[[. Y

Applying the perturbation theorem proved by Wedin [10], see also [3], to the problem (3.2), we obtain

lien - ?~n[[ ~ o[[Fn[[ [[V~+lHn+x,n[[ {lIV,+~H,§ + ~(Vn+,H,~+l,n)[[@n[[/O},

which, considering []~,~]] _< 1/an(Vn+lH,~+l,~) and []~][/p < 1 gives

CztlF,~ll 1 < i3 < 2, I1~- - +,,11 < ~ O'n(Vn+lHn+l,n)'

o r

~a6 ,1(n, l, N)~,~( A ) (3.3) [l+,~ - § < 0

V / 1 - 6n- , - 4l ~(n, l, N)r A ) "

Moreover, from the definition of ~, we have

(3.4) []r~[]/v/T+ fin _< ~ min[lel - Hn+l,ny[[ <_ II+nU/v/T - 6n. Y

The transformed least squares problem (1.4) is solved by using the QR decom- position of the matrix H~+l,~ via Givens rotations [8]. Following the results by Lawson and Hanson [7], the computed solution y~ represents the exact solution of a perturbed problem

][Loel A- ASh - (Hn+l,n -'F AH~)y~[[ = min ]JOel A- ASh - (H~+I,~ + AH,~)y[], Y

where

IlaH~lllllHn+x,nll < r IIAs.II/Q < ~4 n5/2~.

Using the results of the rounding error analysis of the Givens rotations (similarly to [6], rel. (3.5)-(3.11)), we have the estimate for the norm of the vector y~

(3.5) [lYnI[ ~ P (1 + @nr - 2r

716 A. GREENBAUM, M. ROZLOZNIK AND Z. STRAKOS

We assume here tha t 7/(n, l, N)r is small enough, and thus, after substi tut ing the bounds for the singular values of Hn+l,n, the vector IlYnll is reasonably bounded. Then, for the Arnoldi residual we may write

(3.6) lil0el - Hn+l,~Ynil - minliQel - Hn+l,nYiii y

_< myiUHaOel + A S h -- (Hn+l ,n + A H n ) y [ [ - minI[Qely - Hn+l ,nY[[

+ II~HnLillynli + iL~s~li.

Applying again the perturbat ion theorem for the least squares problem, we ob- tain the bound

(3.7) m~n[]~Oex + As~ - (H~+I,~ + AH~)y[[ - min[]0eay - Hn+l,ny[[

<Q ~4n512~ { IIH,~+~,nll m~nl,el Hn+~,,~YlI} O.n(Hn+l,n ) -+- t~ (Hn+l ,n )

<- 0 @n512e~(H~+],~).

Substi tut ing (3.7) and (3.5) into (3.6), we get after a simple manipulat ion

(3.8) I]~oel - Un+i,nYn]l - min]]~oea - Hn+l,nyH[ y I CisnS/%lIHn+l,~II

-< ~ ~ ( U ~ + l , n ) - 2(:6n3/2~liUn+~,nll" Considering the assumption r](n, l, N)stc(A) << 1, and the bounds for the singular values of the matr ix Hn+l,~, it follows

(3.9) ~snS/2eHHn+l,,~]l <_ ~sn~ /2e (v / l+ 5n-1 + (~l~?(n,l,N)~)liAi]

<_ ~sV~ + ~-lnSI2ei]Al[ + 0(r 2)

<_ ;9V/-f + f-~_1n512~l[At[

and

(3.10) a,~(gn+l,n) - 2(~6n3/2eNHn+l,ni[ > aN(A)v /1 -- 5n-1 -- ; l~(n , l, g)el[A[I

-- 2~6U3/2C(V fi- "+" (~n-1 -[- ~l~(n, l, N)e)llA]I

>_ aN(A)[v/X - 6n-1 -- ~,0V/1 + 6n-l~l(n, l, N)ea(A)]

for some positive constants ~9 and (~ao. Consequently,

liQel - H,,+l,nyn]l - miniiQel - Hn+l,nYll[ < fiX1, y I (3.11)

where

(3.12) X] = X/1 _ 5 n - 1 - ( a 0 V / 1 + 5n-lrl(n, l, N)ct~(A)"

M O D I F I E D G R A M - S C H M I D T G M R E S I M P L E M E N T A T I O N 717

Combining (3.11) with (3.4) and (3.3), and considering that

X2 X2 X3 X l + x / 1 + 5 ~ _<X3, Xl + / 1 _ 5, ~ _ 1~_ ~ ~ ,

where X2 is defined by

X2 ~3~1~1(n, l, N ) E a ( A )

X/1 - 5,~-1 - ~v l (n , l, N)ct~( A )

we obtain the statement of the theorem. []

It remains to prove that the norms of the Arnoldi residual tn and the true residual rn are also close. It is done by the following theorem (for details we refer to [6]).

THEOREM 3.2. Under the a s s u m p t i o n s of Theorem 2.3, and as suming more- over that ~ ? ( n , l , g ) ~ ( A ) << l/v/2, the n o r m of the true G M R E S residual is bounded by

V/1 - 5n ]lee1 - Hn+l,nYnl] - QX4 - v ( d , b , xo)

<_ lib- gxnll <

X/1 + 5n libel - Hn+l,nYn]l + ~X4 + v ( A , b , xo) ,

where ~14 V/1 + 5n-lr/(n, l, N ) r

(3.13) X4 = V / 1 - S n - l - ~ l o v / l + S n - l ~ ? ( n ' l ' N ) ~ ( A )

and v(A, b, xo) = O ( l N 1/2 + N)r + O(N)e[Ibl].

PROOF. The computed approximation Xn can be written in the form

(3.14) x,~ = xo + Vnyn + dn,

where d n substitutes the local rounding errors in the computation of the vector xn and is bounded by

Ildn]l <_ o(na/2)e l lYnl l + r �9

Then using (3.14), (1.5) it follows

(3.15) r,~ = b - A x n = ro - AV,~yn - Adn

= (b - A x o - e v l ) + Vn+l (~e l - Hn+l,ny~) - Fnyn - Adn .

The first term in (3.15) represents the rounding errors in the computation of the initial vector vl from the initial residual r0 and can be bounded as

lib - A x o - QVlH <__ o ( 1 g 1/2 + g)~llmll l lxoll + o (g )~ l lb l l .

Consequently (cf. [6], relation (3.4)),

lib - A x n - V n + l ( e e l - Hn+I,nY,~)]I <_ ~13~l(n,l ,N)cllAHllY~H + v ( A , b , xo).

718 A. GREENBAUM, M. ROZLOZNIK AND Z. STRAKOS

Using (3.5) and (3.10), we get

lib - A x n - Vn+l (Qel - Hn+l,ny~)ll _< YX4 + v ( A , b , x o ) ,

where X4 is defined by (3.13). Considering

IIIb - A x n l [ - I I V n + l ( ~ e l - H~+l,nYn)lll <- lib - A x n - V.+I(Qel - H~+I,~Yn)II,

this gives the assertion of the theorem. []

Note that, under the given assumptions, the statements of Theorems 3.1, 3.2 can be written in the form

1 1 (3.16) 1~/i--4--~11§ - ~.J, _< Iltnll _< lv/V-:-~_ 5 I1~,~11 + ~W2,

(3.17) X/rf- 5,~llt~ll - Q,~3 _< IIr,~ll _< v / i -+ 5nllt,~ll + Q"~3,

where wj, j = 1, 2, 3 are of order n N 3 / 2 e ~ ( A ) . Clearly, until II~nll/0 is of the same order, the relative difference of II§ IIr.II and IIt~l[ is small. Note tha t the quantity n N 3 / 2 e n ( A ) which takes part in the bounds counts for the worst possible case and is usually a large overestimate, especially for the "average case" .

In most cases (for most right hand sides) the attained accuracy of the HHA or MGSA-GMRES computation measured by the norm of the relative residual Ilr.ll/~ is much below the level n(A)e. The linear independence of the computed Arnoldi vectors is well preserved until this final accuracy is reached (see Figures 1.1-1.4). It is known, but yet unexplained, that sometimes the norm of the Arnoldi residual converges to zero (for the HHA-GMRES this represents a typical behaviour, see Figures 1.1, 1.3) even after the true residual norm stagnates. For MGSA GMRES, however, the norms of the true and Arnoldi residuals stagnate obviously at about the same level.

4 C o n c l u d i n g r e m a r k s .

It was proved that the loss of orthogonality among the Arnoldi vectors com- puted via the modified Gram-Schmidt orthogonalization is related to the size of the residual of the corresponding GMRES run. This relation explains the re- liable numerical performance of the modified Gram-Schmidt implementation of GMRES which has been known for a long time. The detailed comparison with the Householder GMRES still needs further work which should include quanti- tative bounds for the distance between the corresponding Krylov subspaces.

It can be shown numerically that an analogous relation between the loss of orthogonality among the Arnoldi vectors and decrease of the GMRES residual holds true even for the classical Gram-Schmidt GMRES implementation (the final level of accuracy is, however, much worse in the latter case). What makes this observation attractive is the fact that the classical Gram-Schmidt is easily parallelizable. We will work in these directions and report the results elsewhere.

M O D I F I E D G R A M - S C H M I D T GMRES I M P L E M E N T A T I O N 719

R E F E R E N C E S

1. M. Arioli and C. Fassino, Roundoff error analysis of algorithms based on Krylov subspace methods, BIT, 36 (1996), pp. 189-206.

2. A. BjSrck, Solving linear least squares problems by Gram-Schmidt orthogonaliza- tion, BIT, 7 (1967), pp. 1-21.

3. /~. BjSrck, Stability analysis of the method of seminormal equations for linear least squares problems, Linear Algebra Appl., 88/89 (1987), pp. 31-48.

4. A. BjSrck and C. C. Paige, Loss and recapture of orthogonality in the modified Gram-Schmidt algorithm, SIAM J. Matrix Anal. Appl., 13 (1992), pp. 176-190.

5. J. R. Bunch and Ch. P. Nielsen, Updating the singular value decomposition, Numer. Math., 31 (1978), pp. 111-129.

6. J. Drko~ovs A. Greenbaum, M. Rozlo~n~, and Z. Strako~, Numerical stability of the GMRES method, BIT, 35 (1995), pp. 309-330.

7. C. L. Lawson and R. J. Hanson, Solving Least Squares Problems, Prentice-Hall, Englewood Cliffs, N J, 1974.

8. Y. Saad and M. H. Schultz, GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 7 (1986), pp. 856-869.

9. H. F. Walker, Implementation of the GMRES method using Householder transfor- mations, SIAM J. Sci. Stat. Comput., 9 (1988), pp. 152-163.

10. P./~. Wedin, Perturbation theory for pseudoinverses, BIT, 13 (1973), pp. 217-232.