32 Residual inverse iteration for the nonlinear eigenvalue problem.pdf

Embed Size (px)

Citation preview

  • 7/28/2019 32 Residual inverse iteration for the nonlinear eigenvalue problem.pdf

    1/10

    SIAM J. NUMER. ANAL.Vol. 22, No. 5, October 1985

    1985 Society for Industrial and Applied Mathematics006

    RESIDUAL INVERSE ITERATION FOR THE NONLINEAREIGENVALUE PROBLEM*A. NEUMAIER

    Abstract. For the nonlinear eigenvalue problem A()=0, where A(.) is a matrix-valued operator,residual inverse iteration with shift tr is defined by

    a (!+1) := const. (x (l ) A(o") -1A(Al+l)x

  • 7/28/2019 32 Residual inverse iteration for the nonlinear eigenvalue problem.pdf

    2/10

    RESIDUAL INVERSE ITERATION 915where the new approximation /1+1 for the eigenvalue now has to be determinedbeforehand (we shall use a generalized Rayleigh quotient). It turns out that in thisformulation AI may be replaced by a constant "shift" tr without destroying convergenceto the wanted eigenpair.

    The new algorithm, called residual inverse iteration, thus computes the newapproximation x (I+1) by applying to X (I+1) a correction term computed from the residualA(A)x () for a suitable A. Hence in the presence of rounding errors residual inverseiteration with double precision accumulation of the residuals gives about the samelimit accuracy as one would get with ordinary inverse iteration only when the completeiteration is performed in double precision. In particular, for linear problems, residualinverse iteration can be profitably used to refine eigenvalue approximations obtainedfrom the QR or QZ algorithm (see e.g. Stewart [8]) since the single precision partialfactorization available from the QR or QZ algorithm can be reused to save factorizationtime. Thus residual inverse iteration provides a simple alternative to some refinementprocedures proposed in the literature ([1], [2, 62], [9], [12]), and has the advantageof preserving the structure of A and not requiring an initial eigenvector approximation.

    In case that A(A)=A-AI, residual inverse iteration is again theoreticallyequivalent to ordinary inverse iteration. But in the nonlinear case, residual inverseiteration is no longer strictly equalent to (1), and can be used either with a fixed shiftcr or with variable shift. For the fixed shift, local convergence is at least linear with aconvergence factor proportional to the distance of cr to the nearest eigenvalue(provided that is simple and isolated). Double precision computation of the residualsagain leads (in well-conditioned cases) to results which are correct to almost doubleprecision.

    The paper is organized as follows. In 2, residual inverse iteration is defined forfixed shift. Section 3 gives the local convergence proof, with some remarks on theconvergence behaviour in case of variable shifts. In 4, we comment on the practicalrealization and demonstrate the behaviour of the algorithm with three examples" theFrank matrix of order 11 and two definite quadratic eigenvalue problems of Scott andWard [7]. We use the notation C "" for the set of complex gquare n x n-matrices,denote conjugate transposition by an asterisk *, and use I1" for an arbitrary vector norm.

    2. The algorithm. We consider the finite-dimensional nonlinear eigenvalueproblem

    (3) A(])=0, DC, C"-{0},where A" D -- > C "" is a continuous matrix-valued map. We suppose that an approxima-tion tre D to is known, that A(tr) is nonsingular, and that e is a normalization vectorsuch that(4) e*) 1;usually e will be the unit vector with a 1 in the position of the largest entry of . Wesuggest the following iteration for the approximation of a solution of (3).

    Residual inverse iteration.Step 1. Put 0, and compute an initial approximation x() to as the normalized

    solution of the equation(5) A(r)2 ()= b, x () := X()/e*2()the vector b # 0 has to be chosen suitably (see 4).

    Downloaded07/30/12to190.43.2.19

    3.RedistributionsubjecttoSIAMlicenseorcopyright;seeh

    ttp://www.siam.org/journals/o

    jsa.php

  • 7/28/2019 32 Residual inverse iteration for the nonlinear eigenvalue problem.pdf

    3/10

    916 A. NEUMAIERStep 2. Compute an improved approximation A+I to by solving one of theequations

    (6a) x()*A(A+)x (t) 0 or(6b) e*A(tr)-A(A,+,)x () O.Formula (6a) is appropriate only when A(A) is Hermitian and A, is real; otherwise,(6b) has to be used. The root closest to At is accepted as A+I.Step 3. Compute the residual(7) r (l) :-- a(Ai+l)X (!).

    Step 4. Compute an improved approximation x1+1) to by solving the equation(8) A(tr) dx(l)-- r (l )and normalizing the vector(9) (/+1) :__

    Step 5. Increase by one and return to Step 2.In the special case A(A A A1, residual inverse iteration is equivalent to ordinary

    inverse iteration with shift r (in the absence of rounding errors); indeed we then have(A oI)+1) (A- oI)x (l ) (A oI) dx)

    (A- crI)x (l)- (A- AI+II)x (1)(ll+l O)X (l),

    so that if

  • 7/28/2019 32 Residual inverse iteration for the nonlinear eigenvalue problem.pdf

    4/10

    RESIDUAL INVERSE ITERATION 917since some (n -1) (n -1) minor of A(X) is nonzero, C0 whence y0. Now byGrSbner [3, eq. (4.76)],

    d (A) det A(A det (A() + (A )A[A, ])=det A() + (A -) tr (Adj A(). A[A, ])+ O(A _)2(A ) tr (CA[A, ]) + O(A )2

    (A -) tr (3,33"A()) + O(A _)2(A -)7" 33"A(.) ) + O(A _)2.

    Hence in this case, (i) and (ii) are equivalent.Suppose now that A() has corank s 1. If s =0 then d() 0 and neither (i ) nor(ii) holds. And if s=>2 then all (n-l)(n-l) minors of A() are zero whence C =0and, as above, d(;t)=O(A-) 2. Again neither (i) nor (ii) holds. This proves theproposition.We shall call a simple isolated eigenvalue of the matrix function A(A) if A(A)is twice continuously ditterentiable in some neighbourhood of and the conditions(i) and (ii) of Proposition 1 are satisfied.PROPOSITION 2. Let be a simple isolated eigenvalue of A(A), and let be acorresponding right eigenvector normalized such that e* 1. Then the matrix(11) B := A(,) + A()e*is nonsingular.Proof. Assume that Bx 0. Then, with a left eigenvector 33, we have 0 *Bx*A()x+*A()e*x=*A();. ex, and by (10) then e*x 0. Therefore A()xBx-A();e*x 0, and x t for suitable since A() has corank 1. Now t= te*e*x 0 implies x 0. Since x was arbitrary, B is nonsingular, l-1PROPOSITION 3. With the assumptions of Proposition 2, suppose that for sufficientlysmall >-_ e > 0 we have 0 < Itr- .l

  • 7/28/2019 32 Residual inverse iteration for the nonlinear eigenvalue problem.pdf

    5/10

    918 A. NEUMAIERand since B is nonsingular by Proposition 2, S is nonsingular, and

    S-1= B-l+ O(-)= O(1).16)Moreover,

    S) (A(er) A()): + (1 r + )A[er, ])e*)(r .) Ajar,] + (1 r + )A[er, ],

    so that(17) A[tr, ]) S).Since e*(2-e*2. ))= e*2-e*, e*=0, (15), (12), and (17) imply

    z := S(2- e*2. ) A(tr)(x- e*2. )A(tr)2- e*2. A(o)(A(tr)- A(A ))x- e*2(A(tr) A())(o- a)A[o, a]x- e*2(o-- .)A[o, .]:(er- a )(A[tr,] + O(e))- e*2(tr-.)A[tr,(o- A)(S + O(e))- e*2(o--.)S.

    By (16), this implies(18) -e**. = S-z=(r--a)(+O(e))--e**(r--i.).Multiplication with e* gives0 (tr- X )(1 + O(e)) e*2(tr )which implies (13) and cr-a e*2(r-)(1 + O(e)). Insertion into (18) and divisionby e*2 finally gives 2/e*2-=(cr-,)O(e) which implies (14).

    PROPOSITION 4. Under the assumptions of Proposition 2, if, for sufficiently small>-e > 0 we have 0< Itr-[ 0, A(er) approaches A(.) whence y approaches a left nullvector: of A(), i.e. a left eigenvector corresponding to , and f(a) approaches the functionf(a) := :*A(A) which has a simple zero at . Therefore if a is sufficiently small, f(a)has a simple zero a,+, close to . Now o=f(a,+)=f(S)+(a,+,-.)f() for somenear whence

    f()a,+l +f,- + O(f())since f(:) is bounded away from zero (near ). But

    f(.) y*A(X)x>= y*A()(x- )= O(e),Downloaded07/30/12to190.43.2.19

    3.RedistributionsubjecttoSIAMlicenseorcopyright;seeh

    ttp://www.siam.org/journals/o

    jsa.php

  • 7/28/2019 32 Residual inverse iteration for the nonlinear eigenvalue problem.pdf

    6/10

    RESIDUAL INVERSE ITERATION 919so that (19) holds. (20) is proved in the same way with 37 x (l), observing that in theHermitian case 33 := is a left eigenvector, and

    f() x(t)*A()x (1 ) (x (l) ;)*A()(x 1)- ;) O(e2). 71With the observation that for r -> the solution x) of (5) converges to , Propositions3 and 4 lead to the following local convergence theorem.THEOREM. Let be a simple isolated eigenvalue of A(A ), and suppose that thereis a corresponding eigenvector normalized such that e* 1. Then the residual inverseiteration converges for all tr sufficiently close to , and we have

    iix,)_;iwhere 1 if (6b) is used, and 2 if A(A is Hermitian, is real, and (6a) is used. 71The theorem implies local linear convergence with a convergence factor propor-tional to In particular, this suggests that the convergence is accelerated byupdating the shift tr in each iteration step (or in some iteration steps only, if the extrawork to refactor A(o-) is considered as being too much). It follows easily from thetheorem that we have quadratic convergence (and in the Hermitian case with realeven cubic convergence) if in each iteration step, tr is replaced by the most recentvalue of Al .

    4. Numerical examples. For the actual computation on a computer, severalremarks are in place. (5) is usually solved by using a factorization(21a) A(tr)= SR,where R is upper triangular, and S is a permuted lower unit triangular or orthogonalmatrix. An appropriate choice of b is then the vector b Sj, j (1,..., 1)*, so that weactually solve(5a) R (1, 1)*, x=(/e*(in place of (5). This choice is motivated in Wilkinson [11] for ordinary inverse iterationand works well in the present algorithm.

    In the special case that A(A) is linear in A, and the QR (or QZ) algorithm hasbeen used to compute the eigenvalues (cf. Wilkinson [11], Parlett [5], Stewart [8]), afactorization(21b) A(tr) Q1B(tr)Q2with orthogonal Q1, Q2 and Hessenberg (or tridiagonal) B(tr) is already available,and it may be more economical to factor B(cr) instead"(21c) B(cr)= SRand solve(5b) RX () (1,. ., 1)*, x(O) Q,g(o), x(O) (o)/e,(o)in place of (5).It is a useful fact that the factorization can be reused to find the vector e rA(r) -1required in (6b) as(22a) e*A(cr) -1= e*R-1S -1 using (21a), or(22b) e*A(cr)-= e*Q*2R-S-IQ*I using (21b, c)

    Downloaded07/30/12to190.43.2.19

    3.RedistributionsubjecttoSIAMlicenseorcopyright;seeh

    ttp://www.siam.org/journals/o

    jsa.php

  • 7/28/2019 32 Residual inverse iteration for the nonlinear eigenvalue problem.pdf

    7/10

    920 A. NEUMAIERand to find the correction dx (1) in (8) as(8a) dx R-1S-lr (1) using (21a), or(8b) dx(l)= Q*R-IS-IQ*lr(I) using (21b, c).Equations (8) and (9) suggest that the limit accuracy with which , can be approxi-mated by x1, A is mainly determined by the accuracy with which the residualA(A+I)X is computed. Therefore it is sensible to store x and A o in double precision.Then A/+I and r) should be computed in double precision, but rl can be roundedto single precision before it is stored. The factorizations (21a-c) and the solution ofthe equations (Sa, b), (8a, b) can be performed in single precision, as well as thecomputation of e*A(tr) -1 by (11a, b). Finally, the correction (9) should be done indouble precision again. The resulting limit accuracy can then be expected to be aboutthe same as with the use of double precision throughout, and this is confirmed bynumerical examples shown below.Finally, the equations (6a) resp. (6b) need not be solved to full accuracy, and itis sufficient to take for A/+ one Newton step (linear interpolation) or Euler step(quadratic interpolation) from AI (starting with Ao := tr) towards the solution.To demonstrate the behaviour of residual inverse iteration we report here someof the numerical experiments which we have done on the UNIVAC 1100/82 of theUniversity of Freiburg (mantissa length: 27 bits for single precision, 60 bits for doubleprecision). The linear equations were solved using single precision Gauss eliminationwith column pivoting, and the vector e was chosen as the unit vector with a 1 in theposition of the absolutely largest entry of the most recent x1. This position was foundto be independent of except sometimes for 1 or 2.For a fixed shift tr we generally observed global, monotonic and linear convergenceof A to one of the eigenvalues nearest to tr . In almost all examples tried, the observedconvergence factor for the eigenvector was

    x+- :ll C. inf o--IIx xwhere the infimum extends over all eigenvalues of A(A) distinct from , and C variedbetween 0.5 and 3. The classical analysis of inverse iteration guarantees such a behaviourin the linear, nondefective case.

    Although our convergence analysis applies only to simple, isolated eigenvalues itwas found that multiple, nondefective eigenvalues were found with the same speedand accuracy as simple eigenvalues. We did not try residual inverse iteration ondefective problems. We now consider specific examples.

    1. Our first example is a standard eigenvalue problem Ao , correspondingto A(A)= Ao-AL The matrix Ao is the Frank matrix of order 11 (see [3])"

    11 10 9 110 10 9 1

    9 9 11 1

    All eigenvalues are simple. With strategy (6b) and constant shifts accurate to 10% andDownloaded07/30/12to190.43.2.19

    3.RedistributionsubjecttoSIAMlicenseorcopyright;seeh

    ttp://www.siam.org/journals/o

    jsa.php

  • 7/28/2019 32 Residual inverse iteration for the nonlinear eigenvalue problem.pdf

    8/10

    RESIDUAL INVERSE ITERATION 9210.1%, respectively, all eigenvalues were found very accurately. We give details for theeigenvalue

    -1. The associated normalized eigenvector is

    (--o, o, -, O, A, O, 1/2, O, 1/2, O, 1)*.With constant shift cr 1.0001 the sixth iterate x x (6) was accurate to 15 decimals, e.g.

    A 1.000 000 000 000 000 1,Xll 1 (normalized),xlo 0.000 000 000 000 000 2,xl -0.000 260 416 666 666 7.

    This confirms the claim that a single precision factorization coupled with doubleprecision residuals suffices to produce results comparable with the use of doubleprecision throughout. We remark that if (6a) is used in place of (6b) to compute al+lthen the iteration fails to converge to the small eigenvalues since the correspondingleft and right eigenvectors are almost orthogonal.

    2. Our second example is a symmetric, definite quadratic eigenvalue problemtaken from Scott and Ward [7]:

    -10AE+A +102A2+2A+2 -llh2+h+9

    A(A)= -A2+A-1 2A2+2A+3A2+2A+2 2A2+A-13AE+A-2 AE+3A-2

    Its eigenvalues (to three decimals) are:-1.27 -1.08 -1.0048

    .502 .880 .937 1.47Selected results are given in Table 1; listed are

    r--the (constant) shift,/--the number of iterations (max. 20"),Ax--max. norm of final eigenvector correction,q--average quotient of consecutive corrections,q*-infA--the computed eigenvalue.

    sym.-12A2+10-A2-2A+2 -10A2+2A+12A 2-2A 2A2+3A +

    -.779 -.512

    -llA2+3A +10

    TABLEo" Ax 1/q I/q*

    14 81o-8 16.5 15.9 1.004 838 220 309 0250 20* 410 2.15 1.52 -.511 761 939 586 031 0.5 8 2o-8 208 157 .502 415 273 308 102 5.9 20* 71o-7 1.88 1.82 .879 927 281 097 871 3.94 16 6o 16.9 17.4 .936 550 668 659 857

    It is seen that as described above the convergence rate q strongly correlates withthe relative distance q* of the shift from the eigenvalues; in particular this explainsthe slow convergence when the shift is near the average of two consecutive eigenvalues.

    Downloaded07/30/12to190.43.2.19

    3.RedistributionsubjecttoSIAMlicenseorcopyright;seeh

    ttp://www.siam.org/journals/o

    jsa.php

  • 7/28/2019 32 Residual inverse iteration for the nonlinear eigenvalue problem.pdf

    9/10

    922 A. NEUMAIER3. Our third example is another symmetric definite problem of Scott and Ward[7], this time with multiple eigenvalues-

    A(A)--h2-- 3A -b 1 sym.

    h2-1 -2h2-3h+5-h2-3A +1 h-I -2h2-Sh +2-2A2-6A +2 2h2-2 -4h -9h2- 19h + 14/

    This matrix function has the double eigenvalues 1 and -2, and the simple eigenvalues-4 +,,/ (-8.36, .359), -4 +,,/i- (-8.24, .243).

    Sample results are given in Table 2.TABLE 2

    tr Ax 1/q l/q* ..3 20* 1.05 1.03 .247

    .2 5 14 21o 15.3 14.8 .242640687 1192851.01 9 51o 57 65.1

    -2.01 7 710 413 623 2

    The multiple eigenvalues are found as efficiently as the others.4. Finally, we demonstrate the effect of convergence acceleration with the matrix

    function of Example 2. Starting with tr 2, the shift was updated in each iteration,replacing tr by the single precision truncation of the most recent hi . After 8 steps theapproximate eigenpair agrees to 16 decimal places with that computed by the algorithmwith constant shift tr 1 but the hi are no longer monotonic, and the limit eigenvalue-1.0048.-. is no longer nearest to the initial shift. The convergence behaviour canbe seen from Table 3 which lists the maximal element of the residuals and theeigenvector corrections.

    TABLE 3Step Residual Correction

    4.77 1.851o2 4.63 1.131o3 1.07 3.371o4 2.141o 9.771o-35 6.821o 3.701o6 1.12o-9 7.75,o-*O7 1.19o 6.641o8 5.511o 5.641o-,8

    REFERENCES[1] J. J. DONGARRA, C. B. MOLER AND J. H. WILKINSON, Improving the accuracy of computed eigenvaluesand eigenvectors, this Journal, 20 (1983), pp. 23-45.[2] D. K. FADDEEV AND V. N. FADDEEVA, Computational Methods of Linear Algebra, Freeman, San

    Francisco, 1963.[3] R. T. GREGORY AND D. L. KARNEY, A Collection of Matrices for Testing Computational Algorithms,Wiley-Interscience, New York-London, 1969.

    Downloaded07/30/12to190.43.2.19

    3.RedistributionsubjecttoSIAMlicenseorcopyright;seeh

    ttp://www.siam.org/journals/o

    jsa.php

  • 7/28/2019 32 Residual inverse iteration for the nonlinear eigenvalue problem.pdf

    10/10

    RESIDUAL INVERSE ITERATION 923[4] W. GRtBNER, Matrizenrechnung, Bibliogr. Inst., Mannheim- Wien- Ziirich, 1966.[5] B. N. PARLETT, The Symmetric Eigenvalue Problem, Prentice-Hall, Englewood Cliffs, NJ, 1980.[6] A. RUHE, Algorithms for the nonlinear eigenvalue problem, this Journal, 10 (1973), pp. 674-689.[7] D. S. SCOTT AND R. C. WARD, Solving symmetric-definite quadratic problems without factorization,SIAM J. Sci. Stat. Comput., 3 (1982), pp. 58-67.[8] G. W. STEWART, Introduction to Matrix Computations, Academic Press, New York San FranciscoLondon, 1973.[9] H. J. SYMM AND J. H. WILKINSON, Realistic error bounds fo r a simple eigenvalue and its associated

    eigenvector, Numer. Math., 35 (1980), pp. 113-126.[10] H. UNGER, Nichtlineare Behandlung yon Eigenwertaufgaben, Z. Angew. Math. Mech., 30 (1950),

    pp. 281-282.[11] J. H. WILKINSON, The Algebraic Eigenvalue Problem, Oxford Univ. Press, London, 1965.[12] T. YAMAMOTO, Error bounds for computed eigenvalues and eigenvectors, Numer. Math., 34 (1980),

    pp. 189-199.

    Downloaded07/30/12to190.43.2.19

    3.RedistributionsubjecttoSIAMlicenseorcopyright;seeh

    ttp://www.siam.org/journals/o

    jsa.php