Upload
csregistry
View
0
Download
0
Embed Size (px)
Citation preview
METHODS FOR ACCELERATIONOF CONVERGENCE (EXTRAPOLATION)OF VECTOR SEQUENCES
INTRODUCTION
An important problem that arises in different areas ofscience and engineering is that of computing limits ofsequences of vectors fxmg, where xm 2CN, and the dimen-sion N is very large in many applications. Such vector
sequences arise, for example, in the numerical solution ofvery large systems of linear or nonlinear equations byfixed-point iterative methods, and limm!1xm are simply therequired solutions to these systems. One common sourceof such systems is the finite-difference or finite-elementdiscretization of continuum problems.
In most cases, however, the sequences fxmg converge totheir limits extremely slowly. That is, s ¼ limm!1xm can beapproximatedwithaprescribed level of accuracyby xmwithvery large m. Clearly, this way of approximating s via thexm becomes very expensive computationally. One practicalway of tackling this problemeffectively is by applying to thesequence fxmg a suitable convergence acceleration method
(or extrapolation method).More specifically, let us consider the (linear or non-
linear) system of equations
cðxÞ ¼ 0; c : CN !CN ; ð1Þ
whose solutionwe denote s.What ismeant byx, s, andcðxÞare, respectively,
x ¼ ½xð1Þ; . . . ; xðNÞ�T; s ¼ ½sð1Þ; . . . ; sðNÞ�T; xðiÞ; sðiÞ scalars;
and
cðxÞ ¼ ½c1ðxÞ; . . . ;cNðxÞ�T;ciðxÞ ¼ ciðxð1Þ; . . . ; xðNÞÞ scalar functions:1
Then, starting with a suitable vector x0, an initial approx-imation to s, the sequence fxmg of approximations can begenerated by some fixed-point iterative method as in
xmþ1 ¼ fðxmÞ; m ¼ 0; 1; . . . ; f : CN !CN ; ð2Þ
with
fðxÞ ¼ ½ f1ðxÞ; . . . ; fNðxÞ�T; fiðxÞ ¼ fiðxð1Þ; . . . ; xðNÞÞscalar functions:
Here x�fðxÞ¼0 is a possibly ‘‘preconditioned’’ form of Equa-tion (1), hence it has the same solution s [that is,cðsÞ¼0 andalso s ¼ f(s)], and, in case of convergence, limm!1xm ¼ s.One possible form of f(x) would be
fðxÞ ¼ xþ AðxÞcðxÞ;
where A(x) is an N � N matrix, such that A(s) is non-singular.
Now, the rate of convergence of the sequence fxmg inEquation (2) to s is determined by rðFðsÞÞ, whereF(x) is theJacobian matrix2 of the vector-valued function f(x) andrðKÞ denotes the spectral radius3 of the matrix K. It is
1Throughout, we use lowercase boldface letters to denote vectorsand vector-valued functions, andwe use uppercase boldface lettersto denote matrices.2F(x) isanN�Nmatrix,whose (i,j) entry is@ fi=@x
ð jÞ evaluatedatx.3rðKÞ ¼ maxijmij where mi are eigenvalues of K. Of course, K is asquare matrix.
1828 METHODS FOR ACCELERATION OF CONVERGENCE (EXTRAPOLATION) OF VECTOR SEQUENCES
Wiley Encyclopedia of Computer Science and Engineering, edited by Benjamin Wah.Copyright # 2008 John Wiley & Sons, Inc.
known that rðFðsÞÞ< 1 must hold for convergence to takeplace, and that, the closer rðFðsÞÞ is to zero, the faster theconvergence. The rate of convergence deteriorates asrðFðsÞÞ becomes closer to 1, however.
As an example, let us consider the cases in whichEquation (1) and Equation (2) arise from finite-differenceor finite-element discretizations of continuum problems.For s [the solution to Equation (1) andEquation (2)] to be areasonable approximation to the solution of the conti-nuum problem, the mesh size of the discretization shouldbe small enough. However, a small mesh size means alargeN. In addition, as the mesh size tends to zero, henceN!1, generally, r(F(s)) tends to 1, as can be shownrigorously in some cases. All this means that, when themesh size decreases, not only does the dimension of theproblem increase, the convergence of the fixed-pointmethod in Equation (2) deteriorates as well. Asmentionedabove, this problem of slow convergence can be treatedefficiently via vector extrapolation methods.
Before going on, we mention that, given a vectorsequence fxmg that converges, a vector extrapolationmethod computes, directly or indirectly, a ‘‘weightedaverage’’ of a certain number of the vectors xm as anapproximation to limm!1xm. This approximation is ofthe form
X
k
i¼0
gðn;kÞi xnþi;
where n and k are integers chosen by the user and thescalars g
ðn;kÞi , which depend on the xm and can even be
complex, satisfy
X
k
i¼0
gðn;kÞi
¼ 1:4
In general, themethods differ in theway they determinethe g
ðn;kÞi . A nice feature of vector extrapolation methods
that is also of practical importance is that they take only thevectors xm as their input. The way these vectors are gen-erated is irrelevant; they need no other input.
An additional useful property of vector extrapolationmethods is that they can even take a divergent sequence ofvectors fxmg into convergent ones, with the right ‘‘limits.’’For example, if s is the solution to x� fðxÞ ¼ 0 and if fxmgthat is obtained from Equation (2) diverges, vector extra-polationmethods can produce sequences of approximationsthat converge to s. This can be proved rigorously when f(x)is linear. It seems to be true also when f(x) is nonlinear.In case of divergence, the solution s is called the antilimit
of fxmg.A detailed review of vector extrapolation methods,
which contains the developments up to the early 1980scan be found in the work of Smith et al. (1). This workdiscusses (1) two polynomial-type methods, namely, theminimal polynomial extrapolation (MPE) of Cabay and
Jackson (2) and the reduced rank extrapolation (RRE) ofEddy (3) and Mesina (4), and (2) three epsilon algorithms,namely, the scalar and vector epsilon algorithms of Wynn(5, 6), and the topological epsilon algorithmof Brezinski (7).When applied to very large systems of equations, polyno-mial-typemethodshave been observed generally to bemoreeconomical than the epsilon algorithms as far as computa-tion time and storage requirements are concerned. Forthis reason,we concentrate onMPEandRRE in thepresentwork.
In the next section, we present some preliminaries thatjustify the development of MPE and RRE via a study oflinear systems. Following these preliminaries, in thesucceeding six sections, we present an up-to-date reviewof MPE and RRE, which are the two polynomial-typemethods. This review contains the developments thathave taken place in the study of MPE and RRE sincethe publication of Ref. 1. In the third section, we give adetailed discussion of the derivation ofMPE andRRE thatis easy to understand even by those with limited back-ground. In the fourth section, we provide some elegantdeterminant representations for them. In the fifth section,we discuss their close connection with the method ofArnoldi (8) and with GMRES of Saad and Schultz (9),two well-known Krylov subspace methods for solving lin-ear systems. In the sixth section, we describe the mostaccurate and stable algorithms for their implementationin finite-precision (that is, floating-point) arithmetic;these algorithms are also the most economical as far astheir computational cost and storage requirements areconcerned. We provide all the details of these algorithmsto enable the reader to program them easily. In theseventh section, we provide the analytic theory of MPEand RRE concerning their convergence, accelerationof convergence, error, and numerical stability behavior.The results of this section are important; they are used inthe eighth section, where we discuss a mode of usage forMPE and RRE that is known as cycling and that is verysuitable for solving nonlinear systems.
In the ninth section, we turn to some additional applica-tions of vector extrapolation methods. When applied to thesequence of partial sums of a vector-valued power seriesfrom a vector-valued function, MPE serves as a tool foranalytic continuation: it produces vector-valued rationalapproximations to the function in question that approxi-mate the latter even outside the circle of convergence of itspower series. These rational approximations are closelyrelated to the Arnoldi method for eigenvalue approxima-tions. This topic, which we discuss very briefly here, istreated at length in Sidi (10, 11).
Another application of vector extrapolation methods,especially of MPE and RRE, is to the computation of aneigenvector of an arbitrary large and sparse matrix thatcorresponds to its largest eigenvalue when this eigenvalueis known. In addition to being of importance in its own right,this problem has attracted much attention recently becauseit also arises in the computation of the PageRank of theGoogleWebmatrix.Vectorextrapolationmethods,especiallyMPE and RRE, can be used effectively in the solution ofthis problem and hence in PageRank computations. Thecontents of this part of the paper on computing eigen-
4In case xm ¼ s for all m, it is natural and desirable to havePk
i¼0 gðn;kÞi xnþi ¼ s; and this implies that
Pki¼0 g
ðn;kÞi ¼ 1 must hold.
METHODS FOR ACCELERATION OF CONVERGENCE (EXTRAPOLATION) OF VECTOR SEQUENCES 1829
vectors in general and the PageRank in particular are newand originally appeared in the author’s 2004 technicalreport (12).
For completeness, in the section entitled ‘‘Brief Sum-mary of MMPE,’’ we discuss briefly a third polynomialmethod, which is known as the modified minimal poly-
nomial extrapolation (MMPE), which was proposedindependently by Brezinski (7), Pugachev (13), and Sidiet al. (14).
Again, for completeness, in the last section, we give abrief discussion of the three epsilon algorithms mentionedabove.
Beforewe end this section,wewould like to point out to anice feature of vector extrapolationmethods in general, andMPE and RRE in particular: These methods can be definedand used in the setting of general infinite-dimensionalinner product spaces, as well as CN with finite N. Thealgorithms and the convergence theory remain the samefor all practical purposes.
Finally, wemention thatMPEandRREhave been usedas effective accelerators for solving large and sparse non-linear systems of equations that arise in diverse areas ofsciences and engineering, such as computational fluiddynamics, structures, materials, semiconductor research,computerized tomography, and computer vision, to namea few. We refer the reader to the relevant literature forthese applications.
Note that, throughout this work, ðx; yÞwill stand for theEuclidean inner product of the vectors x; y 2Cs, that is,ðx; yÞ ¼ x�y. Similarly, kzkwill stand for the standard vectorl2-norm of z2Cs, namely, kzk ¼
ffiffiffiffiffiffiffi
z�zp
. Finally, kAk willstand for thematrix norm ofA2Cs�s induced by this vectornorm.5 Here, s is any positive integer.
PRELIMINARIES AND MOTIVATION
Because one of the main areas of application of vectorextrapolationmethods is that of numerical solution of largeand sparse (linear and nonlinear) systems of equations byfixed-point methods, we will motivate their derivation inthis context. We start this by discussing the nature of thevectors xm that arise from the iterativemethod of Equation(2), the function f(x) there being nonlinear in general.Assuming that limm!1xm exists, hence that xm � s for alllargem [recall that s is the solution to the system cðxÞ ¼ 0
and hence to the system x ¼ f(x)], we expand f(xm) inEquation (2) about s, thus obtaining
xmþ1 ¼ fðsÞ þ FðsÞðxm � sÞ þOðkxm � sk2Þ ð3Þas m!1 :
Recalling that s ¼ f(s), and rewriting Equation (3) in theform
xmþ1 � s ¼ FðsÞðxm � sÞ þOðkxm � sk2Þ as m!1; ð4Þ
we realize that, for all large m, the vectors xm and xmþ1
satisfy the approximate equation xmþ1�FðsÞxmþ ½I� FðsÞ�s.That is, the sequence fxmg behaves as if it were beinggenerated by an N-dimensional linear system of the form(I — T)x ¼ b through
xmþ1 ¼ Txm þ b; m ¼ 0; 1; . . . ; ð5Þ
where T ¼ F(s) and b ¼ [I � F(s)]s. This suggests thatwe should study those sequences fxmg that arise fromlinear systems of equations to derive and study vectorextrapolation methods. We undertake this task in thenext section.
DERIVATION OF MPE AND RRE
Linear Systems and Iterative Methods
Let s be the unique solution to the N-dimensional linearsystem
x ¼ Txþ b : ð6Þ
Writing this system in the form (I � T)x ¼ b, it becomesclear that the uniqueness of the solution is guaranteedwhen the matrix I � T is nonsingular or, equivalently,when T does not have 1 as its eigenvalue. Let the vectorsequence fxmg be generated via the iterative scheme
xmþ1 ¼ Txm þ b; m ¼ 0; 1 . . . : ð7Þ
Provided rðTÞ< 1; limm!1xm exists and equals s.Given the sequence x0; x1; . . . ; generated as in Equation
(7), let
um ¼ Dxm; wm ¼ Dum ¼ D2xm; m ¼ 0; 1; . . . ; ð8Þ
where Dxm ¼ xmþ1 � xm and D2xm ¼ DðDxmÞ ¼ xmþ2�2xmþ1 þ xm, and define the error vectors em as in
em ¼ xm � s; m ¼ 0; 1; . . . : ð9Þ
Using Equation (7), it is easy to show that
um ¼ Tum�1; wm ¼ Twm�1; m ¼ 0; 1; . . . : ð10Þ
From this, we have by induction that
um ¼ Tmu0; wm ¼ Tmw0; m ¼ 0; 1; . . . : ð11Þ
5Actually, we can employ any arbitrary inner product ðx; yÞ and thel2-norm induced by it, namely, kzk ¼
ffiffiffiffiffiffiffiffiffiffiffi
ðz; zÞp
. [Recall that any innerproduct in CN is of the form ðx; yÞ ¼ x�My, where M is a hermitianpositive definite matrix.] This requires a slight modification of thealgorithms in the section entitled ‘‘Efficient Implementation ofMPEand RRE.’’ The rest remains essentially unchanged.
1830 METHODS FOR ACCELERATION OF CONVERGENCE (EXTRAPOLATION) OF VECTOR SEQUENCES
Similarly, by Equation (7), and by the fact that s¼Tsþb, one can relate the error in xm to the error in xm�1 via
em ¼ Tem�1; ð12Þ
which, by induction, gives
em ¼ Tme0; m ¼ 0; 1; . . . : ð13Þ
Of course,T0 ¼ I inEquations (11) and (13) and throughout.In addition, em; um, and wm can be related via
um ¼ ðT� IÞem; wm ¼ ðT� IÞum; m ¼ 0; 1; . . . : ð14Þ
As mentioned in the preceding section, we propose toapproximate s by a ‘‘weighted average’’ of the k þ 1 con-secutive vectors xn; xnþ1; . . . ; xnþk, for some integersn andk,as in
sn;k ¼X
k
i¼0
gixnþi;X
k
i¼0
gi ¼ 1:6 ð15Þ
(As already mentioned in the Introduction, we do notrestrict the scalars gi to be real and/or positive; they canbe complex in general.) Substituting Equation (9) in Equa-tion (15), and making use of the fact that
Pki¼0 gi ¼ 1, we
obtain
sn;k ¼X
k
i¼0
giðsþ enþiÞ ¼ sþX
k
i¼0
gienþi: ð16Þ
From this, we observe that, for sn,k to be a good approxima-tion to s, the scalars gi should be chosen such that the vectorPk
i¼0 gienþi, the ‘‘weighted average’’ of the individual errorvectors en; enþ1; . . . ; enþk, is small in some sense. Now, byEquation (13),
X
k
i¼0
gienþi ¼X
k
i¼0
giTi
!
en ¼ GðTÞen; GðzÞ ¼X
k
i¼0
gizi: ð17Þ
Therefore, we should choose the polynomial G(z) [satisfy-ing G(1) ¼ 1] to ensure that the vector GðTÞen is small.As we will see shortly, we can actually make GðTÞenvanish by choosing k and the gi appropriately, thusobtaining s ¼Pk
i¼0 gixnþi;withPk
i¼0 gi ¼ 1:As a prelimin-ary to this, we recall the following definitions andtheorems; see, for example, Householder (15, chapter 1,p. 18):
Definition 3.1. Given a matrix K 2 CN�N, the monic
polynomial7 Q(z) is said to be a minimal polynomial of K
if Q(K) ¼O and Q(z) has smallest degree. (O stands for the
zero N � N matrix.)
Theorem 3.2. Given a matrix K 2 CN�N, the minimal
polynomial of K exists, is unique, and divides the char-
acteristic polynomial of K. [Thus, the degree of Q(z) is at
most N, and all its zeros are eigenvalues of K.]
Definition 3.3. Given a matrix K 2 CN�Nand a nonzero
vector u 2 CN, the monic polynomial P (z) is said to be a
minimal polynomial of K with respect to u if PðKÞu ¼ 0
and P (z) has smallest degree. (0 stands for the zero
N-vector.)
Theorem 3.4. (1) Given a matrix K 2CN�N and a non-
zero vector u 2CN, the minimal polynomial of K with
respect to u exists, is unique, and divides the minimal
polynomial of K, which divides the characteristic polyno-
mial of K. [Thus, the degree of P (z) is at most N, and its
zeros are some or all of the eigenvalues of K.] (2) In
addition, if ~PðzÞis any other polynomial for which
~PðKÞu ¼ 0, then P(z) divides~PðzÞ.
Going back to Equation (17), we can make the vectorGðTÞen there vanish if we choose the polynomial G(z) tobe a constant multiple of P(z), the minimal polynomialof T with respect to en [which exists and is unique by part(1) of Theorem 3.4], because PðTÞen ¼ 0. Thus, choosingGðzÞ ¼ PðzÞ=Pð1Þ so that Gð1Þ ¼ 1 as required, we canobtain from Equations (15)–(17) that
sn;k ¼X
k
i¼0
gixnþi ¼ s;
with
X
k
i¼0
gi ¼ 1:
Here, division by P(1) is allowed because Pð1Þ 6¼ 0 because1 is not an eigenvalue of the matrix T.
We can obtain the same result in an indirect way also byworking backward from PðTÞen ¼ 0: Let
PðzÞ ¼X
k
i¼0
cizi; ck ¼ 1: ð18Þ
Starting with PðTÞen ¼ 0, and invoking Equation (13), wehave
X
k
i¼0
ciTien ¼
X
k
i¼0
cienþi ¼ 0; ð19Þ
which by Equation (9), can be rewritten as
X
k
i¼0
ciðxnþi � sÞ ¼ 0:
6It should be understood that, because they depend on n and k, thescalars gi actually stand for g
ðn;kÞi , asmentioned in the Introduction.
When necessary, we will use the notation gðn;kÞi in the sequel.When
no confusion develops, our notation will be gi, for short.7The polynomial AðzÞ ¼Pm
i¼0 aizi is monic of degree m if am ¼ 1:
METHODS FOR ACCELERATION OF CONVERGENCE (EXTRAPOLATION) OF VECTOR SEQUENCES 1831
Solving this for s, we have
s ¼Pk
i¼0 cixnþiPk
i¼0 ci:
Here, division byPk
i¼0 ci is allowed because
X
k
i¼0
ci ¼ Pð1Þ 6¼ 0
because 1 is not an eigenvalue of the matrix T. Thus, wehave shown once more that
s ¼X
k
i¼0
gixnþi;
with
X
k
i¼0
gi ¼ 1:
Also, the gi of the preceding paragraph are now identifiedas gi ¼ ci=
Pkj¼0 c j; i ¼ 0; 1; . . . ; k, the ci being the coeffi-
cients of the minimal polynomial of T with respect toen ¼ xn � s.
What remains is to find the ci. Invoking ck¼ 1, Equation(19) can be written as
X
k�1
i¼0
cienþi ¼ �enþk:
This is a set of N linear equations in the k unknownsc0; c1; . . . ; ck�1, with ck ¼ 1. Also, because k � N, this setis in general overdetermined. Nevertheless, it is consistentand has a unique solution becauseP(z) exists and is unique.However, to determine the ci from
Pk�1i¼0 cienþi ¼ �enþk;
it seems that we need to know the vectorsem ¼ xm � s; m ¼ n; nþ 1; . . . ;nþ k, hence the solution s.Fortunately, this is not the case, and we can obtain the cisolely from our knowledge of the vectors xm. This weachieve as follows: Multiplying Equation (19) by T, andinvoking Equation (12), we have
0 ¼X
k
i¼0
ciTenþi ¼X
k
i¼0
cienþiþ1: ð20Þ
Subtracting Equation (19) from Equation (20), we obtain
0 ¼X
k
i¼0
ciðenþiþ1 � enþiÞ ¼X
k
i¼0
ciðxnþiþ1 � xnþiÞ;
hence the linear system
X
k
i¼0
ciunþi ¼ 0: ð21Þ
Clearly, the vectors um are known since the xm are.Consequently, the ci can be obtained from Equation (21)without any additional input.
Invoking Equation (11), Equation (21) can also beexpressed as
0 ¼X
k
i¼0
ciTiun ¼ PðTÞun:
Actually, using PðTÞun ¼ 0 and the fact that ðI� TÞ isnonsingular, we can apply part (2) of Theorem 3.4 to provethe following stronger result:
Theorem 3.5. If un ¼ xnþ1 � xn 6¼ 0, then P (z), the mini-
mal polynomial of T with respect to en, is also the minimal
polynomial of T with respect to un.
By Theorem 3.5, even though it is in general overdeter-mined, the system in Equation (21) is consistent and has aunique solution for the ci. As alreadymentioned, its matrixis obtained solely from the vectors xm, which are available.Setting ck ¼ 1, we solve this system for c0; c1; . . . ; ck�1.Following that, with ck ¼ 1, we set
gi ¼ci
Pkj¼0 c j
; i ¼ 0; 1; . . . ; k:
Again, this is allowed because
X
k
i¼0
ci ¼ Pð1Þ 6¼ 0
by the fact that I� T is not singular, and hence T does nothave 1 as an eigenvalue.
Summing up, wehave shown that if k is the degree of theminimal polynomial ofTwith respect to en, then there existscalars g0; g1; . . . ; gk, which satisfy
X
k
i¼0
gi ¼ 1;
such that
X
k
i¼0
gixnþi ¼ s:
The gi are given by
gi ¼ci
Pkj¼0 c j
; i ¼ 0; 1; . . . ; k;
where the ci are the unique solution to the linear system
X
k
i¼0
ciunþi ¼ 0
with ck ¼ 1.At this point, we note that s is the solution to
ðI� TÞx ¼ b, whether rðTÞ< 1 or not. Thus, with the gi asdetermined above, s ¼Pk
i¼0 gixnþi; whether limm!1xmexists or not. (Recall our discussion of antilimits in theIntroduction.)
1832 METHODS FOR ACCELERATION OF CONVERGENCE (EXTRAPOLATION) OF VECTOR SEQUENCES
In the sequel, we shall use the notation
Uð jÞs ¼ ½u jju jþ1j � � � ju jþs�: ð22Þ
Thus, Uð jÞs is an N � (s þ 1) matrix, u j; u jþ1; . . . ; u jþ s being
its columns. In this notation, Equation (21) reads
UðnÞk
c ¼ 0; c ¼ ½c0; c1; . . . ; ck�T: ð23Þ
Of course, dividing Equation (23) byPk
i¼0 ci, and definingthe gi as above, we also have
UðnÞk
g ¼ 0; g ¼ ½g0; g1 . . . ; gk�T: ð24Þ
So far, we have observed that s can be determined viaa sum of the form
Pki¼0 gixnþi with
Pki¼0 gi ¼ 1 once
the minimal polynomial of T with respect to en has beendetermined, and we have seen how this polynomial can bedetermined. However, the degree of the minimal polyno-mial of Twith respect to en can be as large asN by part (1)of Theorem 3.4. Because N can be very large in general,determining s in the way we have described here becomesprohibitively expensive as far as computation time andstorage requirements are concerned. [Note that weneed to store the vectors un; unþ1; . . . ; unþk and solvethe N � k linear system in Equation (21)]. Thus, we con-clude that attempting to determine s via a combination ofthe iteration vectors xm cannot be a suitable way after all.Nevertheless, with some twist, we can use the frameworkdeveloped thus far to approximate s effectively. To do this,we replace the degree of the minimal polynomial of T withrespect to en by an arbitrary small integer k (in practice, wetake k�N). This subject is described in the next twosubsections.
Derivation of MPE
Let us choose k to be an arbitrary positive integer that isnormally (much) smaller than the degree of the minimalpolynomial of T with respect to en. Clearly, the linearsystem in Equation (23) (with ck ¼ 1) is now inconsistentand hence has no solution for c0; c1; . . . ; ck�1 in the ordinarysense. To get around this problem, we solve this system inthe least-squares sense, because such a solution alwaysexists. After that, we compute g0; gi; . . . ; gk precisely asdescribed following Equation (21), and then compute thevector
sn;k ¼X
k
i¼0
gixnþi
as our approximation to s. The resulting method is knownasMPE. Clearly,MPE takes as its input only the integers kand n and the vectors xn; xnþ1; . . . ; xnþkþ1, hence it can beemployed whether these vectors are generated by a linearor nonlinear iterative process.
We can summarize the definition of MPE for an arbi-
trary sequence of vectors fxmg through the followingsteps:
1. Choose the integers k and n, and input the vectorsxn; xnþ1; . . . ; xnþkþ1. (Of course, k � N, but k�N
normally.)
2. Compute the vectors un; unþ1; . . . ; unþk and form theN � k matrix U
ðnÞk�1. (Recall that um ¼ xmþ1 � xm.)
3. Solve the overdetermined linear system UðnÞk�1c
0 ¼�unþk in the least-squares sense for c0 ¼½c0; c1; . . . ; ck�1�T: Set ck ¼ 1; and compute
gi ¼ ci=X
k
j¼0
c j; i ¼ 0; 1; . . . ; k;
provided
X
k
j¼0
c j 6¼ 0:
4. Compute the vector
sn;k ¼X
k
i¼0
gixnþi
as approximation to limm!1xm ¼ s.
Derivation of RRE
Again, let us choose k to be an arbitrary positive integerthat is normally (much) smaller than the degree of theminimal polynomial ofTwith respect to en. With this k, thelinear system in Equation (24), subject to
Pki¼0 gi ¼ 1; is not
consistent, hence it has no solution for g0; g1; . . . ; gk in theordinary sense. Therefore, we solve this system in theleast-squares sense, subject to the constraint
Pki¼0 gi ¼ 1:
After that, we compute the vector
sn;k ¼X
k
i¼0
gixnþi
as our approximation to s. The resulting method is knownas RRE. Clearly, RRE, just as MPE, takes as its input onlythe integers k and n and the vectors xn; xnþ1; . . . ; xnþkþ1,hence it can be employed whether these vectors aregenerated by a linear or nonlinear iterative process.
We can summarize the definition of RRE for an arbitrary
sequence of vectors fxmg through the following steps:
1. Choose the integers k and n, and input the vectorsxn; xnþ1; . . . ; xnþkþ1. (Of course, k � N, but k�N
normally.)
2. Compute the vectors un; unþ1 . . . ; unþk and form theN � ðkþ 1ÞmatrixU
ðnÞk . (Recall that um ¼ xmþ1 � xm.)
3. Solve the overdetermined linear system UðnÞk g ¼ 0 in
the least-squares sense, subject to the constraintPk
i¼0 gi ¼ 1: Here g ¼ ½g0; g1; . . . ; gk�T:
METHODS FOR ACCELERATION OF CONVERGENCE (EXTRAPOLATION) OF VECTOR SEQUENCES 1833
4. Compute the vector
sn;k ¼X
k
i¼0
gixnþi
as approximation to limm!1xm ¼ s.
A method that is essentially the same as RRE wasproposed by Kaniel and Stein (16). In this method, the giare computed as in Step 3 of RRE, but limm!1xm ¼ s is nowapproximated by
sn;k ¼X
k
i¼0
gixnþiþ1:
An Exactness Property of MPE and RRE
From Equations (19) and (20), we realize that PðzÞ ¼Pk
i¼0 cizi, the minimal polynomial of T with respect to
en ¼ xn � s, satisfies
X
k
i¼0
ciðxmþi � sÞ ¼ 0; m ¼ n;nþ 1; . . . : ð25Þ
Consequently, it also satisfies
X
k
i¼0
ciumþi ¼ 0; m ¼ n;nþ 1; . . . ;
fromwhich the ci can be determined uniquely because k isthe smallest possible, and MPE and RRE produce s
exactly.We can now ignore the fact the vectors xm in the pre-
ceding subsections were generated by a linear fixed-pointiterative method, and concentrate only on vectors xmthat satisfy the recursion relation in Equation (25) withminimal k, and state the following exactness result:
Theorem 3.6. Let the vectors xm satisfy Equation (25),with c0ck 6¼ 0;
Pki¼0 ci 6¼ 0; and minimal k. Let em ¼ xm � s
for all m, and assume that the vectors en; enþ1; . . . ; enþk�1
are linearly independent. Then the following hold: (1)With um ¼ xmþ1 � xm for all m, all the sets fum; umþ1; . . . ;umþk�1g; m ¼ n;nþ 1; . . . ; are linearly independent. (2) sm;k
obtained by applying MPE and RRE to the sequence fxmgexist uniquely and sm;k ¼ s, for all m ¼ n;nþ 1; . . . , both for
MPE and for RRE.
This result can be viewed as the MPE/RRE analogue ofwhat is known as McLeod’s theorem that concerns thevector epsilonalgorithm.Wewill discussMcLeod’s theorembriefly in the last section.
DETERMINANT REPRESENTATIONS
Despite the rather complex procedures by which sn;k forMPE and RRE have been defined, elegant analytical
expressions for them can be given. The determinant repre-sentations in the next theorem were given by the author inRef. 17.
Theorem 4.1. With um and wm as in Equation (8), definethe scalars ui,j by
ui; j ¼ðunþi; unþ jÞ ¼ ðDxnþi; Dxnþ jÞ forMPE
ðwnþi; unþ jÞ ¼ ðD2xnþi; Dxnþ jÞ forRRE:
�
ð26Þ
1. The gj forMPE andRRE are the solution to the linear
systems
ðunþi; UðnÞk
gÞ ¼ 0; i ¼ 0; 1; . . . ; k� 1;Pk
j¼0 g j ¼ 1;
forMPE
ðwnþi; UðnÞk
gÞ ¼ 0; i ¼ 0; 1; . . . ; k� 1;Pk
j¼0 g j ¼ 1;
forRRE:
ð27Þ
2. We have the following determinant representations
for sn;k from MPE and RRE:
sn;k ¼ Dðxn; xnþ1; . . . ; xnþkÞDð1; 1; . . . ; 1Þ ; ð28Þ
where Dðv0; v1; . . . ; vkÞ is aðkþ 1Þ � ðkþ 1Þ determi-
nant defined as in
Dðv0; v1; . . . ; vkÞ ¼
v0 v1 � � � vku0;0 u0;1 � � � u0; ku1;0 u1;1 � � � u1; k
..
. ... ..
.
uk�1;0 uk�1;1 � � � uk�1;k
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
:
ð29Þ
Note that the determinantDðxn; xnþ1; . . . ; xnþkÞ in Equa-tion (28) is vector-valued and is defined via its expansionwith respect to its first row. Thus, ifCi is the cofactor of vi inDðv0; v1; . . . ; vkÞ defined in Equation (29), then
sn;k ¼X
k
i¼0
Cixnþi=X
k
i¼0
Ci:
The determinant representations above have been veryuseful in analyzing MPE and RRE. They have also beenused in Ref. 18 in deriving recursion relations among thevarious sn;k.
MPE and RRE in Infinite Dimensional Inner Product Spaces
Before we continue our discussion of MPE and RRE, it isimportant to note that these twomethods, and other vectorextrapolation methods as well, can be applied to vectorsequences in infinite dimensional inner product spaces,such as Hilbert spaces, once the inner product in C
N hasbeen replaced with that pertinent to the space in which we
1834 METHODS FOR ACCELERATION OF CONVERGENCE (EXTRAPOLATION) OF VECTOR SEQUENCES
are working. This is clear from Equations (26–29). All thedevelopments that we discuss in the next sections, includ-ing the convergence theory, remain unchanged in thesespaces, subject to some minor modifications. This has beennoted in Refs. 14 and 17.
MPE AND RRE AND KRYLOV SUBSPACE METHODS
When applied to the solution of the linear systemðI� TÞx ¼ b in conjunction with the linear fixed-pointiteration scheme xmþ1 ¼ Txm þ b; m ¼ 0; 1; . . . ; MPE andRRE become mathematically (but not algorithmically)equivalent to two well-known Krylov subspace methods,namely, the method of Arnoldi (8) and GMRES (9). Thedetails follow:
Given amatrix S and a vector u, let us define the Krylovsubspace KmðS; uÞ via
KmðS; uÞ ¼ span fSiu; i ¼ 0; 1; . . . ;m� 1g: ð30Þ
In the method of Arnoldi and GMRES, one starts withthe vector x0 and seeks an approximation sk to s, thesolution to Ax ¼ b, of the form sk ¼ x0 þ z, with z in theKrylov subspace KkðA; r0Þ, and r0 ¼ rðx0Þ ¼ b� Ax0, suchthat
ðrðskÞ; aÞ ¼ 0 for all a2KkðA; r0Þ ðforArnoldimethodÞð31Þ
and
krðskÞk ¼ minz2KkðA;r0Þ
krðx0 þ zÞk ðforGMRESÞ: ð32Þ
Here, ðx; yÞ ¼ x�y and kxk ¼ffiffiffiffiffiffiffi
x�xp
as before. The followingtheorem was proved in Ref. 19:
Theorem 5.1. Consider the linear system Ax ¼ b, where
A ¼ I� T, and let x0 be an arbitrary initial vector. Let the
vectors xm in MPE and RRE be generated via
xmþ1 ¼ Txm þ b,and let sMPEn;k and sRREn;k be the approximations
to the solution s ofAx¼b.Similarly, let sArnoldik and sGMRESk
be the approximations to s from the method of Arnoldi and
from GMRES, respectively, starting with the same initial
vector x0, as explained above. Then there hold
sMPE0;k ¼ sArnoldik and sRRE0;k ¼ sGMRES
k :
Recall also that the Arnoldi method reduces (mathema-tically) to the method of conjugate gradients when thematrix A is hermitian positive definite.
EFFICIENT IMPLEMENTATION OF MPE AND RRE
General Considerations
We now turn to the numerical implementation of MPEand RRE in floating-point (that is, in finite-precision)
arithmetic. Because we are performing computationswith vectors and matrices, this topic must be handledwith care. In this section, we provide the details of someefficient algorithms for implementing MPE and RRE. Inaddition to being stable numerically, these algorithms arealso timewise economical and have minimal core memoryrequirements. They were originally developed in Ref. 20,where a FORTRAN 77 code is also included. By followingthe instructions of this section, the user can program thesealgorithms himself without having to invoke commercialsoftware packages.
Our algorithms are based on the definitions of MPEand RRE that we presented in the section entitled ‘‘Deri-vation of MPE and RRE.’’ One feature common to bothMPE and RRE is that the matrices U
ðnÞk enter their
definition. In addition, by the fact thatPk
i¼0 gi ¼ 1; thevector sn;k ¼
Pki¼0 gixnþi; for both MPE and RRE, can be
expressed as in
sn;k ¼ xn þX
k�1
i¼0
xiunþi ¼ UðnÞk�1j;
j ¼ ½x0; x1; . . . ; xk�1�T;ð33Þ
where
x0 ¼ 1� g0; x j ¼ x j�1 � g j; j ¼ 1; . . . ; k� 1: ð34Þ
Taking this point into account together with the defini-tions of MPE and RRE, we realize first that the vectorsxnþ1; . . . ; xnþkþ1 are not needed directly for computing sn,k.Actually, they can be discarded as soon as the un þ i arecomputed, which saves storage. This is done by overwrit-ing the vector xn þ i with unþi ¼ Dxnþi as soon as the latteris computed, for i ¼ 1; . . . ; k. We save only xn.
In the sequel, we assume that the matrix UðnÞk has full
rank, that is, that rank ðUðnÞk Þ ¼ kþ 1.
One very important aspect of our algorithms is theiraccurate solution of the relevant least-squares problems,via QR factorizations of the matrices U
ðnÞk , as in
UðnÞk
¼ QkRk: ð35Þ
Here Qk is an N � ðkþ 1Þ unitary matrix satisfyingQ�
kQk ¼ Iðkþ1Þ�ðkþ1Þ. Thus,Qk has the columnwise partition
Qk ¼ ½q0jq1j � � � jqk�; ð36Þ
such that the columns qi form an orthonormal set ofN-dimensional vectors, that is, q�i q j ¼ di j. The matrix Rk
is a ðkþ 1Þ � ðkþ 1Þ upper triangular matrix with positivediagonal elements. Thus,
Rk ¼
r00 r01 � � � r0kr11 � � � r1k
} ...
rkk
2
6
6
6
4
3
7
7
7
5
; rii > 0; i ¼ 0; 1; . . . ; k: ð37Þ
METHODS FOR ACCELERATION OF CONVERGENCE (EXTRAPOLATION) OF VECTOR SEQUENCES 1835
This factorization can be carried out easily and accuratelyusing the modified Gram–Schmidt ortogonalization pro-cess (MGS). See, for example, Refs. 20 and 21. For com-pleteness, we give here the steps of MGS as applied to thematrix U
ðnÞk
:
1. Compute r00 ¼ kunk andq0 ¼ un=r00.
2. For i ¼ 1; . . . ; k do
Set uð0Þi ¼ unþi
For j ¼ 0; . . . ; i� 1 do
r jk ¼ ðq j; uð jÞi Þand uð jþ1Þ
i ¼ uð jÞi � r jkq j
end
Compute rii ¼ kuðiÞi k andqi ¼ uðiÞi =rii
end
Again, ðx; yÞ ¼ x�y and kxk ¼ffiffiffiffiffiffiffi
x�xp
. In addition, q0 over-writes un in Step 1, while in Step 2, u
ð0Þi overwrites
unþi; uð jþ1Þi overwrites u
ð jÞi , and qi overwrites u
ðiÞi , as soon
as they are computed. Thus, for each i ¼ 1; . . . ; k, the vec-tors unþi; u
ð jÞi and qi all occupy the same storage location
both at the time of computation and after the computationhas been performed.
Summarizing, we have that at all stages of the computa-tion of Qk and Rk, we are keeping only k þ 2 vectors in thememory. Ultimately, we save the vector xn and the matrixQk, that is, the vectors q0; q1; . . . ; qk.
Note that Qk is obtained from Qk�1 by appending to thelatter the vector qk as the (kþ 1)st column. Similarly, Rk isobtained from Rk�1 by appending to the latter the 0 vectoras the (kþ 1)st row and then the vector ½r0k; r1k; . . . ; rkk�T asthe (k þ 1)st column.
Algorithms for MPE and RRE
With the QR factorization of UðnÞk (hence of U
ðnÞk�1) available,
we can give algorithms for MPE and RRE within a unified
framework.Thealgorithmsare almost identical; theydifferonly in the way they compute the gi. Here are the theirsteps:
1. Input: k and n and the vectors xn; xnþ1; . . . ; xnþkþ1:
2. Compute the vectors unþi ¼ Dxnþi; i ¼ 0; 1; . . . ; k, andform the N � (k þ 1) matrix
UðnÞk
¼ ½unjunþ1j � � � junþk�;
and form its QR factorization, namely, UðnÞk
¼ QkRk,with Qk and Rk as in Equations (36) and (37), usingMGS.
3. Determine the gi :
For MPEWith rk ¼ ½r0k; r1k; . . . ; rk�1; k�T, solve the k � k
upper triangular system
Rk�1c0 ¼ �rk; c0 ¼ ½c0; c1; . . . ; ck�1�T:
Set ck ¼ 1, and gi ¼ ci=Pk
i¼0 ci; i ¼ 0; 1; . . . ; k,provided
Pki¼0 ci 6¼ 0:
For RREWith e ¼ ½1; 1; . . . ; 1�T, solve the (k þ 1) � (k þ 1)linear system
R�kRkd ¼ e; d ¼ ½d0;d1; . . . ;dk�T:
(This amounts to solving two triangular systems: firstR�ka ¼ e, for a ¼ ½a0;a1; . . . ;ak�T, and following that,
Rkd ¼ a ford.)Next, compute
l ¼ 1=X
k
i¼0
di:
(Note that l is always real and positive because UðnÞk
has full rank.)Next, set g ¼ ld, that is, gi ¼ ldi; i ¼ 0; 1; . . . ; k:
4. With the gi determined, compute j ¼ ½x0; x1; . . . ; xk�1�Tvia
x0 ¼ 1� g0; x j ¼ x j�1 � g j; j ¼ 1; . . . ; k� 1:
5. Compute
h ¼ ½h0;h1; . . . ;hk� 1�T ¼ Rk�1j:
Then compute
sn;k ¼ xn þQk�1h ¼ xn þX
k�1
i¼0
hiqi:
Note that the linear systems that we solve in Step 3 arevery small in size compared with the dimension N of thevectors xm, hence their cost is negligible.
Error Estimation via Algorithms
We now turn to the assessment of the quality of sn, k asan approximation to s, in case the sequence fxmg isobtained from the fixed-point iteration xmþ1 ¼ fðxmÞ ofthe system x ¼ f(x). All we know to compute is f(x) fora given x, and we must use this knowledge for ourassessment.
One common way of assessing the quality of a givenvector x as an approximation to s is by looking at thel2-norm of the residual vector rðxÞ ¼ fðxÞ � x becauserðsÞ ¼ 0. Thus, the quality of sn;k will be assessed by com-puting the l2-norm of rðsn;kÞ.What is of interest is the size ofkrðsn;kÞk=krðx0Þk, namely, the order ofmagnitude of krðsn;kÞkrelative to that of krðx0Þk. This number shows by howmanyorders of magnitude the l2-norm of the initial residual rðx0Þ(or of the initial error e0 ¼ x0 � s) has been reduced in thecourse of extrapolation.
We now show how this can be done by using only some ofthe quantities already produced by the algorithms givenabove, at no additional cost.
1836 METHODS FOR ACCELERATION OF CONVERGENCE (EXTRAPOLATION) OF VECTOR SEQUENCES
First, note that, by Equation (2),
rðx0Þ ¼ fðx0Þ � x0 ¼ x1 � x0 ¼ u0; hence krðx0Þk ¼ ku0k:
For linear sequences: When the iteration vectors xmare generated linearly as in Equation (7), we haverðxÞ ¼ ðTxþ bÞ � x. Thus,
rðxmÞ ¼ xmþ1 � xm ¼ um:
Invoking alsoPk
i¼0 gi ¼ 1; where gi are as obtainedwhen applying MPE or RRE, we therefore have
rðsn;kÞ¼X
k
i¼0
giunþi ¼ UðnÞk
g; hence krðsn;kÞk¼kUðnÞk
g k:
Note also that, in this case,
sn;k � s ¼ �ðI� TÞ�1rðsn;kÞ; hence
ksn;k � sk � kðI� TÞ�1k krðsn;kÞk;
and this provides additional justification for looking atkrðsn;kÞk:
For nonlinear sequences: When the xm are generated non-linearly as in Equation (2), with sn;k close to s, we have
rðsn;kÞ ¼ fðsn;kÞ � sn;k �UðnÞk
g; hence
krðsn;kÞk�kUðnÞk
gk:
Whether the vectors xm are generated linearly or non-linearly, kUðnÞ
k gk can be determined in terms of alreadycomputed quantities and at no cost, without actuallyhaving to compute rðsn;kÞ. Indeed, we have
kUðnÞk
gk ¼ rkkjgkj forMPEffiffiffi
lp
forRRE:
�
Here, rkk is the last diagonal element of the matrix Rk inEquation (37) and l is the parameter computed in Step 3of the algorithms in the preceding subsection. Clearly,this computation of kUðnÞ
kgk can be made a part of the
algorithms. For details, see Ref. 20.
ERROR ANALYSIS FOR MPE AND RRE
The error and convergence analysis of MPE and RRE hasbeen carried out within the framework of linearly gener-ated vector sequences fxmg in Refs. 17 and 22–25. Thisanalysis sheds considerable light onwhat vector extrapola-tion methods can achieve when they are applied to vectorsequences obtained from fixed-point iterative methods onnonlinear systems as well as linear ones.
Convergence Acceleration Properties of sn;k as n!1We first consider the convergence properties of MPEand RRE as n!1 while k is held fixed. The following
result that shows that these methods are bona fideconvergence acceleration methods was given in Ref. 17.The technique of the proof is based on that developed bySidi et al. (14).
Theorem 7.1. Assume that the vectors xm satisfy
xm ¼ sþX
p
i¼1
vilmi ; p � N; ð38Þ
where the vectors vi are linearly independent, and the sca-
lars li, are distinct and nonzero and li 6¼ 1, ordered as in
jl1j jl2j � � � jlpj: ð39Þ
If also
jlkj> jlkþ1j; ð40Þ
then the following hold:
1. sMPEn;k exists for all large n and sRREn;k exists uncondition-
ally for all n.
2. Both for MPE and RRE, there holds
sn;k � s ¼ ½Cn;k þ oð1Þ�lnkþ1 ¼ Oðlnkþ1Þ as n !1; ð41Þ
where the vectors Cn,k are bounded in n, that is,supnkCn;kk<1, and are the same for MPE and
RRE. Thus, the errorssn;k � s for MPE and RRE are
asymptotically equal as n!1as well.
3. In addition, limn!1gðn;kÞi , where g
ðn;kÞi stands for gi, all
exist, and
limn!1
X
k
i¼0
gðn;kÞi li ¼
Y
k
i¼1
l� li
1� li: ð42Þ
4. Finally, for k ¼ p, we have the so-called ‘‘finite termi-
nation property,’’ that is,
sn;p ¼ s andX
p
i¼0
gðn;pÞi li ¼
Y
p
i¼1
l� li1� li
: ð43Þ
It can be shown that the finite termination property inpart (4) of Theorem 7.1 is nothing but the exactness resultwe stated in Theorem 3.6.
Note that when the xm are generated by the iterativeprocedure inEquation (7),where I� T is nonsingular andTis diagonalizable, they are precisely as described in thistheorem, with li being some or all of the distinct nonzeroeigenvalues ofT andvi being eigenvectors corresponding tothese eigenvalues. If ðmi; ziÞ; i ¼ 1; . . . ;N, are the eigenpairsof T, that is, Tzi ¼ mizi, for every i, then x0 � s ¼
PNi¼1 aizi
for some scalarsai since the zi are linearly independent andspan CN. Therefore, by Equations (9) and (13),
xm � s ¼ Tmðx0 � sÞ ¼X
N
i¼1
aimmi zi:
METHODS FOR ACCELERATION OF CONVERGENCE (EXTRAPOLATION) OF VECTOR SEQUENCES 1837
Then, for m 1, zero eigenvalues of T have zero contribu-tion to this summation. In addition, the contributionsof equal nonzero eigenvalues can be lumped togetherto form one single term that is again an eigenvector ofT. Renaming the distinct eigenvalues that contribute tothis sum li, we obtain precisely the structure of xm in thetheorem. In addition, note that p is the degree of theminimal polynomial of T with respect to xm � s for allm 1 in this case. This is also borne out by Equation (43).
It is also clear from Theorem 7.1 that the sequence fxmgthere need not result from a nonsingular linear systemðI� TÞx ¼ b, it can result from some other type of problemas well.
In case, the condition jlkj> jlkþ1j in Theorem 7.1 doesnot hold [then jlkj ¼ jlkþ1j necessarily since jlkj jlkþ1jby the ordering of the li in Equation (39)], we need tomodify the statement of this theorem. This has beendone in Ref. 23, and we state a special case of it next.
Theorem 7.2. Assume that the vectors xm have been gen-
erated by the iterative procedure xmþ1 ¼ Txm þ b; T being
diagonalizable and I� T being nonsingular. Denote the
solution of ðI� TÞx ¼ b by s. Let us order the nonzero dis-
tinct eigenvalues li of T that contribute to xm � s via
Equation (38) as in Equation (39). If jlkj ¼ jlkþ1j, then
sn;k exist and are unique for all n and k, and there holds
sn;k � s ¼ OðlnkÞ ¼ Oðlnkþ1Þ as n!1; ð44Þ
(1) for RRE unconditionally and (2) for MPE provided the
hermitian part8 of the matrix aA, where a is some appro-
priate nonzero scalar in C and A ¼ I � T, is positive
definite.
One conclusion that can be drawn from Theorem 7.2 isthat if
jlrj> jlrþ1j ¼ � � � ¼ jlrþnj> jlrþnþ1j � � � ;
then there holds
sn;k � s ¼ Oðlnrþ1Þ asn!1; for rþ 1 � k � rþ n� 1:9
Thus, if
jl1j ¼ � � � ¼ jlnj> jlnþ1j � � � ;
then
sn;k � s ¼ Oðln1Þ as n!1; for 1 � k � n� 1;
while
sn;n � s ¼ Oðlnnþ1Þ as n!1:00
We would like to emphasize here that the results ofTheorems 7.1 and 7.2 are not the same. For example, inTheorem 7.1 , we have that sMPE
n;k � s and sRREn;k � s are asymp-totically equal as n!1, whereas no such result holds inTheorem 7.2.
Concerning the results of Theorems 7.1 and 7.2, we canoffer the following interpretation: Starting from Equation(38), we first observe that the contribution of the small li isbeing diminished relative to the large ones by letting n
grow. The extrapolation procedure then takes care of thecontribution from the k largest li. What remains is thecontribution from the intermediate li starting with lkþ1.
In connection with Theorems 7.1 and 7.2, we note thatwe have not assumed that limm!1xm exists. Clearly,limm!1xm exists and is equal to s provided jl1j< 1. Incase limm!1xm does not exist, we have that jl1j 1 neces-sarily; in this case, s is the antilimit of fxmg. Now, if we usethe xm, as they are, to approximate s, we have the error
em ¼ xm � s ¼ Oðlm1 Þ asm!1:
Thus, provided jlkþ1j< 1, we have that limn!1sn;k ¼ s,whether limn!1xm exists. Furthermore, when thesequence fxmg converges (to s), sn,k, converges (to s) faster,provided jlkþ1j < jl1j, which clearly holds in Theorem 7.1.This means thatMPE and RRE accelerate the convergenceof the sequence fxmg.
For more general versions of Theorems 7.1 and 7.2involving nondiagonalizable matrices T, see Refs. 22and 23.
Error Analysis for sn;k with Fixed n and k
So far, we have reviewed the convergence properties of theapproximations sn;k for n!1 while k is being held fixed.Of course, n cannot be increased indefinitely in practice.Similarly, k cannot be increased too much because ofstorage limitations [recall from the algorithms of the sixthsection that we need to store the N � (k þ 1) matrix Qk].Because of these reasons,MPEandRREare applied in theso-called cycling mode (to be discussed later in the eighthsection), wheren and k can be chosenmoderately large butare both kept fixed. Bounds on the error in sMPE
n;k and sRREn;k , asfunctions of n and k, have been presented by Sidi andShapira (24, 25).
The next result from Refs. 24 and 25 provides an errorbound on sRREn;k for the case in which both n and k are beingkept fixed. This theorem actually gives the justificationfor cycling with sn;k with even moderate positive n ratherthan n ¼ 0. An essentially identical result for sMPE
n;k ,under the condition that the matrix I� T has a positivedefinite hermitian part (cf. Theorem 7.2), is given inRef. 24.
Theorem 7.3. Let s be the solution to the linear system
ðI� TÞx ¼ b, where I� T is nonsingular. Assume that T is
8The hermitian part of a matrix K2Cs�s is the matrixKH ¼ 1
2ðKþK�Þ.9Recall that sn;r � s ¼ Oðlnrþ1Þ asn!1, and that sn;rþn � s ¼Oðlnrþnþ1Þ asn!1, already as described in Theorem 7.1.
1838 METHODS FOR ACCELERATION OF CONVERGENCE (EXTRAPOLATION) OF VECTOR SEQUENCES
diagonalizable, that is,
T ¼ RLR�1; L ¼ diag ðm1; m2; . . . ; mNÞ; ð45Þ
and let the vector sequencefxmgbe generated
via xmþ1 ¼ Txm þ b; m ¼ 0; 1; . . .Define the residual vector
associated with the vector x by rðxÞ ¼ b� ðI� TÞx. Let alsoPk ¼ fpðzÞ : p2 Qk; pð1Þ ¼ 1g. Then
krðsRREn;k Þk ¼ minp2Pk
kTn pðTÞ rðx0Þk
� ðminp2Pk
kTn pðTÞkÞ krðx0Þk
� kTn pðTÞk krðx0Þk for every p2Pk
� kðRÞGspect ðTÞn;k krðx0Þk;
ð46Þ
where kxk ¼ffiffiffiffiffiffiffi
x�xp
as before, kðRÞ ¼ kRkkR�1k stands for
the condition number of R, spect (T) stands for the spec-
trum of T, and, for any set D in the complex plane,
GDn;k ¼ min
p2Pk
maxz2D
jzn pðzÞj: ð47Þ
Remark. The paper (25) actually treats GMRES (k), therestarted GMRES, with n initial iterations of Richardsontype, which is denoted there GMRES(n,k). Because RREand GMRES are equivalent in the sense described inTheorem 5.1 in the fifth section, the results of Ref. 25can be expressed as in Theorem 7.3 and without anychanges.
Now, in case spect ðTÞ�D, we have that GspectðTÞn;k �GD
n;k.This fact has been used in Ref. 24 to derive upper and lowerbounds onG
spectðTÞn;k that can be expressed analytically in terms
of suitable sets of orthogonal polynomials. For some specialtypes of spectra, both bounds can be expressed analytically interms of the Jacobi polynomials P
ða;bÞk ðzÞ and can easily be
computed numerically. In addition, these bounds are tight.(See the tables in Refs. 24 and 25.) Below, by ½u; v� we meanthe straight line segment between the (complex) numbers uand v in the z-plane. The Jacobi polynomials P
ða;bÞk ðzÞ are
normalized such that Pða;bÞk ð1Þ ¼ kþ a
k
� �
; see Abramowitz
and Stegun (26), for example.
1. If spect(T) is contained in [0, b] for some possiblycomplex b (1 is not in [0, b]), then
Gspect ðTÞn;k � jbjn
jPð0;2nÞk
ð2=b� 1Þj
¼ jbjnþk
jPkj¼0
k
j
� �
2nþ k
j
� �
ð1� bÞ jj:
This bound is a decreasing function of both n and k
when b 6¼ 0 is real and b< 1. In particular, its decreaseas a function of k becomes faster with increasing n.Clearly, this bound decreases very quickly in case thespectrum of T is real nonpositive, that is, b < 0.
2. If spect(T) is contained in ½�b; b� for some possiblycomplex b (1 is not in ½�b; b�), then
Gspect ðTÞn;2n � jbjn
jPð0;nÞn ð2=b2 � 1Þj
;
Gspect ðTÞn;2nþ1 � jbjnþ1
jPð0;nþ1Þn ð2=b2 � 1Þj
:
Both for even and odd values of k, these upper boundscan be unified to read
Gspect ðTÞn;k � jbjnþk
jPnj¼0
n
j
� �
nþ m
j
� �
ð1� b2Þ jj;
where
n ¼ k
2; m ¼ kþ 1
2
%
:
$%$
This unified bound is a decreasing function of both n
and kwhen b is real and jbj< 1. The sameholds in caseb is purely imaginary and |b| < 1, which happenswhen T is a skew-hermitian matrix, for example. Inthis case, the upper bound on G
spectðTÞn;k tends to zero
monotonically as a function of k, its rate of decreasebecoming largerwith increasingn, at the best possiblerate, because now
Gspect ðTÞn;k � jbjnþk
Pnj¼0
n
j
� �
nþ m
j
� �
ð1þ jbj2Þ j:
When comparing two approximations sRREn;k and sRREn0;k ,what seems to determine which one is more accurate isthe correspondingupper bounds forG
spectðTÞn;k andG
spectðTÞn0 ;k . The
examples we have given here suggest that sRREn;k is the betterone if n>n0. In particular, sRREn;k , with n> 0 is likely to bebetter than sRRE0;k [equivalently, GMRES(n, k) with n > 0 isbetter than GMRES(k)]. We can make use of this observa-tion inapplyingRRE(andMPEaswell) in the cyclingmode;this will be discussed in the eighth section.
Numerical Stability of MPE and RRE
An important issue related to the application of extrapola-tionmethods infinite-precision arithmetic is that of numer-ical stability. This issue concerns the propagation of inputerrors into the output. In our case, this means the propaga-tion of errors in the xm into sn,k. A practical measure of thisis the quantity
Gn;k ¼X
k
i¼0
jgðn;kÞi j: ð48Þ
Clearly, Gn;k 1 becausePk
i¼0 gðn;kÞi ¼ 1.
This assertion can be justified heuristically as follows:Let us assume that hm is the error committed in computing
METHODS FOR ACCELERATION OF CONVERGENCE (EXTRAPOLATION) OF VECTOR SEQUENCES 1839
xm (which can be the result of roundoff, for example). Thus,xm ¼ xm þ hm is the computed xm. Assuming that the g
ðn;kÞi
are not affected by these errors, sn;k, the ‘‘computed’’ sn,k,is given by
sn;k ¼X
k
i¼0
gðn;kÞi xnþi ¼
X
k
i¼0
gðn;kÞi ðxnþi þ hnþiÞ
¼ sn;k þX
k
i¼0
gðn;kÞi hnþi: ð49Þ
Therefore,
ksn;k � sn;kk ¼ kX
k
i¼0
gðn;kÞi hnþik �
X
k
i¼0
jgðn;kÞi jkhnþik
� Gn;k max0�i�k
khnþik� �
: ð50Þ
Let us now assume that the relative errors in the computedxm are bounded by some fixed positive scalar b; that is,khmk � bkxmk. Let us assume, in addition, that fxmg con-verges, so that max0�i�kkxnþik is bounded. Then
ksn;k � sn;kk � bGn;k max0�i�k
kxnþik� �
: ð51Þ
Now, b and max0�i�kkxnþik are independent of the extra-polation method, hence are fixed. Because fxmg converges,max0�i�kkxnþik is of the order of ksk and ksn;kk. Therefore,for all practical purposes, bGn;k is the relative error in sn;k.Suppose that the xm have been computed to machineaccuracy. This implies that b is the rounding unit of thefloating-point arithmetic being used; let b be of order 10�p,for some p> 0. If nowGn;k is of order 10
q for some q> 0, thensn;k agrees with sn,k up to p� q decimal digits. In otherwords, q decimal digits have been lost in the course ofcomputation.
In view of this discussion, we say that MPE or RRE isstable if
supn
Gn;k <1: ð52Þ
Let us now assume that the sequence fxmg is as inTheorem 7.1 . By part (3) of this theorem, the g
ðn;kÞi satisfy
Equation (42) and hence limn!1Gn;k exists. In addition, byTheorem 1.4.3 in Sidi (27), Gn;k also satisfies
limn!1
Gn;k �Y
k
i¼1
1þ jlijj1� lij
<1: ð53Þ
Thus both MPE and RRE are stable under the conditionsof Theorem 7.1. Equality holds in (53) when l1; . . . ; lk areall positive or all negative; in addition, when they are allnegative, limn!1Gn;k ¼ 1, which is the most ideal case.When one or more of these li are very close to 1 inthe complex plane, Gn;k becomes very large, and this limitsthe accuracy that canbe reached by sn;k. This suggests thatwe should aim at iterative methods for which the largest
eigenvalues are not too close to 1. In the absence of such aniterative scheme, we can apply MPE and RRE to thesequence fxrmgwith some integer r> 1. By Equation (38),
xrm ¼ sþX
p
i¼1
viðlri Þm as n!1: ð54Þ
Note that, if jlij< 1 but li is close to 1, then j1� lri j> j1� lij for r> 1. Thus MPE and RRE are likely to pro-duce better accuracy when applied to fxrmg with someinteger r> 1 than when applied to fxmg. For example, theapproximations sn;k produced by applying MPE and RREto the sequence fxrmg, under the conditions of Theorem7.1and provided lri are distinct, are such that
sn;k � s ¼ Oðlrnkþ1Þ as n!1 ð55Þ
and
limn!1
X
k
i¼0
gðn;kÞi si ¼
Y
k
i¼1
s� si
1� si; lim
n!1Gn;k �
Y
k
i¼1
1þ jsijj1� sij
;
si ¼ lri ; i ¼ 1; 2; . . . :
ð56Þ
From Equation (55), it is clear that better stability and accu-racy can be obtained using the same storage (the same k)with r> 1. We can also maintain a given level of accuracy byincreasing r and decreasing k simultaneously.
MPE AND RRE VIA CYCLING
As we have already discussed, one important issue that weconfront when applying vector extrapolation methods tolarge-scale problems is that of storage. In the algorithmspresented in the sixth section, for example, we need tostore the matrix Qk, which forms the bulk of the storedquantities. Using the same storage, we can obtain approx-imations of better accuracy by applying the extrapolationmethods to the sequence fxrmg instead of fxmg, as waspointed out at the end of the preceding section.
In this section, we discuss the strategy that is known ascycling or restarting, whichwe alluded to above,whenMPEandRRE are being applied to the solution of the system x¼f (x) in conjunctionwith the iterative scheme xmþ1 ¼ fðxmÞ,m ¼ 0:1; . . . . In this strategy, n and k are held fixed. Hereare the steps of cycling:
C0. Choose integersn 0; k 1 and r 1, and an initialvector x0.
C1. Compute the vectors x1; x2; . . . ; xrðnþkþ1Þ½via xmþ1 ¼fðxmÞ�, and save
yn;ynþ1; . . . ; ynþk; ynþkþ1; yi ¼ xri; i ¼ 0; 1; . . . :
C2. Apply MPE or RRE to the vectors yn;ynþ1; ... ;ynþkþ1
precisely as in the sixth section, with end resultsn,k.
C3. If sn,k satisfies accuracy test, stop.
Otherwise, set x0 ¼ sn;k, and go to Step C1.
1840 METHODS FOR ACCELERATION OF CONVERGENCE (EXTRAPOLATION) OF VECTOR SEQUENCES
Wewill call each application of Steps C1–C3 a cycle.Wewill also denote the sn;k that is computed in the ith cycle s
ðiÞn;k.
The analysis of MPE and RRE, as these are beingapplied to nonlinear systems x = f(x) in the cyclingmode with n = 0 and r ¼ 1, has been considered in Skelboe(28). As has been shown in Ref. 1, the results of this workare only heuristic, however.What these say is that,whenk
in the ith cycle is chosen to be the degree ki of the minimalpolynomial of Fðsði�1Þ
0;ki�1Þ [recall that F(x) is the Jacobian
matrix of f (x) at x], with respect to sði�1Þ0;ki�1
� s, the sequencefsðiÞn;kig converges to s quadratically. However, we mustrecall that, since ki can be as large as N and is not knownexactly, this usage of cycling is not useful practically forlarge-scale problems we are interested in solving. In otherwords, trying to achieve quadratic convergence fromMPEandRREvia cycling is not realistic.Wemay achieve linearbut fast convergence by choosing evenmoderate values forn and k with cycling.
APPLICATIONS
Solution of Large-Scale Nonlinear Systems
As we mentioned earlier, one of the main uses of vectorextrapolationmethodshasbeen inthesolutionof large-scalesystems of nonlinear equations cðxÞ ¼ 0 with solution s,especially those that arise from discretization of nonlinearcontinuum problems, via fixed-point iterative procedures ofthe form xmþ1 ¼ fðxmÞ. In this connection, we would like toemphasize that vector extrapolationmethods, in the cyclingmode, are effective when the iteration procedure xmþ1 ¼fðxmÞ chosen is convergentand reasonablygood, in the sensethat only a few of the eigenvalues of F(s), the Jacobianmatrix of f(x)atx=s, arenear1 in the complexplane.Vectorextrapolation methods do not perform that well when mosteigenvalues of F(s) cluster near 1 in the complex plane.Thus, the iterative method should be chosen such that theeigenvalues of F(s) are scattered in the interior of the unitdisk, very few of them being near 1.
As discussed in the preceding section, the simplest wayof achieving this goal is by applying the extrapolationmethods to the subsequence fxrmg with some integerr 2. Another simple way is by defining the iterationvectors via
xmþ1 ¼ xm þw½fðxmÞ � xm� ¼ ð1�wÞxm þwfðxmÞ;m ¼ 0; 1; . . . ;
for some suitable scalar w. See, for example, Ref. 20(section 7).
In applying MPE and RRE to finite difference solutionsof elliptic PDEs, such as the two-dimensional Poissonequation @2u=@x2 þ @2u=@y2 ¼ fðx; yÞ, a good iterativemethod would be the ADI (alternating direction implicit)method. For this method, see Stoer and Bulirsch (29),Morton and Mayers (30), or Ames (31), for example.
Vector-Valued Rational Approximations
Given the vector-valued power seriesP1
i¼0 uizi; where z
is a complex variable and ui are constant vectors in CN ,
representing a vector-valued function u(z) about z = 0, we
can use vector extrapolation methods to approximate u(z)via a vector-valued rational approximation obtained fromthe power series coefficients ui.
For this, we apply the extrapolation methods to thesequence fxmðzÞg, where
xmðzÞ ¼X
m
i¼0
uizi; m ¼ 0; 1; . . . :
The approximations obtained this way are rational func-tions, whose numerators are vector-valued polynomialsand whose denominators are scalar-valued polynomials.
This topic is dealtwith indetail in the paperSidi (10).Wegive a brief informal description of the subject, in thecontext of MPE, next.
When MPE is used for this purpose, we obtain theapproximations sn;kðzÞ (now called SMPE approximations)that are given as in
sn;kðzÞ ¼DðzkxnðzÞ; zk�1xnþ1ðzÞ; . . . ; z0xnþkðzÞÞ
Dðzk; zk�1 . . . ; z0Þ ; ð57Þ
where Dðv0; v1; . . . ; vkÞ is a ðkþ 1Þ � ðkþ 1Þ determinantdefined as in
Dðv0; v1; . . . ; vkÞ ¼
v0 v1 � � � vk
u0;0 u0;1 � � � u0;k
u1;0 u1;1 � � � u1;k
..
. ... ..
.
uk�1;0 uk�1;1 � � � uk�1;k
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
;
ui; j ¼ ðunþi; unþ jÞ: ð58Þ
Thus, sn;kðzÞ is also of the form
sn;kðzÞ ¼Pk
i¼0 cizk�ixnþiðzÞ
Pki¼0 ciz
k�i; ð59Þ
with appropriate scalar constants ci. Note that the nume-rator polynomial has degree at most n þ k, whereas thedenominator polynomial has degree k when the cofactor ofv0 in Dðv0; v1; . . . ; vkÞ is nonzero.
The sequence fsn;kðzÞg1n¼0 (with fixed k) has a very niceconvergence property, which we discuss briefly. Supposethat u(z) is analytic in an open disc Dr ¼ fz2C : jzj< rgand meromorphic10 in a larger open disc DR ¼fz2C : jzj<Rg. This implies that the series
P1i¼0 uiz
i con-verges to u(z) only for |z| < r, and diverges for |z| r.Under some additional condition that has to do with theLaurent expansions of u(z) about its poles, the rationalapproximations sn;kðzÞ obtained from the series
P1i¼0 uiz
i
have the property that, if k is equal to the number of thepoles inDR, then fsn;kðzÞg1n¼0 converges tou(z)uniformly inevery compact subset of DR excluding the poles of u(z). In
10A function u(z) is said to be meromorphic in a domain D of thecomplex z-plane if its only singularities in D are poles.
METHODS FOR ACCELERATION OF CONVERGENCE (EXTRAPOLATION) OF VECTOR SEQUENCES 1841
addition, the poles of sn;kðzÞ tend to the poles of u(z) asn!1.
As an example, let us consider the function uðzÞ ¼ðI� zAÞ�1
b, where A2CN�N is a constant matrix andb2CN is a constant vector. This function has the seriesrepresentation
uðzÞ ¼X
1
i¼0
uizi;
where ui ¼ Aib for each i ¼ 0; 1; . . . . This series convergesfor jzj < 1=rðAÞ, where, we recall, rðAÞ is the spectral radiusof A. In this case, u(z) is a vector-valued meromorphicfunction (actually, a rational function) with poles equalto the reciprocals of the eigenvalues ofA, the residues beingrelated to corresponding eigenvectors and principal vec-tors. The poles of sn,k(z) turn out to be the so-called Ritzvalues that result from the method of Arnoldi for eigen-values. For details on precise convergence properties andrates of convergence, see Sidi (11).
One of the uses of these rational approximations hasbeen to the summation of a perturbation series resultingfrom ODEs describing some nonlinear oscillations. See, forexample,WuandZhong (32). In this paper, the spaceweareworking in is infinite dimensional, and the definitions ofMPE and RRE remain unchanged, as mentioned in thefourth section on determinant representations onMPEandRRE.
Computation of Eigenvectors of Known Eigenvalues
An immediate application of vector extrapolation in gen-eral, of MPE and RRE in particular, is to the computationof the eigenvector that corresponds to the largest eigenva-lue of a large and sparse matrix when this eigenvalue isnondefective, that is, it has only corresponding eigenvec-tors but no corresponding principal vectors. Let A2CN�N
have eigenvalues m1; . . . ; mN, which we order as in
jm1j> jm2j jm3j � � � jmN j:
Assume that m1 is known and a corresponding (unknown)eigenvector is required. To solve this problem,we choose aninitial vector x0, and perform the power iterations
xmþ1 ¼ m�11 Axm; m ¼ 0; 1; . . . :
It is easy to show that [see, for example, Sidi (12, 33)],provided x0 has a contribution from the eigenvalue m1,the vectors xm are precisely as in Theorem 7.1, with s therebeinganeigenvector ofA that corresponds tom1 andli therebeing some or all of miþ1=m1. (Clearly, jlij < 1 for all i, hencefxmg converges to the required eigenvector.) Thus, MPEand RRE can be applied to the sequence of power iterationsfxmg to accelerate its convergence. Indeed, both methodsapply exactly as described inTheorem7.1. In addition, theycan be applied in the cycling mode, as described in theprevious section. Vector extrapolation methods are attrac-tive for this application because their computational cost islow due to the sparseness of the matrix involved that
enables us to compute the sequence fxmg via xmþ1 ¼m�11 Axm inexpensively.One recent application of this has been to the computa-
tion of thePageRankof theGooglematrix. ThePageRank isthe (unique) eigenvector of the Google matrix that corres-pond to the largest eigenvalue m1 ¼ 1 that is also known tobe simple. It is positive and is normalized such that the sumof its components is 1. [See Brin and Page (34).] Thisproblem was treated by Kamvar et al. (35) using a methodthe authors denoted quadratic extrapolation. This methodhas been generalized in (12, 33); quadratic extrapolationturns out to be the k ¼ 2 case of this generalization. Inaddition, this generalization turns out to be closely relatedto MPE, in the sense that if sn;k and sn;k are the vectorsobtained by applying MPE and the new method, respec-tively, to the sequence of power iterations obtained from theGoogle matrix A, then sn;k ¼ Asn;k. In this application too,the computation of the sequence fxmg via xmþ1 ¼ Axm canbe achieved inexpensively because of the sparseness of theGoogle matrix. The use of vector extrapolation methods ingeneral, and MPE and RRE in particular, in the cyclingmodewitharbitraryn, k, and r for computing thePageRankwas first suggested in Ref. 12. This approach is illustratedwith numerical examples inRef. 33. The results obtained inRef. 33 show clearly that this approach to PageRank com-putation is very effective.
BRIEF SUMMARY OF MMPE
We now present a very brief discussion of the modifiedminimal polynomial extrapolation (MMPE) that is basedon Sidi et al. (14). Without more explanation, we will beusing the notation of the third section throughout.
The MMPE approximation sn;k from fxmg, namely,
sMMPEn;k ¼
X
k
i¼0
gixnþi;
is essentially definedby requiring that the g j be the solutionto the linear system
ðqi;UðnÞk
gÞ ¼ 0; i ¼ 0; 1; . . . ; k� 1;X
k
j¼0
g j ¼ 1; ð60Þ
where q0; q1; . . . ; qk�1 are k fixed linearly independentvectors in CN .
A determinant representation for sMMPEn;k has been given
in Ref. 14, this representation being exactly of the formshown inEquations (28) and (29) of Theorem4.1,withui; j¼ðqi; unþ jÞ in Equation (26).
The convergence theory of the sequences fsMMPEn;k g1n¼0
under the conditions of Theorem 7.1 has been given inRef. 14. This theory shows that, for all large n; sMMPE
n;k existuniquely [part (1) of Theorem 7.1], and that the rest of theresults of Theorem 7.1 [parts (2)–(4)] are valid for MMPEwithout any changes, provided
ðq0; v1Þ ðq0; v2Þ � � � ðq0; vkÞðq1; v1Þ ðq1; v2Þ � � � ðq1; vkÞ
..
. ... ..
.
ðqk�1; v1Þ ðqk�1; v2Þ � � � ðqk�1; vkÞ
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
6¼ 0: ð61Þ
1842 METHODS FOR ACCELERATION OF CONVERGENCE (EXTRAPOLATION) OF VECTOR SEQUENCES
A more complete theory under conditions on fxmg moregeneral than those in Theorem 7.1, has been presented inSidi and Bridger (22).
As we have observed, the vectors qi that enter theconstruction of MMPE are fixed. It is possible to replacethese vectors by q
ðnÞi that depend on n. We then get what
the author has called generalized MPE (GMPE) in Ref. 19.Obviously, GMPE contains MPE, RRE, and MMPE withqðnÞi ¼ unþi; q
ðnÞi ¼ wnþi; andq
ðnÞi ¼ qi, respectively. Then,
sGMPEn;k ¼
X
k
i¼0
gixnþi;
with the g j defined as the solution to
ðqðnÞi ;UðnÞk
gÞ ¼ 0; i ¼ 0; 1; . . . ; k� 1;X
k
j¼0
g j ¼ 1:
Of course, a determinant representation for sGMPEn;k exists
and is given as in Equations (28) and (29) of Theorem 4.1,with ui; j ¼ ðqðnÞi ; unþ jÞ in Equation (26).
BRIEF SUMMARY OF EPSILON ALGORITHMS
Scalar Epsilon Algorithm
Let the scalar sequence fxmgbe such that
xm� sþX
1
i¼1
ailmi as m!1; ð62Þ
where ai are nonzero, li are distinct, nonzero, and li 6¼ 1;such that
jl1j jl2j � � � ; limi!1
li ¼ 0: ð63Þ
Here s is limm!1 xm when the latter exists; it is theantilimit of fxmg otherwise. Of course, the sequence con-verges provided jlij< 1. To accelerate the convergence ofthe sequence fxmg, Shanks (36) developed a nonlineartransformation, by which we define
ekðxnÞ ¼Eðxn; xnþ1; . . . ; xnþkÞ
Eð1; 1; . . . ; 1Þ ; ð64Þ
where
Eðv0; v1; . . . ; vkÞ ¼
v0 v1 � � � uk
un unþ1 � � � unþk
unþ1 unþ2 � � � unþkþ1
..
. ... ..
.
unþk�1 unþk � � � unþ2k�1
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
;
ui ¼ xiþ1 � xi:
ð65Þ
Here, ekðxnÞ are approximations to s. In case Equation (62)assumes the form
xm ¼ sþX
k
i¼1
a2lmi ; m ¼ 0; 1; . . . ;
for some finite k, we have ekðxnÞ ¼ s for all n. This is theexactness result for the Shanks transformation.
This transformation was proposed earlier by Schmidt(37) for constructing the solution of linear systems fromfixed-point iterations; compare Equation (62) withEquation(38) satisfied by fixed-point iteration vectors generatedlinearly.
A very elegant and fast algorithm for constructing theekðxnÞ was developed by Wynn (5), and it reads
�ðnÞ�1 ¼ 0; �
ðnÞ0 ¼ xn; n ¼ 0; 1; . . . ;
�ðnÞkþ1 ¼ �
ðnþ1Þk�1 þ 1
�ðnþ1Þk
� �ðnÞk
; n; k ¼ 0; 1; . . . :ð66Þ
Here, �ðnÞ2k ¼ ekðxnÞ and �
ðnÞ2kþ1 ¼ 1=ekðDxnÞ. Of course,
Dxi ¼ xiþ1 � xi.A different algorithm, denoted theFS/qd algorithm, that
is as efficient as the epsilon algorithm, was recently pro-posed by Sidi in chapter 21 of Ref. 27. In this algorithm, wefirst compute two sets of quantities, feðnÞk g and fqðnÞk g, via theqd algorithm:
eðnÞ0
¼ 0; qðnÞ1 ¼ unþ1
un; n ¼ 0; 1; . . . ; ðui ¼ xiþ1 � xiÞ
eðnÞk
¼ qðnþ1Þk
� qðnÞk
þ eðnþ1Þk�1 ; q
ðnÞkþ1 ¼
eðnþ1Þk
eðnÞk
qðnþ1Þk
;
n ¼ 0; 1; . . . ; k ¼ 1; 2; . . . : ð67Þ
Once the eðnÞk
and qðnÞk
have been computed, we compute theekðxnÞ as follows:
MðnÞ0
¼ xn
un; N
ðnÞ0 ¼ 1
un; n ¼ 0; 1; . . . ; ðui ¼ xiþ1 � xiÞ
MðnÞk
¼M
ðnþ1Þk�1 �M
ðnÞk�1
eðnÞk
; NðnÞk
¼N
ðnþ1Þk�1 �N
ðnÞk�1
eðnÞk
;
ekðxnÞ ¼M
ðnÞk
NðnÞk
n; k ¼ 0; 1; . . . : ð68Þ
The convergence analysis of the sequence fekðxnÞg1n¼0
with fixed k, for sequences fxmg whose members behaveas in Equations (62) and (63), with li all positive or allnegative, was given byWynn (38).Wynn’s results were laterextended by Sidi (39) to the case of general complex li, andalso to sequences fxmg whose members behave in a moregeneral fashion than that given in Equations (62) and (63);see also chapter 16 in Ref. 27. One of the results of Ref. 39,which is a generalization ofWynn’s results, reads as follows:
Theorem 11.1. Provided jlkj> jlkþ1j in Equations (62)and (63), there holds
ekðxnÞ � s ¼ Oðlnkþ1Þ asn!1:
For a detailed study of the Shanks transformation, seechapter 16 in Ref. 27.
METHODS FOR ACCELERATION OF CONVERGENCE (EXTRAPOLATION) OF VECTOR SEQUENCES 1843
Theorem 11.1 shows that the Shanks transformation isan effective convergence acceleration method when appliedto scalar sequences fxmg that behave as in Equations (62)and (63). Now, the vector sequences fxmg in Theorems 7.1and 7.2 behave exactly as in Equations (62) and (63) com-ponentwise. This suggests that the Shanks transformationcan be applied to these sequences componentwise to accel-erate their convergence. When applied in the context ofvector sequences in this way, the Shanks transformationis known as the scalar epsilon algorithm (SEA). Notethat, SEA does not couple the different components of thevectors xm.
Vector Epsilon Algorithm
Another interesting vector method to accelerate the con-vergence of fxmg that is based on the epsilon algorithmitself was proposed by Wynn (6). This method is knownas the vector epsilon algorithm (VEA), and is defined asfollows:
eðnÞ�1
¼ 0; eðnÞ0 ¼ xn; n ¼ 0; 1; . . . ;
eðnÞkþ1 ¼ e
ðnþ1Þk�1 þ ðeðnþ1Þ
k� e
ðnÞk
Þ�1; n; k ¼ 0; 1; . . . : ð69Þ
Here, x�1 is the Samelson inverse of the vector x, given byx�1 ¼ x =ðx�xÞ ¼ x =kxk2, and x is simply the complex con-jugate of x. With VEA, we take e
ðnÞ2k to be approximations to
the limit or antilimit of fxmg. Clearly, unlike SEA, VEAcouples all the components of the vectors xm.
The properties of VEAhave been studied extensively viathe so-called vector-valued Pade approximants: Graves-Morris (40, 41) has shown that, under the conditions ofTheorem 3.6, the VEA approximations from the sequencefxmg satisfy eðmÞ
2k ¼ s; m ¼ n;nþ 1; . . . . This result is knownasMcLeod’s theorem. It was conjectured byWynn (42), andwas first proved by McLeod (43) with the ci restricted to bereal numbers; see Ref. 1, page 211.
A determinantal formula for the eðnÞ2k has been given by
Graves-Morris and Jenkins (44). This formula shows thateðnÞ2k is a ‘‘weighted average’’ of the xm exactly as describedin the Introduction even though this is not clear fromEquation (69).
The convergence of the sequence feðnÞ2k g1n¼0 has been
analyzed by Graves-Morris and Saff (45, 46); under theconditions of Theorem 7.1, the convergence rate of e
ðnÞ2k to
s as n!1 is practically the same as those of sn,k for MPEand RRE.
For application of VEA in the cycling mode to the solu-tion of nonlinear systems of equations, see Brezinski(47, 48), Brezinski and Rieu (49), Gekeler (50), and Skelboe(28). All these works consider the application of VEA suchthat order 2 convergence would be obtained, but theirresults seem to be heuristic. Actually, the remarks wemake concerning MPE and RRE at the end of the eighthsection are valid for VEA as well. See also the remarks inSection 6 of Ref. 1.
Topological Epsilon Algorithm
A third method to accelerate the convergence of vectorsequences fxmg that is also based on the epsilon algorithm
is the topological epsilon algorithm (TEA) of Brezinski (7).This method is defined as follows:
eðnÞ�1
¼ 0; eðnÞ0 ¼ xn; n ¼ 0; 1; . . .
eðnÞ2kþ1 ¼ e
ðnþ1Þ2k�1 þ w
w � ðeðnþ1Þ2k � e
ðnÞ2k Þ
;
n; k ¼ 0; 1; . . .
eðnÞ2kþ2 ¼ e
ðnþ1Þ2k þ
eðnþ1Þ2k � e
ðnÞ2k
ðeðnþ1Þ2kþ1 � e
ðnÞ2kþ1Þ � ðe
ðnþ1Þ2k � e
ðnÞ2k Þ
;
n; k ¼ 0; 1; . . . ð70Þ
Here,w is arbitrary fixed vector in CN and x � y ¼ xTy. It isclear that, just as VEA, TEA couples all the components ofthe vectors xm.
Again, eðnÞ2k are the required approximants, and a deter-
minantal representation for them exists. This representa-tion is of the form given in Equations (28) and (29) ofTheorem 4.1, with ui; j ¼ w � unþiþ j in Equation (29).Thus, e
ðnÞ2k is also a ‘‘weighted average’’ of the xm exactly
as described in the Introduction even though this is notclear from Equation (70).
As shown byBrezinski, when applied to a sequence fxmgthat arises from the fixed-point iterative procedurexmþ1 ¼ Txm þ b, the vector e
ð0Þ2k produced by TEA is the
same as the vector generated at the kth stage of themethodofLanczosapplied to the linear system ðI� TÞx ¼ b startingwith the vector x0.
The convergence of the sequences feðnÞ2k g1n¼0 has been
analyzed bySidi, Ford, andSmith (14) andSidi andBridger(22). It is proved in Ref. 14 that, under the conditions ofTheorem 7.1, the rate of convergence of e
ðnÞ2k to s as n!1 is
the same as those of sn,k from MPE and RRE. Specifically,for all large n, e
ðnÞ2k exist uniquely [part (1) of Theorem 7.1],
and that the rest of the results ofTheorem7.1 [parts (2)–(4)]are valid for TEA without any changes, provided
w � vi 6¼ 0; i ¼ 1; . . . ; k : ð71Þ
Remarks on Application of Epsilon Algorithms
Clearly, all three epsilon algorithms can be applied in thecyclingmode, just asMPEandRRE. It has been observed invarious applications that, when used in the cycling mode,SEA is rather unstable, hence it is not recommended forvector sequences. For this reason,we comment only onVEAand TEA in the sequel.
From the various convergence analyses mentionedabove, and from Theorem 3.6 and McLeod’s theorem, weconclude that the sequences feðnÞ2k g
1n¼0 from VEA and TEA,
have convergence rates that are the same as those offsn;kg1n¼0 fromMPE and RRE. It is clear from the recursionrelations in Equations (69) and (70) that to determine feðnÞ2k gwe need the vectors xn; xnþ1; . . . ; xnþ2k, a total of 2k þ 1vectors. To compute sn;k, we need xn; xnþ1; . . . ; xnþkþ1, a totalof k þ 2 vectors. Thus, VEA and TEA require practicallytwice as many vectors xm to achieve the same kind ofaccuracy as MPE and RRE. This can already be of concernwhen the cost of computing eachxm is high. Also,weneed to
1844 METHODS FOR ACCELERATION OF CONVERGENCE (EXTRAPOLATION) OF VECTOR SEQUENCES
store 2k þ 1 vectors to obtain eðnÞ2k , whereas k þ 2 vectors
need to be stored to obtain sn;k. When N, the dimension ofthe vectors xm, is very large, storage problems can occurwith VEAmore than withMPE and RRE. In addition, withthe vectors xm already available, the cost of computing e
ðnÞ2k
is higher than the cost of computing sn;k when the latter arecomputed via the algorithms of the sixth section; for theexact operation counts, see Sidi (20). For all these reasons,it seems reasonable to prefer MPE and RRE over epsilonalgorithms in general.
BIBLIOGRAPHY
1. D. A. Smith,W. F. Ford, andA. Sidi, Extrapolationmethods forvector sequences, SIAM Rev., 29: 199–233, 1987.
2. S. Cabay and L. W. Jackson, A polynomial extrapolationmethod for finding limits and antilimits of vector sequences,SIAM J. Numer. Anal., 13: 734–752, 1976.
3. R. P. Eddy, Extrapolating to the limit of a vector sequence, inP. C. C. Wang, (ed.), Information Linkage Between Applied
Mathematics and Industry, New York: Academic Press, 1979.
4. M.Mesina, Convergence acceleration for the iterative solutionof the equations X ¼ AX þ f , Comput. Methods Appl. Mech.
Engrg., 10: 165–173, 1977.
5. P.Wynn.Ona device for computing the emðSnÞ transformation,Mathematical Tables and Other Aids to Computation, 10: 91–96, 1956.
6. P. Wynn, Acceleration techniques for iterated vector andmatrix problems, Math. Comp., 16: 301–322, 1962.
7. C. Brezinski, Generalisations de la transformation de Shanks,de la table de Pade, et de l’e-algorithme, Calcolo, 11: 317–360,1975.
8. W. E. Arnoldi, The principle of minimized iterations in thesolution of thematrix eigenvalue problem,Quart. Appl.Math.,9: 17–29, 1951.
9. Y. Saad and M. H. Schultz, GMRES: A generalized minimalresidual method for solving nonsymmetric linear systems,SIAM J. Sci. Statist. Comput., 7: 856–869, 1986.
10. A. Sidi, Rational approximations from power series of vector-valued meromorphic functions, J. Approx. Theory, 77: 89–111,1994.
11. A. Sidi, Application of vector-valued rational approximation tothe matrix eigenvalue problem and connections with Krylovsubspacemethods,SIAMJ.MatrixAnal.Appl.,16: 1341–1369,1995.
12. A. Sidi, Approximation of largest eigenpairs of matrices andapplications to PageRank computation, Technical Report CS-2004-16, Computer ScienceDept., Technion–Israel Institute ofTechnology, 2004.
13. B. P. Pugachev, Acceleration of the convergence of iterativeprocesses and a method of solving systems of nonlinear equa-tions, U.S.S.R. Comput. Math. Math. Phys., 17: 199–207,1978.
14. A. Sidi,W.F.Ford, andD.A. Smith,Acceleration of convergenceof vector sequences, SIAM J. Numer. Anal., 23: 178–196, 1986.
15. A. S. Householder, The Theory of Matrices in Numerical Ana-
lysis. New York: Blaisedell, 1964.
16. S. Kaniel and J. Stein, Least-square acceleration of iterativemethods for linear equations,J.OptimizationTheoryAppl.,14:431–437, 1974.
17. A. Sidi, Convergence and stability properties of minimalpolynomial and reduced rank extrapolation algorithms,SIAM J. Numer. Anal., 23: 197–209, 1986.
18. W. F. Ford and A. Sidi, Recursive algorithms for vectorextrapolation methods, Appl. Numer. Math., 4: 477–489,1988.
19. A. Sidi, Extrapolation vs. projectionmethods for linear systemsof equations, J. Comp. Appl. Math., 22: 71–88, 1988.
20. A. Sidi, Efficient implementation of minimal polynomial andreduced rank extrapolation methods, J. Comp. Appl. Math.,36: 305–337, 1991.
21. G. H. Golub and C. F. VanLoan,Matrix Computations, 3rd ed.,London, UK: John Hopkins University Press, 1996.
22. A. Sidi and J. Bridger, Convergence and stability analyses forsome vector extrapolation methods in the presence of defec-tive iteration matrices, J. Comp. Appl. Math., 22: 35–61,1988.
23. A. Sidi, Convergence of intermediate rows of minimal poly-nomial and reduced rank extrapolation tables, Numer. Algo-
rithms, 6: 229–244, 1994.
24. A. Sidi and Y. Shapira, Upper bounds for convergence ratesof vector extrapolation methods on linear systems withinitial iterations, Technical Report 701, Technion–IsraelInstitute of Technology, 1991. http://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/19920024498-1992024498. pdf.
25. A. Sidi and Y. Shapira, Upper bounds for convergence rates ofacceleration methods with initial iterations, Numer. Algo-
rithms, 18: 113–132, 1998.
26. M. Abramowitz and I. A. Stegun, Handbook of Mathe-
matical Functions with Formulas, Graphs, and Mathemati-
cal Tables. Washington, D.C.: US Government PrintingOffice, 1964.
27. A. Sidi, Practical ExtrapolationMethods: Theory and Applica-tions, in Cambridge Monographs on Applied and Computa-
tional Mathematics, Cambridge, MA: Cambridge UniversityPress, 2003.
28. S. Skelboe, Computation of the periodic steady-state responseof nonlinear networks by extrapolation methods, IEEE
Trans. Circuits and Systems, 27: 161–175, 1980.
29. J. Stoer and R. Bulirsch, Introduction to Numerical Analysis.3rd ed., New York: Springer–Verlag, 2002.
30. K. W.Morton and D. F. Mayers,Numerical Solution of Partial
Differential Equations, Cambridge, MA: Cambridge Univer-sity Press, 1994.
31. W. F. Ames,Numerical Methods for Partial Differential Equa-
tions, 2nd ed., New York: Academic Press, 1977.
32. Baisheng Wu and Huixiang Zhong, Application of vector-valued rational approximations to a class of non-linearoscillations, Internat. J. Non-Linear Mech., 38: 249–254,2003.
33. A. Sidi, Vector extrapolation methods with applications tosolution of large systems of equations and to PageRank com-putations, Computers Math. Appl., 56: 1–24, 2008.
34. S. Brin and L. Page, The anatomy of a large-scale hypertextualweb search engine, Comput. Networks ISDN Syst., 30: 107–117, 1998.
35. S. D. Kamvar, T. H. Haveliwala, C. D. Manning, and G. H.Golub, Extrapolation methods for accelerating PageRankcomputations, Proc. of the Twelfth International World Wide
Web Conference, 2003, pp. 261–270.
36. D. Shanks, Nonlinear transformations of divergent andslowly convergent sequences, J. Math. and Phys., 34: 1–42,1955.
METHODS FOR ACCELERATION OF CONVERGENCE (EXTRAPOLATION) OF VECTOR SEQUENCES 1845
37. J. R. Schmidt, On the numerical solution of linear simultaneousequations by an iterativemethod,Phil. Mag., 7: 369–383, 1941.
38. P. Wynn, On the convergence and stability of the epsilonalgorithm, SIAM J. Numer. Anal., 3: 91–122, 1966.
39. A. Sidi, Extension and completion of Wynn’s theory on con-vergence of columns of the epsilon table, J. Approx. Theory, 86:21–40, 1996.
40. P. R. Graves-Morris, Vector valued rational interpolants I,Numer. Math., 42: 331–348, 1983.
41. P. R. Graves-Morris, Extrapolation methods for vector conse-quences, Numer. Math., 61: 475–487, 1992.
42. P. Wynn, Continued fractions whose coefficients obey a non-commutative law of multiplication,Arch. Rat.Mech. Anal., 12:273–312, 1963.
43. J.B.McLeod,Anoteonthe e-algorithm,Computing,7: 17–24,1971.
44. P. R. Graves-Morris and C. D. Jenkins, Vector valued rationalinterpolants III, Constr. Approx., 2: 263–289, 1986.
45. P. R. Graves-Morris and E. B. Saff, Row convergence theoremsfor generalised inverse vector-valued Pade approximants, J.Comp. Appl. Math., 23: 63–85, 1988.
46. P. R. Graves-Morris and E. B. Saff, An extension of a rowconvergence theorem for vector Pade approximants, J. Comp.
Appl. Math., 34: 315–324, 1991.
47. C. Brezinski, Application de l’e-algorithme a la resolution dessystemes non lineaires, C. R. Acad. Sci. Paris, 271(A): 1174–1177, 1970.
48. C.Brezinski, Surunalgorithmede resolutiondes systemesnonlineaires, C. R. Acad. Sci. Paris, 272(A): 145–148, 1971.
49. C. Brezinski and A. C. Rieu, The solution of systems of equa-tions using the e-algorithm, and an application to boundary-value problems, Math. Comp., 28: 731–741, 1974.
50. E. Gekeler, On the solution of systems of equations bytheepsilonalgorithmofWynn,Math.Comp.,26: 427–436,1972.
AVRAM SIDI
Technion—Israel Institute ofTechnology
Haifa, Israel
1846 METHODS FOR ACCELERATION OF CONVERGENCE (EXTRAPOLATION) OF VECTOR SEQUENCES