Methods for Acceleration of Convergence (Extrapolation) of Vector Sequences

METHODS FOR ACCELERATIONOF CONVERGENCE (EXTRAPOLATION)OF VECTOR SEQUENCES

INTRODUCTION

An important problem that arises in different areas ofscience and engineering is that of computing limits ofsequences of vectors fxmg, where xm 2CN, and the dimen-sion N is very large in many applications. Such vector

sequences arise, for example, in the numerical solution ofvery large systems of linear or nonlinear equations byfixed-point iterative methods, and limm!1xm are simply therequired solutions to these systems. One common sourceof such systems is the finite-difference or finite-elementdiscretization of continuum problems.

In most cases, however, the sequences fxmg converge totheir limits extremely slowly. That is, s ¼ limm!1xm can beapproximatedwithaprescribed level of accuracyby xmwithvery large m. Clearly, this way of approximating s via thexm becomes very expensive computationally. One practicalway of tackling this problemeffectively is by applying to thesequence fxmg a suitable convergence acceleration method

(or extrapolation method).More specifically, let us consider the (linear or non-

linear) system of equations

cðxÞ ¼ 0; c : CN !CN ; ð1Þ

whose solutionwe denote s.What ismeant byx, s, andcðxÞare, respectively,

x ¼ ½xð1Þ; . . . ; xðNÞ�T; s ¼ ½sð1Þ; . . . ; sðNÞ�T; xðiÞ; sðiÞ scalars;

and

cðxÞ ¼ ½c1ðxÞ; . . . ;cNðxÞ�T;ciðxÞ ¼ ciðxð1Þ; . . . ; xðNÞÞ scalar functions:1

Then, starting with a suitable vector x0, an initial approx-imation to s, the sequence fxmg of approximations can begenerated by some fixed-point iterative method as in

xmþ1 ¼ fðxmÞ; m ¼ 0; 1; . . . ; f : CN !CN ; ð2Þ

with

fðxÞ ¼ ½ f1ðxÞ; . . . ; fNðxÞ�T; fiðxÞ ¼ fiðxð1Þ; . . . ; xðNÞÞscalar functions:

Here x�fðxÞ¼0 is a possibly ‘‘preconditioned’’ form of Equa-tion (1), hence it has the same solution s [that is,cðsÞ¼0 andalso s ¼ f(s)], and, in case of convergence, limm!1xm ¼ s.One possible form of f(x) would be

fðxÞ ¼ xþ AðxÞcðxÞ;

where A(x) is an N � N matrix, such that A(s) is non-singular.

Now, the rate of convergence of the sequence fxmg inEquation (2) to s is determined by rðFðsÞÞ, whereF(x) is theJacobian matrix2 of the vector-valued function f(x) andrðKÞ denotes the spectral radius3 of the matrix K. It is

1Throughout, we use lowercase boldface letters to denote vectorsand vector-valued functions, andwe use uppercase boldface lettersto denote matrices.2F(x) isanN�Nmatrix,whose (i,j) entry is@ fi=@x

ð jÞ evaluatedatx.3rðKÞ ¼ maxijmij where mi are eigenvalues of K. Of course, K is asquare matrix.

1828 METHODS FOR ACCELERATION OF CONVERGENCE (EXTRAPOLATION) OF VECTOR SEQUENCES

Wiley Encyclopedia of Computer Science and Engineering, edited by Benjamin Wah.Copyright # 2008 John Wiley & Sons, Inc.

known that rðFðsÞÞ< 1 must hold for convergence to takeplace, and that, the closer rðFðsÞÞ is to zero, the faster theconvergence. The rate of convergence deteriorates asrðFðsÞÞ becomes closer to 1, however.

As an example, let us consider the cases in whichEquation (1) and Equation (2) arise from finite-differenceor finite-element discretizations of continuum problems.For s [the solution to Equation (1) andEquation (2)] to be areasonable approximation to the solution of the conti-nuum problem, the mesh size of the discretization shouldbe small enough. However, a small mesh size means alargeN. In addition, as the mesh size tends to zero, henceN!1, generally, r(F(s)) tends to 1, as can be shownrigorously in some cases. All this means that, when themesh size decreases, not only does the dimension of theproblem increase, the convergence of the fixed-pointmethod in Equation (2) deteriorates as well. Asmentionedabove, this problem of slow convergence can be treatedefficiently via vector extrapolation methods.

Before going on, we mention that, given a vectorsequence fxmg that converges, a vector extrapolationmethod computes, directly or indirectly, a ‘‘weightedaverage’’ of a certain number of the vectors xm as anapproximation to limm!1xm. This approximation is ofthe form

X

k

i¼0

gðn;kÞi xnþi;

where n and k are integers chosen by the user and thescalars g

ðn;kÞi , which depend on the xm and can even be

complex, satisfy

X

k

i¼0

gðn;kÞi

¼ 1:4

In general, themethods differ in theway they determinethe g

ðn;kÞi . A nice feature of vector extrapolation methods

that is also of practical importance is that they take only thevectors xm as their input. The way these vectors are gen-erated is irrelevant; they need no other input.

An additional useful property of vector extrapolationmethods is that they can even take a divergent sequence ofvectors fxmg into convergent ones, with the right ‘‘limits.’’For example, if s is the solution to x� fðxÞ ¼ 0 and if fxmgthat is obtained from Equation (2) diverges, vector extra-polationmethods can produce sequences of approximationsthat converge to s. This can be proved rigorously when f(x)is linear. It seems to be true also when f(x) is nonlinear.In case of divergence, the solution s is called the antilimit

of fxmg.A detailed review of vector extrapolation methods,

which contains the developments up to the early 1980scan be found in the work of Smith et al. (1). This workdiscusses (1) two polynomial-type methods, namely, theminimal polynomial extrapolation (MPE) of Cabay and

Jackson (2) and the reduced rank extrapolation (RRE) ofEddy (3) and Mesina (4), and (2) three epsilon algorithms,namely, the scalar and vector epsilon algorithms of Wynn(5, 6), and the topological epsilon algorithmof Brezinski (7).When applied to very large systems of equations, polyno-mial-typemethodshave been observed generally to bemoreeconomical than the epsilon algorithms as far as computa-tion time and storage requirements are concerned. Forthis reason,we concentrate onMPEandRRE in thepresentwork.

In the next section, we present some preliminaries thatjustify the development of MPE and RRE via a study oflinear systems. Following these preliminaries, in thesucceeding six sections, we present an up-to-date reviewof MPE and RRE, which are the two polynomial-typemethods. This review contains the developments thathave taken place in the study of MPE and RRE sincethe publication of Ref. 1. In the third section, we give adetailed discussion of the derivation ofMPE andRRE thatis easy to understand even by those with limited back-ground. In the fourth section, we provide some elegantdeterminant representations for them. In the fifth section,we discuss their close connection with the method ofArnoldi (8) and with GMRES of Saad and Schultz (9),two well-known Krylov subspace methods for solving lin-ear systems. In the sixth section, we describe the mostaccurate and stable algorithms for their implementationin finite-precision (that is, floating-point) arithmetic;these algorithms are also the most economical as far astheir computational cost and storage requirements areconcerned. We provide all the details of these algorithmsto enable the reader to program them easily. In theseventh section, we provide the analytic theory of MPEand RRE concerning their convergence, accelerationof convergence, error, and numerical stability behavior.The results of this section are important; they are used inthe eighth section, where we discuss a mode of usage forMPE and RRE that is known as cycling and that is verysuitable for solving nonlinear systems.

In the ninth section, we turn to some additional applica-tions of vector extrapolation methods. When applied to thesequence of partial sums of a vector-valued power seriesfrom a vector-valued function, MPE serves as a tool foranalytic continuation: it produces vector-valued rationalapproximations to the function in question that approxi-mate the latter even outside the circle of convergence of itspower series. These rational approximations are closelyrelated to the Arnoldi method for eigenvalue approxima-tions. This topic, which we discuss very briefly here, istreated at length in Sidi (10, 11).

Another application of vector extrapolation methods,especially of MPE and RRE, is to the computation of aneigenvector of an arbitrary large and sparse matrix thatcorresponds to its largest eigenvalue when this eigenvalueis known. In addition to being of importance in its own right,this problem has attracted much attention recently becauseit also arises in the computation of the PageRank of theGoogleWebmatrix.Vectorextrapolationmethods,especiallyMPE and RRE, can be used effectively in the solution ofthis problem and hence in PageRank computations. Thecontents of this part of the paper on computing eigen-

4In case xm ¼ s for all m, it is natural and desirable to havePk

i¼0 gðn;kÞi xnþi ¼ s; and this implies that

Pki¼0 g

ðn;kÞi ¼ 1 must hold.

METHODS FOR ACCELERATION OF CONVERGENCE (EXTRAPOLATION) OF VECTOR SEQUENCES 1829

vectors in general and the PageRank in particular are newand originally appeared in the author’s 2004 technicalreport (12).

For completeness, in the section entitled ‘‘Brief Sum-mary of MMPE,’’ we discuss briefly a third polynomialmethod, which is known as the modified minimal poly-

nomial extrapolation (MMPE), which was proposedindependently by Brezinski (7), Pugachev (13), and Sidiet al. (14).

Again, for completeness, in the last section, we give abrief discussion of the three epsilon algorithms mentionedabove.

Beforewe end this section,wewould like to point out to anice feature of vector extrapolationmethods in general, andMPE and RRE in particular: These methods can be definedand used in the setting of general infinite-dimensionalinner product spaces, as well as CN with finite N. Thealgorithms and the convergence theory remain the samefor all practical purposes.

Finally, wemention thatMPEandRREhave been usedas effective accelerators for solving large and sparse non-linear systems of equations that arise in diverse areas ofsciences and engineering, such as computational fluiddynamics, structures, materials, semiconductor research,computerized tomography, and computer vision, to namea few. We refer the reader to the relevant literature forthese applications.

Note that, throughout this work, ðx; yÞwill stand for theEuclidean inner product of the vectors x; y 2Cs, that is,ðx; yÞ ¼ x�y. Similarly, kzkwill stand for the standard vectorl2-norm of z2Cs, namely, kzk ¼

ffiffiffiffiffiffiffi

z�zp

. Finally, kAk willstand for thematrix norm ofA2Cs�s induced by this vectornorm.5 Here, s is any positive integer.

PRELIMINARIES AND MOTIVATION

Because one of the main areas of application of vectorextrapolationmethods is that of numerical solution of largeand sparse (linear and nonlinear) systems of equations byfixed-point methods, we will motivate their derivation inthis context. We start this by discussing the nature of thevectors xm that arise from the iterativemethod of Equation(2), the function f(x) there being nonlinear in general.Assuming that limm!1xm exists, hence that xm � s for alllargem [recall that s is the solution to the system cðxÞ ¼ 0

and hence to the system x ¼ f(x)], we expand f(xm) inEquation (2) about s, thus obtaining

xmþ1 ¼ fðsÞ þ FðsÞðxm � sÞ þOðkxm � sk2Þ ð3Þas m!1 :

Recalling that s ¼ f(s), and rewriting Equation (3) in theform

xmþ1 � s ¼ FðsÞðxm � sÞ þOðkxm � sk2Þ as m!1; ð4Þ

we realize that, for all large m, the vectors xm and xmþ1

satisfy the approximate equation xmþ1�FðsÞxmþ ½I� FðsÞ�s.That is, the sequence fxmg behaves as if it were beinggenerated by an N-dimensional linear system of the form(I — T)x ¼ b through

xmþ1 ¼ Txm þ b; m ¼ 0; 1; . . . ; ð5Þ

where T ¼ F(s) and b ¼ [I � F(s)]s. This suggests thatwe should study those sequences fxmg that arise fromlinear systems of equations to derive and study vectorextrapolation methods. We undertake this task in thenext section.

DERIVATION OF MPE AND RRE

Linear Systems and Iterative Methods

Let s be the unique solution to the N-dimensional linearsystem

x ¼ Txþ b : ð6Þ

Writing this system in the form (I � T)x ¼ b, it becomesclear that the uniqueness of the solution is guaranteedwhen the matrix I � T is nonsingular or, equivalently,when T does not have 1 as its eigenvalue. Let the vectorsequence fxmg be generated via the iterative scheme

xmþ1 ¼ Txm þ b; m ¼ 0; 1 . . . : ð7Þ

Provided rðTÞ< 1; limm!1xm exists and equals s.Given the sequence x0; x1; . . . ; generated as in Equation

(7), let

um ¼ Dxm; wm ¼ Dum ¼ D2xm; m ¼ 0; 1; . . . ; ð8Þ

where Dxm ¼ xmþ1 � xm and D2xm ¼ DðDxmÞ ¼ xmþ2�2xmþ1 þ xm, and define the error vectors em as in

em ¼ xm � s; m ¼ 0; 1; . . . : ð9Þ

Using Equation (7), it is easy to show that

um ¼ Tum�1; wm ¼ Twm�1; m ¼ 0; 1; . . . : ð10Þ

From this, we have by induction that

um ¼ Tmu0; wm ¼ Tmw0; m ¼ 0; 1; . . . : ð11Þ

5Actually, we can employ any arbitrary inner product ðx; yÞ and thel2-norm induced by it, namely, kzk ¼

ffiffiffiffiffiffiffiffiffiffiffi

ðz; zÞp

. [Recall that any innerproduct in CN is of the form ðx; yÞ ¼ x�My, where M is a hermitianpositive definite matrix.] This requires a slight modification of thealgorithms in the section entitled ‘‘Efficient Implementation ofMPEand RRE.’’ The rest remains essentially unchanged.


Similarly, by Equation (7), and by the fact that s¼Tsþb, one can relate the error in xm to the error in xm�1 via

em ¼ Tem�1; ð12Þ

which, by induction, gives

em ¼ Tme0; m ¼ 0; 1; . . . : ð13Þ

Of course,T0 ¼ I inEquations (11) and (13) and throughout.In addition, em; um, and wm can be related via

um ¼ ðT� IÞem; wm ¼ ðT� IÞum; m ¼ 0; 1; . . . : ð14Þ

As mentioned in the preceding section, we propose toapproximate s by a ‘‘weighted average’’ of the k þ 1 con-secutive vectors xn; xnþ1; . . . ; xnþk, for some integersn andk,as in

sn;k ¼X

k

i¼0

gixnþi;X

k

i¼0

gi ¼ 1:6 ð15Þ

(As already mentioned in the Introduction, we do notrestrict the scalars gi to be real and/or positive; they canbe complex in general.) Substituting Equation (9) in Equa-tion (15), and making use of the fact that

Pki¼0 gi ¼ 1, we

obtain

sn;k ¼X

k

i¼0

giðsþ enþiÞ ¼ sþX

k

i¼0

gienþi: ð16Þ

From this, we observe that, for sn,k to be a good approxima-tion to s, the scalars gi should be chosen such that the vectorPk

i¼0 gienþi, the ‘‘weighted average’’ of the individual errorvectors en; enþ1; . . . ; enþk, is small in some sense. Now, byEquation (13),

X

k

i¼0

gienþi ¼X

k

i¼0

giTi

!

en ¼ GðTÞen; GðzÞ ¼X

k

i¼0

gizi: ð17Þ

Therefore, we should choose the polynomial G(z) [satisfy-ing G(1) ¼ 1] to ensure that the vector GðTÞen is small.As we will see shortly, we can actually make GðTÞenvanish by choosing k and the gi appropriately, thusobtaining s ¼Pk

i¼0 gixnþi;withPk

i¼0 gi ¼ 1:As a prelimin-ary to this, we recall the following definitions andtheorems; see, for example, Householder (15, chapter 1,p. 18):

Definition 3.1. Given a matrix K 2 CN�N, the monic

polynomial7 Q(z) is said to be a minimal polynomial of K

if Q(K) ¼O and Q(z) has smallest degree. (O stands for the

zero N � N matrix.)

Theorem 3.2. Given a matrix K 2 CN�N, the minimal

polynomial of K exists, is unique, and divides the char-

acteristic polynomial of K. [Thus, the degree of Q(z) is at

most N, and all its zeros are eigenvalues of K.]

Definition 3.3. Given a matrix K 2 CN�Nand a nonzero

vector u 2 CN, the monic polynomial P (z) is said to be a

minimal polynomial of K with respect to u if PðKÞu ¼ 0

and P (z) has smallest degree. (0 stands for the zero

N-vector.)

Theorem 3.4. (1) Given a matrix K 2CN�N and a non-

zero vector u 2CN, the minimal polynomial of K with

respect to u exists, is unique, and divides the minimal

polynomial of K, which divides the characteristic polyno-

mial of K. [Thus, the degree of P (z) is at most N, and its

zeros are some or all of the eigenvalues of K.] (2) In

addition, if ~PðzÞis any other polynomial for which

~PðKÞu ¼ 0, then P(z) divides~PðzÞ.

Going back to Equation (17), we can make the vectorGðTÞen there vanish if we choose the polynomial G(z) tobe a constant multiple of P(z), the minimal polynomialof T with respect to en [which exists and is unique by part(1) of Theorem 3.4], because PðTÞen ¼ 0. Thus, choosingGðzÞ ¼ PðzÞ=Pð1Þ so that Gð1Þ ¼ 1 as required, we canobtain from Equations (15)–(17) that

sn;k ¼X

k

i¼0

gixnþi ¼ s;

with

X

k

i¼0

gi ¼ 1:

Here, division by P(1) is allowed because Pð1Þ 6¼ 0 because1 is not an eigenvalue of the matrix T.

We can obtain the same result in an indirect way also byworking backward from PðTÞen ¼ 0: Let

PðzÞ ¼X

k

i¼0

cizi; ck ¼ 1: ð18Þ

Starting with PðTÞen ¼ 0, and invoking Equation (13), wehave

X

k

i¼0

ciTien ¼

X

k

i¼0

cienþi ¼ 0; ð19Þ

which by Equation (9), can be rewritten as

X

k

i¼0

ciðxnþi � sÞ ¼ 0:

6It should be understood that, because they depend on n and k, thescalars gi actually stand for g

ðn;kÞi , asmentioned in the Introduction.

When necessary, we will use the notation gðn;kÞi in the sequel.When

no confusion develops, our notation will be gi, for short.7The polynomial AðzÞ ¼Pm

i¼0 aizi is monic of degree m if am ¼ 1:


Solving this for s, we have

s ¼Pk

i¼0 cixnþiPk

i¼0 ci:

Here, division byPk

i¼0 ci is allowed because

X

k

i¼0

ci ¼ Pð1Þ 6¼ 0

because 1 is not an eigenvalue of the matrix T. Thus, wehave shown once more that

s ¼X

k

i¼0

gixnþi;

with

X

k

i¼0

gi ¼ 1:

Also, the gi of the preceding paragraph are now identifiedas gi ¼ ci=

Pkj¼0 c j; i ¼ 0; 1; . . . ; k, the ci being the coeffi-

cients of the minimal polynomial of T with respect toen ¼ xn � s.

What remains is to find the ci. Invoking ck¼ 1, Equation(19) can be written as

X

k�1

i¼0

cienþi ¼ �enþk:

This is a set of N linear equations in the k unknownsc0; c1; . . . ; ck�1, with ck ¼ 1. Also, because k � N, this setis in general overdetermined. Nevertheless, it is consistentand has a unique solution becauseP(z) exists and is unique.However, to determine the ci from

Pk�1i¼0 cienþi ¼ �enþk;

it seems that we need to know the vectorsem ¼ xm � s; m ¼ n; nþ 1; . . . ;nþ k, hence the solution s.Fortunately, this is not the case, and we can obtain the cisolely from our knowledge of the vectors xm. This weachieve as follows: Multiplying Equation (19) by T, andinvoking Equation (12), we have

0 ¼X

k

i¼0

ciTenþi ¼X

k

i¼0

cienþiþ1: ð20Þ

Subtracting Equation (19) from Equation (20), we obtain

0 ¼X

k

i¼0

ciðenþiþ1 � enþiÞ ¼X

k

i¼0

ciðxnþiþ1 � xnþiÞ;

hence the linear system

X

k

i¼0

ciunþi ¼ 0: ð21Þ

Clearly, the vectors um are known since the xm are.Consequently, the ci can be obtained from Equation (21)without any additional input.

Invoking Equation (11), Equation (21) can also beexpressed as

0 ¼X

k

i¼0

ciTiun ¼ PðTÞun:

Actually, using PðTÞun ¼ 0 and the fact that ðI� TÞ isnonsingular, we can apply part (2) of Theorem 3.4 to provethe following stronger result:

Theorem 3.5. If un ¼ xnþ1 � xn 6¼ 0, then P (z), the mini-

mal polynomial of T with respect to en, is also the minimal

polynomial of T with respect to un.

By Theorem 3.5, even though it is in general overdeter-mined, the system in Equation (21) is consistent and has aunique solution for the ci. As alreadymentioned, its matrixis obtained solely from the vectors xm, which are available.Setting ck ¼ 1, we solve this system for c0; c1; . . . ; ck�1.Following that, with ck ¼ 1, we set

gi ¼ci

Pkj¼0 c j

; i ¼ 0; 1; . . . ; k:

Again, this is allowed because

X

k

i¼0

ci ¼ Pð1Þ 6¼ 0

by the fact that I� T is not singular, and hence T does nothave 1 as an eigenvalue.

Summing up, wehave shown that if k is the degree of theminimal polynomial ofTwith respect to en, then there existscalars g0; g1; . . . ; gk, which satisfy

X

k

i¼0

gi ¼ 1;

such that

X

k

i¼0

gixnþi ¼ s:

The gi are given by

gi ¼ci

Pkj¼0 c j

; i ¼ 0; 1; . . . ; k;

where the ci are the unique solution to the linear system

X

k

i¼0

ciunþi ¼ 0

with ck ¼ 1.At this point, we note that s is the solution to

ðI� TÞx ¼ b, whether rðTÞ< 1 or not. Thus, with the gi asdetermined above, s ¼Pk

i¼0 gixnþi; whether limm!1xmexists or not. (Recall our discussion of antilimits in theIntroduction.)


In the sequel, we shall use the notation

Uð jÞs ¼ ½u jju jþ1j � � � ju jþs�: ð22Þ

Thus, Uð jÞs is an N � (s þ 1) matrix, u j; u jþ1; . . . ; u jþ s being

its columns. In this notation, Equation (21) reads

UðnÞk

c ¼ 0; c ¼ ½c0; c1; . . . ; ck�T: ð23Þ

Of course, dividing Equation (23) byPk

i¼0 ci, and definingthe gi as above, we also have

UðnÞk

g ¼ 0; g ¼ ½g0; g1 . . . ; gk�T: ð24Þ

So far, we have observed that s can be determined viaa sum of the form

Pki¼0 gixnþi with

Pki¼0 gi ¼ 1 once

the minimal polynomial of T with respect to en has beendetermined, and we have seen how this polynomial can bedetermined. However, the degree of the minimal polyno-mial of Twith respect to en can be as large asN by part (1)of Theorem 3.4. Because N can be very large in general,determining s in the way we have described here becomesprohibitively expensive as far as computation time andstorage requirements are concerned. [Note that weneed to store the vectors un; unþ1; . . . ; unþk and solvethe N � k linear system in Equation (21)]. Thus, we con-clude that attempting to determine s via a combination ofthe iteration vectors xm cannot be a suitable way after all.Nevertheless, with some twist, we can use the frameworkdeveloped thus far to approximate s effectively. To do this,we replace the degree of the minimal polynomial of T withrespect to en by an arbitrary small integer k (in practice, wetake k�N). This subject is described in the next twosubsections.

Derivation of MPE

Let us choose k to be an arbitrary positive integer that isnormally (much) smaller than the degree of the minimalpolynomial of T with respect to en. Clearly, the linearsystem in Equation (23) (with ck ¼ 1) is now inconsistentand hence has no solution for c0; c1; . . . ; ck�1 in the ordinarysense. To get around this problem, we solve this system inthe least-squares sense, because such a solution alwaysexists. After that, we compute g0; gi; . . . ; gk precisely asdescribed following Equation (21), and then compute thevector

sn;k ¼X

k

i¼0

gixnþi

as our approximation to s. The resulting method is knownasMPE. Clearly,MPE takes as its input only the integers kand n and the vectors xn; xnþ1; . . . ; xnþkþ1, hence it can beemployed whether these vectors are generated by a linearor nonlinear iterative process.

We can summarize the definition of MPE for an arbi-

trary sequence of vectors fxmg through the followingsteps:

1. Choose the integers k and n, and input the vectorsxn; xnþ1; . . . ; xnþkþ1. (Of course, k � N, but k�N

normally.)

2. Compute the vectors un; unþ1; . . . ; unþk and form theN � k matrix U

ðnÞk�1. (Recall that um ¼ xmþ1 � xm.)

3. Solve the overdetermined linear system UðnÞk�1c

0 ¼�unþk in the least-squares sense for c0 ¼½c0; c1; . . . ; ck�1�T: Set ck ¼ 1; and compute

gi ¼ ci=X

k

j¼0

c j; i ¼ 0; 1; . . . ; k;

provided

X

k

j¼0

c j 6¼ 0:

4. Compute the vector

sn;k ¼X

k

i¼0

gixnþi

as approximation to limm!1xm ¼ s.

Derivation of RRE

Again, let us choose k to be an arbitrary positive integerthat is normally (much) smaller than the degree of theminimal polynomial ofTwith respect to en. With this k, thelinear system in Equation (24), subject to

Pki¼0 gi ¼ 1; is not

consistent, hence it has no solution for g0; g1; . . . ; gk in theordinary sense. Therefore, we solve this system in theleast-squares sense, subject to the constraint

Pki¼0 gi ¼ 1:

After that, we compute the vector

sn;k ¼X

k

i¼0

gixnþi

as our approximation to s. The resulting method is knownas RRE. Clearly, RRE, just as MPE, takes as its input onlythe integers k and n and the vectors xn; xnþ1; . . . ; xnþkþ1,hence it can be employed whether these vectors aregenerated by a linear or nonlinear iterative process.

We can summarize the definition of RRE for an arbitrary

sequence of vectors fxmg through the following steps:

1. Choose the integers k and n, and input the vectorsxn; xnþ1; . . . ; xnþkþ1. (Of course, k � N, but k�N

normally.)

2. Compute the vectors un; unþ1 . . . ; unþk and form theN � ðkþ 1ÞmatrixU

ðnÞk . (Recall that um ¼ xmþ1 � xm.)

3. Solve the overdetermined linear system UðnÞk g ¼ 0 in

the least-squares sense, subject to the constraintPk

i¼0 gi ¼ 1: Here g ¼ ½g0; g1; . . . ; gk�T:


4. Compute the vector

sn;k ¼X

k

i¼0

gixnþi

as approximation to limm!1xm ¼ s.

A method that is essentially the same as RRE wasproposed by Kaniel and Stein (16). In this method, the giare computed as in Step 3 of RRE, but limm!1xm ¼ s is nowapproximated by

sn;k ¼X

k

i¼0

gixnþiþ1:

An Exactness Property of MPE and RRE

From Equations (19) and (20), we realize that PðzÞ ¼Pk

i¼0 cizi, the minimal polynomial of T with respect to

en ¼ xn � s, satisfies

X

k

i¼0

ciðxmþi � sÞ ¼ 0; m ¼ n;nþ 1; . . . : ð25Þ

Consequently, it also satisfies

X

k

i¼0

ciumþi ¼ 0; m ¼ n;nþ 1; . . . ;

fromwhich the ci can be determined uniquely because k isthe smallest possible, and MPE and RRE produce s

exactly.We can now ignore the fact the vectors xm in the pre-

ceding subsections were generated by a linear fixed-pointiterative method, and concentrate only on vectors xmthat satisfy the recursion relation in Equation (25) withminimal k, and state the following exactness result:

Theorem 3.6. Let the vectors xm satisfy Equation (25),with c0ck 6¼ 0;

Pki¼0 ci 6¼ 0; and minimal k. Let em ¼ xm � s

for all m, and assume that the vectors en; enþ1; . . . ; enþk�1

are linearly independent. Then the following hold: (1)With um ¼ xmþ1 � xm for all m, all the sets fum; umþ1; . . . ;umþk�1g; m ¼ n;nþ 1; . . . ; are linearly independent. (2) sm;k

obtained by applying MPE and RRE to the sequence fxmgexist uniquely and sm;k ¼ s, for all m ¼ n;nþ 1; . . . , both for

MPE and for RRE.

This result can be viewed as the MPE/RRE analogue ofwhat is known as McLeod’s theorem that concerns thevector epsilonalgorithm.Wewill discussMcLeod’s theorembriefly in the last section.

DETERMINANT REPRESENTATIONS

Despite the rather complex procedures by which sn;k forMPE and RRE have been defined, elegant analytical

expressions for them can be given. The determinant repre-sentations in the next theorem were given by the author inRef. 17.

Theorem 4.1. With um and wm as in Equation (8), definethe scalars ui,j by

ui; j ¼ðunþi; unþ jÞ ¼ ðDxnþi; Dxnþ jÞ forMPE

ðwnþi; unþ jÞ ¼ ðD2xnþi; Dxnþ jÞ forRRE:

�

ð26Þ

1. The gj forMPE andRRE are the solution to the linear

systems

ðunþi; UðnÞk

gÞ ¼ 0; i ¼ 0; 1; . . . ; k� 1;Pk

j¼0 g j ¼ 1;

forMPE

ðwnþi; UðnÞk

gÞ ¼ 0; i ¼ 0; 1; . . . ; k� 1;Pk

j¼0 g j ¼ 1;

forRRE:

ð27Þ

2. We have the following determinant representations

for sn;k from MPE and RRE:

sn;k ¼ Dðxn; xnþ1; . . . ; xnþkÞDð1; 1; . . . ; 1Þ ; ð28Þ

where Dðv0; v1; . . . ; vkÞ is aðkþ 1Þ � ðkþ 1Þ determi-

nant defined as in

Dðv0; v1; . . . ; vkÞ ¼

v0 v1 � � � vku0;0 u0;1 � � � u0; ku1;0 u1;1 � � � u1; k

..

. ... ..

.

uk�1;0 uk�1;1 � � � uk�1;k

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

:

ð29Þ

Note that the determinantDðxn; xnþ1; . . . ; xnþkÞ in Equa-tion (28) is vector-valued and is defined via its expansionwith respect to its first row. Thus, ifCi is the cofactor of vi inDðv0; v1; . . . ; vkÞ defined in Equation (29), then

sn;k ¼X

k

i¼0

Cixnþi=X

k

i¼0

Ci:

The determinant representations above have been veryuseful in analyzing MPE and RRE. They have also beenused in Ref. 18 in deriving recursion relations among thevarious sn;k.

MPE and RRE in Infinite Dimensional Inner Product Spaces

Before we continue our discussion of MPE and RRE, it isimportant to note that these twomethods, and other vectorextrapolation methods as well, can be applied to vectorsequences in infinite dimensional inner product spaces,such as Hilbert spaces, once the inner product in C

N hasbeen replaced with that pertinent to the space in which we


are working. This is clear from Equations (26–29). All thedevelopments that we discuss in the next sections, includ-ing the convergence theory, remain unchanged in thesespaces, subject to some minor modifications. This has beennoted in Refs. 14 and 17.

MPE AND RRE AND KRYLOV SUBSPACE METHODS

When applied to the solution of the linear systemðI� TÞx ¼ b in conjunction with the linear fixed-pointiteration scheme xmþ1 ¼ Txm þ b; m ¼ 0; 1; . . . ; MPE andRRE become mathematically (but not algorithmically)equivalent to two well-known Krylov subspace methods,namely, the method of Arnoldi (8) and GMRES (9). Thedetails follow:

Given amatrix S and a vector u, let us define the Krylovsubspace KmðS; uÞ via

KmðS; uÞ ¼ span fSiu; i ¼ 0; 1; . . . ;m� 1g: ð30Þ

In the method of Arnoldi and GMRES, one starts withthe vector x0 and seeks an approximation sk to s, thesolution to Ax ¼ b, of the form sk ¼ x0 þ z, with z in theKrylov subspace KkðA; r0Þ, and r0 ¼ rðx0Þ ¼ b� Ax0, suchthat

ðrðskÞ; aÞ ¼ 0 for all a2KkðA; r0Þ ðforArnoldimethodÞð31Þ

and

krðskÞk ¼ minz2KkðA;r0Þ

krðx0 þ zÞk ðforGMRESÞ: ð32Þ

Here, ðx; yÞ ¼ x�y and kxk ¼ffiffiffiffiffiffiffi

x�xp

as before. The followingtheorem was proved in Ref. 19:

Theorem 5.1. Consider the linear system Ax ¼ b, where

A ¼ I� T, and let x0 be an arbitrary initial vector. Let the

vectors xm in MPE and RRE be generated via

xmþ1 ¼ Txm þ b,and let sMPEn;k and sRREn;k be the approximations

to the solution s ofAx¼b.Similarly, let sArnoldik and sGMRESk

be the approximations to s from the method of Arnoldi and

from GMRES, respectively, starting with the same initial

vector x0, as explained above. Then there hold

sMPE0;k ¼ sArnoldik and sRRE0;k ¼ sGMRES

k :

Recall also that the Arnoldi method reduces (mathema-tically) to the method of conjugate gradients when thematrix A is hermitian positive definite.

EFFICIENT IMPLEMENTATION OF MPE AND RRE

General Considerations

We now turn to the numerical implementation of MPEand RRE in floating-point (that is, in finite-precision)

arithmetic. Because we are performing computationswith vectors and matrices, this topic must be handledwith care. In this section, we provide the details of someefficient algorithms for implementing MPE and RRE. Inaddition to being stable numerically, these algorithms arealso timewise economical and have minimal core memoryrequirements. They were originally developed in Ref. 20,where a FORTRAN 77 code is also included. By followingthe instructions of this section, the user can program thesealgorithms himself without having to invoke commercialsoftware packages.

Our algorithms are based on the definitions of MPEand RRE that we presented in the section entitled ‘‘Deri-vation of MPE and RRE.’’ One feature common to bothMPE and RRE is that the matrices U

ðnÞk enter their

definition. In addition, by the fact thatPk

i¼0 gi ¼ 1; thevector sn;k ¼

Pki¼0 gixnþi; for both MPE and RRE, can be

expressed as in

sn;k ¼ xn þX

k�1

i¼0

xiunþi ¼ UðnÞk�1j;

j ¼ ½x0; x1; . . . ; xk�1�T;ð33Þ

where

x0 ¼ 1� g0; x j ¼ x j�1 � g j; j ¼ 1; . . . ; k� 1: ð34Þ

Taking this point into account together with the defini-tions of MPE and RRE, we realize first that the vectorsxnþ1; . . . ; xnþkþ1 are not needed directly for computing sn,k.Actually, they can be discarded as soon as the un þ i arecomputed, which saves storage. This is done by overwrit-ing the vector xn þ i with unþi ¼ Dxnþi as soon as the latteris computed, for i ¼ 1; . . . ; k. We save only xn.

In the sequel, we assume that the matrix UðnÞk has full

rank, that is, that rank ðUðnÞk Þ ¼ kþ 1.

One very important aspect of our algorithms is theiraccurate solution of the relevant least-squares problems,via QR factorizations of the matrices U

ðnÞk , as in

UðnÞk

¼ QkRk: ð35Þ

Here Qk is an N � ðkþ 1Þ unitary matrix satisfyingQ�

kQk ¼ Iðkþ1Þ�ðkþ1Þ. Thus,Qk has the columnwise partition

Qk ¼ ½q0jq1j � � � jqk�; ð36Þ

such that the columns qi form an orthonormal set ofN-dimensional vectors, that is, q�i q j ¼ di j. The matrix Rk

is a ðkþ 1Þ � ðkþ 1Þ upper triangular matrix with positivediagonal elements. Thus,

Rk ¼

r00 r01 � � � r0kr11 � � � r1k

} ...

rkk

2

6

6

6

4

3

7

7

7

5

; rii > 0; i ¼ 0; 1; . . . ; k: ð37Þ


This factorization can be carried out easily and accuratelyusing the modified Gram–Schmidt ortogonalization pro-cess (MGS). See, for example, Refs. 20 and 21. For com-pleteness, we give here the steps of MGS as applied to thematrix U

ðnÞk

:

1. Compute r00 ¼ kunk andq0 ¼ un=r00.

2. For i ¼ 1; . . . ; k do

Set uð0Þi ¼ unþi

For j ¼ 0; . . . ; i� 1 do

r jk ¼ ðq j; uð jÞi Þand uð jþ1Þ

i ¼ uð jÞi � r jkq j

end

Compute rii ¼ kuðiÞi k andqi ¼ uðiÞi =rii

end

Again, ðx; yÞ ¼ x�y and kxk ¼ffiffiffiffiffiffiffi

x�xp

. In addition, q0 over-writes un in Step 1, while in Step 2, u

ð0Þi overwrites

unþi; uð jþ1Þi overwrites u

ð jÞi , and qi overwrites u

ðiÞi , as soon

as they are computed. Thus, for each i ¼ 1; . . . ; k, the vec-tors unþi; u

ð jÞi and qi all occupy the same storage location

both at the time of computation and after the computationhas been performed.

Summarizing, we have that at all stages of the computa-tion of Qk and Rk, we are keeping only k þ 2 vectors in thememory. Ultimately, we save the vector xn and the matrixQk, that is, the vectors q0; q1; . . . ; qk.

Note that Qk is obtained from Qk�1 by appending to thelatter the vector qk as the (kþ 1)st column. Similarly, Rk isobtained from Rk�1 by appending to the latter the 0 vectoras the (kþ 1)st row and then the vector ½r0k; r1k; . . . ; rkk�T asthe (k þ 1)st column.

Algorithms for MPE and RRE

With the QR factorization of UðnÞk (hence of U

ðnÞk�1) available,

we can give algorithms for MPE and RRE within a unified

framework.Thealgorithmsare almost identical; theydifferonly in the way they compute the gi. Here are the theirsteps:

1. Input: k and n and the vectors xn; xnþ1; . . . ; xnþkþ1:

2. Compute the vectors unþi ¼ Dxnþi; i ¼ 0; 1; . . . ; k, andform the N � (k þ 1) matrix

UðnÞk

¼ ½unjunþ1j � � � junþk�;

and form its QR factorization, namely, UðnÞk

¼ QkRk,with Qk and Rk as in Equations (36) and (37), usingMGS.

3. Determine the gi :

For MPEWith rk ¼ ½r0k; r1k; . . . ; rk�1; k�T, solve the k � k

upper triangular system

Rk�1c0 ¼ �rk; c0 ¼ ½c0; c1; . . . ; ck�1�T:

Set ck ¼ 1, and gi ¼ ci=Pk

i¼0 ci; i ¼ 0; 1; . . . ; k,provided

Pki¼0 ci 6¼ 0:

For RREWith e ¼ ½1; 1; . . . ; 1�T, solve the (k þ 1) � (k þ 1)linear system

R�kRkd ¼ e; d ¼ ½d0;d1; . . . ;dk�T:

(This amounts to solving two triangular systems: firstR�ka ¼ e, for a ¼ ½a0;a1; . . . ;ak�T, and following that,

Rkd ¼ a ford.)Next, compute

l ¼ 1=X

k

i¼0

di:

(Note that l is always real and positive because UðnÞk

has full rank.)Next, set g ¼ ld, that is, gi ¼ ldi; i ¼ 0; 1; . . . ; k:

4. With the gi determined, compute j ¼ ½x0; x1; . . . ; xk�1�Tvia

x0 ¼ 1� g0; x j ¼ x j�1 � g j; j ¼ 1; . . . ; k� 1:

5. Compute

h ¼ ½h0;h1; . . . ;hk� 1�T ¼ Rk�1j:

Then compute

sn;k ¼ xn þQk�1h ¼ xn þX

k�1

i¼0

hiqi:

Note that the linear systems that we solve in Step 3 arevery small in size compared with the dimension N of thevectors xm, hence their cost is negligible.

Error Estimation via Algorithms

We now turn to the assessment of the quality of sn, k asan approximation to s, in case the sequence fxmg isobtained from the fixed-point iteration xmþ1 ¼ fðxmÞ ofthe system x ¼ f(x). All we know to compute is f(x) fora given x, and we must use this knowledge for ourassessment.

One common way of assessing the quality of a givenvector x as an approximation to s is by looking at thel2-norm of the residual vector rðxÞ ¼ fðxÞ � x becauserðsÞ ¼ 0. Thus, the quality of sn;k will be assessed by com-puting the l2-norm of rðsn;kÞ.What is of interest is the size ofkrðsn;kÞk=krðx0Þk, namely, the order ofmagnitude of krðsn;kÞkrelative to that of krðx0Þk. This number shows by howmanyorders of magnitude the l2-norm of the initial residual rðx0Þ(or of the initial error e0 ¼ x0 � s) has been reduced in thecourse of extrapolation.

We now show how this can be done by using only some ofthe quantities already produced by the algorithms givenabove, at no additional cost.


First, note that, by Equation (2),

rðx0Þ ¼ fðx0Þ � x0 ¼ x1 � x0 ¼ u0; hence krðx0Þk ¼ ku0k:

For linear sequences: When the iteration vectors xmare generated linearly as in Equation (7), we haverðxÞ ¼ ðTxþ bÞ � x. Thus,

rðxmÞ ¼ xmþ1 � xm ¼ um:

Invoking alsoPk

i¼0 gi ¼ 1; where gi are as obtainedwhen applying MPE or RRE, we therefore have

rðsn;kÞ¼X

k

i¼0

giunþi ¼ UðnÞk

g; hence krðsn;kÞk¼kUðnÞk

g k:

Note also that, in this case,

sn;k � s ¼ �ðI� TÞ�1rðsn;kÞ; hence

ksn;k � sk � kðI� TÞ�1k krðsn;kÞk;

and this provides additional justification for looking atkrðsn;kÞk:

For nonlinear sequences: When the xm are generated non-linearly as in Equation (2), with sn;k close to s, we have

rðsn;kÞ ¼ fðsn;kÞ � sn;k �UðnÞk

g; hence

krðsn;kÞk�kUðnÞk

gk:

Whether the vectors xm are generated linearly or non-linearly, kUðnÞ

k gk can be determined in terms of alreadycomputed quantities and at no cost, without actuallyhaving to compute rðsn;kÞ. Indeed, we have

kUðnÞk

gk ¼ rkkjgkj forMPEffiffiffi

lp

forRRE:

�

Here, rkk is the last diagonal element of the matrix Rk inEquation (37) and l is the parameter computed in Step 3of the algorithms in the preceding subsection. Clearly,this computation of kUðnÞ

kgk can be made a part of the

algorithms. For details, see Ref. 20.

ERROR ANALYSIS FOR MPE AND RRE

The error and convergence analysis of MPE and RRE hasbeen carried out within the framework of linearly gener-ated vector sequences fxmg in Refs. 17 and 22–25. Thisanalysis sheds considerable light onwhat vector extrapola-tion methods can achieve when they are applied to vectorsequences obtained from fixed-point iterative methods onnonlinear systems as well as linear ones.

Convergence Acceleration Properties of sn;k as n!1We first consider the convergence properties of MPEand RRE as n!1 while k is held fixed. The following

result that shows that these methods are bona fideconvergence acceleration methods was given in Ref. 17.The technique of the proof is based on that developed bySidi et al. (14).

Theorem 7.1. Assume that the vectors xm satisfy

xm ¼ sþX

p

i¼1

vilmi ; p � N; ð38Þ

where the vectors vi are linearly independent, and the sca-

lars li, are distinct and nonzero and li 6¼ 1, ordered as in

jl1j jl2j � � � jlpj: ð39Þ

If also

jlkj> jlkþ1j; ð40Þ

then the following hold:

1. sMPEn;k exists for all large n and sRREn;k exists uncondition-

ally for all n.

2. Both for MPE and RRE, there holds

sn;k � s ¼ ½Cn;k þ oð1Þ�lnkþ1 ¼ Oðlnkþ1Þ as n !1; ð41Þ

where the vectors Cn,k are bounded in n, that is,supnkCn;kk<1, and are the same for MPE and

RRE. Thus, the errorssn;k � s for MPE and RRE are

asymptotically equal as n!1as well.

3. In addition, limn!1gðn;kÞi , where g

ðn;kÞi stands for gi, all

exist, and

limn!1

X

k

i¼0

gðn;kÞi li ¼

Y

k

i¼1

l� li

1� li: ð42Þ

4. Finally, for k ¼ p, we have the so-called ‘‘finite termi-

nation property,’’ that is,

sn;p ¼ s andX

p

i¼0

gðn;pÞi li ¼

Y

p

i¼1

l� li1� li

: ð43Þ

It can be shown that the finite termination property inpart (4) of Theorem 7.1 is nothing but the exactness resultwe stated in Theorem 3.6.

Note that when the xm are generated by the iterativeprocedure inEquation (7),where I� T is nonsingular andTis diagonalizable, they are precisely as described in thistheorem, with li being some or all of the distinct nonzeroeigenvalues ofT andvi being eigenvectors corresponding tothese eigenvalues. If ðmi; ziÞ; i ¼ 1; . . . ;N, are the eigenpairsof T, that is, Tzi ¼ mizi, for every i, then x0 � s ¼

PNi¼1 aizi

for some scalarsai since the zi are linearly independent andspan CN. Therefore, by Equations (9) and (13),

xm � s ¼ Tmðx0 � sÞ ¼X

N

i¼1

aimmi zi:


Then, for m 1, zero eigenvalues of T have zero contribu-tion to this summation. In addition, the contributionsof equal nonzero eigenvalues can be lumped togetherto form one single term that is again an eigenvector ofT. Renaming the distinct eigenvalues that contribute tothis sum li, we obtain precisely the structure of xm in thetheorem. In addition, note that p is the degree of theminimal polynomial of T with respect to xm � s for allm 1 in this case. This is also borne out by Equation (43).

It is also clear from Theorem 7.1 that the sequence fxmgthere need not result from a nonsingular linear systemðI� TÞx ¼ b, it can result from some other type of problemas well.

In case, the condition jlkj> jlkþ1j in Theorem 7.1 doesnot hold [then jlkj ¼ jlkþ1j necessarily since jlkj jlkþ1jby the ordering of the li in Equation (39)], we need tomodify the statement of this theorem. This has beendone in Ref. 23, and we state a special case of it next.

Theorem 7.2. Assume that the vectors xm have been gen-

erated by the iterative procedure xmþ1 ¼ Txm þ b; T being

diagonalizable and I� T being nonsingular. Denote the

solution of ðI� TÞx ¼ b by s. Let us order the nonzero dis-

tinct eigenvalues li of T that contribute to xm � s via

Equation (38) as in Equation (39). If jlkj ¼ jlkþ1j, then

sn;k exist and are unique for all n and k, and there holds

sn;k � s ¼ OðlnkÞ ¼ Oðlnkþ1Þ as n!1; ð44Þ

(1) for RRE unconditionally and (2) for MPE provided the

hermitian part8 of the matrix aA, where a is some appro-

priate nonzero scalar in C and A ¼ I � T, is positive

definite.

One conclusion that can be drawn from Theorem 7.2 isthat if

jlrj> jlrþ1j ¼ � � � ¼ jlrþnj> jlrþnþ1j � � � ;

then there holds

sn;k � s ¼ Oðlnrþ1Þ asn!1; for rþ 1 � k � rþ n� 1:9

Thus, if

jl1j ¼ � � � ¼ jlnj> jlnþ1j � � � ;

then

sn;k � s ¼ Oðln1Þ as n!1; for 1 � k � n� 1;

while

sn;n � s ¼ Oðlnnþ1Þ as n!1:00

We would like to emphasize here that the results ofTheorems 7.1 and 7.2 are not the same. For example, inTheorem 7.1 , we have that sMPE

n;k � s and sRREn;k � s are asymp-totically equal as n!1, whereas no such result holds inTheorem 7.2.

Concerning the results of Theorems 7.1 and 7.2, we canoffer the following interpretation: Starting from Equation(38), we first observe that the contribution of the small li isbeing diminished relative to the large ones by letting n

grow. The extrapolation procedure then takes care of thecontribution from the k largest li. What remains is thecontribution from the intermediate li starting with lkþ1.

In connection with Theorems 7.1 and 7.2, we note thatwe have not assumed that limm!1xm exists. Clearly,limm!1xm exists and is equal to s provided jl1j< 1. Incase limm!1xm does not exist, we have that jl1j 1 neces-sarily; in this case, s is the antilimit of fxmg. Now, if we usethe xm, as they are, to approximate s, we have the error

em ¼ xm � s ¼ Oðlm1 Þ asm!1:

Thus, provided jlkþ1j< 1, we have that limn!1sn;k ¼ s,whether limn!1xm exists. Furthermore, when thesequence fxmg converges (to s), sn,k, converges (to s) faster,provided jlkþ1j < jl1j, which clearly holds in Theorem 7.1.This means thatMPE and RRE accelerate the convergenceof the sequence fxmg.

For more general versions of Theorems 7.1 and 7.2involving nondiagonalizable matrices T, see Refs. 22and 23.

Error Analysis for sn;k with Fixed n and k

So far, we have reviewed the convergence properties of theapproximations sn;k for n!1 while k is being held fixed.Of course, n cannot be increased indefinitely in practice.Similarly, k cannot be increased too much because ofstorage limitations [recall from the algorithms of the sixthsection that we need to store the N � (k þ 1) matrix Qk].Because of these reasons,MPEandRREare applied in theso-called cycling mode (to be discussed later in the eighthsection), wheren and k can be chosenmoderately large butare both kept fixed. Bounds on the error in sMPE

n;k and sRREn;k , asfunctions of n and k, have been presented by Sidi andShapira (24, 25).

The next result from Refs. 24 and 25 provides an errorbound on sRREn;k for the case in which both n and k are beingkept fixed. This theorem actually gives the justificationfor cycling with sn;k with even moderate positive n ratherthan n ¼ 0. An essentially identical result for sMPE

n;k ,under the condition that the matrix I� T has a positivedefinite hermitian part (cf. Theorem 7.2), is given inRef. 24.

Theorem 7.3. Let s be the solution to the linear system

ðI� TÞx ¼ b, where I� T is nonsingular. Assume that T is

8The hermitian part of a matrix K2Cs�s is the matrixKH ¼ 1

2ðKþK�Þ.9Recall that sn;r � s ¼ Oðlnrþ1Þ asn!1, and that sn;rþn � s ¼Oðlnrþnþ1Þ asn!1, already as described in Theorem 7.1.


diagonalizable, that is,

T ¼ RLR�1; L ¼ diag ðm1; m2; . . . ; mNÞ; ð45Þ

and let the vector sequencefxmgbe generated

via xmþ1 ¼ Txm þ b; m ¼ 0; 1; . . .Define the residual vector

associated with the vector x by rðxÞ ¼ b� ðI� TÞx. Let alsoPk ¼ fpðzÞ : p2 Qk; pð1Þ ¼ 1g. Then

krðsRREn;k Þk ¼ minp2Pk

kTn pðTÞ rðx0Þk

� ðminp2Pk

kTn pðTÞkÞ krðx0Þk

� kTn pðTÞk krðx0Þk for every p2Pk

� kðRÞGspect ðTÞn;k krðx0Þk;

ð46Þ

where kxk ¼ffiffiffiffiffiffiffi

x�xp

as before, kðRÞ ¼ kRkkR�1k stands for

the condition number of R, spect (T) stands for the spec-

trum of T, and, for any set D in the complex plane,

GDn;k ¼ min

p2Pk

maxz2D

jzn pðzÞj: ð47Þ

Remark. The paper (25) actually treats GMRES (k), therestarted GMRES, with n initial iterations of Richardsontype, which is denoted there GMRES(n,k). Because RREand GMRES are equivalent in the sense described inTheorem 5.1 in the fifth section, the results of Ref. 25can be expressed as in Theorem 7.3 and without anychanges.

Now, in case spect ðTÞ�D, we have that GspectðTÞn;k �GD

n;k.This fact has been used in Ref. 24 to derive upper and lowerbounds onG

spectðTÞn;k that can be expressed analytically in terms

of suitable sets of orthogonal polynomials. For some specialtypes of spectra, both bounds can be expressed analytically interms of the Jacobi polynomials P

ða;bÞk ðzÞ and can easily be

computed numerically. In addition, these bounds are tight.(See the tables in Refs. 24 and 25.) Below, by ½u; v� we meanthe straight line segment between the (complex) numbers uand v in the z-plane. The Jacobi polynomials P

ða;bÞk ðzÞ are

normalized such that Pða;bÞk ð1Þ ¼ kþ a

k

� �

; see Abramowitz

and Stegun (26), for example.

1. If spect(T) is contained in [0, b] for some possiblycomplex b (1 is not in [0, b]), then

Gspect ðTÞn;k � jbjn

jPð0;2nÞk

ð2=b� 1Þj

¼ jbjnþk

jPkj¼0

k

j

� �

2nþ k

j

� �

ð1� bÞ jj:

This bound is a decreasing function of both n and k

when b 6¼ 0 is real and b< 1. In particular, its decreaseas a function of k becomes faster with increasing n.Clearly, this bound decreases very quickly in case thespectrum of T is real nonpositive, that is, b < 0.

2. If spect(T) is contained in ½�b; b� for some possiblycomplex b (1 is not in ½�b; b�), then

Gspect ðTÞn;2n � jbjn

jPð0;nÞn ð2=b2 � 1Þj

;

Gspect ðTÞn;2nþ1 � jbjnþ1

jPð0;nþ1Þn ð2=b2 � 1Þj

:

Both for even and odd values of k, these upper boundscan be unified to read

Gspect ðTÞn;k � jbjnþk

jPnj¼0

n

j

� �

nþ m

j

� �

ð1� b2Þ jj;

where

n ¼ k

2; m ¼ kþ 1

2

%

:

$%$

This unified bound is a decreasing function of both n

and kwhen b is real and jbj< 1. The sameholds in caseb is purely imaginary and |b| < 1, which happenswhen T is a skew-hermitian matrix, for example. Inthis case, the upper bound on G

spectðTÞn;k tends to zero

monotonically as a function of k, its rate of decreasebecoming largerwith increasingn, at the best possiblerate, because now

Gspect ðTÞn;k � jbjnþk

Pnj¼0

n

j

� �

nþ m

j

� �

ð1þ jbj2Þ j:

When comparing two approximations sRREn;k and sRREn0;k ,what seems to determine which one is more accurate isthe correspondingupper bounds forG

spectðTÞn;k andG

spectðTÞn0 ;k . The

examples we have given here suggest that sRREn;k is the betterone if n>n0. In particular, sRREn;k , with n> 0 is likely to bebetter than sRRE0;k [equivalently, GMRES(n, k) with n > 0 isbetter than GMRES(k)]. We can make use of this observa-tion inapplyingRRE(andMPEaswell) in the cyclingmode;this will be discussed in the eighth section.

Numerical Stability of MPE and RRE

An important issue related to the application of extrapola-tionmethods infinite-precision arithmetic is that of numer-ical stability. This issue concerns the propagation of inputerrors into the output. In our case, this means the propaga-tion of errors in the xm into sn,k. A practical measure of thisis the quantity

Gn;k ¼X

k

i¼0

jgðn;kÞi j: ð48Þ

Clearly, Gn;k 1 becausePk

i¼0 gðn;kÞi ¼ 1.

This assertion can be justified heuristically as follows:Let us assume that hm is the error committed in computing


xm (which can be the result of roundoff, for example). Thus,xm ¼ xm þ hm is the computed xm. Assuming that the g

ðn;kÞi

are not affected by these errors, sn;k, the ‘‘computed’’ sn,k,is given by

sn;k ¼X

k

i¼0

gðn;kÞi xnþi ¼

X

k

i¼0

gðn;kÞi ðxnþi þ hnþiÞ

¼ sn;k þX

k

i¼0

gðn;kÞi hnþi: ð49Þ

Therefore,

ksn;k � sn;kk ¼ kX

k

i¼0

gðn;kÞi hnþik �

X

k

i¼0

jgðn;kÞi jkhnþik

� Gn;k max0�i�k

khnþik� �

: ð50Þ

Let us now assume that the relative errors in the computedxm are bounded by some fixed positive scalar b; that is,khmk � bkxmk. Let us assume, in addition, that fxmg con-verges, so that max0�i�kkxnþik is bounded. Then

ksn;k � sn;kk � bGn;k max0�i�k

kxnþik� �

: ð51Þ

Now, b and max0�i�kkxnþik are independent of the extra-polation method, hence are fixed. Because fxmg converges,max0�i�kkxnþik is of the order of ksk and ksn;kk. Therefore,for all practical purposes, bGn;k is the relative error in sn;k.Suppose that the xm have been computed to machineaccuracy. This implies that b is the rounding unit of thefloating-point arithmetic being used; let b be of order 10�p,for some p> 0. If nowGn;k is of order 10

q for some q> 0, thensn;k agrees with sn,k up to p� q decimal digits. In otherwords, q decimal digits have been lost in the course ofcomputation.

In view of this discussion, we say that MPE or RRE isstable if

supn

Gn;k <1: ð52Þ

Let us now assume that the sequence fxmg is as inTheorem 7.1 . By part (3) of this theorem, the g

ðn;kÞi satisfy

Equation (42) and hence limn!1Gn;k exists. In addition, byTheorem 1.4.3 in Sidi (27), Gn;k also satisfies

limn!1

Gn;k �Y

k

i¼1

1þ jlijj1� lij

<1: ð53Þ

Thus both MPE and RRE are stable under the conditionsof Theorem 7.1. Equality holds in (53) when l1; . . . ; lk areall positive or all negative; in addition, when they are allnegative, limn!1Gn;k ¼ 1, which is the most ideal case.When one or more of these li are very close to 1 inthe complex plane, Gn;k becomes very large, and this limitsthe accuracy that canbe reached by sn;k. This suggests thatwe should aim at iterative methods for which the largest

eigenvalues are not too close to 1. In the absence of such aniterative scheme, we can apply MPE and RRE to thesequence fxrmgwith some integer r> 1. By Equation (38),

xrm ¼ sþX

p

i¼1

viðlri Þm as n!1: ð54Þ

Note that, if jlij< 1 but li is close to 1, then j1� lri j> j1� lij for r> 1. Thus MPE and RRE are likely to pro-duce better accuracy when applied to fxrmg with someinteger r> 1 than when applied to fxmg. For example, theapproximations sn;k produced by applying MPE and RREto the sequence fxrmg, under the conditions of Theorem7.1and provided lri are distinct, are such that

sn;k � s ¼ Oðlrnkþ1Þ as n!1 ð55Þ

and

limn!1

X

k

i¼0

gðn;kÞi si ¼

Y

k

i¼1

s� si

1� si; lim

n!1Gn;k �

Y

k

i¼1

1þ jsijj1� sij

;

si ¼ lri ; i ¼ 1; 2; . . . :

ð56Þ

From Equation (55), it is clear that better stability and accu-racy can be obtained using the same storage (the same k)with r> 1. We can also maintain a given level of accuracy byincreasing r and decreasing k simultaneously.

MPE AND RRE VIA CYCLING

As we have already discussed, one important issue that weconfront when applying vector extrapolation methods tolarge-scale problems is that of storage. In the algorithmspresented in the sixth section, for example, we need tostore the matrix Qk, which forms the bulk of the storedquantities. Using the same storage, we can obtain approx-imations of better accuracy by applying the extrapolationmethods to the sequence fxrmg instead of fxmg, as waspointed out at the end of the preceding section.

In this section, we discuss the strategy that is known ascycling or restarting, whichwe alluded to above,whenMPEandRRE are being applied to the solution of the system x¼f (x) in conjunctionwith the iterative scheme xmþ1 ¼ fðxmÞ,m ¼ 0:1; . . . . In this strategy, n and k are held fixed. Hereare the steps of cycling:

C0. Choose integersn 0; k 1 and r 1, and an initialvector x0.

C1. Compute the vectors x1; x2; . . . ; xrðnþkþ1Þ½via xmþ1 ¼fðxmÞ�, and save

yn;ynþ1; . . . ; ynþk; ynþkþ1; yi ¼ xri; i ¼ 0; 1; . . . :

C2. Apply MPE or RRE to the vectors yn;ynþ1; ... ;ynþkþ1

precisely as in the sixth section, with end resultsn,k.

C3. If sn,k satisfies accuracy test, stop.

Otherwise, set x0 ¼ sn;k, and go to Step C1.


Wewill call each application of Steps C1–C3 a cycle.Wewill also denote the sn;k that is computed in the ith cycle s

ðiÞn;k.

The analysis of MPE and RRE, as these are beingapplied to nonlinear systems x = f(x) in the cyclingmode with n = 0 and r ¼ 1, has been considered in Skelboe(28). As has been shown in Ref. 1, the results of this workare only heuristic, however.What these say is that,whenk

in the ith cycle is chosen to be the degree ki of the minimalpolynomial of Fðsði�1Þ

0;ki�1Þ [recall that F(x) is the Jacobian

matrix of f (x) at x], with respect to sði�1Þ0;ki�1

� s, the sequencefsðiÞn;kig converges to s quadratically. However, we mustrecall that, since ki can be as large as N and is not knownexactly, this usage of cycling is not useful practically forlarge-scale problems we are interested in solving. In otherwords, trying to achieve quadratic convergence fromMPEandRREvia cycling is not realistic.Wemay achieve linearbut fast convergence by choosing evenmoderate values forn and k with cycling.

APPLICATIONS

Solution of Large-Scale Nonlinear Systems

As we mentioned earlier, one of the main uses of vectorextrapolationmethodshasbeen inthesolutionof large-scalesystems of nonlinear equations cðxÞ ¼ 0 with solution s,especially those that arise from discretization of nonlinearcontinuum problems, via fixed-point iterative procedures ofthe form xmþ1 ¼ fðxmÞ. In this connection, we would like toemphasize that vector extrapolationmethods, in the cyclingmode, are effective when the iteration procedure xmþ1 ¼fðxmÞ chosen is convergentand reasonablygood, in the sensethat only a few of the eigenvalues of F(s), the Jacobianmatrix of f(x)atx=s, arenear1 in the complexplane.Vectorextrapolation methods do not perform that well when mosteigenvalues of F(s) cluster near 1 in the complex plane.Thus, the iterative method should be chosen such that theeigenvalues of F(s) are scattered in the interior of the unitdisk, very few of them being near 1.

As discussed in the preceding section, the simplest wayof achieving this goal is by applying the extrapolationmethods to the subsequence fxrmg with some integerr 2. Another simple way is by defining the iterationvectors via

xmþ1 ¼ xm þw½fðxmÞ � xm� ¼ ð1�wÞxm þwfðxmÞ;m ¼ 0; 1; . . . ;

for some suitable scalar w. See, for example, Ref. 20(section 7).

In applying MPE and RRE to finite difference solutionsof elliptic PDEs, such as the two-dimensional Poissonequation @2u=@x2 þ @2u=@y2 ¼ fðx; yÞ, a good iterativemethod would be the ADI (alternating direction implicit)method. For this method, see Stoer and Bulirsch (29),Morton and Mayers (30), or Ames (31), for example.

Vector-Valued Rational Approximations

Given the vector-valued power seriesP1

i¼0 uizi; where z

is a complex variable and ui are constant vectors in CN ,

representing a vector-valued function u(z) about z = 0, we

can use vector extrapolation methods to approximate u(z)via a vector-valued rational approximation obtained fromthe power series coefficients ui.

For this, we apply the extrapolation methods to thesequence fxmðzÞg, where

xmðzÞ ¼X

m

i¼0

uizi; m ¼ 0; 1; . . . :

The approximations obtained this way are rational func-tions, whose numerators are vector-valued polynomialsand whose denominators are scalar-valued polynomials.

This topic is dealtwith indetail in the paperSidi (10).Wegive a brief informal description of the subject, in thecontext of MPE, next.

When MPE is used for this purpose, we obtain theapproximations sn;kðzÞ (now called SMPE approximations)that are given as in

sn;kðzÞ ¼DðzkxnðzÞ; zk�1xnþ1ðzÞ; . . . ; z0xnþkðzÞÞ

Dðzk; zk�1 . . . ; z0Þ ; ð57Þ

where Dðv0; v1; . . . ; vkÞ is a ðkþ 1Þ � ðkþ 1Þ determinantdefined as in

Dðv0; v1; . . . ; vkÞ ¼

v0 v1 � � � vk

u0;0 u0;1 � � � u0;k

u1;0 u1;1 � � � u1;k

..

. ... ..

.

uk�1;0 uk�1;1 � � � uk�1;k

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

;

ui; j ¼ ðunþi; unþ jÞ: ð58Þ

Thus, sn;kðzÞ is also of the form

sn;kðzÞ ¼Pk

i¼0 cizk�ixnþiðzÞ

Pki¼0 ciz

k�i; ð59Þ

with appropriate scalar constants ci. Note that the nume-rator polynomial has degree at most n þ k, whereas thedenominator polynomial has degree k when the cofactor ofv0 in Dðv0; v1; . . . ; vkÞ is nonzero.

The sequence fsn;kðzÞg1n¼0 (with fixed k) has a very niceconvergence property, which we discuss briefly. Supposethat u(z) is analytic in an open disc Dr ¼ fz2C : jzj< rgand meromorphic10 in a larger open disc DR ¼fz2C : jzj<Rg. This implies that the series

P1i¼0 uiz

i con-verges to u(z) only for |z| < r, and diverges for |z| r.Under some additional condition that has to do with theLaurent expansions of u(z) about its poles, the rationalapproximations sn;kðzÞ obtained from the series

P1i¼0 uiz

i

have the property that, if k is equal to the number of thepoles inDR, then fsn;kðzÞg1n¼0 converges tou(z)uniformly inevery compact subset of DR excluding the poles of u(z). In

10A function u(z) is said to be meromorphic in a domain D of thecomplex z-plane if its only singularities in D are poles.


addition, the poles of sn;kðzÞ tend to the poles of u(z) asn!1.

As an example, let us consider the function uðzÞ ¼ðI� zAÞ�1

b, where A2CN�N is a constant matrix andb2CN is a constant vector. This function has the seriesrepresentation

uðzÞ ¼X

1

i¼0

uizi;

where ui ¼ Aib for each i ¼ 0; 1; . . . . This series convergesfor jzj < 1=rðAÞ, where, we recall, rðAÞ is the spectral radiusof A. In this case, u(z) is a vector-valued meromorphicfunction (actually, a rational function) with poles equalto the reciprocals of the eigenvalues ofA, the residues beingrelated to corresponding eigenvectors and principal vec-tors. The poles of sn,k(z) turn out to be the so-called Ritzvalues that result from the method of Arnoldi for eigen-values. For details on precise convergence properties andrates of convergence, see Sidi (11).

One of the uses of these rational approximations hasbeen to the summation of a perturbation series resultingfrom ODEs describing some nonlinear oscillations. See, forexample,WuandZhong (32). In this paper, the spaceweareworking in is infinite dimensional, and the definitions ofMPE and RRE remain unchanged, as mentioned in thefourth section on determinant representations onMPEandRRE.

Computation of Eigenvectors of Known Eigenvalues

An immediate application of vector extrapolation in gen-eral, of MPE and RRE in particular, is to the computationof the eigenvector that corresponds to the largest eigenva-lue of a large and sparse matrix when this eigenvalue isnondefective, that is, it has only corresponding eigenvec-tors but no corresponding principal vectors. Let A2CN�N

have eigenvalues m1; . . . ; mN, which we order as in

jm1j> jm2j jm3j � � � jmN j:

Assume that m1 is known and a corresponding (unknown)eigenvector is required. To solve this problem,we choose aninitial vector x0, and perform the power iterations

xmþ1 ¼ m�11 Axm; m ¼ 0; 1; . . . :

It is easy to show that [see, for example, Sidi (12, 33)],provided x0 has a contribution from the eigenvalue m1,the vectors xm are precisely as in Theorem 7.1, with s therebeinganeigenvector ofA that corresponds tom1 andli therebeing some or all of miþ1=m1. (Clearly, jlij < 1 for all i, hencefxmg converges to the required eigenvector.) Thus, MPEand RRE can be applied to the sequence of power iterationsfxmg to accelerate its convergence. Indeed, both methodsapply exactly as described inTheorem7.1. In addition, theycan be applied in the cycling mode, as described in theprevious section. Vector extrapolation methods are attrac-tive for this application because their computational cost islow due to the sparseness of the matrix involved that

enables us to compute the sequence fxmg via xmþ1 ¼m�11 Axm inexpensively.One recent application of this has been to the computa-

tion of thePageRankof theGooglematrix. ThePageRank isthe (unique) eigenvector of the Google matrix that corres-pond to the largest eigenvalue m1 ¼ 1 that is also known tobe simple. It is positive and is normalized such that the sumof its components is 1. [See Brin and Page (34).] Thisproblem was treated by Kamvar et al. (35) using a methodthe authors denoted quadratic extrapolation. This methodhas been generalized in (12, 33); quadratic extrapolationturns out to be the k ¼ 2 case of this generalization. Inaddition, this generalization turns out to be closely relatedto MPE, in the sense that if sn;k and sn;k are the vectorsobtained by applying MPE and the new method, respec-tively, to the sequence of power iterations obtained from theGoogle matrix A, then sn;k ¼ Asn;k. In this application too,the computation of the sequence fxmg via xmþ1 ¼ Axm canbe achieved inexpensively because of the sparseness of theGoogle matrix. The use of vector extrapolation methods ingeneral, and MPE and RRE in particular, in the cyclingmodewitharbitraryn, k, and r for computing thePageRankwas first suggested in Ref. 12. This approach is illustratedwith numerical examples inRef. 33. The results obtained inRef. 33 show clearly that this approach to PageRank com-putation is very effective.

BRIEF SUMMARY OF MMPE

We now present a very brief discussion of the modifiedminimal polynomial extrapolation (MMPE) that is basedon Sidi et al. (14). Without more explanation, we will beusing the notation of the third section throughout.

The MMPE approximation sn;k from fxmg, namely,

sMMPEn;k ¼

X

k

i¼0

gixnþi;

is essentially definedby requiring that the g j be the solutionto the linear system

ðqi;UðnÞk

gÞ ¼ 0; i ¼ 0; 1; . . . ; k� 1;X

k

j¼0

g j ¼ 1; ð60Þ

where q0; q1; . . . ; qk�1 are k fixed linearly independentvectors in CN .

A determinant representation for sMMPEn;k has been given

in Ref. 14, this representation being exactly of the formshown inEquations (28) and (29) of Theorem4.1,withui; j¼ðqi; unþ jÞ in Equation (26).

The convergence theory of the sequences fsMMPEn;k g1n¼0

under the conditions of Theorem 7.1 has been given inRef. 14. This theory shows that, for all large n; sMMPE

n;k existuniquely [part (1) of Theorem 7.1], and that the rest of theresults of Theorem 7.1 [parts (2)–(4)] are valid for MMPEwithout any changes, provided

ðq0; v1Þ ðq0; v2Þ � � � ðq0; vkÞðq1; v1Þ ðq1; v2Þ � � � ðq1; vkÞ

..

. ... ..

.

ðqk�1; v1Þ ðqk�1; v2Þ � � � ðqk�1; vkÞ

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

6¼ 0: ð61Þ


A more complete theory under conditions on fxmg moregeneral than those in Theorem 7.1, has been presented inSidi and Bridger (22).

As we have observed, the vectors qi that enter theconstruction of MMPE are fixed. It is possible to replacethese vectors by q

ðnÞi that depend on n. We then get what

the author has called generalized MPE (GMPE) in Ref. 19.Obviously, GMPE contains MPE, RRE, and MMPE withqðnÞi ¼ unþi; q

ðnÞi ¼ wnþi; andq

ðnÞi ¼ qi, respectively. Then,

sGMPEn;k ¼

X

k

i¼0

gixnþi;

with the g j defined as the solution to

ðqðnÞi ;UðnÞk

gÞ ¼ 0; i ¼ 0; 1; . . . ; k� 1;X

k

j¼0

g j ¼ 1:

Of course, a determinant representation for sGMPEn;k exists

and is given as in Equations (28) and (29) of Theorem 4.1,with ui; j ¼ ðqðnÞi ; unþ jÞ in Equation (26).

BRIEF SUMMARY OF EPSILON ALGORITHMS

Scalar Epsilon Algorithm

Let the scalar sequence fxmgbe such that

xm� sþX

1

i¼1

ailmi as m!1; ð62Þ

where ai are nonzero, li are distinct, nonzero, and li 6¼ 1;such that

jl1j jl2j � � � ; limi!1

li ¼ 0: ð63Þ

Here s is limm!1 xm when the latter exists; it is theantilimit of fxmg otherwise. Of course, the sequence con-verges provided jlij< 1. To accelerate the convergence ofthe sequence fxmg, Shanks (36) developed a nonlineartransformation, by which we define

ekðxnÞ ¼Eðxn; xnþ1; . . . ; xnþkÞ

Eð1; 1; . . . ; 1Þ ; ð64Þ

where

Eðv0; v1; . . . ; vkÞ ¼

v0 v1 � � � uk

un unþ1 � � � unþk

unþ1 unþ2 � � � unþkþ1

..

. ... ..

.

unþk�1 unþk � � � unþ2k�1

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

�

;

ui ¼ xiþ1 � xi:

ð65Þ

Here, ekðxnÞ are approximations to s. In case Equation (62)assumes the form

xm ¼ sþX

k

i¼1

a2lmi ; m ¼ 0; 1; . . . ;

for some finite k, we have ekðxnÞ ¼ s for all n. This is theexactness result for the Shanks transformation.

This transformation was proposed earlier by Schmidt(37) for constructing the solution of linear systems fromfixed-point iterations; compare Equation (62) withEquation(38) satisfied by fixed-point iteration vectors generatedlinearly.

A very elegant and fast algorithm for constructing theekðxnÞ was developed by Wynn (5), and it reads

�ðnÞ�1 ¼ 0; �

ðnÞ0 ¼ xn; n ¼ 0; 1; . . . ;

�ðnÞkþ1 ¼ �

ðnþ1Þk�1 þ 1

�ðnþ1Þk

� �ðnÞk

; n; k ¼ 0; 1; . . . :ð66Þ

Here, �ðnÞ2k ¼ ekðxnÞ and �

ðnÞ2kþ1 ¼ 1=ekðDxnÞ. Of course,

Dxi ¼ xiþ1 � xi.A different algorithm, denoted theFS/qd algorithm, that

is as efficient as the epsilon algorithm, was recently pro-posed by Sidi in chapter 21 of Ref. 27. In this algorithm, wefirst compute two sets of quantities, feðnÞk g and fqðnÞk g, via theqd algorithm:

eðnÞ0

¼ 0; qðnÞ1 ¼ unþ1

un; n ¼ 0; 1; . . . ; ðui ¼ xiþ1 � xiÞ

eðnÞk

¼ qðnþ1Þk

� qðnÞk

þ eðnþ1Þk�1 ; q

ðnÞkþ1 ¼

eðnþ1Þk

eðnÞk

qðnþ1Þk

;

n ¼ 0; 1; . . . ; k ¼ 1; 2; . . . : ð67Þ

Once the eðnÞk

and qðnÞk

have been computed, we compute theekðxnÞ as follows:

MðnÞ0

¼ xn

un; N

ðnÞ0 ¼ 1

un; n ¼ 0; 1; . . . ; ðui ¼ xiþ1 � xiÞ

MðnÞk

¼M

ðnþ1Þk�1 �M

ðnÞk�1

eðnÞk

; NðnÞk

¼N

ðnþ1Þk�1 �N

ðnÞk�1

eðnÞk

;

ekðxnÞ ¼M

ðnÞk

NðnÞk

n; k ¼ 0; 1; . . . : ð68Þ

The convergence analysis of the sequence fekðxnÞg1n¼0

with fixed k, for sequences fxmg whose members behaveas in Equations (62) and (63), with li all positive or allnegative, was given byWynn (38).Wynn’s results were laterextended by Sidi (39) to the case of general complex li, andalso to sequences fxmg whose members behave in a moregeneral fashion than that given in Equations (62) and (63);see also chapter 16 in Ref. 27. One of the results of Ref. 39,which is a generalization ofWynn’s results, reads as follows:

Theorem 11.1. Provided jlkj> jlkþ1j in Equations (62)and (63), there holds

ekðxnÞ � s ¼ Oðlnkþ1Þ asn!1:

For a detailed study of the Shanks transformation, seechapter 16 in Ref. 27.


Theorem 11.1 shows that the Shanks transformation isan effective convergence acceleration method when appliedto scalar sequences fxmg that behave as in Equations (62)and (63). Now, the vector sequences fxmg in Theorems 7.1and 7.2 behave exactly as in Equations (62) and (63) com-ponentwise. This suggests that the Shanks transformationcan be applied to these sequences componentwise to accel-erate their convergence. When applied in the context ofvector sequences in this way, the Shanks transformationis known as the scalar epsilon algorithm (SEA). Notethat, SEA does not couple the different components of thevectors xm.

Vector Epsilon Algorithm

Another interesting vector method to accelerate the con-vergence of fxmg that is based on the epsilon algorithmitself was proposed by Wynn (6). This method is knownas the vector epsilon algorithm (VEA), and is defined asfollows:

eðnÞ�1

¼ 0; eðnÞ0 ¼ xn; n ¼ 0; 1; . . . ;

eðnÞkþ1 ¼ e

ðnþ1Þk�1 þ ðeðnþ1Þ

k� e

ðnÞk

Þ�1; n; k ¼ 0; 1; . . . : ð69Þ

Here, x�1 is the Samelson inverse of the vector x, given byx�1 ¼ x =ðx�xÞ ¼ x =kxk2, and x is simply the complex con-jugate of x. With VEA, we take e

ðnÞ2k to be approximations to

the limit or antilimit of fxmg. Clearly, unlike SEA, VEAcouples all the components of the vectors xm.

The properties of VEAhave been studied extensively viathe so-called vector-valued Pade approximants: Graves-Morris (40, 41) has shown that, under the conditions ofTheorem 3.6, the VEA approximations from the sequencefxmg satisfy eðmÞ

2k ¼ s; m ¼ n;nþ 1; . . . . This result is knownasMcLeod’s theorem. It was conjectured byWynn (42), andwas first proved by McLeod (43) with the ci restricted to bereal numbers; see Ref. 1, page 211.

A determinantal formula for the eðnÞ2k has been given by

Graves-Morris and Jenkins (44). This formula shows thateðnÞ2k is a ‘‘weighted average’’ of the xm exactly as describedin the Introduction even though this is not clear fromEquation (69).

The convergence of the sequence feðnÞ2k g1n¼0 has been

analyzed by Graves-Morris and Saff (45, 46); under theconditions of Theorem 7.1, the convergence rate of e

ðnÞ2k to

s as n!1 is practically the same as those of sn,k for MPEand RRE.

For application of VEA in the cycling mode to the solu-tion of nonlinear systems of equations, see Brezinski(47, 48), Brezinski and Rieu (49), Gekeler (50), and Skelboe(28). All these works consider the application of VEA suchthat order 2 convergence would be obtained, but theirresults seem to be heuristic. Actually, the remarks wemake concerning MPE and RRE at the end of the eighthsection are valid for VEA as well. See also the remarks inSection 6 of Ref. 1.

Topological Epsilon Algorithm

A third method to accelerate the convergence of vectorsequences fxmg that is also based on the epsilon algorithm

is the topological epsilon algorithm (TEA) of Brezinski (7).This method is defined as follows:

eðnÞ�1

¼ 0; eðnÞ0 ¼ xn; n ¼ 0; 1; . . .

eðnÞ2kþ1 ¼ e

ðnþ1Þ2k�1 þ w

w � ðeðnþ1Þ2k � e

ðnÞ2k Þ

;

n; k ¼ 0; 1; . . .

eðnÞ2kþ2 ¼ e

ðnþ1Þ2k þ

eðnþ1Þ2k � e

ðnÞ2k

ðeðnþ1Þ2kþ1 � e

ðnÞ2kþ1Þ � ðe

ðnþ1Þ2k � e

ðnÞ2k Þ

;

n; k ¼ 0; 1; . . . ð70Þ

Here,w is arbitrary fixed vector in CN and x � y ¼ xTy. It isclear that, just as VEA, TEA couples all the components ofthe vectors xm.

Again, eðnÞ2k are the required approximants, and a deter-

minantal representation for them exists. This representa-tion is of the form given in Equations (28) and (29) ofTheorem 4.1, with ui; j ¼ w � unþiþ j in Equation (29).Thus, e

ðnÞ2k is also a ‘‘weighted average’’ of the xm exactly

as described in the Introduction even though this is notclear from Equation (70).

As shown byBrezinski, when applied to a sequence fxmgthat arises from the fixed-point iterative procedurexmþ1 ¼ Txm þ b, the vector e

ð0Þ2k produced by TEA is the

same as the vector generated at the kth stage of themethodofLanczosapplied to the linear system ðI� TÞx ¼ b startingwith the vector x0.

The convergence of the sequences feðnÞ2k g1n¼0 has been

analyzed bySidi, Ford, andSmith (14) andSidi andBridger(22). It is proved in Ref. 14 that, under the conditions ofTheorem 7.1, the rate of convergence of e

ðnÞ2k to s as n!1 is

the same as those of sn,k from MPE and RRE. Specifically,for all large n, e

ðnÞ2k exist uniquely [part (1) of Theorem 7.1],

and that the rest of the results ofTheorem7.1 [parts (2)–(4)]are valid for TEA without any changes, provided

w � vi 6¼ 0; i ¼ 1; . . . ; k : ð71Þ

Remarks on Application of Epsilon Algorithms

Clearly, all three epsilon algorithms can be applied in thecyclingmode, just asMPEandRRE. It has been observed invarious applications that, when used in the cycling mode,SEA is rather unstable, hence it is not recommended forvector sequences. For this reason,we comment only onVEAand TEA in the sequel.

From the various convergence analyses mentionedabove, and from Theorem 3.6 and McLeod’s theorem, weconclude that the sequences feðnÞ2k g

1n¼0 from VEA and TEA,

have convergence rates that are the same as those offsn;kg1n¼0 fromMPE and RRE. It is clear from the recursionrelations in Equations (69) and (70) that to determine feðnÞ2k gwe need the vectors xn; xnþ1; . . . ; xnþ2k, a total of 2k þ 1vectors. To compute sn;k, we need xn; xnþ1; . . . ; xnþkþ1, a totalof k þ 2 vectors. Thus, VEA and TEA require practicallytwice as many vectors xm to achieve the same kind ofaccuracy as MPE and RRE. This can already be of concernwhen the cost of computing eachxm is high. Also,weneed to


store 2k þ 1 vectors to obtain eðnÞ2k , whereas k þ 2 vectors

need to be stored to obtain sn;k. When N, the dimension ofthe vectors xm, is very large, storage problems can occurwith VEAmore than withMPE and RRE. In addition, withthe vectors xm already available, the cost of computing e

ðnÞ2k

is higher than the cost of computing sn;k when the latter arecomputed via the algorithms of the sixth section; for theexact operation counts, see Sidi (20). For all these reasons,it seems reasonable to prefer MPE and RRE over epsilonalgorithms in general.

BIBLIOGRAPHY

1. D. A. Smith,W. F. Ford, andA. Sidi, Extrapolationmethods forvector sequences, SIAM Rev., 29: 199–233, 1987.

2. S. Cabay and L. W. Jackson, A polynomial extrapolationmethod for finding limits and antilimits of vector sequences,SIAM J. Numer. Anal., 13: 734–752, 1976.

3. R. P. Eddy, Extrapolating to the limit of a vector sequence, inP. C. C. Wang, (ed.), Information Linkage Between Applied

Mathematics and Industry, New York: Academic Press, 1979.

4. M.Mesina, Convergence acceleration for the iterative solutionof the equations X ¼ AX þ f , Comput. Methods Appl. Mech.

Engrg., 10: 165–173, 1977.

5. P.Wynn.Ona device for computing the emðSnÞ transformation,Mathematical Tables and Other Aids to Computation, 10: 91–96, 1956.

6. P. Wynn, Acceleration techniques for iterated vector andmatrix problems, Math. Comp., 16: 301–322, 1962.

7. C. Brezinski, Generalisations de la transformation de Shanks,de la table de Pade, et de l’e-algorithme, Calcolo, 11: 317–360,1975.

8. W. E. Arnoldi, The principle of minimized iterations in thesolution of thematrix eigenvalue problem,Quart. Appl.Math.,9: 17–29, 1951.

9. Y. Saad and M. H. Schultz, GMRES: A generalized minimalresidual method for solving nonsymmetric linear systems,SIAM J. Sci. Statist. Comput., 7: 856–869, 1986.

10. A. Sidi, Rational approximations from power series of vector-valued meromorphic functions, J. Approx. Theory, 77: 89–111,1994.

11. A. Sidi, Application of vector-valued rational approximation tothe matrix eigenvalue problem and connections with Krylovsubspacemethods,SIAMJ.MatrixAnal.Appl.,16: 1341–1369,1995.

12. A. Sidi, Approximation of largest eigenpairs of matrices andapplications to PageRank computation, Technical Report CS-2004-16, Computer ScienceDept., Technion–Israel Institute ofTechnology, 2004.

13. B. P. Pugachev, Acceleration of the convergence of iterativeprocesses and a method of solving systems of nonlinear equa-tions, U.S.S.R. Comput. Math. Math. Phys., 17: 199–207,1978.

14. A. Sidi,W.F.Ford, andD.A. Smith,Acceleration of convergenceof vector sequences, SIAM J. Numer. Anal., 23: 178–196, 1986.

15. A. S. Householder, The Theory of Matrices in Numerical Ana-

lysis. New York: Blaisedell, 1964.

16. S. Kaniel and J. Stein, Least-square acceleration of iterativemethods for linear equations,J.OptimizationTheoryAppl.,14:431–437, 1974.

17. A. Sidi, Convergence and stability properties of minimalpolynomial and reduced rank extrapolation algorithms,SIAM J. Numer. Anal., 23: 197–209, 1986.

18. W. F. Ford and A. Sidi, Recursive algorithms for vectorextrapolation methods, Appl. Numer. Math., 4: 477–489,1988.

19. A. Sidi, Extrapolation vs. projectionmethods for linear systemsof equations, J. Comp. Appl. Math., 22: 71–88, 1988.

20. A. Sidi, Efficient implementation of minimal polynomial andreduced rank extrapolation methods, J. Comp. Appl. Math.,36: 305–337, 1991.

21. G. H. Golub and C. F. VanLoan,Matrix Computations, 3rd ed.,London, UK: John Hopkins University Press, 1996.

22. A. Sidi and J. Bridger, Convergence and stability analyses forsome vector extrapolation methods in the presence of defec-tive iteration matrices, J. Comp. Appl. Math., 22: 35–61,1988.

23. A. Sidi, Convergence of intermediate rows of minimal poly-nomial and reduced rank extrapolation tables, Numer. Algo-

rithms, 6: 229–244, 1994.

24. A. Sidi and Y. Shapira, Upper bounds for convergence ratesof vector extrapolation methods on linear systems withinitial iterations, Technical Report 701, Technion–IsraelInstitute of Technology, 1991. http://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/19920024498-1992024498. pdf.

25. A. Sidi and Y. Shapira, Upper bounds for convergence rates ofacceleration methods with initial iterations, Numer. Algo-

rithms, 18: 113–132, 1998.

26. M. Abramowitz and I. A. Stegun, Handbook of Mathe-

matical Functions with Formulas, Graphs, and Mathemati-

cal Tables. Washington, D.C.: US Government PrintingOffice, 1964.

27. A. Sidi, Practical ExtrapolationMethods: Theory and Applica-tions, in Cambridge Monographs on Applied and Computa-

tional Mathematics, Cambridge, MA: Cambridge UniversityPress, 2003.

28. S. Skelboe, Computation of the periodic steady-state responseof nonlinear networks by extrapolation methods, IEEE

Trans. Circuits and Systems, 27: 161–175, 1980.

29. J. Stoer and R. Bulirsch, Introduction to Numerical Analysis.3rd ed., New York: Springer–Verlag, 2002.

30. K. W.Morton and D. F. Mayers,Numerical Solution of Partial

Differential Equations, Cambridge, MA: Cambridge Univer-sity Press, 1994.

31. W. F. Ames,Numerical Methods for Partial Differential Equa-

tions, 2nd ed., New York: Academic Press, 1977.

32. Baisheng Wu and Huixiang Zhong, Application of vector-valued rational approximations to a class of non-linearoscillations, Internat. J. Non-Linear Mech., 38: 249–254,2003.

33. A. Sidi, Vector extrapolation methods with applications tosolution of large systems of equations and to PageRank com-putations, Computers Math. Appl., 56: 1–24, 2008.

34. S. Brin and L. Page, The anatomy of a large-scale hypertextualweb search engine, Comput. Networks ISDN Syst., 30: 107–117, 1998.

35. S. D. Kamvar, T. H. Haveliwala, C. D. Manning, and G. H.Golub, Extrapolation methods for accelerating PageRankcomputations, Proc. of the Twelfth International World Wide

Web Conference, 2003, pp. 261–270.

36. D. Shanks, Nonlinear transformations of divergent andslowly convergent sequences, J. Math. and Phys., 34: 1–42,1955.


37. J. R. Schmidt, On the numerical solution of linear simultaneousequations by an iterativemethod,Phil. Mag., 7: 369–383, 1941.

38. P. Wynn, On the convergence and stability of the epsilonalgorithm, SIAM J. Numer. Anal., 3: 91–122, 1966.

39. A. Sidi, Extension and completion of Wynn’s theory on con-vergence of columns of the epsilon table, J. Approx. Theory, 86:21–40, 1996.

40. P. R. Graves-Morris, Vector valued rational interpolants I,Numer. Math., 42: 331–348, 1983.

41. P. R. Graves-Morris, Extrapolation methods for vector conse-quences, Numer. Math., 61: 475–487, 1992.

42. P. Wynn, Continued fractions whose coefficients obey a non-commutative law of multiplication,Arch. Rat.Mech. Anal., 12:273–312, 1963.

43. J.B.McLeod,Anoteonthe e-algorithm,Computing,7: 17–24,1971.

44. P. R. Graves-Morris and C. D. Jenkins, Vector valued rationalinterpolants III, Constr. Approx., 2: 263–289, 1986.

45. P. R. Graves-Morris and E. B. Saff, Row convergence theoremsfor generalised inverse vector-valued Pade approximants, J.Comp. Appl. Math., 23: 63–85, 1988.

46. P. R. Graves-Morris and E. B. Saff, An extension of a rowconvergence theorem for vector Pade approximants, J. Comp.

Appl. Math., 34: 315–324, 1991.

47. C. Brezinski, Application de l’e-algorithme a la resolution dessystemes non lineaires, C. R. Acad. Sci. Paris, 271(A): 1174–1177, 1970.

48. C.Brezinski, Surunalgorithmede resolutiondes systemesnonlineaires, C. R. Acad. Sci. Paris, 272(A): 145–148, 1971.

49. C. Brezinski and A. C. Rieu, The solution of systems of equa-tions using the e-algorithm, and an application to boundary-value problems, Math. Comp., 28: 731–741, 1974.

50. E. Gekeler, On the solution of systems of equations bytheepsilonalgorithmofWynn,Math.Comp.,26: 427–436,1972.

AVRAM SIDI

Technion—Israel Institute ofTechnology

Haifa, Israel


Documents

Methods for Acceleration of Convergence (Extrapolation) of Vector Sequences