A Crash Course on Matrices, Moments and Quadratureorca.st.usm.edu/~zhang/seminar/James_talk_012910.pdf · · 2010-02-09A Crash Course on Matrices, Moments and Quadrature James V

IntroductionFrom Bilinear Forms to Integrals

Quadratic FormsBilinear Forms

Conclusions

A Crash Course on Matrices, Moments andQuadrature

James V. Lambers

Department of MathematicsUniversity of Southern Mississippi

School of Computing Seminar SeriesJanuary 29, 2010

James V. Lambers A Crash Course on Matrices, Moments and Quadrature



Conclusions

Outline

From bilinear forms to integrals

Estimation of quadratic forms

Estimation of bilinear forms

Conclusions




Conclusions

What This Talk is Really About

Its a plug for a book, even though itsnot my book

The book: Matrices, Moments andQuadrature, with Applications byGene Golub and Gerard Meurant

In the works since 2005, finallypublished in December 2009 byPrinceton University Press

If you find the material in this talkinteresting, you need this book!

Available online for only $65




Conclusions

What This Talk is Really About

Its a plug for a book, even though itsnot my book

The book: Matrices, Moments andQuadrature, with Applications byGene Golub and Gerard Meurant

In the works since 2005, finallypublished in December 2009 byPrinceton University Press

If you find the material in this talkinteresting, you need this book!

Available online for only $65

Ordered for USM Library




Conclusions

Elements of Functions of Matrices

In their 1994 paper Matrices, Moments and Quadrature, Golub andMeurant described a method for computing quantities of the form

uT f (A)v,

where u and v are N-vectors, A is an N N symmetric positive definitematrix, and f is a smooth function




Conclusions

Using the Spectral Decomposition

The basic idea is as follows: since the matrix A is symmetric positivedefinite, it has real eigenvalues

b = 1 2 N = a > 0,

and corresponding orthogonal eigenvectors qj , j = 1, . . . ,N

Therefore, the quantity uT f (A)v can be rewritten as

uT f (A)v =Nj=1

f (j)uTqjq

Tj v




Conclusions

From Bilinear Forms to Integrals

We let a = N be the smallest eigenvalue, b = 1 be the largesteigenvalue, and define the measure () by

() =

0, if < aN

j=i jj , if i < i1Nj=1 jj , if b

, j = uTqj , j = q

Tj v

Then the quantity uT f (A)v can be viewed as a Riemann-Stieltjes integral

uT f (A)v = I [f ] =

ba

f () d()




Conclusions

Not Your Everyday Integral

A Riemann-Stieltjes integral of this form cannot be reduced to aRiemann integral, because is constant everywhere except forjumps at the eigenvalues of A

However, because each of the moments

i =

ba

i d(), i = 0, 1, 2, . . .

are finite, the integral of any polynomial exists

We define the inner product

p, q = ba

p()q() d() = uTp(A)q(A)v

where p and q are real-valued polynomials




Conclusions

Approximation Strategy

The integral I [f ] can be approximated using either Gauss, Gauss-Radau,or Gauss-Lobatto quadrature rules, all of which yield an approximation ofthe form

I [f ] =Kj=1

wj f (j) + R[f ],

where the nodes j , j = 1, . . . ,K , as well as the weights wj ,j = 1, . . . ,K , can be obtained from orthogonal polynomials with respectto the measure ().




Conclusions

Approximation By QuadratureApplications

The Case u = v

When u = v, the measure () is positive and increasing

Therefore, we seek a quadrature rule with positive weights

But before we can compute the weights, we need the nodes

For Gaussian quadrature, the nodes are the roots of a polynomialthat is orthogonal to all polynomials of lesser degree

How can we compute polynomials that are orthogonal with respectto ()?




Conclusions


Orthogonal Polynomials

A sequence of orthogonal polynomials satisfies a 3-term recurrencerelation

jpj() = ( j)pj1() j1pj2(), p1() 0

The recursion coefficients j , j are determined by the orthogonalityof the pj

The initial polynomial p0() is chosen to be a constant

We require each polynomial to have unit norm, i.e. pj , pj = 1,where , is the appropriate inner product




Conclusions


What is the Inner Product?

In this case, the inner product is

f , g = ba

f ()g() d() = uT f (A)g(A)u.

The recursion coefficients are given by

j = uTApj1(A)

2u, 2j = uTqj1(A)

2u,

whereqj1() = ( j)pj1() j1pj2()

To begin the sequence, we set p0() 1/u




Conclusions


Computing Coefficients Efficiently

If we define xj = pj1(A)u, and rj = qj(A)u, then

j = xTj Axj ,

2j = rj12

The resulting algorithm for the recursion coefficients is

r0 = u, 0 = r02, x1 = r0/0for j = 1, 2, . . .

j = xTj Axjrj = (A j I )xj j1xj1j = rj2xj+1 = rj/j

end

Look familiar? Its the Lanczos algorithm!




Conclusions


From Recursion Coefficients to Quadrature Nodes...

For a j-node Gaussian quadrature rule, we need the roots of pj()

We could compute the orthogonal polynomials p0, p1, . . . , pj fromthe recursion coefficients j , j but this is unnecessary

From the 3-term recurrence relation,

vj() = Tjvj() + jpj()

wherevj() =

[p0() p1() pj1()

]Tand Tj is a j j tridiagonal symmetric positive definite matrix,called a Jacobi matrix




Conclusions


...By Way of Eigenvalues!

Specifically,

Tj =

1 11 2 1

. . .. . .

. . .

j2 j1 j1j1 j

Now, suppose is a root of pj . Then we have

vj() = Tjvj()

That is, is an eigenvalue of Tj !




Conclusions


But What About the Weights?

The weights are given by

wk =

ba

Lk() d()

where the Lk , k = 1, . . . , j are the Lagrange polynomials for the nodes

However, from the Christoffel-Darboux identity

(x y)vTj (x)vj(y) = j [pj1(y)pj(x) pj1(x)pj(y)],

it can be shown that wj is the square of the first component of theeigenvector corresponding to the node j




Conclusions


Its Just Functions of Small Matrices!

If Tj = UjjUTj is the Schur decomposition of Tj , then we have

uT f (A)u j

k=1

f (k)wk

= u22eT1 Uj f (j)UTj e1= u22[f (Tj)]11

Therefore, once we compute Tj , we can perform Gaussian quadratureusing any technique for computing the (1, 1) element f (Tj), withoutnecessarily having to compute the nodes and weights explicitly




Conclusions


Diagonal Elements of f (A)

By setting u = v = ej , we can approximate diagonal elements of afunction of a symmetric matrix A

Example: f () = 1 for the inverse

The error in K -node Gaussian quadrature has the form

R[f ] = I [f ] u22eT1 f (TK )e1

= u22f (2K)()

(2K )!

ba

Kk=1

( k)2 d()

where (a, b) Therefore, if f (2K) > 0, Gaussian quadrature yields a lower bound




Conclusions


Prescribed Nodes

It is sometimes advantageous to prescribe selected quadrature nodes

In Gauss-Radau rules, one node is specified, normally a or b (or anestimate)

In Gauss-Lobatto rules, two nodes are specified (e.g., both a and b)

Lanczos is still used, but trailing recursion coefficients are chosen torequire that prescribed nodes are eigenvalues of the Jacobi matrix

Quadrature error includes linear factors ( k) corresponding toprescribed nodes, where (a, b), so by prescribing k = a ork = b, upper or lower bounds may be obtained




Conclusions


Estimating tr(A1) and det(A) (Bai and Golub, 1997)

Let A be symmetric positive definite, and define

r =n

i=1

ri =

ba

r d(),

where () is an unknown measure

Then 1 = tr(A1), and 0, 1 and 2 can easily be computed.

If we use a two-node quadrature rule, approximations of r satisfy a3-term recurrence relation, which can be recovered

Known values of r can then be used to compute the weights

Using Gauss-Radau rules with prescribed node a or b, we can obtainupper and lower bounds on tr(A1)




Conclusions


Estimating det(A)

We have

ln(det(A)) = ln

(n

i=1

i

)=

ni=1

lni = tr(ln(A))

It follows that we can use the same quadrature rule for tr(A1), withthe integrand f () = ln, to obtain upper and lower bounds fordet(A)

Estimates of det(A) and tr(A1) have applications in the study offractals, lattice quantum chromodynamics (QCD), and crystals




Conclusions


Estimating Error in Conjugate Gradient

The kth iteration of the CG algorithm corresponds to a Jacobimatrix Jk , which can be computed from the coefficients of CG

The Jacobi matrices relate to the A-norm of the error as follows:

k2A = r022[eT1 J1n e1 eT1 J1k e1]

Let d be a delay integer. An estimate is

kd2A r022[eT1 J1k e1 eT1 J1kde1]

Because the Jacobi matrices are tridiagonal, the (1, 1) element ofthe inverse can easily be computed using recurrence relations

Jacobi matrices can be modified to obtain estimates fromGauss-Radau or Gauss-Lobatto quadrature rules (Meurant, 2006)




Conclusions


Regularization

Consider Tikhonov regularization

minx{c Ax22 + x22}

for solving the ill-posed problem Ax = c

How does the regularized solution x depend on ?

This can be understood using the L-curve (x2, c Ax2) If we let K = ATA, and d = ATc, then

x22 = dT (K + I )2d,

c Ax22 = cTc dTK (K + I )2d 2dT (K + I )1d Can be estimated using Gaussian quadrature, with Lanczos

bidiagonalization used to compute Jacobi matrices corresponding toATA (Calvetti, Golub and Reichel, 1999)




Conclusions


Regularization

Consider Tikhonov regularization

minx{c Ax22 + x22}

for solving the ill-posed problem Ax = c

How does the regularized solution x depend on ?

This can be understood using the L-curve (x2, c Ax2) If we let K = ATA, and d = ATc, then

x22 = dT (K + I )2d,

c Ax22 = cTc dTK (K + I )2d 2dT (K + I )1d Can be estimated using Gaussian quadrature, with Golub-Kahan

bidiagonalization used to compute Jacobi matrices corresponding toATA (Calvetti, Golub and Reichel, 1999)




Conclusions


Least Squares Error

Consider the least squares problem

minxc Ax2

The backward error is defined by

(x) = min[ A c ]

F,

where x is the computed solution, (ATA + A)x = ATc + c, and is areal parameter




Conclusions


Least Squares Error Estimation

(x) is not practical to compute directly, so we estimate with

(x) =(x22ATA + r22I )1/2AT r

2,

where r = c Ax. Then, if we define u(x) = x22[(x)]2, we have

(x) = yT (ATA + 2I )1y,

where y = AT r and = r2/x2, which is a quadratic form that canbe estimated using Gaussian quadrature in conjunction with Lanczosbidiagonalization (Su, 2005)




Conclusions

Perturbations of Quadratic FormsApplicationsBlock Gaussian QuadratureApplications

The u = v Case

For general u and v, the bilinear form uT f (A)v can be expressed as thedifference quotient

uT f (A)v =1

[uT f (A)(u + v) uT f (A)u

]where is a small constant

This yields Riemann-Stieltjes integrals with positive, increasing measureswhen u and v are real, provided that is sufficiently small




Conclusions


Really, Were Computing Derivatives... (JL, 2008)

Let T be the output of the Lanczos algorithm with starting vectorsu and u + v, and let K be the number of quadrature nodes

Then in computing uT f (A)v, we are approximating the derivative

d

d

[uT (u + v) [f (T)]11

]=0

But

[f (T)]11 =K

k=1

wk f (k)

where the k , wk are the nodes and weights used to approximateuT f (A)(u + v), as functions of




Conclusions


...So Lets Differentiate!

We let 0, thus computing this derivative exactly Then, we obtain

uT f (A)v uTvK

k=1

wk f (k)uTu

Kk=1

w k f (k) + wk f(k)

k

where and k , wk are their derivatives w.r.t. at = 0




Conclusions


Derivatives of the Nodes

There exists a unitary matrix Q0 such that

T0 = Q00QH0

and the nodes are on the diagonal of 0. Also,

T = QQ1 ,

for suffficiently small

Differentiating with respect to and evaluating at = 0 yields

diag(0) = diag(QH0 T

0Q0),

since all other terms arising from differentiation vanish on the diagonal




Conclusions


Derivatives of the Weights

To compute the derivatives of the weights, consider

(T j I )wj = 0, j = 1, . . . ,K ,

where wj is a normalized eigenvector of T with eigenvalue j

Differentiate with respect to , evaluate at = 0

Delete last row and column, using normalization

We now have a (K 1) (K 1) system, where the matrix istridiagonal plus a rank-one update, and independent of v

Solve this system, and a similar one for the left eigenvector

We can obtain w k from the first components of the two solutions




Conclusions


Derivatives of the Recursion Coefficients

From the expressions for the entriesof T in terms of those of T0, thederivatives of the recursioncoefficients can be obtained byletting q0 = v and rj be theunnormalized Lanczos vectors

By differentiating the recurrencerelations with respect to andevaluating at = 0, we obtain thefollowing algorithm that computesthese derivatives

[20 ] = rH0 q0

s0 =10

t0 = [20 ]

20

d 0 = 0for j = 1, . . . ,K

j = sj1rHj qj1 + d

j1j2

d j = (dj1j2 j)/j1

qj = (A j I )qj1 2j1qj2[2j ]

= tj12j + sj1r

Hj qj

sj = sj1/j

tj = tj1

[2j ]

2j

end




Conclusions


Which Approach to Use?

The preceding approach of computing derivatives of quadrature rules isparticularly useful when u varies over a set of size N and v is a fixedvector, since then fewer Krylov subspaces need to be generated (N + 1instead of 2N)

Another approach to bilinear forms is to write

uT f (A)v =1

4[(u + v)T f (A)(u + v) (u v)T f (A)(u v)],

since then only the symmetric Lanczos algorithm is needed




Conclusions


Off-Diagonal Elements

If we let u = ei and v = ej , with i = j , then we can use computeoff-diagonal elements of matrix functions, such as

[A1]ij =1

[eTi A

1(ei + ej) eTi A1ei]

This avoids immediate serious breakdown of the unsymmetric Lanczosalgorithm from approximating uT f (A)v directly when u and v areorthogonal

Note that computing derivatives of quadrature rules w.r.t. cancircumvent numerical instability arising from choice of very small




Conclusions


The Scattering Amplitude

Consider the system Ax = c and the adjoint system ATy = d

In electromagnetics, the scattering amplitude can be described by anexpression of the form dTx, where d represents an antenna that receivesa field x from a signal c

The bilinear form dTA1c can be approximated using the unsymmetricLanczos algorithm to construct a Gaussian quadrature rule with integrandf () = 1, but since A is not necessarily symmetric, this can requireGaussian quadrature in the complex plane

Alternatively, can transform into an integral involving a function of asymmetric positive definite matrix by rewriting as dT (ATA)1p, wherep = ATc




Conclusions


An Alternative Approach to Scattering Amplitude

Let W be a symmetric positive definite matrix, and define

M =

[ATWA AT

A 0

], c =

[ATW c + dc

], p =

[d0

]Then the scattering amplitude can be written as

pT f (M)c

where f () = 1

But why is this sensible, considering M is not symmetric?




Conclusions


CG for Unsymmetric Matrices?

M may not be symmetric, but it is positive definite. Furthermore, it issymmetric with respect to the bilinear form (u, v)G vTGu, where

G =

[I 00 I

]Therefore, there exists a well-defined conjugate gradient method forsolving systems with M! (Liesen and Parlett, 2008)

The idea to use M for the scattering amplitude was described to me byGene Golub on November 6, 2007




Conclusions


Model Variable-Coefficient Problem

Consider the following initial-boundary value problem in one spacedimension,

ut + Lu = 0 on (0, 2) (0,),

u(x , 0) = f (x), 0 < x < 2,

u(0, t) = u(2, t), t > 0

The operator L is a second-order differential operator of the form

Lu = (p(x)ux)x + q(x)u,

where p(x) is a positive smooth function and q(x) is a nonnegative (butnonzero) smooth function. It follows that L is self-adjoint and positivedefinite




Conclusions


Krylov Subspace Spectral Methods

Krylov subspace spectral (KSS) methods (JL, 2005) use this approach tocompute the Fourier coefficients of un+1 from un:

Choose a scaling constant for ! = N/2 + 1, . . . ,N/2

Compute u1 eH! exp[LNt]e!using the symmetric Lanczos algorithm

Compute u2 eH! exp[LNt](e! + un)using the unsymmetric Lanczos algorithm

[un+1]! = (u2 u1)/end

Advantages: high-order accuracy in time, stable

Disadvantages: performance sensitive to choice of basis, limitingapplicability to other spatial discretizations




Conclusions


Block Gaussian Quadrature

As an alternative, we consider computing[u v

]Tf (A)

[u v

]which results in the 2 2 matrix integral b

a

f () d() =

[uT f (A)u uT f (A)vvT f (A)u vT f (A)v

]=

2Kj=1

f (j)ujuTj + error

where j is a scalar, and uj is a 2-vector




Conclusions


Block Lanczos Iteration

To obtain the scalar nodes j and the associated vectors uj , we use theblock Lanczos algorithm (Golub and Underwood)

Let X1 =[u v

]be an N 2 given matrix, such that XT1 X1 = I2. Let

X0 = 0 be an N 2 matrix. Then, for j = 1, . . . , we compute

Mj = XTj AXj ,

Rj = AXj XjMj Xj1BTj1,

Xj+1Bj = Rj

The last step of the algorithm is the QR factorization of Rj such that Xjis N 2 with XTj Xj = I2. The matrix Bj is 2 2 upper triangular. Theother coefficient matrix Mj is 2 2 and symmetric




Conclusions


Computation of Block Gaussian Quadrature Rules

Block Lanczos yields the block tridiagonal matrix

TK =

M1 B

T1

B1 M2 BT2

. . .. . .

. . .

BK2 MK1 BTK1

BK1 MK

We then define the quadrature rule for

[u v

]Tf (A)

[u v

]as

ba

f () d() 2Kj=1

f (j)ujuTj = [f (TK )]1:2,1:2

where 2K is the order of the matrix TK , j an eigenvalue of TK , and ujcontains the first two elements of the normalized eigenvector




Conclusions


Block KSS Methods (JL, 2008)

For each wave number ! = N/2 + 1, . . . ,N/2, we define

R0(!) =[e! un

]and then compute the QR factorization

R0(!) = X1(!)B0(!),

which yields

X1(!) =[e! un!/un!2

], B0(!) =

[1 eH!u

n

0 un!2

],

whereun! = u

n e!eH!un




Conclusions


Block KSS Methods, contd

Then, we can express each Fourier coefficient of the approximate solutionat the next time step as

[un+1]! =[BH0 E

H12 exp[TK (!)t]E12B0

]12

whereE12 =

[e1 e2

]The computation of EH12 exp[TK (!)t]E12 consists of computing theeigenvalues and eigenvectors of TK (!) in order to obtain the nodes andweights for Gaussian quadrature, as before

By computing recursion coefficients as functions of !, we can computeall quadrature rules simultaneously, in O(N log N) time overall




Conclusions


Consistency

The error in a K -node block Gaussian quadrature rule is

R(f ) =f (2K)()

(2K )!

ba

2Kj=1

( j) d()

It follows that the rule is exact for polynomials of degree up to 2K 1

A block KSS method that uses a K -node block Gaussian rule to computeeach Fourier coefficient [u1]!, for ! = N/2 + 1, . . . ,N/2, of thesolution satisfies[u1]! u(!,t) = O(t2K ), ! = N/2 + 1, . . . ,N/2,where u(!,t) is the corresponding Fourier coefficient of the exactsolution at time t




Conclusions


Stability

Let q(x) be bandlimited. Then the block KSS method with K = 1 isunconditionally stable. That is, given T > 0, there exists a constant CT ,independent of N and t, such that

[SN(t)]n CT ,

for 0 nt T , where SN(t) is the approximate solution operator

Note: it has been demonstrated that KSS methods exhibit similarstability on more general problems, and K > 1, even when the leadingcoefficient was not constant




Conclusions


The Wave Equation

We now apply these ideas to the second-order wave equation

utt + Lu = 0 on (0, 2) (0,),

u(x , 0) = f (x), ut(x , 0) = g(x), 0 < x < 2,

with periodic boundary conditions

u(0, t) = u(2, t), t > 0

The operator L is as defined previously,

Lu = (p(x)ux)x + q(x)u




Conclusions


Application to the Wave Equation

A spectral representation of the operator L allows us the obtain arepresentation of the solution operator, the propagator. First, weintroduce

R1(t) = L1/2 sin(t

L) =

n=1

sin(tn)

n'n, 'n ,

R0(t) = cos(t

L) =n=1

cos(tn)'n, 'n ,

where 1, 2, . . . are the (positive) eigenvalues of L, and '1, '2, . . . arethe corresponding eigenfunctions




Conclusions


The Propagator

Then the propagator can be written as

P(t) =

[R0(t) R1(t)L R1(t) R0(t)

]The entries of this matrix, as functions of L, indicate which functions arethe integrands in the Riemann-Stieltjes integrals used to compute theFourier components of the solution

The block Lanczos process is applied exactly as in the parabolic case, butto the solution and its time derivative




Conclusions


Consistency and Stability

A block KSS method that uses a K -node block Gaussian rule to computeeach Fourier coefficient of the solution and its time derivative hastemporal accuracy O(t4K2)

Assume p(x) constant and q(x) is bandlimited. Then, the block KSSmethod with K = 1, which is second-order accurate in time, isunconditionally stable

Bottom Line

Thus, KSS methods represent a best-of-both-worlds compromise betweenthe efficiency of explicit methods and the stability of implicit methods




Conclusions


Application to Other PDE

Time-dependent Schrodinger equation,with f () = eit (JL, 2009)

Elliptic problems (JL, 2009) Uses f () = 1, iterative refinement Ongoing work: Helmholtz equation

Nonlinear diffusion (Guidotti and JL, 2008)

ut (1 + [(D2)0.8u]2)1uxx = 0

Useful for removing noise from signals orimages

Block KSS methods able to handlenonlinearity without modification

Non-self-adjoint, coupled systems such asMaxwells equations (JL, 2009)


soln3.aviMedia File (video/avi)



Conclusions

Other Things I Could Have Talked About

Application to total least squares

Solving secular equations

Modified weight functions: computing quadrature rules for anintegral

I [f ] =

ba

f ()w() d()

for some weight function w() such as p or ( )1 (see work byGolub, Elhay, Kautsky, Gautschi, et al.)

Gauss-Kronrod rules (Calvetti, Golub, Gragg and Reichel, 2000)

Anti-Gauss rules (Laurie, 1996)




Conclusions

Summary

Gaussian quadrature is effective for computing estimates and boundsof quadratic and bilinear forms involving functions of matrices

For large-scale problems, it is not necessary to evaluate the entirematrix functionrelatively low-dimensional Krylov subspaces aresufficient

Block Gaussian quadrature is particularly effective for estimation ofbilinear forms

These techniques have many applications throughout numericallinear algebra, as well as other areas of computational mathematics

What other applications can we find? Be on the lookout!




Conclusions

For More Information...

James LambersDepartment of MathematicsUniversity of Southern Mississippi

[email protected]

http://www.math.usm.edu/lambers


IntroductionFrom Bilinear Forms to IntegralsQuadratic FormsApproximation By QuadratureApplications

Bilinear FormsPerturbations of Quadratic FormsApplicationsBlock Gaussian QuadratureApplications

Conclusions

Documents

A Crash Course on Matrices, Moments and Quadratureorca.st.usm.edu/~zhang/seminar/James_talk_012910.pdf · · 2010-02-09A Crash Course on Matrices, Moments and Quadrature James V