41

Algorithm b y - Skolan för datavetenskap och kommunikation · Abstract A Jacobi algorithm that will diagonalize sim ultaneous more than one dense sym-metric matrices is dev elop

  • Upload
    lamphuc

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

A Jacobi Algorithm for Simultaneousdiagonalization of Several Symmetric

Matrices

by

Mercy Maleko

Master's Thesis in Scienti�c ComputingDepartment of Numerical Analysis and Computer Science

Royal Institute of TechnologyStockholm 2003

Presented on:20th January 2003.

ii

Abstract

A Jacobi algorithm that will diagonalize simultaneous more than one dense sym-metric matrices is developed in this thesis. The development is based on theoriginal Jacobi algorithm that diagonalize one matrix at a time, by performingJacobi rotations until the matrix is diagonal to machine precission. The algorithmwill be modi�ed so as to solve the Singular Value Decomposition for several densesymmetric matrices simultaneously.First, we will �nd the algorithm that will diagonalize the matrices, then test thealgorithm with di�erent types of matrices.Second, modify the algorithm to solve the Singular value problems. We will testit with matrices obtained from some experiments performed to determine chemi-cal composition of a sample mixture. This sample mixture contains some species,the number of non zero singular values will give the number of species containedin the mixture.

Sammanfattning

I detta examensarbete konstreras en Jacobi algoritm som diagonaliserar �eraglesa symmetriska matriser samtidigt. Den bygger på Jacobi algoritm för diago-nalisering av en matris genom Jacobirotationer tills matrisen är diagonal upp tillmaskinnoggrannhet. Algoritmen modi�eras för att lösa singulärvärdes uppdel-ningen av �era matriser samtidigt.Först, konstruerar vi algoritmen för samtidig diagonalisering av �era matriserfäljt av tester på olika typer av matriser.Sedan, modi�erar vi algoritmen för att lösa singulärvärdes problem. Testergenomförs på matriser fråm experiment utförda för att bestämma kemisk sam-mansättning av en lösning. Lösningen innehåller ämnen och antalet ämnen gesav olika antalet nollskilsda singulärvärden.

iii

iv

Acknowledgements

I �rst wish to thank my supervisor, Prof. Axel Ruhe, for his continuous sup-port and continuous and enlightening guidance during completion and writtingof my thesis.I am also grateful to Dr. Lennart Edsberg (Associate Prof.), for being an excel-lent coordinator for International Masters program at Nada, and for the fruitfullcontacts we had made when I was applying this program.I owe many thanks to my sponsor, The Swedish Institute (SI), and in particularto Ms Karin Di�.I would like to thank my o�ce mates, Per-Olov, Reynir Gudmundsson and Hen-rik Olsson for their advises and help. Also thanks to my colleague for their dayto day activities we shared during course work.My heartly thanks have to go to my family and relatives for their support.Lastly but not least, I want to specially thank Dr.H.M Twaakyondo, for his end-less encouragement during the pursuit of my Masters degree and the compositionof the thesis.

v

vi

Contents

1 Introduction 5

2 Jacobi Algorithm for one Symmetric Matrix 7

2.1 Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2 Convergence and Stopping criteria . . . . . . . . . . . . . . . . . . 9

2.2.1 Stopping criteria . . . . . . . . . . . . . . . . . . . . . . . 102.3 How to choose an element . . . . . . . . . . . . . . . . . . . . . . 10

2.3.1 Classical Jacobi . . . . . . . . . . . . . . . . . . . . . . . . 102.3.2 Cyclic Jacobi . . . . . . . . . . . . . . . . . . . . . . . . . 11

3 Jacobi Algorithm on several Symmetric Matrices 13

3.1 Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.2 Monitoring stopping criteria . . . . . . . . . . . . . . . . . . . . . 173.3 How to choose an element . . . . . . . . . . . . . . . . . . . . . . 17

4 Jacobi Algorithm to compute SVD 19

4.1 Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194.2 Convergence and Stopping criteria . . . . . . . . . . . . . . . . . . 21

5 Numerical Tests on the Algorithm 23

5.1 Convergence test with one symmetric matrix . . . . . . . . . . . . 235.1.1 Testing with Hilbert Matrices . . . . . . . . . . . . . . . . 235.1.2 Testing with Square membrane Matrices . . . . . . . . . . 23

5.2 Testing with several Matrices . . . . . . . . . . . . . . . . . . . . 255.2.1 With several commuting Matrices . . . . . . . . . . . . . 255.2.2 With several perturbed commuting Matrices . . . . . . . . 26

5.3 Testing with Test Matrices for SVD . . . . . . . . . . . . . . . . 285.3.1 With single Test Matrix for SVD . . . . . . . . . . . . . . 285.3.2 With several Test Matrices for SVD . . . . . . . . . . . . 28

5.4 Comparison Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

6 Conclusion 33

1

2

To Dr. Hashim M. Twaakyondo

3

4

Chapter 1

Introduction

The method of Jacobi dates back to 1846 before the advent of the QR algorithm.The method solves the eigenvalue problem for real, dense symmetric matrices.The Jacobi algorithm starts by reducing the dense matrixA into diagonal matrixA0 whose diagonal elements are the eigenvalues of the original matrix A. It em-ploys a sequence of orthogonal similarity transformations. Each transformationis a plane rotation designed to annihilate one of the o� diagonal matrix elements.In each successive transformation, the o� diagonal elements are changed and eachtime they get smaller and smaller until the matrix is diagonal to machine preci-sion. It is one of the backward stable method and thus compute large eigenvalues(those near kAk2 in magnitude) with high relative accuracy.

Jacobi algorithm has been used to diagonalize symmetric matrix by perform-ing a basic Jacobi plane rotation R(�), given as

R(�) =

�cos� �sin�sin� cos�

�(1.1)

The matrix A can be 2x2 or more, Jacobi method solves a sequence of 2x2subproblems.Given the 2x2 subproblem (i; j),

A =

�aii aijaji ajj

A transformed matrix A0 will be

A0 = R(i; j; �)TA R(i; j; �) (1.2)

5

with

R(i; j; �) =

0BBBBBBBBBBBBBBBBBB@

1. . .

1cos� �sin�

1. . .

1sin� cos�

1. . .

1

1CCCCCCCCCCCCCCCCCCA

(1.3)

The angle � is chosen so that the o� diagonal elements aij = aji are reducedto zero. Note that only the entries in columns and rows i and j of A will change.Since the matrix is symmetric, only the upper triangular part of A needs to becomputed. Jacobi's method is usually slower than Householder, QR that is usedin Matlab, but it remains of interest because:-

� It can sometimes compute tiny eigenvalues and their eigenvectors with highrelative accuracy.

� It is an e�cient method for computing eigenvalues of a nearly diagonalmatrix.

� It can be used to diagonalize several symmetric matrices simultaneously, orcompute an approximate diagonalization when the matrices do not com-mute exactly. This is the main focus of this project.

In the second chapter of this thesis, we are going to explain about the originalJacobi, the Jacobi Algorithm for one symmetric matrix, we call it algorithmJ1. In the same chapter we will discuss some conditions for convergence for thisalgorithm. Chapter 3 will deal with the whole idea of diagonalization of severalmatrices simultaneously. Here we will derive the algorithm for diagonalizationof several symmetric matrices simultaneously, and also we will have a look onwhat conditions are suitable for convergence. Chapter 4 discusses about theapplication of the Jacobi algorithm to SVD, how we can apply the algorithmfound to singular value decomposition problems. Moreover, in chapter 5, we willshow some numerical tests we have performed in testing the algorithm.

6

Chapter 2

Jacobi Algorithm for one

Symmetric Matrix

As highlighted in chapter 1, that Jacobi method consists of a sequence of or-thogonal similarity transformations. In this chapter we are going to discuss theJacobi Algorithm as applied to a single matrix, this algorithm can also be foundin di�erent advance Linear Algebra books [1] [5] [7].

2.1 Formulation

To construct Jacobi transformation R(i; j; �) on a matrix A we take

A0 = R(i; j; �)TA R(i; j; �) (2.1)

=

�cos� sin��sin� cos�

� �aii aijaji ajj

� �cos� �sin�sin� cos�

=

�cos� aii + sin� aji cos� aij + sin� ajj�sin� aii + cos� aji �sin� aij + cos� ajj

� �cos� �sin�sin� cos�

=

�c2 aii + s2 ajj + sc aij + sc aji �cs aii + cs ajj + c2 aij � s2 aji�sc aii + c2 aji � s2 aij + cs ajj s2 aii � cs aji � cs aij + c2 ajj

Since A is real and symmetric, aij = aji.

7

then

A0 =

�c2 aii + s2 ajj + 2sc aij cs (ajj � aii) + (c2 � s2)aij

cs (ajj � aii) + (c2 � s2) aij s2 aii � 2cs aij + c2 ajj

�(2.2)

where c = cos(�) and s = sin(�)

Expressing A0 as

A0 =

�a0ii a0ija0ji a0jj

we havea0ij = (cos2� � sin2�) aij + cos�sin� (ajj � aii) (2.3)

a0ij = cos2� aij + 0:5sin2� (ajj � aii) (2.4)

The 2� 2 matrix will be diagonal if

a0ij = 0,

that is if

tan 2� =�2 aij

(ajj � aii): (2.5)

The transformed diagonal elements a0ii and a0jj are as follows

a0ii = cos2�aii + sin2�ajj + 2sin� cos�aij (2.6)

a0jj = sin2� aii � 2cos� sin�aij + cos2�ajj (2.7)

From equation (2.5), we compute sin� and cos� from tan2�by using the quadratic equation below

t2 + 2 t� 1 = 0 (2.8)

=1

tan2�

8

and so

t =sign( )

(j j+p

1 + 2)(2.9)

then

c = 1=p1 + t2;

s = ct:

t = tan �; s = sin � and c = cos �

this choice of � can be shown to minimize the di�erence

kA0 � AkFand in particular, this will prevent the exchange of the two diagonal elements aiiand ajj when aij is small, which is critical the convergence of Jacobi method.

Hence the Jacobi Algorithm for a single matrix is given as follows

ALGORITHM J1

Given A matrix, n size of the matrix, Set V=Ifor i=1:n-1

for j=i+1:nFind angle � that transforms a(i; j) to zero.Get the rotation matrix R(i; j; �)Rotate the matrix

A0 = R(i; j; �)TAR(i; j; �)

accumulate transformations

V 0 = V R(i; j; �)

endforendfor

2.2 Convergence and Stopping criteria

The convergence of Jacobi method can be analysed by considering the sum ofsquares of the o� diagonal elements.

9

S =Xi6=j

jaijj2

Equations (2.3), (2.6), and (2.7) implies that

S0 = S� 2 jaijj2

i.e sum of squares of o� diagonal elements decreases correspondingly by 2 jaijj2.The matrixA0 as said, is diagonal to machine precision, when S � (eps)2kAk2F

, the diagonal elements are the eigenvalues of A, since

A0 = VTAV (2.10)

where

V = R(i; j; �)1R(i; j; �)2R(i; j; �)3 : : :

R(i; j; �)k being the successive Jacobi rotation matrices and the columns of Vare the eigenvectors, since

VA0 = AV (2.11)

2.2.1 Stopping criteria

The stopping criteria for Jacobi Algorithm is a small o� diagonal quantity suchthat

jaijj < Æ(max(jaiij; jajjj))

2.3 How to choose an element

At each step in the Jacobi transformation, an element is annihilated. To choosewhich element to be annihilated, there are several strategies.

2.3.1 Classical Jacobi

Based on searching the whole upper triangle at each stage and set the largest o�diagonal element to zero. This is reasonable strategy for hand calculation, but itis prohibitive on a computer since the search alone makes each Jacobi rotationa process of order N2 instead of N . N is given by n(n�1)

2, n is the matrix

dimension.

10

2.3.2 Cyclic Jacobi

Based on innihilating elements in a predetermined order regardless of the size.Each element is being rotated exactly once in any sequence of N rotations calleda sweep. The convergence is generally quadratic. In cyclic Jacobi, ordering canbe row-wise scheme, column-wise scheme or diagonal-wise scheme. The mostpopular cyclic ordering is the row-wise scheme.

In the row wise ordering scheme the rotations are performed in the fol-lowing order.

(1; 2) (1; 3) (1; 4) : : : (1; n)(2; 3) (2; 4) : : : (2; n)

: : : : : :(n� 1; n)

(2.12)

which is repeated cyclically.

11

12

Chapter 3

Jacobi Algorithm on several

Symmetric Matrices

As we have seen in chapter 2, Jacobi algorithm is used to �nd the eigenvaluesand eigenvectors of a dense symmetric matrix, by making all the o� diagonalelements equal to zero and left with diagonal matrix whose diagonal elements arethe eigenvalues of the matrix. Also we saw how orthogonal eigenvectors can beobtained by accumulating the transformations. Algorithm J1.

When dealing with more than one dense matrix, it is di�cult to annihilate allthe o� diagonal elements, instead the best way is to minimize the sum of squaresof their o� diagonal elements. To diagonalize them simultaneously, �rst we willhave to �nd the angle � from the diagonal elements and o� diagonal elements ofall the matrices involved. This angle � will be used by all matrices to transformthem to diagonal form.

3.1 Formulation

Given for example, three real 2�2 symmetric matrices A, B , C, then accordingto equation (2.1).

A0 = R(i; j; �)TA R(i; j; �)

B0 = R(i; j; �)TB R(i; j; �)

C0 = R(i; j; �)TC R(i; j; �)

by equation (2.2)

13

A0 =

�c2 aii + s2 ajj + 2sc aij cs (ajj � aii) + (c2 � s2)aij

cs (ajj � aii) + (c2 � s2) aij s2 aii � 2cs aij + c2 ajj

�(3.1)

B0 =

�c2 bii + s2 bjj + 2sc bij cs (bjj � bii) + (c2 � s2)bij

cs (bjj � bii) + (c2 � s2) bij s2 bii � 2cs bij + c2 bjj

�(3.2)

C0 =

�c2 cii + s2 cjj + 2cs cij cs (cjj � cii) + (c2 � s2)cij

cs (cjj � cii) + (c2 � s2) cij s2 cii � 2cs cij + c2 cjj

�(3.3)

the o� diagonal elements will be expressed as

a0ij = (c2 � s2) aij + cs (ajj � aii) (3.4)

b0ij = (c2 � s2) bij + cs (bjj � bii) (3.5)

c0ij = (c2 � s2) cij + cs (cjj � cii) (3.6)

Generally, the sum of squares of o� diagonal elements is given by

SS =

pXk=1

nXi6=j

(a0ij2)k k = 1; 2; 3 : : : p the number of matrices involved:

(3.7)

So sum of squares for A, B , C will be

SS = a0ij2+ b0ij

2+ c0ij

2 (3.8)

substituting equations 3.4-3.6 on 3.8, we have

SS = cos22�D + (1=4)sin22�E + sin2�cos2�F (3.9)

whereD = a2ij + b2ij + c2ij (3.10)

E = (ajj � aii)2 + (bjj � bii)

2 + (cjj � cii)2 (3.11)

F = aij(ajj � aii) + bij(bjj � bii) + cij(cjj � cii) (3.12)

14

Hence equation 3.9 gives the sum of squares of o� diagonal elements of the threematrices.Assuming we now have matrices A, B, C . . . P, we define

vij =

0BBBBB@

aijbijcij...pij

1CCCCCA

vii =

0BBBBB@

aiibiicii...pii

1CCCCCA

vjj =

0BBBBB@

ajjbjjcjj...pjj

1CCCCCA

(3.13)

where vij is a vector containing the o� diagonal elements of each matrix, vii is avector with upper diagonal elements of each matrix and vjj is a vector with lowerdiagonal elements. Then from equations 3.10-3.12 and 3.13 we have

D = vTijvij = kvijk2E = kvjj � viik2F = vTij(vjj � vii)

Using the fact thatcos4� = cos22� � sin22�:

andsin4� = 2cos2�sin2�

The sum of squares SS from equation 3.9 will now be expressed as

SS = cos4��D � E=4

2

�+�D + E=4

2

�+ (

F

2)sin4�

To �nd the angle � that will transform each matrix closest to diagonal formsimultaneously we let

s =�D � E=4

2

�g = (

F

2) and k =

�D + E=4

2

�This implies that

SS = s� cos4� + g � sin4� + k

15

in amplitude-phase angle form, SS is expressed as

SS = rsin(4� + �) + k

with r =ps2 + g2

� is the phase angle and is given as

� = tan�1(g

s) if s > 0

� = � + tan�1(g

s) if s < 0

With sine function, SS is minimum when either � = 14(3�2� �) or

� = 14(��

2� �).

The Jacobi rotation matrix R(�) is obtained from �, and the matrices aretransformed to diagonal form, see algorithm J1. The convergence is quadraticbased on Frobenius norm

kA0 � AkF =�Xi6=j

SSij�1=2

SS = a0ij2+ b0ij

2+ c0ij

2: : :+ p0ij

2

Therefore, in a nutshell the following is the algorithm for diagonalization ofseveral matrices simultaneously.

ALGORITHM J2.

Given A1; A2 : : : Ap

Get n, size of the matrix Ak all matrices must be of the same size.Get the coordinates i; j of each matrix A1; A2 : : : Ap, Set V=Ifor i=1:n-1

for j=i+1:nFind the angle � that transforms all o� diagonal elements of all p matri-

ces.Get the rotation matrix R(i; j; �)Rotate the matrices.

A0k = R(i; j; �)TAkR(i; j; �)

Accumulate transformations

V 0 = V R(i; j; �)

16

endforendforGet sum of squares of o� diagonal elements for all matrices.Check the stopping criteriaGo to next sweep if previous step is ful�lled.

3.2 Monitoring stopping criteria

The stopping criteria for Jacobi Algorithm on several matrices will be discussedbasing on the sum of squares of o� diagonal elements. Since in each rotation, thesum of squares of o� diagonal elements decreases; it reaches a time when the cal-culated angle � goes to zero. At this point, there will be no more transformations,and this means that the sum of squares of o� diagonal elements becomes invariantto transformation. We stop the process when this stage is reached. When thealgorithm stops F = vij(vjj � vii) will be small which means that vector vij of o�diagonal elements is orthogonal to the vector (vjj � vii) of di�erence of diagonalelements.

3.3 How to choose an element

The choice of which element is to be annihilated �rst is based on the cyclic Jacobi.Here we use the most popular ordering scheme, the row-wise scheme.

17

18

Chapter 4

Jacobi Algorithm to compute SVD

Using the Jacobi algorithm on several matrices in chapter 3, whose purpose is todiagonalize several dense symmetric matrices simultaneously, we can also use thesame algorithm to compute singular values and singular vectors of several densematrices.

4.1 Formulation

From the Singular Value Decomposition (SVD) point of view, the SVD of a realnon symmetric matrix A is a factorization,

A = U�VT

whereU 2 <m�m is orthogonal;V 2 <n�n is orthogonal; and� 2 <m�n is diagonal:

It is assumed that the diagonal entries �j (the singular values) of � are nonnegative and are in non increasing order, that is, �1 � �2 � : : : �r � 0; wherer =min(m;n). The columns of U and V are the left and right singular vectorsof A respectively.

The following algorithm is the general Jacobi algorithm which will be used to�nd the singular values and singular vectors for realm�n, pmatricesX1; X2; X3; : : : ; Xp,with m>n. Here we assume that the di�erence between m and n is not so large.

ALGORITHM J3

19

Given the matrices X1; X2; X3; : : : ; Xp, with m>nGet Ak = XT

k Xk

Set V=IFind the angle �Rotate the matrices with R(i; j�),(Jacobi rotation)

X is transformed from the right.

V = V R(�)

Get Ck = XkXTk

Set U=IFind the angle �Rotate the matrices with R(i; j�),(Jacobi rotation)

X is transformed from the left.

U = UR(�)

The approximate SVD of Xk is obtained as

�k = UTXkV (4.1)

V and U are the right and left singular vectors respectively.

Now, to use Jacobi algorithm, on real, p, m� n matrices X1; X2; X3; : : : ; Xp,with m >> n, we use the same method as above but with slight modi�cations,because m is often very much larger than n.

ALGORITHM J4

Get Ak = XTk Xk

Set V=IApply Jacobi transformation R(�) in coordinates i; j of Ak to get the right or-thogonal columns of Xk, i.e (V )

X is transformed from the right.

V = V R(�)

20

Get G = [X1; X2; X3; : : : ; Xp]

Get Q from QR factorization of G that is

[Q; r] = qr(G; 0)

Let Yk = QTXk this reduce the dimension of the matrix.Get Ck = YkY

Tk

Set U=IApply Jacobi transformation R(�) in coordinates i; j of the symmetric matrix Ck

to get the left orthogonal columns of Xk, i.e (U)

X is transformed from the left

U = UR(�)

The approximate SVD of Xk is obtained by

�k = UTQTXkV (4.2)

V is the right singular vector and U the left singular vector.

The two algorithms above are almost the same, since they all �nd the sin-gular values and singular vectors for non symmetric dence matrices. The onlydi�erence is that the later one is able encounter the situation when the di�erencebetween m and n is so large by �rst performing the QR factorization.

4.2 Convergence and Stopping criteria

To study the convergence, we use the same way we did in the Jacobi algorithmfor several matrices, that we look on the sum of squares of o� diagonal elementsfor each sweep. We realize that convergence is a bit de�erent from previous. Heresum of squares of o� diagonal elements converge very slowly in the beginning,and then they remains constant. This means that the angle � that transform thematrices is very small that it can not reduce the elements any more. When thispoint is reached we set the stop criteria. So the process will stop as soon as thereis no more changes in the sum of squares of o� diagonal elements, that is theabsolute value of angle � is very small to a given tolerance.

21

22

Chapter 5

Numerical Tests on the Algorithm

Using Jacobi Algorithm on one matrix, several matrices and on SVD, we performseveral convergence tests basing on Frobenius norm. We �rst test with standardmatrices and then test with matrices obtained from determination of chemicalcomposition of a sample mixture, where by light of varying wave length is shoneon to the sample and the wave length of the re�ected light is recorded. In thisway the spectrum of the sample is obtained. The results of this experimentare stored in a matrix with m rows and n columns, where n is the number ofspectra(recorded at di�erent locations and angles) and m is the number of pointsat which each spectrum is sampled. Usually m >> n.

5.1 Convergence test with one symmetric matrix

Here we test Jacobi algorithm using matrices like Hilbert matrices and Squaremembrane matrices, the convergence is quadratic and is reached at about the�fth sweep.

5.1.1 Testing with Hilbert Matrices

Hilbert matrices are symmetric matrices with elementsHij = (i+ j � 1)�1 for i; j = 1; 2; : : : nTesting with these matrices in their original form, that is without perturbingthem, the convergence is fast, as in �gure 5.1.

5.1.2 Testing with Square membrane Matrices

Square membrane matrices are large matrices with some double eigenvalues.We test the algorithm using one matrix of this class and the convergence was asexpected in �gure 5.2

23

1 2 3 4 5 6 7 810

−18

10−16

10−14

10−12

10−10

10−8

10−6

10−4

10−2

100

Convergence results with One matrix

Sweeps

squa

re r

oot o

f sum

of s

quar

es o

f off−

diag

onal

ele

men

ts

Figure 5.1: Convergence plot using Hilbert matrix. The transformations wasstopped when the sum of squares of o� diagonal elements becomes very small, asseen, the convergence is fast.

0 2 4 6 8 10 12 1410

−16

10−14

10−12

10−10

10−8

10−6

10−4

10−2

100

102

Convergence results with One matrix

Sweeps

squa

re r

oot o

f sum

of s

quar

es o

f off−

diag

onal

ele

men

ts

Figure 5.2: Convergence plot with One Square membrane matrix. The sum ofsquares of o� diagonal elements decreases reasonably in each sweep. We stoptransformations when they get small to 10�16.

24

1 1.5 2 2.5 3 3.5 4 4.5 510

−16

10−14

10−12

10−10

10−8

10−6

10−4

10−2

100

Convergence results with Several matrices

Sweeps

squa

re r

oot o

f sum

of s

quar

es o

f off−

diag

onal

ele

men

ts

Figure 5.3: Convergence plot using several commuting matrices. Sum of squaresof o� diagonal elements decreases very fast from the beginning. This is becauseof the commuting property of the matrices. We stop the transformations whenthe Sum of squares of o� diagonal elements were reasonable small.

5.2 Testing with several Matrices

We test Jacobi algorithm using several matrices that commute each other, thatis they have the same eigenvectors. Then we perturb them and study the con-vergence.

5.2.1 With several commuting Matrices

Here Algorithm J2 is tested using commuting matrices created in the followingway:-

1. Choose n, size of columns of the matrices you want to have.

2. Create normally distributed random numbers of n dimension. e.g L=randn(n).

3. Do the QR factorization of L that is [Q; r] = qr(L)

4. Depending on the number of matrices you want to have, get real eigenvaluesin column vectors Dk; k = 1; 2; 3:::p

5. Create the matrices from Ak = QDkQT

After generating our matrices above, we run the Algorithm J2. We noticedthat the convergence was very fast, as seen in �gure 5.3

25

1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 610

−4

10−3

10−2

10−1

100

Convergence results with Several perturbed matrices

Sweeps

squa

re r

oot o

f sum

of s

quar

es o

f off−

diag

onal

ele

men

ts

Figure 5.4: Convergence using several perturbed commuting matrices. The ma-trices are nearly commuting. The size of pertubation is around 1e-05, and as seenthe sum of squares goes down reasonably fast in the beginning.

5.2.2 With several perturbed commuting Matrices

We perturb the commuting matrices symmetrically and randomly with di�erentvalues of Æ for instance Æ can be 10�15; 10�5; 10�3 : : :, then transform them (runAlgorithm J2 on them). Our intention was to see what will happen if commutiv-ity is disturbed a bit, can the matrices still be diagonalized? What are the e�ectsof perturbations on the convergence speed?

The steps involved to perturb the matrices were as follows

1. Having Ak = QDkQT

2. With di�erent values of Æ, generate normally distributed random numbersEk = Æ � randn(n)

3. Then perturb the matrices as follows Ak = QDkQT + Ek

4. With Ak call algorithm J2

Results in �gure 5.4 show that the speed of convergence decreases with theincrease in pertubations size. So when the perturbations are large we can notdiagonalize the matrices at all. This can be seen in �gure 5.5

26

1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 610

−2

10−1

100

Convergence results with Several perturbed matrices

Sweeps

squa

re r

oot o

f sum

of s

quar

es o

f off−

diag

onal

ele

men

ts

Figure 5.5: Convergence plot using several perturbed commuting matrices. Thesize of pertubation is around 1e-02. The e�ect of large perturbations is seen veryearly, and the o� diagonal elements did not decrease, since the matrices are notnearly commuting.

1 1.5 2 2.5 3 3.5 40

0.1

0.2

0.3

0.4

0.5

0.6

0.7

delta=1e−16

delta=1e−05

delta=1e−08

Convergence results with different perturbation levels

Sweeps

squa

re r

oot o

f sum

of s

quar

es o

f off−

diag

onal

ele

men

ts

Figure 5.6: Convergence plot showing the variation of convergence speed with Æ. As seen the convergence speed decreases with increasing Æ.

27

5.3 Testing with Test Matrices for SVD

The data matrices as explained in the beginning of the chapter were tested asfollows:-

5.3.1 With single Test Matrix for SVD

First, we test these matrices using only one matrix at a time. Here we are in-terested in the singular value decomposition of A. So we just make the matrixsymmetric and do the diagonalization basing on the following steps.

1. Input the matrix X, X is m� n m >> n

2. Make A = XTX an n� n symmetric matrix.

3. With A, call algorithm J2, with V as right singular vector of X

Here X is transformed from the right

To transform X from left

4. Input the matrix X, A is m� n m >> n

5. Make QR factorization, that is [Q; r] = qr(X; 0)

6. Then X = QTX

7. Make C = XXT an n� n symmetric matrix.

8. With C, call algorithm J2, with U as left singular vector of X

The approximate SVD of X is obtained by

� = UTQTXV (5.1)

Convergence was as expected for single matrix, and is shown in �gure 5.7

5.3.2 With several Test Matrices for SVD

The tests on our algorithm continue with more than one matrices, the folowingsteps were used. Note that for the SVD case the matrices have to be transformedfrom both directions, so as to get left and right singular vectors.Given the matrices X1; X2; X3; : : : ; Xp, with m >> n

28

1 2 3 4 5 6 7 810

−8

10−6

10−4

10−2

100

102

104

106

108

Convergence results with One matrix

Sweeps

squa

re r

oot o

f sum

of s

quar

es o

f off−

diag

onal

ele

men

ts

Figure 5.7: Convergence plot using Single Data matrix. The sum of squres ofo� diagonal elements converge slowly in the beginning and later converge veryfast. We stop the transformations when the di�erence between the sum of squaresbecame reasonably small.

1. Get Ak = XTk Xk.

2. Having A1; A2; A3 : : : Ap, call Algorithm J2. colums of V are the rightsingular vectors of Xk

Here X1; X2; X3; : : : ; Xp are transformed from the right.

To transform X1; X2; X3; : : : ; Xp from left

3. Get G = [X1; X2; X3; : : : ; Xp]

4. Get Q from QR factorization of G that is

[Q; r] = qr(G; 0)

5. Let Yk = QTXk

6. Get Ck = YkYTk

7. Having Ck, call Algorithm J2. k=1,2 3. . . pcolums of U are the left singular vectors of Xk

The matrices X1; X2; X3; : : : ; Xp are transformed from the left.

The approximate SVD of Xk is obtained by

�k = UTQTXkV (5.2)

29

The results were matrices Dks which are nearly diagonal which containssquared singular values of Xks.

5.4 Comparison Tests

We perform some tests to compare our algorithm with the Matlab functions thatcompute eigenvalues and singular values of matrices. Though in Matlab thesefunctions compute the values of one matrix at a time.

To start with, we simultaneously diagonalize 5 matrices using our algorithm,later using Matlab functions we compute the eigenvalues of these 5 matricesseparately. We then plot the eigenvalues obtained from both ways on the sameplot for each matrix. The results arre shown in �gure 5.8.

To compare the singular values from our algorithm and Matlab functions, wecompute singular values of 5 matrices simultaneously and plot them on the sameplot with those computed from Matlab. Figure 5.9 shows the plots.

We tried to look on the singular vectors computed using Matlab functionsand from our code. The following matrix is the product of matrix of singularvector Vm from Matlab and matrix of singular vector V c from our code for the�rst four columns. As can be seen the columns of these matrices seems to benearly orthogonal, as their product is nearly identity matrix. However, due tothe fact that the matrix V c is a common matrix,(same for all 5 matrices), thisgives us the angle between the singular vectors of one of the matrices computedby Matlab and the common singular vectors estimated by our code. We see thatthe direction of the leading singular vectors is very close to the leading commonvectors. The following to vectors are also rather close, but there is no reason toassume that the matrices have more than 3 singular directions in common.Thisgives us a proof that our algorithm gives the expected results for matrices whichare not commuting.

V0mVc =

266666666664

�1:0000e+ 00 +1:8992e� 05 +1:1744e� 06 +2:2473e� 05�2:2212e� 05 �9:9581e� 01 �2:6732e� 02 �8:1958e� 02�9:7989e� 07 +5:3022e� 02 �9:0622e� 01 �3:8857e� 01�2:6918e� 04 �5:8599e� 02 �2:0445e� 01 +6:4174e� 01+7:5146e� 04 �3:2616e� 02 �9:1385e� 02 +3:3494e� 01+3:4483e� 05 �9:7780e� 03 �3:4060e� 01 +4:6728e� 01+2:3545e� 04 +2:3448e� 02 +1:0283e� 01 �3:1501e� 01�2:3818e� 05 �3:6273e� 02 +2:6329e� 02 �4:6572e� 01

377777777775(5.3)

30

1 2 3 4 5 6 7 810

11

1012

1013

1014

1015

1016

1017

1018

Eigenvalue number

Eig

enva

lues

Eigenvalues from Matlab and from the code

Eigenvalues from the codeEigenvalues from Matlab

(a) First matrix. .

1 2 3 4 5 6 7 810

10

1011

1012

1013

1014

1015

1016

1017

1018

Eigenvalue number

Eig

enva

lues

Eigenvalues from Matlab and from the code

Eigenvalues from the codeEigenvalues from Matlab

(b) Second matrix.

1 2 3 4 5 6 7 810

10

1011

1012

1013

1014

1015

1016

1017

1018

Eigenvalue number

Eig

enva

lues

Eigenvalues from Matlab and from the code

Eigenvalues from the codeEigenvalues from Matlab

(c) Third matrix.

1 2 3 4 5 6 7 810

11

1012

1013

1014

1015

1016

1017

1018

Eigenvalue number

Eig

enva

lues

Eigenvalues from Matlab and from the code

Eigenvalues from the codeEigenvalues from Matlab

(d) Fourth matrix.

1 2 3 4 5 6 7 810

11

1012

1013

1014

1015

1016

1017

1018

Eigenvalue number

Eig

enva

lues

Eigenvalues from Matlab and from the code

Eigenvalues from the codeEigenvalues from Matlab

(e) Fifth matrix.

Figure 5.8: The plots for the Eigenvalues from matlab and from the code. Theeigenvalues are in increasing order.

31

1 2 3 4 5 6 7 810

3

104

105

106

107

108

109

singular value number

Sin

gula

r va

lues

Singular values from Matlab and from the code

singular values from the codesingular value from Matlab

(a) First matrix. .

1 2 3 4 5 6 7 810

4

105

106

107

108

109

singular value number

Sin

gula

r va

lues

Singular values from Matlab and from the code

singular values from the codesingular values from Matlab

(b) Second matrix.

1 2 3 4 5 6 7 810

3

104

105

106

107

108

109

singular value number

Sin

gula

r va

lues

Singular values from Matlab and from the code

singular value from the codesingular value from Matlab

(c) Third matrix.

1 2 3 4 5 6 7 810

4

105

106

107

108

109

singular value number

Sin

gula

r va

lues

Singular values from Matlab and from the code

singular value from the codesingular values from Matlab

(d) Fourth matrix.

1 2 3 4 5 6 7 810

4

105

106

107

108

109

singular value number

Sin

gula

r va

lues

Singular values from Matlab and from the code

singular values from the codesingulars value from Matlab

(e) Fifth matrix.

Figure 5.9: The plots for the singular values from matlab and from the code.Thesingular values are in decreasing order.

32

Chapter 6

Conclusion

In this thesis we tried to create an algorithm that diagonalize several symmetricmatrices simultaneously. For the convergence, we concentrate mainly on look-ing on the sum of squares of o� diagonal elements of all the matrices if theyget minimized or not. We learn that the degree of convergence depended verymuch on the commutivity property of the matrices. We found that when thematrices are commuting or nearly commuting, we get full diagonalization andhence minimization of sum of squares of o� diagonal elements. In this case theeigenvalues obtained from the algorithm were the same as those computed fromMatlab functions. Figure 6.1 plots the eigenvalues obtained when this propertyis ful�lled. The convergence was reasonably fast. With this property, we plotthe singular values obtained from the algorithm and from Matlab functions. Thesingular values were the same in both as in �gure 6.2. Convergence was fast.

When matrices are di�erent, minimization is very slightly, this means thatdiagonalization is not full (partial diagonalization). In this case it is not possibleto get all eigenvalues or singular values exact. We got the �rst eigenvalue andsingular value of each matrix exactly as with Matlab functions. The rest were abit diferent as seen in �gures 5.8 and 5.9.

The minimization of sum of squares of o� diagonal elements, depends onthe angle � which was calculated using both o� diagonal elements and diagonalelements. It reaches a time that no matter how many rotations you perform,there is no further decrease in those o� diagonal elements. This means that theangle � is zero or is very small and that A0 is diagonal or can not be diagonalizedany further. At this stage we set the stopping criteria and assume that our matrixis diagonal or nearly diagonal. It is diagonal if all the o� diagonal elements arevery small to those in the diagonal and hence the diagonal elements of A0 are theeigenvalues of A or the diagonal elements of � are the singular values of X.

So from the results we got from tests with matrices for SVD, and especiallyfrom the matrix 5.3, where the �rst three columns gives diagonal elements whichare 1,0.9958 and 0.90622, we are able to say that the sample mixture containedthree species, that is the �rst non zero singular values.

33

1 2 3 4 5 6 7 810

11

1012

1013

1014

1015

1016

1017

1018

Eigenvalue number

Eig

enva

lues

Eigenvalues from Matlab and from the code

Eigenvalues from the codeEigenvalues from Matlab

Figure 6.1: The �gure shows eigenvalues of commuting matrices computed frommatlab and from the code. The eigenvalues are in increasing order.

1 2 3 4 5 6 7 810

5

106

107

108

109

singular value number

Sin

gula

r va

lues

Singular values from Matlab and from the code

singula value from the codesingular value from Matlab

Figure 6.2: The �gure shows singular values of commuting matrices computedfrom matlab and from the code. The eigenvalues are in increasing order.

34

Bibliography

[1] Beresford N. Parlett, The Symmetric Eigenvalue Problem, Prentice-Hall Se-ries In Computational Mathematics, c 1980 chapter 6 and 9.

[2] Gremund Dahlquist, and Åke Björck, Numerical Mathematics,Volume 2, manuscript (August 1998)

[3] Lloyd N.Trefethen and David Bau, III Numerical Linear Algebra,Society for Industrial and Applied Mathematics(SIAM) c 1997.

[4] G. H. Golub and C.F. Van Loan, Matrix computations,Baltimore,MD:The John Hopkins Univ.Press, c 1989.

[5] James. W. Demmel, Applied Numerical Linear Algebra,Society for Industrial and Applied Mathematics(SIAM) c 1997.

[6] G. W.Stewart, Matrix algorithms,Volume 1: Basic DecompositionsSociety for Industrial and Applied Mathematics (SIAM) c 1998.

[7] Michael T. Heath, Scienti�c Computing. An introductory Survey,McGraw-Hill Series in Computer Science c 1997.

35