36
Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

Embed Size (px)

Citation preview

Page 1: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

Linear Algebra and Complexity

Chris DicksonCAS 746 - Advanced Topics in Combinatorial

Optimization

McMaster University, January 23, 2006

Page 2: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

2

Review of linear algebra Complexity of determinant, inverse, systems of equations, etc Gaussian elimination method for soliving linear systems

complexity solvability numerical issues

Iterative methods for solving linear systems Jacobi Method Gauss-Sidel Method Successive Over-Relaxation Methods

Introduction

Page 3: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

3

Review

A subset L of Rn (respectively Qn) is called a linear hyperplane if L = { x | ax = 0 } for some nonzero row vector a in Rn (Qn).

A set U of vectors in Rn (respectively Qn) is called a linear subspace of Rn if the following properties hold: The zero vector is in U If x U then x U for any scalar If x,y U then x + y U

Important subspaces Null Space : null(A) = { x | Ax = 0 } Image : im(A) = { b | Ax = b for some x in A }

A basis for a subspace L is a set of linearly independent vectors that spans L

Page 4: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

4

Review

Theorem 3.1. Each linear subspace L of Rn is generated by finitely many vectors and is also the intersection of finitely many linear hyperplanes

Proof : If L is a subspace, then a maximal set of linearly independent vectors

from L will form a basis. Also, L is the intersection of hyperplanes { x | aix = 0 } where the set of

all ai are linearly independent generating L* := { z | zx = 0 for all x in L }

Page 5: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

5

Review

Corollary 3.1a. For each matrix A there exist vectors x1 , … , xT such that: Ax = 0 x = 1x1 + … + TxT for certain rationals 1 , … T.

If x1 , … , xT are linearlly independent they form a fundamental system of solutions to Ax = 0.

Corollary 3.1b The system Ax = b has a solution iff yb = 0 for each vector y with yA = 0. [ Fundamental theorem of linear algebra, Gauss 1809].

Page 6: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

6

Review

Proof : () If Ax=b then when yA = 0 we have yAx = 0 yb = 0. ()Let im(A) = {z | Ax = z for some x }. By Thrm 3.1 there

exists a matrix C such that im(A) = { z | Cz = 0 }. So, CAx = 0 implies that CA is all-zero. Thus:

(yA = 0 yb = 0) Cb = 0 b im(A) There is a solution to Ax = b

Page 7: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

7

Review

Corollary 3.1c. If x0 is a solution to Ax = b, then there are vectors x1, … , xt, such that Ax = b iff x = x0 + 1x1 + … + txt for certain rationals 1, … , t.

Proof: Ax = b iff A(x – x0) = 0. Cor 3.1a implies:

(x – x0) = 1x1 + … + txt

x = x0 + 1x1 + … + txt

Page 8: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

8

Review

Cramer’s Rule: If A is an invertible (n x n)-matrix, the solution to Ax = b is given by:

x1 = detA1 / detA…

xn = detAn / detA

Ai is defined by replacing the ith column of A by the column vector b.

Page 9: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

9

Review

Example:

5x1 +x2 – x3 = 49x1 +x2 – x3 = 1 x1 -x2 +5x3 = 2

Here, x2 = det(A2) / det(A) = 166 / 16 = 10.3750

Page 10: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

10

Sizes and Characterizations

Rational Number: r = p / q (p Z, q N, relatively prime)

Rational Vector c = ( 1 , 2 , ... , n )

Rational Matrix

We only consider rational entries for vectors and matrices.

Page 11: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

11

Sizes and Characterizations

Theorem 3.2. Let A be a square rational matrix of size . Then, the size of det(A) is less than 2 .

Proof: Let A = (pij/qij) for i,j = 1..n. Let detA = p/q where p,q and pij ,qij are all relatively prime. Then:

From the definitition of the determinant:

Page 12: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

12

Sizes and Characterizations

Noting that |p| = |detA|q, we combine the two formulas to bound p and q:

Plugging the bounds for p and q into the definition for the size of a rational number, we get:

Page 13: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

13

Sizes and Characterizations

Several important corollaries from the theorem

Corollary 3.2a. The inverse A-1 of a nonsingular (rational) matrix A

has size polynomially bounded by the size of A. Proof: Entries of A-1

are quotients of subdeterminants of A. Then, apply the theorem.

Corollary 3.2b. If the system Ax=b has a solution, it has one of size polynomially bounded by the sizes of A and bProof: Since A-1 is polynomially bounded, then A-1b is polynomially bounded.

This tells us that the Corollary 3.1b provides a good characterization

Page 14: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

14

Sizes and Characterizations

What if there isn’t a solution to Ax = b? Is this problem still polynomially bounded? Yes!

Corollary 3.2c says so! If there is no solution, we can specify a vector y with yA = 0 and yb = 1. By corollary 3.2b, such a vector exists and is polynomially bounded by the sizes of A and b.

Gaussian elimination provodes a polynomial method for finding (or not finding) a solution

One more thing left to show. Do we know the solution will be polynomially bounded in size?

Page 15: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

15

Sizes and Characterizations

Corollary 3.2d. Let A be a rational m x n matrix and let b be a rational column vector such that each row of the matrix [A b] has size at most . If Ax = b has a solution, then:

{ x | Ax = b} = { x0 + 1x1 + txt | i R } (*)

for certain rational vectors x0 , .... , xt of size at most 4n2 . Proof: By Cramer’s Rule, there exists x0 , ... , xt satisfying (*) which

are quotients of subdeterminants of [A b] of order at most n. By theorem 3.2, each determinant has size <= 2n. Each component of xi has size less than 4n. Since each xi has n components, the size of xi is at most 4n2.

Page 16: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

16

Gaussian Elimination

Given a system of linear equations:

a11x1 + a12x2 + .... + a1nxn = b1

a21x1 + a22x2 + .... + a2nxn = b2

.........an1x1 + an2x2 + .... + annxn = bn

We subtract appropriate multiples of the first equation from the other equations to get:a11x1 + a12x2 + .... + a1nxn = b1

a’22x2 + .... + a’2nxn = b2

......... a’n2x2 + .... + a’nnxn = bn

We then recursively solve the system of the last m-1 equations.

Page 17: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

17

Gaussian Elimination

Operations allowed on matrices: Adding a multiple of one row to another row Permuting rows or columns

Forward Elimination: Given a matrix Ak of the form

We choose a nonzero component of D called the pivot element. Without loss of generality, assume that this pivot element is in D11. Next, add rational multiples of the first row of D to the other rows in D such that D11 is the only non-zero element in the first column of D.

(B is non-singular, upper triangular, order k)

Page 18: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

18

Gaussian Elimination

Back Substitution: After FE stage, we wish to transform the matrix into the form:

where is non-singular diagonal.

We accomplish this by adding multiples of the kth row to the rows

1, ... , k-1 such that the element Akk is the only non-zero in the kth row and column for k = r, r-1, ... , 1. (r = rank(A))

Page 19: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

19

Gaussian Elimination

Theorem 3.3. For rational data, the Gaussian elminination method is a polynomial time method. proof is found in the book. It is very longwinded! since operations allowed for GE are polynomially bounded, it only

remains to show that numbers do not grow exponentially. Corollary of the theorem implies that other operations on matrices

are polynomially solvable as they depend on solving a system of linear equations. calculating determinant of a rational matrix determining the rank of a rational matrix calculating the inverse of a rational matrix testing a set of vectors for linear independence solving a system of rational linear equations

Page 20: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

20

Gaussian Elimination

Numerical Considerations .A naive pivot strategy always uses D11 as the pivot. However, this can

lead to numerical problems when we do not have exact arithmetic. (ie/ floating point numbers).

Partial pivoting always selects the largest element (in absolute value) in the first column of D as the pivot. (swap rows to make this D11)

Scaled partial pivoting is like partial pivoting, but it scales the pivot column and row so that D11 = 1.

Full pivoting uses the largest value in the entire D submatrix as the pivot. We then swap rows and columns to put this value in D11.

Studies show that full pivoting usually requires more overhead to be beneficial.

Page 21: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

21

Gaussian Elimination

Other methods: Gauss-Jordan Method: This method combines the FE and BS stages in

Gaussian elimination. inferior to ordinary Gaussian elimination

LU Decomposition: Uses GE to decompose original matrix A into the product of a lower and upper triangular matrix.

This is the method of choice when we must solve many systems with the same matrix A but with different b’s.

We only need to perform FE once, then do back substitution for each right hand side.

Iterative Methods: Want to solve Ax = b by determining a sequence of vectors x0 , x1, ... , which eventually converge to a solution x*.

Page 22: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

22

Iterative Methods

Given Ax = b and x0, the goal is find a sequence of iterates x1, x2 , ... , x* such that Ax* = b.

For most methods, the goal is to split A into A = L + D + U where L is strictly lower triangular, D is diagonal, and U is strictly upper triangular.

Example: L D U

Page 23: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

23

Iterative Methods

Now, we can define how the iteration proceeds. In general:

Jacobi Iteration: Mj = D , Nj = -(L + U). Alternatively, the book presents the iteration:

These two are equivalent as book assumes aii = 1, which means D = D-1 = I. (Identity)

Page 24: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

24

Iterative Methods

Example: Solve for x in Ax=b using Jacobi’s Method where:

Using the iteration in the previous slide and x0 = b, we obtain:

x1 = [ -5 0.667 1.5 ]T

x2 = [ 0.833 -1.0 -1.0]T .........x10 = [0.9999 0.9985 1.9977 ]T [ x* = [ 1 1 2] T ]

Page 25: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

25

Iterative Methods

Example 2: Solve for x in Ax=b using Jacobi’s Method where:

Using x0 = b, we obtain:

x1 = [ -44.5 -49.5 -13.4 ]T

x2 = [ 112.7 235.7 67.7]T .........x10 = [2.82 4.11 1.39 ]T *106 [ x* = [ 1 1 1] T ]

Page 26: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

26

Iterative Methods

• .What happened? The sequence diverged!

• The matrix in the second example was NOT diagonally dominant.

• Strict diagonal row dominance is sufficient for convergence. • Diagonal element is greater than sum of all other elements in the same

row (take absolute values!)• Not neccessary though as example 1 did not have strict diagonal

dominance.

• Alternatively, book says that Jacobi Method will converge if all eigenvalues of (I - A) are less than 1 in absolute value. • Again, this is a sufficient condition. It can be checked that eigenvalues

of matrix in first example do not satisfy this condition.

Page 27: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

27

Iterative Methods

Gauss-Sidel Method: Computes the (i+1)th component of xk+1. ie,

where

Page 28: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

28

Iterative Methods

Gauss-Sidel Method: Alternatively, if we split A = L+D+U, then we obtain the iteration:

Where MG = D + L , and NG = -U.

Again, it can be shown that the book’s method is equivalent to this method.

Page 29: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

29

Iterative Methods

Example: Gauss-Sidel Method

Starting with x0 = b, we obtain the sequence:

x1 = [ -4.666 -5.116 3.366 ]T

x2 = [ 1.477 -2.788 1.662]T x3 = [ 1.821 -0.404 1.116]T

....

x12 = [1.000 1.000 1.000 ]T

Page 30: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

30

Iterative Methods

It should also be noted that in both GS and Jacobi Methods, the choice of M is important.

Since the iterations of each method involve inverting M, we should choose M so that this is easy.

In Jacobi Method, M is a diagonal matrix Easy to invert!

In Gauss-Sidel Method, M is a lower triangular Slightly harder. Although inverse is still lower triangular, we must perform a little

more work, but we only have to invert it once. Alternatively, use the strategy by the book to compute iterates component by

component.

Page 31: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

31

Iterative Methods

Successive Over-Relaxation Method (SOR) Write matrices L and U (L lower triangular with diagonal) such that:

for some appropriately chosen > 0 ( is the diagonal of A)

A related class of iterative relaxation methods for approximating the solution to Ax = b can now be defined.

Page 32: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

32

Iterative Methods

Algorithm:

For a certain with 0<≤2 convergent if 0<<2 and Ax = b has a solution.

Page 33: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

33

Iterative Methods

There are many different choices that can be made for the relaxation methods

Choice of = 2, then xk+1 is reflection of xk into xi = = 1, then xk+1 is projection of xk onto the hyperplane xi =

How to choose violated equation Is it best to just pick any violated equation? Should we have a strategy? (First, Best, Worst, etc)

Stopping Criteria: How close is close enough?

Page 34: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

34

Iterative Methods

Why use iterative methods? Sometimes they will be more numerically stable. Particulary when matrix A is positive-semi-definite and diagonally

dominant. (This is the case for the linear least squares problem) If an exact solution is not neccessary. Only need to be ‘close enough’.

Which one to use? Compared with Jacobi Iteration, Gauss-Seidel Method converges faster,

usually by a factor of 2.

Page 35: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

35

References

Schrijver, A. Theory of Linear and Integer Programming. John Wiley & Sons, 1998.

Qiao, S. Linear Systems. CAS 708 Lecture Slides. (2005)

Page 36: Linear Algebra and Complexity Chris Dickson CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 23, 2006

36

THE END