27
Matrix Norms Tom Lyche University of Oslo Norway Matrix Norms – p. 1/2

Matrix Norms - uio.no · Consider (4). kek/kbk is a measure for the size of the perturbation e relative to the size of b. ky −xk/kxk can in the worst case be K(A) = kAkkA−1k times

  • Upload
    votu

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Matrix NormsTom Lyche

University of Oslo

Norway

Matrix Norms – p. 1/27

Matrix NormsWe consider matrix norms on (Cm,n, C). All results holds for(Rm,n, R).Definition 1 (Matrix Norms). A function ‖·‖ : C

m,n → C is called amatrix norm on C

m,n if for all A,B ∈ Cm,n and all α ∈ C

1. ‖A‖ ≥ 0 with equality if and only if A = 0. (positivity)

2. ‖αA‖ = |α| ‖A‖. (homogeneity)

3. ‖A + B‖ ≤ ‖A‖ + ‖B‖. (subadditivity)

A matrix norm is simply a vector norm on the finitedimensional vector spaces (Cm,n, C) of m × n matrices.

Matrix Norms – p. 2/27

Equivalent normsAdapting some general results on vector norms to matrixnorms giveTheorem 2. x

1. All matrix norms are equivalent. Thus, if ‖·‖ and ‖·‖′ are two matrixnorms on C

m,n then there are positive constants µ and M suchthat µ‖A‖ ≤ ‖A‖′ ≤ M‖A‖ holds for all A ∈ C

m,n.

2. A matrix norm is a continuous function ‖·‖ : Cm,n → R.

Matrix Norms – p. 3/27

SubmultiplicativityFor matrix norms we usually require that the norm of aproduct is bounded by the product of the norms. Thusfor square matrices A,B ∈ C

n,n and a matrix norm wemost often have the additional property

4. ‖AB‖ ≤ ‖A‖‖B‖ (submultiplicativity).

For a square matrix A and a submultiplicative matrixnorm ‖·‖ we have

‖Ak‖ ≤ ‖A‖k for k ∈ N. (1)

Matrix Norms – p. 4/27

Consistent Matrix normsWhen m and n vary we have a family of norms which areformally different for each m and n since they are defined indifferent spaces. However, the most common matrix normsare defined by the same formula for all m,n and weconsider mainly such norms.Definition 3 (Consistent Matrix Norms). A submultiplicative matrix normwhich is defined for all m,n ∈ N, is said to be a consistent matrix norm.

Matrix Norms – p. 5/27

The Frobenius Matrix NormFor A ∈ C

m,n we define the Frobenius norm by

‖A‖F :=(

m∑

i=1

n∑

j=1

|aij|2)1/2

.

‖A‖F =√

σ21 + · · · + σ2

n (singular values of A.)

The Frobenius norm is a consistent matrix norm whichis subordinate to the Euclidian vector norm.

Matrix Norms – p. 6/27

Subordinate Matrix NormA matrix norm ‖ ‖ on C

m,n is subordinate to the vectornorms ‖ ‖α on C

n and ‖ ‖β on Cm if

‖Ax‖β ≤ ‖A‖‖x‖α for all A ∈ Cm,n and x ∈ C

n.

Matrix Norms – p. 7/27

Operator NormDefinition 4. Suppose m,n ∈ N are given and let ‖·‖α be a vectornorm on C

n and ‖·‖β a vector norm on Cm. For A ∈ C

m,n we define

‖A‖ := ‖A‖α,β := maxx 6=0

‖Ax‖β

‖x‖α. (2)

We call this the (α, β) operator norm , the (α, β)-norm, or simply theα-norm if α = β.

Matrix Norms – p. 8/27

Operator norm propertiesThe operator norm has the following properties:

It is a matrix norm

It is subordinate to the vector norms ‖·‖α and ‖·‖β .

It is consistent if the vector norms ‖·‖α = ‖·‖β and theyare defined for all m,n.

There is some x∗ ∈ Cn with ‖x∗‖α = 1 such that

‖A‖ = max‖x‖α=1

‖Ax‖β = ‖Ax∗‖β .

Matrix Norms – p. 9/27

The p matrix normThe operator norms ‖·‖p defined from the p-vectornorms are of special interest.

We define

‖A‖p := maxx 6=0

‖Ax‖p

‖x‖p= max

‖y‖p=1‖Ay‖p. (3)

The p-norms are consistent matrix norms which aresubordinate to the p-vector norm.

Matrix Norms – p. 10/27

Explicit expressionsFor A ∈ C

m,n we have:

‖A‖1 = max1≤j≤n∑m

k=1|ak,j |

‖A‖2 = σ1, the largest singular value of A

‖A‖∞ = max1≤k≤m∑m

j=1|ak,j |

If A ∈ Cn,n is nonsingular then ‖A−1‖2 = 1

σn

, thesmallest singular value of A.

Proof:

Matrix Norms – p. 11/27

Unitary TransformationsAn important property of the 2-norm is that it is invariantwith respect to unitary transformations.

Let k,m, n ∈ N, V ∈ Ck,m, U ∈ C

n,n, A ∈ Cm,n, V HV = I

and UHU = I. Then

1. ‖V A‖2 = ‖A‖2 and ‖V ‖2 = 1,2. ‖AU‖2 = ‖A‖2.

Proof:

Matrix Norms – p. 12/27

Example

A := [ 1 23 4 ]

‖A‖1 = 6

‖A‖2 = 5.465

‖A‖∞ = 7.

‖A‖F = 5.4772

Matrix Norms – p. 13/27

Perturbation of linear systemsConsider the system of two linear equations

x1 +x2 = 20

x1 +0.999x2 = 19.99

The exact solution is x1 = x2 = 10.

Suppose we replace the second equation by

x1 + 1.001x2 = 19.99,

the exact solution changes to x1 = 30, x2 = −10.

A small change in one of the coefficients, from 0.999 to1.001, changed the exact solution by a large amount.

Matrix Norms – p. 14/27

Ill ConditioningA mathematical problem in which the solution is verysensitive to changes in the data is called ill-conditionedor sometimes ill-posed .

Such problems are difficult to solve on a computer.

If at all possible, the mathematical model should bechanged to obtain a more well-conditioned orproperly-posed problem.

Matrix Norms – p. 15/27

PerturbationsWe consider what effect a small change (perturbation)in the data A,b has on the solution x of a linear systemAx = b.

Suppose y solves (A + E)y = b+e where E is a (small)n × n matrix and e a (small) vector.

How large can y−x be?

To measure this we use vector and matrix norms.

Matrix Norms – p. 16/27

Conditions on the norms‖·‖ will denote a vector norm on C

n and also asubmultiplicative matrix norm on C

n,n which in additionis subordinate to the vector norm.

Thus for any A,B ∈ Cn,n and any x ∈ C

n we have

‖AB‖ ≤ ‖A‖ ‖B‖ and ‖Ax‖ ≤ ‖A‖ ‖x‖.

This is satisfied if the matrix norm is the operator normcorresponding to the given vector norm or theFrobenius norm.

Matrix Norms – p. 17/27

Absolute and relative errorThe difference ‖y − x‖ measures the absolute error in y

as an approximation to x,

‖y − x‖/‖x‖ or ‖y − x‖/‖y‖ is a measure for therelative error.

Matrix Norms – p. 18/27

Perturbation in the right hand sideTheorem 5. Suppose A ∈ C

n,n is invertible, b,e ∈ Cn, b 6= 0 and

Ax = b, Ay = b+e. Then

1

K(A)

‖e‖

‖b‖≤

‖y − x‖

‖x‖≤ K(A)

‖e‖

‖b‖, K(A) = ‖A‖‖A−1‖. (4)

Proof:

Consider (4). ‖e‖/‖b‖ is a measure for the size of theperturbation e relative to the size of b. ‖y − x‖/‖x‖ canin the worst case be

K(A) = ‖A‖‖A−1‖

times as large as ‖e‖/‖b‖.

Matrix Norms – p. 19/27

Condition numberK(A) is called the condition number with respect toinversion of a matrix , or just the condition number, if it isclear from the context that we are talking about solvinglinear systems.

The condition number depends on the matrix A and onthe norm used. If K(A) is large, A is called ill-conditioned(with respect to inversion).

If K(A) is small, A is called well-conditioned (with respectto inversion).

Matrix Norms – p. 20/27

Condition number properties

Since ‖A‖‖A−1‖ ≥ ‖AA−1‖ = ‖I‖ ≥ 1 we always haveK(A) ≥ 1.

Since all matrix norms are equivalent, the dependenceof K(A) on the norm chosen is less important than thedependence on A.

Usually one chooses the spectral norm whendiscussing properties of the condition number, and thel1 and l∞ norm when one wishes to compute it orestimate it.

Matrix Norms – p. 21/27

The 2-normSuppose A has singular values σ1 ≥ σ2 ≥ · · · ≥ σn > 0and eigenvalues |λ1| ≥ |λ2| ≥ · · · ≥ |λn| if A is square.

K2(A) = ‖A‖2‖A−1‖2 = σ1

σn

K2(A) = ‖A‖2‖A−1‖2 = |λ1|

|λn|, A normal.

It follows that A is ill-conditioned with respect toinversion if and only if σ1/σn is large, or |λ1|/|λn| is largewhen A is normal.

K2(A) = ‖A‖2‖A−1‖2 = λ1

λn

, A positive definite.

Matrix Norms – p. 22/27

The residualSuppose we have computed an approximate solution y toAx = b. The vector r(y :) = Ay − b is called the residualvector , or just the residual. We can bound x−y in term ofr(y).Theorem 6. Suppose A ∈ C

n,n, b ∈ Cn, A is nonsingular and b 6= 0.

Let r(y) = Ay − b for each y ∈ Cn. If Ax = b then

1

K(A)

‖r(y)‖

‖b‖≤

‖y − x‖

‖x‖≤ K(A)

‖r(y)‖

‖b‖. (5)

Matrix Norms – p. 23/27

DiscussionIf A is well-conditioned, (5) says that‖y − x‖/‖x‖ ≈ ‖r(y)‖/‖b‖.

In other words, the accuracy in y is about the sameorder of magnitude as the residual as long as ‖b‖ ≈ 1.

If A is ill-conditioned, anything can happen.

The solution can be inaccurate even if the residual issmall

We can have an accurate solution even if the residual islarge.

Matrix Norms – p. 24/27

Perturbation in A

We consider next a perturbation in A.Theorem 7. Suppose A,E ∈ C

n,n, b ∈ Cn with A invertible and

b 6= 0. If ‖A−1E‖ < 1 for some operator norm then A+E isinvertible. If Ax = b and (A + E)y = b then

‖y − x‖

‖x‖≤

‖A−1E‖

1 − ‖A−1E‖≤

K(A)

1 − ‖A−1E‖

‖E‖

‖A‖. (6)

‖E‖/‖A‖ is a measure of the size of the perturbation E

in A relative to the size of A.

The condition number again plays a crucial role.

Matrix Norms – p. 25/27

The Spectral RadiusWe define the spectral radius of a matrix A ∈ C

n,n as themaximum absolute values of the eigenvalues.

ρ(A) = maxλ∈σ(A)

|λ|. (7)

For any submultiplicative matrix norm ‖·‖ on Cn,n and

any A ∈ Cn,n we have ρ(A) ≤ ‖A‖.

Proof:

Let A ∈ Cn,n and ǫ > 0 be given. There is a

submultiplicative matrix norm ‖·‖′ on Cn,n such that

ρ(A) ≤ ‖A‖′ ≤ ρ(A) + ǫ.

Proof:

Matrix Norms – p. 26/27

LimitsFor any A ∈ C

n,n we have

limk→∞

Ak = 0 ⇐⇒ ρ(A) < 1.

Convergence can be slow:

A =

0.99 1 0

0 0.99 1

0 0 0.99

, A100 =

0.4 9.37 1849

0 0.4 37

0 0 0.4

,

A2000 =

10−9 ǫ 0.004

0 10−9 ǫ

0 0 10−9

Matrix Norms – p. 27/27