AMS526: NumericalAnalysisI (NumericalLinearAlgebra)jiao/teaching/ams526_fall12/lectures/lecture… · Outline 1 CourseOverview 2 Matrix-VectorMultiplication 3 Matrix-MatrixMultiplication

AMS526: Numerical Analysis I(Numerical Linear Algebra)

Lecture 1: Course Overview; Matrix Multiplication

Xiangmin Jiao

Stony Brook University

Xiangmin Jiao Numerical Analysis I 1 / 21

Outline

1 Course Overview

2 Matrix-Vector Multiplication

3 Matrix-Matrix Multiplication


Course Description

What is numerical linear algebra?I Solving linear algebra problems using efficient algorithms on computers

Topics: Direct and iterative methods for solving simultaneous linearequations. Matrix factorization, conditioning, stability, sparsity, andefficiency. Computation of eigenvalues and eigenvectors. Singularvalue decompositionRequired Textbook (adopted in Fall 2012)

I David S. Watkins, Fundamentals of Matrix Computations, 3rd edition,Wiley 2010, ISBN 978-0-470-52833-4.


Prerequisite

Prerequisite/Co-requisite:I AMS 510 (linear algebra portion) or equivalent undergraduate-level

linear algebra course. Familiarity with following concepts is assumed:Vector spaces, Gaussian elimination, Gram-Schmidt orthogonalization,and eigenvalues/eigenvectors

I AMS 595 (co-requisite for students without programming experience inC)

I This MUST NOT be your first course in linear algebra, or you will getlost

To review fundamental concepts of linear algebra, see textbook such asI Gilbert Strang, Linear Algebra and Its Applications, 4th Edition, Brooks

Cole, 2006.


Why Learn Numerical Linear Algebra?Numerical linear algebra is foundation of scientific computationsMany problems ultimately reduce to linear algebra concepts oralgorithms, either analytical or computationalExamples: Finite-element analysis, data fitting, PageRank (Google)

Focus of this course: Fundamental concepts, algorithms, andprogramming


Course Outline

Gaussian elimination and its variants (2.5 weeks)Sensitivity of linear systems (2.5 weeks)Linear least squares problem (1.5 weeks)Singular value decomposition (1.5 weeks)Eigenvalues and eigenvectors (3 weeks)Iterative methods for linear systems (2 weeks)Course webpage:http://www.ams.sunysb.edu/~jiao/teaching/ams526_fall12

Note: Course schedule online is tentative and is subject to change.


http://www.ams.sunysb.edu/~jiao/teaching/ams526_fall12

Some Notes of Textbook and Notation

We will cover all chapters of third-edition of “Watkins, Fundamentalsof Matrix Computations”Previously this course used “Trefethen and Bau, Numerical LinearAlgebra, SIAM, 1997”, which is a very insightful bookWhy change to Fundamentals of Matrix Computations (FMC)?

I FMC takes more progressive approach, so easier for studentsF Starts with linear systems instead of SVD; section on iterative methods

is easier to followF Starts with real numbers and switch to complex numbers laterF Has more examples

I FMC has more coverage on sparse matrices, which are important

FMC uses non-standard names sometimes, so we sometimes will usedifferent terminology from textbookSome explanations in lectures will still come from Trefethen and Bau


Course Policy

Assignments (written or programming)I Assignments are due in class one to two weeks after assignedI You can discuss course materials and homework problems with others,

but you must write your answers completely independentlyI Do NOT copy solutions from any source. Do NOT share your solutions

to others

Exams and testsI All exams are closed-bookI However, one-page cheat sheet is allowed

GradingI Assignments: 30%I Two tests: 40%I Final exam: 30%


Outline

1 Course Overview




DefinitionSuppose A is n ×m, x is size m, and b is size n

A =

a11 a12 . . . a1ma21 a22 . . . a2m...

.... . .

...an1 an2 . . . anm

, x =

x1x2...xm

, b =

b1b2...bn

All entries belong to R, the field of real numbers. Space of m-vectorsis Rm, and the space of n ×m matrices is Rn×m

Matrix-vector product b = Ax is defined as

bi =m∑

j=1

aijxj

The map x 7−→ Ax is linear, which means that for any x , y ∈ Rm andany α ∈ R

A(x + y) = Ax + Ay ,A(αx) = αAx .


Linear Combination

Let aj denote jth column of matrix AAlternatively, matrix-vector product can be viewed as

b = Ax =m∑

j=1

xjaj = x1

a11a21...

an1

+ x2

a12a22...

an2

+ · · ·+ xm

a1ma2m...

anm

i.e., b is a linear combination of column vectors of ATwo different views of matrix-vector products:

1 bi =∑m

j=1 aijxj : A acts on x to produce b; scalar operations;2 b =

∑mj=1 xjaj : x acts on A to produce b; vector operations.

If A is n ×m, Ax can be viewed as a mapping from Rm to Rn


Pseudo-Code for Matrix-Vector Multiplication

Pseudo-code for b = Axfor i = 1 to n

bi ← 0for j = 1 to m

bi ← bi + aijxj

Pseudo-code are not real code, in that they do not run on anycomputer. They are human readable, but they should bestraightforward to convert into real computer codes in anyprogramming language (e.g., C, FORTRAN, MATLAB, etc.)There is not a “standard” convention for pseudo-code


Flop Count

It is important to assess the efficiency of algorithms. But how?I We could implement different algorithms and do head-to-head

comparison, but implementation details can affect true performanceI We could estimate cost of all operations, but it is very tediousI Relatively simple and effective approach is to estimate amount of

floating-point operations, or “flops”, and focus on asymptotic analysisas sizes of matrices approach infinity

IdealizationI Count each operation +,−, ∗, /, and √ as one flop, and make no

distinction of real and complex numbers.I This estimation is very crude, since it omits data movement in memory,

which is non-negligible on modern computer architectures

Matrix-vector product requires about 2mn flops, becausebi ← bi + aijxj requires two flopsSuppose m = n, it takes quadratic time in n (or O(n2))


Outline

1 Course Overview




Matrix-Matrix Multiplication

If A is n ×m and X is m × p, then B = AX is n × p, with entriesdefined by

bij =m∑

k=1

aikxkj .

Written in columns, we have

bj = Axj =m∑

k=1

xkjak .

In other word, jth column of B is A times jth column of X


Pseudo-Code for Matrix-Matrix Multiplication

Pseudo-code for B = AXfor i = 1 to n

for j = 1 to pbij ← 0for k = 1 to m

bij ← bij + aikxkj

Matrix-matrix multiplication requires about 2mnp flops, becausebij ← bij + aikxkj requires two flopsSuppose m = n = p, it takes cubic time in n (or O(n3))


Special Cases Matrix-Matrix Multiplication

Vector is a special case of a matrixMatrix-vector multiplication is a special case of matrix-matrixMultiplicationOther special cases of vector-vector multiplication

I Inner product: uT v (where u ∈ Rn and v ∈ Rn)I Outer product: uvT (where u ∈ Rn and v ∈ Rm)


Block MatricesMatrix can be partitioned into blocks.For example, A ∈ Rn×m and X ∈ Rm×p can be partitioned into

A =

m1 m2[A11 A12A21 A22

]n1n2

, X =

p1 p2[X11 X12X21 X22

]m1m2

,

Then B = AX can be written as

p1 p2[B11 B12B21 B22

]n1n2

=

[A11 A12A21 A22

] [X11 X12X21 X22

]where

Bij = Ai1X1j + Ai2X2j , i , j = 1, 2.

More generally, the matrix can be decomposed into more than twoblock rows/columns


Block Matrix Operators

Suppose A has s1 block rows and s2 block columns, and X has s2block rows and s3 block columns. Matrix-matrix multiplication can berewritten using block matrix operations:

Pseudo-code for B = AX using block-matrix operatorsfor i = 1 to s1

for j = 1 to s3Bij ← 0for k = 1 to s2

Bij ← Bij + AikXkj

Block matrix operations do not reduce flop counts, but they canimprove performance because of improved cache performance andreduced data movement on modern computer architecture, byfitting matrix blocks into cache at once


Fast Matrix-Matrix Multiplication

Multiplying two n × n matrices requires ∼ 2n3 flops usinginner-product-based algorithmIs this optimal?

Strassen’s method requires O(ns) flops, where s = log2 7 ≈ 2.807I It is recursive algorithm applied to matrices of size 2k × 2k

I For matrices of sizes not of 2k × 2k , fill missing rows and columns withzeros

Asymptotically faster algorithm, due to D. Coppersmith and S.Winograd (1990), requires O(n2.3755) flopsAsymptotically fastest algorithm currently known, due to V. Williams(2011), requires O(n2.3727) flops


Fast Matrix-Matrix Multiplication

Multiplying two n × n matrices requires ∼ 2n3 flops usinginner-product-based algorithmIs this optimal?

Strassen’s method requires O(ns) flops, where s = log2 7 ≈ 2.807I It is recursive algorithm applied to matrices of size 2k × 2k

I For matrices of sizes not of 2k × 2k , fill missing rows and columns withzeros

Asymptotically faster algorithm, due to D. Coppersmith and S.Winograd (1990), requires O(n2.3755) flopsAsymptotically fastest algorithm currently known, due to V. Williams(2011), requires O(n2.3727) flops


Matrix-Matrix Multiplication in Practice

In practice, inner-product-based algorithm is almost always usedI Earlier authors estimated that Strassen’s algorithm is faster for

matrices with widths of n & 100 for optimized implementationsI Compared to a highly optimized traditional multiplication on current

architectures, Strassen’s algorithm is not faster unless n & 1000, andthe benefit is marginal for matrix sizes of several thousand

I Coppersmith and Winograd’s algorithm has a huge constant C in frontof it, so it is never fast enough in practice for realistic problem sizes

I The asymptotically “faster” methods are less accurate than standardmultiplication when using floating-point numbers (more prone torounding errors and cancellation errors)

Lower-complexity algorithms are sometimes used to prove theoreticaltime bounds of other algorithmsFor sparse matrices, efficient matrix-matrix multiplication should takeadvantage of sparsity (see SMMP for further reading)


http://www.mgnet.org/~douglas/Preprints/pub0034.pdf

Documents

AMS526: NumericalAnalysisI (NumericalLinearAlgebra)jiao/teaching/ams526_fall12/lectures/lecture… · Outline 1 CourseOverview 2 Matrix-VectorMultiplication 3 Matrix-MatrixMultiplication