Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
AMS526: Numerical Analysis I(Numerical Linear Algebra)
Lecture 1: Course Overview; Matrix Multiplication
Xiangmin Jiao
Stony Brook University
Xiangmin Jiao Numerical Analysis I 1 / 21
Outline
1 Course Overview
2 Matrix-Vector Multiplication
3 Matrix-Matrix Multiplication
Xiangmin Jiao Numerical Analysis I 2 / 21
Course Description
What is numerical linear algebra?I Solving linear algebra problems using efficient algorithms on computers
Topics: Direct and iterative methods for solving simultaneous linearequations. Matrix factorization, conditioning, stability, sparsity, andefficiency. Computation of eigenvalues and eigenvectors. Singularvalue decompositionRequired Textbook (adopted in Fall 2012)
I David S. Watkins, Fundamentals of Matrix Computations, 3rd edition,Wiley 2010, ISBN 978-0-470-52833-4.
Xiangmin Jiao Numerical Analysis I 3 / 21
Prerequisite
Prerequisite/Co-requisite:I AMS 510 (linear algebra portion) or equivalent undergraduate-level
linear algebra course. Familiarity with following concepts is assumed:Vector spaces, Gaussian elimination, Gram-Schmidt orthogonalization,and eigenvalues/eigenvectors
I AMS 595 (co-requisite for students without programming experience inC)
I This MUST NOT be your first course in linear algebra, or you will getlost
To review fundamental concepts of linear algebra, see textbook such asI Gilbert Strang, Linear Algebra and Its Applications, 4th Edition, Brooks
Cole, 2006.
Xiangmin Jiao Numerical Analysis I 4 / 21
Why Learn Numerical Linear Algebra?Numerical linear algebra is foundation of scientific computationsMany problems ultimately reduce to linear algebra concepts oralgorithms, either analytical or computationalExamples: Finite-element analysis, data fitting, PageRank (Google)
Focus of this course: Fundamental concepts, algorithms, andprogramming
Xiangmin Jiao Numerical Analysis I 5 / 21
Course Outline
Gaussian elimination and its variants (2.5 weeks)Sensitivity of linear systems (2.5 weeks)Linear least squares problem (1.5 weeks)Singular value decomposition (1.5 weeks)Eigenvalues and eigenvectors (3 weeks)Iterative methods for linear systems (2 weeks)Course webpage:http://www.ams.sunysb.edu/~jiao/teaching/ams526_fall12
Note: Course schedule online is tentative and is subject to change.
Xiangmin Jiao Numerical Analysis I 6 / 21
Some Notes of Textbook and Notation
We will cover all chapters of third-edition of “Watkins, Fundamentalsof Matrix Computations”Previously this course used “Trefethen and Bau, Numerical LinearAlgebra, SIAM, 1997”, which is a very insightful bookWhy change to Fundamentals of Matrix Computations (FMC)?
I FMC takes more progressive approach, so easier for studentsF Starts with linear systems instead of SVD; section on iterative methods
is easier to followF Starts with real numbers and switch to complex numbers laterF Has more examples
I FMC has more coverage on sparse matrices, which are important
FMC uses non-standard names sometimes, so we sometimes will usedifferent terminology from textbookSome explanations in lectures will still come from Trefethen and Bau
Xiangmin Jiao Numerical Analysis I 7 / 21
Course Policy
Assignments (written or programming)I Assignments are due in class one to two weeks after assignedI You can discuss course materials and homework problems with others,
but you must write your answers completely independentlyI Do NOT copy solutions from any source. Do NOT share your solutions
to others
Exams and testsI All exams are closed-bookI However, one-page cheat sheet is allowed
GradingI Assignments: 30%I Two tests: 40%I Final exam: 30%
Xiangmin Jiao Numerical Analysis I 8 / 21
Outline
1 Course Overview
2 Matrix-Vector Multiplication
3 Matrix-Matrix Multiplication
Xiangmin Jiao Numerical Analysis I 9 / 21
DefinitionSuppose A is n ×m, x is size m, and b is size n
A =
a11 a12 . . . a1ma21 a22 . . . a2m...
.... . .
...an1 an2 . . . anm
, x =
x1x2...xm
, b =
b1b2...bn
All entries belong to R, the field of real numbers. Space of m-vectorsis Rm, and the space of n ×m matrices is Rn×m
Matrix-vector product b = Ax is defined as
bi =m∑
j=1
aijxj
The map x 7−→ Ax is linear, which means that for any x , y ∈ Rm andany α ∈ R
A(x + y) = Ax + Ay ,A(αx) = αAx .
Xiangmin Jiao Numerical Analysis I 10 / 21
Linear Combination
Let aj denote jth column of matrix AAlternatively, matrix-vector product can be viewed as
b = Ax =m∑
j=1
xjaj = x1
a11a21...
an1
+ x2
a12a22...
an2
+ · · ·+ xm
a1ma2m...
anm
i.e., b is a linear combination of column vectors of ATwo different views of matrix-vector products:
1 bi =∑m
j=1 aijxj : A acts on x to produce b; scalar operations;2 b =
∑mj=1 xjaj : x acts on A to produce b; vector operations.
If A is n ×m, Ax can be viewed as a mapping from Rm to Rn
Xiangmin Jiao Numerical Analysis I 11 / 21
Pseudo-Code for Matrix-Vector Multiplication
Pseudo-code for b = Axfor i = 1 to n
bi ← 0for j = 1 to m
bi ← bi + aijxj
Pseudo-code are not real code, in that they do not run on anycomputer. They are human readable, but they should bestraightforward to convert into real computer codes in anyprogramming language (e.g., C, FORTRAN, MATLAB, etc.)There is not a “standard” convention for pseudo-code
Xiangmin Jiao Numerical Analysis I 12 / 21
Flop Count
It is important to assess the efficiency of algorithms. But how?I We could implement different algorithms and do head-to-head
comparison, but implementation details can affect true performanceI We could estimate cost of all operations, but it is very tediousI Relatively simple and effective approach is to estimate amount of
floating-point operations, or “flops”, and focus on asymptotic analysisas sizes of matrices approach infinity
IdealizationI Count each operation +,−, ∗, /, and √ as one flop, and make no
distinction of real and complex numbers.I This estimation is very crude, since it omits data movement in memory,
which is non-negligible on modern computer architectures
Matrix-vector product requires about 2mn flops, becausebi ← bi + aijxj requires two flopsSuppose m = n, it takes quadratic time in n (or O(n2))
Xiangmin Jiao Numerical Analysis I 13 / 21
Outline
1 Course Overview
2 Matrix-Vector Multiplication
3 Matrix-Matrix Multiplication
Xiangmin Jiao Numerical Analysis I 14 / 21
Matrix-Matrix Multiplication
If A is n ×m and X is m × p, then B = AX is n × p, with entriesdefined by
bij =m∑
k=1
aikxkj .
Written in columns, we have
bj = Axj =m∑
k=1
xkjak .
In other word, jth column of B is A times jth column of X
Xiangmin Jiao Numerical Analysis I 15 / 21
Pseudo-Code for Matrix-Matrix Multiplication
Pseudo-code for B = AXfor i = 1 to n
for j = 1 to pbij ← 0for k = 1 to m
bij ← bij + aikxkj
Matrix-matrix multiplication requires about 2mnp flops, becausebij ← bij + aikxkj requires two flopsSuppose m = n = p, it takes cubic time in n (or O(n3))
Xiangmin Jiao Numerical Analysis I 16 / 21
Special Cases Matrix-Matrix Multiplication
Vector is a special case of a matrixMatrix-vector multiplication is a special case of matrix-matrixMultiplicationOther special cases of vector-vector multiplication
I Inner product: uT v (where u ∈ Rn and v ∈ Rn)I Outer product: uvT (where u ∈ Rn and v ∈ Rm)
Xiangmin Jiao Numerical Analysis I 17 / 21
Block MatricesMatrix can be partitioned into blocks.For example, A ∈ Rn×m and X ∈ Rm×p can be partitioned into
A =
m1 m2[A11 A12A21 A22
]n1n2
, X =
p1 p2[X11 X12X21 X22
]m1m2
,
Then B = AX can be written as
p1 p2[B11 B12B21 B22
]n1n2
=
[A11 A12A21 A22
] [X11 X12X21 X22
]where
Bij = Ai1X1j + Ai2X2j , i , j = 1, 2.
More generally, the matrix can be decomposed into more than twoblock rows/columns
Xiangmin Jiao Numerical Analysis I 18 / 21
Block Matrix Operators
Suppose A has s1 block rows and s2 block columns, and X has s2block rows and s3 block columns. Matrix-matrix multiplication can berewritten using block matrix operations:
Pseudo-code for B = AX using block-matrix operatorsfor i = 1 to s1
for j = 1 to s3Bij ← 0for k = 1 to s2
Bij ← Bij + AikXkj
Block matrix operations do not reduce flop counts, but they canimprove performance because of improved cache performance andreduced data movement on modern computer architecture, byfitting matrix blocks into cache at once
Xiangmin Jiao Numerical Analysis I 19 / 21
Fast Matrix-Matrix Multiplication
Multiplying two n × n matrices requires ∼ 2n3 flops usinginner-product-based algorithmIs this optimal?
Strassen’s method requires O(ns) flops, where s = log2 7 ≈ 2.807I It is recursive algorithm applied to matrices of size 2k × 2k
I For matrices of sizes not of 2k × 2k , fill missing rows and columns withzeros
Asymptotically faster algorithm, due to D. Coppersmith and S.Winograd (1990), requires O(n2.3755) flopsAsymptotically fastest algorithm currently known, due to V. Williams(2011), requires O(n2.3727) flops
Xiangmin Jiao Numerical Analysis I 20 / 21
Fast Matrix-Matrix Multiplication
Multiplying two n × n matrices requires ∼ 2n3 flops usinginner-product-based algorithmIs this optimal?
Strassen’s method requires O(ns) flops, where s = log2 7 ≈ 2.807I It is recursive algorithm applied to matrices of size 2k × 2k
I For matrices of sizes not of 2k × 2k , fill missing rows and columns withzeros
Asymptotically faster algorithm, due to D. Coppersmith and S.Winograd (1990), requires O(n2.3755) flopsAsymptotically fastest algorithm currently known, due to V. Williams(2011), requires O(n2.3727) flops
Xiangmin Jiao Numerical Analysis I 20 / 21
Matrix-Matrix Multiplication in Practice
In practice, inner-product-based algorithm is almost always usedI Earlier authors estimated that Strassen’s algorithm is faster for
matrices with widths of n & 100 for optimized implementationsI Compared to a highly optimized traditional multiplication on current
architectures, Strassen’s algorithm is not faster unless n & 1000, andthe benefit is marginal for matrix sizes of several thousand
I Coppersmith and Winograd’s algorithm has a huge constant C in frontof it, so it is never fast enough in practice for realistic problem sizes
I The asymptotically “faster” methods are less accurate than standardmultiplication when using floating-point numbers (more prone torounding errors and cancellation errors)
Lower-complexity algorithms are sometimes used to prove theoreticaltime bounds of other algorithmsFor sparse matrices, efficient matrix-matrix multiplication should takeadvantage of sparsity (see SMMP for further reading)
Xiangmin Jiao Numerical Analysis I 21 / 21