Upload
stkegg
View
226
Download
0
Embed Size (px)
Citation preview
8/3/2019 Fastpoissonslides Using FFT
1/25
Fast Poisson Solvers and FFT
Tom Lyche
University of Oslo
Norway
Fast Poisson Solvers and FFT p. 1/
http://www.ifi.uio.no/http://www.ifi.uio.no/8/3/2019 Fastpoissonslides Using FFT
2/25
Contents of lecture
Recall the discrete Poisson problem
Recall methods used so far
Exact solution by diagonalization
Exact solution by Fast Sine Transform
The sine transform
The Fourier transform
The Fast Fourier Transform
The Fast Sine Transform
Algorithm
Fast Poisson Solvers and FFT p. 2/
8/3/2019 Fastpoissonslides Using FFT
3/25
Recall 5 point-scheme for the Poisson Problem
(u +u ) =f u=0
u=0
u=0
u=0yyxx
h = 1/(m + 1) gives n := m2 interior grid points
Discrete approximation V= [vj,k] Rm,m with vj,k u(jh,kh).Matrix equation TV+ V T = h2F with T = tridiag(
1, 2,
1)
Rm,m
and F = [f(jh,kh)] Rm,m
Linear system Ax = b with A = T V+ V T, x = vec(V) Rnand b = h2vec(F) Rn
Fast Poisson Solvers and FFT p. 3/
8/3/2019 Fastpoissonslides Using FFT
4/25
Compare #flops for different methods
System Ax = b of order n = m2 and bandwidth m = n.full Cholesky: O(n3)
band Cholesky, block tridiagonal: O(n2)
If m = 103 then the computing time is
full Cholesky: years
band Cholesky and block tridiagonal , : hours
Derive a method which takes : seconds
Fast Poisson Solvers and FFT p. 4/
8/3/2019 Fastpoissonslides Using FFT
5/25
New exact fast method
1. Diagonalization of T
2. Fast Sine transform
restricted to rectangular domains
can be extended to 9 point scheme and biharmonic problem
can be extended to 3D
Fast Poisson Solvers and FFT p. 5/
8/3/2019 Fastpoissonslides Using FFT
6/25
8/3/2019 Fastpoissonslides Using FFT
7/25
8/3/2019 Fastpoissonslides Using FFT
8/25
Find Vusing diagonalization
We define a matrix X by V= SXS, where V is thesolution of TV+ V T = h2F
TV+ V T = h2F
V=SXS TSXS+ SXST= h2FS( )S
STSXS2 + S2XSTS= h2SFS
TS=SD S2DXS2 + S2XS2D = h2SFSS2=I/(2h)
DX+XD = 4h4SFS.
Fast Poisson Solvers and FFT p. 8/
8/3/2019 Fastpoissonslides Using FFT
9/25
X is easy to find
An equation of the form DX+ XD = B for some B iseasy to solve.
Since D = diag(j) we obtain for each entry
jxjk + xjkk = bjk
so xjk = bjk/(j + k) for all j, k.
Fast Poisson Solvers and FFT p. 9/
8/3/2019 Fastpoissonslides Using FFT
10/25
Algorithm
Algorithm 1 (A Simple Fast Poisson Solver).
1. h = 1/(m + 1);F =
f(jh,kh)
m
j,k=1;
S=
sin(jkh)mj,k=1; =
sin
2
((jh)/2)mj=1
2. G = (gj,k) = SFS;
3. X= (xj,k)mj,k=1, wherexj,k = h
4gj,k/(j + k);
4. V= SXS;
(1)
Output is the exact solution of the discrete Poisson equation on a
square computed in O(n3/2) operations.
Only a couple of mm matrices are required for storage.Next: Use FFT to reduce the complexity to O(n log2 n)
Fast Poisson Solvers and FFT p. 10/
8/3/2019 Fastpoissonslides Using FFT
11/25
8/3/2019 Fastpoissonslides Using FFT
12/25
The product B = AS
The product B = AScan also be interpreted as a DST.
Indeed, since S is symmetric we have B = (SAT)T
which means that B is the transpose of the DST of the
rows of A.It follows that we can compute the unknowns V inAlgorithm 1 by carrying out Discrete Sine Transforms on
4 m-by-m matrices in addition to the computation of X.
Fast Poisson Solvers and FFT p. 12/
8/3/2019 Fastpoissonslides Using FFT
13/25
The Euler formula
ei = cos + i sin , R, i =1
ei = cos
i sin
cos =ei + ei
2, sin =
ei ei2i
Consider
N := e2i/N = cos
2
N i sin
2
N, NN = 1
Note that
jk2m+2 = e2jki/(2m+2) = ejkhi = cos(jkh) i sin(jkh).
Fast Poisson Solvers and FFT p. 13/
8/3/2019 Fastpoissonslides Using FFT
14/25
Discrete Fourier Transform (DFT)
If
zj =N
k=1
(j1)(k1)N yk, j = 1, . . . , N
then z = [z1, . . . , zN]T is the DFT of y = [y1, . . . , yN]
T.
z = FNy, where FN := (j1)(k1)N Nj,k=1, RN,N
FN is called the Fourier Matrix
Fast Poisson Solvers and FFT p. 14/
8/3/2019 Fastpoissonslides Using FFT
15/25
Example
4 = exp2i/4 = cos(2 ) i sin(2 ) = i
F4 =
1 1 1 1
1 4 24
34
1 24 44
64
1 34 64
94
=
1 1 1 1
1 i 1 i1 1 1 11 i 1 i
Fast Poisson Solvers and FFT p. 15/
8/3/2019 Fastpoissonslides Using FFT
16/25
Connection DST and DFT
DST of order m can be computed from DFT of orderN = 2m + 2 as follows:
Lemma 1. Given a positive integerm and a vectorx Rm.Componentk ofS
mx is equal toi/2 times componentk + 1 of
F2m+2z where
z = (0, x1, . . . , xm, 0,xm,xm1, . . . ,x1)T R2m+2.
In symbols
(Smx)k =i
2
(F2m+2z)k+1 , k = 1, . . . , m .
Fast Poisson Solvers and FFT p. 16/
8/3/2019 Fastpoissonslides Using FFT
17/25
Fast Fourier Transform (FFT)
Suppose N is even. Express FN in terms of FN/2. Reorder the columnsin FN so that the odd columns appear before the even ones.
PN := (e1, e3, . . . , eN1, e2, e4, . . . , eN)
F4 =
1 1 1 1
1 i 1 i1 1 1 1
1 i 1 i
F4P4 =
1 1 1 1
1 1 i i1 1 1 1
1 1 i i
.
F2 = 1 1
1 2 =
1 1
1 1 , D2 = diag(1, 4) =
1 0
0 i .
F4P4 =
F2 D2F2
F2D2F2
Fast Poisson Solvers and FFT p. 17/
8/3/2019 Fastpoissonslides Using FFT
18/25
F2m, P2m, Fm and Dm
Theorem 2. IfN = 2m is even then
F2mP2m =
Fm DmFm
FmDmFm
, (2)
where
Dm = diag(1, N, 2N, . . . ,
m1N ). (3)
Fast Poisson Solvers and FFT p. 18/
8/3/2019 Fastpoissonslides Using FFT
19/25
Proof
Fix integers j, k with 0 j,k m 1 and set p = j + 1 and q = k + 1.Since mm = 1,
2N = m, and
mN = 1 we find by considering elements in
the four sub-blocks in turn
(F2mP2m)p,q = j(2k)N = jkm = (Fm)p,q,
(F2mP2m)p+m,q = (j+m)(2k)N =
(j+m)km = (Fm)p,q,
(F2mP2m)p,q+m = j(2k+1)N =
jN
jkm = (DmFm)p,q,
(F2mP2m)p+m,q+m = (j+m)(2k+1)
N = j(2k+1)
N = (DmFm)p,q
It follows that the four m-by-m blocks of F2mP2m have the required
structure.
Fast Poisson Solvers and FFT p. 19/
8/3/2019 Fastpoissonslides Using FFT
20/25
The basic step
Using Theorem 2 we can carry out the DFT as a block multiplication.
Let y R2m and set w = PT2my = (wT1 ,wT2 )T, where w1,w2 Rm.Then F2my = F2mP2mP
T2my = F2mP2mw and
F2mP2mw =
Fm DmFmFm DmFm
w1w2
= q1 + q2q1 q2
, whereq1 = Fmw1 and q2 = Dm(Fmw2).
In order to compute F2my we need to compute Fmw1 and Fmw2.
Note that wT1 = [y1, y3, . . . , yN1], while wT2 = [y2, y4, . . . , yN].
This follows since wT = [wT1 ,wT2 ] = y
TP2m and post multiplying a
vector by P2m moves odd indexed components to the left of all theeven indexed components.
Fast Poisson Solvers and FFT p. 20/
8/3/2019 Fastpoissonslides Using FFT
21/25
Recursive FFT
Recursive Matlab function when N = 2k
Algorithm 3. (Recursive FFT )
function z=fftrec(y)
n=length(y);
if n==1 z=y;
else
q1=fftrec(y(1:2:n-1));
q2=exp(-2*pi*i/n). (0:n/2-1).*fftrec(y(2:2:n));z=[q1+q2 q1-q2];
end
Such a recursive version of FFT is useful for testing purposes, but is much
too slow for large problems. A challenge for FFT code writers is to develop
nonrecursive versions and also to handle efficiently the case where N is
not a power of two.
Fast Poisson Solvers and FFT p. 21/
FFT C l it
8/3/2019 Fastpoissonslides Using FFT
22/25
FFT Complexity
Let xk be the complexity (the number of flops) when N = 2k.
Since we need two FFTs of order N/2 = 2k1 and a multiplication
with the diagonal matrix DN/2 it is reasonable to assume that
xk
= 2xk
1+ 2k for some constant independent of k.
Since x0 = 0 we obtain by induction on k that
xk = k2k = Nlog2 N.
This also holds when N is not a power of 2.
Reasonable implementations of FFT typically have 5
Fast Poisson Solvers and FFT p. 22/
I t i FFT
8/3/2019 Fastpoissonslides Using FFT
23/25
Improvement using FFT
The efficiency improvement using the FFT to compute the DFT isimpressive for large N.
The direct multiplication FNy requires O(8n2) flops since complex
arithmetic is involved.
Assuming that the FFT uses 5Nlog2 N flops we find for
N = 220 106 the ratio
8N2
5Nlog2 N 84000.
Thus if the FFT takes one second of computing time and if the
computing time is proportional to the number of flops then the direct
multiplication would take something like 84000 seconds or 23 hours.
Fast Poisson Solvers and FFT p. 23/
P i l b d FFT
8/3/2019 Fastpoissonslides Using FFT
24/25
Poisson solver based on FFT
requires O(n log2 n) flops, where n = m2
is the size of the linearsystem Ax = b. Why?
4m sine transforms: G1 = SF, G = G1S, V1 = SX, V= V1S.
A total of 4m FFTs of order 2m + 2 are needed.Since one FFT requires O((2m + 2) log2(2m + 2) flops the 4m FFTs
amounts to 8m(m + 1) log2(2m + 2) 8m2 log2 m = 4n log2 n,
This should be compared to the O(8n3/2
) flops needed for 4straightforward matrix multiplications with S.
What is faster will depend on the programming of the FFT and the
size of the problem.
Fast Poisson Solvers and FFT p. 24/
C t th d
8/3/2019 Fastpoissonslides Using FFT
25/25
Compare exact methods
System Ax = b of order n = m2
.Assume m = 1000 so that n = 106.
Method # flops Storage T = 109flops
Full Cholesky 13n3 = 1310
18 n2 = 1012 10 years
Band Cholesky 2n2 = 2 1012 n3/2 = 109 1/2 hourBlock Tridiagonal 2n2 = 2 1012 n3/2 = 109 1/2 hourDiagonalization 8n3/2 = 8 109 n = 106 8 secFFT 20n log2 n = 4 108 n = 106 1/2 sec
Fast Poisson Solvers and FFT p. 25/