Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
FFT Circuit Design
2
Applications of FFT in Communications
Fundamental FFT Algorithms
FFT Circuit Design Architectures
Conclusions
Outline
3
DAB Receiver
256/512/1024/2048 – point FFT
Tuner OFDMDemodulator
ChannelDecoder
Mpeg2Audio
Decoder
PacketDemux
Controller
Control Panel
4
WLAN OFDM System
RECEIVER
FECCoder S/P IFFT
64-pt
GuardInterval
Insertion
D/ALPF
UpConverter
TRANSMITTER
MACLayer6Mbps
~54Mbps
FECDecoder P/S FFT
64-pt
GuardInterval
Removal
A/DLPF
DownConverter
5
ADSL (Discrete Multi-tune) System
receive filter
+A/D
P/S QAM decoders FEQ
S/P QAM encoders
IFFT512-pt
add cyclic prefix
P/SD/A +
transmit filter
FFT512-pt S/P
remove cyclic prefix
TRANSMITTER
RECEIVER
TEQ
channel
DataIn
DataOut
6
Applications of FFT in Communications
Comm.System
WLAN DAB DVB ADSL VDSL
FFT Size 64 256/512/1024/2048
2048/8192
512 512/1024/2048/4096
OFDM DMT
7
Applications of FFT in Communications
Fundamental FFT Algorithms
FFT Circuit Design Architectures
Conclusions
Outline
8
Fundamental FFT Algorithms
Discrete Fourier Transfer Pair
Radix-2 FFT (N = 2ν)Decimation-in-time (DIT)Decimation-in-frequency (DIF)
FFT for composite N (N = N1 N2)Cooley-Tukey AlgorithmsRadix-r FFT
9
Discrete Fourier Transform Pair
][ ][ kXnx DFT⎯⎯ →←
. )/2( NjN eW π−=
,1 ..., ,1 ,0 , ][ ][1
0
−== ∑−
=
NkWnxkX knN
N
n
,1 ..., ,1 ,0 , ][N1 ][
1
0
−== −−
=∑ NnWkXnx kn
N
N
n
Let
denote a DFP pair.
Where,
We have
10
Observations
WNk is N-periodic.
WNk is conjugate symmetric.
Both x[n] and X[k] are N-periodic.
If x[n] is real, then X[k] is conjugate symmetricand vice versa.
11
Observations
A direct calculation requires approximately N2
complex multiplications and additions. FFT algorithms reduce the computation complexity to the order of N • log N.
Algorithms developed for FFT also works for IFFT with only minor modifications.
12
Example: Zero-Padding (WLAN)
WLAN 52 sub-carriers: use 64-point FFT.Null#1#2..#26NullNullNull#-26..#-2#-1
IFFT
012
2627
3738
6263
012
2627
3738
6263
TimeDomainOutputs
Sub-carriers
13
Decimation-in-Time Radix-2 FFT
1 , ,0
]12[ ]2[
][ ][
2/
12/
02/
12/
0
1
0
−=
+=
++=
=
∑∑
∑−
=
−
=
−
=
Nk
H[k]WG[k]
WrxWWrx
WnxkX
kN
krN
N
r
kN
krN
N
r
knN
N
n
K
Assume N is an even number.
14
Observations
G[k] is DFT of even samples of x[n].
H[k] is DFT of odd samples of x[n].
G[k] and H[k] are N/2-periodic.
WNk+N/2 = - WN
k.
15
DIT Radix-2 FFT
.2/0 , -
, ]2/[
, ][)2/(
NrH[r]WG[r]
H[r]WG[r]NrX
H[r]WG[r]rX
rN
NrN
rN
<≤=
+=+
+=+
WNr
-WNr
X[r]
X[r+N/2]
G[r]
H[r]
16
Decimation-in-Time Radix-2 FFTButterfly for Radix-2 DIT FFT
(M-1)th stage Mth stageWN
r
-WNr
(M-1)th stage Mth stage
WNr -1
In-place Computation
17
Decimation-in-Time Radix-2 FFTFirst layer decimation
N/2-pointDFT
x[0]
x[2]
x[4]
x[6]
N/2-pointDFT
x[1]
x[3]
x[5]
x[7]
G[3]
H[3]
H[2]
H[0]
G[2]
G[0]
G[1]X[0]
X[1]
X[7]
X[6]
X[5]
X[4]
X[3]
X[2]
H[1] WN0
WN3
WN2
WN1
-1
-1
-1
-1
18
Decimation-in-Time Radix-2 FFT x[0]
x[4]-1WN
0
x[6]
x[2]
-1WN0
x[5]
x[1]
-1WN0
x[7]
x[3]
-1WN0
-1
-1
WN0
WN2
-1
-1WN0
WN2
X[0]
X[7]
X[6]
X[5]
X[4]
X[3]
X[2]
X[1]
-1
-1
-1
-1
WN0
WN2
WN1
WN3
19
Bit Reversal
x[n2 n1 n0] x[1 1 0]
x[1 0 0]
x[0 1 0]
x[0 0 0]
x[0 0 1]
x[1 0 1]
x[0 1 1]
x[1 1 1]
0
1
0
1
0
1
0
1
n0
0
1
n2
0
n1
1
1
0
20
Decimation-in-frequency Radix-2 FFT
12 , ,0
])2/[][( ]12[
])2/[][( ]2[
1 , ,0 , ][ ][
2/
1)2/(
0
2/
1)2/(
0
1
0
-N/r
WWNnxnxrX
WNnxnxrX
NkWnxkX
rnN
nN
N
n
rnN
N
n
knN
N
n
K
K
=
+−=+
++=
−==
∑
∑
∑
−
=
−
=
−
=
Assume N is an even number.
21
Decimation-in-frequency Radix-2 FFT
12 , ,02 2 where,
, ][ ]12[
, ][ ]2[
2/
1)2/(
0
2/
1)2/(
0
-N/r])N/x[n(x[n]h[n]])N/x[n(x[n]g[n]
WWnhrX
WngrX
rnN
nN
N
n
rnN
N
n
K=+−=++=
=+
=
∑
∑−
=
−
=
22
Decimation-in-frequency Radix-2 FFT
Butterfly for Radix-2 DIF FFT
(M-1)th stage Mth stage
WNn-1
In-place Computation
23
Decimation-in-frequency Radix-2 FFTFirst layer decimation
x[7]
x[0]
x[1]
x[6]
x[5]
x[4]
x[3]
x[2]
-1
-1
-1
-1
h[2]
h[3]
h[1]
h[0]
g[3]
g[2]
g[0]
g[1]
N/2-pointDFT
N/2-pointDFT
X[0]
X[2]
X[4]
X[6]
X[1]
X[3]
X[5]
X[7]
WN0
WN2
WN1
WN3
24
Decimation-in-frequency Radix-2 FFTX[0]
X7]
X3]
X[5]
X[1]
X[6]
X[2]
X[4]-1
-1
-1
-1
WN0
WN0
WN0
WN0
-1
-1
-1
-1
WN0
WN0
WN2
WN2
-1
x[0]
x7]
x[6]
x5]
x[4]
x[3]
x[2]
x[1]
-1
-1
-1
WN0
WN2
WN1
WN3
25
Butterfly ComparisonButterfly (decimation-in-frequency)
(M-1)th stage Mth stage
WNn-1
Butterfly (decimation-in-time)
(M-1)th stage Mth stage
WNr -1
26
Cooley-Tukey Algorithm
].k X[ ][], x[ ][
:tarrangemen-repoint 2D,1 0,1 0
,
,1 0,1 0
,
211
212
22
11211
22
11212
NkkXnnNnx
NkNk
nNkk
NnNn
nnNn
+=+=
⎩⎨⎧
−≤≤−≤≤
+=
⎩⎨⎧
−≤≤−≤≤
+=21 NNN ⋅=
27
Cooley-Tukey Algorithms
, ] [ ][ 22
2
2
2
211
1
11
1
1
0
1
0212
nkN
N
n
nkN
N
n
nkN WW WnnNxkX ∑ ∑
−
=
−
= ⎥⎥⎦
⎤
⎢⎢⎣
⎡⎟⎟⎠
⎞⎜⎜⎝
⎛+=
] ,[ 12 knG
Twiddle factor
] ,[~12 knG
28
N1 = 2, N2 = N/2 -> 1st stage of the decimation in frequency radix-2 FFT.
N1 = N/2, N2 = 2 -> 1st stage of the decimation in time radix-2 FFT.
In general, N = N1 N2 … Nn.
If N = r n -> Radix-r.
Observations
29
Radix-3 FFT (DIF)
rnN
nN
N
r
jj
rnN
nN
N
r
jj
rnN
N
n
knN
N
n
WWeNnxeNnxnxrX
WWeNnxeNnxnxrX
WNnxNnxnxrX
WnxkX
3/2
1)3/(
0
32
32
3/
1)3/(
0
32
32
3/
1)3/(
0
1
0
)]3/2[]3/[][( ]23[
)]3/2[]3/[][( ]13[
])3/2[ ]3/[][( ]3[
][ ][
∑
∑
∑
∑
−
=
−
−
=
−
−
=
−
=
++++=+
++++=+
++++=
=
ππ
ππ
Assume N is a multiple of 3.
30
Radix-3 FFT (DIF)
Butterfly for Radix-3 DIF FFT
32πje
32πje−(M-1)th stage Mth stage
32πje−
32πje
WNn
WN2n
31
Radix-4 FFT (DIF)
rnN
nN
N
r
rnN
nN
N
r
rnN
nN
N
r
rnN
N
n
WWNnxjNnxNnjxnxrX
WWNnxNnxNnxnxrX
WWNnjxNnxNnxjnxrX
WNnxNnxNnxnxrX
4/3
1)4/(
0
4/2
1)4/(
0
4/
1)4/(
0
4/
1)4/(
0
])4/3[)(]4/2[)1(]4/[][( ]34[
])4/3[)1(]4/2[]4/[)1(][( ]24[
])4/[]4/2[)1(]4/[)(][( ]14[
])4/3[]4/2[ ]4/[][( ]4[
∑
∑
∑
∑
−
=
−
=
−
=
−
=
+−++−+++=+
+−++++−+=+
+++−++−+=+
++++++=
Assume N is a multiple of 4.
32
Radix-4 FFT (DIF)
Butterfly for Radix-4 DIF FFT
(M-1)th stage Mth stage
33
Split Radix FFT
Mix Radix-2 and Radix-4 architecture.
Compute even transform coefficients based on Radix-2 strategy and odd coefficients based on Radix-4 strategy.
Can perform FFT for N = 2ν.
34
Simplify Butterfly Representations
Radix-2
Radix-4
35
Split-Radix FFT
36
Computational Complexity
Method # of Complex Multiplications
# of Complex Additions
DFT N2 N(N-1)
Radix-2 (N/2) log2N N log2N
Radix-4 (3N/8) log2N (3N/2) log2N
The above numbers do not tell the whole story!Architecture is the key issue to trade of among performance, cost, hardware complexity, etc.
37
Outline
Applications of FFT in Communications
Fundamental FFT Algorithms
FFT Circuit Design Architectures
Conclusions
38
FFT Architecture Design Considerations
Trade-off among accuracy, speed, hardware complexity, and power consumption – best fit architecture should be application dependent.
Main architecture differences in:Degrees of parallelism – number and complexity of processing elements,Control schemes - hardware utilization and data flow control.
39
Degree of ParallelismOne simple processing unit or multiple simple processing units
x[0]
x[7]
x[3]
x[5]
x[1]
x[6]
x[2]
x[4]
X[0]
X[7]
X[6]
X[5]
X[4]
X[3]
X[2]
X[1]-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
WN0
WN0
WN0
WN0
WN0
WN0
WN0
WN2
WN2
WN2
WN1
WN3
40
Degree of Parallelism
Simple processing units versus complicate processing units
41
Memory-based FFT architecture
Single butterfly or processing element.Required memory size = N.A control unit ensures the right data flows to compute FFT.Firmware Like.Low complexity.Low speed.
42
Memory-based FFT Block Diagram
Butterflyor
Processing Element
Input Buffer
Coefficients ROM or
Generator
RAM
Control Unit
DataIn Data
Out
Control
43
Pipeline Architectures
FFT Signal Flow Graph
Multiple path delay commutator
Single path delay commutator
Single path delay feedback
44
Radix-2 Signal Flow Graph (DIT)
BF2B
uffer
ROM
BF2
Buffer
ROM ROM
BF2
x[0]
x[4] -1WN0
x[6]
x[2]
-1WN0
x[5]
x[1]
-1WN0
x[7]
x[3]
-1WN0
-1
-1WN
0
WN2
-1
-1WN0
WN2
X[0]
X[7]
X[6]
X[5]X[4]
X[3]
X[2]X[1]
-1
-1-1-1
WN0
WN2
WN1
WN3
45
Radix-2 Signal Flow Graph (DIF)
ROM
BF2
Buffer
ROM
BF2
Buffer
ROM
BF2
X[0]
X7]X3]X[5]X[1]X[6]X[2]X[4]
-1
-1
-1
-1
WN0
WN0
WN0
WN0
-1
-1
-1
-1
WN0
WN0
WN2
WN2
-1
x[0]
x7]x[6]x5]x[4]x[3]x[2]x[1]
-1-1-1
WN0
WN2
WN1
WN3
46
Multi-Path Delay Commutator
Commutator(switch)
Delay
Delay
Butterfly
Delay
Delay
47
Radix-2 Multi-Path Delay CommutatorX[0]
X7]X3]X[5]X[1]X[6]X[2]X[4]-1
-1
-1
-1
WN0
WN0
WN0
WN0
-1
-1-1
-1
WN0
WN0
WN2
WN2
-1
x[0]
x7]x[6]x5]x[4]x[3]x[2]x[1]
-1-1-1
WN0
WN2
WN1
WN3
7 6 5 4 3 2 1 03 2 1 0
4 5 6 73 2 1 04 5 6 7
3 2 1 04 5 6 7
3 2 1 04 5 6 7
5 4 1 07 6 3 2
5 4 1 07 6 3 2
5 4 1 07 6 3 2
5 4 1 07 6 3 2
6 4 2 07 5 3 1
6 4 2 07 5 3 1
6 4 2 07 5 3 1
switch
switch
switch
delay butterfly
butterfly
butterfly
delay
delay
delay
delay
48
Radix-2 Multi-Path Delay Commutator
C2
4
BF2
2
C2
2
BF2
1
C2
1
BF2C2
8
BF2
4
N=16
49
Radix-4 Multi-Path Delay Commutator
C4 BF4
3
2
1
C4
12
BF4
3
2
18
4
C4
48
BF4
12
8
432
16
C4
192
BF4
48
32
16128
64
N=256
50
Single Path Delay Commutator
DelayCommutator Butterfly
51
Radix-2 Single Path Delay Commutator
DC2 BF2 DC2 BF2 DC2 BF2 DC2 BF2
N=16
52
Radix-4 Single Path Delay Commutator
DC4 BF4 DC4 BF4 DC4 BF4 DC4 BF4
N=256
53
Single Path Delay Feedback
Butterfly
Delay
54
Radix-2 Single Path Delay Feedback
BF2
4
BF2
2
BF2
1
BF2
8
N=16
55
Radix-4 Single Path Delay Feedback
BF4
4x3
BF4
1x3
BF4
16x3
BF4
64x3
N=256
56
R22SDF
BF2II
4
BF2I
2
BF2I
8
BF2II
1
N=256
BF2II
64
BF2I
32
BF2I
128
BF2II
16
57
Hardware Comparison
Multiplier # Adder # Memory Size ControlArchitecture
R2MDCR2SDFR4MDCR4SDFR4SDCR22SDF
2(log4 N-1)2(log4 N-1)3(log4 N-1)
log4N-1log4N-1log4N-1
4 log4 N4 log4 N8 log4 N8 log4 N3 log4 N4 log4 N
3N/2-2N-1
5N/2-4N-12N-2N-1
simplesimplesimple
mediumcomplexsimple
58
Conclusions
Effect FFT computation is essential to many communication applications utilizing OFDM or DMT technique.
A pipelined FFT architecture is applied where a high real-time performance is required. A memory-based FFT architecture can be adopted when cost is more concerned than speed.
A best fit FFT architecture depends on application specific requirements to trade–off among accuracy, speed, chip size, power consumption, etc.