Fast Fourier Transform: VLSI Architectures Fourier Transform: VLSI Architectures ... Where spatial regularity is preserved in a signal-flow graph ... Synchronization control very simple

  • View
    215

  • Download
    2

Embed Size (px)

Text of Fast Fourier Transform: VLSI Architectures Fourier Transform: VLSI Architectures ... Where spatial...

  • Fast Fourier Transform:

    VLSI Architectures

    Lecture 10

    6.973 Communication System Design Spring 2006

    Massachusetts Institute of Technology

    Vladimir Stojanovi

    Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.

    Downloaded on [DD Month YYYY].

  • Pipelined FFT architectures

    Examples

    Radix-2 multi-path delay commutator single-path delay feedback

    Radix-4 single-path delay feedback multi-path delay commutator single-path delay commutator

    6.973 Communication System Design 2

    Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.

    Downloaded on [DD Month YYYY].

    Figure by MIT OpenCourseWare.

    (1) . R2MDC(N-16)

    (2) . R25DF(N-16)

    (3) . R4SDF(N-256)

    (4) . R4MDC(N-256)

    (5) . R4SDC(N-256)

    1

    BF4

    2B4E8F

    3X64 3X16

    BF4 BF4

    3X4

    BF4

    3X1

    C2 C2 C2 C2BF2 BF2 BF2 BF2j

    8 4

    4

    2

    2 1

    4848

    1921921281286464

    16163232

    BF4BF4C4C4

    1212

    4848 32 321616

    4488

    BF4BF4C4C4

    33

    1212 8 844

    1122

    BF4BF4C4C4

    332211C4C4 BF4BF4

    BF4BF4BF4BF4DC6X64DC6X64 BF4BF4 BF4BF4DC6X16DC6X16 DC6X4DC6X4 DC6X1DC6X1

    jBF2

    8

    BF2

    4

    BF2

    1

    BF2

    2

  • Radix-2 Multi-path Delay Commutator

    The most classical approach for pipelineimplementation of radix-2 FFT

    Input sequence broken into two parallel datastreams flowing forward with correct distancebetween data elements entering the butterflyscheduled by proper delays

    Both butterflies and multipliers are in 50%utilization

    Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.

    Downloaded on [DD Month YYYY].6.973 Communication System Design 3

    C2 C2 C2 C2BF2 BF2 BF2 BF2j

    8 4

    4

    2

    2

    1

    1

    Figure by MIT OpenCourseWare.

  • Radix-2 Single-path Delay Feedback

    Uses registers more efficiently Both as input and the output of the butterfly

    A single data stream goes through the multiplier at every stage

    Multiplier utilization is also 50%

    6.973 Communication System Design 4

    Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.

    Downloaded on [DD Month YYYY].

    BF2

    8

    BF2

    4

    BF2

    1

    BF2

    2

    j

    Figure by MIT OpenCourseWare.

    [Wold & Despain 84]

  • Radix-4 Single-path Delay Feedback

    [Despain74]

    Utilization of multipliers 75% By storing 3 BF4 outputs

    Radix-4 butterfly utilization only 25% Butterfly fairly complicated

    At least 8 complex adders

    6.973 Communication System Design 5

    Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.

    Downloaded on [DD Month YYYY].

    3X4 3X1

    BF4BF4

    N4

    x(n+ ) N4

    y(n+ )

    N2

    y(n+ )

    3N4

    y(n+ )

    x(n) y(n)

    N2

    x(n+ )

    3N4

    x(n+ )

    WN0

    WNn

    WN2n

    WN3n

    -j-1

    -1

    -1

    -1j

    -j

    DFT4

    DFT4

    DFT4

    DFT4

    DFT4

    X0x0X12

    x12

    X1

    x8X13

    X14

    x4

    X2

    X15x15

    X3

    Figure by MIT OpenCourseWare.

    Figure by MIT OpenCourseWare.

    Figure by MIT OpenCourseWare.

    Figure by MIT OpenCourseWare.

  • Radix-4 Multi-path Delay Commutator

    [Swartzlander84]

    What is the utilization of Butterflies? Multipliers?

    6.973 Communication System Design 6

    Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.

    Downloaded on [DD Month YYYY].

    +++

    1284

    BF4 BF41

    2

    3

    C4123

    C4

    DFT4

    DFT4

    DFT4

    DFT4

    DFT4

    DFT4

    X0x0X12

    x12

    X1

    x8X13

    X14

    x4

    X2

    X15x15

    X3

    N4

    x(n+ ) N4

    y(n+ )

    N2

    y(n+ )

    3N4

    y(n+ )

    x(n) y(n)

    N2

    x(n+ )

    3N4

    x(n+ )

    WN0

    WNn

    WN2n

    WN3n

    -j-1

    -1

    -1

    -1j

    -j

    Figure by MIT OpenCourseWare.Figure by MIT OpenCourseWare.

    Figure by MIT OpenCourseWare.

  • Radix-4 Single-path Delay Commutator

    [Bi & Jones 89]

    Modified radix-4 algorithm Programmable radix-4 BF 75% utilization Used to build one of the largest single-chip

    FFTs (8192pts) [Bidet95]

    6.973 Communication System Design 7

    Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.

    Downloaded on [DD Month YYYY].

    DFT4

    DFT4

    DFT4

    DFT4

    DFT4

    DFT4

    X0x0X12

    x12

    X1

    x8X13

    X14

    x4

    X2

    X15x15

    X3

    N4

    x(n+ ) N4

    y(n+ )

    N2

    y(n+ )

    3N4

    y(n+ )

    x(n) y(n)

    N2

    x(n+ )

    3N4

    x(n+ )

    WN0

    WNn

    WN2n

    WN3n

    -j-1

    -1

    -1

    -1j

    -j

    commutator butterflyelement

    commutator butterflyelement

    coefficient

    c1 c2 c3 c4 c5 c6

    input

    stage 1 stage 2

    Figure by MIT OpenCourseWare.

    Figure by MIT OpenCourseWare.

    Figure by MIT OpenCourseWare.

  • R4SDC commutator and butterfly details

    6.973 Communication System Design 8

    Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.

    Downloaded on [DD Month YYYY].

    inputNt Nt Nt

    Nt

    Nt

    Nt

    mt c1 c2 c3

    1

    11 1 1

    11

    123

    0

    0000

    00 0

    1

    1

    1

    0

    0

    0

    1

    1

    1

    0

    0

    0

    1

    1

    1

    0

    0

    0

    1

    1

    1

    0

    0

    0

    2

    Time

    Time

    1 0 15

    t'+16T t'

    t'+28T

    m1= 3 m1= 2 m1= 1 m1= 0

    t'+12T

    14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 15 14 13 x(n)

    T

    input

    2:1 multiplexers

    Outputs fromcommutator at

    stage 1

    2 1 0 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 15 14 13

    14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9

    10 9 8 7 6 5 4

    6 5 4 3 2 1 0

    3 2 1 0

    4 3 2 1

    15 14 13 12 11 10 9 8 7 6 5

    15 14 13 12 11 10 9 8 7 6 5

    00

    0

    0

    0

    0

    0

    0

    1

    1

    2

    4

    2

    3

    3

    2

    3

    4

    5

    6

    6

    6

    9

    7

    8

    9

    10

    11

    12

    13

    14

    15

    0

    4

    8

    12

    1

    5

    9

    13

    2

    6

    10

    14

    3

    7

    11

    15

    0

    0

    0

    0

    1

    1

    1

    1

    2

    2

    2

    2

    3

    3

    3

    3

    0

    1

    2

    3

    0

    1

    2

    3

    0

    1

    2

    3

    0

    1

    2

    3

    stage 1 stage 2

    mt

    re (0)Re

    Im

    add/sub

    add/sub

    D

    add/sub

    (0 = addition, 1 = subtraction)

    add/sub

    add/sub

    add/sub

    im (0)

    im (1)re (1)

    im (2)re (2)

    im (3)re (3)

    c4 c5 c6

    10 0 0

    11

    023

    0101

    11 0

    Figure by MIT OpenCourseWare.

    Figure by MIT OpenCourseWare.

    Figure by MIT OpenCourseWare.Figure by MIT OpenCourseWare.

  • Some conclusions

    Delay feedback approaches are always more

    efficient than corresponding delay-commutator approaches In terms of memory utilization

    Since butterfly outputs share same storage with its inputs

    Pipeline architectures require FFT algorithms to be formulated in a hardware-oriented form Where spatial regularity is preserved in a signal-flow

    graph (SFG) So that arithmetic operations can be tightly

    scheduled for efficient hardware utilization

    6.973 Communication System Design 9

    Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.

    Downloaded on [DD Month YYYY].

  • Decomposition a review

    Twiddle factor is Nth primitive root of unity With exponent evaluated modulo N

    Most fast algorithms share same general strategy Map one-dimensional transform int a two or multi-

    dimensional representation Exploit congruence property of coefficients to simplify

    computation

    Unlike traditional step-by-step decompositionof twiddle factors Cascading the twiddle factor decomposition leads

    to new forms of FFT with high-spatial regularity

    Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Insti

Recommended

View more >