031211 Dean Seminar1

Embed Size (px)

Citation preview

  • 8/2/2019 031211 Dean Seminar1

    1/19

    An Introduction to

    Reconfigurable Computing

    Mitch Sukalski and Craig Ulmer

    Dean R&D Seminar11 December 2003

  • 8/2/2019 031211 Dean Seminar1

    2/19

    Reconfigurable Computing

    is computation on a platform with reconfigurable

    (i.e., modifiable at run-time) hardware capable

    of implementing application-specificalgorithms

    and functionality on demand.

  • 8/2/2019 031211 Dean Seminar1

    3/19

    Computing Spectrum

    Execute

    x / xor

    Fetch

    Decode

    Registers

    +

    Memory

    Writeback

    Software

    General-PurposeCPU

    Easily reprogrammedLow costFundamental bottlenecks

    +

    z-1

    xorx

    +

    x

    A B D

    x

    C

    result

    Hardware

    Application-SpecificIntegrated Circuit (ASIC)

    Not modifiableHigh costExtremely fast

    Soft-Hardware

    Field ProgrammableGate Arrays (FPGAs)

    Reconfigurable hardwareMedium costSpeedup potential

  • 8/2/2019 031211 Dean Seminar1

    4/19

    History

    The Teramac CCM:

    Multi-Chip Module of FPGAs

    Fixed+Variable CPU:

    Users can attach new

    computational circuits

    to a fixed ALU

    Xilinx Virtex FPGA

    1945: Eckert, Mauchly, von Neumann: ENIAC

    1945: von Neumann architecture

    1960: Estrin: Fixed+Variable Structure Computer1970s: Simple PLDs

    1985: Xilinx introduces first FPGA

    1990s: Custom Computing Machines (CCMs)

    1999: FPGAs exceed million logic gates

    2002: FPGAs include complex cores

    ENIAC

    Connecting computational

    Blocks for an algorithmXilinx Virtex II Pro(image courtesy of rapidio.org)

  • 8/2/2019 031211 Dean Seminar1

    5/19

    Reconfigurable Computing in

    Modern HPC Stand-alone platforms

    OctigaBay 12K

    SRC-6

    Starbridge Hypercomputer

    Accelerator cards

    Timelogics DeCypher

    Nallatechs BenNUEY

    Annapolis Micro SystemsWILDSTAR II

  • 8/2/2019 031211 Dean Seminar1

    6/19

    Example: Computational Fluid Dynamics

    William Smith & Austars Schnore at GE Global Research

    From: Towards an RCC-based Accelerator for

    Computational Fluid Dynamics, ERSA 2003

  • 8/2/2019 031211 Dean Seminar1

    7/19

    And now for some details

    Field Programmable Gate Arrays (FPGAs)

    Common RC design techniques

    Reported examples

  • 8/2/2019 031211 Dean Seminar1

    8/19

    Field-Programmable Gate Arrays (FPGAs)

    FPGAs emulate digital logic circuitry

    Large array of configurable logic blocks

    Internal routing through programmable interconnection network

    FPGAs hold hardware configuration in SRAM Change the digital circuitry by loading new configuration

    Design approach:

    User designs in hardware description language

    Synthesis tools translate to logic gates

    Mapping tools target specific FPGA

  • 8/2/2019 031211 Dean Seminar1

    9/19

    Register

    Register

    LUT

    LUT

    Simplified Logic Block

    Emulates logic function

    Thousands per chip

    Lookup Table (LUT)

    Holds truth table

    Inputs produce outputs

    1-bit registers

    Hold data between cycles

    Note: Greatly simplified

  • 8/2/2019 031211 Dean Seminar1

    10/19

    LUT Example:1-bit Adder

    A B Cin Cout Sum

    0 0 0 0 0

    0 0 1 0 1

    0 1 0 0 1

    0 1 1 1 0

    1 0 0 0 1

    1 0 1 1 0

    1 1 0 1 0

    1 1 1 1 1

    Register

    Register

    LUT

    LUT

    ABC0

    ABC0

    Cout

    Sum

    Truth Table

  • 8/2/2019 031211 Dean Seminar1

    11/19

  • 8/2/2019 031211 Dean Seminar1

    12/19

    Reconfiguration

    Modern FPGAs SRAM based

    Can be loaded with new circuitry

    Full reconfiguration

    Few megabytes of configuration

    Milliseconds

    Partial reconfiguration

    Reprogram only a portion of chip Reduces configuration time

    Non-trivial, poorly supported

    FPGA

    Full Configuration Image

    Partial Configuration Image

  • 8/2/2019 031211 Dean Seminar1

    13/19

    Design Techniques

    Digital logic design techniques for

    exploiting FPGAs

  • 8/2/2019 031211 Dean Seminar1

    14/19

    FPGAs as Computational Accelerators

    Use FPGAs as soft-hardware

    Port algorithm to hardware

    Run inside FPGA

    Reuse hardware

    Techniques

    Concurrency, memory, partial evaluation

  • 8/2/2019 031211 Dean Seminar1

    15/19

    1. Concurrency

    Load FPGA with multiple computational circuits

    Hardware state machines are like threads, but..

    All tasks are always running

    Raw parallelism Units run in parallel

    Example: Key breaking

    Pipelining Chain units together in series

    Example: Streaming computations, data-flow

  • 8/2/2019 031211 Dean Seminar1

    16/19

    2. Custom Memory Interactions

    Most FPGA cards have multiple memory banks

    Fetch/store multiple data values at same time

    Predictable performance (as opposed to caches)

    Hide address generation

    SRAM

    Bank 0

    SRAM

    Bank 1

    SRAM

    Bank 2

    SRAM

    Bank 3

    X

    X

    X SRAMBank 4

    FPGA

  • 8/2/2019 031211 Dean Seminar1

    17/19

    3. Partial Evaluation

    Know data constants at design time

    Apply to circuits and reduce hardware

    Synthesis tools perform automatically

    Note: FPGAs unique because we can easily generate new, optimized

    hardware configurations for each set of constants.

    Example: 4-bit Ripple-Carry Adder

  • 8/2/2019 031211 Dean Seminar1

    18/19

    RC Performance Examples

    CFD: 23 GFLOPS sustained Towards an RCC-based Accelerator for

    Computational Fluid Dynamics, Smith & Schnore,2003

    Adaptive beamforming: 20 GFLOPS Parallel systolic array architecture

    20 GFLOPS QR processor on a Xilinx Virtex-EFPGA, Walke, et. al., 2000

    Real-time holographic video display at 30fps Using field programmable gate arrays to scale up the

    speed of holographic video computation, Nwodoh

  • 8/2/2019 031211 Dean Seminar1

    19/19

    In Summary

    Reconfigurable computing uses FPGAs to

    emulate application-specific hardware

    Achieve performance gains with dedicated hardware

    It is possible to implement just about any kind of

    digital hardware in the FPGA.

    Limited by capacity and effort

    Resurrect application-specific hardware architectures

    SIMD, MIMD, Systolic Processor Arrays, Data-Flow