14
Student : Andrey Kuyel Supervised by Mony Orbach Spring 2011 Final Presentation High speed digital systems laborator High-Throughput FFT nion - Israel institute of technology rtment of Electrical Engineering

Student : Andrey Kuyel Supervised by Mony Orbach Spring 2011 Final Presentation High speed digital systems laboratory High-Throughput FFT Technion - Israel

Embed Size (px)

Citation preview

Page 1: Student : Andrey Kuyel Supervised by Mony Orbach Spring 2011 Final Presentation High speed digital systems laboratory High-Throughput FFT Technion - Israel

Student : Andrey KuyelSupervised by Mony Orbach

Spring 2011Final Presentation

High speed digital systems laboratory

High-Throughput FFT

Technion - Israel institute of technologydepartment of Electrical Engineering

Page 2: Student : Andrey Kuyel Supervised by Mony Orbach Spring 2011 Final Presentation High speed digital systems laboratory High-Throughput FFT Technion - Israel

Presentation overview

•Project motivation and goals•Theory studding•FFT 16/32 core definitions•Encountered problems•Selecting optimal algorithm•FFT core design and development •Validation and verification•Xilinx development boar Demo

Page 3: Student : Andrey Kuyel Supervised by Mony Orbach Spring 2011 Final Presentation High speed digital systems laboratory High-Throughput FFT Technion - Israel

Project goals

•The project goals is to design and implement on FPGA device FFT that capable to deal with high rate data processing (rates up to 10MSamp/sec*).

•The design will be written on VHDL and tested on Xilinx development board.

•The project has aspects of: signal processing and logic design and high rate data processing.

*- 5Ms/sec for each of I and Q components .

Page 4: Student : Andrey Kuyel Supervised by Mony Orbach Spring 2011 Final Presentation High speed digital systems laboratory High-Throughput FFT Technion - Israel

FFT - Theoretic overviewThe DFT (N- length vector) definition is:

The time-complexity of the DFT is:

The FFT algorithm (developed at first by J.W. Cooley and John Tukey at 1965) comes to reduce the time-complexity of DFT into

This algorithm called: "The Cooley–Tukey radix-2 FFT algorithm".It is one of the most common FFT algorithms.

Page 5: Student : Andrey Kuyel Supervised by Mony Orbach Spring 2011 Final Presentation High speed digital systems laboratory High-Throughput FFT Technion - Israel

Radix 4 algorithm

Page 6: Student : Andrey Kuyel Supervised by Mony Orbach Spring 2011 Final Presentation High speed digital systems laboratory High-Throughput FFT Technion - Israel

The FFT (N=16) radix 2 data flowThe FFT (N=8) radix 2 data flow

Studding and Examining different FFT parallel algorithms

Sixteen-point radix-4 decimation-in-time algorithm Length-16, Decimation-in-Frequency, In-order input, Radix-4 FFT

Page 7: Student : Andrey Kuyel Supervised by Mony Orbach Spring 2011 Final Presentation High speed digital systems laboratory High-Throughput FFT Technion - Israel

FFT core will have the following features:•Real and imaginary Inputs: 8 bits width each.•Real and imaginary outputs: 20bits width each, where 12 MSB bits for integer part and 8 LSB bits for fractional part.•Drop-in module for Virtex-6 (xc6vlx240T)•Forward complex FFT•Transform sizes N = 16/32•Arithmetic type: Fixed-point•Truncation after the butterfly •natural Input/output order•Input data at frequency 10 Ms/sec (total rate of real and image part of data )

FFT core features

Page 8: Student : Andrey Kuyel Supervised by Mony Orbach Spring 2011 Final Presentation High speed digital systems laboratory High-Throughput FFT Technion - Israel

FFT core general schematics

16 pointsComplex Parallel

FFT

Clock

Start

Real partData input [7:0]

Imaginary partData input [7:0]

FFT Realdata out 20q8

FFT ImagData out 20q8

Done

Edone

x16

x16

x16x16

rst

x0_re

x15_re

y0_im

y15_im

fx0_re

fx15_re

fy0_im

fy15_im

Page 9: Student : Andrey Kuyel Supervised by Mony Orbach Spring 2011 Final Presentation High speed digital systems laboratory High-Throughput FFT Technion - Israel

Selected FFT 16/32 core algorithm (Minimal DSP slices utilization)

Sixteen-point radix-4 decimation-in-time algorithm

Basic butterfly computation in a radix-4 FFT algorithm

Page 10: Student : Andrey Kuyel Supervised by Mony Orbach Spring 2011 Final Presentation High speed digital systems laboratory High-Throughput FFT Technion - Israel

XC6VLX240T FPGA utilization FFT size Maximal frequency DSP slices utilization

16 points 383MHz (12[GSam/sec]) 27

32 points 335MHz (21 [Gsam/sc]) 102=27*2+16*3

Page 11: Student : Andrey Kuyel Supervised by Mony Orbach Spring 2011 Final Presentation High speed digital systems laboratory High-Throughput FFT Technion - Israel

Debugging and verification

•RTL Matlab model of FFT core , signals values on each pipe line stage•Xilinx simulator •Xilinx development board verification using chip scope•Quantization error estimation against Matlab double precision FFT•Maximal frequency operation validation .

Page 12: Student : Andrey Kuyel Supervised by Mony Orbach Spring 2011 Final Presentation High speed digital systems laboratory High-Throughput FFT Technion - Israel

Stimulus ROM

Input dataControl

logic

Data

path

FFT 16 points

PLL Frequency multiplier

FFT resultsmemor

y

Output data

control logic

Increased clockTo all modules

Input clock

Data

path

ChipScopeTo PC

Xilinx development board design validation

Matlab results comarement

Page 13: Student : Andrey Kuyel Supervised by Mony Orbach Spring 2011 Final Presentation High speed digital systems laboratory High-Throughput FFT Technion - Israel

Results verification between Matlab fft function and 32 FFT core running at 320MHzAt Xilinx development board

FFT 16/32 core design validation and error estimation

Imaginary part of Matlab vs FFT core fft Quantization error estimation

Page 14: Student : Andrey Kuyel Supervised by Mony Orbach Spring 2011 Final Presentation High speed digital systems laboratory High-Throughput FFT Technion - Israel

FFT 16/32 core xilinx development board demo

FFT 32/16 core

Real data

Imag data

Transform Real data

Transform Imag data

4 different signals bank A

4 different signals bank B

Wrap around

Error estimationPLL Operational FFT clock

Input clock