Circuits Syst Signal ProcessDOI 10.1007/s00034-011-9332-7
Design and Comparison of FFT VLSI Architecturesfor SoC Telecom Applications with Different Flexibility,Speed and Complexity Trade-Offs
Sergio Saponara Massimo Rovini Luca Fanucci Athanasios Karachalios George Lentaris Dionysios Reisis
Received: 5 July 2010 / Revised: 16 June 2011 Springer Science+Business Media, LLC 2011
Abstract The design of Fast Fourier Transform (FFT) integrated architectures forSystem-on-Chip (SoC) telecom applications is addressed in this paper. After review-ing the FFT processing requirements of wireless and wired Orthogonal Frequency Di-vision Multiplexing (OFDM) standards, including the emerging Multiple Input Mul-tiple Output (MIMO) and OFDM Access (OFDMA) schemes, three FFT architec-tures are proposed: a fully parallel, a pipelined cascade and an in-place variable-sizearchitecture, which offer different trade-offs among flexibility, processing speed andcomplexity. Silicon implementation results and comparisons with the state-of-the-art prove that each macrocell outperforms the known works for a target application.The fully parallel is optimized for throughput requirements up to several GSamples/senabling Ultra-wideband (UWB) communications by using all channels foreseen inthe standard. The pipelined cascade macrocell minimizes complexity for large sizeFFTs sustaining throughput up to 100 MSamples/s. The in-place variable-size FFT
S. Saponara () M. Rovini L. FanucciDepartment of Information Engineering, University of Pisa, Via G. Caruso 16, 56122 Pisa, Italye-mail: firstname.lastname@example.org
M. Rovinie-mail: email@example.com
L. Fanuccie-mail: firstname.lastname@example.org
A. Karachalios G. Lentaris D. ReisisDepartment of Physics, University of Athens, Panepistimiopolis, Zografou, 15784 Athens, Greece
A. Karachaliose-mail: email@example.com
G. Lentarise-mail: firstname.lastname@example.org
D. Reisise-mail: email@example.com
Circuits Syst Signal Process
macrocell stands for its flexibility by allowing run-time reconfigurability required inOFDMA schemes while attaining the required throughput to support MIMO commu-nications. The three architectures are also compared with common case-studies andtarget technology.
Keywords VLSI design Fast Fourier Transform System-on-Chip OFDMtelecom systems
In the evolving telecommunication applications, dedicated FFT/IFFT architecturesare required for the baseband processing. A plethora of such applications (see [1,6, 11, 12, 17, 18, 20, 22, 38, 41]) suggests the design of configurable FFT archi-tectures, capable of achieving high throughput but also keeping the gate complexityand power consumption relatively low. Aiming at accommodating these types of ap-plications, this paper proposes the design of different VLSI (Very Large Scale ofIntegration) FFT/IFFT architectures targeting different trade-offs among the aboveperformance metrics. Particularly, the design aspects allowing for optimized FPGA(Field Programmable Gate Array) implementations are considered. FPGAs providean attractive implementation platform for telecom applications, because they are ableto reconfigure at compilation and/or run time and hence support different wirelessstandards. Moreover, todays FPGA designs extend their application range from pro-totyping platforms to user products, from fixed to mobile terminals: indeed, FPGAfamilies are available at the cost of few dollars for large volume market, while em-bedded FPGAs can be integrated as reconfigurable logic in System-on-Chips (SoCs).
The specifications of advanced OFDM-based standards for telecom systems leadto a wide configuration space to be faced by the FFT engine. The throughput may varyfrom few MSamples/s in xDSL (Digital Subscriber Line) modems for residential In-ternet connections (see [12, 41]) up to GSamples/s in UWB terminals for short-rangecommunication of multimedia contents . The I/O data-width may vary from 4 or5 bits in UWB up to 16 bits in VDSL (Very high-speed DSL) or BPL (Broadband onPower Lines) applications. Similarly, the FFT size (i.e. the FFT length) varies from 64complex points in Wireless Local Area Network (WLAN) (see [20, 38]) or ADSL, to8192 in DVB (Digital Video Broadcasting) . Moreover, the FFT engine should beconceived as a parametric IP (Intellectual Property) macrocell and, once integrated, itshould be still configurable at run time to support standards with multi-mode adaptivebehavior. As examples of such standards it is worth citing the Worldwide Interoper-ability for Microwave Access, WiMAX , or the 3rd Generation Partnership ProjectLong Term Evolution, 3GPP LTE , with FFT length ranging from 128 to 2048.
To achieve an unified view on the variety of possible design solutions meetingthe above requirements, this paper proposes different architectural approaches withrespect to the degree of parallelism, memory access strategy and machine arithmeticstyle. Furthermore, it shows their implementations and analyzes their advantages anddisadvantages in terms of performance, complexity and flexibility, considering FPGAdevices as target implementation technology. Exploiting different FFT processing
Circuits Syst Signal Process
schemes, each of the proposed architectures introduces design features allowing foran efficient support of a specific group of the aforementioned standards.
The paper is organized as follows. Section 2 reviews the OFDM communica-tion standards and the requirements for the FFT processing core. Section 3 proposesa massively parallel FFT architecture suitable for high-throughput applications (upto GSamples/s) such as UWB. Section 4 presents a configurable cascade FFT corewhich ensures an optimal trade-off between complexity and performance for applica-tions requiring large size FFTs (1024 complex points), such as DVB, and large data-widths, but with throughput requirements lower than one hundred of MSamples/s.Section 5 describes an in-place variable-length FFT core with parallel butterfly pro-cessors optimizing run-time reconfigurability and still supporting high-throughputapplications. Such architecture is suitable for emerging WiMAX terminals needingrun-time FFT length configuration and a computational throughput up to hundredsof MSamples/s to support Multi-Input Multi-Output (MIMO) communications. Im-plementation results of the above architectures on the same target technology andcomparisons between them are proposed in Sect. 6. Results are also compared withthe state-of-the-art of FFT VLSI designs for OFDM telecom applications. Conclu-sions are drawn in Sect. 7.
2 Overview on OFDM-Based Communication Standards
2.1 OFDM and MIMO-OFDM Architectures
The multi-carrier OFDM scheme has fostered the rise of several wireless and wiredcommunication standards including: Digital Broadcasting of Audio and Video con-tents, in Terrestrial and Handheld scenarios (DAB, DVB-T/H) (see [6, 22]); 802.16-d/e Wireless Metropolitan Area Network (WMAN), known respectively as fixed andmobile WiMAX , for wireless fast Internet access in metropolitan scenarios;xDSL [7, 12, 31, 41] and BPL [1, 3, 11] modem for fast Internet access through wiredchannels, the telephone line and the power line respectively; 802.11 a/n WLAN formedium range indoor networking [20, 26, 38]; UWB radio [18, 32, 36] for high datarate personal area network connectivity. The connectivity range covers short range us-ing UWB radio, mid range based on WLAN, BPL and VDSL and wide range throughDVB-T/H, DAB, WMAN and xDSL standards.
With respect to single-carrier modulation, OFDM-based systems offer enhancedrobustness against cross-talk, fading channels and multi-path distortion . In OFDMsystems, channel equalization is simplified because the transmitted data are spreadacross orthogonal sub-carriers, hence OFDM can be viewed as the contributionof many narrow-band signals rather than a rapidly-modulated wideband signal. In802.16e, OFDM is also deployed as a multi-user access technology (OFDMA), wherecarriers are clustered in subsets dynamically assigned to each user. Therefore, thechannel capacity is shared among multiple users.
All the aforementioned standards exploit a similar baseband processing schemewhose core are an FFT processor, in charge of multi-carrier symbol demodulation atthe receiver (rx), plus an IFFT processor in charge of symbol modulation at the trans-mitter (tx). FFT and IFFT require roughly half of the total circuit complexity of the
Circuits Syst Signal Process
baseband processing in OFDM systems (see [6, 29]). During modulation (IFFT) thereis a cyclic extension of the symbol to insert a guard interval handling time-spreadingand eliminating inter symbol interference. The extraction of the cyclic prefix is doneat receiver side (FFT).
Note that FFT and IFFT operations can be merged in a single FFT/IFFT processorin case the communication is based on a time division duplexing (TDD) scheme,since the transceiver is working either in rx mode (demodulation by FFT) or in txmode (modulation by IFFT). In full-duplex transceivers adopting frequency divisionduplexing (FDD), with concurrent tx and rx, FFT and IFFT have to be implementedthrough different dedicated processors.
OFDM can be used in conjunction with MIMO techniques to increase the systemcapacity and/or the diversity gain (see [21, 25, 29]). The MIMO scheme, adoptedin emerging standards such as 802.16 WMAN and 802.11n WLAN, uses multipleantennas at both the receiver and the transmitter side to exploit spatial diversity and/orspatial multiplexing.
Spatial multiplexing increases the cap