[IEEE 2011 18th International Conference on Telecommunications (ICT) - Ayia Napa, Cyprus (2011.05.8-2011.05.11)] 2011 18th International Conference on Telecommunications - Flexible

Flexible Hardware Architecture of SEFDMTransmitters with Real-Time Non-Orthogonal

AdjustmentMarcus R Perrett, Izzat Darwazeh

Department of Electrical and Electronic Engineering, University College LondonEmail: [email protected]

Abstract—Field Programmable Gate Arrays (FPGA) offera unique combination of software abstraction and hardwareperformance, enabled by programming languages such as VHDLor Verilog. Inherent from this capability is a multitude of differentdesign possibilities for a single implementation problem. A systemcan be designed which allows for real world evaluation at realtime speeds of algorithms ordinarily restricted to simulationenvironments. Presented here is an FPGA implementation ofa method of generating non-orthogonal Frequency DivisionMultiplexed signals, where the spacing between sub-carriers canbe controlled externally without the need for re-synthesis. Theinternal data-paths and associated algorithms are constructed soas to react to changes which dictate the aforementioned spacing,and as such represents a dynamic transmission platform forresearch purposes.

I. INTRODUCTION

This paper details the implementation, in a Field Pro-grammable Gate Array (FPGA) using VHDL, of a SpectrallyEfficient Frequency Division Multiplexed (SEFDM) signalgeneration method [1] [2]. The purpose of this work is toimplement an SEFDM generation platform on an FPGA, andevaluate the signal generated with respect to analytical andsimulation models. It is of interest to have the ability to modifythe aforementioned sub-carrier separation, to achieve func-tionality similar to current Matlab simulations. SEFDM signalsources are required so as to evaluate FPGA implementationsof non-orthogonal detection techniques, which are specific toSEFDM [3] [4]. It is also desirable to evaluate other FPGAimplementations of non-orthogonal signal detection methodswhich may, in the future, be tailored for SEFDM detection[5] [6]. The implementation is purposely targeted at a generalcase where, within available resource limits, the values bywhich the non-orthogonality is derived are not bound. Thisrequires the implementation of methods to derive the SEFDMsignal at an algorithmic level, rather than by the use of pre-determined states, which re-configure the generation of thesignal when appropriate input conditions are presented. Sec-tion II provides a review of the SEFDM signal fundamentalsand how its generation is determined. Section III provides ashort review of similar methods which have been previouslyimplemented on FPGAs, such as Faster-Than-Nyquist (FTN)[7], and Discrete Fractional Fourier Transforms (DFrFT) [8].

Section IV describes the SEFDM system implementation onan FPGA, including a comparison of two implementations ofmodulo arithmetic specific to SEFDM, and an alternative postprocessing signal generation method. Section V looks at theeffect of internal architecture changes relating to implementa-tion. In addition preliminary Matlab plots of the FPGA codeevaluated using Modelsim are presented.

II. SEFDM BACKGROUND

SEFDM signals may be defined as a general class of multi-carrier signals, where the sub-carrier frequency separationresults in a non-orthogonal set of modulated carriers. Suchsignals have been of interest recently as means to createcommunication systems offering bandwidth saving [9]. Math-ematically, these signals may be generated by modifyingtypical Orthogonal Frequency Division Multiplexed (OFDM)signal generation methods (using the Inverse Discrete FourierTransform (IDFT)), in structures similar to the one shownin Fig. 1. The output signal samples, X[K], are functions

Fig. 1: SEFDM based transmitter

of two parameters; the number of sub-carriers, N , and therelative frequency separation of the sub-carriers, α = 1,which also defines the bandwidth saving relative to OFDM.For implementation purposes, α is taken as the ratio of twointegers b and c. As such, X[K] may be represented as

X[K] = 1/√N

c−1∑i=0

ej2πib

cN

N−1∑l=0

Si+lc exp(j2πlk/N) (1)

978-1-4577-0024-8/11/ $26.00 c©2011 IEEE

2011 18th International Conference on Telecommunications

369

Fig. 2: Block Diagram of SEFDM transmitter operation block diagram

where S0 − SN−1 are the input symbols. Fig. 1 shows adiagrammatic representation of (1). The resulting system willconsist of c, N -length Inverse Discrete Fourier Transform(IDFT) blocks. N source data symbols are padded with zerosor repositioned, as a function of a modulo operation in baseof b, to form a new matrix of size cN . The implementation inan FPGA of an IDFT is resource intensive and it is standardpractice to use Inverse Fast Fourier Transforms (IFFT) wherepossible. N is therefore restricted to values falling at 2L, whereL is an integer greater than one.

III. NON-ORTHOGONAL IMPLEMENTATION ALTERNATIVES

Methods for creation of non-orthogonal sub-carriers, whichhave been implemented in FPGAs, already exist. The work ofDasalunkunte et al [10] details such an architecture based onan FTN mapper, which has subsequently been implemented inan FPGA [11]. A fast combinatorial, or slower and more effi-cient Random Access Memory (RAM) holds pre-determinedFTN coefficients. While the frequency spacing remains that ofOrthogonal Frequency Division Multiplexing (OFDM) signals,the time period of each symbol is reduced and hence, the sub-carrier spacings in the frequency domain are compressed. TheFTN mapped dataset is then passed to an Isotropic OrthogonalTransform Algorithm (IOTA) block, which is based on anIFFT with post filtering. This work has subsequently beenimplemented as an end-to-end system [12]. Another architec-ture is an implementation by Prasad [8], which details a 4-bitDFrFT, by the use of a COordinate Rotation DIgital Computer(CORDIC) to generate Eigenvalues; simplifying the DFrFTimplementation considerably over previous designs. A DFrFT

could be used in place of standard OFDM IFFT blocks, andenable non-orthogonal sub-carrier generation. Estimations arepresented for a 1024-point DFrFT which would be realisablein current FPGA devices. However, to the authors knowledge,this work has not yet been implemented in an FPGA.

IV. SEFDM IMPLEMENTATION

Recalling from (1), the spacing of the sub-carriers is depen-dant on two values, b and c, where α is defined as b

c . OFDMis defined as the case when b = c and hence α = 1. Anyvalue where b < c would result in α < 1 and a subsequentreduction in the sub-carrier spacings. The stages required tocreate an SEFDM signal based on (1) are:

• The data symbols, Sn, are reordered by a modulo arith-metic function. The data symbols are placed in positionsthat are integer multiples of b, with zeros placed in otherpositions to form a zero padded matrix of size cN .

• A Column-Major-Reorder process is applied to the cNmatrix to form a matrix of size c × N , with each Nlength row assigned to an IFFT block, where there are cnumber of blocks in total.

• Post IFFT phase rotation is applied by performing mul-tiplication with shifting values, derived from b and c.

This process is shown diagrammatically in Fig. 2.

A. Modulo Arithmetic Implementations

The SEFDM system requires only that, in order to performthe required re-ordering of the source symbols, the value ofa natural order counting integer is equal to a multiple of themodulo value b. A naive implementation of this is achieved by

370

Fig. 3: Logic diagram of Modulo divider circuit

the division of a natural counting integer with b. The quotientvalue of the divider is used as the index position of the sourcesymbols. The index is used only if the division remainder isequal to zero, otherwise a zero value is applied. This processis shown logically in Fig. 3. An equivalent operation is tocopy source symbols, indexed by a natural counting integer, topositions indexed by a second integer, increasing by factors ofb. All other positions of the destination matrix are set to zero.For example, if data symbols, Sn, are subjected to reorderingbased on b = 2; the resulting matrix, identified as φ, will beconditionally copied as S0 → φ0, and S1 → φ2, and so on.This is shown logically in Fig.4.

B. Division versus Index implementation comparison

Table I shows the corresponding complexity and latency ofthe methods described earlier. It should be noted that LUTrefers to a Xilinx Look-Up-Table and not a generic tableof values. For the results contained in the table, N = 16,b = 2 and c = 3; α ≈ 67%. The use of fixed-pointdatapaths and quantized coefficients leads to a finite signal-to-quantisation noise-ratio (SQNR) at the output. Therefore,datapath precision must be dimensioned to achieve sufficientSQNR. A datawidth representation of 16-bit is used, whichsatisfies an SQNR target of over 60 dB. The target device is aXilinx Virtex-series XC5VSX50T. It can be seen that a modest

TABLE I: Divisor versus index method for modulo matrixaugmentation

Metric Divisor Index % difference

State Machine 3 2 −33%

RAM Blocks 2 2 0%

Multiplier 1 1 0%

Subtracters 1 1 0%

Adders 2 1 −50%

Counters 1 3 300%

Accumulators 0 1 100%

Comparators - ≥ 2 2 0%

Comparators - > 8 8 0%

Comparators - = 0 1 100%

Comparators - < 3 3 0%

Comparators - 6= 0 1 100%

Multiplexors 1 1 0%

Total Flip-Flops 9697 9111 −6%

Total LUTs 19229 17224 −10%

Occupied Slices 6179 5811 −6%

Block RAM 18 16 −11%

DSP48E blocks 60 51 −15%

Max Estimated clock Period 17.3ns 12.8ns −26%

Latency N.C.dataidth N.C −93.7%

reduction in resources, results in a dramatically decreasedclock period. This is due to the removal of the divider circuitsload on the clock. The reduction in latency is due to themethodology itself; division in FPGAs is implemented as longdivision, which requires w clock cycles per division, where w

371

Fig. 4: Logic diagram of improved, index based Modulo circuit

is defined as the number of bits used to represent data symbols.The index method has no such divider, and therefore offers theadvantage of a 90% reduction in clock cycles.

C. Phase shift of IFFT output vectors

It can be seen from (1), that the resulting post IFFT symbolsare subjected to a phase shift based on values derived from band c. The required values are generated using complex expo-nential terms which, in hardware, are dependent on Sine andCosine values. This approach is costly in terms of resources,due to the need for Sine and Cosine coefficient generationmethods such as CORDIC. An improved SEFDM architectureis realised by the use of an IFFT to generate coefficientvalues, when N , b or c are changed. Hence, the need forSine and Cosine coefficient generation methods is removed,and generation is now via an IFFT block with a circularlyshifting input vector of length cN , for c iterations. This methodintroduces latency, when b, c or N is changed, of cN×c clockcycles. Despite potential flaws, this methodology is valid forthe purpose of creating a dynamic system for the evaluationof SEFDM signals and system. A final implementation wouldremove the IFFT generation block, and be replaced by pre-determined values at the expense of flexibility of α. Fig.5shows the implementation logic diagram. The area shadowedwould be removed where pre-determined values are used.

V. THE EFFECT OF α ON FPGA RESOURCES

For any FPGA design, the size of registers, memory orDSP block must be implicitly stated at design time. Resultinglogic may lay dormant until required, and perhaps constitute aconsiderable portion of the total available resources. There is atrade-off between resource utilisation and design functionality.An increase in a maximum value of a design parameter canoffer functional advantages, but at the expense of resourceutilisation. For the SEFDM system, an increase in c leads toa corresponding increase in the number of N length IFFTs,complex multipliers, and adder chains. By analysis of Fig.6, it is obvious that the maximum possible size of c, (heredefined as cMax), for N = 16, is 12. Values over c = 12require more than 100% of the device resources. It can also beconcluded that if the value of cMax is doubled, the increasein resource required increases by approximately 2.5×. Thevalue of 8 is chosen for cMAX to illustrate the design,with a corresponding device utilisation of around 50%. Fordesigns with a greater resource utilisation, the time takento synthesise increases dramatically, hindering development.A high resource utilisation also has the effect of drasticallyreducing maximum clock rates achievable, due to increasedclock loading and cross-device routing available. The designused to gather the data in Fig. 6 reflects the lowest possible

372

Fig. 5: Logic diagram of the multiplication of shift values generated by an IFFT to produce an SEFDM output signal. Shadedarea denotes logic replaced with RAM tables if pre-determined values of sub-carrier spacing are employed

Fig. 6: Chart showing the resource utilisation increase withrespect to cMax

latency implementation via the use of parallel IFFT blocks.Conversely, the design also represents one with the mostresource utilisation. A single IFFT block could be used inplace of the parallel IFFT blocks, whereby resource is tradedfor system latency. The IFFT architecture with the lowestlatency is a pipelined type, which requires N clock cyclesper transform. As a consequence a sequential IFFT methodwould offer an IFFT block reduction of approximately c− 1,with an increase of N × (c− 1) clock cycles and consequentsystem latency. The saving is approximate, as an additionalcN : 1 multiplexer would be required. Fig.7 shows preliminary

results of three system configuration, with values α = 1(OFDM), α = 0.5 (Fast-OFDM) and α = 2

3 (SEFDM). ALinear Feedback Shift Register (LFSR) is used to generatea Pseudo-Random Binary Sequence (PRBS), which are thenapplied to the system as serial data input bits. N = 16, zeropadding is 4, therefore 8 data bit outputs (as adjacent 4 sub-carriers are set to zero at either end) of the LFSR per SEFDMsymbol are required. A total of 1024-bits are utilised for eachα plot. These results are obtained by mathematical digital toanalogue conversion of the signals at the output of the FPGA(which is the output point of Fig. 5). The spectral plots clearlyshow the working of the design technique to generate SEFDM.Further studies are required to test the flexibility of the designsproposed and their practical implementation.

VI. CONCLUSION

This paper demonstrates, for the first time, a flexible im-plementation of a FPGA based SEFDM transmitter, wherevalues of two integers, b and c, control the spacing betweenthe subsequent output sub-carriers. The values are limited onlyby the available resources present in a target FPGA device.The implications for different modulo arithmetic functions inVHDL is discussed, and an optimum solution for the SEFDMpurpose is presented. Detailed is an alternative to previous

373

−6 −4 −2 0 2 4 6 8

x 107

0

1

0

1

0

1

2

3

Frequency (Hz)

Magnitude (

AU

)

α = 1; b=3, c=3 (OFDM)

α = 0.67; b=2, c=3 (SEFDM)

α = 0.5; b=3, c=6 (FOFDM)

Fig. 7: Resulting frequency spectrum plots for SEFDM,against OFDM and Fast-OFDM values of α

SEFDM work, whereby an additional IFFT block is used togenerate required values. Results are provided of an end-to-endSEFDM system for different values of α. The VHDL code thatdescribes the systems is ready to be implemented in hardwareand have real world testing applied. The design can be easilyscaled up in terms of maximum values of c or N , as newFPGA devices become available. Future work will be basedon hardware detection implementations, used in conjunctionwith signals generated using the SEFDM method, to create anSEFDM transceiver hardware environment.

VII. ACKNOWLEDGEMENTS

The authors would like to thank the UK Engineering andPhysical Sciences Research Council (EPSRC) for fundingMarcus Perrett through the Communications EngD programat UCL. We are grateful to UCL PhD student Safa Isam forher valuable input and comments on this work.

REFERENCES

[1] S. Isam and I. Darwazeh, “Simple DSP-IDFT Techniques for GeneratingSpectrally Efficient FDM System Signals,” 7th International Symposiumon Communication Systems Networks and Digital Signal Processing(CSNDSP), 2010.

[2] I. Kanaras, A. Chorti, M. Rodrigues, and I. Darwazeh, “Investigationof a Semidefinite Programming detection for a spectrally efficient FDMsystem,” 20th IEEE International Symposium on Personal, Indoorand Mobile Radio Communications, pp. 2827–2832, Sep. 2009.[Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5449891

[3] M. Rodrigues and I. Darwazeh, “A spectrally efficient frequency divisionmultiplexing based communications system,” In Proceedings of 2ndInternational Symposium on Broadband Communications (ISBC), pp.48 – 49, 2006.

[4] I. Kanaras, A. Chorti, M. Rodrigues, and I. Darwazeh, “Anew quasi-optimal detection algorithm for a non orthogonalSpectrally Efficient FDM,” 9th International Symposium onCommunications and Information Technology, pp. 460–465, Sep.2009. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5341206

[5] M. Joham, L. Barbero, T. Lang, W. Utschick, J. Thompson, andT. Ratnarajah, “FPGA implementation of MMSE metric basedefficient near-ML detection,” in IEEE International ITG Workshopon Smart Antennas (WSA), 2008, pp. 139–146. [Online]. Available:http://ieeexplore.ieee.org/xpls/abs\ all.jsp?arnumber=4475549

[6] X. Huang, C. Liang, and J. Ma, “System Architecture andImplementation of MIMO Sphere Decoders on FPGA,” IEEETransactions on Very Large Scale Integration (VLSI) Systems,vol. 16, no. 2, pp. 188–197, Feb. 2008. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4408629

[7] F. Rusek and J. B. Anderson, “Multistream Faster than NyquistSignaling,” IEEE Transactions on Communications, vol. 57, no. 5, pp.1329–1340, May 2009. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4939227

[8] M. Prasad, K. Ray, and A. Dhar, “FPGA implementation of DiscreteFractional Fourier Transform,” International Conference on Signal Pro-cessing and Communications (SPCOM), 2010.

[9] M. Perrett and I. Darwazeh, “FPGA Implementation of Quad OutputGenerator for Spectrally Efficient Wireless FDM Evaluation,” 7th Inter-national Symposium of Broadband Communications (ISBC), 2010.

[10] D. Dasalukunte, F. Rusek, J. B. Anderson, and V. Owall,“Transmitter architecture for faster-than-Nyquist signaling systems,”IEEE International Symposium on Circuits and Systems,no. 1, pp. 1028–1031, May 2009. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5117934

[11] D. Dasalukunte, F. Rusek, V. Owall, K. Ananthanarayanan, andM. Kandasamy, “Hardware Implementation of Mapper for Faster-Than-Nyquist Signaling Transmitter,” IEEE Norchip Conference, pp. 1–5,Nov. 2009. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5397801

[12] D. Dasalukunte, F. Rusek, and V. Owall, “Multicarrier Faster-Than-Nyquist Transceivers Hardware Architecture and Performance Analysis,”IEEE Transactions on Circuits and Systems I: Regular Papers, pp. 1–12,2010.

374

Documents

[IEEE 2011 18th International Conference on Telecommunications (ICT) - Ayia Napa, Cyprus (2011.05.8-2011.05.11)] 2011 18th International Conference on Telecommunications - Flexible