R 3GPP LTE Digital Front End Reference Design - pudn.comIntroduction XAPP1123 (v1.0) October 29, 2008 2 R Acronyms and Abbreviations 3GPP 3rd Generation Partnership Project AGC Automatic

XAPP1123 (v1.0) October 29, 2008 www.xilinx.com 1

© 2008 Xilinx, Inc. All rights reserved. XILINX, the Xilinx logo, and other designated brands included herein are trademarks of Xilinx, Inc. All other trademarks are the property of their respective owners.

Summary This application note provides designers with an optimized solution for Digital Up Conversion (DUC), Digital Down Conversion (DDC), and Crest Factor Reduction (CFR) required in a typical 3rd Generation Partnership Protocol (3GPP) Long Term Evolution (LTE) radio.

The design is configurable to support seven single and multi-carrier scenarios, while offering an optimized solution for each chosen configuration that allows designers to select their requirement without paying a penalty on design area, and therefore cost and power. Developed in Xilinx® System Generator for DSP™, the design allows customization to meet the needs of radio designs for the 3GPP LTE specification.

Accompanying this application note are design files, test vectors, and scripts that allow designers to quickly evaluate the performance of the reference design within MATLAB®. Additionally, instructions on how to integrate the reference design into a larger system design are included. Design files are available for Virtex®-5 device architectures.

Introduction The wireless industry is aggressively reducing Capital Expenditure (CapEx) and Operating Expenditure (OpEx). It is estimated that up to 60 percent of the overall CapEx cost is incurred within the radio elements of a typical base station. Additionally, since the radio also contains the power amplifiers, the radio portion of the design is responsible for much of the OpEx incurred during the lifetime of the site.

Reducing CapEx can be achieved through the use of less costly non-linear power amplifiers, and highly integrated digital radio transceivers. Integration, low cost, high reliability, and low power are key elements to the Xilinx radio solution. To meet industry cost needs, designs must be realized in Xilinx devices in the most efficient manner possible. This application note demonstrates that high clock rates and efficient design techniques for DUC, DDC, and CFR processing in Xilinx devices enables designers to meet the needs of their radio designs with very low cost and low power, while benefiting from smaller PCB area and greater reliability.

Additionally, OpEx can be reduced through the use of advanced algorithms. OpEx is directly related to the power amplifier efficiency in the base station. Currently, a very small proportion of the DC power consumed by the base station is converted to radiated energy. The efficiency at which a power amplifier can be operated is a function of the transmitted signal. LTE signals have a high Peak-to-Average Power Ratio (PAPR) or Crest Factor. This imposes significant operating restrictions on the power amplifier. To handle the peaks, the amplifier is heavily backed off from its most efficient operating point. To increase efficiency, CFR algorithms can be used to decrease the PAPR of the transmitted signal prior to it entering the power amplifier. By doing so, the power amplifier can operate with less back off and thus increased efficiency.

Another method of improving the efficiency of power amplifiers is to use Digital Pre-Distortion (DPD). Rather than use digital signal processing to reduce the dynamic range of the transmitted signal as with CFR, DPD is used to linearize the power amplifier itself. DPD is outside the scope of this document, but its reference is included as a widely used method of amplifier efficiency improvement.

Application Note: Virtex-5 FPGA

XAPP1123 (v1.0) October 29, 2008

3GPP LTE Digital Front End Reference DesignAuthors: Helen Tarn, Ed Hemphill, and David Hawke

R

http://www.xilinx.com

Introduction


R

Acronyms and Abbreviations3GPP 3rd Generation Partnership Project

AGC Automatic Gain Control

Block RAM Block Random Access Memory (Xilinx device resource)

BS Base Station

BTS Base Transceiver Station

CapEx Capital Expenditure

CFR Crest Factor Reduction

dB Decibels

DDC Digital Down Converter

DDS Direct Digital Synthesizer

DFE Digital Front End

DPD Digital Pre-Distortion

DSP Digital Signal Processing/Processor

DUC Digital Up Converter

EDGE Enhanced Data rates for GSM Evolution

EDGE2 or e-EDGE Evolved EDGE

FPGA Field Programmable Gate Array

FIR Finite Impulse Response

GSM Global System for Mobile (Communications), originating from Groupe Spécial Mobile

GUI Graphical User Interface

HDL Hardware Description Language

IF Intermediate Frequency

LSB Least Significant Bit(s)

LUT Look-Up Table

MAC Multiply-Accumulate

MSB Most Significant Bit(s)

Msps Mega-samples per second (1,000,000 samples per second)

OpEx Operation Expenditures

PAPR Peak-to-Average Power Ratio

PAR Place and Route

PSD Power Spectral Density

RMS Root Mean Square

SFDR Spurious-Free Dynamic Range

SNR Signal-to-Noise Ratio

TDM Time Division Multiplex

XST Xilinx Synthesis Technology


Contents


R

Contents Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Acronyms and Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4System-Level Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Transmit Downlink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6Receive Uplink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Transmit Downlink Design & Implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9Performance Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9Digital Up Converter Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11Crest Factor Reduction Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19System Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22Implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

Receive Uplink Design & Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44System Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57Implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

Resource Utilization Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76Resource Utilization for Downlink. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76Resource Utilization for Uplink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

Power Consumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77Power Consumption for Downlink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78Power Consumption for Uplink. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

Interface Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79Downlink Interface Description. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80Downlink Interface Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81Uplink Interface Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83Uplink Interface Timing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

Latency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85Software Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85Hardware Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86System Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87Notice of Disclaimer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

Figures Figure 1. Digital Front-End Architecture for Transmit Downlink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Figure 2. Digital Front-End Architecture for Receive Uplink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8Figure 3. Reference Point for EVM Measurement on an LTE System. . . . . . . . . . . . . . . . . . . . . . . . . . 11Figure 4. Digital Up Converter Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12Figure 5. PSD of the Baseband LTE Signal for 20-MHz Bandwidth. . . . . . . . . . . . . . . . . . . . . . . . . . . . 12Figure 6. Magnitude Response of Single-Rate Channel Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14Figure 7. Interpolation Filter Structure for 1x5 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15Figure 8. Interpolation Filter Structure for 1x10 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . 15Figure 9. Interpolation Filter Structure for 1x15 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16Figure 10. Interpolation Filter Structure for 1x20 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . 16Figure 11. Interpolation Filter Structure for 2x5 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . 16Figure 12. PSD after Multi-Carrier Mixing @ 15.36 MHz for 2x5 MHz Configuration. . . . . . . . . . . . . . . 17Figure 13. Interpolation Filter Structure for 2x10 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . 18Figure 14. PSD after Multi-Carrier Mixing @ 30.72 MHz for 2x10 MHz Configuration. . . . . . . . . . . . . . 18Figure 15. Interpolation Filter Structure for 4x5 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . 18Figure 16. PSD after Multi-Carrier Mixing @ 30.72 MHz for 4x5 MHz Configuration. . . . . . . . . . . . . . . 19Figure 17. 4-Channel Mixing and Combining Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20Figure 18. Time Domain View of Peak Cancellation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21Figure 19. Block Diagram of PC-CFR Method (One Iteration) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22Figure 20. CCDF of CFR Input and Output with Two Iterations of PC-CFR (1x10 MHz) . . . . . . . . . . . . 25Figure 21. PSD of CFR Input and Output with Two Iterations of PC-CFR (1x10 MHz) . . . . . . . . . . . . . 25Figure 22. Constellation Plot for 64 QAM (1x10 MHz) – 4% EVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26Figure 23. CCDF of CFR Input and Output with Two Iterations of PC-CFR (2x10 MHz) . . . . . . . . . . . . 27Figure 24. PSD of CFR Input and Output with Two Iterations of PC-CFR (2x10 MHz) . . . . . . . . . . . . . 27Figure 25. CCDF of CFR Input and Output with [1 1 1 1] Carrier Configuration, 4% EVM . . . . . . . . . . 29Figure 26. Non-ideal CCDF Curve with [1 1 1 1] Carrier Configuration . . . . . . . . . . . . . . . . . . . . . . . . . 29Figure 27. PSD of CFR Input and Output with [1 0 0 1] Carrier Config, 4% EVM . . . . . . . . . . . . . . . . . 30Figure 28. System Generator Block Diagram of Transmit Downlink for 4x5 MHz Configuration . . . . . . 31


Tables


R

Figure 29. System Generator DUC Configuration Subsystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32Figure 30. GUI for Single Carrier DUC Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33Figure 31. System Generator Block Diagram of DUC for 1x15 MHz Configuration . . . . . . . . . . . . . . . . 33Figure 32. System Generator Block Diagram of DUC for 1x15 MHz Configuration . . . . . . . . . . . . . . . . 33Figure 33. System Generator Block Diagram of Mixer & Combiner for 4x5 MHz Configuration . . . . . . 38Figure 34. System Generator Block Diagram of Rasterized DDS (Half Wave Storage . . . . . . . . . . . . . 39Figure 35. System Generator Diagram of Step Terminal Count Block (4x5 MHz Configuration). . . . . . 39Figure 36. System Generator Block Diagram of Rasterized DDS (Full Wave Storage) . . . . . . . . . . . . . 41Figure 37. System Generator Diagram of Step Terminal Count Block (2x5 MHz Configuration). . . . . . 41Figure 38. System Generator Block Diagram of Rate and Data Format Conversion Block . . . . . . . . . . 42Figure 39. System Generator Block Diagram of PC-CFR Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44Figure 40. System Generator Diagram of Cancellation Pulses (c_pulses) Module, 6 CPGs version . . 45Figure 41. Decimation Filter Structure for 1x10 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51Figure 42. Decimation Filter Structure for 1x15 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Figure 43. Decimation Filter Structure for 1x20 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Figure 44. Decimation Filter Structure for 2x5 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53Figure 45. Decimation Filter Structure for 2x10 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54Figure 46. Decimation Filter Structure for 4x5 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54Figure 47. Magnitude Response of Channel Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55Figure 48. Example DDS Output Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57Figure 49. Block Diagram of Receive Uplink Simulation Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58Figure 50. PSD of DDC Input with No Noise and No Interference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60Figure 51. Transmitted and Received Constellation with No Noise and No Interference. . . . . . . . . . . . 61Figure 52. PSD of DDC Input for Noise Only Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62Figure 53. Transmitted and Received Constellation for Noise Only Case . . . . . . . . . . . . . . . . . . . . . . . 62Figure 54. PSD of DDC Input for Wideband ACS Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63Figure 55. PSD of DDC Input for Narrowband ACS Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64Figure 56. PSD of DDC Input for Wideband Intermod Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66Figure 57. PSD of DDC Input for Narrowband Intermod Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67Figure 58. System Generator Block Diagram of DDC Top Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68Figure 59. System Generator Block Diagram of DDC for 1x5 MHz Configuration . . . . . . . . . . . . . . . . . 68Figure 60. System Generator Block Diagram of DDC for 4x5 MHz Configuration . . . . . . . . . . . . . . . . . 69Figure 61. System Generator Block Diagram of Fs_4_Mixer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69Figure 62. System Generator Block Diagram of HB4 Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70Figure 63. Screen Shots of FIR Compiler 4.0 GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71Figure 64. System Generator Block Diagram of Four-Carrier Mixer . . . . . . . . . . . . . . . . . . . . . . . . . . . 75Figure 65. System Generator Block Diagram of Four-Carrier Mixer . . . . . . . . . . . . . . . . . . . . . . . . . . . 75Figure 66. System Generator Block Diagram of TDM Circuit for Carrier Frequencies. . . . . . . . . . . . . . 76Figure 67. System Generator Block Diagram of Frequency-to-Phase Accumulator . . . . . . . . . . . . . . . 76Figure 68. System Generator Block Diagram of Sine/Cosine Lookup Table . . . . . . . . . . . . . . . . . . . . . 77Figure 69. Dynamic Power versus Frequency of LTE DUC/CFR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79Figure 70. Dynamic Power versus Frequency of LTE DDC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80Figure 71. Downlink Top-Level Component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81Figure 72. Timing Diagram of Input Data Interface for Downlink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82Figure 73. Timing Diagram of Input Control Interface for Downlink . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83Figure 74. Timing Diagram of CFR Configuration Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83Figure 75. Timing Diagram of Output Data Interface for Downlink. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84Figure 76. DDC Top-Level Component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84Figure 77. Timing Diagram of Data Interface for DDC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

Tables Table 1. Performance Summary for Transmit Downlink. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6Table 2. Performance Summary for Receive Uplink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8Table 3. General LTE Emission Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9Table 4. Additional LTE Emission Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9Table 5. Spectral Mask Requirements for LTE (5/10/15/20 MHz BW). . . . . . . . . . . . . . . . . . . . . . . . . . 9Table 6. Properties for Different Carrier Bandwidth Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12Table 7. Filter Parameters for the Channel Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12Table 8. Filter Parameters for Reference Design Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14Table 9. Prototype Filter for Designing Cancellation Pulse Coefficients . . . . . . . . . . . . . . . . . . . . . . . . 21Table 10. PC-CFR Performance for Single-Carrier Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22Table 11. PC-CFR Performance for Dual-Carrier Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25Table 12. PC-CFR Performance for 4x5MHz Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27Table 13. FIR Compiler Settings for 1x5 MHz Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33Table 14. FIR Compiler Settings for 1x10 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33Table 15. FIR Compiler Settings for 1x15 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Table 16. FIR Compiler Settings for 1x20 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Table 17. FIR Compiler Settings for 2x5 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35


Tables


R

Table 18. FIR Compiler Settings for 2x10 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35Table 19. FIR Compiler Settings for 4x5 MHz Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36Table 20. Resource Utilization Summary for PC-CFR Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42Table 21. E-UTRA BS Reference Sensitivity Level (From 3GPP TR 36.104, Table 7.2-1) . . . . . . . . . . 44Table 22. ACS Requirement with Wideband Interferer (from Table 7.5-3 of [Ref 2]) . . . . . . . . . . . . . . . 45Table 23. ACS Requirement with Narrowband Interferer (from Tables 7.5-1,2 of [Ref 2]). . . . . . . . . . . 45Table 24. In-Band Blocking Requirements (from Tables 7.6-1,2 of [Ref 2]) . . . . . . . . . . . . . . . . . . . . . 46Table 25. Wideband Intermodulation Requirements (from Tables 7.8-1,2 of [Ref 2] . . . . . . . . . . . . . . 47Table 26. Narrowband Intermodulation Requirements (from Table 7.8-3 of [Ref 2]) . . . . . . . . . . . . . . . 47Table 27. Summary of ACS, Blocking, and Intermodulation Requirements . . . . . . . . . . . . . . . . . . . . . . 47Table 28. Filter Parameters for Halfband Decimator that Follows Fs/4 Mixer . . . . . . . . . . . . . . . . . . . . 49Table 29. Filter Parameters for 1x5 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50Table 30. Filter Parameters for 1x15 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51Table 31. Filter Parameters for 1x20 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51Table 32. Filter Parameters for 1x10 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51Table 33. Filter Parameters for 2x5 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Table 34. Filter Parameters for 2x10 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53Table 35. Filter Parameters for 4x5 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53Table 36. Filter Parameters for the Channel Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54Table 37. Decimation Filter Characteristics after 18-Bit Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . 55Table 38. SC-FDMA Parameters for Uplink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57Table 39. SC-FDMA Parameters for Uplink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59Table 40. Performance Data for Noise Only Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62Table 41. Performance Data for Wideband ACS Test. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62Table 42. Performance Data for Narrowband ACS Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63Table 43. Performance Data for In-Band Blocking Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64Table 44. Performance Data for Wideband Intermod Test. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65Table 45. Performance Data for Narrowband Intermod Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66Table 46. FIR Compiler Settings for 1x5 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70Table 47. FIR Compiler Settings for 1x10 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71Table 48. FIR Compiler Settings for 1x15 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71Table 49. FIR Compiler Settings for 1x20 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72Table 50. FIR Compiler Settings for 2x5 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72Table 51. FIR Compiler Settings for 2x10 MHz Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73Table 52. FIR Compiler Settings for 4x5 MHz Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73Table 53. Resource Utilization Summary for Downlink Design (DUC+CFR) . . . . . . . . . . . . . . . . . . . . . 76Table 54. Resource Utilization Summary for DDC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77Table 55. Dynamic Power of LTE DUC/CFR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78Table 56. Dynamic Power of LTE DDC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79Table 57. Port Definitions for Downlink Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80Table 58. Period for Input Valid Signal (vin) in Downlink Configurations . . . . . . . . . . . . . . . . . . . . . . . . 82Table 59. Port Definitions for DDC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83Table 60. Latency and Total Delay of DUC+CFR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85


System-Level Overview


R


Transmit Downlink

Figure 1 shows the top-level block diagram for the transmit downlink portion of the digital front-end reference design. The downlink portion includes a DUC, which up samples the baseband signals to 122.88 Mega-samples per second (Msps), and a peak cancellation crest factor reduction (PC-CFR) module to control the peak of the signal to average power ratio (PAPR). The reference design supports seven different configurations:

• Single-carrier 5-MHz, 10-MHz, 15-MHz, and 20-MHz bandwidths

• Dual-carrier 5-MHz and 10-MHz bandwidths

• Four-carrier 5-MHz bandwidth

The common zero intermediate frequency (IF) architecture is assumed. Therefore, for single-carrier configurations, the DUC consists of several stages of interpolation filtering and no mixer stage is required. For multi-carrier configurations, a mixing and combining stage is necessary to allocate an individual carrier to its relative position in a 10-MHz or 20-MHz bandwidth before further processing.

Performance SummaryTable 1 summarizes the performance of the transmit downlink portion of the digital front-end design.

X-Ref Target - Figure 1

Figure 1: Digital Front-End Architecture for Transmit Downlink

Table 1: Performance Summary for Transmit Downlink

Parameter Value Comments

Channel Bandwidths (BW)

5, 10, 15, 20 MHz Also support 2x5 MHz, 2x10 MHz, and 4x5 MHz configurations.

Input Sample Rates 7.68, 15.36, 23.04, 30.72 Msps

For channel bandwidths of 5, 10, 15, 20 MHz, respectively.

Output Sample Rate 122.88 Msps

FPGA Clock Rate 368.64 MHz 3×122.88 MHz

Spectral Mask Requirements

For frequency offsets:within 0 ~ 1 MHz: 55 dB within 1 ~ 10 MHz: 65 dB over 10 MHz: 67 db

Derived from Table 6.6.2.2-6, -7, -12 and -13 of Table 1 plus 15 dB margin (assuming maximum base station power = 46 dBm).

ACLR ≥ 60 dB

PAPR Target 6~8 dB @ 0.01% clipping probability

X1123_01_100108

ChannelFilter

PC-CFR

7.6815.3623.0430.72Msps

122.88Msps

InterpolationFiltering andMulti-carrier

Mixing

Digital Up Converter

122.88Msps




R

Receive Uplink

As illustrated in Figure 2 the receive uplink portion of the digital front-end reference design consists of a first stage mixer, decimation filters, a second stage mixer (for multi-carrier configurations), and a channel filter.

The first stage mixer is based on an IF centered at one fourth the sample rate (Fs/4). This allows for a very efficient hardware implementation of the mixing process and initial halfband filtering. If the IF is centered at a frequency other than Fs/4, an alternate structure must be used. For example, a generic DDS can be generated using the Xilinx DDS Compiler and this can be combined with a DSP48-based complex multiplier to perform the mixing. The output of this more generic mixing process would feed a traditional halfband decimator that can be designed using the Xilinx Finite Impulse Response (FIR) Compiler.

Multiple decimation filtering architectures are required to support the various bandwidths and carrier configuration options in a hardware efficient manner. For multi-carrier configurations, the decimation filtering module includes a second stage of mixing that is used prior to final decimation and channel filtering.

Performance SummaryTable 2 summarizes the performance of the receive uplink portion of the digital front-end design.

The target filter parameters are derived from the adjacent channel selectivity (ACS), blocking, and intermodulation requirements of the LTE BS receiver described in the [Ref 1]. The ACS,

EVM for Digital Portion For single carrier: 2% at 8 dB PAPR 4% at 7 dB PAPR 7% at 6 dB PAPRFor multi-carrier:“Crest Factor Reduction Architecture,” page 19 for detail.

64 QAM Modulation

Input Signal Quantization 16 bits I/Q Complex input

Output Signal Quantization

16 bits I/Q Complex output

Mixer Properties Tunability: FixedResolution: 100 kHzSpurious-Free Dynamic Range (SFDR): Ideal

Assume zero IF; used only in multi-carrier configurations.Use rasterized DDS based on 100 kHz raster.

Table 1: Performance Summary for Transmit Downlink (Cont’d)



Figure 2: Digital Front-End Architecture for Receive UplinkX1123_02_100108

ChannelFilter

To BasebandProcessing

FromAD

Converter

7.6815.3623.0430.72Msps

7.6815.3623.0430.72Msps

61.44Msps

DecimationFiltering

andMulit-carrier

Mixing

All modules running @ 368.64 MHz. Sample rates are per single complex channel.

Halfband2

Fs/4Mixer

122.88Msps




R

blocking, and intermodulation requirements are summarized in the following subsections along with some assumptions that were used in deriving the filter requirements.

Table 2: Performance Summary for Receive Uplink


Channel Bandwidths (BW) 5, 10, 15, 20 MHz Also supports 2x5 MHz, 2x10 MHz, and 4x5 MHz configurations.

Input Sample Rate 122.88 Msps 16×7.68 Msps.

Output Sample Rates 7.68, 15.36, 23.04, 30.72 Msps For channel bandwidths of 5, 10, 15, 20 MHz, respectively.

FPGA Clock Rate 368.64 MHz 3×122.88 MHz.

EVM for DDC ≤ 1% In absence of noise and interferers.

Receive Filter Requirement Fpass = 0.9×BW/2Fstop = BW/2Apass ≤ 0.1 dBAstop ≥80 dB

Fpass is the passband frequency.Fstop is the stopband frequency.Apass is the peak-to-peak ripple.Astop is the stopband attenuation.

Input Signal Quantization 14 bits Real input.

Output Signal Quantization 18 bits I, 18 bits Q No Automatic Gain Correction (AGC).

First Stage Mixer Properties Tunability: None Fs/4 Mixer.

Second Stage Mixer Properties(Used only in multi-carrier configurations)

Tunability: FixedResolution: 100 kHzSFDR: Ideal

Use simplified DDS based on 100 kHz raster.


Transmit Downlink Design & Implementation


R


Performance Requirements

Spectral Mask RequirementsThe spectral mask requirements used in the digital front-end design are derived from the LTE (also called E-UTRA) emission limits defined in [Ref 1].

The general LTE emission limits are listed in Table 3, and are summarized from the General LTE Emission Limits from Table 6.6.2.2-6 and Table 6.6.2.2-12 of [Ref 1].

Table 4 shows the additional LTE emission limits generated from Table 6.6.2.2-7 and Table 6.6.2.2-13 of the same volume [Ref 1].

The maximum BS power is 43 dBm for 5-MHz carrier and 46 dBm for 10-MHz, 15-MHz, and 20-MHz carrier, given in the Table 4.6 of [Ref 3].

The spectral mask requirement for 5-MHz, 10-MHz, 15-MHz, and 20-MHz bandwidths, respectively, can be calculated given the combined general LTE emission limits and the additional LTE emission limits shown in these tables. Select the most stringent requirement of the four plus 15 dB as the spectral emission mask, so enough margins are guaranteed and the design for various bandwidths configurations can share the same channel filter. The spectral mask requirements are listed in Table 5.

Adjacent Channel Leakage RatioAdjacent Channel Leakage Ratio (ACLR) is defined as the ratio of the in-band power to the power in adjacent LTE carriers.

Table 3: General LTE Emission Limits

Frequency Offset to Measurement

Maximum Power in Measurement Measurement

Filter Edge (-3 dB cutoff) Bandwidth Bandwidth

0 MHz to 5 MHz -7 dBm → -14 dBm 100 kHz

5 MHz to 10 MHz -14 dBm 100 kHz

> 10 MHz -15 dBm 1 MHz

Table 4: Additional LTE Emission Limit

Frequency Offset to Measurement Filter Edge (-3 dB cutoff)

Maximum Power in Measurement

Bandwidth

Measurement Bandwidth

0 MHz to 1 MHz BW = 5 MHz -15 dBm 30 kHz

BW = 10 MHz -13 dBm 100 kHz



> 1 MHz -13 dBm 1 MHz

Table 5: Spectral Mask Requirements for LTE (5/10/15/20 MHz BW)

For |Δf| Within the Range Minimum Attenuation (dB) Attenuation with 15 dB Margin (dB)

0 MHz to 1 MHz 40 55

1 MHz to 10 MHz 50 65

> 10 MHz 52 67




R

The power in an adjacent channel is measured with a rectangular filter with a bandwidth equal to 180 kHz × transmission bandwidth configuration (NRB), centered on the first or second adjacent channel (corresponding to ACLR1 and ACLR2).

For 5-MHz, 10-MHz, 15-MHz, and 20-MHz LTE carriers, NRB is equal to 25, 50, 75, and 100, respectively. Therefore, the corresponding rectangular filter bandwidth is 4.5-MHz, 9-MHz, 13.5-MHz, and 18-MHz, respectively. The minimum requirement is listed in the section 6.6.2.3.1 of 3GPP TS 36.804 [Ref 1].

The limits for ACLR1 and ACLR2 are both 45 dB. In the reference design, 60 dB is specified as the minimum ACLR requirement to provide 15 dB margin.

Error Vector MagnitudeFor an LTE system, the reference point in the receiver for Error Vector Magnitude (EVM) measurement is at the point after the Cyclic Prefix (CP) removal, FFT, and subcarrier amplitude and phase correction, as shown in Figure 3.

The basic unit of EVM measurement is defined over one subframe in the time domain (1 ms) and one resource block which contains 12 data subcarriers (180 kHz) in the frequency domain for frame structure type 1. To test against the EVM requirements, the Root Mean Square (RMS) average of individual EVM is calculated over 10 consecutive downlink subframes (10 ms) and all allocated resource blocks in the frequency domain for both FDD and TDD frame structure type 1. This can be expressed in Equation 1 as:

Equation 1

Where:

is the number of resource blocks

is the EVM for ith subframe and jth resource block. For a more detailed definition, refer to section 6.8.1 of the 3GPP TR 36.804 v0.7.1 (2007-10) [Ref 1].

Separate EVM requirements are specified for different modulation schemes:

• For 64 QAM modulation, a range of 7 ~ 8% is the proposed EVM requirement.


Figure 3: Reference Point for EVM Measurement on an LTE SystemX1123_03_100108

Pre/Post FFTTime/Frequency

Sync

Pre Sub-carrierAmplitude/Phase

Correction

Reference Pointfor EVM

Measurement

Base StationUnder Test

Cyclic PrefixRemoval

SymbolDetection/Decoding

FFT

102,10

1 1

1

1 iN

i ji j

ii

EVM EVMN = =

=

= ∑∑∑

iN

,i jEVM




R

• For 16 QAM and QPSK, 12.5% and 17.5% are the proposed minimum performance requirements.

Note: These figures are for the total system EVM, which includes the digital and analog portion. This application note uses the minimum requirements that EVM 2% at 8 dB PAPR, 4% at 7 dB PAPR, and 7% at 6 dB PAPR for digital portion and single carrier configurations.

Digital Up Converter Architecture

This section describes the detailed architecture of major modules in the digital up-converter. Figure 4 shows the overview block diagram of the DUC.

Single Rate Channel Filter

The baseband data first has to pass the channel filter so that the out-of-band power is attenuated to meet the spectral mask requirements. Because the LTE baseband signal is OFDM-based, the power spectral density (PSD) of the input signal to the channel filter already has a natural attenuation starting from the edge of the occupied bandwidth (i.e., 90% of the total channel bandwidth).

As shown in Figure 5, for 20-MHz bandwidth configuration, the PSD at 10-MHz frequency point is less than -30 dB compared to that at active bins. Similar PSD characteristics are observed for the 5-MHz, 10-MHz, and 15-MHz bandwidth configurations. To ensure the signal at the output of the channel filter can be down by up to 67 dB (as summarized in Table 6) outside the desired bandwidth, the channel filter needs to support additional ~40 dB attenuation.


Figure 4: Digital Up Converter ArchitectureX1123_04_100108

122.88Msps

Single-RateChannel

Filter

InterpolationFiltering 1

InterpolationFiltering 2

Mixing andCombining

(Multi-carrieronly)

FractionalResampling(1x15MHz

only)

7.6815.3623.0430.72Msps


Figure 5: PSD of the Baseband LTE Signal for 20-MHz Bandwidth




R

Table 6 shows some useful properties for all four bandwidth settings, such as the number of usable subcarriers, subcarrier spacing, occupied bandwidth, and the ratio between occupied and total bandwidth. The DC subcarrier is unused.

The following numbers are important when it comes to deciding the specification for the channel filter:

• The passband for the channel filter (Fpass) must be at least BWoccupied/2.

• The stopband (Fstop) should be BWtotal/2.

Because the normalized passband and stopband frequencies, i.e., ωpass= Fpass/(Fs/2)and ωstop= Fstop/(Fs/2), respectively, are identical for the four bandwidths, the same channel filter can be shared across configurations.

Table 7 summarizes the channel filter design parameters. It is a single-rate filter with a total of 81 taps. A channel filter with a rate change of 2 can be designed; however, the filter order can double (~160 taps) and does not necessarily result in an efficient implementation. The magnitude response of the channel filter is shown in Figure 6, where the filter coefficients have been quantized to 18 bits.

The Fpass used in the Wpass calculation is 9.015 MHz, instead of 9 MHz. It makes the filter requirements more stringent (one more tap than if using 9 MHz) but it also ensures that the signal carried at the last active data bin can pass through. Also, this creates an odd numbers of taps and is preferable in an efficient filter implementation when it comes to speed and area consumption.

Table 6: Properties for Different Carrier Bandwidth Settings

Total BW (BWtotal)

(MHz)Fs (Msps) Usable

Subcarriers

Subcarrier Spacing

(kHz)

Occupied BW

(BWoccupied)(MHz)

BWoccupied/BWtotal *

100%

5 7.68 300 15 4.5 90%

10 15.36 600 15 9 90%

15 23.04 900 15 13.5 90%

20 30.72 1200 15 18 90%

Table 7: Filter Parameters for the Channel Filter

BW (MHz)

ωpass

ωstop

Apass (dB)

Astop (dB) # Taps

5, 10, 15, 20

0.587(=9.015/(30.72/2))

0.651(=10.0/(30.72/2))

0.035 45 81




R

Interpolation FilteringAfter the channel filter, a cascade of interpolation filters follows to remove the aliasing effect produced by up sampling.

The majority of the interpolation filtering chain is implemented using a cascade of halfband filters. Halfband filters are a type of FIR filter where its transition region is centered at one quarter of the sampling rate, Fs/4. The end of its passband and the beginning of the stopband are equally spaced on either side of Fs/4.

When implementing an interpolation filter with a rate of two, the halfband filter is often the chosen structure of choice, because it requires much less computational power (and thus less hardware) for a filter realization. This results from the fact that every odd indexed coefficient in the time domain is zero except the center tap and even indexed coefficients are symmetric.

In 1x5, 1x10, and 1x20 MHz BW configurations, the desired sampling rate (122.88 MHz) can be achieved by using a cascade of halfband filters. For 1x15 MHz, a rate 4/3 fractional resampler follows the halfband filters to convert the sampling rate from 92.16 Msps to 122.88 Msps.

The main difference between the single carrier and multi-carrier configurations is that the interpolation filtering process for the multi-carrier configurations includes a mixing and carrier combining stage. For the 2x10 and 4x5 MHz bandwidth, this stage frequency shifts individual carriers to their relative positions, [-5, 5] and [-7.5, -2.5, 2.5, 7.5] MHz, respectively, in a 20-MHz bandwidth centered at 0 MHz. The composite signal is then processed with further halfband filters. For the 2x5 MHz, two carriers are first shifted to [-2.5, 2.5] MHz in a 10-MHz bandwidth centered at 0 MHz before the later halfband filtering.

All filters were designed using the MATLAB Filter Design and Analysis Tool (FDATool). The seven reference designs share a similar filtering structure.

The halfband interpolation filter parameters are identical. Table 8 summarizes the filter parameters for easy referencing and comparison.


Figure 6: Magnitude Response of Single-Rate Channel Filter

0 1 2 3 4 5 6 7-100

-90

-80

-70

-60

-50

-40

-30

-20

-10

0Magnitude Response of Channel Filter

Frequency (MHz)

dB




R

1x5 MHz Configuration

The interpolation filtering chain for the 1x5 MHz configuration is shown in Figure 7.

Each filter is a halfband interpolation filter with parameters as listed in Table 8. The equiripple halfband lowpass input parameters (in FDATool) are Fs, Fpass, and Apass. The passband cutoff frequency for all four halfband filters was set to 2.5 MHz to protect the entire 5 MHz band and remove the aliasing component completely. The number of taps is the length of the smallest filter that satisfies the requirements specified by the input parameters.


The interpolation filtering chain for the 1x10 MHz configuration is shown Figure 8 and the filter parameters are listed in Table 8. Because the passband cutoff frequencies (after normalizing to the filter sample rates) and the passband ripple for HB#1, HB#2, and HB#3 are identical to the ones for the 1x5 MHz configuration, the resultant filter coefficients are identical. The throughput requirement for each filter is doubled in implementation because the sample rates at each stage are doubled (see “DUC Implementation” section for further details).


The interpolation filtering chain for the 1x15 MHz configuration is shown in Figure 9, and the filter parameters are listed in Table 8.

A fractional resampler with the rate of 4/3 is required to convert from the 92.16 Msps rate to the desired 122.88 Msps rate. To design the resampler, the standard polyphase filter design technique is used.

• The sampling frequency Fs is set to 92.16×4 = 368.64 Msps.

Table 8: Filter Parameters for Reference Design Configurations

Filter Fs (Msps)

Fpass (MHz)

Fstop (MHz) Apass (dB) Astop

(dB) # Taps

HB #1 15.36 2.5 0.01 23

HB #2 30.72 2.5 0.01 11

HB #3 61.44 2.5 0.01 7

HB #4 122.88 2.5 0.01 7

Fractional Resampler

368.64 7.5 84.66 0.01 80 19


Figure 7: Interpolation Filter Structure for 1x5 MHz ConfigurationX1123_07_100108

15.36Msps

7.68Msps

HB #12

30.72Msps

HB #22

61.44Msps

HB #32

122.88Msps

HB #4

(23 taps) (11 taps) (7 taps) (7 taps)

2



15.36Msps

HB #12

30.72Msps

HB #22

61.44Msps

HB #32

122.88Msps

(23 taps) (11 taps) (7 taps)




R

• The passband frequency is set to 7.5 MHz to protect the entire 15 MHz band.

• The stopband frequency is 92.16-7.5=84.66 MHz to remove the aliasing image.


Figure 10 shows the interpolation filtering chain for the 1x20 MHz configuration, and the filter parameters are listed in Table 8.

After normalizing to the sample rate, the filter coefficients are identical to those in the corresponding filter for the 1x10 MHz and 1x5 MHz configurations.


Figure 11 shows the interpolation filtering chain for the 2x5 MHz configuration.

One major difference here from all the single carrier configurations is that signals from two complex channels need to be combined together to a total allocated bandwidth of 10 MHz.

Because a zero IF architecture was assumed, two individual carriers are shifted to fixed frequencies at -2.5 and 2.5 MHz before combining. This simplifies the mixer design as the whole 10-MHz BW is centered at zero so that the multi-carrier mixer can be placed at the lowest possible sample rate. This helps to achieve a significant hardware resource saving.

In the case of non-zero IF architecture, it requires a mixer at the end of the DUC chain (at the highest sample rate) to support the capability to shift to any frequencies.

After channel filtering, signals from two complex channels enter the first halfband filter and are up-sampled to 15.36 Msps. This is the earliest point that two carriers are able to be combined together without triggering any aliasing effect because (Fs – BWindividual_carrier) is greater than BWindividual_carrier. Here the sampling rate Fs is 15.36 MHz and BWindividual_carrier is 5 MHz.


Figure 9: Interpolation Filter Structure for 1x15 MHz Configuration



X1123_09_100108

23.04Msps

HB #12

46.08Msps

HB #22

92.16Msps

FractionalResamplerP/Q = 4/3

122.88Msps


X1123_10_100108

30.72Msps

HB #12

61.44Msps

HB #22

122.88Msps

(23 taps) (11 taps)



7.68Msps

HB #1

2 ComplexChannels

215.36Msps

HB #22

61.44Msps

2-CarrierMixing

& Combiningto

10 MHz BW

122.88Msps

(23 taps)2 Carriers Centered at

[-2.5, 2.5] MHz

(11 taps)

HB #32

(7 taps)

15.36Msps

HB #12

30.72Msps

(23 taps)




R

After the mixing and combining process, the composite signal is treated as data from one carrier so that the rest of the halfband filters only require supporting a single complex channel.

The PSD for 2x5 MHz LTE data after multi-carrier mixing is shown in Figure 12.

The resource trade-off is an important consideration when designing multi-carrier filter structures. Pushing the mixer and combiner to the earlier stage in the DUC chain will increase the complexity of the polyphase filters that are to follow.

In this example, instead of using the 11-tap halfband interpolator as shown in 1x5 MHz configuration in Figure 7, this architecture requires a 23-tap halfband filter immediately after the mixer to support the (doubled) 10 MHz total BW. This halfband filter has the same specification as the first filter immediately following the channel filter (both are HB #1), as their normalized passband frequencies are equivalent. The passband of the rest of the halfband filters is doubled. The advantage is that the filtering process only needs to apply to one complex channel instead of two, if mixing and combining at the later stage. The filter parameters are listed in Table 8.

The mixer and combiner designs are detailed in ““Multi-Carrier Mixing and Combining”,” page 14.


Figure 13 shows the interpolation filtering chain for the 2x10 MHz configuration and the filter parameters are listed in Table 8. Similar to the 2x5 MHz configuration, signals from two complex channels are shifted to fixed frequencies at -5 MHz and 5 MHz before combining, and this is done at the lowest possible sample rate domain – 30.72 MHz.

Again, instead of using the 11-tap halfband interpolator, this architecture requires a 23-tap halfband filter after the mixer to support 20 MHz total BW. This halfband filter has the same specification as the first filter immediately following the channel filter (both are HB #1).


Figure 12: PSD after Multi-Carrier Mixing @ 15.36 MHz for 2x5 MHz Configuration

-6 -4 -2 0 2 4 6-100

-90

-80

-70

-60

-50

-40

-30

-20

-10

0

Frequency (MHz)

Wat

ts/H

z (d

B)

Power Spectral Density for LTE Data @ 15.36MHz




R

Figure 14 shows the PSD for the 2x10 MHz LTE data after multi-carrier mixing.


Figure 15 shows the interpolation filtering chain for the 4x5 MHz configuration and the filter parameters are listed in Table 8, page 14.

The multi-carrier output signal has a total allocated bandwidth of 20 MHz. After channel filtering, signals from four complex channels first enter two stages of halfband filter and are up-sampled to 30.72 Msps. Signals are shifted to fixed frequencies at -7.5 MHz, -2.5 MHz, 2.5MHz, and 7.5 MHz before combining.

Figure 16 shows the PSD. This is the earliest point that four carriers are able to be combined without any aliasing effect because 30.72–10 > 10. After the mixing and combining process, the





X1123_13_100108

15.36Msps

HB #1

2 ComplexChannels

230.72Msps

HB #22

122.88Msps

2-CarrierMixing

& Combiningto

10 MHz BW


[-5, 5] MHz

(11 taps)

30.72Msps

HB #12

61.44Msps

(23 taps)

-15 -10 -5 0 5 10 15-100

-90

-80

-70

-60

-50

-40

-30

-20

-10

0

Frequency (MHz)

Wat

ts/H

z (d

B)



Figure 15: Interpolation Filter Structure for 4x5 MHz ConfigurationX1123 15 100108

7.68Msps

HB #1

4 ComplexChannels

215.36Msps

HB #22

122.88Msps

4-CarrierMixing

& Combiningto

20 MHz BW


[-7.5, -2.5, 2.5, 7.5] MHz

(11 taps)

HB #22

30.72Msps

(11 taps)

30.72Msps

HB #12

61.44Msps

(23 taps)




R

signals are treated as data from one carrier and the rest of the halfband filters operate on one complex channel.

DUC Filtering Overall Gain

The single rate channel filter was designed to have a unity gain (the sum of all the coefficients is 1). The halfband interpolators were designed with a filter gain of 2 (center tap equals to 1), so that with the zero insertion between samples in the up-sampling process, the average output signal power is maintained to be the same as that of the input signal. Therefore, for 1x5, 1x10, and 1x20 MHz configurations, the DUC filtering process has an overall gain of 0 dB. For 1x15 MHz configuration, the fractional resampling process generates a gain of 1.104. This results in an overall gain of 0.86 dB.

For multi-carrier configurations, the up-sampling process for each carrier also preserves a 0 dB gain. With N active carriers, the composite output average amplitude is sqrt(N) times the individual input signal average amplitude. For example, in four carrier [1 1 0 1] configuration, the output signal average amplitude is expected to be sqrt(3) ≅ 1.7321 higher than the average amplitude of the input signal from each carrier.

Multi-Carrier Mixing and Combining

This section describes the architecture of the multi-carrier mixing and combining that is embedded in the interpolation filter chains for the 2x5, 2x10, and 4x5 MHz configurations. In this zero-IF DUC architecture, this operation frequency shifts individual carriers to their relative positions in a given bandwidth centered at 0 MHz. The architecture of the mixing and combining module used in the 4x5 MHz configuration is described as an example.

Figure 17 illustrates the 4-channel mixing and combining module.



-15 -10 -5 0 5 10 15-100

-90

-80

-70

-60

-50

-40

-30

-20

-10

0

Frequency (MHz)

Wat

ts/H

z (d

B)





R

The input to the mixing and combining module is four complex signals sampled at 30.72 Msps. The mixer multiplies four input signals x0(n), x1(n), x2(n), and x3(n) by exp(jω0n), exp(jω1n), exp(jω2n), and exp(jω3n), respectively, and then sums the product up to generate a complex composite output signal, where the ωk represent the carrier center frequencies.

The real (I channel) and imaginary (Q channel) components of the output signals are given by Equation 2 and Equation 3:

Equation 2

Equation 3

Similar to the DDC, the rasterized DDS is used to generate the sine and cosine waveforms used in the above equations. The detailed discussion on the rasterized DDS concept can be found in the ““Multi-Carrier Mixing and Combining.”

Crest Factor Reduction Architecture

The LTE DFE downlink design uses a peak cancellation crest factor reduction (PC-CFR) method to reduce the high PAPR signal at the output of the DUC. This section gives an overview of the PC-CFR algorithm and architecture. For more detailed information on the PC-CFR, refer to [Ref 4].


Figure 17: 4-Channel Mixing and Combining ModuleX1123_17_100108

Channel 0Complex

Input

c0

c1

Channel 1Complex

Input

c2

Channel 2Complex

Input

4-ChannelComposite

Complex Output

c3

Channel 3Complex

Input

X

X

X

X

+

3

0( ) ( ( ) cos( ) sin( ))

k kI I k Q kk

y n x n n x nω ω=

= −∑

3

0( ) ( ( ) cos( ) sin( ))

k kQ Q k I kk

y n x n n x nω ω=

= +∑




R

Algorithm OverviewThe peak cancellation method of CFR reduces the peak to average power ratio (PAPR) of a signal by subtracting spectrally shaped pulses from signal peaks that exceed a specified threshold. The cancellation pulses are designed to have a spectrum that matches that of the CFR input signal and, consequently, introduces negligible out-of-band interference. Each cancellation pulse is rotated to match the phase of the corresponding signal peak.

The magnitude of a given cancellation pulse is set equal to the difference between the corresponding signal peak magnitude and the desired clipping threshold. This reduces the signal peak magnitudes to the threshold value while preserving the signal phase.

Figure 18 illustrates the peak cancellation process in the time domain.

Note: The blue curve shows a section of the input signal magnitude before the CFR iteration. The cyan horizontal line overlaid on the plot indicates the clipping threshold. Any peak that exceeds this threshold is a candidate for cancellation. The magenta curve shows the magnitude of the output signal after subtracting the cancellation pulse from the input signal.

Architecture

Figure 19 shows a block diagram of the PC-CFR algorithm. Peaks in the input signal are detected and cancelled to produce a reduced PAPR signal. The peak detect block works on the signal magnitudes to produce a peak location indicator along with magnitude and phase information for each peak. The difference between the peak magnitudes and the clipping threshold is generated by the peak scaling block. The magnitude difference is combined with the phase information to produce the complex weighting that is used to scale the cancellation pulse coefficients. It is this scaling that replaces the more computationally intense convolution that is used in the noise shaping method.

Each cancellation pulse generator (CPG) outputs an unscaled version of the cancellation pulse waveform aligned with a peak location. Each CPG can cancel only one peak at a time. The length of the cancellation pulse combined with the number of CPGs determines the rate at which signal peaks can be cancelled.

The allocator block controls the distribution of CPGs to incoming peaks. When a new peak is detected the allocator assigns an available CPG to the cancellation of that peak. If all CPGs are busy when a new peak is detected, it will not be cancelled. Multiple iterations of the algorithm are necessary to eliminate the peaks that were not cancelled during an earlier pass of the


Figure 18: Time Domain View of Peak Cancellation

4.45 4.455 4.46 4.465 4.47 4.475 4.48

x 104

0

0.5

1

1.5

2

2.5

3

3.5

4x 10

4 Signal Magnitude Before and After CFR Iteration

Time (Samples)

Sig

nal M

agni

tude

Input

Output




R

algorithm. The final algorithm step is to subtract the summation of the CPG outputs from a delayed version of the input signal.

In the 3GPP LTE reference design, two iterations of the PC-CFR algorithm were used:

• For single carrier configurations, each iteration consists of three CPGs to achieve acceptable performance, while resulting in slightly lower area than the multi-carrier configuration.

• For multi-carrier configurations, six CPGs were used in the first iteration and another 3 CPGs in the second iteration to eliminate most of the peaks and achieve a satisfactory output PAPR.

Designing Cancellation PulseThe cancellation pulse coefficients can be obtained using any preferred filter design methodology and are computed offline before being written to the PC-CFR. Memory that is external to the design can be used to store multiple sets of cancellation pulse coefficients corresponding to predetermined carrier configurations. Transferring a selected set of coefficients into the PC-CFR memory can be handled with some simple multiplexing circuitry.

For multi-carrier configurations, it is useful to first design a prototype filter that is matched to the spectrum of a single carrier. Frequency shifted replicas of the prototype filter are then placed at each carrier center frequency before being summed to create a composite multiband filter. In the 3GPP LTE reference design, the prototype filter was obtained using the firls function in MATLAB. The order of the filter is fixed at 254 (255 taps).

Table 9 lists the passband and stopband frequency of the prototype filter.

The command prototype_filter_coeff = firls(N, [0 Fpass Fstop Fs/2]/(Fs/2), [1 1 0 0], [Wpass Wstop]) generates the prototype filter coefficients.


Figure 19: Block Diagram of PC-CFR Method (One Iteration)

Table 9: Prototype Filter for Designing Cancellation Pulse Coefficients

Bandwidth (MHz)

Fs (Msps)

Fpass (MHz)

Fstop (MHz)

Wpass (dB)

Wstop (dB)

Filter Order (N)

5 7.68 2 2.5 1 50 254

10 15.36 4.5 5 1 50 254

X1123_19_100108

HighPAPRSignal

ReducedPAPRSignal

Delay

PeakDetect

PeakScaling

AllocatorCPG #1

CPG #2

Sum

CPG #N

+

PeakLocations

MagPhase

X

X

X

+




R

To generate the cancellation pulse coefficients, first the prototype filter coefficient is rotated to respective carriers, for example, multiply it by exp(j*2*pi*Fc*k/Fs) where:

Fc = 0 for single-carrier configurations, Fc = [-2.5, 2.5] in 2x5 MHz case, Fc = [-5, 5] in 2x10 MHz case Fc = [-7.5, -2.5, 2.5, 7.5] in 4x5 MHz case,

k is integer, from -# of taps/2 to # of taps/2.

Then, sum all the rotated coefficients together to obtain the CP coefficients.

System Performance

This section summarizes the performance of the LTE transmit downlink using PC-CFR method. In all cases, the length of the cancellation pulse is 255, and the cancellation pulse was designed using the least squares filter design method, for example, the firls function in MATLAB, with Fpass = 0.9*BW/2 for 10, 15, and 20 MHz BW, Fpass = 0.8*BW/2 for 5 MHz BW, and Fstop = BW/2. The detailed parameters are described in the ““Designing Cancellation Pulse.” Results are presented based on using two iterations of the algorithm, because no improvement was observed using three or more iterations.

Single-Carrier Configuration

Table 10 shows the output PAPR and PAPR reduction (dPAPR) versus EVM performance of the PC-CFR algorithm for single carrier configurations.

• The number of cancellation pulse generators is three for both iterations.

• All PAPR results are referenced at the 0.01% probability of clip point.

• With 8 dB PAPR target, less than 2% EVM can be achieved.

• With 7 dB PAPR target, the EVM is below 4%.

• The EVM is below 7% when the output PAPR is 6 dB.

• ACLR1 and ACLR2 meet the 60 dB requirement.

Compared to other CFR methods, such as the Peak Windowing CFR (PW-CFR) and Noise Shaping CFR (NS-CFR), it can be concluded that the PC-CFR performance is slightly better than that of the NS-CFR algorithm, and significantly better than the PW-CFR algorithm.

Figure 20 shows the performance comparison based on 1x10 MHz configuration.

15 23.04 6.75 7.5 1 50 254

20 30.72 9 10 1 1 254

Table 10: PC-CFR Performance for Single-Carrier Configurations

Configuration Input PAPR(dB)

Output PAPR (dB)

dPAPR(dB) EVM (%) ACLR1/ACLR

2 (dB)

2x10 9.55 8 1.55 1.74 74/82

7 2.55 3.82

6 3.55 7.11

2x5 9.5 8 1.5 1.85 68/76

7 2.5 3.94

6 3.5 7.43

Table 9: Prototype Filter for Designing Cancellation Pulse Coefficients (Cont’d)

Bandwidth (MHz)

Fs (Msps)

Fpass (MHz)

Fstop (MHz)

Wpass (dB)

Wstop (dB)

Filter Order (N)




R

The PW-CFR method is described in [Ref 5]. The NS-CFR method is described in [Ref 6]. A link to this document can be found in “References,” page 87.

Two iterations are used for both the NS-CFR and PC-CFR methods, while PW-CFR used a 401 tap window. Given the same EVM, the PC-CFR method can provide the highest PAPR reduction of the three.

The following three figures are based on 1x10 MHz configuration:

• Figure 21 shows a plot of the CCDF for the CFR input and output at the 4% EVM operating point for 1x10 MHz configuration. The curve demonstrates that the PAPR reduction is around 2.72 dB at the 0.01% (1E-4) probability of clip point.

• Figure 22 shows the power spectral density (PSD) estimates of the CFR input and output for the single carrier at the 4% EVM operating point.

• Figure 23 shows the constellation plot for 64-QAM modulation.


Figure 20: Performance Comparison among Several CFR Methods (1x10 MHz)

1 2 3 4 5 6 7 8 9 105.5

6

6.5

7

7.5

8

8.5

EVM (%)

Out

put

PA

PR

(dB

)

CFR Performance Comparison for 10 MHz BW

PW-CFR

NS-CFR (2 iterations)

PC-CFR (2 iterations)




R


Figure 21: CCDF of CFR Input and Output with Two Iterations of PC-CFR (1x10 MHz)


Figure 22: PSD of CFR Input and Output with Two Iterations of PC-CFR (1x10 MHz)

0 2 4 6 8 10 1210

-6

10-5

10-4

10-3

10-2

10-1

100

CCDF

PAPR (dB)

Clip

ping

Pro

babi

lity

Before Clipping: PAPR (@1E-4) = 9.67 dB

After Clipping: PAPR (@1E-4) = 6.95 dB

-40 -30 -20 -10 0 10 20 30 40-120

-100

-80

-60

-40

-20

0

Frequency (MHz)

Wat

ts/H

z (d

B)

Power Spectral Density for LTE Data @ IF

before CFR

after CFR

SEM




R

Dual-Carrier Configuration

Table 11 shows the output PAPR and PAPR versus EVM performance of the PC-CFR algorithm when using two iterations for the dual carrier configurations (2x5 and 2x10).

The PAPR of the CFR input signal is 9.5 dB and 9.55 dB at the 0.01% probability of clip point for 2x5 and 2x10 BW options, respectively.

The number of cancellation pulse generators is six for the first iteration, and three for the second iteration. Both ACLR1 and ACLR2 meet the 60 dB requirement.

For the PAPR reduction, fewer than 2% EVM can be achieved with 8 dB output PAPR. With 7 dB PAPR target, the EVM is within 4%.

When the output PAPR is 6 dB, the EVM is over 7%, which is higher than the single carrier case due to the destructive effect generated from the adjacent carrier.


Figure 23: Constellation Plot for 64 QAM (1x10 MHz) – 4% EVM

Table 11: PC-CFR Performance for Dual-Carrier Configurations

Configuration Input PAPR(dB)

Output PAPR (dB)

dPAPR(dB) EVM (%) ACLR1/ACLR2

(dB)

2x10 9.55 8 1.55 1.74 74/82

7 2.55 3.82

6 3.55 7.11

2x5 9.5 8 1.5 1.85 68/76

7 2.5 3.94

6 3.5 7.43




R

Figure 24 shows a plot of the CCDF for the CFR input and output when the output PAPR is 7 dB for 2x10 MHz configuration. The EVM at 0.01% clipping probability operating point is 3.82%.

Figure 25 shows the PSD estimates of the CFR input and output for the 2x10 MHz configuration.


Figure 24: CCDF of CFR Input and Output with Two Iterations of PC-CFR (2x10 MHz)


Figure 25: PSD of CFR Input and Output with Two Iterations of PC-CFR (2x10 MHz)

0 2 4 6 8 10 1210

-6

10-5

10-4

10-3

10-2

10-1

100

CCDF

PAPR (dB)

Clip

ping

Pro

babi

lity



-60 -40 -20 0 20 40 60-120

-100

-80

-60

-40

-20

0

Frequency (MHz)

Wat

ts/H

z (d

B)


before CFR

after CFRSEM




R

Four-Carrier ConfigurationTable 12 shows the output PAPR and dPAPR versus EVM performance of the PC-CFR algorithm when using two iterations for the 4x5 MHz configuration. Because the reference design offers the option to turn on and off each carrier, this configuration includes possible scenarios like three active carriers with one non-adjacent carrier, such as [1 1 0 1] or [1 0 1 1], or two non adjacent carriers, such as [1 0 0 1], and the typical four active carriers [1 1 1 1] carrier settings.

Similar to the dual carrier configuration, the number of cancellation pulse generators is six for the first iteration, and three for the second iteration.

In the 4x5 MHz configuration the output PAPR vs. EVM performance is worse than all other configurations, though the ACLR1 and ACLR2 still meet the 60 dB requirement.

In four active carriers setting, although less than 2% EVM can still be achieved with 8 dB PAPR and 4% EVM for 7 dB output PAPR, the CCDF curve for 6 dB output PAPR will be curved outwards to high PAPR region at 1e-5 and 1e-6 clipping probability as shown in Figure 27 (instead of a straight line down as the red curve in Figure 26). This is non-ideal as this causes a problem for processing in the digital pre-distortion (DPD) block to follow. Consequently, the minimal achievable (stable) output PAPR is approximately 6.5 dB.

Table 12: PC-CFR Performance for 4x5MHz Configuration

Configuration Allocation Spacing

Input PAPR(dB)

Output PAPR (dB)

dPAPR(dB)

EVM (%)

ACLR1/ACLR2 (dB)

Four Active Carriers for example [1 1 1 1]

None or [10, 5]

9.7 8 1.7 1.91 66/75

7 2.7 4.00

6.5 3.2 5.5

Three Active Carriers with One Non-adjacent Carrierfor example [1 1 0 1] or [1 0 1 1]

None 9.64 8 1.64 2.45 63/76

7 2.64 5.10

6.7 2.94 6.42

[10, 5] 8 1.64 2.36

7 2.64 4.81

6.7 2.94 6.00

Two Non-adjacent Carriersfor example [1 0 0 1]

None 9.6 8 1.6 4.11 66/66

7 2.6 9.57

6.7 2.9 11.80

[10, 5] 8 1.6 2.65

7 2.6 5.67

6.7 2.9 6.88




R

In any non-adjacent carrier setting, the allocation spacing technique must be used to reduce the EVM given certain output PAPR. For example, in a [1 1 0 1] or [1 0 1 1] case, 5.1% EVM was seen to achieve 7 dB of output PAPR, but with the help of allocation spacing, the EVM metric can be reduced to 4.81%, almost 0.3% improvement.

The worst case scenario happens in two active carrier with two middle carriers inactive setting, such as [1 0 0 1]. In this case, to achieve 7 dB output PAPR causes almost 10% EVM. This is much improved with the allocation spacing technique, and nearly 4% improvement is seen at 7 dB output PAPR operation point. The optimal allocation spacing is [10, 5], meaning space of 10 is used for the first iteration and 5 is used for the second iteration.


Figure 26: CCDF of CFR Input and Output with [1 1 1 1] Carrier Configuration, 4% EVM


Figure 27: Non-ideal CCDF Curve with [1 1 1 1] Carrier Configuration

0 2 4 6 8 10 1210

-6

10-5

10-4

10-3

10-2

10-1

100

CCDF

PAPR (dB)

Clip

ping

Pro

babi

lity



0 2 4 6 8 10 1210

-6

10-5

10-4

10-3

10-2

10-1

100

CCDF

PAPR (dB)

Clip

ping

Pro

babi

lity






R

Figure 28 illustrates the input and output PSD for [1 0 0 1] configuration at a fixed 4% of EVM.

Implementation

The transmit downlink was implemented using Xilinx System Generator version 10.1.2. The top-level GUI lets you select from one of seven carrier configurations as shown in Figure 29. The DUC and CFR module settings under the mask change accordingly based on different carrier configurations. The Power Meter block is an optional module and can be turned on and off from the top-level GUI.


Figure 28: PSD of CFR Input and Output with [1 0 0 1] Carrier Config, 4% EVM


Figure 29: System Generator Top-Level GUI of Transmit Downlink Design

-40 -30 -20 -10 0 10 20 30 40-120

-100

-80

-60

-40

-20

0

Frequency (MHz)

Wat

ts/H

z (d

B)


before CFR

after CFR

SEM




R

Under the mask is the block diagram of transmit downlink design. Figure 30 shows the diagram of the 4x5 MHz configuration as an example. There are three major blocks in the design: a DUC configuration subsystem, a rate and data format conversion block, and a PC-CFR module (pc_cfr_3x), which are described in the following subsections.

DUC ImplementationThe DUC was implemented with a heavy reliance on the FIR Compiler v4.0. The top-level DUC architecture uses a configurable subsystem to select from one of seven unique architectures that are stored in a Simulink library. See the Xilinx reference, System Generator for DSP User Guide, Release 10.1.2 [Ref 12], for a description of using configurable subsystems in System Generator designs.

Because the carrier and BW configuration are determined by the choice in the top-level GUI as shown in Figure 29, the Block Choice menu item changes automatically and the user does not need to (and is not able to) make a selection on this level.


Figure 30: System Generator Block Diagram of Transmit Downlink for 4x5 MHz Configuration

vin In

threshold In

rst3

Out

rst1

Out

reset In

rate and data format conversion

din_i

din_q

vin

rst_1

dout_i

dout_q

vout

rst_3

power Out

pc_cfr_3x

data_i_in

data_q_in

data_sync

threshold

alloc_spacing1

alloc_spacing2

filter _numtaps

filter _ram_addr

filter _ram_data

filter _ram_we

reset

data_i_out

data_q_out

data_valid

ce_3_out

gain _we In

gain _ch In

gain In

filter _ram _we In

filter _ram _data In

filter _ram _addr In

filter _numtaps In

duc _valid

Out

duc _q Out

duc _i Out

din In

coefficients

coef

cfr_valid Out

cfr_q Out

cfr_i Out

ce_3_out Out

alloc _spacing 2 In

alloc _spacing 1 In

reset3

cfr_vout

cfr_q

duc _vout

duc_q

duc _i

power

ce_3_out

reset

cfr_i

tx_data

reinterpret

reinterpret

reinterpret

reinterpret

reinterpret

DUC Configurable SubsystemDUC_4x5

din

vin

reset

gain

gain_we

gain_ch

dout_i

dout_q

vout

power

rst_out

-C-

thresh

255

0

0

-C-

0




R

Figure 31 shows the DUC configuration subsystem.

Figure 32 shows another GUI that allows you to add or remove the gain block on the DUC configuration subsystem level if you choose any of the single carrier configurations (1x5, 1x10, 1x15 or 1x20 MHz) on the top-level. The default setting is off.

For multi-carrier cases, the gain block always exists and cannot be removed. This is because the gain block also can be used to turn on and off certain carriers.

The seven architectures are based on the structures described in the “Digital Up Converter Architecture,” page 11. The following diagrams provide block diagrams of two configurations:

• Figure 33, page 32 shows the System Generator block diagram for the 1x15 MHz configuration.

• Figure 34, page 32 shows the System Generator block diagram for the 4x5 MHz configuration.

In the single-carrier configurations, the data flow from one module to the next is handled in a Time Division Multiplexing (TDM) fashion except for the last module. The valid in (vin) and valid out (vout) signal is paired with TDM data input (din) and TDM data output (dout), respectively, to indicate the signal availability. Because the output of the DUC needs to feed into the CFR module, the I and Q data of the last module are separate and use some time division demultiplexing logic.

There is a higher level of design complexity in the multi-carrier configurations due to the existence of the mixer. For all the modules before the mixer, the data input and output stream


Figure 31: System Generator DUC Configuration Subsystem




R

is handled in a TDM manner. However, the efficient mixer design and the final few FIR compilers sometimes generate non-uniform output data, and some data scheduling and FIFO are required in these special situations.


Figure 32: GUI for Single Carrier DUC Configuration


Figure 33: System Generator Block Diagram of DUC for 1x15 MHz Configuration


Figure 34: System Generator Block Diagram of DUC for 1x15 MHz Configuration

rst_out5

power4

vout3

dout _q2

dout _i1

power _meter

I

Q

valid

rst

power

input reg

h_frac _2ch

din

vin

ch

reset

dout_i

dout_q

vout

h_channel

din

vin

reset

dout

vout

h2_2ch

din

vin

reset

dout

vout

ch

h1_2ch

din

vin

reset

dout

vout

Gain Stage SCno_gain_stage

din

vin

gain

reset

dout

vout

gain _ch6

gain _we5

gain4

reset3

vin2

din1

rst_out5

power4

vout3

dout _q2

dout _i1

power _meter

I

Q

valid

rst

power

mixer

din

vin

reset

dout

vout

input reg

h_channel

din

vin

reset

dout

vout

chout

h2_8ch

din

vin

reset

dout

vout

h2_2ch

din

vin

chin

reset

dout_i

dout_q

vout

h1_8ch

din

vin

reset

dout

vout

h1_2ch

din

vin

reset

dout

vout

chout

gain _ctrl

din

vin

chin

gain

gain_we

gain_ch

reset

dout

vout

gain _ch6

gain _we5

gain4

reset3

vin2

din1




R

Channel Filtering and Interpolation

This section summarizes the Filter Compiler 4.0 settings for each filter used in each of the seven configurations in the transmit downlink.

• In all cases, the number of coefficient sets is equal to one and the reloadable coefficient option is disabled.

• All filter architectures are based on the systolic multiply accumulate structure with 18-bit signed coefficients.

• The setting for the coefficient structure is generally set as Inferred except in the 2x10 and 4x5 MHz cases, where it is set to Non-Symmetric for better resource balancing and speed advantage.

• The optimization goal is always left at the default setting of Area.

• The data buffer type and coefficient buffer type are left at the default setting of Automatic.

• All filters in the design use the rst and nd control options.

Table 13 through Table 19 list the FIR compiler settings for 1x5, 1x10, 1x15, 1x20 2x5 2x10, 2x15, 2x20, and 4x5 MHz configurations.

Table 13: FIR Compiler Settings for 1x5 MHz Configuration

Parameter Channel Filter 1st Halfband 2nd

Halfband3rd

Halfband4th

Halfband

Coefficients h_chan h1_duc h2_duc h3_duc h3_duc

Filter Type Single_Rate Interpolation Interpolation

Interpolation

Interpolation

Rate Change Type Integer Integer Integer Integer Integer

Interpolation Rate Value

1 2 2 2 2

Number of Channels 2 2 2 2 1

Effective Input Sample Period

24 24 12 6 6

Coefficient Structure Inferred Inferred Inferred Inferred Inferred

Number of Paths 1 1 1 1 2

Output Width 18 17 17 17 17

Optimization Goal Area Area Area Area Area


Parameter Channel Filter 1st Halfband 2nd Halfband 3rd Halfband

Coefficients h_chan h1_duc h2_duc h3_duc

Filter Type Single_Rate Interpolation Interpolation Interpolation

Rate Change Type Integer Integer Integer Integer

Interpolation Rate Value 1 2 2 2

Number of Channels 2 2 2 1


12 12 6 6

Coefficient Structure Inferred Inferred Inferred Inferred

Number of Paths 1 1 1 2




R

Output Width 18 17 17 17

Optimization Goal Area Area Area Area


Parameter Channel Filter 1st Halfband 2nd Halfband Resampler

Coefficients h_chan h1_duc h2_duc h_frac

Filter Type Single_Rate Interpolation Interpolation Fixed_Fractional


Interpolation Rate Value 1 2 2 4

Decimation Rate Value 1 1 1 3



8 8 4 4

Coefficient Structure Inferred Inferred Inferred Inferred

Number of Paths 1 1 1 2




Parameter Channel Filter 1st Halfband 2nd Halfband

Coefficients h_chan h1_duc h2_duc


Rate Change Type Integer Integer Integer

Interpolation Rate Value 1 2 2

Number of Channels 2 2 1

Effective Input Sample Period 6 6 6

Coefficient Structure Inferred Inferred Inferred

Number of Paths 1 1 2

Output Width 18 17 17

Optimization Goal Area Area Area

Table 14: FIR Compiler Settings for 1x10 MHz Configuration (Cont’d)

Parameter Channel Filter 1st Halfband 2nd Halfband 3rd Halfband




R


Parameter Channel Filter

1st Halfband

2nd Halfband

3rd Halfband

4th Halfband


Filter Type Single_Rate Interpolation Interpolation Interpolation Interpolation

Rate Change Type Integer Integer Integer Integer Integer


1 2 2 2 2

Number of Channels 4 4 2 2 2


12 12 12 6 3

Coefficient Structure Inferred Inferred Inferred Inferred Inferred



Optimization Goal Area Area Area Area Area


Parameter Channel Filter

1st Halfband

2nd Halfband

3rd Halfband 4th Halfband



Interpolation Interpolation

Rate Change Type

Integer Integer Integer Integer Integer


1 2 2 2 2

Number of Channels

4 4 2 2 2


12 12 12 6 3

Coefficient Structure

Inferred Inferred Inferred Inferred Inferred



Optimization Goal

Area Area Area Area Area




R

Multi-Carrier Mixing and Combining

Four-Carrier Mixer

Figure 35 shows the System Generator block diagram of the four-carrier mixer and combiner.

• The input data sampling rate is 30.72 Msps and there are eight channel data (I and Q from 4 carriers) coming in from the last FIR compiler.

• With a 368.64 MHz FPGA clock and the input data throughput of 30.72×8 = 245.76 Msps, data enters the mixer module in a bursty mode.

• 8 valid TDM samples come in during 12 clock cycles.

• The data is stored in a FIFO and is read out and mixed with the cosine and sine waveform generated from the raster_dds_4ch block.

• It requires two DSP48Es for this mixer and combining operation, one for the in-phase output and another for the quadrature output.

• Because there are still four empty cycles left, one cycle is used for the symmetric round operation by the same DSP48E and no extra logic is required.

• The final TDM block assembles the resultant I and Q channel data in a TDM format to feed into the next FIR compiler.


Parameter Channel Filter 1st Halfband 2nd Halfband 3rd Halfband 4th Halfband


Filter Type Single_Rate Interpolation Interpolation Interpolation Interpolation

Rate Change Type

Integer Integer Integer Integer Integer


1 2 2 2 2

Number of Channels

8 8 8 2 1


6 6 3 6 6


Non-Symmetric

Inferred Inferred Inferred Inferred



Optimization Goal

Area Area Area Area Area




R

The top-level System Generator block diagram of the four-carrier DDS is shown in Figure 36.

• The center frequencies for four carriers are [-7.5, -2.5, 2.5, 7.5] MHz and sampling frequency (Fs) is 30.72 MHz.

• Because gcd(2.5M, 30.72M) = 20k, a raster of 20 kHz is required and the total number of elements stored to form a whole sinusoidal wave is Fs/20k = 1536.

• Because 1536 is a multiple of 4, minimally a quarter cosine wave is enough to generate the rest of the data sample on a unit circle utilizing the trigonometric relationship. However, with the minimal granularity of the Block Memory size at 1024x18K, a half cosine wave can be stored with only 1 block memory with 18-bit output precision. This will also result in less logic usage to generate the full waveform.

• The sine wave can also be generated from mapping and inverting the corresponding cosine wave samples.

• The dual port RAM in Figure 36 was initialized by the vector cos_lut1, which is generated from the following script (which is located in the mask of the DUC_4x5 module in the duc_lib.mdl file):

% gcd(2.5M, 30.72M) = 20k - need raster of 20 kHz = 0.02 MHzf = [-7.5, -2.5, 2.5, 7.5];Fs = 30.72;num_el = Fs/20*10^3; % Total number of element to form a whole sinu wave n = [0: num_el-1]; offset = 0.01; % offset by 10 kHz to avoid +/-1cos_lut = cos(2*pi*(offset+n*0.02)/Fs); step = mod(f/20*10^3, num_el); % Only need to store quarter wave, % but since each BRAM is 1024x18, we store half wave (w/o extra resource)% cos(theta) = cos(-theta), sin(theta) = -sin(-theta)% if between 768~1535 -> mapped to 767~0 accordinglyn1 = [0: num_el/2-1];cos_lut1 = cos(2*pi*(offset+n1*0.02)/Fs);

Note: Another applied technique is that +1 and -1 are avoided in this Look-Up Table (LUT). This is done by offsetting the starting sampling point by half of the raster; for example, 10 kHz. This way, samples 768 to 1535 are symmetric to samples 767 to 0 in absolute magnitude, and a full 18 bit precision can be utilized (Fix 18.17 format instead of Fix 18.16 format which dedicates one bit to represent +1).


Figure 35: System Generator Block Diagram of Mixer & Combiner for 4x5 MHz Configuration

count from 0~11

if x<8 & ~empty vout2

dout1reset_gen

reset reset_24

raster_dds_4ch

vin

cnt

reset

cos_sin

sin_cos

vout _early

TDM

din_i

din_q

vin

cnt

rst

dout

vout

Register 4

d

rstqz-1

Register 3

d

rstqz-1

Register2

d

rstqz-1

Register 1

d

rst

en

qz-1MSB

[a:b]

MAC _Q

data

coeff

op_addr

rst

dout_q

vout

MAC_I

data

coeff

op_addr

dout_i

Goto 2

[rst_24 ]

Goto 1

[FIFO _re]

Goto

[empty ]

From 2

[rst_24 ]

From 1

[FIFO _re]

From

[empty ]Expression

a

b~a & ~b

z-7

Data FIFO

din

we

re

rst

dout

empty

%full

full

Counter

rst

enout

1

reset3

vin2

din1

x




R

Most of the other logic in the rasterized DDS, such as step_terminal_count block (shown in Figure 37) is to generate the baseline address to perform the sine and cosine LUT mapping.

The step size (generated from the fixed frequencies [-7.5, -2.5, 2.5, 7.5] is stored in the ROM, and an accumulating process is performed based on the step size. When the count reaches above 1536; for example, 30.72M/20k, the output is wrapped back to the start producing a modulo (count, 1536) operation.

Two-Carrier Mixer

Figure 38 shows the System Generator block diagram of the dual carrier mixer and combiner. The implementations for the 2x5 and 2x10 MHz configurations are similar, so only the 2x5 MHz implementation is explained.

For the 2x5 MHz configuration, the input data sampling rate is 15.36 Msps and there are 4-channel data (I and Q from two carriers) coming in from the last FIR compiler. With a 368.64 MHz FPGA clock, every six clocks an input data sample enters the mixer module.


Figure 36: System Generator Block Diagram of Rasterized DDS (Half Wave Storage


Figure 37: System Generator Diagram of Step Terminal Count Block (4x5 MHz Configuration)

create vout early signal to trigger the FIFO read operation(sync up the timing between cos /sin and data )

half wave storagemapped from 0~2pi to 0~pi

step terminal counter to calculate corresponding

phase for each channel (between 0~2pi )

vout _early3

sin_cos2

cos_sin1

step_terminal _count

reset

cnt

en

base_addr

valid

sin_addr _map

base_addr sin_addr

cos_addr _map

base_addr cos_addr

d

rstqz-1

d qz-1

d qz-1

Mux 5

sel

d0

d1

z-1

Mux 1

sel

d0

d1

z-1

Dual Port RAM

addra

dina

wea

addrb

dinb

web

A

B

z-2

Counter

rst

enout

0

0

reset3

cnt2

vin1

sin_addr

cos_addr

enable once every 2 clks

latency for this module = 4

this loop takes 12 clks overall

pick the right step size for each channeleach step constant stays for 2 clks once the count is over 1536 (= 30 .72 M/20k)

wrap the counter result back to the start

valid2

base_addr1

data _wrap

In Outbit 21

[a:b]

d

rstqz-1

d

rst

en

qz-1

d

rstqz-1

d

rstqz-1

ROM

addr z-1

LSB

[a:b] a

b

~a & b

z-2

Delay 1

z-9

AddSub 1

a

b

en

a + bz-1

en3

cnt2

reset1




R

Four valid TDM data inputs are presented in 24 clock cycles and the data pattern is uniform. The sine and cosine waveforms are generated and TDM’ed from the raster_dds_2ch block to mix with the input samples. Because it takes four operations/clock cycles to mix and combine data from each carrier for I or Q channel, only a single Multiply-Accumulate (MAC) is required for 24 clock cycles total.

No extra logic is needed for symmetric rounding. The output from the mixing and combining module is in the TDM format.

Figure 39 shows the top-level System Generator block diagram of the 2x5 MHz DDS.

The center frequencies for two carriers are [-2.5, 2.5] MHz and sampling frequency (Fs) is 15.36 MHz. Because gcd(2.5M, 15.36M) = 20k, a raster of 20 kHz is required and the total number of element stored to form a whole sinusoidal wave is Fs/20k = 768. In this case, a full cosine wave can be stored with only 1 block memory with 18 bit output precision.

The sine wave is generated from mapping and inverting the right cosine wave samples. A similar technique to what was used in the 4x5 MHz configuration is applied to avoid +1 and -1 storage in the LUT. This is done by offsetting the starting sampling point by half of the raster, e.g. 10 kHz. This results in a full 18 bit precision to be utilized instead of dedicating one bit to represent +1.

The dual port RAM in Figure 39 was initialized by the vector cos_lut, which is generated from the following script (the script can be found in the mask of the DUC_2x5 module in the duc_lib.mdl file).

% gcd(2.5M, 15.36M) = 20k - need raster of 20 kHz = 0.02 MHzf = [-2.5, 2.5];Fs = 15.36;num_el = Fs/20*10^3; % Total number of element to form a whole sinu wave % store full wave since it consists less than 1024 elementsn = [0: num_el-1]; offset = 0.01; % offset by 10 kHz to avoid +/-1cos_lut = cos(2*pi*(offset+n*0.02)/Fs); step = mod(f/20*10^3, num_el);


Figure 38: System Generator Block Diagram of Mixer & Combiner for 2x5 MHz Configuration

data scheduling - ensure data and coeff are repeated twice in 24 clks, since real and

imag are calculated using the same MAC

vout2

dout1

reset_gen

reset reset_24

raster_dds_2ch

cnt

vin

reset

cos_sin

sin_cos

vout

addr

valid

ch_index

rst

addr

enable

mux_sel

op_cnt

Mux5

sel

d0

d1

z-1

Mux1

sel

d0

d1

z-1

MAC

data

coeff

op_addr

reset

dout

vout

Delay 6

z- 10

Delay 4

z-12

Delay 3

z-8

Delay 2

z-12Delay 1

z-7

Data RAM

addra

dina

wea

addrb

enb

A

B

reset4

chin3

vin2

din1




R

Figure 40 shows that the step_terminal_count block System Generator diagram for the 2x5 MHz configuration is very similar to the “Four-Carrier Mixer” section. It is used to generate the baseline address to perform the sine and cosine LUT mapping.

In the 2x5 MHz case, the step size generated from the fixed frequencies [-2.5, 2.5] is stored in the ROM, and an accumulating process is performed based on the step size. Whenever the count reaches above 768, e.g., 15.36M/20k, the output is wrapped back to the start. This is equivalent to a modulo (count, 768) operation. The 2x10 MHz case is identical to the four carrier configuration with the exception of time delay elements.


Figure 39: System Generator Block Diagram of Rasterized DDS (Full Wave Storage)


Figure 40: System Generator Diagram of Step Terminal Count Block (2x5 MHz Configuration)

full wave storage

step terminal counter to calculate corresponding

phase for each channel (between 0~2pi )

vout3

sin_cos2

cos_sin1

step_terminal _count

reset

cnt

en

base_addr

valid

sin_addr _map

base_addr sin_addr

d qz-1

d qz-1

d qz-1

Mux 5

sel

d0

d1

z-1

Mux 1

sel

d0

d1

z-1

Dual Port RAM

addra

dina

wea

addrb

dinb

web

A

B

Delay 1

z-3

Counter

rst

enout

0

0

reset3

vin2

cnt1

cos_addr

sin_addr

latency for this module = 4

this loop takes 24 clks overall

enable once every 2 clks

once the count is over 768 (= 15.72M/20k)wrap the counter result back to the start

pick the right step size for each channeleach step constant stays for 2 clks

valid2

base_addr1

data_wrap

In Out

d qz-1

d

rst

en

qz-1

d

rstqz-1

d

rstqz-1

ROM

addr z-1

MSB

[a:b]

LSB

[a:b] a

b

~a & b

z-3

Delay 1

z-21

AddSub 1

a

b

en

a + bz-1

en3

cnt2

reset1




R

Rate and Data Format Conversion

For rate and data format conversion:

• Ports of the DUC module operate on the 368.64 MHz system clock rate domain.

• Ports of the CFR module (except for ce_3_out) operate on the 122.88 Msps sample rate.

• The DUC output data format is Fix 16.15 (or Fix 16.14) which is different from the Fix 16.0 the CFR module required.

Figure 41 shows how a rate and data format conversion block is used to integrate these two modules. a rate and data format conversion block is used as shown in Figure 41:

CFR Implementation

The PC-CFR module used in this reference design is a slightly modified version of what is described in [Ref 4]. In [Ref 4], the system clock rate must be set to four times the sample rate. For example, a sample rate of 76.8 Msps, the clock rate is equal to 307.2 MHz.

In this reference design, the system clock rate is 368.64 MHz, three times the sample rate of 122.88 Msps. Some modification is required to generate the three clocks per sample PC-CFR module. Because much of the original design operates at the sample rate, the modifications are needed for only a few sections of the design.

There is a GUI to select the PC-CFR from one of two settings:

• The single-carrier setting which uses three CPGs for both iterations.

• The multi-carrier setting which uses six CPGs for the first iteration and three CPGs for the second iteration.

The carrier and bandwidth selection on the top-level (Figure 29) dictates the architecture used in the PC-CFR. Figure 42 shows the CFR module.


Figure 41: System Generator Block Diagram of Rate and Data Format Conversion Block

rst_34

vout3

dout _q2

dout _i1

Reset Generator

rstrst3

rdy

reinterpret

reinterpret

Register2

d qz-1

Register1

d qz-1

Register

d

rst

en

qz-1 ↓3z-1

↓3z-1

↓3z-1

1

rst_14

vin3

din _q2

din _i1




R

Figure 43 shows System Generator block diagram of the PC-CFR design. For further implementation details, refer to [Ref 4]. A link to the document is available in “References,” page 87. It contains two iterations of the PC-CFR algorithm. For single carrier configurations (1x5, 1x10, 1x15 and 1x20), each of the iterations is identical. The input and output data streams are quantized to 16 bits and are represented in two’s complement number format.

All interface signals operate at the sample rate (122.88 Msps) with the exception of the ce_3_out signal which operates at the system clock rate. For multi-carrier configurations (2x5, 2x10, and 4x5), as described in “Crest Factor Reduction Architecture,” six CPGs are required in the 1st iteration. The major difference compared to the single carrier case is in the cancellation pulses (c_pulses) module, that extra logic is required to process the pulse cancellation in parallel.

Table 20 lists the resource utilization for the 3 CPG and 6 CPG versions of the PC-CFR design.

Figure 44 and Figure 45 show the 3 CPG and 6 CPG versions of System Generator diagram of c_pulses module and the System Generator Diagram of Cancellation Pulses (c_pulses) Module, 6 CPGs version, respectively.


Figure 42: System Generator GUI of PC-CFR Module

Table 20: Resource Utilization Summary for PC-CFR Design

Configuration DSP48s RAMB36k RAMB18k FFs LUTs Slices

Single carrier setting(3 CPGs for each iteration)

10 6 2 1944 1832 824

Multi-carrier setting(6 CPGs for 1st iteration; 3 CPGs for 2nd iteration)

14 7 2 2328 2132 967




R

For further implementation details, refer to [Ref 4]. A link to the document is available in “References,” page 87. X-Ref Target - Figure 43

Figure 43: System Generator Block Diagram of PC-CFR Design


Figure 44: System Generator Diagram of Cancellation Pulses (c_pulses) Module, 3 CPGs version

ce_3_out4

data _valid3

data _q_out2

data _i_out1

PC_CFR Iteration 2pc_cfr _3CPGs

data_i_in

data_q_in

data_sync

threshold

alloc_spacing

filter _numtaps

filter _ram_addr

filter _ram_data

filter _ram_we

reset

data_i_out

data_q_out

data_valid

PC_CFR Iteration 1pc_cfr _6CPGs

data_i_in

data_q_in

data_sync

threshold

alloc_spacing

filter _numtaps

filter _ram_addr

filter _ram_data

filter _ram_we

reset

data_i_out

data_q_out

data_valid

CEProbe

reset11

filter _ram_we10

filter _ram_data9

filter _ram_addr8

filter _numtaps7

alloc _spacing 26

alloc _spacing 15

threshold4

data _sync3

data _q_in2

data _i_in1

pulse _sum_q2

pulse _sum_i1

filter _ram

wr_addr

wr_data

wr_en

rd_addr

coef_re

coef_im

cpg _multiplexing

cpg_0_addr

cpg_1_addr

cpg_2_addr

cpg_0_peak_i

cpg_1_peak_i

cpg_2_peak_i

cpg_0_peak_q

cpg_1_peak_q

cpg_2_peak_q

cpg_addr

cpg_peak_i

cpg_peak_q

cpg_sync

cpg _allocator

peak_scale_i

peak_scale_q

peak_indicator

numtaps

alloc_spacing

cpg_0_addr

cpg_1_addr

cpg_2_addr

cpg_0_peak_i

cpg_1_peak_i

cpg_2_peak_i

cpg_0_peak_q

cpg_1_peak_q

cpg_2_peak_q

cmplx _mac

ar

ai

br

bi

sync

sum_re

sum_im

z-3

z-3

z-3

filter _ram _we8

filter _ram _data7

filter _ram _addr6

filter _numtaps5

alloc _spacing4

peak _indicator3

peak _scale_q2

peak _scale_i1


Receive Uplink Design & Implementation


R


Performance Requirements

The following subsections describe the performance requirements for the Receive Uplink design and implementation.

Reference Sensitivity Level

Section 7.2 of the 3GPP TR 36.804 (and TS 36.104) [Ref 2] describe the reference sensitivity level requirements for the E-UTRA BS. The reference sensitivity level specifies the minimum mean power received at the antenna connector to meet some performance criteria.

For the LTE uplink, the reference sensitivity level is defined over a bandwidth corresponding to 25 resource blocks, where each resource block occupies 12 subcarriers that are spaced 15 kHz apart. Therefore, the measurement bandwidth for the reference sensitivity level is 25×12×15 kHz = 4.5 MHz.

Table 21, lists the reference sensitivity level for the E-UTRA BS. The numbers in the table are based on a “receiver noise figure of 5 dB, signal-to-noise ratio (SNR) operating point equal to 95% relative of nominal throughput with 2 dB implementation margin and 90% bandwidth efficiency” from [Ref 2].

The FRC A1-3 reference channel is based on 25 resource blocks of QPSK, Rate-1/3 turbo coded data with 12 DFT-OFDM symbols per subframe. For bandwidths greater than 5 MHz, multiple chunks of the 25 resource block FRC A1-3 are applied [Ref 1].


Figure 45: System Generator Diagram of Cancellation Pulses (c_pulses) Module, 6 CPGs version

pulse _sum_q2

pulse _sum_i1

filter _ram

wr_addr

wr_data

wr_en

rd_addr0

rd_addr1

coef_re0

coef _im0

coef_re1

coef _im1

cpg _multiplexing

cpg_0_addr

cpg_1_addr

cpg_2_addr

cpg_3_addr

cpg_4_addr

cpg_5_addr

cpg_0_peak_i

cpg_1_peak_i

cpg_2_peak_i

cpg_3_peak_i

cpg_4_peak_i

cpg_5_peak_i

cpg_0_peak_q

cpg_1_peak_q

cpg_2_peak_q

cpg_3_peak_q

cpg_4_peak_q

cpg_5_peak_q

cpg_addr_012

cpg_addr_345

cpg_peak_i012

cpg_peak_i345

cpg_peak_q012

cpg_peak_q345

cpg_sync

cpg _allocator

peak_scale_i

peak_scale_q

peak_indicator

numtaps

alloc_spacing

cpg_0_addr

cpg_1_addr

cpg_2_addr

cpg_3_addr

cpg_4_addr

cpg_5_addr

cpg_0_peak_i

cpg_1_peak_i

cpg_2_peak_i

cpg_3_peak_i

cpg_4_peak_i

cpg_5_peak_i

cpg_0_peak_q

cpg_1_peak_q

cpg_2_peak_q

cpg_3_peak_q

cpg_4_peak_q

cpg_5_peak_q

cmplx _mac

ar0

ai0

br0

bi0

ar1

ai1

br1

bi1

sync

sum_re

sum_im

z-3

z-3

z-3

z-3

z-3

filter _ram _we8

filter _ram _data7

filter _ram _addr6

filter _numtaps5

alloc _spacing4

peak_indicator3

peak _scale_q2

peak _scale_i1

Table 21: E-UTRA BS Reference Sensitivity Level (From 3GPP TR 36.104, Table 7.2-1)

Channel Bandwidth (MHz)

Reference Measurement Channel

Reference Sensitivity Level (dBm)

5 FRC A1-3 in Annex A.1 -101.6

10 FRC A1-3 in Annex A.1 -101.6

15 FRC A1-3 in Annex A.1 -101.6

20 FRC A1-3 in Annex A.1 -101.6




R

Adjacent Channel Selectivity

The following information is taken from section 7.4.1 of [Ref 1] (see also section 7.5 of [Ref 2]):

Adjacent channel selectivity (ACS) is “a measure of the receiver ability to receive a wanted signal at its assigned channel frequency in the presence of an adjacent channel signal at a given frequency offset …”. For UTRA, the ACS is defined by stating a required BER performance of 0.001 at a specified data rate, wanted signal mean power and interfering signal mean power.

From 3GPP TS 25.104 [Ref 7], an ACS value of 46 dB is used for the UTRA FDD BS, assuming a noise figure of 5 dB for the BS receiver.

For E-UTRA, two different types of interference signals are used to specify ACS:

• A wideband interferer consisting of a 5-MHz E-UTRA signal

• A narrowband interferer consisting of a single resource block from a 5 MHz E-UTRA signal.

Table 22 (from Table 7.5-3 of 3GPP TR 36.104 [Ref 2]) lists the ACS requirements for the wideband interferer. Table 23 lists the requirements for the narrowband interferer.

Note: The wanted signal power is set to a level that is 6 dB higher than the reference sensitivity level. As pointed out in TR 36.804, “this does not mean that 6 dB degradation is allowed. It is simply a selected test parameter to make the interference impact measurable.” As with the reference sensitivity level, the wanted signal mean power levels are normalized to a bandwidth of 25 resource blocks (4.5 MHz).

For all BW = 5, 10, 15, or 20 MHz, the wideband interference signal is a 5-MHz E-UTRA signal that is centered 2.5 MHz from the victim carriers band edge. For example, if BW = 20 MHz, then the center frequency of the interfering signal would be at 12.5 MHz.

From Table 21, REFSENS = -101.6 dBm measured in a 4.5-MHz bandwidth. Therefore, the wanted signal power is equal to -95.6, -92.6, -90.8, or -89.6 dBm for channel bandwidths of 5, 10, 15, and 20 MHz, respectively.

Table 22: ACS Requirement with Wideband Interferer (from Table 7.5-3 of [Ref 2])

Channel Bandwidth(MHz)

Wanted Signal Mean Power (dBm)

Interfering Signal Mean Power

(dBm)

Frequency Offset from Band Edge

(MHz)

5 REFSENS + 6 -52 2.5

10 REFSENS + 6 -52 2.5

15 REFSENS + 6 -52 2.5

20 REFSENS + 6 -52 2.5

Table 23: ACS Requirement with Narrowband Interferer (from Tables 7.5-1,2 of [Ref 2])

Channel Bandwidth

(MHz)

Wanted Signal Mean Power(dBm)


(dBm)

Frequency Offset from Band Edge (MHz)

5 REFSENS + 6 -49 340 + m*180m = 0, 1, 2, 3, 4, 9, 14, 19, 24

10 REFSENS + 6 -49 340 + m*180m = 0, 1, 2, 3, 4, 9, 14, 19, 24

15 REFSENS + 6 -49 340 + m*180m = 0, 1, 2, 3, 4, 9, 14, 19, 24

20 REFSENS + 6 -49 340 + m*180m = 0, 1, 2, 3, 4, 9, 14, 19, 24




R

At the stated signal powers, the interfering signal mean power is -52 dBm for all cases. Using BW = 5 MHz, the interferer is -52 – (-95.6) = 43.6 dB higher than the wanted signal. Because the reference bandwidth is 4.5 MHz, the ACS for all channel bandwidths (5, 10, 15, 20 MHz) is the same (~44 dB).

For the narrowband interferer, the wanted signal mean power of -95.6 dBm per 25 resource blocks translates into -109.6 dBm per resource block.

The mean power of the narrowband interfering signal is -49 dBm and occupies a single resource block. This implies that the spectral level of the narrowband interferer is ~61 dB higher than the spectral level of the wanted signal. However, the total mean power of the narrowband interferer is only -49 – (-95.6) = 46.6 dB higher than the wanted signal. With the narrowband interference it is sufficient to base the filter requirements on the total power levels rather than the relative spectral levels.

Blocking Characteristics

The blocking performance is specified as a measure of the receiver ability to receive a wanted signal in the presence of an unwanted interferer.

The interference signal is a 5-MHz E-UTRA signal for in-band blocking and a continuous wave (CW) signal for out-of-band blocking.

Table 24 lists the in-band blocking requirements for all operating bands.

For in-band blocking, the center frequency of the interfering signal is specified as being between (FUL_low-20) MHz and (FUL_high+20) MHz. For example, the E-UTRA operating band 1 has FUL_low = 1920 MHz and FUL_high = 1980 MHz. Therefore, the in-band interferer lies between 1900 MHz and 2000 MHz. Out-of-band interferers lie outside of the range defined for the in-band interferers.

The mean power of the CW interferer for the out-of-band blocking is -15 dBm.

There is a blocking requirement for co-location with GSM, UTRA, and E-UTRA operating in different frequency bands. The CW interference in this case is +16 dBm. In both cases, it is assumed that the out-of-band interference is sufficiently attenuated by analog filters in the receiver front-end to mitigate the impact on the digital filter requirements.

The in-band blocking requirements are similar to the wideband ACS requirements, except the interfering signal is in the second adjacent channel (relative to 5 MHz bandwidth) and its power is -43 dBm compared to only -52 dBm (9 dB higher). Therefore, the spectral level of the 5-MHz E-UTRA interferer is ~53 dB higher than the spectral level of the wanted signal.

Intermodulation CharacteristicsThe following description of intermodulation characteristics is from section 7.6 of the 3GPP TR 36.804 [Ref 1] (see also section 7.8 of 3GPP TS 36.104 [Ref 2]):

“The intermodulation performance requirement of the E-UTRA system is specified as a measure of the capability of the receiver to receive a wanted signal on its assigned channel frequency in the presence of two interfering signals which have a specific frequency

Table 24: In-Band Blocking Requirements (from Tables 7.6-1,2 of 3GPP TR 36.104)

Channel Bandwidth

(MHz)

Wanted Signal Mean Power (dBm)

Interfering Signal Mean Power (dBm)

Frequency Offset from Band Edge (MHz)

5 REFSENS + 6 -43 7.5

10 REFSENS + 6 -43 7.5

15 REFSENS + 6 -43 7.5

20 REFSENS + 6 -43 7.5




R

relationship to the wanted signal.” There are two types of intermodulation requirements: wideband and narrowband:

The wideband intermodulation pairs a 5-MHz E-UTRA interferer at some offset frequency with a CW interferer at a different offset frequency. Table 25 lists the wideband intermodulation requirements.

The narrowband intermodulation pairs 1 resource block of a 5-MHz E-UTRA signal at some offset frequency with a CW interferer at a different offset frequency. Table 26 lists the narrowband intermodulation requirements.

Summary of Requirements for ACS, Blocking, and Intermodulation

Table 27 summarizes the ACS, blocking and intermodulation requirements based on information presented in the previous subsections. The interference relative mean power is the ratio of the interference mean power to that of the wanted signal in dB.

Table 25: Wideband Intermodulation Requirements (from Tables 7.8-1,2 of [Ref 2]

Channel Bandwidth

(MHz)

Wanted Signal Mean Power

(dBm)


(dBm)


(MHz)

Type of Interfering Signal

5 REFSENS + 6 -52-52

7.517.5

CW5 MHz E-UTRA

10 REFSENS + 6 -52-52

7.517.7

CW5 MHz E-UTRA

15 REFSENS + 6 -52-52

7.518.0

CW5 MHz E-UTRA

20 REFSENS + 6 -52-52

7.518.2

CW5 MHz E-UTRA

Table 26: Narrowband Intermodulation Requirements (from Table 7.8-3 of [Ref 2])

Channel Bandwidth

(MHz)

Wanted Signal Mean Power

(dBm)


(dBm)


(kHz)

Type of Interfering

Signal

5 REFSENS + 6 -52-52

3601060

CW1 RB, E-UTRA

10 REFSENS + 6 -52-52

4151420

CW1 RB, E-UTRA

15 REFSENS + 6 -52-52

3801600

CW1 RB, E-UTRA

20 REFSENS + 6 -52-52

3451780

CW1 RB, E-UTRA

Table 27: Summary of ACS, Blocking, and Intermodulation Requirements

InterferenceRequirement Interfering Signal

Interference Relative Mean

Power (dB)

Minimum Frequency Offset from Band Edge

Wideband ACS 5-MHz E-UTRA 44 2.5 MHz

Narrowband ACS 1 RB, E-UTRA 47 340 kHz

In-Band Blocking 5-MHz E-UTRA 53 7.5 MHz




R

Filter Requirements

This section derives the receiver digital filter requirements from the ACS, blocking, and intermodulation requirements.

From the reference sensitivity level section, the minimum mean power received at the antenna connector is -101.6 dBm per 4.5 MHz of bandwidth. Assuming a standard noise temperature of T0 = 290 K, the noise power at the receiver front end is given by N = κT0W watts, where κ = Boltzmann’s constant = 1.38×10-23 W/K-Hz.

The total noise power in a bandwidth of 4.5 MHz is -107.5 dBm. Therefore, the signal-to-noise ratio (SNR) within the 4.5 MHz bandwidth is equal to 5.9 dB. The ACS, blocking, and intermodulation measurements are defined at a level of REFSENS + 6 dB. Therefore, the in-channel SNR is equal to 11.9 dB.

At a minimum, it would be desirable to attenuate the interference signals to be below the thermal noise level. If the attenuated interference is at the same level as the thermal noise, it is expected that the SNR will be degraded by 3 dB. For comparison, a 3 dB degradation is allowed when testing adjacent channel rejection for WiMAX [Ref 8] [Ref 9]. This level of performance would require a stopband attenuation of 53 + 11.9 = ~65 dB. Adding 15 dB of margin to this number gives the required 80 dB of stopband attenuation that is listed in Table 2, page 8.

Digital Down-Converter Architecture

This section describes each of the modules shown in Figure 2, page 7.

Fs/4 Mixer + Halfband Decimator

The Fs/4 mixer is used to frequency translate the spectrum of the real input signal centered at one fourth the sample rate to 0 Hz.

The input signal is assumed to occupy a bandwidth of 5-MHz, 10-MHz, 15-MHz, or 20 MHz. If the input signal bandwidth contains multiple carriers, the Fs/4 mixer centers the entire bandwidth at 0 Hz regardless of the carrier configuration. For example, if there are four 5-MHz carriers within a 20-MHz bandwidth, the output of the Fs/4 mixer will have a spectrum with two carriers to the left of 0 Hz and two carriers to the right of 0 Hz.

The Fs/4 mixer output signal is complex. An Fs/4 mixer has the property that the exp(-jωt) term used to multiply the input signal reduces to exp(-jnπ/2) = (1, -j, -1, j, 1, -j, -1, j, …). Therefore, the frequency translation can be accomplished with some simple two’s complement and MUX circuitry.

If the input signal is represented by x(n):

• the I channel output takes the form x(n) ×[1, 0, -1, 0, 1, 0, -1, 0, …]

• the Q channel output takes the form x(n) ×[0, -1, 0, 1, 0, -1, 0, 1, …].

The fact that every other sample is zero on each channel can be exploited to reduce filtering complexity. In particular, for a decimate-by-2 filter, one polyphase arm can be eliminated from

Wideband Intermodulation

CW5-MHz E-UTRA

4444

7.5 MHz17.5 MHz

Narrowband Intermodulation

CW1 RB, E-UTRA

4444

[345, 360] kHz[1780, 1060] kHz

Table 27: Summary of ACS, Blocking, and Intermodulation Requirements (Cont’d)

InterferenceRequirement Interfering Signal

Interference Relative Mean

Power (dB)

Minimum Frequency Offset from Band Edge




R

each channel. If a halfband decimator is used, only one non-trivial filter arm is needed to filter the entire complex input signal. The output of a polyphase decimate-by-2 filter can be written as

where:

Equation 4

Equation 5

• h0(k) = h(2k) are the even filter coefficients

• h1(k) = h(2k+1) are the odd coefficients

• x0(n) = x(2n) are the even data samples

• x1(n) = x(2n-1) are the odd data samples

• L is the filter length

For the Fs/4 mixer I channel output, the odd samples are zero and the even samples are the even samples of the Fs/4 mixer input modulated by a ±1 sequence.

The Q channel output has even samples that are zero and odd samples that are equal to the odd samples of the Fs/4 mixer input modulated by a –(±1) sequence. Therefore,

• The I channel filter output is equal to ±x0(n) filtered with h0(k)

• The Q channel filter output is equal to –(±x(n)) delayed by (floor(L/2)-1)/2 samples.

Table 28 lists the filter parameters for the halfband filter that follows the Fs/4 mixer. The filter is designed to accommodate the widest bandwidth of 20 MHz and is used for all configurations.

Decimation Filtering

The general structure of the decimation filtering mirrors the interpolation filtering used in the transmit downlink.

• The input sample rate to the decimation filtering module is 61.44 Msps (see Figure 2, page 7).

• The decimation filtering for the 5-MHz, 10-MHz, and 20-MHz bandwidth configurations is implemented using a cascade of halfband filters.

• For the 15-MHz configuration, a rate 3/4 fractional resampler is also needed.

• The multi-carrier configurations are handled in a similar manner, except a second stage of mixing is needed to frequency translate each carrier down to 0 Hz.

• Filters were designed using the Filter Design and Analysis Tool (FDATool) in MATLAB.


Table 28: Filter Parameters for Halfband Decimator that Follows Fs/4 Mixer

Filter Fs (Msps) Fpass (MHz) Fstop (MHz) Apass (dB) Astop (dB) # Taps

HB #4 122.88 10.0 51.44 0.001 104 15

)()()( 10 nynyny +=

⎣ ⎦

∑=

−=2/

0000 )()()(

L

kknxkhny

⎣ ⎦

∑−

=

−=12/

0111 )()()(

L

kknxkhny




R


Figure 46 shows the decimation filtering chain for the 1x5 MHz configuration. Each filter is a halfband decimation filter with parameters as listed in Table 29, page 50.

The equiripple halfband lowpass input parameters (in FDATool) are Fs, Fpass, and Apass. The parameter Fstop is implied by the parameters Fs and Fpass. Astop is measured from the filter magnitude response.

The number of taps is the length of the smallest filter that satisfies the requirements specified by the input parameters.

Although the occupied bandwidth of the channel is only 90% of the total channel bandwidth, the passband cutoff frequency was set to protect the entire 5-MHz band. This was done to improve the system-level performance of the receive uplink in the presence of interferers. If the passband cutoff frequency is set to 2.25 MHz instead of 2.50 MHz, it is possible for a potentially strong interferer to alias down into the transition band of the channel filter (the stopband of the channel filter is set to 2.50 MHz for the 1x5 MHz configuration).


Figure 47 shows the decimation filtering chain for the 1x10 MHz configuration. Table 30 lists the filter parameters.

After normalizing to the filter sample rates, the passband cutoff frequencies are identical to the frequencies listed in Table 29. Given this and the fact that the same passband ripple was specified, the filter coefficients are identical to the ones used in the corresponding filters for the 1x5 MHz configuration. The throughput requirement for each filter is doubled because the sample rates at each stage are doubled.


Figure 46: Decimation Filter Structure for 1x5 MHz Configuration

Table 29: Filter Parameters for 1x5 MHz Configuration

Filter Fs (Msps)

Fpass (MHz)

Fstop (MHz)

Apass (dB)

Astop (dB) # Taps

HB #3 61.44 2.50 28.22 0.001 117 11

HB #2 30.72 2.50 12.86 0.001 104 15

HB #1 15.36 2.50 5.18 0.002 83 27



X1123 46 100108

30.72Msps

61.44Msps

HB #32

15.36Msps

HB #22

7.68Msps

HB #12


X1123_47_100108

30.72Msps

61.44Msps

HB #22

15.36Msps

HB #12

(15 taps) (27 taps)




R


Figure 48 shows the decimation filtering chain for the 1x15 MHz configuration. The filter parameters are listed in Table 31.

A fractional resampler is required to convert from the 30.72 Msps rate to the 23.04 Msps rate. The filter was designed using the lowpass equiripple design methodology in FDATool. The filter sample rate is based on the interpolated rate of 3×30.72 = 92.16 Msps. Unlike the halfband filters, the resampler stopband cutoff frequency is explicitly specified.

The value of 15.54 MHz was chosen to protect the entire 15-MHz band from aliasing after the implicit decimation by 4.


Figure 49 shows the decimation filtering for the 1x20 MHz configuration. Table 32 lists the filter parameters.

After normalizing to the sample rate, the filter coefficients are identical to those in the corresponding filter for the 1x10 MHz and 1x5 MHz configurations.


Filter Fs (Msps)

Fpass (MHz)

Fstop (MHz)

Apass (dB)

Astop (dB) # Taps

HB #2 61.44 5.0 25.72 0.001 104 15

HB #1 30.72 5.0 10.36 0.002 83 27




Filter Fs (Msps)

Fpass(MHz)

Fstop (MHz)

Apass (dB)

Astop (dB) # Taps

HB #1 (1x15) 61.44 7.50 23.22 0.001 90 19

Resampler 92.16 6.75 15.54 0.01 85 45




Filter Fs (Msps)

Fpass (MHz)

Fstop (MHz)

Apass (dB)

Astop (dB) # Taps

HB #1 61.44 10.0 20.72 0.002 83 27

X1123_48_100108

30.72Msps

61.44Msps

HB #1 (1x15)2

23.04Msps

(19 taps)

FractionalResamplerP/Q = 3/4

(45 taps)

X1123 49 100108

30.72Msps

61.44Msps

HB #12

(27 taps)




R



The multi-carrier input signal has a total allocated bandwidth of 10 MHz and the initial filtering is based on this total bandwidth. The first two halfband filters are the same as those used in the 1x10 MHz case. The coefficients for the final halfband filter are the same as those used in the second halfband filter (both are labeled HB#1). The difference is that they operate at different sample rates and accommodate a different number of channels.

At some point in the digital down conversion chain, it becomes necessary to further split the two carriers by frequency translating each one to 0 Hz before applying further decimation filtering and eventually channel filtering. A tradeoff exists in determining the best placement of the second stage (multi-carrier) mixer:

• Placing the mixer close to the IF results in each subsequent halfband filter has lower complexity because the ratio of sample rate to passband frequency is high. However, each filter that follows the multi-carrier mixer must now handle multiple complex channels.

• Placing the multi-carrier mixer at the lowest possible sample rate has the advantage that fewer filters in the chain need to handle multiple complex channels. In this case, most of the filters operate on the composite multi-carrier signal (one complex channel). However, because the multi-carrier signal occupies a greater bandwidth than a single carrier, at some point the normalized transition band of the filter becomes so small that the filter complexity can increase dramatically.

Taking these factors into account results in a placement of the multi-carrier mixer as shown in Figure 50.






Filter Fs (Msps)

Fpass (MHz)

Fstop (MHz)

Apass (dB) Astop (dB) # Taps

HB #2 61.44 5.00 25.72 0.001 104 15

HB #1 30.72 5.00 10.36 0.002 83 27

HB #1 15.36 2.50 5.18 0.002 83 27

X1123_50_100108

30.72Msps

61.44Msps

HB #22

15.36Msps

HB #12

7.68Msps

HB #12


2 ComplexChannels

15.36Msps

2 CarrierMixer




R

The multi-carrier input signal has a total allocated bandwidth of 20 MHz. The first halfband filter is the same as those used in the 1x20 MHz case. The coefficients for the final halfband filter are the same as the ones used in the first halfband filter (both are labeled HB#1).



• The multi-carrier input signal has a total allocated bandwidth of 20 MHz.

• The first halfband filter is the same as the one used in the 1x20 MHz case.

• The coefficients for the final halfband filter are the same as the ones used in the first halfband filter (both are labeled HB#1) because the normalized parameters are equivalent.

Channel FilterAfter decimation filtering, a single-rate channel filter is applied to remove any interference that might exist in the region between the allocated channel bandwidth and the Nyquist frequency. For the case of 3GPP LTE, the occupied channel bandwidth is 90% of the total channel bandwidth. This determines the passband cutoff frequency.




Filter Fs (Msps)

Fpass (MHz)

Fstop (MHz)

Apass (dB)

Astop (dB) # Taps

HB #1 61.44 10.0 20.72 0.002 83 27

HB #1 30.72 5.00 10.36 0.002 83 27




Filter Fs (Msps)

Fpass (MHz)

Fstop (MHz)

Apass (dB)

Astop (dB) # Taps

HB #1 61.44 10.0 20.72 0.002 83 27

HB #2 30.72 2.50 12.86 0.001 104 15

HB #1 15.36 2.50 5.18 0.002 83 27

X1123 51 100108

30.72Msps

61.44Msps

HB #12

15.36Msps

HB #12

(27 taps) (27 taps)

2 ComplexChannels

30.72Msps

2 CarrierMixer

30.72Msps

61.44Msps

HB #12

15.36Msps

HB #22

(27 taps) (15 taps)

4 ComplexChannels

30.72Msps

4 CarrierMixer

X1123_52_100108

7.68Msps

HB #12

(27 taps)

4 ComplexChannels




R

The stopband frequency is set to match the channel bandwidth (Fstop = BW/2).

The peak-to-peak passband ripple (Apass) and stopband attenuation (Astop) were set to meet the performance requirements summarized in Table 2, page 8.

Table 36 lists the filter parameters for the channel filter.

The filter coefficients are the same for each configuration because the normalized parameters are identical.

Figure 53 shows the magnitude response of the channel filter. After 18-bit quantization of the coefficients, the stopband rejection is 80 dB and the peak-to-peak passband ripple is 0.052 dB. Obtaining the stopband rejection of 80 dB requires that the coefficients be scaled such that the peak of the impulse response is equal to the maximum positive 18-bit signed number (131071/131072). This results in a filter passband gain of 1.637.

If unity passband gain is maintained, the impulse response peak is 0.6128 and the stopband rejection degrades to ~77 dB.

Filter Quantization

This section summarizes the characteristics of the filters after quantizing the coefficients to 18 bits.

Table 37 lists the decimation filter characteristics after 18-bit quantization. In the Virtex-5 FPGA family, each dedicated hardware multiplier (DSP48E) can perform a signed 18-bit by 25-bit multiply. For the DDC reference design, the quantization was limited to 18 bits by 18 bits to

Table 36: Filter Parameters for the Channel Filter

BW (MHz)

Fs (Msps)

Fpass (MHz)

Fstop (MHz)

Apass (dB)

Astop (dB) # Taps

5 7.68 2.25 2.50 0.05 83 113

10 15.36 4.50 5.00 0.05 83 113

15 23.04 6.75 7.50 0.05 83 113

20 30.72 9.00 10.0 0.05 83 113


Figure 53: Magnitude Response of Channel Filter

0 0.5 1 1.5 2 2.5 3 3.5-100

-90

-80

-70

-60

-50

-40

-30

-20

-10

0Magnitude Response of Channel Filter

Frequency (MHz)

dB

Double Precision

Quantized (18 bits)




R

remain consistent with previous architectures. This allows the user the flexibility of increasing either the data path quantization or the coefficient quantization.

The “System Performance” section demonstrates that this level of quantization is sufficient to meet the adjacent channel, blocking, and intermodulation performance requirements.

Multi-Carrier MixingThis section describes the architecture of the multi-carrier mixing that is embedded in the decimation filter chains for the 2x5, 2x10, and 4x5 MHz configurations.

The multi-carrier mixing is needed to frequency translate each carrier of interest down to 0 Hz before further decimation and channel filtering. The mixer used in the 4x5 MHz configuration is described followed by a description of the mixer architectures for the 2x5 and 2x10 MHz configurations. Hardware implementation details are covered in “Hardware Verification,” page 86.

4x5 Multi-Carrier Mixing

In the 4x5 MHz case, the input to the mixer is a single complex signal sampled at 30.72 Msps. The mixer multiplies the input signal by exp(-jω0t), exp(-jω1t), exp(-jω2t), and exp(-jω3t) to produce four complex output signals, where the ωk represent the carrier center frequencies. Letting x(n) represent the input signal, the real (I channel) and imaginary (Q channel) components of the output signals are given by Equation 6 and Equation 7:

Equation 6

Equation 7

A Direct Digital Synthesizer (DDS) is used to generate the sine and cosine waveforms.

For 3GPP LTE, the carrier center frequencies are located on a 100 KHz raster. This feature can be exploited to reduce computational complexity while simultaneously providing better system level performance. In particular, the rasterized DDS is shown to have ideal SFDR and requires no further noise shaping methods such as dithering.

In a rasterized DDS, a finite set of discrete frequencies can be generated, where each frequency is a multiple of the raster size. For example, at a sample rate of 30.72 Msps, a 10 kHz raster can be supported by storing one full cycle of cosine wave that is discretized to 3072 samples. If only 1536 samples is stored to cover one full cycle, then the raster size is 20 kHz.

Table 37: Decimation Filter Characteristics after 18-Bit Quantization

Filter Fs (Msps)

Fpass (MHz)

Fstop (MHz)

Apass (dB)

Astop (dB) # Taps

HB #4 122.88 10.0 51.44 <0.001 99 15

HB #3 61.44 2.50 28.22 <0.001 96 11

HB #2 30.72 2.50 12.86 <0.001 96 15

HB #1 15.36 2.50 5.18 <0.002 82 27

HB #1 (1x15) 61.44 7.50 23.22 <0.001 86 19

Resampler (1x15) 92.16 6.75 15.54 0.011 83 45

Channel 7.68 2.25 2.50 0.052 80 113


)sin()()cos()()( nnxnnxny kQkIIk ωω +=

)sin()()cos()()( nnxnnxny kIkQQk ωω −=




R

The number of samples stored to cover a full cycle must be an integer. Given this constraint, and to minimize hardware, a 20 kHz raster is used to cover the 100 kHz raster requirement. In particular, addressing the cosine ROM to access every fifth location will give a 100 kHz cosine wave.

The output of the rasterized DDS has a noise floor that is governed by the quantization of the sine and cosine waves, but as previously mentioned, the SFDR is ideal. This is because the cosine samples that are stored and accessed line up exactly with the ideal sampling times for the rasterized frequencies.

Figure 55 gives an example plot of the DDS output spectrum.

The cosine wave is quantized to 18 bits, but the effective number of bits is only 17, because the cosine wave is scaled to give a maximum output of 0.5.

The measured output signal-to-noise ratio is approximately 104 dB which is consistent with the predicted value of 17 bits × 6 db/bit = 102 dB.

2x5 Multi-Carrier Configuration

In the 2x5 MHz configuration, a similar multi-carrier mixer is used. In this case, the mixer input is a complex signal sampled at 15.36 Msps. Only 768 samples of one full cosine wave are needed to support a 20 KHz raster. The mixer sample rate is 30.72 Msps. The mixer from the 2x5 MHz configuration was reused to implement the mixer for the 2x10 MHz configuration. The main difference is that now the 768 samples of one full cosine wave provide a 40 kHz raster. While this is not a factor of 100 kHz, the default carrier center frequencies for this case are ±5.0 MHz which is a multiple of 40 kHz. Therefore, the same mixer is used but operates at a different sample frequency.


Figure 55: Example DDS Output Spectrum

-15 -10 -5 0 5 10 15-200

-180

-160

-140

-120

-100

-80

-60

-40

-20

0PSD of DDS Output Signal

Frequency (MHz)

dB




R

System Performance

This section describes the methodology used to test the receive uplink portion of the digital front end and provides some simulation results.

Methodology and Assumptions

The performance of the receive uplink is based on simulation results that were obtained from internally developed MATLAB code. All results are based on processing one 10 ms frame worth of Physical Uplink Shared Channel (PUSCH) data. In this reference design, frame structure type 1 is used. See section 4.1 of GPP TS 36.211 V8.0.0 (2007-09)[Ref 10], for further description of the frame structure.

Figure 56 shows a block diagram of the uplink simulation software. The code contains a simplified baseband signal generator that creates random modulation symbols from a QAM alphabet (4-QAM, 16-QAM, or 64-QAM). The random modulation symbols are then transform precoded using the equation listed in section 5.3.3 of 3GPP TS 36.211 V8.0.0 (2007-09) [Ref 10].

The transform precoding process performs an Msc-point DFT on contiguous blocks of modulation symbols, where Msc is the number of subcarriers per SC-FDMA symbol. The transform precoder is followed by the SC-FDMA signal generator. The SC-FDMA signal generator produces OFDM symbols based on the equation from section 5.6 of 3GPP TS 36.211[Ref 10]. In particular, it performs the IFFT? and generates the cyclic extension.

Table 38 lists the parameters used for the baseband signal generation.

After baseband signal generation, the data passes through a DUC model. The DUC model contains a channel filter followed by a cascade of interpolation filters to get to a sample rate of 122.88 Msps. For multi-carrier cases, the DUC model also performs frequency translation before combining the different carriers.

After digital up-conversion, noise and interference is added to the signal to model the channel. The type of interference added is specified through a top-level parameter. The code supports


Figure 56: Block Diagram of Receive Uplink Simulation Software

RandomSymbol

Generator

X1123_55_100108

TransformPrecoder

SF-FDMASignal

Generator

InverseTransformPrecoder

SC-FDMASignal

Receiver

Noise+

Interference+

DUCModel

DDCModel

PerformanceAnalysis

Table 38: SC-FDMA Parameters for Uplink

Bandwidth(MHz)

Sample Rate (Msps) FFT Size Number of Subcarriers

5 7.68 512 300

10 15.36 1024 600

15 23.04 1536 900

20 30.72 2048 1200




R

five types of interference: wideband adjacent channel, narrowband adjacent channel, in-band blocker, wideband intermodulation and narrowband intermodulation. When noise is turned on, the noise level is set to achieve an in-channel SNR of 11.9 dB as explained in the “Filter Requirements,” page 48.

After adding interference and noise, the real part of the signal is taken and quantized to 14 bits to drive a bit-true model of the DDC. The DDC output signal drives an SC-FDMA signal receive module that performs the inverse operation of the SC-FDMA signal generator block. In particular, it removes the cyclic extension from the signal and then performs an FFT on each OFDM symbol. After the FFT, the SC-FDMA symbols are processed by the inverse transform precoder (inverse DFT) to recover the received data modulation symbols. The recovered data symbols are compared to the known data symbols at the transform precoder input to generate an error vector magnitude (EVM) or relative constellation error (RCE) metric. For the DDC performance results, EVM and RCE are calculated the same, except one is presented as a percentage and the other is presented in dB.

Two methods of EVM calculation were considered for the receive uplink. The simpler method is based on comparing the received modulation symbols to the transmitted modulation symbols over the entire frame. The second method is based on calculating EVM over each 12-subcarrier by 1-subframe block and then averaging. For the data presented here, the results between the two methods are nearly identical (around 0.01% difference).

One method of measuring system-level performance is carried out by adding interference in the presence of AWGN at some SNR. The degradation in EVM that occurs after adding the interference indicates the degradation in the in-channel SNR. If the degradation is small, the interference level can be increased until the in-channel SNR degrades by 3 dB. This gives an indication of how much margin is provided over the requirements for a given level of degradation.

Simulation ResultsThe simulation results presented focus on the 4x5 MHz configuration because it represents the worst case scenario. In particular, the multiple 5-MHz carriers act as additional interference sources relative to each carrier of interest. Five performance tests are performed: Wideband ACS, Narrowband ACS, In-Band Blocking, Wideband Intermod, and Narrowband Intermod. In addition, the case of no interference with and without AWGN are examined to establish the baseline performance.

No Noise and No Interference

This section examines the performance of the receive uplink in the absence of both interference and noise.

Figure 57 shows a plot of the power spectral density at the DDC input. The noise floor level of -80 dB is consistent with the input quantization of 14 bits. The one-sided spectrum is shown to highlight that the input signal is real. The plot on the negative side of the frequency domain is




R

the reflection of the image on the positive side. The multi-carrier waveform is centered at 30.72 Msps which is one fourth the sample rate of 122.88 Msps.

Figure 58 shows a plot of the signal constellation at the transmit precoder input overlayed with the constellation at the inverse transform precoder output. In all cases, the test signal is QPSK as specified in [Ref 1] and [Ref 2].

Table 39 lists the EVM and RCE for each carrier. The EVM values are within the 1% requirement that specified in Table 2, page 8.


Figure 57: PSD of DDC Input with No Noise and No Interference

Table 39: SC-FDMA Parameters for Uplink

Bandwidth (MHz)

Sample Rate (Msps) FFT Size Number of Subcarriers

5 7.68 512 300

10 15.36 1024 600

15 23.04 1536 900

20 30.72 2048 1200

0 10 20 30 40 50 60-100

-90

-80

-70

-60

-50

-40

-30

-20

-10

0PSD of DDC Input Signal

Frequency (MHz)

dB




R

Noise Only

In this section, the performance of the receive uplink is examined in the presence of AWGN but in the absence of other interference.

Figure 59 shows a plot of the power spectral density at the DDC input. The noise level is set to achieve an in-channel SNR of 11.9 dB.


Figure 58: Transmitted and Received Constellation with No Noise and No Interference




R

Figure 60 shows a plot of the signal constellation at the transmit precoder input overlaid with the constellation at the inverse transform precoder output.


Figure 59: PSD of DDC Input for Noise Only Case


Figure 60: Transmitted and Received Constellation for Noise Only Case

0 10 20 30 40 50 60-100

-90

-80

-70

-60

-50

-40

-30

-20

-10


Frequency (MHz)

dB




R

Table 40 lists the EVM and RCE for each carrier. The EVM values are consistent with the in-channel SNR setting of 11.9 dB.

Wideband ACS

The wideband ACS test measures the performance of the receive uplink in the presence of a wideband adjacent channel as described in “Adjacent Channel Selectivity,” page 45.

Figure 61 illustrates the input spectrum for this test. Table 41 lists the performance results.

The minimum interference level is 44 dB higher than each channel of interest. At this level, there is virtually no degradation in performance. The next step in the test is to increase the interference power level until some predetermined level of degradation (for example 3 dB) is seen. The second row in Table 41 shows that the DDC can handle an additional 26 dB of interference power at the cost of 2.6 dB in RCE degradation.

Table 40: Performance Data for Noise Only Case

In-Channel SNR (dB)

Interference Relative Power

(dB)

EVM (%)

RCE (dB)

11.9 -Inf 25.4, 25.3, 25.3, 25.5 -11.9, -11.9, -11.9, -11.9


Figure 61: PSD of DDC Input for Wideband ACS Test

Table 41: Performance Data for Wideband ACS Test

In-Channel SNR (dB)

Interference Relative Power (dB)

EVM (%)

RCE (dB)

11.9 44 25.4, 25.3, 25.4, 25.5 -11.9, -11.9, -11.9, -11.9

11.9 44 + 26 33.6, 33.4, 33.7, 34.1 -9.5, -9.5, -9.5, -9.3

0 10 20 30 40 50 60-100

-90

-80

-70

-60

-50

-40

-30

-20

-10


Frequency (MHz)

dB




R

Narrowband ACS

The narrowband ACS test measures the performance of the receive uplink in the presence of a narrowband adjacent channel as described in “Adjacent Channel Selectivity,” page 45.


The minimum interference level is 47 dB higher than each channel of interest. At this level, there is virtually no degradation in performance. Less than 3 dB of degradation is seen at an interference level as high as 47 + 24 = 71 dB. X-Ref Target - Figure 62

Figure 62: PSD of DDC Input for Narrowband ACS Test

Table 42: Performance Data for Narrowband ACS Test

In-Channel SNR (dB)


(dB)

EVM (%)

RCE (dB)

11.9 47 25.4, 25.3, 25.4, 25.5 -11.9, -11.9, -11.9, -11.9

11.9 47 + 24 35.3, 35.0, 35.5, 35.3 -9.1, -9.1, -9.0, -9.1

0 10 20 30 40 50 60-100

-90

-80

-70

-60

-50

-40

-30

-20

-10


Frequency (MHz)

dB




R

In-Band Blocking

The in-band blocking test measures the performance of the receive uplink in the presence of a blocker as described in the Blocking Characteristics section.


The minimum interference level is 53 dB higher than each channel of interest. At this level, there is approximately 0.1 dB degradation in performance. Less than 3 dB of degradation is seen at an interference level as high as 53 + 18 = 71 dB.


Figure 63: PSD of DDC Input for In-Band Blocking Test

Table 43: Performance Data for In-Band Blocking Test

In-Channel SNR (dB)


(dB)

EVM (%)

RCE (dB)

11.9 53 25.6, 25.5, 25.5, 25.7 -11.9, -11.9, -11.9, -11.8

11.9 53 + 18 35.5, 35.1, 35.4, 35.6 -9.0, -9.1, -9.0, -9.0

0 10 20 30 40 50 60-100

-90

-80

-70

-60

-50

-40

-30

-20

-10


Frequency (MHz)

dB




R

Wideband Intermod

The wideband intermod test measures the performance of the receive uplink in the presence of interference from wideband intermodulation as described in the “Intermodulation Characteristics,” page 46.

The input spectrum for this test is illustrated in Figure 64 and the performance results are listed in Table 44.

The minimum interference level is 44 dB higher than each channel of interest. At this level, there is virtually no degradation in performance. Less than 3 dB of degradation is seen at an interference level as high as 44 + 22 = 66 dB.


Figure 64: PSD of DDC Input for Wideband Intermod Test

Table 44: Performance Data for Wideband Intermod Test

In-Channel SNR (dB)


EVM (%)

RCE(dB)

11.9 44 25.5, 25.4, 25.4, 25.5 -11.9, -11.9, -11.9, -11.9

11.9 44 + 22 34.9, 35.3, 34.9, 34.9 -9.2, -9.1, -9.1, -9.2

0 10 20 30 40 50 60-100

-90

-80

-70

-60

-50

-40

-30

-20

-10


Frequency (MHz)

dB




R

Narrowband Intermod

The narrowband intermod test measures the performance of the receive uplink in the presence of interference from narrowband intermodulation as described in the Intermodulation Characteristics section. The input spectrum for this test is illustrated in Figure 65 and the performance results are listed in Table 45.

The minimum interference level is 44 dB higher than each channel of interest. At this level, there is virtually no degradation in performance. Less than 3 dB of degradation is seen at an interference level as high as 44 + 22 = 66 dB.X-Ref Target - Figure 65

Figure 65: PSD of DDC Input for Narrowband Intermod Test

Table 45: Performance Data for Narrowband Intermod Test

In-Channel SNR (dB)


EVM (%)

RCE (dB)

11.9 44 25.5, 25.4, 25.4, 25.5 -11.9, -11.9, -11.9, -11.9

11.9 44 + 22 34.7, 34.7, 35.0, 34.7 -9.2, -9.2, -9.1, -9.2

0 10 20 30 40 50 60-100

-90

-80

-70

-60

-50

-40

-30

-20

-10


Frequency (MHz)

dB




R

Implementation

The DDC was implemented using Xilinx® System Generator version 10.1.2 with a heavy reliance on the FIR Compiler v4.0 [Ref 11]. The top-level DDC architecture uses a configurable subsystem to select from one of seven unique architectures that are stored in a Simulink library (see reference [Ref 12] for a description of using configurable subsystems in System Generator designs). The configuration choice is selected by right clicking on the top-level component and scrolling down to the Block Choice menu item as illustrated in Figure 66.

The seven architectures are based on the structures described in the “Digital Up Converter Architecture” section. For example, the System Generator block diagram for the 1x5 MHz configuration is shown in Figure 67. As a second example, the System Generator block diagram for the 4x5 MHz configuration is shown in Figure 68.


Figure 66: System Generator Block Diagram of DDC Top Level


Figure 67: System Generator Block Diagram of DDC for 1x5 MHz Configuration

chan3

rdy2

dout1

HB3

di n

nd

rst

dout

rdy

HB2

din

nd

rst

dout

rdy

HB1

din

nd

rst

dout

rdy

Fs_4_Mixer

din

nd

rst

dou t

rdy

Channel _Fi lter

din

nd

rst

dout

rdy

chan_out

rst3

nd2

din1

UFi x_1_0

Bool

Bool

Fi x_17_16

Bool

Fix _17_16

Bool

Fix_17_16

Bool

Fix_17_16

Bool

Fix_18_ 17

Bool

Fix _14_13




R

In most cases, the data flow from one module to the next is handled with a single TDM data signal along with an output data ready (rdy) that is used as an enable, or new data (nd), for the next downstream module. The main exception to this is in the 4x5 MHz configuration where two TDM streams are used at the output of the multi-carrier mixer in order to simplify the enabling of the subsequent filter stages. In particular, the output of the 4-carrier mixer is at a sample rate of 30.72 Msps per complex channel. Given there are four carriers, each with an I and Q channel, the sample rate of a single TDM signal would be 245.76 Msps, which would require a non-uniform enable pattern (high for two clocks, low for one clock). This leads to a bursty enable pattern at the output of the next filter stage. To keep the enables uniform, it is simpler in this case to use two TDM signals each at a sample rate of 122.88 Msps and then recombine into a single TDM signal after decimating by 2.

The data path quantization at the output of each module is maintained at 17 bits except the final stage where it grows to 18 bits. Symmetric rounding is used in all cases with the exception of the fractional resampler in the 1x15 MHz configuration where a simple round is used to save some hardware (one DSP48E).

The remainder of this section describes the implementation of each module in more detail with an emphasis on the handcrafted portions of the design. Because FIR Compiler v4.0 was used, the description of the filter designs is reduced to a listing of GUI parameter settings.

Fs/4 Mixer + Halfband DecimatorThe first stage mixer and halfband decimator are contained in the Fs_4_Mixer module. This module is common to all seven DDC configurations. The System Generator block diagram of the design is shown in Figure 69.


Figure 68: System Generator Block Diagram of DDC for 4x5 MHz Configuration

chan3

rdy2

dout1

HB2_2x4CH

din_i

din_q

nd

rst

do ut

rdy

HB1_8CH

din

nd

rst

dout

rdy

HB1

din

nd

rst

dout

rdy

chan _out

Fs_4_Mixer

din

nd

rs t

dout

rdy

Channel _Fi lter

di n

nd

rst

dout

rdy

chan_out

4C_Mix er

din

nd

chan_i n

rst

dout_i

d out_q

rdy

rst3

nd2

din1

BoolBool

Fix_17_16

Fi x_17_16

UFi x_3_0

Bool

Fix_18_17

B ool

Fi x_17_16

Fi x_17_16

Bool

UFix _1_0

B ool

Fi x_17_16

Bool

Fix_17_16

Bool

Fix_14_13


Figure 69: System Generator Block Diagram of Fs_4_Mixer

rdy2

dout1

TDM

din_q

din_i

nd

rs t

dout

rdy

d qz-1

Modulate

din

nd

rs t

dout

rdy

HB4

din

nd

rst

dout

rdy

EvenOddDemux

din

nd

rs t

dout_odd

dout_even

rdy

Delay

din

nddout

rst3

nd2

din1

Bool

Bool

Fix _17_16

Fix_17_16

Bool

Fix_17_16

Fix_15_13

Fix_15_13

Bool

Bool

Fix _15_13

Bool

Bool

Fix_14_13




R

The input to the Fs_4_Mixer design is a real signal with 14 bits of precision and a sample rate of 122.88 Msps. The combination of the Modulate and EvenOddDemux blocks produces the ±x0(n) and –(±x1(n)) sequences described in the “Digital Up Converter Architecture” section. The modulated even samples, ±x0(n), are filtered by the halfband decimation filter labeled HB4. The delayed modulated odd samples are time division multiplexed with the filtered modulated even samples into a single output stream at a sample rate of 122.88 Msps (3 clocks per sample).

As with all filters in the design, the HB4 filter is implemented using the FIR Compiler 4.0. As shown in Figure 70, the HB4 module consists of a single FIR Compiler 4.0 instance followed by some conditioning of the output signal quantization. In particular, the signal is compressed by one MSB to remove unnecessary headroom. The quantization and scaling of the signal is such that a full scale input sine wave with 14 bits of precision is mapped to a half scale sine wave with 17 bits of precision. This initial gain of 0.5 provides protection from overflow in the downstream decimation filters that have unity passband gain and the channel filter that has a gain of 1.637.

Figure 71 shows screen shots of the Filter Compiler 4.0 GUI with settings for the HB4 module. For this particular filter, the coefficients are the even coefficients of the HB4 halfband filter and are contained in the vector h0_hb4. For all filters in the DDC design, the number of filter coefficients is set to 1 (constant coefficient filters). In this design, the filter is a single channel and single rate filter with 6 clocks per input sample. The filter architecture is specified to exploit coefficient symmetry and to use a systolic multiply accumulate structure. In all cases, the DDC filters use 18-bit signed coefficients that are specified in an integer format. Further descriptions of the GUI parameters are given in the FIR Compiler 4.0 documentation [Ref 11].


Figure 70: System Generator Block Diagram of HB4 Module

Remove 1 MSB to reduce

excessive head room .

rdy2

dout1reinterpret

FIR Compiler 4.0

din

nd

rs t

dout

rfd

rdy

cas t

rst3

nd2

din1

Bool

Bool

Fix_15_13

Bool

Bool

Fix_18_0 Fix_17_16Fix_18_16




R

Decimation and Channel FilteringThis section summarizes the Filter Compiler 4.0 settings for each filter used in each of the seven configurations. In all cases, the number of coefficient sets is equal to one and the reloadable coefficient option is disabled. Also, all filter architectures are based on the systolic multiply accumulate structure with 18-bit signed coefficients. In most cases, the optimization goal is left at the default setting of “Area”; the two exceptions being all filters in the 1x15 MHz configuration and the channel filter in the 4x5 MHz configuration. Also, the data buffer type and coefficient buffer type are left at the default setting of “Automatic” in all cases except for the 4x5 MHz channel filter where they are set to “Block.” Finally, all filters in the design use the “rst” and “nd” control options.


Figure 71: Screen Shots of FIR Compiler 4.0 GUI


Parameter 1st Halfband 2nd Halfband 3rd Halfband Channel Filter

Coefficients h_hb3 h_hb2 h_hb1 h_channel_filter

Filter Type Decimation Decimation Decimation Single_Rate


Decimation Rate Value

2 2 2 1




R

Number of Channels

2 2 2 2


3 6 12 24


Half_Band Half_Band Half_Band Symmetric



Data Buffer Type Automatic Automatic Automatic Automatic

Coefficient Buffer Type

Automatic Automatic Automatic Automatic


Parameter 1st Halfband 2nd Halfband Channel Filter

Coefficients h_hb2 h_hb1 h_channel_filter

Filter Type Decimation Decimation Single_Rate


Decimation Rate Value 2 2 1



Coefficient Structure Half_Band Half_Band Symmetric



Data Buffer Type Automatic Automatic Automatic

Coefficient Buffer Type Automatic Automatic Automatic


Parameter Halfband Resampler Channel Filter

Coefficients h_hb1_1x15 h_resampler_1x15 h_channel_filter


Rate Change Type Integer Fixed_Fractional Integer

Interpolation Rate Value 1 3 1




Coefficient Structure Half_Band Inferred Symmetric


Optimization Goal Speed Speed Speed






R




Parameter Halfband Channel Filter

Coefficients h_hb1 h_channel_filter

Filter Type Decimation Single_Rate

Rate Change Type Integer Integer

Decimation Rate Value 2 1

Number of Channels 2 2


3 6

Coefficient Structure Half_Band Symmetric

Output Width 18 20

Optimization Goal Area Area

Data Buffer Type Automatic Automatic

Coefficient Buffer Type Automatic Automatic





Rate Change Type

Integer Integer Integer Integer


2 2 2 1

Number of Channels

2 2 4 4


3 6 6 12


Half_Band Half_Band Half_Band Symmetric


Optimization Goal

Area Area Area Area

Data Buffer Type Automatic Automatic Automatic Automatic


Automatic Automatic Automatic Automatic


Parameter Halfband Resampler Channel Filter




R

Multi-Carrier MixingThis section describes the implementation of the multi-carrier mixers that are used in the 4x5, 2x5, and 2x10 MHz configurations. An emphasis will be placed on the four-carrier mixer that is used in the 4x5 MHz configuration. The two-carrier mixer used in the 2x5 and 2x10 MHz configurations is a modification of the four-carrier mixer.

Four-Carrier Mixer

The System Generator block diagram of the four-carrier mixer is shown in Figure 72. The input data arrives in a TDM fashion at a sample rate of 61.44 Msps. The IQ_Demux module splits the I and Q channels into two parallel data streams to simplify the handling of the downstream


Parameter 1st Halfband 2nd Halfband Channel Filter

Coefficients h_hb1 h_hb1 h_channel_filter






Coefficient Structure Half_Band Half_Band Symmetric






Parameter 1st Halfband 2nd Halfband(1) 3rd Halfband Channel Filter





2 2 2 1



3 3 3 6

Coefficient Structure Half_Band Half_Band Half_Band Symmetric


Optimization Goal Area Area Area Speed

Data Buffer Type Automatic Automatic Automatic Block


Automatic Automatic Automatic Block

Notes: 1. The second halfband filter for this case is implemented using two 4-channel filters, one for the I channel

and one for the Q channel. The parallel filters are followed by a TDM circuit to recombine the I and Q channels into a single stream.




R

complex multiply. The cosine and sine waveforms at each carrier frequency are generated in the four-carrier DDS module, DDS_4C.

The System Generator block diagram of the four-carrier DDS is shown in Figure 73. The new data (nd) pulse arrives at a sample rate of 30.72 Msps (high once every 12 clock cycles). The pulses module generates four output pulses for every input pulse, where the output pulses are high once per three clock cycles. This signal is then used as the enable for the freq_tdm module. The freq_tdm module contains a counter and a four-input mux to select between the four different carrier frequency values (see Figure 73). The frequency values of 1161, 1411, 125, and 375 corresponds to center frequencies of -7.5, -2.5, 2.5, and 7.5 MHz, respectively.

The time-division multiplexed frequency values drive the input to the frequency-to-phase accumulator, freq_accum. The frequency-to-phase accumulator contains four stages of enabled registers that follow a mod 1536 adder (see Figure 74). Each enabled register stores the accumulated phase for a given carrier. The output of the adder is registered to improve timing without breaking functionality; the result of the pipeline register is stable before the next register is enabled. The mod 1536 function is efficiently implemented using an 8 deep by 2 bit wide lookup table to map the three input MSBs into two output MSBs. The nine LSBs at the output are equal to the nine LSBs at the input. The 11-bit phase output is used to address the sine/cosine lookup table.


Figure 72: System Generator Block Diagram of Four-Carrier Mixer

latency of 7 clk s

la tenc y of 9 clks rdy3

dout _q

2

dout _ i1

d qz-1

IQ _Demux

din

nd

chan_in

rst

dout_i

dout_q

rdy

z-9

z-9

DDS _4C

nd

rs t

cos_out

s in_out

rdy

ComplexMult

a0

a1

b0

b1

nd

dout_i

dout_q

rdy

rst4

c han _in3

nd2

din1

Bool

Fix_17_16

Bool

Fix_17_16

Bool

Fix_18_17

Fix_18_17

Fix_17_16

Fix_17_16

Fix _17_16

Fix _17_16

Bool

Fix _17_16

Bool

Bool

UFix _1_0


Figure 73: System Generator Block Diagram of Four-Carrier Mixer

Note: rst must be held high for a minimumof 10 clock cycles to flush pipe delays

4 clk latency

rdy3

sin_out2

cos_out1

sin_cos_lut

phase

cos_out

s in_outpulses

nd

rstrdy

freq _tdm

en

rs t

out

rdy

freq_accum

freq

en

rs t

phase

rdy

rst2

nd1

Bool

UFix_11_0Bool

Bool

Bool

Bool

U Fix _11_0

Fix_18_17

Fix_18_17




R

The System Generator block diagram of the sine/cosine lookup table is shown in Figure 76. The cosine address is mapped from 11 bits to 10 bits to exploit symmetry of the cosine wave and to reduce memory usage. The sine wave output is obtained from the stored cosine wave by offsetting the phase address by one fourth the length of the address space. The offset in the address results in a 90 degree phase shift in the cosine wave, which gives the desired sine wave.

The contents of the 1K deep block RAM are initialized to the first 1023 points of the vector given by round(2^16*cos(2*pi*(0.5 + (0:1535))/1536)), although only 768 of these points are used. Note that the phase of the cosine wave is shifted by half a quantile in order to obtain a waveform where half-wave symmetry is easily utilized. Because only one half of the cosine wave is stored, the entire sine/cosine lookup table requires only one 18K block RAM.


Figure 74: System Generator Block Diagram of TDM Circuit for Carrier Frequencies

order of mux inputs is rotatedbecause the first mux outputis d1 (counter is 1 after first en )

must have reset

rdy2

out1

d

rstqz-1

d

rstqz-1

d

rst

en

qz-1

Mux

sel

d0

d1

d2

d3

Counter

rst

enout

125

1411

1161

375

rst2

en1

Bool

BoolBool

UFix _11_0

UFix_11_0

Bool

UFix_2_0

UFix_11_0

UFix_11_0

UFix_11_0

UFix_11_0


Figure 75: System Generator Block Diagram of Frequency-to-Phase Accumulator

pi pe delay that matchesl atency of s in _cos_lut

rdy2

phase1

m od 1536

din doutd

rstqz-1

d

rst

qz- 1

d

rst

en

qz- 1

d

rst

en

qz-1

d

rst

en

qz-1

d

rst

en

qz-1

z- 4

AddSub

a

ba + b

rst3

en2

freq1

UFi x_11_0

UFix_11_0

UFi x_11_0

UFix_11_0

UFix_11_0

UFi x_11_0

Bool

Bool

UFix_12_0UFix_12_0

Bool Bool


Resource Utilization Summary


R

The sine and cosine outputs of the four-carrier DDS update once per three clock cycles. The complex multiply generates [xI(n)×cos(ωk(n)) + xQ(n)×sin(ωk(n))] for the I channel output and [xQ(n)×cos(ωk(n)) – xI(n)×sin(ωk(n))] for the Q channel output for each carrier (k = 0, 1, 2, 3). Including rounding, each calculation requires three clock cycles per carrier. Therefore, one DSP48E is needed for the I channel processing and one DSP48E is needed for the Q channel processing.

Two-Carrier Mixer

The two-carrier mixer design is similar to the four-carrier mixer design with the following key differences:

1. The sine/cosine lookup table is based on 768 samples of a full cosine wave. In particular, the ROM contents are initialized to round(2^16*cos(2*pi*(0:767)/768)). This eliminates the need for the half-wave address mappers.

2. The number of clocks per sample is higher, which allows the complex multiply to be implemented using a single DSP48E.

3. The frequency-to-phase accumulator has only two stages of registers.

4. The modulo arithmetic is based on an address depth of 768 instead of 1536.

Resource Utilization Summary

This section lists the required resources and maximum clock frequency for each configuration. The results were obtained using the Xilinx ISE 10.1.02i software.

Resource Utilization for Downlink

The required resources and maximum clock frequency for the DUC+CFR are listed in Table 53.


Figure 76: System Generator Block Diagram of Sine/Cosine Lookup Table

sin_out2

cos_out1

mapper 1

din dout

mapper

din doutd qz-1

Offset

x y

BRAM

addra

addrb

doa

dob

phase1

Fix _18_17

Fix _18_17

UFix _10_0

UFix _10_0

U Fix_11_0

UFix_11_0

UFix_11_0

Table 53: Resource Utilization Summary for Downlink Design (DUC+CFR)

Configuration DSP48s RAMB36K

RAMB18K FFs LUTs Slices Fmax

(MHz)

1x5 20 6 2 3286 3055 1016 419.6

1x10 21 6 2 3294 2952 1061 424.6

1x15 30 6 2 3422 3013 1144 418.6

1x20 26 6 2 3391 2993 1134 412.5

2x5 27 7 3 4141 3907 1266 415.8

2x10 40 7 3 4400 3812 1443 412.5

4x5 45 7 2 4434 4092 1559 407.5


Power Consumption


R

For all cases, the gain control block and power meter option are checked, so the numbers reported in Table 53 are the maximum possible resources for that particular design. The resource numbers especially the DSP48 count reduces if the gain controller or power meter is not needed.

In the 1x5, 1x10, 1x15, and 2x10 configurations, a global clock period constraint of 2.4 ns was set. In the 1x20 case, a single clock period constraint of 2.45 ns was used. The 2x5 and 4x5 cases have a constraint of 2.48 ns. Since the CFR portion of the design is qualified by the ce_3 signal, a ce_3 group with a clock period constraint of 3 times longer than the global clock period was automatically generated by the tool. In addition, the Place and Route (PAR) effort level was set to high with an extra effort level of normal. The 2x10 case has a special XST fanout setting of 50 in order to achieve the required speed, while the XST default setting was used for all other cases. Finally, area groups were used to constrain the placement of the design to a specific region of the device. The design files include the user constraint files that contain the area group specifications for each configuration. Here is an example (for 1x5 configuration) of the area constraint in the user constraints file (UCF):

# Area ConstraintsINST "downlink_design_x0/lte_dfe_downlink*/duc_configurable_subsystem*" AREA_GROUP = "AG_duc";AREA_GROUP "AG_duc" RANGE = SLICE_X0Y40:SLICE_X35Y59,RAMB36_X0Y8:RAMB36_X2Y11,DSP48_X0Y16:DSP48_X3Y23; INST "downlink_design_x0/lte_dfe_downlink*/pc_cfr*" AREA_GROUP = "AG_cfr";AREA_GROUP "AG_cfr" RANGE = SLICE_X0Y60:SLICE_X35Y79,RAMB36_X0Y12:RAMB36_X2Y15,DSP48_X0Y24:DSP48_X3Y31;

Resource Utilization for Uplink

The required resources and maximum clock frequency for the DDC are listed in Table 54. In all cases, a single clock period constraint of 2.4 ns was set. Also, the XST option “equivalent_register_removal” was turned off. This was done to prevent fanout issues on the synchronous reset. In addition, the PAR effort level was set to high with an extra effort level of normal. Finally, area groups were used to constrain the placement of the design to a specific region of the device. The design files include the user constraint files that contain the area group specifications for each configuration.

Power Consumption

The dynamic power of each design was measured using a Virtex-5 FF1136 Prototype Board populated with an XC5VSX50T-1 FPGA. Static power was not measured because it is dependent on the density of the target device. Also, the I/O power was not measured because the design is not meant to be a stand-alone chip solution; the design inputs and outputs are expected to interface to other logic within the FPGA. All measurements were performed at room temperature.

Table 54: Resource Utilization Summary for DDC

Configuration DSP48s RAMB18K FFs LUTs Slices Fmax (MHz)

1x5 8 0 1543 1429 547 431.9

1x10 10 0 1442 1220 481 432.7

1x15 16 0 1932 1424 590 422.8

1x20 16 0 1614 1387 507 422.2

2x5 12 1 1988 1787 781 421.4

2x10 20 1 2226 1868 823 423.7

4x5 23 16 3124 2407 1105 421.0


Power Consumption


R

Power Consumption for Downlink

For the DUC+CFR design, a hardware test bench was created to provide stimulus to the ports of the design. The data inputs were driven by an internal block RAM based pattern generator that contains 2048 points of representative LTE baseband data in a TDM format. An 11 bit counter was used to cyclically address the contents of the block RAMs. Also, some logic was used to create the input control signals. The power consumed by the internal pattern generator is assumed to be a small fraction of the power consumed by the design under test.

The dynamic power was measured as a function of frequency by stepping the external clock generator through a range of settings and measuring the current on the VCCINT supply at each step. Figure 76 shows a plot of the dynamic power versus frequency for each configuration. Table 55 lists the dynamic power requirements at the expected clock rate of 368.64 MHz.

Power Consumption for Uplink

A hardware test bench was created to provide stimulus to the design using an internal pattern generator. A single 18K block RAM was used to store 1024 points of representative input data for each configuration. The data generated for each configuration contains the specified number of carriers at an SNR corresponding to the reference sensitivity level. A 10-bit counter is used to cyclically address the block RAM contents. The power consumed by the internal


Figure 77: Dynamic Power versus Frequency of LTE DUC/CFR

Table 55: Dynamic Power of LTE DUC/CFR

Carrier Configuration Dynamic Power @ 368.64 MHz (W)

1x5 MHz 0.477

1x10 MHz 0.508

1x15 MHz 0.624

1x20 MHz 0.606

2x5 MHz 0.665

2x10 MHz 0.830

4x5 MHz 0.894

Power Consumption of DUC/CFR

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

250 300 350 400

Frequency (MHz)

Dyna

mic

Pow

er (W

) 1x5 MHz1x10 MHz1x15 MHz1x20 MHz2x5 MHz2x10 MHz4x5 MHz


Interface Requirements


R

pattern generator is assumed to be a small fraction of the power consumed by the design under test.

Similar to the downlink, the dynamic power was measured as a function of frequency by stepping the external clock generator through a range of settings and measuring the current on the VCCINT supply at each step. Figure 78 shows a plot of the dynamic power versus frequency for each configuration. Table 56 lists the dynamic power requirements at the expected clock rate of 368.64 MHz.


This section describes the interface of the DUC+CFR and DDC designs along with timing diagrams of the interface signals. In the downlink design, some signals operate at the system clock rate and others operate at the 122.88 Msps sample rate. The timing to the DUC module is based on the 368.64 MHz system clock, while the timing to the CFR module and from both modules is based on the 122.88 MHz sampling rate. In both designs, the timing to and from the designs is based on a single clock domain. All interface signals are required to be synchronous to their respective domains.


Figure 78: Dynamic Power versus Frequency of LTE DDC

Table 56: Dynamic Power of LTE DDC

Carrier Configuration Dynamic Power @ 368.64 MHz (W)

1x5 MHz 0.221

1x10 MHz 0.252

1x15 MHz 0.371

1x20 MHz 0.360

2x5 MHz 0.351

2x10 MHz 0.463

4x5 MHz 0.758

Power Consumption of LTE DDC

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

200 250 300 350 400

Fre que ncy (MHz)

Dyn

amic

Pow

er (W

) 1x5 MHz

1x10 MHz1x15 MHz

1x20 MHz

2x5 MHz2x10 MHz

4x5 MHz




R

Downlink Interface Description

The top-level interface of the downlink design is illustrated in Figure 79 and the port descriptions are listed in Table 57. The number formats shown in Figure 79 use the notation from System Generator. For example, Fix_16_15 represents a two’s complement signed number that is quantized to 16 bits (includes the sign bit) with 15 bits of fraction. As a second example, the format UFix_2_0 represents an unsigned number that is quantized to 2 bits with 0 bits of fraction. For certain ports that have two different number formats such as UFix_16_15 (sc) / UFix_16_14 (mc) it means that for single carrier configurations the data is in UFix_16_15 format, and for multi-carrier configurations the data is in UFix_16_14 format.X-Ref Target - Figure 79

Figure 79: Downlink Top-Level Component

Table 57: Port Definitions for Downlink Design

Signal Operate On Direction Description

din[15:0] system clock Input Time division multiplexed input data.

vin system clock Input Active High signal indicating valid input data.

gain[15:0] system clock Input Gain applied to DUC.

gain_we system clock Input Write enable for gain port.

gain_ch[1:0] system clock Input Channel select for gain port.

threshold[15:0] sample rate Input Clipping threshold in units of linear magnitude.

alloc_spacing1[7:0] sample rate Input Specifies the minimum distance between peak allocations in samples for the first iteration.

din

vin

gain

fil ter_numtaps

gain_we

filter_ ram _addr

duc _i

duc_q

duc_valid

Fix_16_15

Bool

UFix_16_15

UFix_9_0

Bool

UFix_9_0

UFix_16_13

DUC+CFR

filter_ ram _data

fil ter_ ram _we

resetBool

Bool

UFix_32_0

gain_chUFix_2_0

Bool

ce_3_out

power

cf r_i

cfr_q

cf r_valid

threshold

alloc _spacing 1

alloc _spacing 2UFix_8_0

UFix_8_0

UFix_16_15 (sc) / UFix_16_14 (mc)

Bool

Bool

Fix_16_15 (sc) / Fix_16_14 (mc)

Fix_16_15 (sc) / Fix_16_14 (mc)

Fix_16_15 (sc) / Fix_16_14 (mc)

Fix_16_15 (sc) / Fix_16_14 (mc)




R

Downlink Interface Timing

A timing diagram for the input data interface for the downlink design is shown in Table 80. The input port, din, accepts sample rate input data in the time division multiplexed fashion. The number of clocks per input sample is a function of the channel bandwidth and the number of carriers, and is given by n = 368.64/(2*Fs*Ncarrier), where Fs is the baseband sample rate (7.68, 15.36, 23.04, or 30.72 Msps) and Ncarrier is the number of carriers. The factor of two in the denominator accounts for the fact that the complex input data is (I, Q) multiplexed. For single carrier configurations, data from the I channel of the carrier is input first, followed by that of the Q channel. For multi-carrier configurations, the ordering of the data is such that the I and Q channels of the first carrier are input first, followed by the I and Q channels of the second carrier and so on, as shown in Figure 80. The valid input pulse, vin, must be asserted High once every n clocks. Table 58 lists the value n in different configurations.

alloc_spacing2[7:0] sample rate Input Specifies the minimum distance between peak allocations in samples for the second iteration.

filter_numtaps[8:0] sample rate Input Specifies the cancellation pulse length.

filter_ram_addr[8:0] sample rate Input Address for the filter RAM.

filter_ram_data[31:0] sample rate Input Filter RAM data in (I,Q) pairs. The filter RAM stores the cancellation pulse coefficients.

filter_ram_we sample rate Input Write Enable for the filter RAM.

reset system clock Input Active High synchronous reset.

duc_i sample rate Output I component of DUC output.

duc_q sample rate Output Q component of DUC output.

duc_valid sample rate Output Active High data valid signal for DUC output.

power sample rate Output Power estimate of DUC output before CFR.

cfr_i sample rate Output I component for CFR output.

cfr_q sample rate Output Q component for CFR output.

cfr_valid sample rate Output Active High data valid signal for CFR output.

ce_3_out system clock Output Clock enable used for all sample rate signals.

Table 57: Port Definitions for Downlink Design (Cont’d)

Signal Operate On Direction Description


Figure 80: Timing Diagram of Input Data Interface for Downlink

c lk

din

v in

n clocks

I channel (carr ier 0) Q channel (carrier 0) I channel (carrier 1) Q channel (carrier 1) I channel (carr ier 2)

n clocks n clocks n clocks




R

Figure 81 illustrates the timing requirements for the input control interface for downlink. The reset signal only requires to be pulsed for one clock cycle. The gain programming port is used to set different carrier gain (default gain = 1) and can be used by a microprocessor or connected to a bus. There are four possible addresses, 0-3, which correspond to the four complex carriers. To write to a port, set the address gain_ch to the desired channel with the desired data presented on gain port, and assert the write enable gain_we. All writes are synchronous.

It is important to note that the design may have a transient phase after reset that is different than the transient from configuration startup. This is due to the fact that the data buffers in the filter modules are not cleared with reset. The length of the transient section is roughly equal to 1500 samples for 1x5, 2x5, and 4x5 MHz configurations, 750 samples for 1x10 and 2x10 MHz configurations, 500 samples for 1x15, and 375 samples for 1x20 MHz.

Figure 82 shows a timing diagram for the configuration interface for the CFR module. All signals must be synchronous to the rising edge of clock and operate at the sample rate. Circuitry driving the configuration inputs should use registers that are enabled with the ce_3_out signal to guarantee timing to the design. It is also suggested when reloading the cancellation filter coefficient, the reset signal is set to High at all time during the writing process (when filter_ram_we signal is asserted).

Table 58: Period for Input Valid Signal (vin) in Downlink Configurations

Configuration n (clocks)

1x5 24

1x10 12

1x15 8

1x20 6

2x5 12

2x10 6

4x5 6


Figure 81: Timing Diagram of Input Control Interface for Downlink

c lk

gain

0 1 2gain_ch

gain_we

3

g0 g1 g2 g3

reset


Figure 82: Timing Diagram of CFR Configuration Interface

clk

threshold

alloc_spacing1 /2

ce_3_out

filter_numtaps

filter_ram_addr

filter_ram_data

f ilter_ram_we

0 1 numtaps-1




R

The timing diagram for the output data interface is shown in Figure 83. All the output signals (duc_i, duc_q, duc_valid, cfr_i, cfr_q, and cfr_valid) operate at the sample rate. The ce_3_out signal is the clock enable for all synchronous elements in the design that operate on the sample rate domain and is provided to facilitate handshaking to and from the design. The data inputs and outputs are registered on the clock rising edge when ce_4_out is High. The latency from the data_sync input to the data_valid output is illustrated assuming a cancellation pulse length of 255 samples.

Uplink Interface Description

The top-level interface of the DDC design is illustrated in Figure 84 and the port descriptions are listed in Table 59. The number formats shown in Figure 84 use the notation from System Generator. For example, Fix_14_13 represents a two’s complement signed number that is quantized to 14 bits (includes the sign bit) with 13 bits of fraction. As a second example, the format UFix_3_0 represents an unsigned number that is quantized to 3 bits with 0 bits of fraction.


Figure 83: Timing Diagram of Output Data Interface for Downlink

duc_i / q

cfr_i / q

clk

duc_voutc fr_vout

ce_3_out


Figure 84: DDC Top-Level Component

Table 59: Port Definitions for DDC

Signal Direction Description

din[13:0] Input Real IF input signal.

nd Input Active High signal indicating valid input data.

rst Input Active High synchronous reset.

dout[17:0] Output Time division multiplexed output data.

rdy Output Active High signal indicating valid output data.

chan Output Channel number output.

X1123_83_100108

DDC

Fix_14_13din

Fix_18_17dout

Bool

Bool

ndBool

rdy

rstUFix_3_0

chan




R

Uplink Interface Timing

A timing diagram for the data interface is shown in Figure 85. The input sample rate is 122.88 Msps which corresponds to three clocks per sample. The new data pulse, nd, must be asserted High once every three clocks as shown in the diagram. The latency from the first nd pulse to the first rdy pulse is a function of the bandwidth configuration with values as listed in the “Latency” section.

The sample rate of the time division multiplexed output data is a function of the channel bandwidth and the number of carriers. The number of clocks per output sample is given by M = 368.64/(2*Fs*Ncarrier), where Fs is the baseband sample rate (7.68, 15.36, 23.04, or 30.72 Msps) and Ncarrier is the number of carriers. The factor of two in the denominator accounts for the fact that the complex output data is (I, Q) multiplexed. For multi-carrier configurations, the ordering of the data is such that the I and Q channels of the first carrier are output first, followed by the I and Q channels of the second carrier, etc. For example, in the 4x5 MHz configuration, the channel 0, 2, 4, and 6 outputs are the I channels of the first, second, third, and fourth carriers. The channel 1, 3, 5, and 7 outputs are the Q channels of the first, second, third, and fourth carriers.

Figure 86 illustrates the timing requirements for the active High synchronous reset. The reset signal should be held High for a minimum of 20 clock cycles to insure proper operation. Because the reset signal is registered internal to the design, a delay of one clock is needed after reset is released before a new nd pulse will take effect. The latency from the first new nd pulse after reset to the first rdy pulse after reset is the same as before.

It is important to note that the DDC may have a transient section after reset that is different than the transient from configuration startup. This is due to the fact that the data buffers in the filter modules are not cleared with reset. The length of the transient section is roughly equal to the group delay of the DDC.


Figure 85: Timing Diagram of Data Interface for DDC

clk

din

nd

L clks

dout

rdy

chan

M clks M clks


Figure 86: Timing Diagram of Reset Signal for DDC

clk

din

nd

= 20 clks

reset


Latency


R

Latency This section summarizes the latency of each design, where latency is defined as the total delay from input to output, including both the algorithmic delays (for example, filter group delay) as well as implementation delays (for example, pipeline delays). The latency and total delay of the downlink design is summarized in Table 60. The latency for DUC+CFR was measured using the impulse response method. An impulse located at a certain input sample was used to stimulate the design. The peak locations of the DUC output and CFR output were measured in clock cycles, and the difference between the input pulse and the output peaks are the data latency. This latency of the DUC is dominated by the filter group delays, particularly from the channel filter. For the CFR, The largest component of the delay is typically equal to the group delay of the cancellation pulse.

The latency, total delay, and group delay of the DDC is summarized in Table 61. For the DDC, the latency is defined as the delay from the first nd pulse at the input to the first rdy pulse at the output. The total delay is measured by sending an impulse into the DDC input and then plotting the DDC output. The delay in clocks from the input impulse to the center of the output impulse response gives the total delay. The group delay is measured using a similar method, except now the plot is performed for a single real output channel and is reported in output samples. The group delay is a measure of the algorithmic delays and does not include the delay from the implementation latency.

Software Requirements

The reference designs described in this application note were developed and tested using Xilinx® System Generator for DSP version 10.1.2, MATLAB version 7.5 (R2007b), Simulink version 7.0, and Xilinx® ISE® version 10.1 (K.37 build) with IP Update 2 (K.10 build). The MATLAB Signal Processing Toolbox and Simulink Signal Processing Blockset are required to successfully run the designs and supporting software. In addition, the MATLAB Filter Design Toolbox is required to run some of the filter design functions.

Table 60: Latency and Total Delay of DUC+CFR

Configuration DUC Latency (clocks)

CFR Latency (clocks)

DUC+CFR Latency (clocks)

Total Delay (μs)

1x5 2484 1098 3582 9.72

1x10 1311 1098 2409 6.53

1x15 870 1098 1968 5.34

1x20 651 1098 1749 4.74

2x5 2712 1101 3813 10.34

2x10 1374 1101 2475 6.71

4x5 2535 1101 3636 9.86

Table 61: Latency, Total Delay, and Group Delay of DDC

Configuration Latency (clocks) Total Delay (clocks) Group Delay (samples)

1x5 199 3271 64

1x10 121 1729 67

1x15 147 1299 72

1x20 78 894 68

2x5 218 3530 69

2x10 143 1823 70

4x5 212 3572 70


Hardware Verification


R

Hardware Verification

The functionality of the reference designs was verified through hardware co-simulation on an ML506 board populated with an XC5VSX50T-1 device.

System Integration

This section briefly describes how to integrate a System Generator design into a larger system using an NGC netlist and a VHDL black-box flow. For a more complete description of this process, see the Xilinx® System Generator for DSP documentation [Ref 12]. It is also contained in the MATLAB help menu that is labeled “MATLAB Full Product Family Help.”

There are several compilation targets that can be selected from the System Generator GUI (for example, HDL Netlist, NGC Netlist, Bitstream, and so on). For the design flow described here, the NGC Netlist option should be selected. An NGC netlist is a binary file recognized by the Xilinx® ISE tools that contains the design netlist along with any constraint information. This format allows the design to be pulled into a larger design as a black box by NGDBuild (Project Navigator Translation). When using the default options, the NGC netlist is stored in a file named <design_name>_cw.ngc.

In addition to creating the NGC netlist, System Generator software typically creates two VHDL files for the design. The first file is named <design_name>.vhd and the second file is named <design_name>_cw.vhd. The _cw.vhd file is a top-level wrapper containing the System Generator design as a component along with a clock-driver module. The clock driver generates any inferred clock enables that can be in the design along with uniquely named clock signals for each clock enable. Even though the clock signals may have unique names, they are driven from the same clock in a single-clock design.

If the design is to be used as a black box, the appropriate synthesis attribute must be set in the code that instantiates it. For Synplify the user must attach the syn_black_box attribute to the component and set it to true. For XST, the box_type attribute must be attached to the component and set to black_box.

When implementing the top-level design, the user must copy the NGC netlist from the System Generator subdirectory and place it in the project directory of the top-level design, allowing the ISE tools to pull the module into the top-level design during NGDBuild. However, the VHDL files should not be copied to this directory because they can interfere with the NGC netlist black-box flow. For simulation purposes, compile the VHDL files into a simulation library while leaving the source code in a separate directory. Although the VHDL files should remain in a different directory, any Memory Initialization Files (.mif) and Coefficient (.coe) files should be copied to the top-level project directory.

Conclusion This application note describes an efficient and high-performance method for implementing cost effective DUC, DDC and CFR signal processing required in a typical 3GPP-LTE radio. System Generator source code is made available along with test vectors and scripts to allow designers to rapidly evaluate the algorithms and verify necessary performance targets.

When using Xilinx FPGAs in radio designs, the rich mix of logic and DSP processing capabilities offer significant performance, flexibility, low power and low cost. This application note aims to demonstrate what can be achieved in Xilinx FPGAs, while providing inherent flexibility. Additionally, it also allows designers to differentiate their products and provide network operators a more competitive solution.


References


R

References 1. 3GPP TR 36.804 v0.7.1 (2007-10), 3rd Generation Partnership Project; Technical Specification Group Radio Access Network; Evolved Universal Terrestrial Radio Access (E-UTRA); Base Station (BS) radio transmission and reception; (Release 8).

2. 3GPP TS 36.104 v8.0.0 (2007-12), 3rd Generation Partnership Project; Technical Specification Group Radio Access Network; Evolved Universal Terrestrial Radio Access (E-UTRA); Base Station (BS) radio transmission and reception; (Release 8).

3. 3GPP TR 36.942 v1.2.0 (2007-06), 3rd Generation Partnership Project; Technical Specification Group Radio Access Network; Evolved Universal Terrestrial Radio Access (E-UTRA); Radio Frequency (RF) system scenarios; (Release 8).

4. XAPP1033, Peak Cancellation Crest Factor Reduction Reference Design.

5. Väänänen, Olli, Jouko Vankka, and Kari Halonen. 2002. Effect of Clipping in Wideband CDMA System and Simple Algorithm For Peak Windowing, World Wireless Congress, San Francisco, pp. 614-619.

6. XAPP921c, High Density WCDMA Digital Front End Reference Design.

7. 3GPP TS 25.104, Base Station (BS) Radio Transmission and Reception (FDD).”

8. IEEE Std 802.16-2004, IEEE Standard for Local and metropolitan area networks: Part 16: Air Interface for Fixed Broadband Wireless Access Systems.

9. IEEE Std 802.16e-2005 and IEEE Std 802.16-2004/Cor1-2005, IEEE Standard for Local and metropolitan area networks: Part 16: Air Interface for Fixed Mobile Broadband Wireless Access Systems, Amendment 2: Physical and Medium Access Control Layers for Combined Fixed and Mobile Operation in Licensed Bands and Corrigendum 1.

10. 3GPP TS 36.211 V8.0.0 (2007-09), 3rd Generation Partnership Project; Technical Specification Group Radio Access Network; Evolved Universal Terrestrial Radio Access (E-UTRA); Physical channels and modulation; (Release 8).

11. Xilinx DS534, FIR Compiler v4.0 Product Specification.

12. System Generator for DSP User Guide, Release 10.1.2 http://www.xilinx.com/support/sw_manuals/sysgen_user.pdf

Revision History

Notice of Disclaimer

Xilinx is disclosing this Application Note to you “AS-IS” with no warranty of any kind. This Application Note is one possible implementation of this feature, application, or standard, and is subject to change without further notice from Xilinx. You are responsible for obtaining any rights you may require in connection with your use or implementation of this Application Note. XILINX MAKES NO REPRESENTATIONS OR WARRANTIES, WHETHER EXPRESS OR IMPLIED, STATUTORY OR OTHERWISE, INCLUDING, WITHOUT LIMITATION, IMPLIED WARRANTIES OF MERCHANTABILITY, NONINFRINGEMENT, OR FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT WILL XILINX BE LIABLE FOR ANY LOSS OF DATA, LOST PROFITS, OR FOR ANY SPECIAL, INCIDENTAL, CONSEQUENTIAL, OR INDIRECT DAMAGES ARISING FROM YOUR USE OF THIS APPLICATION NOTE.

Date Version Description of Revisions

10/29/08 1.0 Initial Xilinx release.

http://www.xilinx.com/support/sw_manuals/sysgen_user.pdf


http://www.xilinx.com/support/documentation/ip_documentation/fir_compiler_ds534.pdf

http://www.xilinx.com/esp/wireless.htm#rf



Documents

R 3GPP LTE Digital Front End Reference Design - pudn.comIntroduction XAPP1123 (v1.0) October 29, 2008 2 R Acronyms and Abbreviations 3GPP 3rd Generation Partnership Project AGC Automatic