Upload
phungcong
View
217
Download
0
Embed Size (px)
Citation preview
Overcoming LTE PHY Design Challenges Using ESL Design MethodologiesBy: Louie Valeña, Field Applications Engineer, CoWare K.K.
The 3rd Generation Partnership Project (3GPP) announced the functional freeze1 of the LTE specs
(Release 8) in Dec 2008 [1], but even before that, Nokia Siemens Networks had already announced the
availability of LTE base stations [2] and LG had announced LTE baseband chips for the handset [3].
These and other initial implementations will need to be optimized over time to optimize cost/performance
and upgraded to comply with the latest version of the specification. With NTT docomo [4] and Verizon
Wireless [5] announcing LTE service availability in 2010, the design and development time for hardware,
software and systems is painfully short. A comprehensive design and verification methodology is required
to meet the tight development schedule while meeting or exceeding performance criteria.
Electronic System Level (ESL) design aims to be this comprehensive design and verification
methodology. This is achieved by using simulation models with a high level of abstraction to act as an
“executable specification” for all the design teams involved. A paper spec is subject to misunderstanding
and misinterpretation. An executable spec works around these limitations by embedding the designer’s
“intent” into the spec. The executable spec acts as the “golden testbench” for all the design teams
involved, thereby removing the need for each team to create their own testbenches.
This article aims to introduce some of the design challenges that may be encountered in designing the
physical layer (PHY) for LTE and how ESL tools can help create executable specs to overcome them.
Design Challenge #1: Reading and understanding the specs
The bulk of the LTE PHY specs are in two documents: TS 36.211 [6] which describes the physical
channels and their modulation; and TS 36.212 [7] which describes multiplexing and channel coding
performed on data from the MAC. Although the specs include some block diagrams to facilitate
comprehension, it’s still quite a feat to visualize how data moves and is transformed while moving from
one block to the next just from perusing the specs. CoWare provides an LTE library which can be used as
a reference guide to facilitate comprehension of the spec. Figure 1 shows the detail view of the
hierarchical LTE encoder block. It fills in the details outlined in TS 36.212 v.8.5.0 Sec. 5.3.2. Note that
probes can be attached to block outputs to monitor how signals change during processing.
1 A “freeze” implies that no additional functionality will be included in the spec; it does not imply completeness. (c.f. http://www.3gpp.org/releases )
Page 1
Figure 1: Detail view of the hierarchical LTE encoder block showing the processing performed on the
downlink shared channel as specified in TS 36.212 v.8.5.0 Sec. 5.3.2. Note that parameters can depend on
other parameters in higher hierarchies and can be passed down to lower hierarchies, as well. Probes can
be attached to block outputs to show how the signals change during simulation.
Design Challenge #2: Creating an executable spec to investigate system performance and act as a golden
testbench for all design teams
Many companies participating in the standardization process have algorithm development teams
dedicated to writing C programs to create and evaluate various proposals. Unfortunately, the simulation
programs are seldom usable outside the algorithm development teams due to non-uniform coding styles
and lack of suitable documentation. They are seldom used as executable specs because they are difficult
to read, maintain and interface to. An executable spec for PHY layer design should have the following
characteristics:
Dataflow model of computation – Simulation programs are usually classified according to the
model of computation used. Some commonly used models of computation are: continuous time (e.g.
SPICE, Verilog-A), discrete event (e.g. Verilog, VHDL) and dataflow. In continuous time and
discrete event models, the order in which blocks/functions are executed need to be determined
during runtime, which requires significant overhead. For static dataflow, the execution schedule can
be determined before runtime, allowing faster simulations. When multiple sampling rates are
involved (multi-rate), non-dataflow simulators which use a discrete time fixed-step solver model of
computation would need to execute all blocks at the fastest common multiple clock. For static
dataflow, the runtime schedule would appear as nested “for” loops, thereby allowing each block to
Page 2
execute at its “designated” sampling frequency. In Fig. 2, the AFE can be modeled in complex
baseband with a bandwidth wide enough to evaluate the effects of interfering signals on system
performance. Assuming an LTE bandwidth of 20MHz, the frequency of the ADC clock would be
fADC = 30.72MHz (2048 point DFT with 15kHz subcarrier spacing). Intermodulation response
rejection tests described in [9] specifies an unmodulated carrier at ±17.5MHz offset and a 5MHz
modulated interfering signal located ±35MHz away from the desired channel. This implies that the
AFE portion should be sampled (for simulation purposes) at >3fADC = 92.16MHz to satisfy Nyquist’s
sampling criterion. A discrete time fixed-step simulator would need to process everything at
>92.16MHz, while a static dataflow simulator would process each block at the required rate (e.g.
>92.16MHz for the AFE and 30.72MHz for the ADC). This is the primary reason why dataflow is
considered to be the best model of computation for signal processing applications.
Figure 2: Simplified block diagram of a UE receiver with 2 antennas (AFE: analog front-end; BPF:
band pass filter; LNA: low noise amplifier; VGA: variable gain amplifier; LPF: low pass filter:
ADC: analog-to-digital converter; DAC: digital-to-analog converter; AGC: automatic gain control;
NCO: numerically controlled oscillator; CP: cyclic prefix; DFT: discrete Fourier transform).
Baseband processing blocks are in orange. The LTE specs describe how data is to be transmitted
but not how they are to be recovered.
Hierarchical block diagram editor – Viewing a block diagram to trace signal flow is a lot easier than
going through several pages of C code. A hierarchical block diagram editor allows the user to
quickly grasp the signal flow and manage complexity. (See Fig. 1).
Source code available for all blocks/models – This allows implementers to examine the details of
the “executable spec” and use it as a starting point for their own implementation; or modify it to suit
their purposes. The source code for all primitive blocks should be available for viewing and editing.
Rapid simulation – The execution should finish as quickly as possible to allow sweeping over
Page 3
several parameters and getting results quickly for various channel scenarios and usage profiles.
More simulations run in less time results in fewer surprises during field testing, less “back to the
drawing board” moments, and thus provide a huge costs savings to the project. Using a C++
infrastructure and running compiled simulations over a distributed network should be supported.
Multi-core support further takes advantage of dual-core CPUs by automatically subdividing the
design into independent threads which can run on separate cores, resulting in a 1.7x ~ 1.9x speedup
compared to a single core.
Single-process co-simulation framework – This simplifies bottom-up verification. Implementation
of digital blocks would be done using Verilog/VHDL. Initial implementation of analog blocks
would be done using Verilog-A or Verilog/VHDL-AMS. Note that the commonly used IPC (inter-
process communication) method for co-simulation has high overhead and can be very slow. During
co-simulation with Verilog/VHDL-AMS, the algorithm portion of the design is directly linked with
the RTL/AMS simulator using PLI/VPI, incurring no IPC overhead.
CoWare’s LTE library includes:
Downlink reference system with ideal receiver
Downlink reference system with practical channel estimator
Uplink reference system with ideal receiver
Cell Search reference system
MIMO channel model supporting EPA, EVA, ETU, SCM-A/B/C/D as well as user-defined
scenarios with M transmit and N receive antennas (no limitations on M nor N)
MIMO receiver supporting spatial multiplexing using zero forcing (ZF), minimum mean squared
error (MMSE) or maximum likelihood (ML); and transmit diversity using maximal ratio combining
(MRC)
The library models the test cases and simulation scenarios published by the LTE working group. Users of
Page 4
the LTE library therefore have a higher probability of being compliant to the spec since the LTE library
essentially becomes a shared database. Designers can insert their own implementations into the “LTE
executable spec” and check performance against the test cases published by the LTE working group.
Figure 3 shows the reference throughput performance obtained compared with results from other 3GPP
LTE participants. The LTE library may be used as a starting point for algorithm developers to explore
particular implementations and evaluate their performance from a known, good reference point.
Figure 3: LTE downlink (PDSCH) reference system throughput simulation results compared with
other 3GPP participants’ results for FDD dual-stream MIMO, 10MHz, 50RB, 2 codewords, 2
layers, 2 Tx antennas, MMSE, no feedback, precoding #0, 2 x 16QAM, coding rate = 1/2, EVA5,
RVseq = 01,2,3.
Design Challenge #3: Exploring and evaluating analog front-end architectures which will meet performance
requirements while minimizing power and cost.
The architectures to be considered include:
Super-heterodyne with analog quadrature modulation/demodulation
Super-heterodyne with digital quadrature modulation/demodulation
Direct conversion
Super-heterodyne architectures involve multiple frequency translations. They provide the best sensitivity
and selectivity at the expense of a bigger parts count, bigger BOM and larger area. Analog quadrature
modulation/demodulation requires mixers, a phase shifter and a combiner. Balancing the gains of the I
and Q arms and achieving an exact 90° phase shift is impossible in an analog implementation. Analog
quadrature modulation/demodulation circuits suffer from gain/phase imbalance and carrier leakage.
Typical discrete devices have a minimum sideband suppression of -28dBc at high RF output frequencies
(e.g. 1.9GHz). This would correspond to roughly 4° of phase imbalance and 0.1dB of gain imbalance [9].
Such imperfections degrade EVM and adversely affect total system performance [10]. LTE uses 64QAM
to achieve higher throughput and requires an EVM of less than 8%. Figure 4 shows the constellation
diagram for 64QAM with quadrature modulator imperfections.
Page 5
Figure 4: Constellation diagram for 64QAM. The top diagram is the ideal constellation. The bottom
diagram shows the constellation with 4° of phase imbalance and 0.1dB of gain imbalance resulting
in an EVM of 3%. Note that even though this value is less than the required 8%, the constellation is
visibly skewed and will reduce the overall performance of the system. The signal points have been
enlarged for easier viewing.
A digital quadrature modulator/demodulator requires 2 multipliers and an adder. In its simplest form, the
local oscillator can be generated as a sequence +1, 0, -1, 0, +1, …,, that is, a sine wave with 4x
oversampling. The multiplier then becomes a simple switch selecting between the original I/Q signal, an
inverted version and zero. A digital quadrature modulator/demodulator doesn’t suffer from gain/phase
imbalance and carrier leakage. However, it requires upsampling (zero insertion and filtering) to match the
sampling frequency of the local oscillator.
A direct conversion architecture promises the lowest parts count, BOM, area and power dissipation.
However, the use of an analog quadrature modulator/demodulator is unavoidable. Specifying the
Page 6
parameters too “tightly” in the analog quadrature modulator/demodulator would lead to low yield and low
volumes since “champion” samples would have to be selected. It would be better to specify the
component “roughly” and compensate digitally. Fig. 5 shows how quadrature modulator compensation
and power amplifier linearization in an LTE eNB transmitter may be modeled with the front-end modules
in Verilog-AMS.
Fig. 5: Block diagram of quadrature modulator compensation and power amplifier linearization.
The front-end modules (green) are modeled in complex baseband with Verilog-AMS to provide an
“executable specification” for the analog design team. The digital baseband portion of the design
can be exported as a SystemC block for use within analog/RF simulator that co-simulate with
SystemC. Note that direct conversion is used in transmission but not for the linearizer feedback. In
a highly integrated system, the LO signal of the quadrature modulator would be pulled by the
power amplifier, making it unusable for direct downconversion.
LTE supports multiple bandwidths: 1.4 MHz, 3 MHz, 5 MHz, 10 MHz, 15 MHz and 20 MHz. This allows
carriers to gradually migrate users from GSM/EDGE to LTE. Device developers would likely not know
which part of a carrier’s available bandwidth would be assigned for LTE, so it would be more practical to
cope with the multiple bandwidth issue digitally, that is, filtering in the digital domain. This implies that
the baseband I/Q analog filters would have a passband of 20 MHz (-10 MHz to +10 MHz in the complex
domain) to cover all possible cases. This places stringent requirements on the analog-to-digital
converter’s dynamic range since there may be strong GSM/EDGE signals right beside the desired and
possibly weak LTE signals. Dynamic simulations would need to be performed to determine the optimum
number of bits for the analog-to-digital converters in the presence of AGC and analog compensation
circuits.
The transmit power amplifier consumes most of the power available in a handset. Using a highly-efficient
but non-linear power amplifier with digital adaptive predistortion allows longer battery life while coping
with poor antenna VSWR and exceeding LTE requirements [11]. Exploring and evaluating various
Page 7
predistortion algorithms and architectures requires dynamic simulations to select the optimum solution.
Using complex baseband representation for RF signals is sufficient when selecting the optimum front-end
architecture. Complex baseband involves “moving” the RF carrier to zero Hertz and selecting a
bandwidth (sampling frequency) wide enough to cover all signals of interest in blocking, interfering and
intermodulation scenarios. Complex baseband representation allows the system designer to determine the
characteristics of filters (e.g. passband, stopband, passband ripple) and amplifiers (e.g. gain, saturation)
required to meet or exceed the specifications. Complex baseband dataflow modeling for front-end
architecture exploration is more efficient (executes faster) than using AMS languages. Models for phase
noise, non-linear amplifiers, quadrature modulator/demodulator errors, filters and others to help designers
quickly model analog front-end architectures.
Design Challenge #4: Evaluate algorithms which will meet performance requirements while minimizing area,
power and cost.
Some key algorithms include:
Signal acquisition and start of frame detection
Coarse frequency synchronization
Channel estimation and equalization
Fine frequency/phase synchronization
Symbol timing synchronization
MIMO receiver
PMI, CQI, RI calculation and reporting
FFT/IFFT processing
Turbo/convolutional decoder
Transmitter processing is explicitly defined in the standard, but receiver processing is not. Designing and
evaluating receiver algorithms requires transmitted signals to work with. The LTE library includes
uplink/downlink transmitters and receivers to accelerate algorithm development.
There are many algorithms which may be used to realize the above tasks [12]. The “executable spec”
should act as a golden reference against which all algorithms may be compared. The golden reference
indicates the ideal performance of the system. It is created by providing the receiver with perfect
knowledge of all impairments (multipath channel characteristics, carrier frequency/phase error, etc.). Any
practical implementation of an algorithm will fall short of the ideal performance and constitutes an
implementation loss. More complex algorithms will have a small implementation loss at the cost of
Page 8
higher power dissipation or larger area or longer latency.
Simulations are required to evaluate the performance of various algorithms over different scenarios and
select the “best”. The “best” algorithm would have the least complexity (least number of operations and
least amount of memory used) and least latency (computation delay) while meeting or exceeding
performance requirements.
An LTE library should offer the above algorithms as well as testbenches to check performance as a
starting point for developers to evaluate their own algorithms.
Design Challenge #5: Convert floating-point algorithms into fixed-point for optimum performance.
Floating-point allows values to be represented with a large dynamic range and high precision but requires
more hardware resources (area and power) compared to a fixed-point representation. Floating-point is
used in initial algorithm evaluation to obtain an upper bound on performance but is seldom used in a
hardware implementation. Fixed-point incurs a quantization loss and a limited but “good enough”
dynamic range. C programs developed during the standardization process are always done in floating-
point and creating fixed-point versions of critical functions is not a trivial task [13]. It is important to take
advantage of C++ polymorphism to build models whose datatype can be set to floating-point, fixed-point,
complex, scalar, vector, matrix or image with a parameter change. Also offers several analysis utilities
like statistics (e.g. min, max) and histogram to facilitate fixed-point conversion should be available.
Parameter sweeping simulation (e.g. for bid width) on distributed simulations using Grid Engine [14]
achieve the desired design space exploration productivity.
Design Challenge #6: Partition the baseband design for implementation on dedicated hardware,
programmable accelerators, or software.
The criteria for selecting between a dedicated hardware and a pure software implementation of an
algorithm are fairly straightforward: if it needs to run really fast with low probability of being changed,
it’s a good candidate for a dedicated hardware implementation; if the processing is fairly complex with a
lot of parameters but low throughput, it’s a good candidate for implementation in software.
Programmable accelerators provide a good compromise between dedicated hardware (high energy
efficiency with no flexibility) and software (low energy efficiency with maximum flexibility) by allowing
developers to create a custom processor with an instruction set and register file tailored for the
application.
The transform precoding required in the uplink is a good candidate for a programmable accelerator
Page 9
implementation [16]. Transform precoding is essentially an N-point DFT, with 72 ≤ N ≤ 1320 (in
multiples of 12) depending on the size of the data to be transmitted/received.
Programmable accelerators are designed by describing the processor’s instruction set using the LISA
language. From there tools can automatically generate all the required processor development tools
(assembler, compiler, linker, debugger, documentation, instruction set simulator) and RTL from the LISA
description. The instruction set simulator generated can be imported into LTE simulation to verify the
operation and performance of the programmable accelerator across several scenarios.
Summary
The design of the LTE physical layer implementation poses significant development challenges that
require a range of ESL design solutions in order to arrive to the market on time and with the right
performance and flexibility. The CoWare solutions for DSP algorithm design (CoWare Signal Processing
Designer) and programmable accelerator design (CoWare Processor Designer) are able to assist design
teams in the creation of highly differentiated solutions which are standards compliant. For more
information, visit www.coware.com.
Glossary:
3GPP 3rd Generation Partnership Project
3GPP2 3rd Generation Partnership Project 2
ADC Analog-to-Digital Converter
AFE Analog Front-end
AMS Analog Mixed Signal
ASIP Application Specific Instruction Set Processor
BOM Bill of Materials
CMOS Complementary Metal Oxide Semiconductor
CPU Central Processing Unit
CQI Channel Quality Indicator
DFT Discrete Fourier Transform
DSP Digital Signal Processor
eNB evolved Node B
ESL Electronic System Level
EVA Extended Vehicular A
EVM Error Vector Magnitude
FDD Frequency Division Duplex
HARQ Hybrid Automatic Repeat Request
Page 10
HSDPA High Speed Downlink Packet Access
IDFT Inverse Discrete Fourier Transform
IF Intermediate Frequency
IPC Inter-Process Communication
ISS Instruction Set Simulator
LO Local Oscillator
LTE Long Term Evolution
MAC Medium Access Control
MIMO Multiple Input Multiple Output
MMSE Minimum Mean Square Estimation
OFDM Orthogonal Frequency Division Multiplex
OFDMA Orthogonal Frequency Division Multiple Access
PAPR Peak to Average Power Ration
PDSCH Physical Downlink Shared Channel
PHY Physical
QAM Quadrature Amplitude Modulation
QoS Quality of Service
RB Resource Blocks
RF Radio Frequency
RTL Register Transfer Level
RVseq Redundancy Version sequence
SC-FDMA Single Carrier Frequency Division Multiple Access
SNR Signal to Noise Ratio
SPICE Simulation Program with Integrated Circuit Emphasis
TS Technical Specification
UE User Equipment
UI User Interface
VHDL VHSIC (Very High-Speed Integrated Circuits) Hardware Description Language
References:
[1] UMTS Forum, “LTE freeze completed,” 11 Dec 2008.
http://www.umts-forum.org/content/view/2616/109/
[2] Mobile Europe, “Nokia Siemens Networks ships LTE base station hardware,” 15 Oct 2008.
http://www.mobileeurope.co.uk/news_wire/114202/Nokia_Siemens_Networks_ships_LTE_base_station_
Page 11
hardware.html
[3] Information Week, “LG Says It Has the World’s First LTE Chip for Phones,” 9 Dec 2008.
http://www.informationweek.com/blog/main/archives/2008/12/lg_says_it_has.html
[4] PC World, “NTT DoCoMo to Launch LTE Mobile Broadband in 2010,” 17 Nov 2008.
http://www.pcworld.com/businesscenter/article/154069/ntt_docomo_to_launch_lte_mobile_broadband_in
_2010.html
[5] Daily Wireless, “Verizon: LTE in 25 to 30 Markets By 2010,” 18 Feb 2008.
http://www.dailywireless.org/2009/02/18/verizon-lte-in-25-to-30-markets-by-2010/
[6] 3GPP TS 36.211 V8.5.0 (2008-12), “3rd Generation Partnership Project; Technical Specification
Group Radio Access Network; Evolved Universal Terrestrial Radio Access (E-UTRA); Physical Channels
and Modulation (Release 8),” Dec 2008.
[7] 3GPP TS 36.212 V8.5.0 (2008-12), “3rd Generation Partnership Project; Technical Specification
Group Radio Access Network; Evolved Universal Terrestrial Radio Access (E-UTRA); Multiplexing and
channel coding (Release 8),” Dec 2008.
[8] 3GPP TS 36.101 V8.4.0 (2008-12), “3rd Generation Partnership Project; Technical Specification
Group Radio Access Network; Evolved Universal Terrestrial Radio Access (E-UTRA); User Equipment
(UE) radio transmission and reception (Release 8),” Dec 2008.
[9] Esa Tiiliharju, “Integration of Broadband Direct-Conversion Quadrature Modulators,” Doctoral
Dissertation, Espoo 2006, http://lib.tkk.fi/Diss/2006/isbn9512285223/isbn9512285223.pdf
[10] Tzi-Dar Chiueh and Pei-Yun Tsai, “OFDM Baseband Receiver Design for Wireless
Communications,” John Wiley & Sons, 2007.
[11] George Norris, et. al. , “Application of Digital Adaptive Pre-distortion to Mobile Wireless Devices,”
2007 IEEE Radio Frequency Integrated Circuits (RFIC) Symposium, pp. 247 –250, 3-5 June 2007.
[12] Tzi-Dar Chiueh and Pei-Yun Tsai, “OFDM Baseband Receiver Design for Wireless
Communications,” John Wiley & Sons, 2007.
[13] Tomas Andersson, “LTE Testbed: A Prototype System for Evolved Mobile Broadband,”
www.s3.kth.se/signal/edu/s3_seminar/2008/talks/sem15.pdf
[14] http://gridengine.sunsource.net/
[15] Andreas Hoffman and Achim Nohl, “The Dusk of ASIC, the Dawn of ASIP,”
http://www.coware.com/PDF/ESC2005.PDF
[16] Jay Mundarath, “LTE: Mixed Radix DFTs for LTE Uplink,”
www. freescale .com/files/ftf_2008/presentations/Americas/3/AM108_ LTEMixedRadixDFT sfor LTE Uplin
k.pdf
[17] Tim Kogel and Matthew Braun, “Virtual Prototyping of Embedded Platforms for Wireless and
Multimedia,” Mar 2006.
Page 12