16
Overcoming LTE PHY Design Challenges Using ESL Design Methodologies By: Louie Valeña, Field Applications Engineer, CoWare K.K. The 3 rd Generation Partnership Project (3GPP) announced the functional freeze 1 of the LTE specs (Release 8) in Dec 2008 [1], but even before that, Nokia Siemens Networks had already announced the availability of LTE base stations [2] and LG had announced LTE baseband chips for the handset [3]. These and other initial implementations will need to be optimized over time to optimize cost/performance and upgraded to comply with the latest version of the specification. With NTT docomo [4] and Verizon Wireless [5] announcing LTE service availability in 2010, the design and development time for hardware, software and systems is painfully short. A comprehensive design and verification methodology is required to meet the tight development schedule while meeting or exceeding performance criteria. Electronic System Level (ESL) design aims to be this comprehensive design and verification methodology. This is achieved by using simulation models with a high level of abstraction to act as an “executable specification” for all the design teams involved. A paper spec is subject to misunderstanding and misinterpretation. An executable spec works around these limitations by embedding the designer’s “intent” into the spec. The executable spec acts as the “golden testbench” for all the design teams involved, thereby removing the need for each team to create their own testbenches. This article aims to introduce some of the design challenges that may 1 A “freeze” implies that no additional functionality will be included in the spec; it does not imply completeness. (c.f. http://www.3gpp.org/releases ) Page 1

Overcoming LTE PHY Design Challenges Using ESL … · Web viewThe bulk of the LTE PHY specs are in two documents: TS 36.211 [6] which describes the physical channels and their modulation;

Embed Size (px)

Citation preview

Overcoming LTE PHY Design Challenges Using ESL Design MethodologiesBy: Louie Valeña, Field Applications Engineer, CoWare K.K.

The 3rd Generation Partnership Project (3GPP) announced the functional freeze1 of the LTE specs

(Release 8) in Dec 2008 [1], but even before that, Nokia Siemens Networks had already announced the

availability of LTE base stations [2] and LG had announced LTE baseband chips for the handset [3].

These and other initial implementations will need to be optimized over time to optimize cost/performance

and upgraded to comply with the latest version of the specification. With NTT docomo [4] and Verizon

Wireless [5] announcing LTE service availability in 2010, the design and development time for hardware,

software and systems is painfully short. A comprehensive design and verification methodology is required

to meet the tight development schedule while meeting or exceeding performance criteria.

Electronic System Level (ESL) design aims to be this comprehensive design and verification

methodology. This is achieved by using simulation models with a high level of abstraction to act as an

“executable specification” for all the design teams involved. A paper spec is subject to misunderstanding

and misinterpretation. An executable spec works around these limitations by embedding the designer’s

“intent” into the spec. The executable spec acts as the “golden testbench” for all the design teams

involved, thereby removing the need for each team to create their own testbenches.

This article aims to introduce some of the design challenges that may be encountered in designing the

physical layer (PHY) for LTE and how ESL tools can help create executable specs to overcome them.

Design Challenge #1: Reading and understanding the specs

The bulk of the LTE PHY specs are in two documents: TS 36.211 [6] which describes the physical

channels and their modulation; and TS 36.212 [7] which describes multiplexing and channel coding

performed on data from the MAC. Although the specs include some block diagrams to facilitate

comprehension, it’s still quite a feat to visualize how data moves and is transformed while moving from

one block to the next just from perusing the specs. CoWare provides an LTE library which can be used as

a reference guide to facilitate comprehension of the spec. Figure 1 shows the detail view of the

hierarchical LTE encoder block. It fills in the details outlined in TS 36.212 v.8.5.0 Sec. 5.3.2. Note that

probes can be attached to block outputs to monitor how signals change during processing.

1 A “freeze” implies that no additional functionality will be included in the spec; it does not imply completeness. (c.f. http://www.3gpp.org/releases )

Page 1

Figure 1: Detail view of the hierarchical LTE encoder block showing the processing performed on the

downlink shared channel as specified in TS 36.212 v.8.5.0 Sec. 5.3.2. Note that parameters can depend on

other parameters in higher hierarchies and can be passed down to lower hierarchies, as well. Probes can

be attached to block outputs to show how the signals change during simulation.

Design Challenge #2: Creating an executable spec to investigate system performance and act as a golden

testbench for all design teams

Many companies participating in the standardization process have algorithm development teams

dedicated to writing C programs to create and evaluate various proposals. Unfortunately, the simulation

programs are seldom usable outside the algorithm development teams due to non-uniform coding styles

and lack of suitable documentation. They are seldom used as executable specs because they are difficult

to read, maintain and interface to. An executable spec for PHY layer design should have the following

characteristics:

Dataflow model of computation – Simulation programs are usually classified according to the

model of computation used. Some commonly used models of computation are: continuous time (e.g.

SPICE, Verilog-A), discrete event (e.g. Verilog, VHDL) and dataflow. In continuous time and

discrete event models, the order in which blocks/functions are executed need to be determined

during runtime, which requires significant overhead. For static dataflow, the execution schedule can

be determined before runtime, allowing faster simulations. When multiple sampling rates are

involved (multi-rate), non-dataflow simulators which use a discrete time fixed-step solver model of

computation would need to execute all blocks at the fastest common multiple clock. For static

dataflow, the runtime schedule would appear as nested “for” loops, thereby allowing each block to

Page 2

execute at its “designated” sampling frequency. In Fig. 2, the AFE can be modeled in complex

baseband with a bandwidth wide enough to evaluate the effects of interfering signals on system

performance. Assuming an LTE bandwidth of 20MHz, the frequency of the ADC clock would be

fADC = 30.72MHz (2048 point DFT with 15kHz subcarrier spacing). Intermodulation response

rejection tests described in [9] specifies an unmodulated carrier at ±17.5MHz offset and a 5MHz

modulated interfering signal located ±35MHz away from the desired channel. This implies that the

AFE portion should be sampled (for simulation purposes) at >3fADC = 92.16MHz to satisfy Nyquist’s

sampling criterion. A discrete time fixed-step simulator would need to process everything at

>92.16MHz, while a static dataflow simulator would process each block at the required rate (e.g.

>92.16MHz for the AFE and 30.72MHz for the ADC). This is the primary reason why dataflow is

considered to be the best model of computation for signal processing applications.

Figure 2: Simplified block diagram of a UE receiver with 2 antennas (AFE: analog front-end; BPF:

band pass filter; LNA: low noise amplifier; VGA: variable gain amplifier; LPF: low pass filter:

ADC: analog-to-digital converter; DAC: digital-to-analog converter; AGC: automatic gain control;

NCO: numerically controlled oscillator; CP: cyclic prefix; DFT: discrete Fourier transform).

Baseband processing blocks are in orange. The LTE specs describe how data is to be transmitted

but not how they are to be recovered.

Hierarchical block diagram editor – Viewing a block diagram to trace signal flow is a lot easier than

going through several pages of C code. A hierarchical block diagram editor allows the user to

quickly grasp the signal flow and manage complexity. (See Fig. 1).

Source code available for all blocks/models – This allows implementers to examine the details of

the “executable spec” and use it as a starting point for their own implementation; or modify it to suit

their purposes. The source code for all primitive blocks should be available for viewing and editing.

Rapid simulation – The execution should finish as quickly as possible to allow sweeping over

Page 3

several parameters and getting results quickly for various channel scenarios and usage profiles.

More simulations run in less time results in fewer surprises during field testing, less “back to the

drawing board” moments, and thus provide a huge costs savings to the project. Using a C++

infrastructure and running compiled simulations over a distributed network should be supported.

Multi-core support further takes advantage of dual-core CPUs by automatically subdividing the

design into independent threads which can run on separate cores, resulting in a 1.7x ~ 1.9x speedup

compared to a single core.

Single-process co-simulation framework – This simplifies bottom-up verification. Implementation

of digital blocks would be done using Verilog/VHDL. Initial implementation of analog blocks

would be done using Verilog-A or Verilog/VHDL-AMS. Note that the commonly used IPC (inter-

process communication) method for co-simulation has high overhead and can be very slow. During

co-simulation with Verilog/VHDL-AMS, the algorithm portion of the design is directly linked with

the RTL/AMS simulator using PLI/VPI, incurring no IPC overhead.

CoWare’s LTE library includes:

Downlink reference system with ideal receiver

Downlink reference system with practical channel estimator

Uplink reference system with ideal receiver

Cell Search reference system

MIMO channel model supporting EPA, EVA, ETU, SCM-A/B/C/D as well as user-defined

scenarios with M transmit and N receive antennas (no limitations on M nor N)

MIMO receiver supporting spatial multiplexing using zero forcing (ZF), minimum mean squared

error (MMSE) or maximum likelihood (ML); and transmit diversity using maximal ratio combining

(MRC)

The library models the test cases and simulation scenarios published by the LTE working group. Users of

Page 4

the LTE library therefore have a higher probability of being compliant to the spec since the LTE library

essentially becomes a shared database. Designers can insert their own implementations into the “LTE

executable spec” and check performance against the test cases published by the LTE working group.

Figure 3 shows the reference throughput performance obtained compared with results from other 3GPP

LTE participants. The LTE library may be used as a starting point for algorithm developers to explore

particular implementations and evaluate their performance from a known, good reference point.

Figure 3: LTE downlink (PDSCH) reference system throughput simulation results compared with

other 3GPP participants’ results for FDD dual-stream MIMO, 10MHz, 50RB, 2 codewords, 2

layers, 2 Tx antennas, MMSE, no feedback, precoding #0, 2 x 16QAM, coding rate = 1/2, EVA5,

RVseq = 01,2,3.

Design Challenge #3: Exploring and evaluating analog front-end architectures which will meet performance

requirements while minimizing power and cost.

The architectures to be considered include:

Super-heterodyne with analog quadrature modulation/demodulation

Super-heterodyne with digital quadrature modulation/demodulation

Direct conversion

Super-heterodyne architectures involve multiple frequency translations. They provide the best sensitivity

and selectivity at the expense of a bigger parts count, bigger BOM and larger area. Analog quadrature

modulation/demodulation requires mixers, a phase shifter and a combiner. Balancing the gains of the I

and Q arms and achieving an exact 90° phase shift is impossible in an analog implementation. Analog

quadrature modulation/demodulation circuits suffer from gain/phase imbalance and carrier leakage.

Typical discrete devices have a minimum sideband suppression of -28dBc at high RF output frequencies

(e.g. 1.9GHz). This would correspond to roughly 4° of phase imbalance and 0.1dB of gain imbalance [9].

Such imperfections degrade EVM and adversely affect total system performance [10]. LTE uses 64QAM

to achieve higher throughput and requires an EVM of less than 8%. Figure 4 shows the constellation

diagram for 64QAM with quadrature modulator imperfections.

Page 5

Figure 4: Constellation diagram for 64QAM. The top diagram is the ideal constellation. The bottom

diagram shows the constellation with 4° of phase imbalance and 0.1dB of gain imbalance resulting

in an EVM of 3%. Note that even though this value is less than the required 8%, the constellation is

visibly skewed and will reduce the overall performance of the system. The signal points have been

enlarged for easier viewing.

A digital quadrature modulator/demodulator requires 2 multipliers and an adder. In its simplest form, the

local oscillator can be generated as a sequence +1, 0, -1, 0, +1, …,, that is, a sine wave with 4x

oversampling. The multiplier then becomes a simple switch selecting between the original I/Q signal, an

inverted version and zero. A digital quadrature modulator/demodulator doesn’t suffer from gain/phase

imbalance and carrier leakage. However, it requires upsampling (zero insertion and filtering) to match the

sampling frequency of the local oscillator.

A direct conversion architecture promises the lowest parts count, BOM, area and power dissipation.

However, the use of an analog quadrature modulator/demodulator is unavoidable. Specifying the

Page 6

parameters too “tightly” in the analog quadrature modulator/demodulator would lead to low yield and low

volumes since “champion” samples would have to be selected. It would be better to specify the

component “roughly” and compensate digitally. Fig. 5 shows how quadrature modulator compensation

and power amplifier linearization in an LTE eNB transmitter may be modeled with the front-end modules

in Verilog-AMS.

Fig. 5: Block diagram of quadrature modulator compensation and power amplifier linearization.

The front-end modules (green) are modeled in complex baseband with Verilog-AMS to provide an

“executable specification” for the analog design team. The digital baseband portion of the design

can be exported as a SystemC block for use within analog/RF simulator that co-simulate with

SystemC. Note that direct conversion is used in transmission but not for the linearizer feedback. In

a highly integrated system, the LO signal of the quadrature modulator would be pulled by the

power amplifier, making it unusable for direct downconversion.

LTE supports multiple bandwidths: 1.4 MHz, 3 MHz, 5 MHz, 10 MHz, 15 MHz and 20 MHz. This allows

carriers to gradually migrate users from GSM/EDGE to LTE. Device developers would likely not know

which part of a carrier’s available bandwidth would be assigned for LTE, so it would be more practical to

cope with the multiple bandwidth issue digitally, that is, filtering in the digital domain. This implies that

the baseband I/Q analog filters would have a passband of 20 MHz (-10 MHz to +10 MHz in the complex

domain) to cover all possible cases. This places stringent requirements on the analog-to-digital

converter’s dynamic range since there may be strong GSM/EDGE signals right beside the desired and

possibly weak LTE signals. Dynamic simulations would need to be performed to determine the optimum

number of bits for the analog-to-digital converters in the presence of AGC and analog compensation

circuits.

The transmit power amplifier consumes most of the power available in a handset. Using a highly-efficient

but non-linear power amplifier with digital adaptive predistortion allows longer battery life while coping

with poor antenna VSWR and exceeding LTE requirements [11]. Exploring and evaluating various

Page 7

predistortion algorithms and architectures requires dynamic simulations to select the optimum solution.

Using complex baseband representation for RF signals is sufficient when selecting the optimum front-end

architecture. Complex baseband involves “moving” the RF carrier to zero Hertz and selecting a

bandwidth (sampling frequency) wide enough to cover all signals of interest in blocking, interfering and

intermodulation scenarios. Complex baseband representation allows the system designer to determine the

characteristics of filters (e.g. passband, stopband, passband ripple) and amplifiers (e.g. gain, saturation)

required to meet or exceed the specifications. Complex baseband dataflow modeling for front-end

architecture exploration is more efficient (executes faster) than using AMS languages. Models for phase

noise, non-linear amplifiers, quadrature modulator/demodulator errors, filters and others to help designers

quickly model analog front-end architectures.

Design Challenge #4: Evaluate algorithms which will meet performance requirements while minimizing area,

power and cost.

Some key algorithms include:

Signal acquisition and start of frame detection

Coarse frequency synchronization

Channel estimation and equalization

Fine frequency/phase synchronization

Symbol timing synchronization

MIMO receiver

PMI, CQI, RI calculation and reporting

FFT/IFFT processing

Turbo/convolutional decoder

Transmitter processing is explicitly defined in the standard, but receiver processing is not. Designing and

evaluating receiver algorithms requires transmitted signals to work with. The LTE library includes

uplink/downlink transmitters and receivers to accelerate algorithm development.

There are many algorithms which may be used to realize the above tasks [12]. The “executable spec”

should act as a golden reference against which all algorithms may be compared. The golden reference

indicates the ideal performance of the system. It is created by providing the receiver with perfect

knowledge of all impairments (multipath channel characteristics, carrier frequency/phase error, etc.). Any

practical implementation of an algorithm will fall short of the ideal performance and constitutes an

implementation loss. More complex algorithms will have a small implementation loss at the cost of

Page 8

higher power dissipation or larger area or longer latency.

Simulations are required to evaluate the performance of various algorithms over different scenarios and

select the “best”. The “best” algorithm would have the least complexity (least number of operations and

least amount of memory used) and least latency (computation delay) while meeting or exceeding

performance requirements.

An LTE library should offer the above algorithms as well as testbenches to check performance as a

starting point for developers to evaluate their own algorithms.

Design Challenge #5: Convert floating-point algorithms into fixed-point for optimum performance.

Floating-point allows values to be represented with a large dynamic range and high precision but requires

more hardware resources (area and power) compared to a fixed-point representation. Floating-point is

used in initial algorithm evaluation to obtain an upper bound on performance but is seldom used in a

hardware implementation. Fixed-point incurs a quantization loss and a limited but “good enough”

dynamic range. C programs developed during the standardization process are always done in floating-

point and creating fixed-point versions of critical functions is not a trivial task [13]. It is important to take

advantage of C++ polymorphism to build models whose datatype can be set to floating-point, fixed-point,

complex, scalar, vector, matrix or image with a parameter change. Also offers several analysis utilities

like statistics (e.g. min, max) and histogram to facilitate fixed-point conversion should be available.

Parameter sweeping simulation (e.g. for bid width) on distributed simulations using Grid Engine [14]

achieve the desired design space exploration productivity.

Design Challenge #6: Partition the baseband design for implementation on dedicated hardware,

programmable accelerators, or software.

The criteria for selecting between a dedicated hardware and a pure software implementation of an

algorithm are fairly straightforward: if it needs to run really fast with low probability of being changed,

it’s a good candidate for a dedicated hardware implementation; if the processing is fairly complex with a

lot of parameters but low throughput, it’s a good candidate for implementation in software.

Programmable accelerators provide a good compromise between dedicated hardware (high energy

efficiency with no flexibility) and software (low energy efficiency with maximum flexibility) by allowing

developers to create a custom processor with an instruction set and register file tailored for the

application.

The transform precoding required in the uplink is a good candidate for a programmable accelerator

Page 9

implementation [16]. Transform precoding is essentially an N-point DFT, with 72 ≤ N ≤ 1320 (in

multiples of 12) depending on the size of the data to be transmitted/received.

Programmable accelerators are designed by describing the processor’s instruction set using the LISA

language. From there tools can automatically generate all the required processor development tools

(assembler, compiler, linker, debugger, documentation, instruction set simulator) and RTL from the LISA

description. The instruction set simulator generated can be imported into LTE simulation to verify the

operation and performance of the programmable accelerator across several scenarios.

Summary

The design of the LTE physical layer implementation poses significant development challenges that

require a range of ESL design solutions in order to arrive to the market on time and with the right

performance and flexibility. The CoWare solutions for DSP algorithm design (CoWare Signal Processing

Designer) and programmable accelerator design (CoWare Processor Designer) are able to assist design

teams in the creation of highly differentiated solutions which are standards compliant. For more

information, visit www.coware.com.

Glossary:

3GPP 3rd Generation Partnership Project

3GPP2 3rd Generation Partnership Project 2

ADC Analog-to-Digital Converter

AFE Analog Front-end

AMS Analog Mixed Signal

ASIP Application Specific Instruction Set Processor

BOM Bill of Materials

CMOS Complementary Metal Oxide Semiconductor

CPU Central Processing Unit

CQI Channel Quality Indicator

DFT Discrete Fourier Transform

DSP Digital Signal Processor

eNB evolved Node B

ESL Electronic System Level

EVA Extended Vehicular A

EVM Error Vector Magnitude

FDD Frequency Division Duplex

HARQ Hybrid Automatic Repeat Request

Page 10

HSDPA High Speed Downlink Packet Access

IDFT Inverse Discrete Fourier Transform

IF Intermediate Frequency

IPC Inter-Process Communication

ISS Instruction Set Simulator

LO Local Oscillator

LTE Long Term Evolution

MAC Medium Access Control

MIMO Multiple Input Multiple Output

MMSE Minimum Mean Square Estimation

OFDM Orthogonal Frequency Division Multiplex

OFDMA Orthogonal Frequency Division Multiple Access

PAPR Peak to Average Power Ration

PDSCH Physical Downlink Shared Channel

PHY Physical

QAM Quadrature Amplitude Modulation

QoS Quality of Service

RB Resource Blocks

RF Radio Frequency

RTL Register Transfer Level

RVseq Redundancy Version sequence

SC-FDMA Single Carrier Frequency Division Multiple Access

SNR Signal to Noise Ratio

SPICE Simulation Program with Integrated Circuit Emphasis

TS Technical Specification

UE User Equipment

UI User Interface

VHDL VHSIC (Very High-Speed Integrated Circuits) Hardware Description Language

References:

[1] UMTS Forum, “LTE freeze completed,” 11 Dec 2008.

http://www.umts-forum.org/content/view/2616/109/

[2] Mobile Europe, “Nokia Siemens Networks ships LTE base station hardware,” 15 Oct 2008.

http://www.mobileeurope.co.uk/news_wire/114202/Nokia_Siemens_Networks_ships_LTE_base_station_

Page 11

hardware.html

[3] Information Week, “LG Says It Has the World’s First LTE Chip for Phones,” 9 Dec 2008.

http://www.informationweek.com/blog/main/archives/2008/12/lg_says_it_has.html

[4] PC World, “NTT DoCoMo to Launch LTE Mobile Broadband in 2010,” 17 Nov 2008.

http://www.pcworld.com/businesscenter/article/154069/ntt_docomo_to_launch_lte_mobile_broadband_in

_2010.html

[5] Daily Wireless, “Verizon: LTE in 25 to 30 Markets By 2010,” 18 Feb 2008.

http://www.dailywireless.org/2009/02/18/verizon-lte-in-25-to-30-markets-by-2010/

[6] 3GPP TS 36.211 V8.5.0 (2008-12), “3rd Generation Partnership Project; Technical Specification

Group Radio Access Network; Evolved Universal Terrestrial Radio Access (E-UTRA); Physical Channels

and Modulation (Release 8),” Dec 2008.

[7] 3GPP TS 36.212 V8.5.0 (2008-12), “3rd Generation Partnership Project; Technical Specification

Group Radio Access Network; Evolved Universal Terrestrial Radio Access (E-UTRA); Multiplexing and

channel coding (Release 8),” Dec 2008.

[8] 3GPP TS 36.101 V8.4.0 (2008-12), “3rd Generation Partnership Project; Technical Specification

Group Radio Access Network; Evolved Universal Terrestrial Radio Access (E-UTRA); User Equipment

(UE) radio transmission and reception (Release 8),” Dec 2008.

[9] Esa Tiiliharju, “Integration of Broadband Direct-Conversion Quadrature Modulators,” Doctoral

Dissertation, Espoo 2006, http://lib.tkk.fi/Diss/2006/isbn9512285223/isbn9512285223.pdf

[10] Tzi-Dar Chiueh and Pei-Yun Tsai, “OFDM Baseband Receiver Design for Wireless

Communications,” John Wiley & Sons, 2007.

[11] George Norris, et. al. , “Application of Digital Adaptive Pre-distortion to Mobile Wireless Devices,”

2007 IEEE Radio Frequency Integrated Circuits (RFIC) Symposium, pp. 247 –250, 3-5 June 2007.

[12] Tzi-Dar Chiueh and Pei-Yun Tsai, “OFDM Baseband Receiver Design for Wireless

Communications,” John Wiley & Sons, 2007.

[13] Tomas Andersson, “LTE Testbed: A Prototype System for Evolved Mobile Broadband,”

www.s3.kth.se/signal/edu/s3_seminar/2008/talks/sem15.pdf

[14] http://gridengine.sunsource.net/

[15] Andreas Hoffman and Achim Nohl, “The Dusk of ASIC, the Dawn of ASIP,”

http://www.coware.com/PDF/ESC2005.PDF

[16] Jay Mundarath, “LTE: Mixed Radix DFTs for LTE Uplink,”

www. freescale .com/files/ftf_2008/presentations/Americas/3/AM108_ LTEMixedRadixDFT sfor LTE Uplin

k.pdf

[17] Tim Kogel and Matthew Braun, “Virtual Prototyping of Embedded Platforms for Wireless and

Multimedia,” Mar 2006.

Page 12