29
MIMO OFDM Transceiver for a Many-Core Computing Fabric A Nucleus based Implementation Institute for Communication Technologies and Embedded Systems T. Kempf, D. Günther, A. Ishaque, G. Ascheid ISS (Chair of Integrated Signal Processing Systems)

MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

  • Upload
    hatu

  • View
    228

  • Download
    4

Embed Size (px)

Citation preview

Page 1: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

MIMO OFDM Transceiver for a Many-Core Computing Fabric

–A Nucleus based Implementation

Institute for Communication Technologies and Embedded Systems

T. Kempf, D. Günther, A. Ishaque, G. Ascheid

ISS (Chair of Integrated Signal Processing Systems)

Page 2: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Outline

Introduction

Nucleus Methodology

MIMO OFDM Transceiver Implementation

Application Analysis - Nuclei Identification Efficient Nuclei Implementations on HW Platform (Flavor) Algorithmic Performance Evaluation Application-to-Architecture Mapping

Summary & Outlook

2

Page 3: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

FlexibleSDR

Software Defined Radio Vision

e.g UMTS

Future SDR Mobile Phone

Sou

rce:

Infin

eon

Tec

hnol

ogie

s

Free Area:Cost Savings

ornew

Functionality

Today‘s Mobile Phone

e.g.GSM

Sou

rce:

Infin

eon

Tec

hnol

ogie

s

Page 4: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

FlexibleSDR

Software Defined Radio Vision

e.g Bluetooth

The three key properties: Portability

Software is portable onto different platformsStandard.exe → Device_1, ..., Device_n

Interoperability Different devices configured for the same standard interoperate

Standard_1/Device_1 ↔ Standard_1/Device_2

Future SDR Mobile Phone

Sou

rce:

Infin

eon

Tec

hnol

ogie

s

Free Area:Cost Savings

ornew

Functionality

Today‘s Mobile Phone

e.g.GSM

Sou

rce:

Infin

eon

Tec

hnol

ogie

s

Standard_1/Device_1 ↔ Standard_1/Device_2

Loadability Platform is capable of running different standards

Device ← Standard_1.exe, ..., Standard_n.exe

But we must not forget: Efficiency

Power consumption of flexible SDR must be close to power consumption of dedicated device (battery driven!)

Page 5: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

FlexibleSDR

Software Defined Radios Vision

e.g Bluetooth

GSM.exeUMTS.exe LTE.exe

On-the-fly Configuration

The three key properties: Portability

Software is portable onto different platformsStandard.exe → Device_1, ..., Device_n

Interoperability Different devices configured for the same standard interoperate

Standard_1/Device_1 Standard_1/Device_2

Contradicting Requirements !

Flexibility (programmability) vs.

Future SDR Mobile Phone

Sou

rce:

Infin

eon

Tec

hnol

ogie

s

Free Area:Cost Savings

ornew

Functionality

Today‘s Mobile Phone

e.g.GSM

Sou

rce:

Infin

eon

Tec

hnol

ogie

s

Standard_1/Device_1 Standard_1/Device_2

Loadability Platform is capable of running different standards

Device ← Standard_1.exe, ..., Standard_n.exe

But we must not forget: Efficiency

Power consumption of flexible SDR must be close to power consumption of dedicated device (battery driven!)

Flexibility (programmability) vs.Energy Efficiency

Page 6: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Outline

Introduction

Nucleus Methodology

MIMO OFDM Transceiver Implementation

Application Analysis - Nuclei Identification Efficient Nuclei Implementations on HW Platform (Flavor) Algorithmic Performance Evaluation Application-to-Architecture Mapping

Summary & Outlook

6

Page 7: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Nucleus Methodology

Nuclei N 1 N 2 N 7 NNN 5

Transceiver Description

Transceiver DescriptionN 1

N 7 N 5N 2Non N Tasks

NucleusLibrary

Nucleus

7

PE 2(rASIP)

PE 3(DSP)

MEM

Comm. Arch.

PE 5(FPGA)

PE1(ASIP)

PE 4(GPP)

HWPlatform

Nucleus• Critical, demanding, algorithmic kernel • Kernel is common among different waveforms• Not waveform nor hardware specific

Page 8: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Nuclei N 1 N 2 N 7 NNN 5

Transceiver Description

Transceiver DescriptionN 1

N 7 N 5N 2Non N Tasks

NucleusLibrary

Nucleus Methodology

8

NI

PE 2(rASIP)

PE 3(DSP)

MEM

Comm. Arch.

PE 5(FPGA)

PE1(ASIP)

PE 4(GPP)

HWPlatform

PEs PE 2 PE 3 PE 4 PE 5PE 1

NI

Flavor

Page 9: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Nuclei N 1 N 2 N 7 NNN 5

Transceiver Description

Transceiver DescriptionN 1

N 7 N 5N 2Non N Tasks

Mapping&

EvaluationCompile

NucleusLibrary

Nucleus Methodology

9

PE 2(rASIP)

PE 3(DSP)

MEM

Comm. Arch.

PE 5(FPGA)

PE1(ASIP)

PE 4(GPP)

HWPlatform

PEs

BoardSupportPackage

PE 2PE 1 PE 3 PE 4 PE 5

NINI

NI

NI NNIFlavors

NINI

Page 10: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Outline

Introduction

Nucleus Methodology

MIMO OFDM Transceiver Implementation

Application Analysis - Nuclei Identification Efficient Nuclei Implementations on HW Platform (Flavor) Algorithmic Performance Evaluation Application-to-Architecture Mapping

Summary & Outlook

10

Page 11: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Nuclei Identification: Transceiver Structure

Outer Modem Channel (De-)coding (De-)Interleaving

IEEE 802.11n

11

(De-)Interleaving Inner Modem (RX)

RX OFDM Processing Channel Estimation Spatial Equalizing: Mitigate channel impact on payload Soft Demapping: Calculate soft bits (LLRs)

BPSK, 4QAM, 16QAM

OFDM Slot

Page 12: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Nuclei Identification: Kernel Identification

12

Analyze different algorithmic choices within RX bloc ks Identify computational kernels

Recurring tasks Operate on data with certain alignment

Build application as composition of kernels

Page 13: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Nuclei Identification: Kernel Identification (Example)

LMMSE MIMO Equalizer with QRD Basic transmission equation

Linear MMSE equalization

Regularized QRD

RQ

QI

HH a

=

= σ

ˆ

nHxy +=

( ) HH HIHHGyGx ˆˆˆ,ˆ12 −

+==s

n

13

Rewrite G using Qa and Qb

RQI

Hb

=

=s

n

E

σ

G = E s

σ nQ b Q a

H

Computational Kernels Regularized QR decomposition Matrix-matrix multiplication Matrix-vector multiplication

Page 14: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Nuclei Identification: Kernel Overview

14

Application variants consist of a few kernels only!

Page 15: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Outline

Introduction

Nucleus Methodology

MIMO OFDM Transceiver Implementation

Application Analysis - Nuclei Identification Efficient Nuclei Implementations on HW Platform (Flavor) Algorithmic Performance Evaluation Application-to-Architecture Mapping

Summary & Outlook

15

Page 16: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Application Implementation: P2012 Platform (ST Microelectronics)

SoC platform with maximum of 32 clusters One cluster provides

Max. 16 RISC cores (STxP70) @ 600MHz VECx vector extension (SIMD)

128 bit vector registers 8x16 bit or 4x32 bit operations

Hardware synchronizer for inter-core signaling Interface for hardware accelerators (ASICs)

16

Page 17: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Application Implementation: Kernel Overview

For 2x2 and 4x4 MIMO use case Cycles for execution on

single STxP70 processor core including VECX unit

Corresponding time for 600MHz clock frequency

17

In the range of … Competing solutions IEEE 802.11n real time

(4µs per OFDM slot)

Page 18: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Outline

Introduction

Nucleus Methodology

MIMO OFDM Transceiver Implementation

Application Analysis - Nuclei Identification Efficient Nuclei Implementations on HW Platform (Flavor) Algorithmic Performance Evaluation Application-to-Architecture Mapping

Summary & Outlook

18

Page 19: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Algorithm Performance Evaluation: Investigated Algorithmic Choices

Wide variety of algorithms is implemented Channel Estimation, Spatial Equalizer, Channel Coding

19

Channel Estimation, Spatial Equalizer, Channel Coding Determine superior choice by error correction perfo rmance Channel simulation

Fading: i.i.d. Rayleigh Fading Power delay profile: Exponential 20dB drop along 150ns Noise: AWGN 4x4 MIMO system

Page 20: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Algorithm Performance Evaluation: ZF vs. MMSE MIMO Equalization

MMSE equalizerBetter performance at

little computational cost

4x4 MIMO, r=1/2, g1=(133)8, g2=(171)8, nconv=6144 bit, nldpc = 1944 bit)

= H

(C)–

H(C

|Ω)

20

4QAM region 16QAM region

I(C

,Ω)

= H

(C)

Page 21: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Frame Error Rate of 4x4 MIMO System (Short Frames)

21

Fix-point issues at low FERs when using MMSE-QRD, SIC-MMSE

Page 22: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Frame Error Rate of 4x4 MIMO System (Short Frames)

For the investigated algorithms MMSE-DS-QRD is a viable trade -off between

22

is a viable trade -off between algorithmic performance and implementation complexi ty

Page 23: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Frame Error Rate of 4x4 MIMO System for different Frame Sizes

Algorithmic performance comparable to results found in literature

23

to results found in literature

Page 24: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Outline

Introduction

Nucleus Methodology

MIMO OFDM Transceiver Implementation

Application Analysis - Nuclei Identification Efficient Nuclei Implementations on HW Platform (Flavor) Algorithmic Performance Evaluation Application-to-Architecture Mapping

Summary & Outlook

24

Page 25: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Application-to-Platform Mapping: Identify Parallelism

Parallelizable dimensions of OFDM receiver applicat ion Space (RX antennas) Frequency (subcarriers) Time (OFDM slots)

P = PreambleD= Data payload

25

D= Data payload

Page 26: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Application-to-Platform Mapping: Assign Cores to PGs

Given :Single core timing requirements Goal : Assign cores to match real time constraints (4µs per slot)

Task time (us) #cores

Preprocessing (per OFDM frame)

LS Channel Estimation 17.47

Equalizer Preprocessing 215.31

Actual Processing (per OFDM slot)

44

26

Actual Processing (per OFDM slot)

OFDM Demodulation (mem. realign) 6.83

Equalizer (Actual Detection) 6.08

Soft Demapping (16 QAM) 2.84

242

Page 27: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Application-to-Platform Mapping: Assign Cores

Final mapping Partitioning of components into processing groups Number of cores per group

8 cores enable real time

PG 2PP&EQ4 PEs

PG 3

27

PG 1Modulation

2 PEs

PG 3Demapping

2 PEs

Page 28: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

2PARMA: Occupation Graph

Implementation on P2012 platform using 8 cores Minimum latency for 27 or more OFDM slots of data payload Latency = 8.2µs IEEE 802.11n allows 16µs (including MAC layer)

28

Page 29: MIMO OFDM Transceiver for a Many-Core Computing · PDF fileMany-Core Computing Fabric – A Nucleus based Implementation ... MIMO OFDM Transceiver Implementation ... ZF vs. MMSE MIMO

Thank you for your attention !

Any questions?

[email protected]