27
A Complete Real-Time 802.11a Baseband Receiver Implemented on an Array of Programmable Processors Anh Tran, Dean Truong and Bevan Baas VLSI Computation Lab, ECE Department, University of California - Davis ACSSC 2008 – Pacific Grove, CA

A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

A Complete Real-Time 802.11a

Baseband Receiver Implemented

on an Array of Programmable

Processors

Anh Tran, Dean Truong and Bevan Baas

VLSI Computation Lab, ECE Department, University of California - Davis

ACSSC 2008 – Pacific Grove, CA

Page 2: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

Outline

� Architecture of a 802.11a Digital Baseband

Receiver

� The Target Many-core Computational

Platform

� Implementation of the Receiver

� Results and Analysis

� Conclusion

Page 3: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

Outline

� Architecture of a 802.11a Digital Baseband

Receiver

� The Target Many-core Computational

Platform

� Implementation of the Receiver

� Results and Analysis

� Conclusion

Page 4: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

Architecture of a Complete 802.11a

Digital Baseband Receiver

� Three important features required for a practical receiver:� Frame detection and timing synchronization

� Carrier frequency offset (CFO) estimation and correction

� Channel estimation and equalization

Page 5: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

Frame Detection and Timing Synchronization

� Timing metric (*):

S S S S S S S S S S GI2 L L GI SIGNAL . . .

10 short-training symbols 2 long-training symbolswith GI2

Many OFDM data symbols

Data

SIGNAL symbol

8 µs 8 µs 4 µs N x 4 µs

GI

2

2

| ( ) |( )

( )

P nM n

Q n=

15*

0

( ) ( 16). ( )k

P n r n k r n k=

= + + +∑

152

0

( ) | ( ) |k

Q n r n k=

= +∑

(*)T.M. Schmidl and D.C. Cox, ”Robust frequency and timing synchronization for OFDM,” IEEE Transactions on

Communications, pp. 1613-1621, Dec. 1997

SNR = 20 dB

where:

Page 6: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

Frame Detection and Timing Synchronization

S S S S S S S S S S GI2 L L GI SIGNAL . . .

10 short-training symbols 2 long-training symbolswith GI2

Many OFDM data symbols

Data

SIGNAL symbol

8 µs 8 µs 4 µs N x 4 µs

GI

det( )M n Th> SNR = 20 dB

or:2 2

det| ( ) | ( )P n Th Q n> ⋅

( ) synM n Th<

or:

2 2| ( ) | ( )synP n Th Q n< ⋅

� Frame detection:

�Timing synchronization:

Page 7: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

CFO Estimation and Compensation

� Offset angle (*):

� CFO compensation: using CORDIC Rotation algorithm

S S S S S S S S S S GI2 L L GI SIGNAL . . .

10 short-training symbols 2 long-training symbolswith GI2

Many OFDM data symbols

Data

SIGNAL symbol

8 µs 8 µs 4 µs N x 4 µs

GI

63*

2 1

0

1( ). ( )

64 k

L k L kα

=

= ∠∑

(*) E. Sourour et al., “Frequency offset estimation and correction in the IEEE 802.11a WLAN,” IEEE Vehicular

Technology Conference, pp. 4923-4927, Sep. 2004.

CFO = 10 ppm at 5 GHz

Before

CompensatedAfter

compensated

Page 8: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

Channel Estimation and Equalization

� Channel coefficients:

� Channel equalization:

S S S S S S S S S S GI2 L L GI SIGNAL . . .

10 short-training symbols 2 long-training symbolswith GI2

Many OFDM data symbols

Data

SIGNAL symbol

8 µs 8 µs 4 µs N x 4 µs

GI

� �

1 2( ) ( )1

( )2 ( )

L k L kH k

L k

+= ⋅

P. Hung et al., “Fast division algorithm with a small lookup table,” IEEE Asilomar CSSC, pp. 1465-1468, Oct. 1999.

��( )

( )( )

mm

S kS k

H k=

�( ) ( )mS k C k= ⋅

where: �

� �1 2

1 2 ( )( )

( ) ( ) ( )

L kC k

H k L k L k= =

+

Page 9: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

Outline

� Architecture of the 802.11a Digital Baseband

Receiver

� The Target Many-core Computational

Platform

� Implementation of the Receiver

� Results and Analysis

� Conclusion

Page 10: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

The Target Computational Platform

Tile

Core

Osc

Comm Viterbi Decoder

FFT

16 KB SharedMemories

MotionEstimation

� Key features (*):

� 164 fine-grained processors

� 3 configurable accelerators:

� FFT, Viterbi and Motion Estimation

� 3 big shared memory modules

� Circuit-switched network

� Max. frequency of 1.2 GHz at 1.3 V

� Fabricated in ST 65 nm process

(*) D. Truong, et at., ” A 167-processor 65 nm Computational Platform with Per-Processor Dynamic Supply Voltage

and Dynamic Clock Frequency Scaling},” VLSI Circuits Symposium, Jun. 2008.

Page 11: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

Outline

� Architecture of the 802.11a Digital Baseband

Receiver

� The Target Many-core Computational

Platform

� Implementation of the Receiver

� Results and Analysis

� Conclusion

Page 12: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

Implementation of the Receiver

� Implement whole system using Matlab

� Program each function on one/many processors using the AsAPassembly language

� Map whole system on the AsAP platform

� Compare results with Matlab

Page 13: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

The Receiver Operates Obeying a FSM

� Compute P(n) and Q(n)

� Frame is detected if

Frame Detection

Timing Synchronization

CFO Estimation

Channel Estimation

OFDM Symbols Processing

Begin

2 2

det| ( ) | ( )P n Th Q n> ⋅

for 48 consecutive samples

Page 14: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

The Receiver Operates Obeying a FSM

Frame Detection

Timing Synchronization

CFO Estimation

Channel Estimation

OFDM Symbols Processing

Begin

� Compute P(n) and Q(n)

� After frame is detected

� Timing is synchronized at first sample that satisfies

2 2| ( ) | ( )

synP n Th Q n< ⋅

Page 15: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

The Receiver Operates Obeying a FSM

Frame Detection

Timing Synchronization

CFO Estimation

Channel Estimation

OFDM Symbols Processing

Begin

� Compute offset vectorusing two long-training symbols

� Compute offset angle αusing CORDIC Angle algorithm

Page 16: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

The Receiver Operates Obeying a FSM

Frame Detection

Timing Synchronization

CFO Estimation

Channel Estimation

OFDM Symbols Processing

Begin

� Compute C(n) from two long-training symbols in the frequency domain (after FFT)

Page 17: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

The Receiver Operates Obeying a FSM

Frame Detection

Timing Synchronization

CFO Estimation

Channel Estimation

OFDM Symbols Processing

Begin

� Includes all processors on the critical data path

� The OFDM SIGNAL symbol is used to decide the modulation scheme and code rate for all DATA symbols

Page 18: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

Outline

� Architecture of the 802.11a Digital Baseband

Receiver

� The Target Many-core Computational

Platform

� Implementation of the Receiver

� Results and Analysis

� Conclusion

Page 19: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

Throughput Evaluation

� Processors on the critical data path determines the receiver’s throughput

� Each processor operates as one stage of a pipeline

� The CORDIC Rotation processor is system bottleneck

� One OFDM symbol is processed by each processor in 15120 cycles

� To achieve 54 Mbps throughput, all processors must run at 3.78 GHz

Page 20: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

Throughput Improvement

τ τ τ

� Using 15 processors to pipeline the CORDIC algorithm:

� Using many CORDIC

processors in parallel:

� Method 1: 2 N processors to support N CORDIC processors

� Method 2: Only N processors

Page 21: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

Throughput Improvement

� When using 7 CORDIC processors in parallel, the Viterbi Decoder becomes bottleneck

� No further improvement is possible by software

� Now, each processor processes one OFDM symbol in 2376 cycles

� The receiver obtains 54 Mbps throughput at 590 MHz

Page 22: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

Comparison

110110√√√120065AsAP2this work

74.754√-√28090SDRAkabane

66.424√-√400180SODALin

33.212√--260180CoPro.Yung

23.24.3-√√130350Strong

ARM

Bakker

7236√√√600130TI 64xSereni

4.71.7√--200180TI 62xTariq

Work by Platform Tech.

(nm)

Max

Freq.

(MHz)

Fram.

Det. &

Syn.

Scaled

to 65

nm

Throug

hput

(Mbps)

Chan.

Est. &

Eq.

CFO

Est. &

Comp.

� Our receiver sustains 110 Mbps throughput at max frequency of 1.2 GHz

� It is a complete one and 1.5x – 23x faster than others

Page 23: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

Outline

� Architecture of the 802.11a Digital Baseband

Receiver

� The Target Many-core Computational

Platform

� Implementation of the Receiver

� Results and Analysis

� Conclusion

Page 24: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

Summary

� Fine-grained many-core platform

� Task-level parallelism

� Highly flexible and scalable

� Many ways to speedup an application

� A complete 802.11a baseband receiver

� Supports all necessary features of a real receiver

� Sustain real-time 54 Mbps throughput at 590 MHz

� Can sustain up to 110 Mbps if running at maximum frequency

� Many times faster than other related works

� Future work

� Improve accelerators

� Upgrade the platform for mapping more wireless applications

Page 25: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

Acknowledgments

� Intellasys Inc.

� a VEF fellowship

� SRC GRC Grant 1598 and CSR Grant 1659

� ST Microelectronics

� UC Micro

� NSF Grant 0430090 and CAREER Award 0546907

� Intel

� S Machines

Page 26: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

The End

THANK YOU !

Page 27: A Complete Real-Time 802.11a BasebandReceiver Implemented …vcl.ece.ucdavis.edu/.../AnhTran_Asilomar08_Presentation.pdf · 2008-11-12 · A Complete Real-Time 802.11a BasebandReceiver

Compute Bit Rate and Frame Length

Frame Detection

Timing Synchroniza

-tion

Carr. Freq. Offset

Estimation

Complex Rotation

Guard Removal

64-pt FFT

Subcarrier Reordering

Channel Estimation

Channel Equalizer

Constellation De-mapping

De-interleaving Step 1

De-interleaving Step 2

De-puncturing

Viterbi Decoder

De-scrambling

Pad Removal

from Analog to Digital Converter

to Media Access Control layer