Embedded Signal Processing Laboratory at UT Austin
Prof. Brian L. EvansDept. of Electrical and Computer
Eng.The University of Texas at Austin
http://www.ece.utexas.edu/~bevans
2
Got to Texas as Fast I Could…
BSEE/CS Rose-Hulman 1987 MSEE Georgia Tech 1988 PhDEE Georgia Tech 1993 Post-Doc UC Berkeley 1993-1996 Very happy to land in Austin in Fall 1996
at …
3
Summary of Previous NI Interaction
NI Support Prior to Fall 2002 Funding for Babar Ahmed (undergraduate)
Real-Time DSP Lab alumni who went to NI Prethi Gopinath, Newton Petersen, Junichi
Suguira NI employees in Embedded Software
Systems Hugo Andrade, Scott Kovner, Sadia Malik,
Kurt Nee, Newton Petersen, Ram Rajagopal,Michael Schaeffer
4
Outline
Real-Time Digital Signal Processing Lab Programmable Digital Signal Processors Future Uses of LabVIEW
Embedded Software Systems graduate course Electronic Design Automation Tools Interaction with National Instruments
Research Group (Embedded Signal Proc. Lab) Common Themes ADSL Transceiver Design
5
Real-Time DSP Lab
Introduced Fall 1997: 384 served Digital signal processing theory/algorithms Digital communication systems Digital signal processor architecture Deliverable: Voiceband modem
Design of sinusoidal generators, filters, etc. Implementation in C/assembly on TI floating-
point TMS320C6700 DSP using Code Composer Studio
Test implementation with spectrum analyzers, etc.
6
Digital Signal Processors (DSPs)
For real time (guaranteed delivery) Fixed-point DSPs for high-volume products
Battery-powered: cell phones, dial-up modems, portable MP3 players, digital still cameras, and digital video (e.g. TI C5000)
Wall-powered: ADSL modems, VDSL modems, cell phone basestations, modem banks, laser printers, video conferencing systems (e.g. TI 6200, C6400)
Floating-point DSPs for low-volume products and feasibility analysis on fixed-point DSPs
TI 45%, Agere 25%, Mot 10%, 8% Analog
7
Digital Signal Processor Architecture
Harvard architecture: program/data memory separated and can be accessed on same cycle
Word size: 16, 20, 24, or 32 bits Programmer must manage memory
32-128 kwords data/program on chip On-chip data cache rare (TI C6000) No support for virtual memory
Predictable input/output: deterministic interrupt service routine latency (e.g. 11 cycles on TI C6000)
8
Digital Signal Processor Architecture
Deterministic, no-overhead looping Single instruction cycle multiply unit(s) No-overhead addressing modes in hardware
Modulo addressing for circular buffers, e.g. filters Bit-reversed addressing, e.g. fast Fourier
transforms (not available on TI C6000) Native number formats
Integer: binary point on far right of bit pattern Fractional: binary point just right of sign bit Floating-point: could emulate on fixed-point DSPs
9
Drawbacks to Programming DSPs
General drawbacks Limited on-chip memory Poor C compiler performance
Fixed-point issues Non-standard C extensions for fractional data Converting floating-point programs to fixed-
point Manual tracking of binary point prone to error
Conventional DSPs No byte addressing (needed for image/video) Limited addressable memory on fixed-point
DSPs
10
LabVIEW for Real-Time DSP Lab
Students use LabVIEW in a pre-requisite Fall 2003: System-level representation
In the first lab, students are given a LabVIEW simulation of voiceband modem running on PC
In each subsequent lab, students substitute the subsystem implemented on the DSP in the LabVIEW simulation to test the design
Future: Synthesis vs. handcoding In each lab, students use the LabVIEW modem
simulation to synthesize the subsystem being designed and compare their handcode with it
11
Embedded Software Systems
Introduced in Spring 1997: 87 served Modern methods for specifying, simulating,
and synthesizing embedded systemsProgramming languages ConcurrencyDataflow models Process networkScheduling Software synthesisDiscrete-event models Cosimulation
Students evaluate/build system designs in Ptolemy from UC Berkeley Advanced Design System from Agilent
12
Dataflow Models
Examples in modern design automation tools
EDA Tool Dataflow Models Example Application
Agilent Advanced Design System
Synchronous Dataflow,Timed Synchronous
Dataflow
Mixed analog, digital, and RF communication systems
(data transmission subsystem)
Co-Centric System Design Studio
Cyclostatic Dataflow Periodic digital systems, e.g. data converters, MP3 decoder, digital
baseband communications
Cadence Signal Processing Worksystem
Synchronous Dataflow, Dynamic Dataflow
Periodic digital systems
UC Berkeley Ptolemy Synchronous Dataflow,Boolean Dataflow,Dynamic Dataflow
Periodic and aperiodic digital systems
13
Synchronous Dataflow [Lee 1986]
Arcs: one-way first-in first-out queues A block is enabled for execution when enough
tokens are available on all inputs Source blocks are always enabled
When block executes, it always produces and consumes the same fixed amount of tokens
Consumed data is dequeued from arc Flow of data through graph may not depend on
values of data Delay is a property of an arc
Delay of n samples means that n tokens are initially in the queue of that arc
14
Synchronous Dataflow
Systems are determinate History of tokens produced on communication
channels do not depend on the execution order May be executed sequentially or in parallel with
the same outcome Scheduling
Load balancing to make sure that all tokens produced can be consumed: linear complexity
Find a periodic schedule List scheduling: worst-case is exponential complexity Heuristics to minimize buffer size: cubic complexity
15
Synchronous Datalflow Modeling Signal Processing
Finite impulse response filters Infinite impulse response filters Fast Fourier transform Multirate systems and filter banks
Communication Systems Sinusoidal modulation and demodulation Pulse shapers Transmission subsystem
Inappropriate for data-dependent graphs, e.g. baud rate negotiation at modem startup
16
Process Network [Kahn 1974]
A set of concurrent processes that communicate through network of one-way infinite first-in first-out (FIFO) queues
Reads from queues are blocking If the queue is empty, the process will
suspend until there is enough data in the queue.
When a process blocks, the scheduler will not run the process until enough data becomes available.
Writes to the queues are non-blocking
17
Process Network A process is either enabled or blocked
waiting for data on only one of its input channels
Systems are determinate History of tokens produced on communication
channels do not depend on the execution order
May be executed sequentially or in parallel with the same outcome
Supports recurrence and recursion Formal mathematical representation:
processes are functions that map streams into streams
18
Process Network Turing complete: questions of termination
and bounded buffering are undecidable Undecidable (in finite time) if process
network Terminates Requires bounded memory
Signal processing: run for infinite time Scheduler can find a bounded memory
solution using infinite time [Parks 1995] Ptolemy Process Network domain UT Austin Computational Process Network
framework in C++http://www.ece.utexas.edu/~allen/PNSourceCode/
19
NIers in Embedded Software Systems
Hugo Andrade and Scott Kovner, 1998,“Software Synthesis from Dataflow Models for
Embedded Software Design in the G Programming Language and the LabVIEW Development Environment”
Kurt Nee (with Chad Roesle), 1999,“Feasability of Implementating an H.263+
Decoder on a TMS320C6x Digital Signal Processor”
20
NIers in Embedded Software Systems
Michael Schaeffer, 1999,“An Extension to the Foundation Fieldbus
Model for Specifying Process Control Strategies”
Sadia Malik and Ram Rajagopal, 2000,“LabVIEW Based Embedded Design”
Newton Petersen (with Martin Wojcik), 2000,“Node Prefetch Prediction in Dataflow Graphs”
21
Image Analysis
Ph.D. graduates: Dong Wei (SBC Research) K. Clint Slatton (UT Center for Space Research) Wade C. Schwartzkopf
Real-Time Imaging Ph.D. graduates: Thomas D. Kite (Audio Precision) Niranjan Damera-Venkata (HP Labs)Ph.D. students: Gregory E. Allen (UT Applied Research Labs) Serene BanerjeeMS graduates: Young Cho (UCLA)MS students: Vishal Monga
Ph.D. graduates: Güner Arslan (Cicada) Biao Lu (Schlumberger)Ph.D. students: Dogu Arifler Ming Ding Milos Milosevic (Schlumberger)
ADSL/VDSL Transceiver Design
Wireless Communications
Ph.D. graduates: Murat Torlak (UT Dallas)Ph.D. student: Kyungtae Han MS graduates: Srikanth K. Gummadi (TI) Amey A. Deosthali (TI)MS students: Zukang Shen Ian Wong
http://signal.ece.utexas.edu
Wireless Networking and Comm. Group: http://www.wncg.org
Center for Perceptual Systems: http://www.cps.utexas.edu
Prof. Brian L. Evans
22
Common Themes Find or derive optimal algorithm Develop low-complexity algorithms
(bottom-up design) Keep in mind that these algorithms will
ultimately be realized in real time on a fixed-point DSP
Algorithms should be statically scheduled Evaluate performance-implementation tradeoff
System-level design (top-down design) Dataflow modeling for synthesis Simulate system to validate algorithm
Software releases
23
ADSL Transceiver Design
Asymmetric Digital Subscriber Line modem Line driver (single chip) Transceiver: analog front end + digital baseband
Sampling rate: 2.208 Mbps (real time) Bit error rate: 10-7 (Reed-Solomon codes) Symbol rate: 4,000 symbols/s Frame is symbol plus redundant information Single frame transmission (low delay) Proper equalizer design can double bit rate
24
Digital Subscriber Line (DSL)Broadband Access
Customer Premises
downstream
upstream
Voice
Switch
Central Office
DSLAM
DSL modem
DSL modem
LPFLPF
Telephone Network
Internet
DSLAM - DSL Access Multiplexer
LPF – Low Pass Filter
25
Discrete Multitone (DMT) Standards ADSL – Asymmetric DSL (G.DMT Standard)
Echo cancelled no longer deployed in central office
Frequency division multiplexing max. data rates:13.38 Mbps downstream, 1.56 Mbps upstream
ADSL:cable modem –1:2 in US & 5:1 non-US
DMT VDSL – Very HighRate DSL (Proposed) Faster G.DMT ADSL Freq. division multiplex 2m subcarriers m [8, 12]
G.DMT ADSL
Asymmetric DMT VDSL
Data band 25 kHz – 1.1 MHz
1 MHz – 12 MHz
Upstream subcarriers
32 256
Downstream subcarriers
256 2048/4096
Target up- stream rate
1 Mbps 3 Mbps
Target down- stream rate
8 Mbps 13/22 Mbps
26
Multicarrier Modulation
Divide channel into narrowband subchannels No inter-symbol interference
(ISI) if constant gain in everysubchannel and ideal sampling
Discrete multitone modulation Based on fast Fourier transform (FFT)
subchannel
frequency
magnitude
carrier
DTFT-1pulse sinc
kcc
k
kc
sin
channel
Subchannels are 4.3 kHz wide in ADSL and DMT VDSL
27
Discrete Multitone Modulation Symbol
Subsymbols are complex-valued ADSL training uses 4-level Quadrature
Amplitude Modulation (QAM) ADSL uses QAM of 22, 23, 24, …, 215
levels during data transmission
In-phase
Quadrature
iX
QAM
N-pointInverse
FFT
X1
X2
X1*
x1
x2
x3
xNX2*
XN/2
XN/2-1*
X0
one symbol of N
real-valued samples
N/2 subsymbols(one subsymbol
per carrier)
28
Discrete Multitone Modulation Frame Frame through D/A converter and
transmitted Frame is the symbol with cyclic prefix prepended Cyclic prefix (CP) is last samples of symbol
Linear convolution of framew/ channel impulse response Is circular convolution if channel
length is CP length plus one or shorter If circular, frequency equalization in FFT domain
N samplesv samples
CP CPs y m b o l i s y m b o l i+1
copy copy
ADSL G.DMT Values Down
stream Up
stream 32 4
N 512 64
29
Eliminating Inter-Symbol Interference Time domain equalizer (TEQ)
Finite impulse response (FIR) filter Effective channel impulse response:
convolution of TEQ impulse responsewith channel impulse response
Frequency domain equalizer (FEQ) Compensates magnitude and phase
distortion of channel + TEQ by dividingeach FFT coefficient by complex number
ADSL G.DMT equalizer training Reverb: same symbol sent 1,024 to 1,536 times Medley: aperiodic sequence of 16,384 symbols At 0.25 s after medley, receiver returns number
of bits on each subcarrier that can be supported
channel impulse response
effective channel impulse response
: transmission delay: cyclic prefix length
30
P/S
QAM demod
decoder
invert channel
=frequency
domainequalizer
S/P
quadrature amplitude
modulation (QAM) encoder
mirrordataand
N-IFFT
add cyclic prefix
P/SD/A +
transmit filter
N-FFTand
removemirrored
data
S/Premove
cyclic prefix
TRANSMITTER
RECEIVER
N/2 subchannels N real samples
N real samplesN/2 subchannels
time domain
equalizer (FIR filter)
receive filter
+A/D
channel
ADSL Transceiver: Data Transmission
Bits
00110
conventional ADSL equalizer structure
31
Simulation Results for 17-Tap TEQ
Achievable percentage of upper bound on bit rate
ADSL CSA Loop
Minimum
MSE
Maximum Geometric
SNR
Maximum Shortening
SNR
Minimum
ISI
Maximum
Bit Rate
Upper Bound
(Mbps) 1 43% 84% 62% 99% 99% 9.059
2 70% 73% 75% 98% 99% 10.344
3 64% 94% 82% 99% 99% 8.698
4 70% 68% 61% 98% 99% 8.695
5 61% 84% 72% 98% 99% 9.184
6 62% 93% 80% 99% 99% 8.407
7 57% 78% 74% 99% 99% 8.362
8 66% 90% 71% 99% 100% 7.394
Cyclic prefix length 32FFT size (N) 512Coding gain 4.2 dBMargin 6 dB
Input power 23 dBmNoise power -140 dBm/HzCrosstalk noise 8 ADSL disturbersPOTS splitter 5th order Chebyshev
32
Simulation Results for 3-Tap TEQ Achievable percentage of matched filter bound on bit rate
ADSL CSA Loop
Minimum
MSE
Maximum Geometric
SNR
Maximum Shortening
SNR
Minimum
ISI
Maximum
Bit Rate
Upper Bound
(Mbps) 1 54% 70% 96% 97% 98% 9.059
2 47% 71% 96% 96% 97% 10.344
3 57% 69% 92% 98% 99% 8.698
4 46% 66% 97% 97% 98% 8.695
5 52% 65% 96% 97% 98% 9.184
6 60% 71% 95% 98% 99% 8.407
7 46% 63% 93% 96% 97% 8.362
8 55% 61% 94% 98% 99% 7.394
Cyclic prefix length 32FFT size (N) 512Coding gain 4.2 dBMargin 6 dB
Input power 23 dBmNoise power -140 dBm/HzCrosstalk noise 8 ADSL disturbersPOTS splitter 5th order Chebyshev
33
Contributions by Research Group New time-domain equalizer design
methods Maximum Bit Rate method maximizes bit rate
(upper bound) Minimum Inter-Symbol Interference method
(real-time, fixed-point) Minimum Inter-Symbol Interference TEQ
design method Reduces number of TEQ taps by a factor of ten
over Minimum Mean Squared Error method for the same bit rate in discretized simulation
Implemented in real-time on Motorola 56000, TI TMS320C6200 and TI TMS320C5000 DSPs:http://www.ece.utexas.edu/~bevans/projects/adsl
34
Matlab DMT TEQ Design Toolbox 3.1
FIR, dual-path, per-tone & filter bank equalizers: http://www.ece.utexas.edu/~bevans/projects/adsl/dmtteq/
variousperformance
measures
default parameters
from G.DMT ADSL
standard
different graphical
views
-140
23
35
Future Interaction with NI
Integrate LabVIEW into Real-Time DSP Lab to reinforce modem system being designed
Add lecture on LabVIEW computational model in Embedded Software Systems course
Discuss ideas for extensions to LabVIEW for synthesis onto programmable DSPs Evaluate restrictions and extensions to the G
language for synthesis Investigate methods for conversion of floating-
point source code to fixed-point