Upload
dangxuyen
View
251
Download
0
Embed Size (px)
Citation preview
Multithreaded Processor for
Software Defined Radio
1 North Lexington Ave, 10th FloorWhite Plains, New York 10601
914-287-8500
John Glossner, Ph.D., Founder, CTO & EVP
Erdem Hokenek, Ph.D., Founder, Chief H/W Architect
Mayan Moudgill, Ph.D., Founder, Chief S/W [email protected]
http://www.SandbridgeTech.com
Agenda
Motivation for SDR
Multithreaded SDR Processor
Software Development ToolsCompilerSimulatorIDE
Communications System Implementation2Mbps WCDMA802.11b
http://www.SandbridgeTech.com
The Challenges of an Industry
Cost3G is >10x more complex than 2G
but cost should be same, or even lessConvergence phone 2x complexity
WLAN, 2G, and 2.5G integration Traditionally implemented in HW
Moore’s law reduces cost 50% every 18 months6 years until the wireless multimedia is a real consumer market
Time-to-marketGPRS terminals were late
OEMs had to wait until bug free SoCs were available3G terminals will be late
OEMs have to wait until bug free SoCs with ‘reasonable’ power consumption are available
http://www.SandbridgeTech.com
A Whole Industry’s approach failed …
DSP
Fixed
ARM
WLAN802.11
GSM
Bluetooth
GPRS
CDMA2k
CDMAone
JavaWCDMAFDD
IS-136
MPEG-4
MP3
EDGE
PDC
USB
DSP
ARM
VGA
… on advanced wireless systems …
WCDMATD-SCDMA
WCDMATDD
GPS
… with multimedia
http://www.SandbridgeTech.com
Agenda
Motivation for SDR
Multithreaded SDR Processor
Software Development ToolsCompilerSimulatorIDE
Communications System Implementation2Mbps WCDMA802.11b
http://www.SandbridgeTech.com
SandblasterTM Architecture Performs
CompilableDSP
Java ProcessorControl
Processor
SystemProductivityAdvantage
9-12+ months
C programmedLatency hiding architecture
3G Applications Standard 3G, xDSL, 802.11Control Stacks
http://www.SandbridgeTech.com
Processor Execution Models
addi r0,r2,8
muli r8,r0,4
IF RD EX WBIF RD EX WB
IF RD EX WBIF RD EX WB
Pipelined Processor Stalls
addi r0,r2,8
muli r8,r3,4
IF RD EX WBIF RD EX WB
IF RD EX WBIF RD EX WB
Monolithic Processor
addi r0,r2,8
muli r8,r3,4
IF RD EX WB
IF RD EX WB
Pipelined Processor
addi r0,r2,8
muli r8,r3,4
IF RD EX WBIF RD EX WB
IF RD EX WBIF RD EX WB
Pipelined Processor
addi r0,r2,8 IF RD EX WB
IF RD EX WBIF RD EX WB
Multithreaded Processor
muli r8,r0,4 IF RD EX WB
addi r0,r2,8 IF RD EX WBIF RD EX WB
IF RD EX WBIF RD EX WBIF RD EX WBIF RD EX WB
Multithreaded Processor
muli r8,r0,4 IF RD EX WBIF RD EX WB
http://www.SandbridgeTech.com
Multithreaded Architecture
Thread 0# Monitoring FCCH&SCHlu r1,M(r2) lu r1,M(r2) add r2,r3,r4cmp cf2,r5,r3jc cf2,(Synch_off)
Thread 7# Monitoring CPICHlu r1,M(r2)lu r1,M(r2) call (De_scrambler)
stu r3,M(r2)
Thread 6# Handover Code
ctsr r1,sr2 lu r1,M(r2).jc cf0,(UMTS_Mode)
Thread 5# FIR Filter
lvu vr1,M(r4)vmacs
vr3,vr1,vr2,wr0loop lcr0,label(4x)
Thread 4# Pulse shaping Code
lvuvr1,M(r4)vmacs vr3,vr1,vr2,wr0loop lcr0,label(4x)
Thread 3# WCDMA Handover Code
ctsr r1,sr2 lu r1,M(r2).jc cf0,(WCDMA_Mode)
Thread 2# Exit Code ..Start Thread33..Start Thread24
)
Thread 1# WCDMA Main
Start Thread0Start Thread1Start Thread2
Start Thread13Start Thread14loop lc1,0xf0
)
I –C
ache
64K
B64
B L
ines
4W (2
-act
ive)
I- DecodeI- Decode
JumpQ PCCRJTRLCR
Data Buffer
MPY
VRABC
VPR0
MPY
VRABC
VPR1
PABCMPY
VPR2
MPY
VPR3
SAT
VRABCVRABC
PABCPABCPABC
ACC ACC ACC ACCADD ADD ADD ADD
AGEN
(16) 32-bit GPR
LS IQ
LRA LRB
Address
DirLRU Replace
Data Memory64KB
Data Memory64KB
Data Memory64KB
Data Memory64KB
Data Memory64KB
Data Memory64KB
Data Memory64KB
Data Memory64KB
8-Banks / 2Active
INT IQ
IRA IRB
ALU
WB
Bus/MemoryInterface
Interrupt SIMDIQ
Sea of Threads
FastCross Thread
Interrupts
Hardware Scheduled
FullyInterlocked
HighlyParallel
CProgrammed
Code&Data Sharing Across Threads
Key to Low Power Implementation
http://www.SandbridgeTech.com
Multithreaded Baseband Processor
High ParallelismVector / SIMD data parallelismMultiple instruction issueThread-level parallelism
I –C
ache
64K
B64
B L
ines
4W (2
-act
ive)
I- DecodeI- Decode
JumpQ PCCR
JTR
LCR
Data Buffer
MPY
VRABC
VPR0
MPY
VRABC
VPR1
PABC
MPY
VPR2
MPY
VPR3
SAT
VRABC VRABC
PABCPABCPABC
ACC ACC ACC ACCADD ADD ADD ADD
AGEN
(16) 32-bit GPR
LS IQ
LRA LRB
Address
DirLRU Replace
Data Memory64KB
Data Memory64KB
Data Memory64KB
Data Memory64KB
Data Memory64KB
Data Memory64KB
Data Memory64KB
Data Memory64KB
8-Banks
INT IQ
IRA IRB
ALU
WB
Bus/MemoryInterface
Interrupt SIMDIQI –
Cac
he64
KB
64B
Lin
es4W
(2-a
ctiv
e)
I –C
ache
64K
B64
B L
ines
4W (2
-act
ive)
I- DecodeI- Decode
JumpQ PCPCCRCR
JTRJTR
LCRLCR
Data Buffer
MPY
VRABC
VPR0
MPY
VRABC
VPR1
PABC
MPY
VPR2
MPY
VPR3
SAT
VRABC VRABC
PABCPABCPABC
ACC ACC ACC ACCADD ADD ADD ADD
AGEN
(16) 32-bit GPR
LS IQ
LRA LRB
Address
DirLRU Replace
DirLRU Replace
Data Memory64KB
Data Memory64KB
Data Memory64KB
Data Memory64KB
Data Memory64KB
Data Memory64KB
Data Memory64KB
Data Memory64KB
8-Banks
Data Memory64KB
Data Memory64KB
Data Memory64KB
Data Memory64KB
Data Memory64KB
Data Memory64KB
Data Memory64KB
Data Memory64KB
8-Banks
INT IQ
IRA IRB
ALU
IRA IRB
ALU
WB
Bus/MemoryInterface
Interrupt SIMDIQ
http://www.SandbridgeTech.com
Performance
Peak3 compound operations/cycle>20 RISC-ops/cycle4 MACS/cycle (MAC-SAT-ADD-SAT)
ExampleL0: lvu %vr0,%r3,8|| vmulreds %ac0,%vr0,%vr0,%ac0|| loop %lc0,L0>20 operations packed into a 64-bit compound ISAVLIW machines may require 512+ bits
20 tap FIR3.63 taps/cycle sustained
http://www.SandbridgeTech.com
Agenda
Motivation for SDR
Multithreaded SDR Processor
Software Development ToolsCompilerSimulatorIDE
Communications System Implementation2Mbps WCDMA802.11b
http://www.SandbridgeTech.com
Compiler Productivity
SANDBRIDGE
Compile
DesignAlgorithms
Write DSPSpecific C
Write DSPAssembly
10x every 10 years
Signal Processing Applications Complexity
1E+3
10E+3
100E+3
1985 1995 2005
Line
s of
C C
ode
Map toFixed Point C
Hand ScheduleOperations on DSP
Final Product Final Product
6-9 Months! 6-9 Months!
SandblasterTM Provides Dramatic Improvement
http://www.SandbridgeTech.com
Compiler
Programmed in C or JavaSuper-computer class compiler
VectorizationDSP instruction generation
Standard LibraryPrintf();
POSIX pthreads or Java threads50k+ testcases used for validation
Industry standards: Plum-Hall, perennial, nullstone
AMR Encoder(out-of-the-box C code)
0
100
200
300
400
500
600
700
SB TI C64x TI C62x SC140 ADI BlackFin
DSP's
Mhz
10 MHz
http://www.SandbridgeTech.com
Simulation Software
Compiled Simulator JIT “Flash” compilation Up to 100 MHz on high end x86Multi-threaded supported
Up to 4 orders of magnitude fasterDramatic development time reductionSignificant productivity improvement
Simulation Speed(1GHz Laptop)
0.114 0.106
0.0020.013
24.639
0.001
0.010
0.100
1.000
10.000
100.000
Mill
ions
of I
nstru
ctio
ns P
er S
econ
d
SB 24.639
TI C64x (Code Composer) 0.114
TI C62x(Code Composer) 0.106
SC140(Metrow erks) 0.002
ADI Blackfin (Visual DSP) 0.013
http://www.SandbridgeTech.com
Context sensitive IDE
IDE based on open source netbeansCommon Java/C programming environmentIntegrated debuggingTransparent HW / Simulation environmentWorks in multiple languages
http://www.SandbridgeTech.com
Agenda
Motivation for SDR
Multithreaded SDR Processor
Software Development ToolsCompilerSimulatorIDE
Communications System Implementation2Mbps WCDMA802.11b
http://www.SandbridgeTech.com
Integration
IF
RF
MMI
APPLICATIONTASKS
PROTOCOLSTACK
DATA I/OLCD, KPD …
L1 CONTROL
L1 BASEBAND SW
http://www.SandbridgeTech.com
Real-time Baseband Performance
Physical channel Segmentation
Radio frame segmentation
2nd interleaving
Physical channel mapping
Channel coding
Rate matching
TrBk concatenation /Code block segmentation
CRC attachment
Radio frameequalization
1 st interleaving
TrCHMultiplexing
Spreading/Scrambling
Filter
Rate matching
Physical channel Segmentation
Radio frame segmentation
2nd interleaving
Physical channel mapping
Channel coding
Rate matching
TrBk concatenation /Code block segmentation
CRC attachment
Radio frameequalization
1 st interleaving
TrCHMultiplexing
Spreading/Scrambling
Filter
Rate matchingFILTER
RAKE Searcher
PN BT#1 PN BT#2 PN BT#3
De-Scrambler
Path Table Building
Timing Management
De-SpreadDe-SpreadDe-SpreadChannelEst/Derot
Path 2Path 3Path 4
Path 1DSCH
Path 1S-CCPCH
De-ScrambleDe-ScrambleDe-Scramble
DPCH
De-SpreadDe-SpreadDe-SpreadDe-Spread
Path 2Path 3Path 4
MRCMeasurements:
SIR
RSCP
ISCP
Ec/Io
Multi Channel Code De-Mux2nd Deinterleaver
1nd Deinterleaver Channel DecodingFurther Processing
FILTER
RAKE Searcher
PN BT#1 PN BT#2 PN BT#3
De-Scrambler
Path Table Building
Timing Management
De-SpreadDe-SpreadDe-SpreadChannelEst/Derot
Path 2Path 3Path 4
De-SpreadDe-SpreadDe-SpreadChannelEst/Derot
Path 2Path 3Path 4
Path 1DSCH
Path 1S-CCPCH
De-ScrambleDe-ScrambleDe-Scramble
DPCHS-CCPCH
De-ScrambleDe-ScrambleDe-Scramble
DPCH
De-SpreadDe-SpreadDe-SpreadDe-Spread
Path 2Path 3Path 4
De-SpreadDe-SpreadDe-SpreadDe-Spread
Path 2Path 3Path 4
MRCMeasurements:
SIR
RSCP
ISCP
Ec/Io
Multi Channel Code De-Mux2nd Deinterleaver
1nd Deinterleaver Channel DecodingFurther Processing
0
10
20
30
40
50
60
70
80
90
100
802.11b GPRS WCDMA1/2/5.5/11Mbps Class 10/12 64/384/2k Kbps
% S
B96
00 U
tiliz
atio
n
Real-time chip, bit, and symbol rate processing1 SB9600 chip for 2Mbps Rx concurrently with 768kbps Tx<75% utilization for 384kbps Rx / 384kbps Tx
Includes functions traditionally implemented in H/WTurbo DecoderRake ReceiverTx/Rx Filters
Concurrent performance on 802.11B and GPRS
http://www.SandbridgeTech.com
Summary
Multithreaded baseband processormulti-threadedhigh-performance and low-power
Sophisticated compiler technologyautomatically generates DSP operationsnear-assembly language performance
Reconfigurable Communications ProtocolsWCDMAGSM, GPRS802.11BluetoothGPS