66
The World Leader in High-Performance Signal Processing Solutions Blackfin Presentation SHARC Users Group March 2003 by Joerg Hauber (FAE)

Section 4 Register Files and Arithmetic Units

Embed Size (px)

Citation preview

Page 1: Section 4 Register Files and Arithmetic Units

The World Leader in High-Performance Signal Processing Solutions

Blackfin Presentation SHARC Users Group

March 2003

by Joerg Hauber (FAE)

Page 2: Section 4 Register Files and Arithmetic Units

2

Target Applications: Internet Era of Signal Processing

PDAs Digital Still Cameras Video Cameras Digital Printing Internet Audio Video Conferencing Internet Appliances Mobile PhonesCable Modems Telephony Integrated Access Devices Modem PABXVoice over Network Recognition Handwriting Recognition Text to SpeechSpeech to Text Echo

Cancellation Imaging Portable MedicalMP3 Audio DVD playback MPEG2 Video Conferencing Surround Sound Audio 3G Data Terminals Speakerphones Car Infotainment Consumer

AudioSet top Box Home Networking Modems GPSVoIP Phone Solutions ADSL Modems Car Digital Radios Navigation systems

Automotive security ADSL Modems Digital Car RadiosRAS modems Web Pads Wireless Modems GSM Phones

Page 3: Section 4 Register Files and Arithmetic Units

3

User Benefits

High Performance

BLACKfin

DSP offers 600MMAC performance today with roadmap to 2GMAC.

Low Power Consumption:

BLACKfin

DSP enables significant power savings by dynamically varying both operating frequency and voltage.

Easy to Use

BLACKfin

DSP combines attributes of both high performance DSPs and microcontrollers into a single RISC device.

Page 4: Section 4 Register Files and Arithmetic Units

4

Controls Voltage & Frequency

Dynamic Power Management using RTOS or Firmware

Profiling Tools audit MIPS requirements by function

Func MHz Vdd

F0(x) 225 1.3

F1(y) 300 1.5

Fn(z) 100 0.9

a Optional Powermanagment ICregulatesvoltage

Multiple Power-Down modesFunctional & Peripheral blocks can be clocked only when used

ADP3053

Page 5: Section 4 Register Files and Arithmetic Units

5

Blackfin DSPs

Optimize Power Consumption

DSP Operation

PLL Settling

RegulatorTransition

1.5V, 300MHz

1.0V, 100MHz

1.3V, 225MHz

RegulatorTransition

PLL SettlingDSP Operation

DSP Operation

PowerConsumption

Vdd

t

Just vary the frequency

Vary the voltage and frequency

Dynamic Power Management

Page 6: Section 4 Register Files and Arithmetic Units

6

Integrated Blackfin Features Typically Found in a Micro-Controller

Data MovementLD, ST, 8,16,32 bitsUnsigned, Sign-extendRegister moves, P-D-DAG,Push, Pop, Push/PopmultCC2 dreg, etc.

Addressing ModesAuto incr, Auto decr,Pre-decr store on SP,IndirectIndexed w/immed offsetPost-incr w/ nonunity strideByte addressable

Program ControlBRCC, UJUMP,Call, Rets, Loop Setup

Arithmetic+,-,*,/,>>>, Negate2 and 3 operand instructs

LogicalAND, OR, XOR, NOTBITtst,set,tgl,clr, CC ops<<,>>

VideoSAA, Byteops: Residual calc,Spatial Interpolation, SpatialFilter

Cache ControlPrefetch, Flush

A RISC Instruction Set

Supervisor/user modes

Memory management

Wide range of peripherals

Event control

and …

There is not a separate Micro-Controller mode!

Page 7: Section 4 Register Files and Arithmetic Units

7

Operating Modes

Supervisor mode

Emulator/Debug mode

User mode

System, Code and Event Handlers

Application Code

Supervisor

User

Emulation

RTI, RTX

Interrupt o

r Exc

eption

RT

E

Em

ulation Event

Emulation Event

RTE

Page 8: Section 4 Register Files and Arithmetic Units

8

Micro Signal Architecture Core

Acc1

40BarrelShifter

Acc0

40

16168 8 8 8

Address Arithmetic Unit

DAG0 DAG1

I3 L3 B3 M3I2 L2 B2 M2I1 L1 B1 M1I0 L0 B0 M0

P0P1P2P3P4P5FPSP

R0R1R2R3R4R5R6R7

Data Arithmetic Unit

Sequencer

Blackfin DSP Core based on the Micro Signal ArchitectureJointly Developed With Intel Corporation

• Two 16-bit Multipliers• Two 32/40-bit ALUs• Four 8-bit Video ALUs• Barrel Shifter• Sixteen 16-bit Math registers / Eight 32-bit Math Registers

• Two DAGs, byte addressing• Eight 32-bit pointer registers• Four Sets of 32-bit Index, Modify, Length, Base

• 16-bit Instructions• 32-bit Instructions• Multi-Issue, 64-bit Instructions

Page 9: Section 4 Register Files and Arithmetic Units

9

Register View of Math

031 16

R3

MAC 0ALU 0

MAC 1ALU 1

R6 R7

A0 A1

Dual ALU / MAC functions are “Vector” functions

Two Pairs of operands areavailable

RegisterFile

32 32

32 32

R201631

Page 10: Section 4 Register Files and Arithmetic Units

10

Arithmetic Logic Unit (ALU)

Two ALUs operating on 16-bit, 32-bit, and 40-bit input operands and output 16-bit, 32-bit, and 40-bit results.

Functions Fixed-point addition and subtraction Addition and subtraction of immediate values Accumulator and subtraction of multiplier results Logical AND, OR, NOT, XOR, bitwise XOR, Negate Functions: ABS, MAX, MIN, Round, division primitives

Features Supports conditional instructions 8-bit video ALU operations

Page 11: Section 4 Register Files and Arithmetic Units

11

Multiply-Accumulators (MAC)

Two identical MACs Each can perform fixed point multiplication and multiply-and-

accumulate operations on 16-bit fixed point input data and outputs 32-bit or 40-bit results depending the destination.

Functions Multiplication Multiply-and-accumulate with addition (optional rounding) Multiply-and-accumulate with subtraction (optional rounding) Dual versions of the above

Features Saturation of accumulator results Optional rounding of multiplier results

Page 12: Section 4 Register Files and Arithmetic Units

12

Barrel-Shifter (Shifter)

The shifter performs bitwise shifting for 16-bit, 32-bit or 40-bit inputs and yields 16-bit, 32-bit, or 40-bit outputs.Functions Arithmetic Shift: The Arithmetic Shift instruction shifts a

registered number a specified distance and direction while preserving the sign of the original number. The sign bit value back-fills the left-most bit positions vacated by the arithmetic right shift.

Logical Shift: The Logical Shift instruction logically shifts a registered number a specified distance and direction. Logical shifts discard any bits shifted out of the register and backfill vacated bits with zeros.

Rotate: The Rotate instruction rotates a registered number through the CC bit a specified distance and direction.

Bit Operations Field Extract and Deposit

Page 13: Section 4 Register Files and Arithmetic Units

13

Data Types

8-bit bytessigned or unsigned integers

16-bit half-words (little Endian)signed or unsigned integerssigned fractional (1.15)

32-bit words (little Endian)signed or unsigned integerssigned fractional (1.31)

Page 14: Section 4 Register Files and Arithmetic Units

14

8 bit Video Alu

Page 15: Section 4 Register Files and Arithmetic Units

15

Video pixel operations

ALIGN8,16,24 (align data in src_reg)BYTEPACK (Quad 8-bit Pack)BYTEOP16P (Quad 8-bit ADD)BYTEOP16M (Quad 8-bit Substract)BYTEOP1P (Quad 8-bit Average Byte,ADD/DIV2)BYTEOP2P (Quad 8-bit Average Half word,

ADD/DIV4)BYTEOP3P (DUAL 16-Bit ADD word and byte / Clip

to byte)BYTEUNPACK (QUAD 8-bit Unpack)

Page 16: Section 4 Register Files and Arithmetic Units

16

BYTEOP16P (Quad 8-bit Add) Adds eight unsigned bytes to result in four 16-bit words

General Form

(dest_reg_1, dest_reg_0) = BYTEOP16P(src_reg_0, src_reg_1)

source data chosen by I0 from register pairs R3:2 and R1:0

Example

(r1, r2) = BYTEOP16P(r3:2, r1:0);

src_reg_0 y3 y2 y1 y0

src_reg_1 z3 z2 z1 z0

dest_reg_0 y1+z1 y0+z0

dest_reg_1 y3+z3 y2+z2

BYTEOP16P

Page 17: Section 4 Register Files and Arithmetic Units

17

BYTEOP1P (Quad 8-bit Average – Byte) Averages four unsigned byte pairs to produce four 8-bit results General Form

dest_reg = BYTEOP1P(src_reg_0, src_reg_1)

source data chosen by I0 from register pairs R3:2 and R1:0

Example

r3 = BYTEOP1P(r1:0, r3:2);

src_reg_0 y3 y2 y1 y0

src_reg_1 z3 z2 z1 z0

dest_reg avg(y3,z3) avg(y2,z2) avg(y1,z1) avg(y0,z0)

Quad-Byte Averaging (1)

Page 18: Section 4 Register Files and Arithmetic Units

18

BYTEOP2P (Quad 8-bit Average – Half-Word) Averages two unsigned byte quadruples to produce two 8-bit

results General Form

dest_reg = BYTEOP2P(src_reg_0, src_reg_1)

source data chosen by I0 from register pairs R3:2 and R1:0

Example

r3 = BYTEOP2P(r1:0, r3:2);

src_reg_0 y3 y2 y1 y0

src_reg_1 z3 z2 z1 z0

dest_reg 0..0 avg(y3,y2,z3,z2)

0..0 avg(y1,z1,y0,z0)

Quad-Byte Averaging (2)

Page 19: Section 4 Register Files and Arithmetic Units

19

4-Neighborhood Average

The value of the center pixel is defined like thisx = (xN+xS+xE+xW)/4

A better description isx = average(xN, xS, xE, xW)

BYTEOP2P can perform this kind of average on two pixels in 1 cycle

xN

xW x xE

xS

Page 20: Section 4 Register Files and Arithmetic Units

20

Quad-Byte-Sum Absolute Difference (1)

SAA (Quad 8-bit Subtract-Absolute-Accumulate) Subtracts four pair of bytes, takes the absolute value of

each difference, and accumulates each result into a 16-bit accumulator half

N is typically 8 or 16 (corresponding to blocks of 8x8 and 16x16 pixel, respectively)

Useful for block-based video motion estimation

1

0

1

0

),(),(N

i

N

j

jibjiaSAD

Page 21: Section 4 Register Files and Arithmetic Units

21

Configurable Memory System

Supports a Cache Memory Model and an SRAM Memory Model Sustained Dual Data Accesses for DSP Applications Supports accesses of 8, 16, 32 bit Data Separate Multi-ported L1 Instruction and Data Memories

Processor Core

L1 InstructionSRAM & Cache

DMA

L2Instruction

& DataSRAM

L1 Data SRAM & Cache

Scratchpad SRAM

Page 22: Section 4 Register Files and Arithmetic Units

22

21535 Memory Levels Internal L1 memory -

Closest to the Processor

Can be configured as cache or SRAM

Smallest Memory Capacity(16KB Instruction, 36KB Data)

Single Cycle Access

Internal L2 memory -

Further from the Processor

Larger Memory Capacity (256KB Total for Instruction/Data)

Multiple Cycle Access

External L2 memory -

Off Chip

Largest Memory Capacity (Synchronous and Asynchronous)

Slowest access time

Page 23: Section 4 Register Files and Arithmetic Units

23

Direct Memory Access

The ADSP-21535 DMA controller allows data transfer operations without processor core interventionTypes of data transfers

Memory Memory (MemDMA)

Memory Serial Peripheral Interface (SPI)

Memory Serial Port

Memory UART Port

Memory USB Device

Page 24: Section 4 Register Files and Arithmetic Units

24

Instruction Set Optimized for Rich Media Applications

Video Applications Up to four 8-bit math operations in a single cycle DCT / iDCT Support (Less than 300 cycle 8*8 DCT)

Dual MAC with IEEE 1180 Rounding Motion Estimation

Quad-Byte Operations (e.g. Sum Abs. Differences) Huffman Coding

Sophisticated Field Deposit / Extract Capability

2G and 3G Communications Protocol Standards Voice Codecs: On-The-Fly Saturation Arithmetic Channel Codecs: Instruction Set Support for Complex Math,

Bit Interleaving, Population Count, Viterbi Dual Add-Compare-Select, and CRC

Page 25: Section 4 Register Files and Arithmetic Units

25

Performance

DSP Code is often a C - language program with interspersed assembly code Kernels. The measure of performance is threefold:

The Compiled Program Code Size - measure of Cost and Power Consumption

The Compiled Code Performance - measure of Power Consumption and Software Engineer Work remaining (TTM )

The Assembly Level Kernel Performance - measure of Power Consumption

Page 26: Section 4 Register Files and Arithmetic Units

26

Optimized DSP Software Libraries Currently In Development

Image Processing Libraries Generic Pixel Interpolation Algorithms Auto Focus, Auto Exposure Control Auto White Balance Color Space Conversion, RGB YCrCb Transformations, 4.4.4 4.2.2, 4.2.2 4.2.0, 4.4.4 4.2.0

Image Processing Application Development Bilinear Interpolation Image Processing Linear Laplace Interpolation Image Processing High Quality Image Processing Video Image Processing

Image/Video/Audio Processing CODECs Still - JPEG, JPEG2000 Audio - MP3, AAC, MPEG1 layer 2 audio Video - MJPEG, MPEG2, MPEG-4

www.analog.com -> Digital Signal Prozessing -> Blackfin -> Code Examples

Page 27: Section 4 Register Files and Arithmetic Units

27

ts - OS 3rd Party Partners

Nucleus and Nucleus uITRON from Accelerated Technology ( http://www.acceleratedtechnology.com )

Embedded Linux from Embedix ( http://www.embedix.com )

CMX from CMX (http://www.cmx.com )

Real Time Architect from LiveDevices ( http://www.livedevices.com )

ThreadX from Express Logic ( http://www.expresslogic.com )

DSP OS from DSP OS ( http://www.dspos.com )

VspWorks from WindRiver ( http://www.windriver.com )

uC Linux from Lineo ( www.lineo.com )

LiveDevices Limited ( www.livedevices.com )

KwikNet TCP/IP Stack Kadak (www.kadak.com)

Page 28: Section 4 Register Files and Arithmetic Units

28

Video Technology Third Parties

Algorithmix www.algorithmix.com

Algo Vision Systems www.algovision.de

Emuzed www.emuzed.com

Epigon Audiocare Pvt Ltd www.epigonaudio.com

Fastcom-Technology www.fastcom.ch

SignalWorks www.signalworks.com

Sunfield Group www.sunfieldgroup.com

Page 29: Section 4 Register Files and Arithmetic Units

29

ADSP-21535 t TargetsVideo-Enabled Internet Appliances

2.4 GbytePer

SecondI/O

Band-Width

DynamicPower

Management Varies

FrequencyAnd

Voltage

Interfaces

ToExternalFLASHAnd

SDRAM

308 KbytesOn-ChipSRAM

300 MHz16-bit Fixed-Point

Core

MemorySubsystem

260 PBGAPackage

0.9V to 1.5VVoltage

PCIUSB Device2 SPORTS2 UARTS2 SPI3 32-bit Timers

Peripherals

260KbytesOn Chip SRAM

48K bytesInstruction / Data Cache

768MbytesAddress Range

640mW, 300MHz100mW, 100MHz

Power @ 1.5V @ 0.9V

Performance

ADSP-21535Part Number

600MMACs306 DhrystoneMIPs

Page 30: Section 4 Register Files and Arithmetic Units

31

ADSP-21532 t TargetsCost-Sensitive Consumer Applications

300MHz 600MMACs

Performance

MemorySubsystem

160 Mini-BGAPackage

2.25V to 3.6VVoltage

2 SPORTSUARTSPI3 TimersParallel Peripheral Interface/GPIO

Peripherals

32KbytesOn Chip ROM

84K bytesOn-Chip RAM

132MbytesAddress Range

ADSP-21532Part Number

2.4 GbytePer

SecondMemoryBand-Width

DynamicPower

Management Varies

FrequencyAnd

Voltage

Interfaces

ToExternalFLASHAnd

SDRAM

84 KbytesOn-ChipSRAM

300 MHz16-bit Fixed-Point

Core

Page 31: Section 4 Register Files and Arithmetic Units

32

ADSP-21535 Availability

Development Tools Availability Date- Visual DSP++ IDE Today- EZ-ICE and EZ-KIT Today

Documentation- Data Sheet Today- HW/SW Reference Guides Today

Silicon- Samples Today- Production Q1 2003- Pricing

- ADSP-21535PKB-300 $25 @ 10K- ADSP-21535PKB-200 $22 @ 10K

Please visit http://www.analog.com/blackfin-dsp for additional information!!!

Page 32: Section 4 Register Files and Arithmetic Units

33

ADSP-21532 Release Plan

Tools Availability DateVDSP Upgrade NowEZ-KIT, ICE CY 1Q 03

DocumentationData Sheet NowHardware Reference Now

SiliconSamples NowProduction CY 4Q 03Pricing

- ADSP-21532SBBC-300 $9.95 @ 10K

Page 33: Section 4 Register Files and Arithmetic Units

34

-40C to 105C

-40C to 105C

Ordering Guide and Specifications

260 -PBGA200 MHz

Boot308K Bytes

ADSP-21535PKB-200

260 -PBGA300 MHz

0C to 85CBoot308K Bytes

ADSP-21535PKB-300

PackagesSpeedTemp(Case)

ROMSRAMPart Number

300 MHz

32K Bytes

84K BytesADSP-21532SBBC-300

160 miniBGA

Page 34: Section 4 Register Files and Arithmetic Units

35

ADSP-21532 Differences

Page 35: Section 4 Register Files and Arithmetic Units

36

48KBInstructionSRAM/Cache

Memory DMA

System Control Blocks

Emulator& Test Control

VoltageRegulation

Event Controller

Clock(PLL)

System Interface Unit

Processor Core300MHz

48KBInstructionSRAM/Cache

High Speed I/O

ExternalMemoryInterface

UART SPORT0 SPORT1Timers0/1/2

ParallelPeripheral Interface/

GPIO

WatchdogTimer

SPI

RealTimeClock

32KBDataSRAM/Cache

32KBInstructionROM

4KBScratchpadRAM

Peripheral Blocks

ADSP-21532 Consumer & Multimedia Enhancements

Highly integrated peripheral set reduces BOM costs

PPI supports CCIR-656 video converter interface

Enhanced serial ports support up to 8 stereo I2S channels

2-D DMA supports data transfer with programmable count & stride values

Page 36: Section 4 Register Files and Arithmetic Units

37

Parallel Peripheral Interface

Up To 16-bit ParallelData

PPICLKSYNC Appliances

External Clock

Bidirectional, half-duplex interface Supports CCIR-656 Video Converter

Interface PPI provides general fast ADC / DAC

interface at up to 65MSPS

Page 37: Section 4 Register Files and Arithmetic Units

38

PPI - Features

Can optionally ignore Field 2 (don’t DMA)

Works hand-in-hand with ADSP-21532 2D DMA Engine

Can skip even or odd data elements

Supports 16-bit data packing mode

Supports 32-bit DMA mode (2 bursts of 16-bit DMA)

4 control signal polarity choices (H,V,CLK)

Page 38: Section 4 Register Files and Arithmetic Units

39

8- or 10-bit data w/embedded control

CLK

‘656-Compatible Video Source

PPIPPIx

PPI_CLK

PPI I/O Modes

8-16 bits dataCLK

HSYNCVSYNC

PPI_FS3Video Source PPI

PPI_FS1PPI_FS2FIELD

PPIx

PPI_CLK

CCIR-656

GP - Mode

Page 39: Section 4 Register Files and Arithmetic Units

40

Video Data and Control PPI

DMA L1

Memory

SPORT

Compressed Video

DMA

PPIDMA

SDRAML1

Memory

DMA

Possible Data Transfer Scenarios

SPORT

DMA

External Processor

Page 40: Section 4 Register Files and Arithmetic Units

41

2-D Direct Memory Access

A E F GC DB

PONMLKJI

H

LKJIH

FG

EDCBA

....

Linear Data Capture & Storage

2-D DMA to L1 Memory A, B, I, J

Programmable

X &Y Count & Stride Values

Programmable

X &Y Count & Stride Values

2-D DMA significantly accelerates video processing

Page 41: Section 4 Register Files and Arithmetic Units

42

Serial Ports

Primary TXSecondary TX

Tx ClockTx Sync

Primary RxSecondary RX

Rx ClockRx Sync

Two Dual-Channel Synchronous Ports supporting 8 Stereo I2S Channels

Supports 3-32bit data widths

100MHz operation from external clock

SCLK/2 operation from internal clock ( up to 66MHz )

Page 42: Section 4 Register Files and Arithmetic Units

43

ADSP-21532 in Prosumer Audio

32-Bit Math31 x 31 Audio Multiply in 2 cycle loop 150MHz

(effective)32 x 32 Multiply in 3 cycle loop 100MHz (effective)

Serial PortsEight Stereo I2S ChannelsProgrammable L/R channelUp to 100MHz operation

Application AreasDigital Mixers, Home Theatre, Car Audio

Page 43: Section 4 Register Files and Arithmetic Units

44

1M Byte Asynchronous

Memory & Interface

32K BytesInstruction ROM

32K BytesInstruction SRAM

16K Bytes Instruction SRAM/Cache

32K BytesData SRAM/Cache

4K BytesScratchpad SRAM

Ext

ern

al M

emo

ry I

nte

rfac

e16

1M Byte Asynchronous

1M Byte Asynchronous

1M Byte Asynchronous

16M Byte – 128M ByteSynchronous

Page 44: Section 4 Register Files and Arithmetic Units

45

Power Management--Variable Voltage

+-

VREF

VDDINT

VDDCTRL

VDDEXT

DSPINTERNALCIRCUIT

EXTERNALCOMPONENTS

2.25V -> 3.6V

TANTALUMOR

ELECTROLYTIC

CERAM IC

10 F .1F

On-chip Voltage Regulation

Generates core voltage from external 2.25V to 3.6V input

Core voltage programmable in 50mV increments

Optional bypass

Minimal external components required

Ind10µH

Uz=4V

Page 45: Section 4 Register Files and Arithmetic Units

46

Dynamic Power Management - Variable Frequency

PLL1x - 31x

1, 2, 4, 8

1 : 15

CLKIN

CCLK

SCLK

Dynamically ModificationOn the fly

Dynamic Modification Requires PLL Sequencing

CCLK

SCLK

SCLK =< CCLK

SCLK =< 133MHzPLL

Page 46: Section 4 Register Files and Arithmetic Units

47

New Blackfin DSP

NDA information

Page 47: Section 4 Register Files and Arithmetic Units

48

BLACKfin DSP NDA Roadmap

2002 2003< 2001 2004

Pe

rfo

rman

ce/P

rice

21532300MHz/$10

21532300MHz/$10

Blackfin DSPTeton

(dual Core)

Blackfin DSPTeton

(dual Core)

Blackfin DSP500MHz/Nx

Blackfin DSP500MHz/Nx

21533600MHz

21533600MHz

Blackfin DSP

Blackfin DSP

Blackfin DSP

Blackfin DSP

21535300MHz

21535300MHz

Page 48: Section 4 Register Files and Arithmetic Units

49

BLACKfin DSP New ADSP-21533 Brief Summary

ADSP-21533 Blackfin DSP will be sampling in 1Q03 at the highest clock rate of any commercially available GP DSP Product.

Blackfin DSPs will demonstrate 3x speed of the available competitive architectures in its class.

ADSP-21533 Blackfin DSP’s high speed and Dynamic Power Management are ideal for high performance (video) portable applications.

Blackfin DSPs are the platform on which many new ADI system level products are being built, including Vehicle Telematics and Wireless Communications.

Page 49: Section 4 Register Files and Arithmetic Units

50

BLACKfin DSP ADSP-21533 : Architecture Overview

48KBInstruction

SRAM/Cache

Memory DMA

System Control BlocksEmulator& Test Control

VoltageRegulation

Event Controller

Clock(PLL)

System Interface Unit

Processor Core600MHz

16KBInstruction

Cache

ExternalMemoryInterfaceFLASHSRAM

SDRAM

UART SPORT0 SPORT1Timers0/1/2

ITU-R 656Video

Interface(PPI)

WatchdogTimer

SPI

RealTimeClock

64KBData SRAM/

Cache

4KBScratchpa

dRAM

Peripheral Blocks

64KBInstructio

nSRAM

Page 50: Section 4 Register Files and Arithmetic Units

51

32KB Data Cache

4KB Scratch SRAM

16KB Inst Cache

32KB Data Cache

4KB Scratch SRAM

ADSP-21532

16-bit 133MHz

32KB Inst SRAM

16KB Inst Cache

32KB Inst ROM

256K BytesInst / Data SRAM

Ext

ern

al M

emo

ry I

nte

rfac

e

Ex

tern

al

Me

mo

ry I

nte

rfa

ce

ADSP-21533 ADSP-21535

Ex

tern

al

Me

mo

ry I

nte

rfa

ce

L1

Me m

or y

L2

Me m

or y

Ext

ern

a lL

2 M

e mo

r y

16-bit 133MHz

32-bit 133MHz

32KB Data Cache

4KB Scratch SRAM

32KB Inst SRAM

16KB Inst Cache

32KB Inst SRAM

32KB Data SRAM

BLACKfin DSP ADSP-21532 and ADSP-21533 Memory

Maps

Page 51: Section 4 Register Files and Arithmetic Units

52

BLACKfin DSP ADSP-21533 and ADSP-21532

ADSP-21533 is a Larger Memory, higher performance , pin–pin compatible part to the ADSP-21532

On Chip ROM

On Chip RAM

32KBytes

84KBytes

Address Range 132MBytes

Performance300 MHz,

600 MMACs

ADSP - 21532

-

148KBytes

132MBytes

500 MHz, 1000 MMACs

ADSP - 21533

Price at 10Ku $9.95 $17

-

148KBytes

132MBytes

600 MHz, 1200 MMACs

ADSP - 21533

$23

Page 52: Section 4 Register Files and Arithmetic Units

53

BLACKfin DSP ADSP-21533 Power Consumption

New Blackfin targets derived from ADSP-21535 factual data

ADSP-21535 consumes 640mW at 300MHz at 1.5V Scaling to 1.0V and applying 30% further reduction

expected from geometry ( 0.18u to 0.13u ) , reduced gate capacitance and custom implementation methodology.

ADSP-21533 Power Targets Less than 350mW at 500MHz at 1.0V Less than 100mW at 200MHz at 0.7V

Page 53: Section 4 Register Files and Arithmetic Units

54

BLACKfin DSP for Portable ApplicationsDynamic Power Management – Benefits

Blackfin DSPs offer the highest DSP performance for applications that require it MP3PRO Encoding CIF Video Conferencing

AND the same device delivers the lowest power consumption when the demands reduce MP3 Decoding Speech Recognition Text to Speech

Speed and Power

0

50

100

150

200

250

300

350

400

0 200 400 600MHz

mW

Page 54: Section 4 Register Files and Arithmetic Units

55

BLACKfin DSP - ADSP-21533 Example : Video Display System

ADSP-21533

FLASH SDRAM

CIF Display

SPORT0

SPORT1PPI

Single chip Audio and Video Decoder for Entertainment System

Video transport over local bus, connected through SPORT0 and SPI control

SPORT1 connects to stereo speakers and microphones

MPEG2 Video Decoding MPEG2 Audio Decoding Speech Recognition command

and control with noise canceling array microphone input

Reduced existing BOM by >50%

SPI

Page 55: Section 4 Register Files and Arithmetic Units

56

BLACKfin DSP - ADSP-21532 Example :Low Cost Video Surveillance

ADSP-21532

FLASH SDRAM

CaptureSensor

SPORT0

SPORT1

PPI

Single chip Video Encoder for surveillance systems

Video capture – example Omnivision integrated lens / sensor

SPORT1 connects microphone MPEG4 CIF Video Encoding MPEG4 Audio Encoding optional Video transport over Ethernet or

SPORT

Reduced existing BOM by >60%

LAN

Page 56: Section 4 Register Files and Arithmetic Units

57

BLACKfin DSP - ADSP-21533 Example : Low Cost Video Display

ADSP-21533

FLASH SDRAM

CIF Display

SPORT1

PPI

Single chip Video Decoder for Portable Display systems

Video Display – through PPI SPORT1 connects Speakers MPEG4 CIF Video Decoding MPEG4 Audio Decoding optional Content transport over WLAN

Reduced existing BOM by >50%802.11

Page 57: Section 4 Register Files and Arithmetic Units

58

Silicon ADSP-21532 Samples 1Q03 ADSP-21533 Samples 1Q03 Production CY Q403 Pricing:

- ADSP-21532SBCA-300 $9.95 @ 10K- ADSP-21533SBCA-300- ADSP-21533SKCA-500 $17 @ 10K

Tools VDSP Upgrade Now EZ-KIT, ICE CY 1Q ’03

Sales Collateral Preliminary Data Sheet Now, on Web Hardware Reference Manual CY 4Q ‘02

BLACKfin DSP ADSP21532 & ADSP21533 Release Plan

Page 58: Section 4 Register Files and Arithmetic Units

59

Blackfin DSP ADSP-21533 - MPEG4 Processing

Capability

Hardware Facilitators Parallel Peripheral Interface 2-D DMA CCIR-656 SOL/SOF support

Application Areas Video-Enabled Information Appliance, Security Systems, Videophone

MPEG4Simple Profile384kbps

15 Frames/s

30 Frames/s

60 Frames/s

CIF Frame(352x288)

Encode AND Decode

Encode ANDDecode

---------

QCIF Frame(176x144)

Encode AND Decode

Encode AND Decode

Encode AND Decode

70% core loading

Page 59: Section 4 Register Files and Arithmetic Units

60

Blackfin DSP ADSP-21533 - DVD Decoding : MPEG2

MPEG2Main Profile/Main Level

300MHz 500MHz 1000 MHz

VGA Frame(640x 480)

CIF Frame(352x288)

QCIF Frame(176x144)

95% Loaded 50% Loaded

31% Loaded52% Loaded

13% Loaded

Page 60: Section 4 Register Files and Arithmetic Units

61

Blackfin DSP Image Decode Performance

JPEG2000 140 cycles/pixel 4:2:2 decode 2Mpel requires 280 MHz

JPEG 39 cycles/pixel 4:2:0 Encode 2Mpel requires 80 MHz

Page 61: Section 4 Register Files and Arithmetic Units

62

BLACKfin DSP Roadmap

2001 2002 2003 2004

Perf

orm

an

ce M

Hz

Portable Image / Video

ADSP-21535PCI / USB

ADSP-21535PCI / USB

ADSP-21533Reg / CCIR656

ADSP-21533Reg / CCIR656

Internet / Network300

Blackfin DSP(Video)

Blackfin DSP(Video)

500

1000

1400

Blackfin DSP(Video)

Blackfin DSP(Video)

Blackfin DSPUSB / Ethernet

Blackfin DSPUSB / Ethernet

ADSP-21532Reg / CCIR656

ADSP-21532Reg / CCIR656

Blackfin DSP PCI / Ethernet

Blackfin DSP PCI / Ethernet

Page 62: Section 4 Register Files and Arithmetic Units

63

BLACKfin DSPLong Term Roadmap

Technology Advances Further the availability of interfaces on products – USB

2.0, Ethernet, Bluetooth Further availability of memory derivatives and higher

speeds Provide commercial, Industrial and Extended temp range

products Add software partners that help Blackfin to absorb simple

microprocessor tasks

Page 63: Section 4 Register Files and Arithmetic Units

64

Digital Imaging Products

ADSP-21535 - Sampling NOW First implementation of MSA (Blackfin) core 300Mhz, 600MMACs performance CIF MPEG2 Encoder or Decoder : MPEG4 simple profile

ADSP-21532, ADSP-21533 - Sampling 1Q03 300MHz and 500Mhz, performance Low Power – designed for portable consumer applications Video peripherals

Video Interface supporting input/output of CCIR656 signals etc. CIF MPEG2 Encoder and Decoder : MPEG 4 Advanced Simple Profile Encode and

Decode, JPEG

Dual-core - Under Design : Sampling 2H03 Dual 500MHz processor cores, dual Video Interfaces

Storage (ATA Interface), LCD controller, ( Ethernet MAC) , USB, Additional Serial Comms

VGA MPEG2 Encoder & Decode : MPEG4, JPEG

Next Generation Digital Video/Imaging Products (To be defined) 1 GHz roadmap, multicore roadmap

Page 64: Section 4 Register Files and Arithmetic Units

65

Targets High-Performance Video Applications – Security/Surveillance, Broadband Home Gateways

Dual 500 MHz Cores , 1000MHz/2000 MMACs

200KBytes L1 Memory 128KBytes L2 Memory

TSMC 0.13um, 5LM

500MHzBlackfin Processor

Core

SDRAM

FLASH/SRAM

Interfaces

RTC

Watchdog (2)

JTAG

System Peripherals

SPI 1

UART 2

Timers 12

GPIO 32

User Peripherals

PLL

Dynamic Power

Management UpgradedSPORTs 2

2 Video I/O and DMA

Switching Regulator

Next Generation Video Processor (Teton lite)

32KBytes

64KBytes

128KByte L2

500MHzBlackfin Processor

Core

4KByte

64KByte

32KByte

4K Byte

Page 65: Section 4 Register Files and Arithmetic Units

66

Targets ‘simple’ connected DSP applications

Single 500 MHz Core, 500MHz/1000 MMACs

Integrated Ethernet and USB connectivity

TSMC 0.13um, 5LM

BlackfinDSP plus Network Connectivity

32-bitOR

16-bit +16 GPIO

Memory Interface

TestControl

EmulationControl

Event Controller

WatchdogTimer

Memory DMA

System Control Blocks

Per

iph

eral

Blo

cks

PLL

Processor Core500MHz

USB v1.1Device

(with PHY)GPIO

System Interface UnitSystem Interface Unit L1

ScratchPad4KB

DataSRAM/Cache

32KB

I2CUARTsSPITimer0/1/2

SPORT1SPORT0

SRAM/Cache SRAM300MHz 16KB 64KB

Instruction

EthernetMAC

MII

Page 66: Section 4 Register Files and Arithmetic Units

67

BLACKfin DSP Portable Video DSP Roadmap

Emulator& Test

Dynamic Power Management

FLASH, SRAM, SDRAM Memor

y Interfa

ceUARTI2S Audio Converter Interface

SPORT1

TimersGP IO

Video Capture

SPI

Real TimeClock

Blackfin Processor

Core80kByte

Instruction68 Kbyte

Data80 kByte

Instruction68 kByte

Data

128kByteImage Buffer

128kByteImage Buffer

I2C

Video Display

LCD Controller

ATAInterface

Blackfin Processor

Core