A Single Chip Terabit Switch

Preview:

Citation preview

1

A Single Chip Terabit Switch

Speaker: Fred Heaton

Authors: Bill Dally,Wayne Dettloff, John Eyles,Trey Greer, John Poulton, Teva Stone, Steve Tell

2

Architecture Overview

140 998 Mbps - 3.125Gb/s

Transmitters

140 998 Mb/s - 3.125Gb/s

Receivers

140 x 1403.125Gb/s

Switch

Core

8 bit uP interface

• Re-usable, Low Power Tx/Rx Module

• Rx Clock/Data recovery

• Adaptive termination

• Lane-by-lane selectable data rates

3 selectable Reference clock inputs

5 selectable multiplier settings

•Asynchronous switch operation

•User selectable Transmitter Pre-emphasis

•Scalable architecture

3

Applications3/5 Stage Clos Network configurations

9800 port SNB 3 stage Clos network (30+ Tbps)Protection switchingVideo routingDWDM OEO switching applications

WAN SONET Ring WAN SONET Ring

OEO switch

will be

replaced

by

4

Receiver

140 x 1403.125Gb/s

Switch

Core

RxRX0 ±±±±

Clock Gen

RxDat

Host Interface,

Controller, and

Global logic

8 bit uP Interface

MDIO/MDC

RFCC ±

RFCB ±

RFCA ±

receiver assembly N

receiver assembly 2

receiver assembly 1

receiver assembly 0

RxToggle

8

CDR

To Switch

5

Switch

140 x 1403.125Gb/s

Switch

Core

Crosspoint switch fabric

140 x 140 x 9

•Full reconfiguration and multicast support

•Asynchronous operation

•Differential data path design

•Byte wide datapath + 1 “clock”

From Rx

To Tx

6

Transmitter

140 x 1403.125Gb/s

Switch

Core

Tx

TX0 ±

Clo

ck

Gen

.

TxD

at

TxT

oggle

Switch

control

Host Interface

Controller, and

Global logic

RFCC ±

RFCB ±

RFCA ±

host

control buses

switch

select

tra

nsm

itte

r a

ssem

bly

N

tran

smit

ter

ass

emb

ly 2

tran

smit

ter

ass

emb

ly 1

8

From

Switch

CDR

8 bit uP Interface

MDIO/MDC

• Differential, CML signaling

• Transmitter Pre-emphasis

• On-chip adaptive termination

7

Technology

0.18 micron, 7 Layer MetalFlip-chip die attachCeramic BGA package1mm ball pitch

36 x 36 full array (1296 pins)

168 Vdd pins (1.8 volt nominal)

13 Ivdd pins (2.5/3.3 volt nominal)

473 Ground pins

8

Electrical/Performance Specs

Rx Input Sensitivity - 35 mVJitter Tolerance - 0.65 UIJitter Generation - 0.25 UIPlesiochronous tracking of 200ppm differencebetween reference and embedded data clockSwitch Reconfiguration time ~30uSClock Lock time ~9uSSelectable output levels (675 mV max)Power Dissipation18 Watts @ 2.5 Gbps (1.8 V operation)

22 Watts @ 3.125 Gbps (1.8V operation), 130mW/Tx-Rx pair

9

Receiver

...D

ESD

Programmable

Termination

E

D

E

eclkP

dclkN

eclkN

dclkP

Data/EdgeSamplers

Early/Late

Logic

early

late LPF

720° Phase

Interpolator

dclkP,dclkN

eclkP,eclkN

Clock

Multiplier

RFC

A,B,C

Multiplier

settings

8/10

210-1 PRBS

Checker

to switch

RX+,RX-RXDAT[9:0]

RXTOGGLE

E0

D0

E1

D1

Quadrature clocks

10

Early Late Logic

D0

E0D1

E1 DDLogic threshold

!=

late

= != =

late

dclkP

dclkN

eclkP

eclkN

!=

early

!= != !=

early- Data sample

Logic threshold

1

011 00

000 1

11

Switch Design

Weak

driver

Strong

driver

resistor

row

Diff. ampE

dg

e

Detecto

r

Ed

ge

Detecto

r

Column driver

Colu

mn

Diff. amp

Out+/-

•Asynchronous, low-swing differential design

•Pre-emphasis technique used to drive heavily

loaded row/column lines

•Weak driver and “resistors” used to maintain

differential voltage at DC

• 3 Watts worst case power dissipation at 3.125

Gbps, 2 volt operation

NFET H bridge driver

Edge

Detector

In+

In-

Row driver

Edge

Detector

12

Nominal Spice simulation, 2.5 Gbps operation

Row near end

Row far end

In

Col near end

Out

Xpt amp

3.2ns

260mV

780mV

13

Transmitter

...D

ESD

Programmable

Termination

D

dclkP

dclkP

Tx CDR

720° Phase

Interpolator

dclkP,dclkN

eclkP,eclkN

Clock

Multiplier

RFC

A,B,C Multiplier

settings

8/10

210-1 PRBS

Generator

from switch TX+,TX-

TXDAT

TXTOGGLE

Driver

Quadrature clocks

14

Transmitter Current Steering Element

G

0

G

G

d0

dclk

Vbn

Vbn

dclk[0]

en0N

TX+TX-

s1

s0

• 4 identical sets of current steering elements

•Separate staggered clocks provided to each for slew rate control

•Programmable bias generator (5 settings)

•Equalization tap (5 settings)

Vbn

Vbn

Vbn

Di-bit data

1

0

1

0

1

0

1

d1

d0

d0

d0

d1

d1

d1

dclk

dclk

dclk

GG

GG

dclkdclk

d1

d0

en1N

en2N

en3N

EQ

Bias

Gen.

15

Interconnect Performance @ Gigabit rates

Skin effect attentuation - freq relationshipDielectric loss - to first order, linear attenuation with frequencySerial transmitters use pre-emphasis to compensate for signaldistortion effects of FR4

1.0 2.0 3.0

Frequency (GHz)

Am

pli

tud

e (d

B)

0.0

-9.0

-6.0

-3.0

FR4 dielectric loss

Skin effect loss

(trace geometry)Total trace loss

16

Example Waveform (No Pre-emphasis)

0.20.0

Time (ns)

Am

pli

tud

e (V

)

0.75

0.0

0.25

0.50

0.4 0.6 0.8 1.0 1.2 1.4

Transmitted pulse

Received pulse

17

Example Waveform (Pre-emphasis)

0.20.0

Am

pli

tud

e (V

)

0.75

0.0

0.25

0.50

0.4 0.6 0.8 1.0 1.2 1.4

-0.25

Pre-Emphasis transmitted pulse

Pre-Emphasis pulse response

Response w/ out pre-emphasis

a

b

1.00

� Pre-Emphasis 1-tap equation:

Vpr(n) = a * Vi(n) – b * Vi(n-1)

Narrows pulse which opens the transmit eye at the

receiver

18

At Transmitter: No Pre-Emphasis Signal at Receiver

At Transmitter: Pre-Emphasis Enabled Signal at Receiver

Pre-Emphasis at 2.5Gb/s, PRBS data:

40” FR4RXTX

19

On-chip Processor

8 bit Microprocessor4K x 14 bit instruction ROM

4K x 14 bit instruction RAM

512 x 8 bit data RAM

External EEPROM interface

On reset processor performsOn-chip register initialization

Default switch configuration

Receiver offset trim

Termination resistor calibration

Ongoing polling loopRegister updates, termination resistor updates

20

210-1 PRBS, 3.577 Gbps, 4” FR4

650mV

Duty cycle distortion present due to internal clock load imbalance

Eye reduction in both the time domain and amplitude

21

Terabit Switch Building Blocks

Crosspoint switch demonstratestwo key building blocksHigh-speed I/Os

High-performance switch core

Different switches can berealized by adding application-specific logic between theseGrooming switchFramers and time-slot interchangers

Cell switchScheduler and queueing

140 x 140

3.125Gb/s

Switch

Core

Application-Specific

Logic

140 3.125Gb/s

Transmitters

140 3.125Gb/s

Receivers

22

Other Products

VC2002 SONET/SDH Grooming Switch72 Integrated 2.5Gb/s I/O port pairs

SONET Input and Output Processing

SONET Input and Output Processing

ST-192(c) Support

STS-1 level switching

(3456 x 3456 STS-1 Switch)

multi-stage scalable architecture

VC1001/2/3 Octal SERDESfull-duplex, 1Gbps - 3.125 Gbps, 1.5 - 2.1 WattsQuad version also available

Recommended