18
High-Speed and Low-Power On-Chip Global Link Using Continuous-Time Linear Equalizer Yulei Zhang 1 , James F. Buckwalter 1 , and Chung-Kuan Cheng 2 1 Dept. of ECE, 2 Dept. of CSE, UC San Diego, La Jolla, CA 19 th Conference on Electrical Performance of Electronic Packaging and Systems Oct 25, 2010 Austin, USA

High-Speed and Low-Power On-Chip Global Link Using Continuous-Time Linear Equalizer

  • Upload
    haven

  • View
    39

  • Download
    0

Embed Size (px)

DESCRIPTION

High-Speed and Low-Power On-Chip Global Link Using Continuous-Time Linear Equalizer. Yulei Zhang 1 , James F. Buckwalter 1 , and Chung- Kuan Cheng 2 1 Dept. of ECE, 2 Dept. of CSE, UC San Diego, La Jolla, CA 19 th Conference on Electrical Performance of Electronic Packaging and Systems - PowerPoint PPT Presentation

Citation preview

Page 1: High-Speed and Low-Power  On-Chip Global Link Using Continuous-Time Linear Equalizer

High-Speed and Low-Power On-Chip Global Link Using Continuous-Time Linear Equalizer

Yulei Zhang1, James F. Buckwalter1, and Chung-Kuan Cheng2

1Dept. of ECE, 2Dept. of CSE, UC San Diego, La Jolla, CA

19th Conference on Electrical Performance of Electronic Packaging and SystemsOct 25, 2010 Austin, USA

Page 2: High-Speed and Low-Power  On-Chip Global Link Using Continuous-Time Linear Equalizer

2

Outline Introduction Equalized On-Chip Global Link

Overall structure Basic working principle

Driver Design for On-Chip Transmission-Line Guideline for tapered CML driver Driver design example

Continuous-Time Linear Equalizer (CTLE) Design CTLE modeling CTLE design example

Driver-Receiver Co-Design for Low Energy per Bit Methodology Overall link design example

Conclusion

Page 3: High-Speed and Low-Power  On-Chip Global Link Using Continuous-Time Linear Equalizer

Research Motivation Global interconnect planning becomes a challenge in ultra-

deep sub-macron (UDSM) process Performance gap between global wire and logic gates Conventional buffer insertion brings in larger extra power overhead

Uninterrupted wire configurations are used to tackle the on-chip global communication issues On-chip T-lines to reduce interconnect power Equalization to improve the bandwidth State-of-the-art[Kim2009]

2Gb/s/um, < 1pJ/b, signaling over 10mm global wire in 90nm

3

Page 4: High-Speed and Low-Power  On-Chip Global Link Using Continuous-Time Linear Equalizer

Our Contributions Contributions

Build up a novel equalized on-chip T-line structure for global communication

Tapered CML driver + CTLE receiver Accurate small-signal modeling on CTLE receiver to improve the

optimization quality A design methodology to achieve driver-wire-receiver co-

optimization to reduce the total energy per bit Results of our design

20Gbps signaling over 10mm, 2.2um-pitch on-chip T-line 11ps/mm latency and 0.2pJ/b energy per bit in 45nm

4

Page 5: High-Speed and Low-Power  On-Chip Global Link Using Continuous-Time Linear Equalizer

Equalized On-Chip Global Link

5

Overall structure Tapered current-mode logic (CML) drivers Terminated differential on-chip T-line Continuous-time linear equalizer (CTLE) receiver Sense-amplifier based latch

Page 6: High-Speed and Low-Power  On-Chip Global Link Using Continuous-Time Linear Equalizer

Basic Working Principle Tapered CML Driver

Provide low-swing differential signals to driver T-line Tapered factor u, number of stages N, fan-out X, final stage current ISS,

driver resistance RS

T-line Differential wire w/ P/G shielding Geometries (width, pitch) and termination resistance RT

CTLE Receiver Recover signal and improve eye-quality Load resistance RL, source degeneration resistance RD and capacitance

CD, over-drive voltage Vod. Sense-amplifier based latch

Synchronize and convert signal back to digital level

6

Page 7: High-Speed and Low-Power  On-Chip Global Link Using Continuous-Time Linear Equalizer

Tapered CML Driver Design Output swing constraint

Design guideline [Tsuchiya2006, Heydari2004]

Begin from the final stage For given VSW, output resistance RS optimized with RT

to increase eye-opening Transistor size

Tapered factor u = 2.7 for delay reduction Number of stages

Each previous stage is designed backward by scaling with the factor u

7

Need to design:1) Output resistance RS

2) Tail current ISS

3) Size of transistors W

Page 8: High-Speed and Low-Power  On-Chip Global Link Using Continuous-Time Linear Equalizer

CML Driver Study w/ Loaded T-line

8

Assume 45nm 1P11M CMOST-line built on M9 with M1 as referenceT = 1.2um, H = 3.5um (fixed)Optimize W and S for eye-opening

Change of the eye-opening with width for fixed 2um pitch

Change of the eye-opening with pitch for equal width/spacing

Page 9: High-Speed and Low-Power  On-Chip Global Link Using Continuous-Time Linear Equalizer

CML Driver Design Example Experimental observations

Optimal eye happens when width=spacing Eye-opening improves with larger pitch

Design methodology Choose the minimum pitch that satisfied the wire-end eye-opening

requirement Design example

9

Page 10: High-Speed and Low-Power  On-Chip Global Link Using Continuous-Time Linear Equalizer

Accurate CTLE Modeling

10

voutvinG

S

D

RD CD

rds CLRLgmvgs

Small Signal Circuit to derive H(s):

2

1

2

1( )1

( 1)( ) ( 1)

( 1)

( 1)1

1/

/

D DDC

m ds LDC

m ds D ds L

ds L L D D m ds D L L L D D

m ds D ds L

ds D D L L

m ds D ds L

zD D

p

p

sR CH s Gainas bsg r RGain

g r R r Rr R C R C g r R R C R R Ca

g r R r Rr R C R Cb

g r R r R

R Ca

a b

1.2

( ), ( ), ( )21, , ,

1.5fF/um , 1.5fF/um

,

od od od

Bias dd ic Biasm ds

od Bias L od

para paraS D

ex para ex paraD D S L L D

V V K K VI V V IWg r IbiasV I R L KV

C W C W

C C C C C C

Design Variables: RL, RD, CD, Vod(Size)

[Hanumolu2005]

Page 11: High-Speed and Low-Power  On-Chip Global Link Using Continuous-Time Linear Equalizer

CTLE Modeling Validation

Test case:10mm, 16mV-eye@wire-end Blue lines: simple modeling, not consider rds and parasitics Red line: only consider rds

Black line: the proposed accurate model11

<10% correlation error>20% eye-opening increase

Page 12: High-Speed and Low-Power  On-Chip Global Link Using Continuous-Time Linear Equalizer

CTLE Design Example Observations of CTLE study

Eye-opening improves with relaxed power constraints but tends to be saturated

Design example Based on the pre-optimized CML driver + T-line design Eye-opening improved by 4X after CTLE

12

Page 13: High-Speed and Low-Power  On-Chip Global Link Using Continuous-Time Linear Equalizer

Driver-Receiver Co-Design Methodology

Optimize driver-wire-receiver together by setting Veye/Power as the cost function

Choose pre-designed CML/T-line/CTLE as initial solution Optimization Flow

Driver-to-receiver step-response generation based on SPICE simulation and CTLE modeling

Eye-opening estimation based on step-response SQP-based non-linear optimization Variables: [ISS,RT,RL,RD,CD,Vod]

Performance Comparison Option A:Driver/Receiver independent design Option B:Low-power driver/receiver co-design

13

Page 14: High-Speed and Low-Power  On-Chip Global Link Using Continuous-Time Linear Equalizer

Low Energy-per-Bit Optimization Flow

14

Pre-designed CML driver Pre-designed CTLE receiver

Driver-Receiver Co-Design Initial Solution

Co-Design Cost Function Estimation

SPICE generated T-line step response

Step-Response Based Eye Estimation

Receiver Step-Response using CTLE modeling

Internal SQP (Sequential Quadratic Optimization) routine to generate best solution

Best set of design variables in terms of overall energy-per-bit

Change variables[ISS,RT,RL,RD,CD,Vod]

Cost-FunctionVeye/Power

Page 15: High-Speed and Low-Power  On-Chip Global Link Using Continuous-Time Linear Equalizer

Simulated Eye Diagrams

15

Methodology A: driver/receiver separate design

Methodology B: driver/receiver co-design for low-power

Page 16: High-Speed and Low-Power  On-Chip Global Link Using Continuous-Time Linear Equalizer

Summary of Performance ComparisonMethodology Adriver/receiver separate design

Methodology Bdriver/receiver co-design for low-power

RS/ohm 47 148

RT/ohm 94 1100

RL/ohm 440 890

RD/ohm 110 1430

CD/fF 680 150

Vod/mV 60 58

Eye-Opening@CTLE/mV 91 113

Power Consumption/mW 8.1 3.8

16

Note: driver/receiver co-design methodology uses much larger driver/termination resistance to reduce power, but will close the eye-opening at the driver output and wire-end. Final eye is recovered by fully utilizing CTLE.

Page 17: High-Speed and Low-Power  On-Chip Global Link Using Continuous-Time Linear Equalizer

Conclusion We propose a novel equalized on-chip global link

using CML driver and CTLE receiver Accurate modeling for CTLE is provided to achieve

<10% correlation error and will improve eye-opening optimization quality

Our design achieves 20Gbps signaling over 10mm, 2.2um-pitch on-chip T-line 11ps/mm latency and 0.2pJ/b energy

17

Page 18: High-Speed and Low-Power  On-Chip Global Link Using Continuous-Time Linear Equalizer

Thank You!Q & A

18