45
On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

Embed Size (px)

Citation preview

Page 1: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

On-chip power distribution in deep submicron technologiesOn-chip power distribution in deep submicron technologies

Aida Todri

Electrical and Computer Engineering Department

University of California Santa Barbara

Page 2: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

2

OutlineOutline

Introduction

Problem Statement and Formulation

Electromigration (EM) Phenomena in Power Gated Networks EM Analysis and Grid Optimization

Decoupling Capacitor Efficiency in Power Networks Metrics and Placement

Power Supply Noise Reduction in Multi-core System Power vs Performance Trade-offs

Conclusions

Page 3: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

3

Technology ScalingTechnology Scaling

Advantages: Increasing device count Higher transistor density Increasing logic switching

speed Increasing clock frequencies

Disadvantages: Increasing internal capacitance Increasing leakage current

higher standby power Increasing dynamic power

larger transient currents3

4

Page 4: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

4

On-Chip Power Delivery NetworkOn-Chip Power Delivery Network

Hierarchical mesh structure on several metal layers Global grid occupies the top two

layers of the chip Local (block) grid occupies lower

metal layers

Must satisfy reliability constraints: In DC (steady state) conditions:

Voltage drop (IR) must be within margins

Current density in power tracks should not surpass allowed current density

In AC (transient) conditions: Power supply noise must be within

margins Decaps may be inserted to

suppress power supply noise and to lower impedance of power tracks

Page 5: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

5

Low-Power Strategy Low-Power Strategy

Idle blocks can be disconnected from the grid Their static power can be eliminated

Sleep transistor controls the wake up or sleep mode of the gated block

Vdd

Logic Block

Gnd

HeaderSwitch

Sleep

Power gating technique

Page 6: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

6

Power Gating TechniquePower Gating Technique

Vdd

Vdd

Vdd

VddSleep

Transistor

Top Layer

SleepTransistor

Block

v

GatedBlock

v

GatedBlock

B2

B3

B1

UngatedBlock

Top layer is global grid Designed to satisfy reliability constraints (EM and IR) when all circuits are

switching

Each block has its local power mesh

Many power gating configurations exist

Page 7: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

7

Research Topics of Interest - 1Research Topics of Interest - 1

Designing Power Grid for Power-Gated Chips

Typically designed at the early stages of the design process

Mostly over-designed causing a large overhead in chip power consumption

Power gating is not considered during the design of power grids.

Ipeak

Ileakage

tpto tf Ttime

Current

Page 8: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

8

On-chip Power Delivery for Power Gated ChipsOn-chip Power Delivery for Power Gated ChipsObjective: Deliver power to the circuit blocks while satisfying reliability constraints in the power grid when power gating is applied.

Global Power GridVdd Vdd

Vdd

S1

S2

S3

Local Power Grid

Ungated Block

Gated BlockSleep

Transistors

Gated Block

IntermediateVias

Global Vias

Local Vias

• Power tracks are not ideal and have finite resistance

• Many possible configurations of operating blocks

Page 9: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

9

Electromigration MechanismsElectromigration Mechanisms

Transport of metal atoms under the force of an electron flux High current density stress

Depletion/ accumulation of metal material from atomic flow can lead to the formation of hillocks and voids in metal lines lead to shorts and open circuits faults

Voids

Grain Boundaries

Hillocks

Photo courtesy of University of Notre Dame

Page 10: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

10

Electromigration on Power Gated GridsElectromigration on Power Gated Grids

IA

R1 R2

R3

I1 I2I3

IB

VDD

R1 R2

R3

VDD

I'1I'2

I'3IA1

IA2IB1 IB2

IA3 IB3IA IB

IB1 IB2

IB3

EM violations may occur only on those branches where base currents flow in opposite directions.

MacroI

MacroII

Vdd

Before power gating

AB

BA

BA

III

III

III

333

222

111

,

,

After power gating

B

B

B

II

II

II

3

~

2

~

1

~

3

2

1

,

,

Page 11: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

11

IR Drop Analysis for Power GatingIR Drop Analysis for Power Gating

Theorem 1: The grid node voltages can only increase when a current source is turned off.

Corollary: When a source is turned off, IR drop may only decrease when power gating is applied.

Theorem 2: Uniform track resizing of a resistive grid does not change the current flow.

Corollary: Uniform upsizing does not change currents on a grid, so we can always upsize tracks to meet EM and IR constraints.

maxJ

JVB

Uniform upsizing by guarantees that all EM and IR constraints are satisfied for all power gating configurations.

Page 12: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

12

Power-Gating Aware OptimizationPower-Gating Aware Optimization

We reduce the complexity of the optimization problem by reducing the grid granularity by applying the multi-grid technique.

Our optimization scheme has three main steps: Reduce grid size by folding tracks

Optimize the reduced grid

Unfold the grid to its original granularity

Page 13: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

13

1. Grid Folding1. Grid Folding

Identify a few neighbor tracks around a violation that remain unfolded. VDD

VDDVDD

VDD

(a)

VDD

VDDVDD

VDD

(c)

VDD

VDDVDD

VDD

(d)

(b)

Page 14: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

14

2. Reduced Grid Optimization2. Reduced Grid Optimization

A three-step iterative process, 3 Step LP : Derive current and voltage sensitivities to grid sizing Uniformly upsize the grid by fine scale upsizing steps {ψ1, ψ2,…, ψr} Shrink the selected tracks

The process is repeated until no violations exist.

Upsizing by ψi from {ψ1, ψ2,…, ψr}

Shrink selected tracksOriginal grid

Page 15: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

15

LP ProblemLP Problem

Minimize the total resizing of the grid as

subject to the three constraints:

Current Density

Voltage Drop

Resizing Coefficients

)...21

max(qttt

')(

'

VBJ

wtwhbiI

o

i

o

DDiDDVVV

~

it

Page 16: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

16

3-step Iterative LP Algorithm3-step Iterative LP AlgorithmInitial Optimized Grid for All

Sources On

Computations from Power Gating Configurations

EM violation JVB

IR violation Vnode Upsizing coefficient

Finer scale coefficients i

Upsize Grid by

Shrink Grid

Feasible GridN

Y

i

JVB>JmaxVnode <0.9V DD

Page 17: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

17

3. Grid Unfolding3. Grid Unfolding

As we only considered only worst case violations on the grid, minor violations after optimization and unfolding are possible.

These violations are miniscule and can be fixed by applying greedy upsizing of the track with violation.

Page 18: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

18

Experiments- FloorplansExperiments- Floorplans

H

H

HH

H

H

H

H

H

H H

L

M

LM

L

M

M

M

L

M

H M L H

M

H

H

L

L H

LM

H M

L H

L

L

M

LM

L

M

M

M

L

M

H M L H

M

H

H

L

L H

LM

H M

L H

L

L

M

LM

L

M

M

M

L

M

H M L H

M

H

H

L

L H

LM

H M

L H

L

High density blocks located in the center of the grid.

H

H

HH

H

H

H

H

H

H H

H

H

HH

H

H

H

H

H

H H

Power gating configurations.

Low/medium density blocks located in the center of the grid.

Power gating configurations

Low/medium current density blocks

High current density blocks

Gated blocks

Page 19: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

19

ResultsResults

Experiments to observe: Various current density blocks (high, med, low) Various power grid granularities

20x20, 30x30, 50x50, 100x100 All vs. some power gating configurations

Percentages in area savings compared to uniform upsizing up to 48% of area savings

100x100 granularity grid with high density blocks placed on the center of the grid

Page 20: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

20

Decoupling Capacitor vs. PSNDecoupling Capacitor vs. PSN

• Inserted decoupling capacitor (decaps) can provide charge to switching circuit to reduce power supply noise (PSN).

• Decaps consume power due to switching

• PSN suppression depends on decap efficiency

Vdd

Vdd

Vdd

Vdd

Global Grid

Page 21: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

21

Research Topics of Interest - 2Research Topics of Interest - 2

How to Use Decoupling Capacitors Most Efficiently ?

Decoupling capacitor is a reservoir of charge

Used to reduce voltage drop at the switching current load

Amount of charge supplied depends on

Parasitic conductance between decap and current load

Parasitic conductance between decap and power supply

Switching frequency of the current load

Capacitor

To current load

Charge

Interconnect

Page 22: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

22

Decoupling Capacitance EffectivenessDecoupling Capacitance Effectiveness

Decoupling capacitors suppress power supply noise

Decaps reduce the impedance of the power delivery system operating at high frequencies.

Efficacy of decoupling capacitors depends upon Impedance of conductors

connecting the capacitor to current loads and power sources

Charge-back ability after a transitions is completed.

+-

Vdd

Iswitching_circuit Ccircuit

Cdecap

RpkgLpkg

Cpkg

Rgrid2Lgrid2Rgrid1Lgrid1

1 2

+-

Vdd

Iswitching_circuitCcircuit

Rgrid2Lgrid2

Cdecap

RpkgLpkg

Cpkg

RgridLgrid

Page 23: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

23

Decap Effectiveness in Mesh GridsDecap Effectiveness in Mesh Grids

Original mesh

Mesh A circuit

Mesh B circuit

Mesh C circuit

1

8765

432

1211109

16151413

(a)

1

8765

432

1211109

16151413

(b)

A

1

8765

432

1211109

16151413

(c)

B

1

8765

432

1211109

16151413

(d)

C

Page 24: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

24

Decap Effectiveness on Mesh GridsDecap Effectiveness on Mesh Grids

Detrimental decoupling capacitance.

Page 25: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

25

Decap Effectiveness in Mesh GridsDecap Effectiveness in Mesh Grids

1

8765

432

1211109

16151413

(c)

B

Ineffective decoupling capacitance.

Page 26: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

26

Decap Effectiveness in Mesh GridsDecap Effectiveness in Mesh Grids

1

8765

432

1211109

16151413

(d)

C

Effective decoupling capacitance

Page 27: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

27

Mesh AnalysisMesh Analysis

Decap effectiveness depends upon Zd impedance has an impact on how fast Cdecap will be recharged Zs,impedance has an impact on how much voltage drop will be at the

switching circuit Zsd,impedance has an impact on how much current (charge) Cdecap can

provide to the switching circuit. tr, tf, Ipeak, switching frequency and current magnitude Cdecap, decap size

Vdd

Zsd

Zs Zd

CdecapIswitching_circuit

Page 28: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

28

Decap’s effectiveness metricsDecap’s effectiveness metricsRegion of Effectivennes for

Decap Insertion

a

b

u a: effective distance between decap and Vdd pin

b: effective distance between current source and decap

u: minimum distance between decap and Vdd pin to avoid spurious switching.

Page 29: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

29

Decap Effectiveness ModelDecap Effectiveness ModelVdd

Zsd

Zs Zd

CdecapIswitching_circuit

Amount of charge providedfrom Vdd supply and non-switching circuit decap.

Amount of charge thatshould be provided frominserted decaps.

Ipeak

t

Circuit Current Profile

nrofdecaps

iidecap

pliesnrofVdd

iplynswcircuitswitching

QQQQi 1

sup

1sup_

Region of Effectivennes forDecap Insertion

a

b

u

Page 30: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

30

Decap Budget : Optimization FunctionDecap Budget : Optimization Function

LP optimization problem

Subject to :

1) Voltage drop margin

2) Charge transfer balance

3) Allowed cap constraint

4) Efficiency metrics constraints

Page 31: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

31

Sequence of Linear ProgramsSequence of Linear Programs

Cdecapi is dependent on the node voltage Vi ; Cdecapi and Vi are variables.

Sequence of linear programs:

1. Initial transient analysis performed with existing decaps, solved for Vi’s

2. Determine decap budgets Cdecapi based on LP formulation where node voltages are determined in step 1.

3. Re-perform transient analysis with Cdecapi to check the node voltages. Update node voltages Vi.

4. Check if Vi >Vthresh.

1. If Vi >Vthresh+σ, run decap budget to reduce decaps, step 2

2. If Vi <Vthresh-σ, run decap budget to allocate more decaps, step 2

Page 32: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

32

Case StudyCase Study

Courtesy of STMicroelectronics

Page 33: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

33

ExperimentsExperiments

Page 34: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

34

ExperimentsExperiments

Page 35: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

35

ExperimentsExperiments

Total Decap Reduction Total amount of decap reduced on chip 297pF

Percentage 5.56% Number of Filler Cells Reduction (placed decaps)

297pF out of 623pF = > 52%

Correlations

Case Study Max IR Drop (mV) Power (W)

Apache’s Redhawk

51.8 0.645

Our method (before)

43.1 0.660

(after) 43.7 0.660

Page 36: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

36

Multi-Core SystemMulti-Core System

Several cores integrated on a chip

Chips with Several cores have been produced Tens to hundreds of cores per chip are envisioned

Physical design problems Thermal management Power management Power delivery Noise control …

Page 37: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

37

Research Topics of Interest - 3Research Topics of Interest - 3

How to Suppress Power Supply Noise?

Sources Fast transient currents of

switching blocks Turn on/off of power gated blocks Parasitic impedance of power

tracks (package)

Detrimental Effects Circuit delay increase Logical faults due to increased

delay

In Out

Cload

Vdd

Vdd

90%Vdd

Voltage

time

DrainCurrent

Id

Vds

Drain -Source Voltage

Page 38: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

38

Multi-Core SystemsMulti-Core Systems

Shared global grid

Uniform controlled collapse chip connection (C4s) distribution

Vdd

Vdd Vdd

Vdd

Macros

Core

C4 Bumps

Vdd

Objective: Assign task to cores such that minimum power supply noise is generated.

Page 39: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

39

CoresAssigned

Assignment Workloads Power SupplyNoise (V*ps)

1

2

1-2-4

1-2-4

1-2-4

1-5-9

1-5-9

1-5-9

W3-W3-W3

W2-W2-W2

W1-W2-W3

W1-W2-W3

2.56

0.06

1.98

0.06

1.82

1.83

W3-W3-W3

W2-W2-W2

9

321

8

654

7

PSN vs. Workload AssignmentsPSN vs. Workload Assignments

9

321

8

654

7

• PSN vs. proximity between working cores

• PSN vs. available decap

• PSN vs. operating frequencies

Page 40: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

40

Grid ModelsGrid Models

Vdd

C4 Bump

Vdd

Vdd

Vdd

Vdd

Vdd

Vdd

Vdd

VddVdd

Core1

Core2 Core3

Core4 Core6Core5

Core7 Core8 Core9

Core grid

Global grid

Vdd Vdd

Vdd Vdd

Base grid

Page 41: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

41

Circuit Reduction Circuit Reduction

(b)

Vdd

r

C

R

loadI

L

+- Ceff

Vdd Vdd

Vdd Vdd

2 3

4 5

(a)

6

7 8 9

1

Reducing base grid (a) to a simplified model (b) Circuit voltage response maintained for the worst case voltage drop Assumption: the worst case voltage drop is on node 5

Page 42: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

42

Power Supply Noise Aware AssignmentPower Supply Noise Aware Assignment

We apply simulated annealing (SA) based algorithm to minimize PSN.

A workload can be assigned to any core

Task assignments on cores will vary due to: Location

same task at different location Frequency

Same location but varying workloads Location and Frequency

whHwmHwlH

wh

wm

wl

Page 43: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

43

Assignment HeuristicsAssignment Heuristics

Current Demand-Based Assignment (CDA) Workloads assigned to cores which are farther away from large

current workloads to minimize noise propagation.

W1

W2

Large CurrentWorkload

Page 44: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

44

ExperimentsExperiments

Experiments to observe Various core granularities

3x3,5x5,7x7, 10x10 Various operating frequencies Various core sizes Impact of initial task assignment on the multicore system

Results No initial assignment

Up to 30% less in PSN compared to CDA method With initial assignment

Up to 37% less in PSN compared to CDA method.

Page 45: On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

45

ConclusionsConclusions

On-chip power distribution for low-power applications

Power gating induced electromigration issues in the power networks Analysis and optimization of power network

Analysis of decoupling capacitance efficiency in power grids Decoupling capacitance placement in power networks

Low power supply noise task assignment for multicore systems Analysis of multicore systems power network Task assignment optimization for low power noise