14
4/8/2012 1 Energy and Delay Models EE216B: VLSI Signal Processing Prof. Dejan Marković [email protected] Lecture Overview Goal: tie-in parameters of the underlying implementation technology together with algorithm-level specifications Strategy Technology characterization (energy, delay) Circuit-level tuning (gate size, supply voltage) Tradeoff analysis (E-D space, logic depth, activity) Remember We will go all the way down to these low-level results to match algorithm specs with technology characteristics 2.2

Energy and Delay Models - UCLAicslwebs.ee.ucla.edu/dejan/ee219awiki/images/b/b0/Lec-02_Energy... · Power and Energy Figures of Merit ... Every 0→1 transition at the output,

  • Upload
    ngodung

  • View
    218

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Energy and Delay Models - UCLAicslwebs.ee.ucla.edu/dejan/ee219awiki/images/b/b0/Lec-02_Energy... · Power and Energy Figures of Merit ... Every 0→1 transition at the output,

4/8/2012

1

Energy and Delay Models

EE216B: VLSI Signal Processing

Prof. Dejan Marković [email protected]

Lecture Overview

Goal: tie-in parameters of the underlying implementation technology together with algorithm-level specifications

Strategy

– Technology characterization (energy, delay)

– Circuit-level tuning (gate size, supply voltage)

– Tradeoff analysis (E-D space, logic depth, activity)

Remember

– We will go all the way down to these low-level results to match algorithm specs with technology characteristics

2.2

Page 2: Energy and Delay Models - UCLAicslwebs.ee.ucla.edu/dejan/ee219awiki/images/b/b0/Lec-02_Energy... · Power and Energy Figures of Merit ... Every 0→1 transition at the output,

4/8/2012

2

Power and Energy Figures of Merit

Power consumption in Watts

– Determines battery life in hours

Peak power

– Determines power ground wiring designs

– Sets packaging limits

– Impacts signal noise margin and reliability analysis

Energy efficiency in Joules

– Rate at which power is consumed over time

Energy = Power * Delay

– Joules = Watts * seconds

– Lower energy number means less power to perform a computation at the same frequency

2.3

Power versus Energy

Watts

time

Power is the height of the waveform

Watts

time

Approach 1

Approach 2

Approach 2

Approach 1

Energy is the area under the waveform

Lower power design could simply be slower

Two approaches require the same energy

2.4

Page 3: Energy and Delay Models - UCLAicslwebs.ee.ucla.edu/dejan/ee219awiki/images/b/b0/Lec-02_Energy... · Power and Energy Figures of Merit ... Every 0→1 transition at the output,

4/8/2012

3

Dynamic (~75% today, decreasing)

Short-circuit (~5% today, decreasing)

Leakage (~20% today,

slowly increasing)

Review: Energy and Power Equations

E = α01· CL· VDD2 + α01 ·tsc · VDD · Ipeak + VDD · Ileakage /fclock

P = f01 · CL· VDD2 + f01 · tsc· VDD· Ipeak + VDD · Ileakage

f01 = α01 · fclk

Energy = Power / fclk

2.5

Dominant Energy Components

Dramatic increase in Leakage Energy

0

1

2

3

4

5

0.25 µm 0.18 µm 0.13 µm 90 nm 65 nm

Technology Generation

Ener

gy (

no

rm.)

leakage switching

W

VDD Switching: charges the load capacitance

Leakage: parasitic component

2.6

Page 4: Energy and Delay Models - UCLAicslwebs.ee.ucla.edu/dejan/ee219awiki/images/b/b0/Lec-02_Energy... · Power and Energy Figures of Merit ... Every 0→1 transition at the output,

4/8/2012

4

Switching Energy

Every 0→1 transition at the output, an amount of energy is taken out of supply (energy source)

CL

VDD

Vin Vout

0 1 · ·OH

OL

V

L out outVE C V dV

20 1 ·L DDE C V

2.7

Energy Balance

One half of the energy from supply is consumed in the pull-up network and one half is stored on CL

Charge from CL is discharged to Gnd during the 1→0 transition

E0→1

PMOS network

NMOS network

. . .

A1

AN

CL

Vout

VDD

E1→0

E0→1 = CL · VDD2

E1→0 = 0.5 · CL · VDD2

ER = E1→0

ER = 0.5 · E0→1

EC = 0.5 · E0→1

Energy from supply

heat

heat

2.8

Page 5: Energy and Delay Models - UCLAicslwebs.ee.ucla.edu/dejan/ee219awiki/images/b/b0/Lec-02_Energy... · Power and Energy Figures of Merit ... Every 0→1 transition at the output,

4/8/2012

5

Consider switching a CMOS gate for N clock cycles

EN : the energy consumed for N clock cycles n(N) : the number of 0→1 transitions in N clock cycles

Node Transition Activity and Energy

2· · ( )N L DDE C V n N

2( )lim lim · ·N

avg L DDN N

E n NE C V

N N

0 1

( )limN

n Nα

N

20 1· ·avg L DDE α C V

2.9

Lowering Switching Energy

Esw = a01 · CL · VDD2

Capacitance: Function of fan-out, wire length, transistor sizes

Supply Voltage: Has been dropping* with CMOS scaling

Activity factor: How often, on average, do nodes switch?

2.10

Page 6: Energy and Delay Models - UCLAicslwebs.ee.ucla.edu/dejan/ee219awiki/images/b/b0/Lec-02_Energy... · Power and Energy Figures of Merit ... Every 0→1 transition at the output,

4/8/2012

6

Switched Capacitance

i i+1

Cwire Cparasitic,i Cgate,i+1

For large fanouts, we may neglect the parasitic component

VDD,i VDD,i+1

L sw par outCC C C

, 1sw out wire gate iC C C C

2.11

MOS Capacitances

Gate-Channel Capacitance

– CGC = Cox·W·Leff (Off, Linear)

– CGC = (2/3)·Cox·W·Leff (Saturation)

Gate Overlap Capacitance

– CGSO = CGDO = CO·W (Always)

Junction/Diffusion Capacitance

– Cdiff = Cj·LS·W + Cjsw·(2LS + W) (Always)

Circuit design

Cgate

Cparasitic

Simple linear models

– Designers typically use C / unit width (fF/mm)

γ = Cpar / Cgate (typically γ < 1)

– 90 nm gpdk: γ = 0.61

90 nm gpdk 2.5 fF/mm

C W

2.12

Page 7: Energy and Delay Models - UCLAicslwebs.ee.ucla.edu/dejan/ee219awiki/images/b/b0/Lec-02_Energy... · Power and Energy Figures of Merit ... Every 0→1 transition at the output,

4/8/2012

7

Leakage Energy

When the gate is idle (keeping the state), an amount of energy is taken out of supply (energy source)

CL

VDD

Vin Vout

Sin = 1

Sin = 0

The sub-threshold leakage current is the dominant component

( )· /Leak Leak in DD clockE I S V f

2.13

Sub-Threshold ID vs. VGS

Physical model

Empirical model

[mV/dec]

DIBL

· /0· ·(1 )

DSGSVV

k T qSDSI I e e

2

· · /0

·· · ·

TV

n k T qk TWI μ e

L q

·

0

0

· ·10GS T DSV V γ V

SDS

WI I

W

· · (10)kT

S n lnq

2.14

Page 8: Energy and Delay Models - UCLAicslwebs.ee.ucla.edu/dejan/ee219awiki/images/b/b0/Lec-02_Energy... · Power and Energy Figures of Merit ... Every 0→1 transition at the output,

4/8/2012

8

VDS : 0 to 0.5V

Sub-Threshold ID vs. VGS

I D (A

)

VGS (V)

10x

90 mV

90 mV/dec

lower VT

Exp. increase

0 0.2 0.4 0.6 0.8 1

·

0

0

· ·10GS T DSV V γ V

SDS

WI I

W

· · (10)kT

S n lnq

10−12

10−10

10−8

10−6

10−4

2.15

Balancing Switching and Leakage Energy

Switching energy drops quadratically with VDD

Leakage energy reaches a minimum, then increases

– This is because fclock drops exponentially at low VDD

0.001

0.01

0.1

1

0 0.2 0.4 0.6 0.8 1 1.2

Vdd (V)

En

erg

y (

no

rm.)

Switching

Leakage

Esw = α01 · CL · VDD2

Elk = Ilk(Sin) · VDD / fclock

0 0.2 0.4 0.6 0.8 1 1.2

VDD (V)

0.001

0.01

0.1

1

Ener

gy (

no

rm.)

Energy-VDD

2.16

Page 9: Energy and Delay Models - UCLAicslwebs.ee.ucla.edu/dejan/ee219awiki/images/b/b0/Lec-02_Energy... · Power and Energy Figures of Merit ... Every 0→1 transition at the output,

4/8/2012

9

Total Energy has a Minimum

Total energy is limited by sub-threshold conduction

– Current doesn’t decrease, but delay increases rapidly

0.001

0.01

0.1

1

0 0.2 0.4 0.6 0.8 1 1.2

Vdd (V)

En

erg

y (

no

rm.)

Total

Switching

Leakage

0 0.2 0.4 0.6 0.8 1 1.2

VDD (V)

0.001

0.01

0.1

1

Ener

gy (

no

rm.)

0.3 V

Energy-VDD

12

x

Interesting result: only an order of magnitude in energy reduction is possible by VDD scaling!

Simulation parameters: 65 nm CMOS Activity = 0.1 Logic depth = 10

2.17

Alpha-Power Model of the Drain Current

Basis for delay calculation, also useful for hand analysis [1]

Empirical model

– Curve fitting (MMSE)

– α is between 1 and 2

– In 90 nm, it is ~1.4 (it depends on VTH) ● Can fit to α = 1, but with

what VTH?

1· · ·( )

DS ox GS TH

WI μ C V V

L

[1] T. Sakurai and R. Newton, “Alpha-Power Law MOSFET Model and its Applications to CMOS Inverter Delay and Other Formulas,” IEEE J. Solid-State Circuits, vol. 25, no. 2, pp. 584-594, Apr. 1990.

I D (

no

rmal

ize

d)

VDS / VDD

VGS

0 0.2 0.4 0.6 0.8 1 0

1

2

3

4

5

6 simulation model

2.18

Page 10: Energy and Delay Models - UCLAicslwebs.ee.ucla.edu/dejan/ee219awiki/images/b/b0/Lec-02_Energy... · Power and Energy Figures of Merit ... Every 0→1 transition at the output,

4/8/2012

10

Alpha-Power-Based Delay Model

Fitting parameters [2]

Von , αd , Kd

··

( Δ ) d

pard DD outα

DD on TH in in

WK V WDelay

V V V W W

Inv NAND2

model

0.5 0

0.5

0.6 0.7 0.8 0.9 1

VDD / VDDref

1

1.5

2

2.5

3

3.5

De

lay

/ D

ela

yref

[2] V. Stojanović et al., “Energy-Delay Tradeoffs in Combinational Logic using Gate Sizing and Supply Voltage Optimization,” in Proc. Eur. Solid-State Circuits Conf., Sept. 2002, pp. 211-214.

2.19

Alpha-Power-Based Delay Model

heff

Fitting parameters Von , αd , Kd

Effective fanout, heff

heff = g · (VDD, DVTH) · h

··

( Δ ) d

pard DD outα

DD on TH in in

WK V WDelay

V V V W W

Inv NAND2

model

0.5 0

0.5

0.6 0.7 0.8 0.9 1

VDD / VDDref

1

1.5

2

2.5

3

3.5

De

lay

/ D

ela

yref

2.20

Page 11: Energy and Delay Models - UCLAicslwebs.ee.ucla.edu/dejan/ee219awiki/images/b/b0/Lec-02_Energy... · Power and Energy Figures of Merit ... Every 0→1 transition at the output,

4/8/2012

11

Gate Delay as a Function of VDD

Delay increases exponentially in sub-threshold

1

10

100

1000

10000

100000

0 0.2 0.4 0.6 0.8 1 1.2

Vdd (V)

De

lay

(n

orm

.)

0 0.2 0.4 0.6 0.8 1 1.2

VDD (V)

1

100

10,000

100,000 D

elay

(n

orm

.)

Delay-VDD

10

1,000

2.21

Energy-Delay Tradeoff

Assumptions: 65 nm technology, datapath activity = 0.1, logic depth = 10

Energy ↓ 10% 25% 2x 3x 5x 10x

Delay ↑ 7% 27% 2x 4x 10x 130x

Ener

gy (

no

rm.)

0.001

0.01

0.1

1

1 10 100 1000 10000 100000

Delay (norm.)

En

erg

y (

no

rm.)

Total

Switching

Leakage

1 100 1000 104 105

Delay (norm.)

0.001

0.01

0.1

1

10

Energy-delay

12

x

1000x Hardly a tradeoff: a 1000x delay increase for a 12x energy reduction

Which operating point to choose?

2.22

Page 12: Energy and Delay Models - UCLAicslwebs.ee.ucla.edu/dejan/ee219awiki/images/b/b0/Lec-02_Energy... · Power and Energy Figures of Merit ... Every 0→1 transition at the output,

4/8/2012

12

PDP and EDP

Power-delay product (PDP) = Pavg · tp = (CL · VDD2)/2

– PDP is the average energy consumed per switching event (Watts * sec = Joule)

– Lower power design could simply be a slower design

Energy-delay product (EDP)

– EDP = PDP · tp = Pavg · tp2

– EDP = average energy * the computation time required

– One can trade increased delay for lower E/op (e.g. via VDD scaling)

Energy*Delay (EDP)

Energy (PDP)

Delay

0 0.4 0.6 0.8 1

VDD (norm.)

0

0.5

1

1.5

Ener

gy-d

elay

(n

orm

.)

2.23

Choosing Optimal VDD

Optimal VDD depends on the optimization goal

– VDD increases as we put more emphasis on delay

VDD|minE < VDD|minEDP < … < VDD|minD

Energy*Delay (EDP)

Energy (PDP)

Delay

0 0.4 0.6 0.8 1

VDD (norm.)

0

0.5

1

1.5

Ener

gy-d

elay

(n

orm

.)

2.24

Page 13: Energy and Delay Models - UCLAicslwebs.ee.ucla.edu/dejan/ee219awiki/images/b/b0/Lec-02_Energy... · Power and Energy Figures of Merit ... Every 0→1 transition at the output,

4/8/2012

13

Energy-Delay Tradeoff

Unified description of wide range of E and D targets – Choose the operating point that best meets E-D constraints

Delay

VDD scaling

Energy

Emax

Dmax Dmin

Emin

E · D

1

E · D 2

2

E · D 3

3

E · D n

n

E 2 · D

1/2

E 3 · D

1/3

E n · D

1/n

Slope of the line indicates the emphasis on E or D

2.25

Energy-Delay Optimization

Equivalent formulations – Achieve the lowest energy under delay constraint – Achieve the best performance under energy constraint

Delay

Unoptimized design

sizing

VDD

sizing & VDD

sizing & VDD & VTH

Energy

Emax

Dmax Dmin

Emin

(fclkmax) (fclk

min) 2.26

Page 14: Energy and Delay Models - UCLAicslwebs.ee.ucla.edu/dejan/ee219awiki/images/b/b0/Lec-02_Energy... · Power and Energy Figures of Merit ... Every 0→1 transition at the output,

4/8/2012

14

Circuit-Level Optimization

VDD , VTH , W Circuit topology

A

B

Delay

En

erg

y

A

B

Delay

En

erg

yCircuit Optimization

Objective: minimize E E = E(VDD , VTH , W)

Constraint: Delay D = D(VDD , VTH , W)

Energy-Delay

Tuning variables VDD , VTH , W

Constraints VDD

min < VDD < VDDmax

VTHmin < VTH < VTH

max

Wmin < W

Number of bits

Delay

2.27

Summary

The goal in algorithm design is to minimize the number of operations required to perform a task

– Once the number of operations is minimized, circuit-level implementation can further reduce energy by lowering supply voltage, switching activity, or gate capacitance

– There exists a well-defined minimum-energy point in CMOS technology due to parasitic leakage currents

– Considering energy alone is insufficient, energy-performance tradeoff reveals how much energy reduction is possible given a performance constraint

– Energy and performance models with respect to gate size, supply and threshold voltage provide basis for circuit optimization (finding the best energy-delay tradeoff)

2.28