204
Variability & Variability & Low Voltage Ci it D i Circuit Design ARC Seminar Prof. David Money Harris 30 December 2010 1

Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Variability & Variability & Low Voltage Ci it D iCircuit Design

ARC Seminar

Prof. David Money Harris

30 December 2010

1

Page 2: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

MotivationMotivationMajor challenges of nanometer CMOS design– Power consumption– Variability – Complexity

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 2

Page 3: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Power ConsumptionPower ConsumptionPower consumption limits chip performance today– Chips can do more computation than we can cool– Portable device battery life & standby time

Steady drive toward lower voltage to save power– Moreover, nanometer devices can’t withstand high

VVDD

New applications open at very low voltage and power– Implantable medical devicesp– Energy scavenging sensors

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 3

Page 4: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

VariabilityVariabilityNanometer transistors face physical limits– Discrete number of dopant atoms in channel– Atomic-scale roughness of polysilicon gate

Variability tends to increase as devices shrinkLarger numbers of devices are more susceptible to improbable events at the tail of the variationimprobable events at the tail of the variationUsing worst-case design is no longer tenable

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 4

Page 5: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

ObjectivesObjectivesAt the end of this class, you will be able to…– Make back-of-the-envelope predictions of energy

in CMOS circuitsM k i f d d i h i t d– Make informed design choices to reduce power subject to design constraints

– Describe the major sources of variation in circuitsesc be e ajo sou ces o a a o c cu s– Make statistical estimates of the impact of

variation on energy, delay, and yield– Analyze and improve noise margins in SRAM– Apply timing error detection registers to reduce

the margins caused by variationCMOS VLSI DesignCMOS VLSI Design 4th Ed.

the margins caused by variation5

Page 6: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

OutlineOutlineMotivationDevice Models– Ideal I-V Characteristics– Gate and Diffusion Capacitance– High Field Effects

Threshold Voltage Effects– Threshold Voltage Effects– Leakage

Energy & DelayEnergy & DelayVariationLow-Voltage Circuit Design with Variability

CMOS VLSI DesignCMOS VLSI Design 4th Ed.

g g y

6

Page 7: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

IntroductionIntroductionTransistors can be viewed as imperfect switchesAn ON transistor passes a finite amount of current– Depends on terminal voltages– Derive current-voltage (I-V) relationships

Transistor gate, source, drain all have capacitanceI = C (ΔV/Δt) > Δt = (C/I) ΔV– I = C (ΔV/Δt) -> Δt = (C/I) ΔV

– Capacitance and current determine speed

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 7

Page 8: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

MOS CapacitorMOS CapacitorGate and body form MOS

itpolysilicon gatesilicon dioxide insulator

p-type body+-

Vg < 0

capacitorOperating modes– Accumulation

(a)– Depletion– Inversion

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 8

Page 9: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Terminal VoltagesTerminal VoltagesMode of operation depends on Vg, Vd, Vs

Vg

– Vgs = Vg – Vs

– Vgd = Vg – Vd

– Vds = Vd – Vs = Vgs - VgdVs Vd

VgdVgs

V +-

+

-

+

-

ds d s gs gd

Source and drain are symmetric diffusion terminals– By convention, source is terminal at lower voltage

Hence V ≥ 0

Vds+

– Hence Vds ≥ 0nMOS body is grounded. First assume source is 0 too.Three regions of operation– Cutoff– Linear– Saturation

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 9

Page 10: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

nMOS CutoffnMOS CutoffNo channelIds ≈ 0

V 0+-

Vgs = 0

n+ n+

+-

Vgdg

s d

n+ n+

p-type body

b

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 10

Page 11: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

nMOS LinearnMOS LinearChannel formsCurrent flows from d to s – e- from s to d +

-

Vgs > Vt

+-

Vgd = Vgsg

Ids increases with Vds

Similar to linear resistorn+ n+ Vds = 0

p-type body

b

s d

+-

Vgs > Vt

+-

Vgs > Vgd > Vt

b

g

I

n+ n+ 0 < Vds < Vgs-Vt

p-type body

b

s d Ids

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 11

b

Page 12: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

nMOS SaturationnMOS SaturationChannel pinches offIds independent of Vds

We say current saturatesSimilar to current source

Vgs > Vt V < V+-

gs t

n+ n+

+-

Vgd < Vt

Vds > Vgs-Vt

g

s d Ids

p-type bodyb

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 12

Page 13: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

I-V CharacteristicsI V CharacteristicsIn Linear region, Ids depends on– How much charge is in the channel?– How fast is the charge moving?

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 13

Page 14: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Channel ChargeChannel ChargeMOS structure looks like parallel plate capacitor while operating in inversions– Gate – oxide – channel

Q CVQchannel = CVC = Cg = εoxWL/tox = CoxWLV = V – V = (V – V /2) – V

Cox = εox / tox

V = Vgc – Vt = (Vgs – Vds/2) – Vt

V d

gate

+ +source V drain

Vg

Cpolysilicon

gate

n+ n+

p-type body

+

Vgdsource

-

Vgs

-drain

Vds

channel-Vs Vd

Cg

n+ n+

p type body

W

L

tox

SiO2 gate oxide(good insulator, εox = 3.9)

gate

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 14

p-type body

Page 15: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Carrier velocityCarrier velocityCharge is carried by e-Electrons are propelled by the lateral electric field between source and drain

E V /L– E = Vds/LCarrier velocity v proportional to lateral E-field – v = μE μ called mobility– v = μE μ called mobility

Time for carrier to cross channel:– t = L / v

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 15

Page 16: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

nMOS Linear I-VnMOS Linear I VNow we know– How much charge Qchannel is in the channel– How much time t each carrier takes to cross

channelds

QIt

=

⎛ ⎞ox 2

dsgs t ds

W VC V V VL

μ ⎛ ⎞= − −⎜ ⎟⎝ ⎠

⎛ ⎞ W2

dsgs t ds

VV V Vβ ⎛ ⎞= − −⎜ ⎟⎝ ⎠

ox = WCL

β μ

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 16

Page 17: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

nMOS Saturation I-VnMOS Saturation I VIf Vgd < Vt, channel pinches off near draing

– When Vds > Vdsat = Vgs – Vt

Now drain voltage no longer increases current

2dsat

ds gs t dsatVI V V Vβ ⎛ ⎞= − −⎜ ⎟

⎝ ⎠

( )2

2 gs tV Vβ= −

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 17

Page 18: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

nMOS I-V SummarynMOS I V SummaryShockley 1st order transistor models

⎧cutoff0 gs tV V

V

⎧⎪ <⎪⎪ ⎛ ⎞⎜ ⎟⎨

( )2

linear2ds

ds gs t ds ds dsatVI V V V V Vβ

β

⎪ ⎛ ⎞= − − <⎜ ⎟⎨ ⎝ ⎠⎪⎪ ( )2

saturatio2

ngs t ds dsatV V V Vβ⎪− >⎪⎩

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 18

Page 19: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

ExampleExampleWe will be using a 65 nm process in these examples– From IBM– tox = 10.5 Å– μ = 80 cm2/V*s– Vt = 0.3 V

Plot I vs VPlot Ids vs. Vds

– Vgs = 0, .2, .4, .6, .8, 1.0– Use W/L = 0 1 / 0 05 μmUse W/L 0.1 / 0.05 μm

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 19

Page 20: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

pMOS I-VpMOS I VAll dopings and voltages are inverted for pMOS– Source is the more positive terminal

Mobility μp is determined by holes– Typically 2-3x lower than that of electrons μn

– 40 cm2/V•s in 65 nm processThus pMOS must be wider toThus pMOS must be wider to provide same current

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 20

Page 21: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

CapacitanceCapacitanceAny two conductors separated by an insulator have capacitanceGate to channel capacitor is very important

C t h l h f ti– Creates channel charge necessary for operationSource and drain have capacitance to body– Across reverse-biased diodes– Across reverse-biased diodes– Called diffusion capacitance because it is

associated with source/drain diffusion

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 21

Page 22: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Gate CapacitanceGate CapacitanceApproximate channel as connected to sourceCgs = εoxWL/tox = CoxWL = CpermicronWCpermicron is typically about 2 fF/μm

polysilicon

W

tox

polysilicongate

n+ n+

p-type body

Lox

SiO2 gate oxide(good insulator, εox = 3.9ε0)

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 22

Page 23: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Diffusion CapacitanceDiffusion CapacitanceCsb, Cdb

Undesirable, called parasitic capacitanceCapacitance depends on area and perimeter– Use small diffusion nodes– Comparable to Cg

for contacted difffor contacted diff– ½ Cg for uncontacted– Varies with processVaries with process

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 23

Page 24: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Nonideal TransistorsNonideal TransistorsHigh Field Effects

M bilit D d ti– Mobility Degradation– Velocity Saturation

Threshold Voltage Effectsg– Body Effect– Drain-Induced Barrier Lowering– Short Channel Effect– Short Channel Effect

Leakage– Subthreshold Leakage

G t L k– Gate Leakage– Junction Leakage

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 24

Page 25: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Ideal vs. Simulated nMOS I-V PlotIdeal vs. Simulated nMOS I V Plot

65 nm IBM process, VDD = 1.0 V

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 25

Page 26: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

ON and OFF CurrentON and OFF CurrentIon = Ids @ Vgs = Vds = VDD

– Saturation

Ioff = Ids @ Vgs = 0, Vds = VDD

Cutoff– Cutoff

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 26

Page 27: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Electric Fields EffectsElectric Fields EffectsVertical electric field: Evert = Vgs / toxg

– Attracts carriers into channel– Long channel: Qchannel ∝ Evert

Lateral electric field: Elat = Vds / L– Accelerates carriers from drain to source

Long channel: v = E– Long channel: v = μElat

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 27

Page 28: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Coffee Cart AnalogyCoffee Cart AnalogyTired student runs from VLSI lab to coffee cartFreshmen are pouring out of the physics lecture hallVds is how long you have been up– Your velocity = fatigue × mobility

Vgs is a wind blowing you against the glass (SiO2) wallAt high V you are buffeted against the wallAt high Vgs, you are buffeted against the wall– Mobility degradation

At high Vd you scatter off freshmen fall down get upAt high Vds, you scatter off freshmen, fall down, get up– Velocity saturation

• Don’t confuse this with the saturation region

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 28

g

Page 29: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Mobility DegradationMobility DegradationHigh Evert effectively reduces mobility– Collisions with oxide interface

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 29

Page 30: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Velocity SaturationVelocity SaturationAt high Elat, carrier velocity rolls off– Carriers scatter off atoms in silicon lattice– Velocity reaches vsat

• Electrons: 107 cm/s• Holes: 8 x 106 cm/s

Better model– Better model

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 30

Page 31: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Vel Sat I-V EffectsVel Sat I V EffectsIdeal transistor ON current increases with VDD

2

( ) ( )2

2

ox 2 2gs t

ds gs t

V VWI C V VL

βμ−

= = −

Velocity-saturated ON current increases with VDD

( )ox maxds gs tI C W V V v= −

Real transistors are partially velocity saturated– Approximate with α-power law model

( )g

Approximate with α power law model– Ids ∝ VDD

α

– 1 < α < 2 determined empirically (≈ 1.3 for 65 nm)

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 31

p y ( )

Page 32: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

α-Power Modelα Power Model0 cutoffgs tV V⎧ <

⎪ ( )αβlinear

saturation

gs t

dsds dsat ds dsat

dsat

VI I V VV

I V V

⎪⎪= <⎨⎪⎪ >⎩

( )

( ) / 22dsat c gs t

dsat v gs t

I P V V

V P V V

α

α

β= −

= −saturationdsat ds dsatI V V⎪ >⎩( )g

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 32

Page 33: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Channel Length ModulationChannel Length ModulationReverse-biased p-n junctions form a depletion region– Region between n and p with no carriers– Width of depletion Ld region grows with reverse bias

VDDGND VDD– Leff = L – Ld

Shorter Leff gives more currentI increases with V

GateSource DrainVDDGND VDD

Depletion RegionWidth: Ld

– Ids increases with Vds

– Even in saturationn+

p bulk Si

n+

GND

LLeff

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 33

Page 34: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Chan Length Mod I-VChan Length Mod I V

( ) ( )21

2ds gs t dsI V V Vβ λ= − +

λ = channel length modulation coefficient

2

λ = channel length modulation coefficient– not feature size– Empirically fit to I-V characteristicsEmpirically fit to I V characteristics

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 34

Page 35: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Threshold Voltage EffectsThreshold Voltage EffectsVt is Vgs for which the channel starts to invertg

Ideal models assumed Vt is constantReally depends (weakly) on almost everything else:– Body voltage: Body Effect– Drain voltage: Drain-Induced Barrier Lowering

Channel length: Short Channel Effect– Channel length: Short Channel Effect

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 35

Page 36: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Body EffectBody EffectBody is a fourth transistor terminalVsb affects the charge required to invert the channel– Increasing Vs or decreasing Vb increases Vt

( )0t t s sb sV V Vγ φ φ= + + −

φs = surface potential at threshold( )0t t s sb sγ φ φ

2 ln As T

i

Nvn

φ =

– Depends on doping level NA

– And intrinsic carrier concentration ni

γ = body effect coefficient

i

γ y

sioxsi

ox ox

2q2q A

A

Nt NCε

γ εε

= =

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 36

Page 37: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Body Effect Cont.Body Effect Cont.

For small source-to-body voltage, treat as linearFor small source to body voltage, treat as linear

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 37

Page 38: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

DIBLDIBLElectric field from drain affects channelMore pronounced in small transistors where the drain is closer to the channelD i I d d B i L iDrain-Induced Barrier Lowering– Drain voltage also affect Vt

ttdsVVVη

High drain voltage causes current to increase.t t dsV V Vη′ = −

g g

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 38

Page 39: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Short Channel EffectShort Channel EffectIn small transistors, source/drain depletion regions extend into the channel– Impacts the amount of charge required to invert

the channelthe channel– And thus makes Vt a function of channel length

Short channel effect: Vt increases with LS o c a e e ec t c eases– Some processes exhibit a reverse short channel

effect in which Vt decreases with L

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 39

Page 40: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

LeakageLeakageWhat about current in cutoff?Simulated resultsWhat differs?– Current doesn’t

go to 0 in cutoff

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 40

Page 41: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Leakage SourcesLeakage SourcesSubthreshold conduction– Transistors can’t abruptly turn ON or OFF– Dominant source in contemporary transistors

Gate leakage– Tunneling through ultrathin gate dielectric

Junction leakageJunction leakage– Reverse-biased PN junction diode current

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 41

Page 42: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Subthreshold LeakageSubthreshold LeakageSubthreshold leakage exponential with Vgs

⎛ ⎞

n is process dependent

0

0e 1 egs t ds sb ds

T T

V V V k V Vnv v

ds dsI Iγη− + − −⎛ ⎞

= −⎜ ⎟⎜ ⎟⎝ ⎠

p p– typically 1.3-1.7

Rewrite relative to Ioff on log scale

S ≈ 100 mV/decade @ room temperature

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 42

@ p

Page 43: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

BSIM I SimulationBSIM Ion Simulation

CMOS VLSI DesignCMOS VLSI Design 4th Ed.43

Page 44: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

ON-Current ModelsON Current ModelsSaturation: α-Power Law Model [Sakurai90]

2on DTI k WV α= DT DD tV V V= −

Subthreshold: Exponential Model [Sheu87]DT

T

VnvI I We=

kTv =

Near Threshold: EKV [Markovic10]0onI I We=

( ) 2⎡ ⎤⎛ ⎞

Tvq

=

( ) 21

222 ln 1DD t

T

V Vnvox

on Tfit

n C WI v ek L

ημ

+ −⎡ ⎤⎛ ⎞⎢ ⎥⎜ ⎟= +

⎜ ⎟⎢ ⎥⎝ ⎠⎣ ⎦

CMOS VLSI DesignCMOS VLSI Design 4th Ed.44

⎝ ⎠⎣ ⎦

Page 45: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Transregional I ModelTransregional Ion Model2V aV Exponential Transregional

0

DT DT

T

V aVnv

onI I We−

=Exponential Model

Transregional Model

• nMOS device

• 65 nm commercial65 nm commercial process

• Transregional model fits ll f 60 700 Vwell from 60 – 700 mV

• Average error: 2.4%• Maximum error: 6.1%

CMOS VLSI DesignCMOS VLSI Design 4th Ed.45

Page 46: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Off Current ModelOff Current ModelSensitivity to VDD through drain-induced barrier lowering

1

DT

T

Vnv

offI I Weη

=

Good fit for VDD of 200 – 700 mV

1off

CMOS VLSI DesignCMOS VLSI Design 4th Ed.46

Page 47: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Junction LeakageJunction LeakageReverse-biased p-n junctions have some leakage– Ordinary diode leakage– Band-to-band tunneling (BTBT)– Gate-induced drain leakage (GIDL)

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 47

Page 48: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Gate LeakageGate LeakageCarriers tunnel thorough very thin gate oxidesExponentially sensitive to tox and VDD

A and B are tech constants– A and B are tech constants– Greater for electrons

• So nMOS gates leak moreSo nMOS gates leak moreNegligible for older processes (tox > 20 Å)Critically important at 65 nm and below (tox ≈ 10.5 Å)

From [Song01]

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 48

y p ( ox )

Page 49: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Diode LeakageDiode LeakageReverse-biased p-n junctions have some leakage

e 1D

T

Vv

D SI I⎛ ⎞

= −⎜ ⎟⎜ ⎟⎝ ⎠

At any significant negative diode voltage, ID = -IsIs depends on doping levels

And area and perimeter of diffusion regions– And area and perimeter of diffusion regions– Typically < 1 fA/μm2 (negligible)

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 49

Page 50: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Band-to-Band TunnelingBand to Band TunnelingTunneling across heavily doped p-n junctions– Especially sidewall between drain & channel

when halo doping is used to increase Vt

Increases junction leakage to significant levels

– Xj: sidewall junction depthXj: sidewall junction depth– Eg: bandgap voltage– A, B: tech constants

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 50

Page 51: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Gate-Induced Drain LeakageGate Induced Drain LeakageOccurs at overlap between gate and drain– Most pronounced when drain is at VDD, gate is at

a negative voltageTh t ff t t d bth h ld l k– Thwarts efforts to reduce subthreshold leakage using a negative gate voltage

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 51

Page 52: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Temperature SensitivityTemperature SensitivityIncreasing temperature– Reduces mobility– Reduces Vt

ION decreases with temperatureIOFF increases with temperature

dsI

increasing

Vgs

gtemperature

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 52

Page 53: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

So What?So What?So what if transistors are not ideal?– They still behave like switches.

But these effects matter for…– Supply voltage choice– Logical effort

Quiescent power consumption– Quiescent power consumption– Pass transistors– Temperature of operationTemperature of operation

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 53

Page 54: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

OutlineOutlineMotivationDevice ModelsEnergy & Delay– Power and Energy– Dynamic Power

Static Power– Static Power– Delay– Energy-Delay OptimizationEnergy Delay Optimization

VariationLow-Voltage Circuit Design with Variability

CMOS VLSI DesignCMOS VLSI Design 4th Ed.

g g y

54

Page 55: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Power and EnergyPower and EnergyPower is drawn from a voltage source attached to the VDD pin(s) of a chip.

I t t P ( ) ( ) ( )P I VInstantaneous Power:

Energy:

( ) ( ) ( )P t I t V t=

( )T

E P t dt= ∫Energy:

Average Power:0

( )E P t dt= ∫1 ( )

TEP P t dt= = ∫gavg

0

( )P P t dtT T

= = ∫

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 55

Page 56: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Power in Circuit ElementsPower in Circuit Elements

( ) ( )P t I t V=( ) ( )VDD DD DDP t I t V=

( ) ( ) ( )2

2RR R

V tP t I t R

R= =

( ) ( ) ( )CdVE I t V t dt C V t dtdt

∞ ∞

= =∫ ∫( ) ( ) ( )

( )

0 0

212

C

C

V

C

dt

C V t dV CV= =

∫ ∫

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 56

0

Page 57: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Charging a CapacitorCharging a CapacitorWhen the gate output rises– Energy stored in capacitor is

– But energy drawn from the supply is

212C L DDE C V=

gy pp y

( )0 0

2DD

VDD DD L DD

V

dVE I t V dt C V dtdt

C V dV C V

∞ ∞

= =

= =

∫ ∫

∫– Half the energy from VDD is dissipated in the pMOS

transistor as heat, other half stored in capacitorWh th t t t f ll

0L DD L DDC V dV C V= =∫

When the gate output falls– Energy in capacitor is dumped to GND– Dissipated as heat in the nMOS transistor

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 57

Page 58: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Switching WaveformsSwitching WaveformsExample: VDD = 1.0 V, CL = 150 fF, f = 1 GHz

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 58

Page 59: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Switching PowerSwitching Power

1 T

∫switching0

1 ( )DD DD

T

P i t V dtT

V

= ∫

0

( )T

DDDD

V i t dtT

= ∫

[ ]sw

2

DDDD

V Tf CVT

CV f

=fswiDD(t)

VDD

2swDDCV f=

C

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 59

Page 60: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Activity FactorActivity FactorSuppose the system clock frequency = fLet fsw = αf, where α = activity factor– If the signal is a clock, α = 1– If the signal switches once per cycle, α = ½

Dynamic power:Dynamic power:2

switching DDP CV fα=

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 60

Page 61: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Short Circuit CurrentShort Circuit CurrentWhen transistors switch, both nMOS and pMOS networks may be momentarily ON at onceLeads to a blip of “short circuit” current.< 10% f d i if i /f ll ti< 10% of dynamic power if rise/fall times are comparable for input and outputWe will generally ignore this componente ge e a y g o e s co po e

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 61

Page 62: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Power Dissipation SourcesPower Dissipation SourcesPtotal = Pdynamic + Pstaticy

Dynamic power: Pdynamic = Pswitching + Pshortcircuit

– Switching load capacitances– Short-circuit current

Static power: Pstatic = (Isub + Igate + Ijunct + Icontention)VDD

Subthreshold leakage– Subthreshold leakage– Gate leakage– Junction leakageJunction leakage– Contention current

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 62

Page 63: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Dynamic Power ExampleDynamic Power Example1 billion transistor chip– 50M logic transistors

• Average width: 12 λ• Activity factor = 0 1• Activity factor = 0.1

– 950M memory transistors• Average width: 4 λ• Activity factor = 0.02

– 1.0 V 65 nm processC = 1 fF/ m (gate) + 0 8 fF/ m (diffusion)– C = 1 fF/μm (gate) + 0.8 fF/μm (diffusion)

Estimate dynamic power consumption @ 1 GHz. Neglect wire capacitance and short-circuit current.

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 63

Page 64: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

SolutionSolution( )( )( )( )6

logic 50 10 12 0.025 / 1.8 / 27 nFC m fF mλ μ λ μ= × =( )( )( )( )( )( )( )( )

( ) ( )

logic

6mem

2

950 10 4 0.025 / 1.8 / 171 nF

0 1 0 02 1 0 1 0 GH 6 1 W

f

C m fF m

P C C

μ μ

λ μ λ μ= × =

⎡ ⎤ ( ) ( )2dynamic logic mem0.1 0.02 1.0 1.0 GHz 6.1 WP C C⎡ ⎤= + =⎣ ⎦

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 64

Page 65: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Dynamic Power ReductionDynamic Power Reduction

2

Try to minimize:A ti it f t

2switching DDP CV fα=

– Activity factor– Capacitance– Supply voltage– Supply voltage– Frequency

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 65

Page 66: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Activity Factor EstimationActivity Factor EstimationLet Pi = Prob(node i = 1)– Pi = 1-Pi

αi = Pi * Pi

Completely random data has P = 0.5 and α = 0.25Data is often not completely random

e g upper bits of 64 bit words representing bank– e.g. upper bits of 64-bit words representing bank account balances are usually 0

Data propagating through ANDs and ORs has lower p p g g gactivity factor– Depends on design, but typically α ≈ 0.1

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 66

Page 67: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Switching ProbabilitySwitching Probability

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 67

Page 68: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

ExampleExampleA 4-input AND is built out of two levels of gatesEstimate the activity factor at each node if the inputs have P = 0.5

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 68

Page 69: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Clock GatingClock GatingThe best way to reduce the activity is to turn off the clock to registers in unused blocks– Saves clock activity (α = 1)

Eli i t ll it hi ti it i th bl k– Eliminates all switching activity in the block– Requires determining if block will be used

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 69

Page 70: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

CapacitanceCapacitanceGate capacitance– Fewer stages of logic– Small gate sizes

Wire capacitance– Good floorplanning to keep communicating

blocks close to each otherblocks close to each other– Drive long wires with inverters or buffers rather

than complex gates

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 70

Page 71: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Voltage / FrequencyVoltage / FrequencyRun each block at the lowest possible voltage and f th t t f i tfrequency that meets performance requirementsVoltage Domains– Provide separate supplies to different blocksProvide separate supplies to different blocks– Level converters required when crossing

from low to high VDD domains

Dynamic Voltage Scaling (DVS)Dynamic Voltage Scaling (DVS)– Adjust VDD and f according to

workload

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 71

Page 72: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Dynamic Voltage ScalingDynamic Voltage ScalingContinuously adjustable supply voltages are costlyMost benefit can be gained dithering between 2 or 3supply voltages

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 72

Page 73: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Static PowerStatic PowerStatic power is consumed even when chip is quiescent.– Leakage draws power from nominally OFF

devicesdevices– Ratioed circuits burn power in fight between ON

transistors

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 73

Page 74: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Static Power ExampleStatic Power ExampleRevisit power estimation for 1 billion transistor chipEstimate static power consumption– Subthreshold leakage

• Normal Vt: 100 nA/μm• High Vt: 10 nA/μm• High Vt used in all memories and in 95% of• High Vt used in all memories and in 95% of

logic gates– Gate leakage 5 nA/μmg μ– Junction leakage negligible

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 74

Page 75: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

SolutionSolution

( )( )( )( )6 650 10 12 0 025 m / 0 05 0 75 10 mW λ μ λ μ× ×( )( )( )( )

( )( )( ) ( )( ) ( )t

t

normal-V

6 6 6high-V

50 10 12 0.025 m / 0.05 0.75 10 m

50 10 12 0.95 950 10 4 0.025 m / 109.25 10 m

100 nA/ m+ 10 nA/ m / 2 584 mA

W

W

I W W

λ μ λ μ

λ λ μ λ μ

μ μ

= × = ×

⎡ ⎤= × + × = ×⎣ ⎦⎡ ⎤= × × =⎣ ⎦

( )t t

t t

normal-V high-V

normal-V high-V

100 nA/ m+ 10 nA/ m / 2 584 mA

5 nA/ m / 2

sub

gate

I W W

I W W

μ μ

μ

⎡ ⎤= × × =⎣ ⎦⎡ ⎤= + × =⎣ ⎦( )( )

275 mA

P 584 mA 275 mA 1.0 V 859 mWstatic = + =( )( )static

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 75

Page 76: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Subthreshold LeakageSubthreshold LeakageFor Vds > 50 mV Typical values in 65 nm

( )

10gs ds DD sbV V V k V

Ssub offI I

γη+ − −

≈Ioff = 100 nA/μm @ Vt = 0.3 VIoff = 10 nA/μm @ Vt = 0.4 VIoff = 1 nA/μm @ Vt = 0.5 V

0 1Ioff = leakage at Vgs = 0, Vds = VDDη = 0.1kγ = 0.1S = 100 mV/decade

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 76

Page 77: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Stack EffectStack EffectSeries OFF transistors have less leakage– Vx > 0, so N2 has negative Vgs

( ) ( )( )10 10

x DD x DD xx DD V V V V k VV VS SI I I

γηη − + − − −−

2 1

10 10S Ssub off off

N N

I I I= =

1 2DD

xVV

=1 2x kγη+ +

11 2

10 10DD

DD

kV

k VS S

sub off offI I I

γ

γ

ηη

η η

⎛ ⎞+ +− ⎜ ⎟⎜ ⎟+ + −⎝ ⎠

= ≈

– Leakage through 2-stack reduces ~10x– Leakage through 3-stack reduces further

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 77

Page 78: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Leakage ControlLeakage ControlLeakage and delay trade off– Aim for low leakage in sleep and low delay in

active modeTo reduce leakage:To reduce leakage:– Increase Vt: multiple Vt

• Use low Vt only in critical circuits– Increase Vs: stack effect

• Input vector control in sleep– Decrease V– Decrease Vb

• Reverse body bias in sleep• Or forward body bias in active mode

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 78

Page 79: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Gate LeakageGate LeakageExtremely strong function of tox and Vgsg

– Negligible for older processes– Approaches subthreshold leakage at 65 nm and

b l ibelow in some processesAn order of magnitude less for pMOS than nMOSControl leakage in the process using t > 10 5 ÅControl leakage in the process using tox > 10.5 Å– High-k gate dielectrics help– Some processes provide multiple toxp p p ox

• e.g. thicker oxide for 3.3 V I/O transistorsControl leakage in circuits by limiting VDD

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 79

Page 80: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

NAND3 Leakage ExampleNAND3 Leakage Example100 nm processIgn = 6.3 nA Igp = 0Ioffn = 5.63 nA Ioffp = 9.3 nA

D t f [L 03]

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 80

Data from [Lee03]

Page 81: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Junction LeakageJunction LeakageFrom reverse-biased p-n junctions– Between diffusion and substrate or well

Ordinary diode leakage is negligibleBand-to-band tunneling (BTBT) can be significant– Especially in high-Vt transistors where other

leakage is smallleakage is small– Worst at Vdb = VDD

Gate-induced drain leakage (GIDL) exacerbatesg ( )– Worst for Vgd = -VDD (or more negative)

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 81

Page 82: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Power GatingPower GatingTurn OFF power to blocks when they are idle to

l ksave leakage– Use virtual VDD (VDDV)– Gate outputs to preventGate outputs to prevent

invalid logic levels to next block

Voltage drop across sleep transistor degrades performance during normal operation– Size the transistor wide enough to minimizeSize the transistor wide enough to minimize

impactSwitching wide sleep transistor costs dynamic power

Onl j stified hen circ it sleeps long eno ghCMOS VLSI DesignCMOS VLSI Design 4th Ed. 82

– Only justified when circuit sleeps long enough

Page 83: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Delay ModelingDelay Modeling

1load DD

dC Vt k= 1pd

on

t kI

CMOS VLSI DesignCMOS VLSI Design 4th Ed.83

Page 84: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Voltage SensitivityVoltage Sensitivityα-power law (saturation)

load DDpd

C Vt k=

Transregional (near threshold)

pdDT

t kW V α

Transregional (near threshold)

2DT DTV aVC −

−Tnvload

pd DDCt k V eW

=

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 84

Page 85: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Fanout-of-4 Inverter DelayFanout of 4 Inverter Delay2

DT DTV aVl dC −

−Tnvload

pd DDCt k V eW

=

• Transregional model fits well from 140 – 700 mV

• Average error:Average error: 2.8%• Maximum error: 8 7%

CMOS VLSI DesignCMOS VLSI Design 4th Ed.85

8.7%

Page 86: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Delay TrackingDelay TrackingEx: 8-bit ripple adder delay tracks inverter delay well– 8% variation from 200 – 700 mV– Transregional delay model can predict how a

i it d l l ith V d Vcircuit delay scales with VDD and Vt

CMOS VLSI DesignCMOS VLSI Design 4th Ed.86

Page 87: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Energy-Delay OptimizationEnergy Delay OptimizationWhat is the best choice of VDD and Vt

Possible Objectives– Minimum Energy (Power-Delay Product)– Minimum Energy-Delay Product– Minimum Energy under a Delay Constraint

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 87

Page 88: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Ideal Minimum EnergyIdeal Minimum EnergyIdealized minimum energy for an inverter– Assume n = 1, ignore leakage– To get nonzero noise margin, transfer function

t b t th 1 t Vmust be steeper than -1 at Vinv

– Gives Vmin = 2vT ln 2 = 36 mV @ 300 K– If transistor has only one electron– If transistor has only one electron,

• E = qVmin/2 = kT ln 2 = 2.9 x 10-21 J– Compare inverters inp

• 0.5 μm 5V: 1.5 x 10-13 J• 65 nm 1 V: 3 x 10-16 J

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 88

Page 89: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Practical Min EnergyPractical Min EnergyBalance leakage and dynamic energy– Low VDD reduces dynamic energy– But increases cycle time and total leakage

Even though inverters can operate at 100 mV, minimum energy occurs at a higher voltage

Calhoun05

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 89

Calhoun05

Page 90: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Min Energy Ckt DesignMin Energy Ckt DesignAlso called subthreshold or near-threshold circuit design.Use static CMOS gatesUse minimum width transistors (both P and N)– Reduce switching capacitanceg p

Keep wires short– Cell height as small as possible (~8 tracks)

Avoid complex gates (> ~2 stack)Avoid complex gates (> ~2 stack)– Stack effect degrades ON current & speed– Longer cycle time leads to more leakage– Complex gates have worse noise margins

Synthesize to commercial min-sized library with complex cells removed

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 90

Page 91: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Minimum Energy ModelMinimum Energy Model

E E E= +tot dyn leakE E E= +

2dyn dyn DDE C V= leak off DD cE I V T=y y

Cdyn: Effective switching capacitance:glitching• glitching

• activity factor• short circuit current

CMOS VLSI DesignCMOS VLSI Design 4th Ed.91

Page 92: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Off Current ModelOff Current ModelSensitivity to VDD through drain-induced barrier lowering

1

DT

T

Vnv

off effI I W eη

=

Good fit for VDD of 200 – 700 mV

1off eff

CMOS VLSI DesignCMOS VLSI Design 4th Ed.92

Page 93: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Leakage EnergyLeakage Energy

E I V T=leak off DD cE I V T=

T t L=DT

T

VnvI I W eη

=

( ) 21

2DT DTV aV

eff nvWE I L kC V

η− −−

c pd dpT t L= 1off effI I W e=

21

T

leak

eff nvleak dp load DD

C

E I L kC V eW

=

CMOS VLSI DesignCMOS VLSI Design 4th Ed.93

Page 94: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Total EnergyTotal Energy( ) 21DT DTV aVη− −

−⎛ ⎞⎜ ⎟ C2 1 Tnv

tot DD dynE V C Re⎛ ⎞⎜ ⎟= +⎜ ⎟⎝ ⎠

leak

dyn

CRC

=

CMOS VLSI DesignCMOS VLSI Design 4th Ed.94

Page 95: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Application: Inv. ChainsApplication: Inv. ChainsModel circuit with logic depth N and activity factor 1/MWhat is the minimum energy point?

Model Parameters– 65 nm process– W = 0.1 μm (minimum)W 0.1 μm (minimum)– Cg = 1.0 fF/μm– Cinv = 0.2 fF

C = 0 8N fF– Cdyn = 0.8N fF– Weff = 2MNW

CMOS VLSI DesignCMOS VLSI Design 4th Ed.95

Page 96: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Model ParametersModel Parameters

CMOS VLSI DesignCMOS VLSI Design 4th Ed.96

Page 97: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

ResultsResultsFor N = 12 stage ring oscillators:

Transregional model matches HSPICE to < 15 mVSubthreshold model underestimates best supplySubthreshold model underestimates best supply voltage by up to 80 mV at low activity factorMinimum-energy operating point is above threshold

CMOS VLSI DesignCMOS VLSI Design 4th Ed.97

Page 98: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Best Supply VoltageBest Supply VoltageBest VDD is a logarithmic function of the ratio of leakage to dynamic energy

( )1.37 ln 6.96DDopt TV R nv= +

CMOS VLSI DesignCMOS VLSI Design 4th Ed.98

Page 99: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Energy Contour PlotsEnergy Contour PlotsNormalized energy in 180 nm process– Best VDD increases as α goes down

α = 1 α = 0.1 Wang02

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 99

Wang02

Page 100: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Energy-Delay ProductEnergy Delay ProductAssume VDD > Vt, leakage negligible– E = CeffVDD

2

– D = kCeffVDD/ (VDD-Vt)α

Differentiate wrt. VDD, set result to 0 to minimize EDP

VDD ~ 2Vt for min EDP

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 100

Page 101: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

EDP Considering LeakageEDP Considering LeakagePrevious model calls for VDD = Vt = 0!Considering leakage, results are messyGraphical:

Gonzalez97

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 101

Gonzalez97

Page 102: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Energy with Delay ConstraintEnergy with Delay Constraint

This is the problem most designers faceNo closed form solutionPick point where delay and energy contours tangentAt this point leakage is about half of dynamic powerAt this point, leakage is about half of dynamic power– [Markovic04]– But the curve is fairly flatBut the curve is fairly flat– May choose lower leakage to save power during

sleep mode

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 102

Page 103: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

OutlineOutlineMotivationDevice ModelsEnergy & DelayVariation– Sources

Process Corners– Process Corners– Statistical Analysis– Impact EstimationImpact Estimation

Low-Voltage Circuit Design with Variability

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 103

Page 104: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Process VariationProcess VariationThreshold Voltage

D d l t f d t i h l– Depends on placement of dopants in channel– Standard deviation inversely proportional to channel area

Channel Length

[Bernstein06]

– Systematic across-chip linewidth variation (ACLV)– Random line edge roughness (LER)

Interconnect– Etching variations affect w, s, h

Courtesy Texas Instruments

Courtesy Larry Pileggi

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 104

Courtesy Larry Pileggi

Page 105: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Vt VariationVt VariationAvt = 1.0 – 2.5 mV * μm – Might reduce to 0.4 mV * μm with device designσvt = 26 mV for min-sized transistor in IBM 90 nm– Gets worse with device scaling

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 105

Page 106: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Spatial DistributionSpatial DistributionVariations show spatial correlation– Lot-to-lot (L2L)– Wafer-to-wafer (W2W)– Die-to-die (D2D) / inter-dieDie to die (D2D) / inter die– Within-die (WID) / intradie

Closer transistors match better

Courtesy M. Pelgrom

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 106

Page 107: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Environmental VariationEnvironmental VariationVoltage– VDD is usually designed +/- 10%– Regulator error– On-chip droop from

switching activityTemperature

Courtesy IBM

Temperature– Ambient temperature ranges– On-die temperature elevatedOn die temperature elevated

by chip power consumption[Harris01b]

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 107

Page 108: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

AgingAgingTransistors change over time as they wear out– Hot carriers– Negative bias temperature instability– Time-dependent dielectric breakdown

Causes threshold voltage changesMore on this laterMore on this later…

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 108

Page 109: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Parameter VariationParameter VariationTransistors have uncertainty in parameters– Process: Leff, Vt, tox of nMOS and pMOS– Vary around typical (T) values

Fast (F)– Leff: short

V : low MO

Sfa

st

TT

FFSF

– Vt: low– tox: thin

Slow (S): opposite

pMsl

ow

SSFS

Slow (S): oppositeNot all parameters are independentfor nMOS and pMOS

nMOSfastslow

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 109

p

Page 110: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Environmental VariationEnvironmental VariationVDD and T also vary in time and spaceFast:– VDD: high– T: low

Corner Voltage TemperatureF 1.1 0 CT 1 0 70 CT 1.0 70 CS 0.9 125 C

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 110

Page 111: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Process CornersProcess CornersProcess corners describe worst case variations– If a design works in all corners, it will probably

work for any variation.D ib ith f l tt (T F S)Describe corner with four letters (T, F, S)– nMOS speed– pMOS speed– pMOS speed– Voltage– Temperaturep

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 111

Page 112: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Important CornersImportant CornersSome critical simulation corners include

Purpose nMOS pMOS VDD Temp

Cycle time S S S S

Power F F F F

Subthreshold F F F Sleakage

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 112

Page 113: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Monte Carlo SimulationMonte Carlo SimulationAs process variation increases, the worst-case corners become too pessimistic for practical designMonte Carlo: repeated simulations with parameters randomly varied each timerandomly varied each timeLook at scatter plot of results to predict yieldEx: impact of Vt variationpac o t a a o– ON-current– leakage

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 113

Page 114: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

ReliabilityReliabilityHard Errors– Oxide wearout– Interconnect wearout– Overvoltage failure

Failure Rate

InfantMortality

UsefulOperatingLife

WearOut

g– Latchup

Soft ErrorsCharacterizing reliability

Time

e

Characterizing reliability– Mean time between failures (MTBF)

• # of devices x hours of operation / number of failures– Failures in time (FIT)

• # of failures / thousand hours / million devices

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 114

Page 115: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Accelerated Lifetime TestingAccelerated Lifetime TestingExpected reliability typically exceeds 10 yearsBut products come to market in 1-2 yearsAccelerated lifetime testing required to predict d t l t li bilitadequate long-term reliability

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 115

[Arnaud08]

Page 116: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Hot CarriersHot CarriersElectric fields across channel impart high energies to some carriers– These “hot” carriers may be blasted into the gate

oxide where they become trappedoxide where they become trapped– Accumulation of charge in oxide causes shift in Vt

over time– Eventually Vt shifts too far for devices to operate

correctlyCh V t hi bl d t lif tiChoose VDD to achieve reasonable product lifetime– Worst problems for inverters and NORs with slow

input risetime and long propagation delays

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 116

input risetime and long propagation delays

Page 117: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

NBTINBTINegative bias temperature instabilityElectric field applied across oxide forms dangling bonds called traps at Si-SiO2 interfaceA l ti f t V hiftAccumulation of traps causes Vt shiftMost pronounced for pMOS transistors with strong negative bias (Vg = 0, Vs = VDD) at high temperatureega e b as ( g 0, s DD) a g e pe a u e

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 117

Page 118: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

TDDBTDDBTime-dependent dielectric breakdown– Gradual increase in gate leakage when an

electric field is applied across an oxidek t i d d l k t– a.k.a stress-induced leakage current

For 10-year life at 125 C, keep Eox below ~0.7 V/nm

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 118

Page 119: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Soft ErrorsSoft ErrorsIn 1970’s, DRAMs were observed to randomly flip bits– Ultimately linked to alpha particles and cosmic

ray neutronsray neutronsCollisions with atoms create electron-hole pairs in substrate– These carriers are collected on p-n junctions,

disturbing the voltage

[Baumann05]

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 119

[Baumann05]

Page 120: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Radiation HardeningRadiation HardeningRadiation hardening reduces soft errors– Increase node capacitance to minimize impact of

collected chargeO d d– Or use redundancy

– E.g. dual-interlocked cell

Error-correcting codesg– Correct for soft errors that do occur

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 120

Page 121: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Statistical AnalysisStatistical AnalysisProbability Density Function (PDF): f(x)– Probability that random variable X is in a range

Cumulative Density Function (CDF): F(x)Probability that X < x– Probability that X < x

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 121

Page 122: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Mean and Standard DevMean and Standard DevMean: average value of X

Standard deviation: how far X varies from the mean

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 122

Page 123: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Uniform Random VariableUniform Random Variable

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 123

Page 124: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Normal Random VariableNormal Random VariableGaussian

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 124

Page 125: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Lognormal RV Lognormal RV Exponential of a normal variableIf Y is normal with μ and σ, X = eY is lognormal

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 125

Page 126: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Normal CDFNormal CDF

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 126

Page 127: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Zero Mean Random VarZero Mean Random VarLook at variations from the mean

Xv is a zero-mean random variableWe’ll focus on these– We ll focus on these

Ex: – If Vt has a mean of 0 3 V and standard dev ofIf Vt has a mean of 0.3 V and standard dev of

0.025 V, it can be written as– Vt = 0.3 + 0.025 X, where X is normal, zero mean

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 127

Page 128: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Independent / DependentIndependent / DependentIndependent Random Variables– Vt variation from RDF

Dependent Random Variables– ACLV for nearby devices

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 128

Page 129: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Sum of Random VarsSum of Random VarsAssuming independent random variables– Mean is sum of means

Central Limit TheoremCentral Limit Theorem– The sum of a large number of independent RVs

approaches a normal RVapproaches a normal RV

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 129

Page 130: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Maximum of RVsMaximum of RVsMaximum of N independent standard normal RVs:– Not normal, but can be found in a table

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 130

Page 131: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Implication for Critical PathsImplication for Critical Paths

Longest paths form a wall with a tighter distribution

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 131

Page 132: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

ExampleExampleA large chip has 100 paths that are nearly all critical. Each path has 25 gates Each gate has a normally distributed delay with a mean of 16 psgates. Each gate has a normally distributed delay with a mean of 16 psand a standard deviation of 4 ps. What is the mean clock period and the standard deviation of this period? What period should be set so 97.7% of chips meet timing?gPath Mean Delay: 25 * 16 = 400 psPath Std Deviation: sqrt(25) * 4 = 20 psMax of 100 standard normal RVs has – Mean: 2.50– Sigma : 0.43

Clock period:p– Mean: 400 + 2.50 * 20 = 450– Standard Deviation: 0.43 * 20 = 9 ps

Tc: 450 + 2 * 9 = 468 ps

CMOS VLSI DesignCMOS VLSI Design 4th Ed.

c p

132

Page 133: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

YieldYieldY: Yield, fraction of chips that work is yieldX: Failure probability = 1 – Y

If system is built with N components with Yc

– System Yield: Ys = YcN

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 133

Page 134: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Defect DensityDefect DensityD: Defects / unit areaM components per unit areaIf defects are randomly distributed and independent– Xc = D/M

System with area A has yield

In the limit that M approaches infinity: Poisson– Ys = e-DA

CMOS VLSI DesignCMOS VLSI Design 4th Ed.

s

134

Page 135: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Variation SensitivityVariation SensitivityON and OFF Current model

Differentiate wrt L and V to find sensitivity to variationDifferentiate wrt. L and Vt to find sensitivity to variation

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 135

Page 136: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

ExamplesExamples10% change Le causes Ion and Ioff to change by– 10%

If = 1 3 n = 1 6 V = 1 0 V = 0 3If α = 1.3, n = 1.6, VDD = 1.0, Vt = 0.310 mV change in Vt causes– I changes by 1 8%Ion changes by 1.8%– Ioff changes by 23%

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 136

Page 137: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Monte Carlo ExampleMonte Carlo ExampleL has σ/μ = 0.04Vt has σ = 25 mV

Off current changes by 6x while ONby 6x while ON changes 40%Some correlation

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 137

Page 138: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Delay VariationDelay VariationGate delay varies directly with ON currentPath delay depends on correlations in gate delay– ACLV is strongly correlated among nearby gates– RDF is uncorrelated– Variance reduces for uncorrelated paths

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 138

Page 139: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Delay ExampleDelay ExampleA path contains 16 2-input gates, each with a 20 psnominal delay. Suppose Le has a 2% standard deviation from ACLV and Vt has a 25 mV standard deviation from RDF. Estimate the standarddeviation from RDF. Estimate the standard deviation in path delay.Nominal path delay: 16 x 20 = 320 psTransistors on the critical path: 24Le causes 2% correlated delay: 6.4 psV 4 6% td d i I tVt causes 4.6% std. dev in Ion per gate– But only 4.6% / sqrt(24) = 0.95% = 3.0 ps in path

Total std dev = sqrt(6 42 + 3 02) = 7 1 ps = 2 2%CMOS VLSI DesignCMOS VLSI Design 4th Ed.

Total std. dev = sqrt(6.4 + 3.0 ) = 7.1 ps = 2.2%139

Page 140: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Delay ObservationsDelay ObservationsCritical paths tend to form a wall 2-3 standard deviations above the meanShort pipeline stages suffer because of less averagingaveraging

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 140

Page 141: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

ExampleExampleA microprocessor has D2D variation of 9% and WID variation of 3% on several critical paths. If the nominal clock period is T without considering variation and the chip has nearly 1000 critical paths, what clock period p y p , pshould be selected to ensure a parametric yield of 97.7%? Neglect clock skew.Max over 1000 paths of WID variation:Max over 1000 paths of WID variation:– Mean = 3% x 3.24 = 9.7% above nominal– Std Dev = 3% x 0.35 = 1.05% above nominal

Total Stdev = RMS(9%, 1.05%) = 9.05%For 97.7% yield, 2 std devs. 9.7% + 2 x 9.05% = 1.28T

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 141

Page 142: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

EnergyEnergyVariation has little effect on dynamic energy– Systematic variation is relatively small– Random variation averages out across many paths

Strong impact on leakage– Exponential sensitivity to Vt

Shifts minimum E EDP to higher V VShifts minimum E, EDP to higher VDD, Vt

– Increases energy and delay

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 142

Page 143: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Systematic LeakageSystematic LeakageFor 3-sigma yield, accept 3 sigma of systematic Vtvariation

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 143

Page 144: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Random LeakageRandom LeakageRandom dopant fluctuations are uncorrelated but likely to have a greater standard deviationAverage across many gatesD d th f th l l l kDepends on the mean of the log-normal leakage distribution

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 144

Page 145: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Best EDPBest EDPEffect of temperature and Vt variation– Best EDP when leakage is 1/3 of total energy– If leakage increases, move to higher VDD

No Var Var

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 145

Page 146: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Sensitive CircuitsSensitive CircuitsSRAMMatched circuits (e.g. sense amplifiers)Circuits with races or matched delaysRatioed circuits (e.g. pseudo-nMOS)KeepersSubthreshold circuitsSubthreshold circuits

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 146

Page 147: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Sense Amp ExampleSense Amp ExampleThe sense amp offset voltage is normally distributed with a standard deviation of 10 mV. If a memory contains 4096 sense amps and the chip should have 99.9% parametric yield, how much offset voltage99.9% parametric yield, how much offset voltage must it tolerate.Ys = 0.999N = 4096Yc = 0.99999976Thi i b t 5 t d d d i tiThis is about 5 standard deviations– Tolerate 50 mV offset

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 147

Page 148: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

CaveatsCaveatsWe’ve made many assumptions– Independence or dependence of RVs– Normal distribution

These assumptions are seldom quite true– Especially when examining long tails

Nevertheless usefulNevertheless useful– Qualitative understanding of system behavior– Back of the envelope estimatesBack of the envelope estimates– Understand key parameters

Confirm estimates through simulation, or just build it!

CMOS VLSI DesignCMOS VLSI Design 4th Ed.

g j

148

Page 149: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Variation ToleranceVariation ToleranceAdaptive Control– Adaptive body bias

• Compensate for systematic D2D Vt variation• Reduce spread in leakage, increase speed

– Adaptive voltage scaling• Reduces speed/power spread from corners• Reduces speed/power spread from corners

– Temperature SensingFault ToleranceFault Tolerance– Spares– Error detection and correction

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 149

Page 150: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

SparesSparesProvide spare parts (e.g. extra cores or cache)Probability that a system with N components has r defective components is

If up to r defects can be repaired with sparesIf up to r defects can be repaired with spares

If N is large, consider defects / unit area D

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 150

Page 151: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

ExampleExampleIf each core in a 16-core processor has a yield of 90%, what is the system yield? How would it improve if 2 spares were available?

Without spares: Ys = (0.9)16 = 18.5%

With spares: Ys = (0.9)18 + 18 x (0.9)17 x (0.1) + 18 x 17 / 2 x (0.9)16 x (0.1)2

= 73.4%

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 151

Page 152: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

OutlineOutlineMotivationDevice ModelsEnergy & DelayVariationLow-Voltage Circuit Design with Variability

SRAM– SRAM– Sequencing Elements

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 152

Page 153: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Array ArchitectureArray Architecture2n words of 2m bits eachIf n >> m, fold by 2k into fewer rows of more columns

Good regularity – easy to designVery high density if good cells are used

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 153

y g y g

Page 154: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

6T SRAM Cell6T SRAM CellCell size accounts for most of array size– Reduce cell size at expense of complexity

6T SRAM CellUsed in most commercial chips– Used in most commercial chips

– Data stored in cross-coupled invertersRead: bit bit b– Precharge bit, bit_b– Raise wordline

Write:

_

word

Write:– Drive data onto bit, bit_b– Raise wordline

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 154

Page 155: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

SRAM ReadSRAM ReadPrecharge both bitlines highThen turn on wordlineOne of the two bitlines will be pulled down by the cellEx: A = 0, A_b = 1– bit discharges, bit_b stays high

But A bumps up slightly

bit bit_b

N1

N2P1

A

P2

N3

N4

A_b

word

– But A bumps up slightlyRead stability– A must not flipA must not flip– N1 >> N2

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 155

Page 156: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

SRAM WriteSRAM WriteDrive one bitline high, the other lowThen turn on wordlineBitlines overpower cell with new value

bit bit_b

Ex: A = 0, A_b = 1, bit = 1, bit_b = 0– Force A_b low, then A rises high

WritabilityN1

N2P1

A

P2

N3

N4

A_b

word

Writability– Must overpower feedback inverter– N2 >> P1N2 >> P1

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 156

Page 157: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

SRAM SizingSRAM SizingHigh bitlines must not overpower inverters during readsBut low bitlines must write new value into cell

bit bit_b

dweak

d

word

med

Astrong

med

A_b

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 157

Page 158: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

SRAM Column ExampleSRAM Column ExampleRead Write

φ2

MoreCells

Bitline Conditioning

φ2

MoreCells

Bitline Conditioning

SRAM Cell

word_q1

bit_v1f

bit_b_v1 f

Ce s

SRAM Cell

word_q1

bit_v1

bit_b_v1

H H

f

out_v1rout_b_v1r

φ1

f f

data_s1

write_q1

φ2

word_q1

bit_v1f

out v1r

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 158

out_v1r

Page 159: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

SRAM LayoutSRAM LayoutCell size is critical: 26 x 45 λ (even smaller in industry)Tile cells sharing VDD, GND, bitline contacts

VDD

GND GNDBIT BIT_B

WORD

Cell boundary

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 159

Page 160: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Thin CellThin CellIn nanometer CMOS– Avoid bends in polysilicon and diffusion– Orient all transistors in one direction

Lithographically friendly or thin cell layout fixes this– Also reduces length and capacitance of bitlines

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 160

Page 161: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Commercial SRAMsCommercial SRAMsFive generations of Intel SRAM cell micrographs– Transition to thin cell at 65 nm– Steady scaling of cell area

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 161

Page 162: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Cell StabilityCell StabilityCell constraints– Hold (at lower standby voltage)– Readability– Writability

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 162

Page 163: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Hold MarginHold MarginCell must hold value while idle– Even if VDD is low for standby

How much noise could be addedbefore the cell flips state?Butterfly Diagrams

Plot V vs V and– Plot V1 vs. V2 andV2 vs. V1

– SymmetricSymmetric– Square inscribed in diagram

indicates hold margin

CMOS VLSI DesignCMOS VLSI Design 4th Ed.

g

163

Page 164: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Read MarginRead MarginAvoid disturb during readBitlines are initially at VDD

Read margin is smaller than hold margin

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 164

Page 165: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Write MarginWrite MarginWrite should flip the state of the cellIf curves overlap, cell may be unwritable

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 165

Page 166: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

VariabiltyVariabiltyVariability breaks symmetry of butterfly diagrams– Reduces noise margins– If margins become negative, cell is definitely

blunusable– Even cells with small positive margin might be

unreliable due to noiseu e ab e due o o se

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 166

Page 167: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

ExampleExampleSuppose cells in a 64 Mb SRAM have normally distributed read margins with a 15 mV standard deviation. What must the mean read margin be to achieve 90% parametric yield?achieve 90% parametric yield?Cell failure probability:

26 921 1 0 9 1 6 10NX Y −= − = − = ×Requires 6σ reliability

1 1 0.9 1.6 10cX Y= − = − = ×

Thus read margin should be > 90 mVCaveats:

independence normalit chip ield point defectsCMOS VLSI DesignCMOS VLSI Design 4th Ed.

– independence, normality, chip yield, point defects167

Page 168: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Monte Carlo SimulationMonte Carlo SimulationSimulating 5-7 σ reliability is very time consuming– HSPICE results may be dubious anyway

Run smaller simulations to find the distribution– Fit curve to tail– Be conservative in case tail is not normal

Or use simulation techniques to explore the tailOr use simulation techniques to explore the tail– Mixture Importance Sampling [Kanj06]– Statistical Blockade [Singhee09 Wang10]Statistical Blockade [Singhee09, Wang10]

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 168

Page 169: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Tail FitTail FitTails of a wide class of distributions look exponential– Fit a quasiempirical exponential tail to measured

or Monte Carlo data [Keller10]O d d t i t X X t t i i l CDFOrder data points X1 … Xn to get empirical CDFReplace last k points with exponential– Keep the same expected value– Keep the same expected value

( ) 1n kx XkF x e

−−−

= −

12

1

n

n k i n ki n k

X X X

− −= − +

+ −=

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 169

k

Page 170: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Low Power SRAMsLow Power SRAMsSRAM accounts for much of chip area and powerPower reduction techniques– Overall: low VDD

• Vmin Limited by read and write margins• ~ 0.7 V for 6T in 90 nm process, scaling upward

Dynamic: activate only necessary subarrays– Dynamic: activate only necessary subarrays– Static: sleep mode to reduce leakage

• Limited by hold marginLimited by hold marginImproving margins would allow more dynamic voltage scaling and lower voltage sleep

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 170

Page 171: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Density / V i tradeoffDensity / Vmin tradeoffRatioed transistors reduce Vmin

– Npulldown > Naccess > Ppullup

Example: Intel 65 nm process– High performance cell (> min size devices)

• Vmin = 0.7 V• V = 0 6 V• Vstandby = 0.6 V

– High density cell (minimum size devices)• V i = 1 1 VVmin 1.1 V• Vstandby = 1.0 V• 44% greater density

CMOS VLSI DesignCMOS VLSI Design 4th Ed.

g y

171

Page 172: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Read AssistRead AssistImprove read margin– Pulse wordline or bitline briefly to exploit dynamic

noise margins greater than static margins [Khellah06][Khellah06]

– Lower wordline voltage [Ohbayashi07, Yabuuchi07]

– Raise VDD during reads

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 172

Page 173: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Write AssistWrite AssistImprove write margin– Drive biltine to a negative voltage– Raise wordline voltage [Morita06]– Float cell GND during writes [Yamaoka04]– Float cell VDD during writes [Yamaoka06]

Lower cell V during writes [Zhang06– Lower cell VDD during writes [Zhang06, Ohbayashi07]

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 173

Page 174: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Leakage ControlLeakage ControlMinimize leakage during standby– Reduce Vds, neagative Vgs, or negative Vbs

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 174

Page 175: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Partial Power GatingPartial Power GatingUse weaker bias device to allow partial VDDV collapse during sleep mode~200 mV reduction cuts subthreshold leakage 2x

N l li i t t &– Nearly eliminates gate &junction leakage

Turn power gate on ahead ofTurn power gate on ahead ofexpected memory access

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 175

Page 176: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

8T SRAM Cell8T SRAM CellEliminates read stability and ratio issuesOperates at a lower voltage (~0.7 V in 45 nm)Dual ported operation30% cell area penalty on 45 nm Core processors

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 176

Page 177: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Subthreshold 10T SRAMSubthreshold 10T SRAMImprove low voltage read/writeStack effect reduces leakageonto rbl when not reading– Allows more cells/bitline

Float VDDV during writeImproves write margin– Improves write margin

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 177

Page 178: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Sequencing ElementsSequencing ElementsSequencingMax and Min-DelayTime BorrowingClock SkewResilient Sequencing Elements

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 178

Page 179: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

SequencingSequencingCombinational logic– output depends on current inputs

Sequential logic– output depends on current and previous inputs– Requires separating previous, current, future

Called state or tokens– Called state or tokens– Ex: FSM, pipeline

clk clk clk clk

CLin out

CL CL

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 179

PipelineFinite State Machine

Page 180: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Sequencing Cont.Sequencing Cont.If tokens moved through pipeline at constant speed, no sequencing elements would be necessaryEx: fiber-optic cable

Li ht l (t k ) t d bl– Light pulses (tokens) are sent down cable– Next pulse sent before first reaches end of cable– No need for hardware to separate pulses– No need for hardware to separate pulses– But dispersion sets min time between pulses

This is called wave pipelining in circuitsp p gIn most circuits, dispersion is high– Delay fast tokens so they don’t catch slow ones.

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 180

Page 181: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Sequencing OverheadSequencing OverheadUse flip-flops to delay fast tokens so they move through exactly one stage each cycle.Inevitably adds some delay to the slow tokensM k i it l th j t th l i d lMakes circuit slower than just the logic delay– Called sequencing overhead

Some people call this clocking overheadSome people call this clocking overhead– But it applies to asynchronous circuits too– Inevitable side effect of maintaining sequenceg q

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 181

Page 182: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Sequencing ElementsSequencing ElementsLatch: Level sensitive– a.k.a. transparent latch, D latch

Flip-flop: edge triggered– A.k.a. master-slave flip-flop, D flip-flop, D registerp p, p p, g

Timing Diagrams– Transparent

Opaque D

Flopatch Q

clk clk

D QD

Flopatch Q

clk clk

D Q– Opaque– Edge-trigger

FLa

clk

D

FLa

clk

D

Q (latch)

Q (flop)

Q (latch)

Q (flop)

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 182

Page 183: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Sequencing MethodsSequencing MethodsFlip-flops

F

Tc

2-Phase LatchesPulsed Latches

Flip-FlopsFl

op

Flop

clk

clk clk

Combinational Logic

φ1

φ2

2-Phase Transpar

Tc/2

tnonoverlap tnonoverlap

Latc

h

Latc

h

Latc

h

φ1 φ1φ2

ent LatchesP

ul

CombinationalLogic

CombinationalLogic

Half-Cycle 1 Half-Cycle 1

φp

φp φp

lsed Latches

Combinational Logic

Latc

h

Latc

h

tpw

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 183

Page 184: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Timing DiagramsTiming Diagrams

C t i ti dA tpd

CombinationalLogicA Y

t Logic Prop Delay

Contamination and Propagation Delays

p

Yog c

clk clk

tcd

tsetup thold

tpd Logic Prop. Delay

tcd Logic Cont. Delay

tpcq Latch/Flop Clk->Q Prop. Delay

FlopD Q D

Q tccq

tpcqtccq Latch/Flop Clk->Q Cont. Delay

tpdq Latch D->Q Prop. Delay

t Latch D >Q Cont DelayLa

tch

D Q

clk clk

D

Q

tccq

tsetup tholdtpcq

tpdqtcdq

tcdq Latch D->Q Cont. Delay

tsetup Latch/Flop Setup Time

thold Latch/Flop Hold Time

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 184

Page 185: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Max-Delay: Flip-FlopsMax Delay: Flip Flopsclk clk( )t T t t≤ +

F1 F2Combinational Logic

Tc

Q1 D2( )setup

sequencing overhead

pd c pcqt T t t≤ − +

clk

Q1 tpd

tsetuptpcq

D2

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 185

Page 186: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Max Delay: 2-Phase LatchesMax Delay: 2 Phase Latchesφ1 φ1φ2

( )2t t t T t≤Q1

L1

φ1

L2 L3

CombinationalLogic 1

CombinationalLogic 2

Q2 Q3D1 D2 D3( )1 2

sequencing overhead

2pd pd pd c pdqt t t T t= + ≤ −

Tc

φ1

φ2

D1 t

Q1

D2

D1

tpd1

tpdq1

tpdq2

Q2

D3

tpd2

pdq2

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 186

Page 187: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Max Delay: Pulsed LatchesMax Delay: Pulsed Latchesφp φp( )maxt T t t t t≤ +

Tc

Q1 Q2D1 D2

D1

p p

Combinational LogicL1 L2

t d

( )setup

sequencing overhead

max ,pd c pdq pcq pwt T t t t t≤ − + −

Q1

D2

D1

(a) tpw > tsetup tpd

tpdq

φp

tpw

Q1

D2

(b) tpw < tsetup

Tctpcq

tpd tsetup

D2

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 187

Page 188: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Min-Delay: Flip-FlopsMin Delay: Flip Flopsclk

holdcd ccqt t t≥ − CLF1

Q1

clk

clk

F2

D2

clk

Q1

D2

tcd

thold

tccq

hold

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 188

Page 189: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Min-Delay: 2-Phase LatchesMin Delay: 2 Phase Latches

φ1

1, 2 hold nonoverlapcd cd ccqt t t t t≥ − −CL

Q1

φ1

L1

φ2Hold time reduced byD2

L2

φ1

tnonoverlap

Hold time reduced by nonoverlap

Paradox: hold applies

Q1

D2

φ2

tcd

t

tccq

Paradox: hold applies twice each cycle, vs. only once for flops.

D2 thold

But a flop is made of two latches!

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 189

Page 190: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Min-Delay: Pulsed LatchesMin Delay: Pulsed Latchesφp

holdcd ccq pwt t t t≥ − +CL

Q1

L1

φpHold time increasedD2

φp tpw

L2

Hold time increased by pulse width

Q1

D2

pw

tcd

thold

tccq

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 190

Page 191: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Time BorrowingTime BorrowingIn a flop-based system:– Data launches on one rising edge– Must setup before next rising edge– If it arrives late, system fails– If it arrives early, time is wasted

Flops have hard edges– Flops have hard edgesIn a latch-based system– Data can pass through latch while transparentData can pass through latch while transparent– Long cycle of logic can borrow time into next– As long as each loop completes in one cycle

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 191

g p p y

Page 192: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Time Borrowing ExampleTime Borrowing Exampleφ1φ1

φ2

φ1 φ1φ2

Latc

h

Latc

h

Latc

h

Combinational Logic CombinationalLogic(a)

Borrowing time acrosshalf-cycle boundary

Borrowing time acrosspipeline stage boundary

h h

φ1 φ2

(b) Latc

h

Latc

hCombinational Logic Combinational

Logic

Loops may borrow time internally but must complete within the cycle

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 192

Loops may borrow time internally but must complete within the cycle

Page 193: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

How Much Borrowing?How Much Borrowing?φ1 φ22-Phase Latches

Q1

L1

φ1

L2Combinational Logic 1Q2D1 D2

( )borrow setup nonoverlap2cTt t t≤ − +

φ1

φ2Tc

Tc/2 tborrow

tnonoverlap

tsetupt t t≤ −

Pulsed Latches

D2

c Nominal Half-Cycle 1 Delay

borrowborrow setuppwt t t≤ −

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 193

Page 194: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Clock SkewClock SkewWe have assumed zero clock skewClocks really have uncertainty in arrival time– Decreases maximum propagation delay– Increases minimum contamination delay– Decreases time borrowing

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 194

Page 195: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Skew: Flip-FlopsSkew: Flip Flops

F1 F2

clk clk

Combinational LogicQ1 D2

( )T≤ F F

clk

Combinational Logic

Tc

Q1

tskew

t

tpcq

t

( )setup skew

sequencing overhead

hold skew

pd c pcq

cd ccq

t T t t t

t t t t

≤ − + +

≥ − +Q1

D2

CL1

clk

Q1

tsetuptpdqhold skewcd ccq

CLF1

F2

clk

D2

Q1

D2

clk

tskew

t

thold

tccq

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 195

D2 tcd

Page 196: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Skew: LatchesSkew: Latchesφ1 φ1φ22-Phase Latches

Q1

L1

φ1

φ

L2 L3

CombinationalLogic 1

CombinationalLogic 2

Q2 Q3D1 D2 D3

( )sequencing overhead

2pd c pdqt T t≤ −

φ2

( )

1 2 hold nonoverlap skew

borrow setup nonoverlap skew

,

2

cd cd ccq

c

t t t t t t

Tt t t t

≥ − − +

≤ − + +2

( )setup skew

i h d

max ,pd c pdq pcq pwt T t t t t t≤ − + − +Pulsed Latches

( )

sequencing overhead

hold skewcd pw ccqt t t t t

t t t t

≥ + − +

≤ − +

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 196

( )borrow setup skewpwt t t t≤ +

Page 197: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

SummarySummaryFlip-Flops:– Very easy to use, supported by all tools

2-Phase Transparent Latches:– Lots of skew tolerance and time borrowing

Pulsed Latches:Fast some skew tol & borrow hold time risk– Fast, some skew tol & borrow, hold time risk

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 197

Page 198: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Timing Error Detection LatchesTiming Error Detection Latches

Designers include timing margin– Voltage– Temperature– Process variation– Data dependency

Tool inaccuracies– Tool inaccuraciesAlternative: run faster and check for near failures– Increase frequency until at the verge of errorIncrease frequency until at the verge of error– Can reduce cycle time by ~30%

Leading flavors: DSTB, Razor II

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 198

g

Page 199: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

DSTBDSTBDouble-Sampling L L

with Time Borrowing

atch

atch

( )d t t t f t lt t t t= + −( )detect pw setupf setuplt t t t+

pd c pcql setupf skewt T t t t≤ − − −

cd pw holdl ccql skewt t t t t≥ + − +

0t 0borrowt =

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 199

Page 200: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

DSTB Time BorrowingDSTB Time BorrowingTime borrowing if flip-flop clock is delayed by td

borrow d setupft t t= −

( )detect pw setupf setupl dt t t t t

t t t

= + − −

= pw setupl borrowt t t= − −

( )max ,0pd c pcql setupf dt T t t t≤ − − −( ),pd c pcql setupf d

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 200

Page 201: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Razor IIRazor IILooks for input transitionexcept when the detectionclock is lowSame timing as DSTB

borrow dc pcnlt t t= −

detect pw setupl borrowt t t t= − −

( )max 0t T t t t≤ − − −( )max ,0pd c pcql setupf dt T t t t≤

cd pw holdl ccql skewt t t t t≥ + − +

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 201

Page 202: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Trade-offsTrade offsWant a wide pulse width for– Broad detection window

• Accommodate much uncertainty– Significant time borrowing

• Hide impact of clock skew• Balance logic between pipeline stages• Balance logic between pipeline stages

But wide pulses make hold times hard to satisfy

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 202

Page 203: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

Two-Phase Adaptive LatchesTwo Phase Adaptive Latches

Break trade-off betweenL i 1

1To

tD1 Q1 D2 Q21

L i 2

2

detection, borrowing, hold times set by pulse

Logic 1

To ERR

next stage

D1 Q1 D2 Q2

1d 1d

Logic 2

At cost of 2nd latchin pipeline stage 1

Tc

T / 2

tnonoverlap

tphase

tsetupf

Tc / 2 tnonoverlap

td

2

1d

borrow d setupft t t= −

detect phase setupl borrowt t t t= − −

D1

tholdltsetupltdetect

1 2pd c pdql pdqlt T t t≤ − −

1 2cd holdl ccql skew nonoverlapt t t t t≥ − + −

CMOS VLSI DesignCMOS VLSI Design 4th Ed. 203

1,2cd holdl ccql skew nonoverlap

Page 204: Variability & Low Voltage Ci it D iCircuit Design · 2018-06-25 · Power Consumption Power consumption limits chip performance today – Chips can do more computation than we can

SummarySummaryNow you should be able to…– Make back-of-the-envelope predictions of energy

in CMOS circuitsM k i f d d i h i t d– Make informed design choices to reduce power subject to design constraints

– Describe the major sources of variation in circuitsesc be e ajo sou ces o a a o c cu s– Make statistical estimates of the impact of

variation on energy, delay, and yield– Analyze and improve noise margins in SRAM– Apply timing error detection registers to reduce

the margins caused by variationCMOS VLSI DesignCMOS VLSI Design 4th Ed.

the margins caused by variation204