Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Variability & Variability & Low Voltage Ci it D iCircuit Design
ARC Seminar
Prof. David Money Harris
30 December 2010
1
MotivationMotivationMajor challenges of nanometer CMOS design– Power consumption– Variability – Complexity
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 2
Power ConsumptionPower ConsumptionPower consumption limits chip performance today– Chips can do more computation than we can cool– Portable device battery life & standby time
Steady drive toward lower voltage to save power– Moreover, nanometer devices can’t withstand high
VVDD
New applications open at very low voltage and power– Implantable medical devicesp– Energy scavenging sensors
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 3
VariabilityVariabilityNanometer transistors face physical limits– Discrete number of dopant atoms in channel– Atomic-scale roughness of polysilicon gate
Variability tends to increase as devices shrinkLarger numbers of devices are more susceptible to improbable events at the tail of the variationimprobable events at the tail of the variationUsing worst-case design is no longer tenable
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 4
ObjectivesObjectivesAt the end of this class, you will be able to…– Make back-of-the-envelope predictions of energy
in CMOS circuitsM k i f d d i h i t d– Make informed design choices to reduce power subject to design constraints
– Describe the major sources of variation in circuitsesc be e ajo sou ces o a a o c cu s– Make statistical estimates of the impact of
variation on energy, delay, and yield– Analyze and improve noise margins in SRAM– Apply timing error detection registers to reduce
the margins caused by variationCMOS VLSI DesignCMOS VLSI Design 4th Ed.
the margins caused by variation5
OutlineOutlineMotivationDevice Models– Ideal I-V Characteristics– Gate and Diffusion Capacitance– High Field Effects
Threshold Voltage Effects– Threshold Voltage Effects– Leakage
Energy & DelayEnergy & DelayVariationLow-Voltage Circuit Design with Variability
CMOS VLSI DesignCMOS VLSI Design 4th Ed.
g g y
6
IntroductionIntroductionTransistors can be viewed as imperfect switchesAn ON transistor passes a finite amount of current– Depends on terminal voltages– Derive current-voltage (I-V) relationships
Transistor gate, source, drain all have capacitanceI = C (ΔV/Δt) > Δt = (C/I) ΔV– I = C (ΔV/Δt) -> Δt = (C/I) ΔV
– Capacitance and current determine speed
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 7
MOS CapacitorMOS CapacitorGate and body form MOS
itpolysilicon gatesilicon dioxide insulator
p-type body+-
Vg < 0
capacitorOperating modes– Accumulation
(a)– Depletion– Inversion
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 8
Terminal VoltagesTerminal VoltagesMode of operation depends on Vg, Vd, Vs
Vg
– Vgs = Vg – Vs
– Vgd = Vg – Vd
– Vds = Vd – Vs = Vgs - VgdVs Vd
VgdVgs
V +-
+
-
+
-
ds d s gs gd
Source and drain are symmetric diffusion terminals– By convention, source is terminal at lower voltage
Hence V ≥ 0
Vds+
– Hence Vds ≥ 0nMOS body is grounded. First assume source is 0 too.Three regions of operation– Cutoff– Linear– Saturation
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 9
nMOS CutoffnMOS CutoffNo channelIds ≈ 0
V 0+-
Vgs = 0
n+ n+
+-
Vgdg
s d
n+ n+
p-type body
b
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 10
nMOS LinearnMOS LinearChannel formsCurrent flows from d to s – e- from s to d +
-
Vgs > Vt
+-
Vgd = Vgsg
Ids increases with Vds
Similar to linear resistorn+ n+ Vds = 0
p-type body
b
s d
+-
Vgs > Vt
+-
Vgs > Vgd > Vt
b
g
I
n+ n+ 0 < Vds < Vgs-Vt
p-type body
b
s d Ids
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 11
b
nMOS SaturationnMOS SaturationChannel pinches offIds independent of Vds
We say current saturatesSimilar to current source
Vgs > Vt V < V+-
gs t
n+ n+
+-
Vgd < Vt
Vds > Vgs-Vt
g
s d Ids
p-type bodyb
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 12
I-V CharacteristicsI V CharacteristicsIn Linear region, Ids depends on– How much charge is in the channel?– How fast is the charge moving?
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 13
Channel ChargeChannel ChargeMOS structure looks like parallel plate capacitor while operating in inversions– Gate – oxide – channel
Q CVQchannel = CVC = Cg = εoxWL/tox = CoxWLV = V – V = (V – V /2) – V
Cox = εox / tox
V = Vgc – Vt = (Vgs – Vds/2) – Vt
V d
gate
+ +source V drain
Vg
Cpolysilicon
gate
n+ n+
p-type body
+
Vgdsource
-
Vgs
-drain
Vds
channel-Vs Vd
Cg
n+ n+
p type body
W
L
tox
SiO2 gate oxide(good insulator, εox = 3.9)
gate
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 14
p-type body
Carrier velocityCarrier velocityCharge is carried by e-Electrons are propelled by the lateral electric field between source and drain
E V /L– E = Vds/LCarrier velocity v proportional to lateral E-field – v = μE μ called mobility– v = μE μ called mobility
Time for carrier to cross channel:– t = L / v
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 15
nMOS Linear I-VnMOS Linear I VNow we know– How much charge Qchannel is in the channel– How much time t each carrier takes to cross
channelds
QIt
=
⎛ ⎞ox 2
dsgs t ds
W VC V V VL
μ ⎛ ⎞= − −⎜ ⎟⎝ ⎠
⎛ ⎞ W2
dsgs t ds
VV V Vβ ⎛ ⎞= − −⎜ ⎟⎝ ⎠
ox = WCL
β μ
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 16
nMOS Saturation I-VnMOS Saturation I VIf Vgd < Vt, channel pinches off near draing
– When Vds > Vdsat = Vgs – Vt
Now drain voltage no longer increases current
2dsat
ds gs t dsatVI V V Vβ ⎛ ⎞= − −⎜ ⎟
⎝ ⎠
( )2
2 gs tV Vβ= −
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 17
nMOS I-V SummarynMOS I V SummaryShockley 1st order transistor models
⎧cutoff0 gs tV V
V
⎧⎪ <⎪⎪ ⎛ ⎞⎜ ⎟⎨
( )2
linear2ds
ds gs t ds ds dsatVI V V V V Vβ
β
⎪ ⎛ ⎞= − − <⎜ ⎟⎨ ⎝ ⎠⎪⎪ ( )2
saturatio2
ngs t ds dsatV V V Vβ⎪− >⎪⎩
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 18
ExampleExampleWe will be using a 65 nm process in these examples– From IBM– tox = 10.5 Å– μ = 80 cm2/V*s– Vt = 0.3 V
Plot I vs VPlot Ids vs. Vds
– Vgs = 0, .2, .4, .6, .8, 1.0– Use W/L = 0 1 / 0 05 μmUse W/L 0.1 / 0.05 μm
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 19
pMOS I-VpMOS I VAll dopings and voltages are inverted for pMOS– Source is the more positive terminal
Mobility μp is determined by holes– Typically 2-3x lower than that of electrons μn
– 40 cm2/V•s in 65 nm processThus pMOS must be wider toThus pMOS must be wider to provide same current
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 20
CapacitanceCapacitanceAny two conductors separated by an insulator have capacitanceGate to channel capacitor is very important
C t h l h f ti– Creates channel charge necessary for operationSource and drain have capacitance to body– Across reverse-biased diodes– Across reverse-biased diodes– Called diffusion capacitance because it is
associated with source/drain diffusion
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 21
Gate CapacitanceGate CapacitanceApproximate channel as connected to sourceCgs = εoxWL/tox = CoxWL = CpermicronWCpermicron is typically about 2 fF/μm
polysilicon
W
tox
polysilicongate
n+ n+
p-type body
Lox
SiO2 gate oxide(good insulator, εox = 3.9ε0)
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 22
Diffusion CapacitanceDiffusion CapacitanceCsb, Cdb
Undesirable, called parasitic capacitanceCapacitance depends on area and perimeter– Use small diffusion nodes– Comparable to Cg
for contacted difffor contacted diff– ½ Cg for uncontacted– Varies with processVaries with process
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 23
Nonideal TransistorsNonideal TransistorsHigh Field Effects
M bilit D d ti– Mobility Degradation– Velocity Saturation
Threshold Voltage Effectsg– Body Effect– Drain-Induced Barrier Lowering– Short Channel Effect– Short Channel Effect
Leakage– Subthreshold Leakage
G t L k– Gate Leakage– Junction Leakage
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 24
Ideal vs. Simulated nMOS I-V PlotIdeal vs. Simulated nMOS I V Plot
65 nm IBM process, VDD = 1.0 V
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 25
ON and OFF CurrentON and OFF CurrentIon = Ids @ Vgs = Vds = VDD
– Saturation
Ioff = Ids @ Vgs = 0, Vds = VDD
Cutoff– Cutoff
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 26
Electric Fields EffectsElectric Fields EffectsVertical electric field: Evert = Vgs / toxg
– Attracts carriers into channel– Long channel: Qchannel ∝ Evert
Lateral electric field: Elat = Vds / L– Accelerates carriers from drain to source
Long channel: v = E– Long channel: v = μElat
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 27
Coffee Cart AnalogyCoffee Cart AnalogyTired student runs from VLSI lab to coffee cartFreshmen are pouring out of the physics lecture hallVds is how long you have been up– Your velocity = fatigue × mobility
Vgs is a wind blowing you against the glass (SiO2) wallAt high V you are buffeted against the wallAt high Vgs, you are buffeted against the wall– Mobility degradation
At high Vd you scatter off freshmen fall down get upAt high Vds, you scatter off freshmen, fall down, get up– Velocity saturation
• Don’t confuse this with the saturation region
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 28
g
Mobility DegradationMobility DegradationHigh Evert effectively reduces mobility– Collisions with oxide interface
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 29
Velocity SaturationVelocity SaturationAt high Elat, carrier velocity rolls off– Carriers scatter off atoms in silicon lattice– Velocity reaches vsat
• Electrons: 107 cm/s• Holes: 8 x 106 cm/s
Better model– Better model
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 30
Vel Sat I-V EffectsVel Sat I V EffectsIdeal transistor ON current increases with VDD
2
( ) ( )2
2
ox 2 2gs t
ds gs t
V VWI C V VL
βμ−
= = −
Velocity-saturated ON current increases with VDD
( )ox maxds gs tI C W V V v= −
Real transistors are partially velocity saturated– Approximate with α-power law model
( )g
Approximate with α power law model– Ids ∝ VDD
α
– 1 < α < 2 determined empirically (≈ 1.3 for 65 nm)
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 31
p y ( )
α-Power Modelα Power Model0 cutoffgs tV V⎧ <
⎪ ( )αβlinear
saturation
gs t
dsds dsat ds dsat
dsat
VI I V VV
I V V
⎪⎪= <⎨⎪⎪ >⎩
( )
( ) / 22dsat c gs t
dsat v gs t
I P V V
V P V V
α
α
β= −
= −saturationdsat ds dsatI V V⎪ >⎩( )g
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 32
Channel Length ModulationChannel Length ModulationReverse-biased p-n junctions form a depletion region– Region between n and p with no carriers– Width of depletion Ld region grows with reverse bias
VDDGND VDD– Leff = L – Ld
Shorter Leff gives more currentI increases with V
GateSource DrainVDDGND VDD
Depletion RegionWidth: Ld
– Ids increases with Vds
– Even in saturationn+
p bulk Si
n+
GND
LLeff
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 33
Chan Length Mod I-VChan Length Mod I V
( ) ( )21
2ds gs t dsI V V Vβ λ= − +
λ = channel length modulation coefficient
2
λ = channel length modulation coefficient– not feature size– Empirically fit to I-V characteristicsEmpirically fit to I V characteristics
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 34
Threshold Voltage EffectsThreshold Voltage EffectsVt is Vgs for which the channel starts to invertg
Ideal models assumed Vt is constantReally depends (weakly) on almost everything else:– Body voltage: Body Effect– Drain voltage: Drain-Induced Barrier Lowering
Channel length: Short Channel Effect– Channel length: Short Channel Effect
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 35
Body EffectBody EffectBody is a fourth transistor terminalVsb affects the charge required to invert the channel– Increasing Vs or decreasing Vb increases Vt
( )0t t s sb sV V Vγ φ φ= + + −
φs = surface potential at threshold( )0t t s sb sγ φ φ
2 ln As T
i
Nvn
φ =
– Depends on doping level NA
– And intrinsic carrier concentration ni
γ = body effect coefficient
i
γ y
sioxsi
ox ox
2q2q A
A
Nt NCε
γ εε
= =
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 36
Body Effect Cont.Body Effect Cont.
For small source-to-body voltage, treat as linearFor small source to body voltage, treat as linear
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 37
DIBLDIBLElectric field from drain affects channelMore pronounced in small transistors where the drain is closer to the channelD i I d d B i L iDrain-Induced Barrier Lowering– Drain voltage also affect Vt
ttdsVVVη
High drain voltage causes current to increase.t t dsV V Vη′ = −
g g
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 38
Short Channel EffectShort Channel EffectIn small transistors, source/drain depletion regions extend into the channel– Impacts the amount of charge required to invert
the channelthe channel– And thus makes Vt a function of channel length
Short channel effect: Vt increases with LS o c a e e ec t c eases– Some processes exhibit a reverse short channel
effect in which Vt decreases with L
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 39
LeakageLeakageWhat about current in cutoff?Simulated resultsWhat differs?– Current doesn’t
go to 0 in cutoff
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 40
Leakage SourcesLeakage SourcesSubthreshold conduction– Transistors can’t abruptly turn ON or OFF– Dominant source in contemporary transistors
Gate leakage– Tunneling through ultrathin gate dielectric
Junction leakageJunction leakage– Reverse-biased PN junction diode current
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 41
Subthreshold LeakageSubthreshold LeakageSubthreshold leakage exponential with Vgs
⎛ ⎞
n is process dependent
0
0e 1 egs t ds sb ds
T T
V V V k V Vnv v
ds dsI Iγη− + − −⎛ ⎞
= −⎜ ⎟⎜ ⎟⎝ ⎠
p p– typically 1.3-1.7
Rewrite relative to Ioff on log scale
S ≈ 100 mV/decade @ room temperature
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 42
@ p
BSIM I SimulationBSIM Ion Simulation
CMOS VLSI DesignCMOS VLSI Design 4th Ed.43
ON-Current ModelsON Current ModelsSaturation: α-Power Law Model [Sakurai90]
2on DTI k WV α= DT DD tV V V= −
Subthreshold: Exponential Model [Sheu87]DT
T
VnvI I We=
kTv =
Near Threshold: EKV [Markovic10]0onI I We=
( ) 2⎡ ⎤⎛ ⎞
Tvq
=
( ) 21
222 ln 1DD t
T
V Vnvox
on Tfit
n C WI v ek L
ημ
+ −⎡ ⎤⎛ ⎞⎢ ⎥⎜ ⎟= +
⎜ ⎟⎢ ⎥⎝ ⎠⎣ ⎦
CMOS VLSI DesignCMOS VLSI Design 4th Ed.44
⎝ ⎠⎣ ⎦
Transregional I ModelTransregional Ion Model2V aV Exponential Transregional
0
DT DT
T
V aVnv
onI I We−
=Exponential Model
Transregional Model
• nMOS device
• 65 nm commercial65 nm commercial process
• Transregional model fits ll f 60 700 Vwell from 60 – 700 mV
• Average error: 2.4%• Maximum error: 6.1%
CMOS VLSI DesignCMOS VLSI Design 4th Ed.45
Off Current ModelOff Current ModelSensitivity to VDD through drain-induced barrier lowering
1
DT
T
Vnv
offI I Weη
=
Good fit for VDD of 200 – 700 mV
1off
CMOS VLSI DesignCMOS VLSI Design 4th Ed.46
Junction LeakageJunction LeakageReverse-biased p-n junctions have some leakage– Ordinary diode leakage– Band-to-band tunneling (BTBT)– Gate-induced drain leakage (GIDL)
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 47
Gate LeakageGate LeakageCarriers tunnel thorough very thin gate oxidesExponentially sensitive to tox and VDD
A and B are tech constants– A and B are tech constants– Greater for electrons
• So nMOS gates leak moreSo nMOS gates leak moreNegligible for older processes (tox > 20 Å)Critically important at 65 nm and below (tox ≈ 10.5 Å)
From [Song01]
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 48
y p ( ox )
Diode LeakageDiode LeakageReverse-biased p-n junctions have some leakage
e 1D
T
Vv
D SI I⎛ ⎞
= −⎜ ⎟⎜ ⎟⎝ ⎠
At any significant negative diode voltage, ID = -IsIs depends on doping levels
And area and perimeter of diffusion regions– And area and perimeter of diffusion regions– Typically < 1 fA/μm2 (negligible)
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 49
Band-to-Band TunnelingBand to Band TunnelingTunneling across heavily doped p-n junctions– Especially sidewall between drain & channel
when halo doping is used to increase Vt
Increases junction leakage to significant levels
– Xj: sidewall junction depthXj: sidewall junction depth– Eg: bandgap voltage– A, B: tech constants
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 50
Gate-Induced Drain LeakageGate Induced Drain LeakageOccurs at overlap between gate and drain– Most pronounced when drain is at VDD, gate is at
a negative voltageTh t ff t t d bth h ld l k– Thwarts efforts to reduce subthreshold leakage using a negative gate voltage
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 51
Temperature SensitivityTemperature SensitivityIncreasing temperature– Reduces mobility– Reduces Vt
ION decreases with temperatureIOFF increases with temperature
dsI
increasing
Vgs
gtemperature
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 52
So What?So What?So what if transistors are not ideal?– They still behave like switches.
But these effects matter for…– Supply voltage choice– Logical effort
Quiescent power consumption– Quiescent power consumption– Pass transistors– Temperature of operationTemperature of operation
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 53
OutlineOutlineMotivationDevice ModelsEnergy & Delay– Power and Energy– Dynamic Power
Static Power– Static Power– Delay– Energy-Delay OptimizationEnergy Delay Optimization
VariationLow-Voltage Circuit Design with Variability
CMOS VLSI DesignCMOS VLSI Design 4th Ed.
g g y
54
Power and EnergyPower and EnergyPower is drawn from a voltage source attached to the VDD pin(s) of a chip.
I t t P ( ) ( ) ( )P I VInstantaneous Power:
Energy:
( ) ( ) ( )P t I t V t=
( )T
E P t dt= ∫Energy:
Average Power:0
( )E P t dt= ∫1 ( )
TEP P t dt= = ∫gavg
0
( )P P t dtT T
= = ∫
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 55
Power in Circuit ElementsPower in Circuit Elements
( ) ( )P t I t V=( ) ( )VDD DD DDP t I t V=
( ) ( ) ( )2
2RR R
V tP t I t R
R= =
( ) ( ) ( )CdVE I t V t dt C V t dtdt
∞ ∞
= =∫ ∫( ) ( ) ( )
( )
0 0
212
C
C
V
C
dt
C V t dV CV= =
∫ ∫
∫
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 56
0
Charging a CapacitorCharging a CapacitorWhen the gate output rises– Energy stored in capacitor is
– But energy drawn from the supply is
212C L DDE C V=
gy pp y
( )0 0
2DD
VDD DD L DD
V
dVE I t V dt C V dtdt
C V dV C V
∞ ∞
= =
= =
∫ ∫
∫– Half the energy from VDD is dissipated in the pMOS
transistor as heat, other half stored in capacitorWh th t t t f ll
0L DD L DDC V dV C V= =∫
When the gate output falls– Energy in capacitor is dumped to GND– Dissipated as heat in the nMOS transistor
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 57
Switching WaveformsSwitching WaveformsExample: VDD = 1.0 V, CL = 150 fF, f = 1 GHz
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 58
Switching PowerSwitching Power
1 T
∫switching0
1 ( )DD DD
T
P i t V dtT
V
= ∫
0
( )T
DDDD
V i t dtT
= ∫
[ ]sw
2
DDDD
V Tf CVT
CV f
=fswiDD(t)
VDD
2swDDCV f=
C
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 59
Activity FactorActivity FactorSuppose the system clock frequency = fLet fsw = αf, where α = activity factor– If the signal is a clock, α = 1– If the signal switches once per cycle, α = ½
Dynamic power:Dynamic power:2
switching DDP CV fα=
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 60
Short Circuit CurrentShort Circuit CurrentWhen transistors switch, both nMOS and pMOS networks may be momentarily ON at onceLeads to a blip of “short circuit” current.< 10% f d i if i /f ll ti< 10% of dynamic power if rise/fall times are comparable for input and outputWe will generally ignore this componente ge e a y g o e s co po e
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 61
Power Dissipation SourcesPower Dissipation SourcesPtotal = Pdynamic + Pstaticy
Dynamic power: Pdynamic = Pswitching + Pshortcircuit
– Switching load capacitances– Short-circuit current
Static power: Pstatic = (Isub + Igate + Ijunct + Icontention)VDD
Subthreshold leakage– Subthreshold leakage– Gate leakage– Junction leakageJunction leakage– Contention current
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 62
Dynamic Power ExampleDynamic Power Example1 billion transistor chip– 50M logic transistors
• Average width: 12 λ• Activity factor = 0 1• Activity factor = 0.1
– 950M memory transistors• Average width: 4 λ• Activity factor = 0.02
– 1.0 V 65 nm processC = 1 fF/ m (gate) + 0 8 fF/ m (diffusion)– C = 1 fF/μm (gate) + 0.8 fF/μm (diffusion)
Estimate dynamic power consumption @ 1 GHz. Neglect wire capacitance and short-circuit current.
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 63
SolutionSolution( )( )( )( )6
logic 50 10 12 0.025 / 1.8 / 27 nFC m fF mλ μ λ μ= × =( )( )( )( )( )( )( )( )
( ) ( )
logic
6mem
2
950 10 4 0.025 / 1.8 / 171 nF
0 1 0 02 1 0 1 0 GH 6 1 W
f
C m fF m
P C C
μ μ
λ μ λ μ= × =
⎡ ⎤ ( ) ( )2dynamic logic mem0.1 0.02 1.0 1.0 GHz 6.1 WP C C⎡ ⎤= + =⎣ ⎦
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 64
Dynamic Power ReductionDynamic Power Reduction
2
Try to minimize:A ti it f t
2switching DDP CV fα=
– Activity factor– Capacitance– Supply voltage– Supply voltage– Frequency
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 65
Activity Factor EstimationActivity Factor EstimationLet Pi = Prob(node i = 1)– Pi = 1-Pi
αi = Pi * Pi
Completely random data has P = 0.5 and α = 0.25Data is often not completely random
e g upper bits of 64 bit words representing bank– e.g. upper bits of 64-bit words representing bank account balances are usually 0
Data propagating through ANDs and ORs has lower p p g g gactivity factor– Depends on design, but typically α ≈ 0.1
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 66
Switching ProbabilitySwitching Probability
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 67
ExampleExampleA 4-input AND is built out of two levels of gatesEstimate the activity factor at each node if the inputs have P = 0.5
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 68
Clock GatingClock GatingThe best way to reduce the activity is to turn off the clock to registers in unused blocks– Saves clock activity (α = 1)
Eli i t ll it hi ti it i th bl k– Eliminates all switching activity in the block– Requires determining if block will be used
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 69
CapacitanceCapacitanceGate capacitance– Fewer stages of logic– Small gate sizes
Wire capacitance– Good floorplanning to keep communicating
blocks close to each otherblocks close to each other– Drive long wires with inverters or buffers rather
than complex gates
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 70
Voltage / FrequencyVoltage / FrequencyRun each block at the lowest possible voltage and f th t t f i tfrequency that meets performance requirementsVoltage Domains– Provide separate supplies to different blocksProvide separate supplies to different blocks– Level converters required when crossing
from low to high VDD domains
Dynamic Voltage Scaling (DVS)Dynamic Voltage Scaling (DVS)– Adjust VDD and f according to
workload
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 71
Dynamic Voltage ScalingDynamic Voltage ScalingContinuously adjustable supply voltages are costlyMost benefit can be gained dithering between 2 or 3supply voltages
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 72
Static PowerStatic PowerStatic power is consumed even when chip is quiescent.– Leakage draws power from nominally OFF
devicesdevices– Ratioed circuits burn power in fight between ON
transistors
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 73
Static Power ExampleStatic Power ExampleRevisit power estimation for 1 billion transistor chipEstimate static power consumption– Subthreshold leakage
• Normal Vt: 100 nA/μm• High Vt: 10 nA/μm• High Vt used in all memories and in 95% of• High Vt used in all memories and in 95% of
logic gates– Gate leakage 5 nA/μmg μ– Junction leakage negligible
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 74
SolutionSolution
( )( )( )( )6 650 10 12 0 025 m / 0 05 0 75 10 mW λ μ λ μ× ×( )( )( )( )
( )( )( ) ( )( ) ( )t
t
normal-V
6 6 6high-V
50 10 12 0.025 m / 0.05 0.75 10 m
50 10 12 0.95 950 10 4 0.025 m / 109.25 10 m
100 nA/ m+ 10 nA/ m / 2 584 mA
W
W
I W W
λ μ λ μ
λ λ μ λ μ
μ μ
= × = ×
⎡ ⎤= × + × = ×⎣ ⎦⎡ ⎤= × × =⎣ ⎦
( )t t
t t
normal-V high-V
normal-V high-V
100 nA/ m+ 10 nA/ m / 2 584 mA
5 nA/ m / 2
sub
gate
I W W
I W W
μ μ
μ
⎡ ⎤= × × =⎣ ⎦⎡ ⎤= + × =⎣ ⎦( )( )
275 mA
P 584 mA 275 mA 1.0 V 859 mWstatic = + =( )( )static
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 75
Subthreshold LeakageSubthreshold LeakageFor Vds > 50 mV Typical values in 65 nm
( )
10gs ds DD sbV V V k V
Ssub offI I
γη+ − −
≈Ioff = 100 nA/μm @ Vt = 0.3 VIoff = 10 nA/μm @ Vt = 0.4 VIoff = 1 nA/μm @ Vt = 0.5 V
0 1Ioff = leakage at Vgs = 0, Vds = VDDη = 0.1kγ = 0.1S = 100 mV/decade
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 76
Stack EffectStack EffectSeries OFF transistors have less leakage– Vx > 0, so N2 has negative Vgs
( ) ( )( )10 10
x DD x DD xx DD V V V V k VV VS SI I I
γηη − + − − −−
2 1
10 10S Ssub off off
N N
I I I= =
1 2DD
xVV
kη
=1 2x kγη+ +
11 2
10 10DD
DD
kV
k VS S
sub off offI I I
γ
γ
ηη
η η
⎛ ⎞+ +− ⎜ ⎟⎜ ⎟+ + −⎝ ⎠
= ≈
– Leakage through 2-stack reduces ~10x– Leakage through 3-stack reduces further
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 77
Leakage ControlLeakage ControlLeakage and delay trade off– Aim for low leakage in sleep and low delay in
active modeTo reduce leakage:To reduce leakage:– Increase Vt: multiple Vt
• Use low Vt only in critical circuits– Increase Vs: stack effect
• Input vector control in sleep– Decrease V– Decrease Vb
• Reverse body bias in sleep• Or forward body bias in active mode
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 78
Gate LeakageGate LeakageExtremely strong function of tox and Vgsg
– Negligible for older processes– Approaches subthreshold leakage at 65 nm and
b l ibelow in some processesAn order of magnitude less for pMOS than nMOSControl leakage in the process using t > 10 5 ÅControl leakage in the process using tox > 10.5 Å– High-k gate dielectrics help– Some processes provide multiple toxp p p ox
• e.g. thicker oxide for 3.3 V I/O transistorsControl leakage in circuits by limiting VDD
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 79
NAND3 Leakage ExampleNAND3 Leakage Example100 nm processIgn = 6.3 nA Igp = 0Ioffn = 5.63 nA Ioffp = 9.3 nA
D t f [L 03]
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 80
Data from [Lee03]
Junction LeakageJunction LeakageFrom reverse-biased p-n junctions– Between diffusion and substrate or well
Ordinary diode leakage is negligibleBand-to-band tunneling (BTBT) can be significant– Especially in high-Vt transistors where other
leakage is smallleakage is small– Worst at Vdb = VDD
Gate-induced drain leakage (GIDL) exacerbatesg ( )– Worst for Vgd = -VDD (or more negative)
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 81
Power GatingPower GatingTurn OFF power to blocks when they are idle to
l ksave leakage– Use virtual VDD (VDDV)– Gate outputs to preventGate outputs to prevent
invalid logic levels to next block
Voltage drop across sleep transistor degrades performance during normal operation– Size the transistor wide enough to minimizeSize the transistor wide enough to minimize
impactSwitching wide sleep transistor costs dynamic power
Onl j stified hen circ it sleeps long eno ghCMOS VLSI DesignCMOS VLSI Design 4th Ed. 82
– Only justified when circuit sleeps long enough
Delay ModelingDelay Modeling
1load DD
dC Vt k= 1pd
on
t kI
CMOS VLSI DesignCMOS VLSI Design 4th Ed.83
Voltage SensitivityVoltage Sensitivityα-power law (saturation)
load DDpd
C Vt k=
Transregional (near threshold)
pdDT
t kW V α
Transregional (near threshold)
2DT DTV aVC −
−Tnvload
pd DDCt k V eW
=
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 84
Fanout-of-4 Inverter DelayFanout of 4 Inverter Delay2
DT DTV aVl dC −
−Tnvload
pd DDCt k V eW
=
• Transregional model fits well from 140 – 700 mV
• Average error:Average error: 2.8%• Maximum error: 8 7%
CMOS VLSI DesignCMOS VLSI Design 4th Ed.85
8.7%
Delay TrackingDelay TrackingEx: 8-bit ripple adder delay tracks inverter delay well– 8% variation from 200 – 700 mV– Transregional delay model can predict how a
i it d l l ith V d Vcircuit delay scales with VDD and Vt
CMOS VLSI DesignCMOS VLSI Design 4th Ed.86
Energy-Delay OptimizationEnergy Delay OptimizationWhat is the best choice of VDD and Vt
Possible Objectives– Minimum Energy (Power-Delay Product)– Minimum Energy-Delay Product– Minimum Energy under a Delay Constraint
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 87
Ideal Minimum EnergyIdeal Minimum EnergyIdealized minimum energy for an inverter– Assume n = 1, ignore leakage– To get nonzero noise margin, transfer function
t b t th 1 t Vmust be steeper than -1 at Vinv
– Gives Vmin = 2vT ln 2 = 36 mV @ 300 K– If transistor has only one electron– If transistor has only one electron,
• E = qVmin/2 = kT ln 2 = 2.9 x 10-21 J– Compare inverters inp
• 0.5 μm 5V: 1.5 x 10-13 J• 65 nm 1 V: 3 x 10-16 J
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 88
Practical Min EnergyPractical Min EnergyBalance leakage and dynamic energy– Low VDD reduces dynamic energy– But increases cycle time and total leakage
Even though inverters can operate at 100 mV, minimum energy occurs at a higher voltage
Calhoun05
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 89
Calhoun05
Min Energy Ckt DesignMin Energy Ckt DesignAlso called subthreshold or near-threshold circuit design.Use static CMOS gatesUse minimum width transistors (both P and N)– Reduce switching capacitanceg p
Keep wires short– Cell height as small as possible (~8 tracks)
Avoid complex gates (> ~2 stack)Avoid complex gates (> ~2 stack)– Stack effect degrades ON current & speed– Longer cycle time leads to more leakage– Complex gates have worse noise margins
Synthesize to commercial min-sized library with complex cells removed
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 90
Minimum Energy ModelMinimum Energy Model
E E E= +tot dyn leakE E E= +
2dyn dyn DDE C V= leak off DD cE I V T=y y
Cdyn: Effective switching capacitance:glitching• glitching
• activity factor• short circuit current
CMOS VLSI DesignCMOS VLSI Design 4th Ed.91
Off Current ModelOff Current ModelSensitivity to VDD through drain-induced barrier lowering
1
DT
T
Vnv
off effI I W eη
=
Good fit for VDD of 200 – 700 mV
1off eff
CMOS VLSI DesignCMOS VLSI Design 4th Ed.92
Leakage EnergyLeakage Energy
E I V T=leak off DD cE I V T=
T t L=DT
T
VnvI I W eη
=
( ) 21
2DT DTV aV
eff nvWE I L kC V
η− −−
c pd dpT t L= 1off effI I W e=
21
T
leak
eff nvleak dp load DD
C
E I L kC V eW
=
CMOS VLSI DesignCMOS VLSI Design 4th Ed.93
Total EnergyTotal Energy( ) 21DT DTV aVη− −
−⎛ ⎞⎜ ⎟ C2 1 Tnv
tot DD dynE V C Re⎛ ⎞⎜ ⎟= +⎜ ⎟⎝ ⎠
leak
dyn
CRC
=
CMOS VLSI DesignCMOS VLSI Design 4th Ed.94
Application: Inv. ChainsApplication: Inv. ChainsModel circuit with logic depth N and activity factor 1/MWhat is the minimum energy point?
Model Parameters– 65 nm process– W = 0.1 μm (minimum)W 0.1 μm (minimum)– Cg = 1.0 fF/μm– Cinv = 0.2 fF
C = 0 8N fF– Cdyn = 0.8N fF– Weff = 2MNW
CMOS VLSI DesignCMOS VLSI Design 4th Ed.95
Model ParametersModel Parameters
CMOS VLSI DesignCMOS VLSI Design 4th Ed.96
ResultsResultsFor N = 12 stage ring oscillators:
Transregional model matches HSPICE to < 15 mVSubthreshold model underestimates best supplySubthreshold model underestimates best supply voltage by up to 80 mV at low activity factorMinimum-energy operating point is above threshold
CMOS VLSI DesignCMOS VLSI Design 4th Ed.97
Best Supply VoltageBest Supply VoltageBest VDD is a logarithmic function of the ratio of leakage to dynamic energy
( )1.37 ln 6.96DDopt TV R nv= +
CMOS VLSI DesignCMOS VLSI Design 4th Ed.98
Energy Contour PlotsEnergy Contour PlotsNormalized energy in 180 nm process– Best VDD increases as α goes down
α = 1 α = 0.1 Wang02
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 99
Wang02
Energy-Delay ProductEnergy Delay ProductAssume VDD > Vt, leakage negligible– E = CeffVDD
2
– D = kCeffVDD/ (VDD-Vt)α
Differentiate wrt. VDD, set result to 0 to minimize EDP
VDD ~ 2Vt for min EDP
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 100
EDP Considering LeakageEDP Considering LeakagePrevious model calls for VDD = Vt = 0!Considering leakage, results are messyGraphical:
Gonzalez97
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 101
Gonzalez97
Energy with Delay ConstraintEnergy with Delay Constraint
This is the problem most designers faceNo closed form solutionPick point where delay and energy contours tangentAt this point leakage is about half of dynamic powerAt this point, leakage is about half of dynamic power– [Markovic04]– But the curve is fairly flatBut the curve is fairly flat– May choose lower leakage to save power during
sleep mode
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 102
OutlineOutlineMotivationDevice ModelsEnergy & DelayVariation– Sources
Process Corners– Process Corners– Statistical Analysis– Impact EstimationImpact Estimation
Low-Voltage Circuit Design with Variability
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 103
Process VariationProcess VariationThreshold Voltage
D d l t f d t i h l– Depends on placement of dopants in channel– Standard deviation inversely proportional to channel area
Channel Length
[Bernstein06]
– Systematic across-chip linewidth variation (ACLV)– Random line edge roughness (LER)
Interconnect– Etching variations affect w, s, h
Courtesy Texas Instruments
Courtesy Larry Pileggi
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 104
Courtesy Larry Pileggi
Vt VariationVt VariationAvt = 1.0 – 2.5 mV * μm – Might reduce to 0.4 mV * μm with device designσvt = 26 mV for min-sized transistor in IBM 90 nm– Gets worse with device scaling
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 105
Spatial DistributionSpatial DistributionVariations show spatial correlation– Lot-to-lot (L2L)– Wafer-to-wafer (W2W)– Die-to-die (D2D) / inter-dieDie to die (D2D) / inter die– Within-die (WID) / intradie
Closer transistors match better
Courtesy M. Pelgrom
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 106
Environmental VariationEnvironmental VariationVoltage– VDD is usually designed +/- 10%– Regulator error– On-chip droop from
switching activityTemperature
Courtesy IBM
Temperature– Ambient temperature ranges– On-die temperature elevatedOn die temperature elevated
by chip power consumption[Harris01b]
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 107
AgingAgingTransistors change over time as they wear out– Hot carriers– Negative bias temperature instability– Time-dependent dielectric breakdown
Causes threshold voltage changesMore on this laterMore on this later…
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 108
Parameter VariationParameter VariationTransistors have uncertainty in parameters– Process: Leff, Vt, tox of nMOS and pMOS– Vary around typical (T) values
Fast (F)– Leff: short
V : low MO
Sfa
st
TT
FFSF
– Vt: low– tox: thin
Slow (S): opposite
pMsl
ow
SSFS
Slow (S): oppositeNot all parameters are independentfor nMOS and pMOS
nMOSfastslow
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 109
p
Environmental VariationEnvironmental VariationVDD and T also vary in time and spaceFast:– VDD: high– T: low
Corner Voltage TemperatureF 1.1 0 CT 1 0 70 CT 1.0 70 CS 0.9 125 C
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 110
Process CornersProcess CornersProcess corners describe worst case variations– If a design works in all corners, it will probably
work for any variation.D ib ith f l tt (T F S)Describe corner with four letters (T, F, S)– nMOS speed– pMOS speed– pMOS speed– Voltage– Temperaturep
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 111
Important CornersImportant CornersSome critical simulation corners include
Purpose nMOS pMOS VDD Temp
Cycle time S S S S
Power F F F F
Subthreshold F F F Sleakage
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 112
Monte Carlo SimulationMonte Carlo SimulationAs process variation increases, the worst-case corners become too pessimistic for practical designMonte Carlo: repeated simulations with parameters randomly varied each timerandomly varied each timeLook at scatter plot of results to predict yieldEx: impact of Vt variationpac o t a a o– ON-current– leakage
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 113
ReliabilityReliabilityHard Errors– Oxide wearout– Interconnect wearout– Overvoltage failure
Failure Rate
InfantMortality
UsefulOperatingLife
WearOut
g– Latchup
Soft ErrorsCharacterizing reliability
Time
e
Characterizing reliability– Mean time between failures (MTBF)
• # of devices x hours of operation / number of failures– Failures in time (FIT)
• # of failures / thousand hours / million devices
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 114
Accelerated Lifetime TestingAccelerated Lifetime TestingExpected reliability typically exceeds 10 yearsBut products come to market in 1-2 yearsAccelerated lifetime testing required to predict d t l t li bilitadequate long-term reliability
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 115
[Arnaud08]
Hot CarriersHot CarriersElectric fields across channel impart high energies to some carriers– These “hot” carriers may be blasted into the gate
oxide where they become trappedoxide where they become trapped– Accumulation of charge in oxide causes shift in Vt
over time– Eventually Vt shifts too far for devices to operate
correctlyCh V t hi bl d t lif tiChoose VDD to achieve reasonable product lifetime– Worst problems for inverters and NORs with slow
input risetime and long propagation delays
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 116
input risetime and long propagation delays
NBTINBTINegative bias temperature instabilityElectric field applied across oxide forms dangling bonds called traps at Si-SiO2 interfaceA l ti f t V hiftAccumulation of traps causes Vt shiftMost pronounced for pMOS transistors with strong negative bias (Vg = 0, Vs = VDD) at high temperatureega e b as ( g 0, s DD) a g e pe a u e
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 117
TDDBTDDBTime-dependent dielectric breakdown– Gradual increase in gate leakage when an
electric field is applied across an oxidek t i d d l k t– a.k.a stress-induced leakage current
For 10-year life at 125 C, keep Eox below ~0.7 V/nm
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 118
Soft ErrorsSoft ErrorsIn 1970’s, DRAMs were observed to randomly flip bits– Ultimately linked to alpha particles and cosmic
ray neutronsray neutronsCollisions with atoms create electron-hole pairs in substrate– These carriers are collected on p-n junctions,
disturbing the voltage
[Baumann05]
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 119
[Baumann05]
Radiation HardeningRadiation HardeningRadiation hardening reduces soft errors– Increase node capacitance to minimize impact of
collected chargeO d d– Or use redundancy
– E.g. dual-interlocked cell
Error-correcting codesg– Correct for soft errors that do occur
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 120
Statistical AnalysisStatistical AnalysisProbability Density Function (PDF): f(x)– Probability that random variable X is in a range
Cumulative Density Function (CDF): F(x)Probability that X < x– Probability that X < x
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 121
Mean and Standard DevMean and Standard DevMean: average value of X
Standard deviation: how far X varies from the mean
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 122
Uniform Random VariableUniform Random Variable
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 123
Normal Random VariableNormal Random VariableGaussian
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 124
Lognormal RV Lognormal RV Exponential of a normal variableIf Y is normal with μ and σ, X = eY is lognormal
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 125
Normal CDFNormal CDF
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 126
Zero Mean Random VarZero Mean Random VarLook at variations from the mean
Xv is a zero-mean random variableWe’ll focus on these– We ll focus on these
Ex: – If Vt has a mean of 0 3 V and standard dev ofIf Vt has a mean of 0.3 V and standard dev of
0.025 V, it can be written as– Vt = 0.3 + 0.025 X, where X is normal, zero mean
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 127
Independent / DependentIndependent / DependentIndependent Random Variables– Vt variation from RDF
Dependent Random Variables– ACLV for nearby devices
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 128
Sum of Random VarsSum of Random VarsAssuming independent random variables– Mean is sum of means
Central Limit TheoremCentral Limit Theorem– The sum of a large number of independent RVs
approaches a normal RVapproaches a normal RV
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 129
Maximum of RVsMaximum of RVsMaximum of N independent standard normal RVs:– Not normal, but can be found in a table
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 130
Implication for Critical PathsImplication for Critical Paths
Longest paths form a wall with a tighter distribution
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 131
ExampleExampleA large chip has 100 paths that are nearly all critical. Each path has 25 gates Each gate has a normally distributed delay with a mean of 16 psgates. Each gate has a normally distributed delay with a mean of 16 psand a standard deviation of 4 ps. What is the mean clock period and the standard deviation of this period? What period should be set so 97.7% of chips meet timing?gPath Mean Delay: 25 * 16 = 400 psPath Std Deviation: sqrt(25) * 4 = 20 psMax of 100 standard normal RVs has – Mean: 2.50– Sigma : 0.43
Clock period:p– Mean: 400 + 2.50 * 20 = 450– Standard Deviation: 0.43 * 20 = 9 ps
Tc: 450 + 2 * 9 = 468 ps
CMOS VLSI DesignCMOS VLSI Design 4th Ed.
c p
132
YieldYieldY: Yield, fraction of chips that work is yieldX: Failure probability = 1 – Y
If system is built with N components with Yc
– System Yield: Ys = YcN
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 133
Defect DensityDefect DensityD: Defects / unit areaM components per unit areaIf defects are randomly distributed and independent– Xc = D/M
System with area A has yield
In the limit that M approaches infinity: Poisson– Ys = e-DA
CMOS VLSI DesignCMOS VLSI Design 4th Ed.
s
134
Variation SensitivityVariation SensitivityON and OFF Current model
Differentiate wrt L and V to find sensitivity to variationDifferentiate wrt. L and Vt to find sensitivity to variation
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 135
ExamplesExamples10% change Le causes Ion and Ioff to change by– 10%
If = 1 3 n = 1 6 V = 1 0 V = 0 3If α = 1.3, n = 1.6, VDD = 1.0, Vt = 0.310 mV change in Vt causes– I changes by 1 8%Ion changes by 1.8%– Ioff changes by 23%
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 136
Monte Carlo ExampleMonte Carlo ExampleL has σ/μ = 0.04Vt has σ = 25 mV
Off current changes by 6x while ONby 6x while ON changes 40%Some correlation
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 137
Delay VariationDelay VariationGate delay varies directly with ON currentPath delay depends on correlations in gate delay– ACLV is strongly correlated among nearby gates– RDF is uncorrelated– Variance reduces for uncorrelated paths
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 138
Delay ExampleDelay ExampleA path contains 16 2-input gates, each with a 20 psnominal delay. Suppose Le has a 2% standard deviation from ACLV and Vt has a 25 mV standard deviation from RDF. Estimate the standarddeviation from RDF. Estimate the standard deviation in path delay.Nominal path delay: 16 x 20 = 320 psTransistors on the critical path: 24Le causes 2% correlated delay: 6.4 psV 4 6% td d i I tVt causes 4.6% std. dev in Ion per gate– But only 4.6% / sqrt(24) = 0.95% = 3.0 ps in path
Total std dev = sqrt(6 42 + 3 02) = 7 1 ps = 2 2%CMOS VLSI DesignCMOS VLSI Design 4th Ed.
Total std. dev = sqrt(6.4 + 3.0 ) = 7.1 ps = 2.2%139
Delay ObservationsDelay ObservationsCritical paths tend to form a wall 2-3 standard deviations above the meanShort pipeline stages suffer because of less averagingaveraging
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 140
ExampleExampleA microprocessor has D2D variation of 9% and WID variation of 3% on several critical paths. If the nominal clock period is T without considering variation and the chip has nearly 1000 critical paths, what clock period p y p , pshould be selected to ensure a parametric yield of 97.7%? Neglect clock skew.Max over 1000 paths of WID variation:Max over 1000 paths of WID variation:– Mean = 3% x 3.24 = 9.7% above nominal– Std Dev = 3% x 0.35 = 1.05% above nominal
Total Stdev = RMS(9%, 1.05%) = 9.05%For 97.7% yield, 2 std devs. 9.7% + 2 x 9.05% = 1.28T
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 141
EnergyEnergyVariation has little effect on dynamic energy– Systematic variation is relatively small– Random variation averages out across many paths
Strong impact on leakage– Exponential sensitivity to Vt
Shifts minimum E EDP to higher V VShifts minimum E, EDP to higher VDD, Vt
– Increases energy and delay
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 142
Systematic LeakageSystematic LeakageFor 3-sigma yield, accept 3 sigma of systematic Vtvariation
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 143
Random LeakageRandom LeakageRandom dopant fluctuations are uncorrelated but likely to have a greater standard deviationAverage across many gatesD d th f th l l l kDepends on the mean of the log-normal leakage distribution
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 144
Best EDPBest EDPEffect of temperature and Vt variation– Best EDP when leakage is 1/3 of total energy– If leakage increases, move to higher VDD
No Var Var
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 145
Sensitive CircuitsSensitive CircuitsSRAMMatched circuits (e.g. sense amplifiers)Circuits with races or matched delaysRatioed circuits (e.g. pseudo-nMOS)KeepersSubthreshold circuitsSubthreshold circuits
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 146
Sense Amp ExampleSense Amp ExampleThe sense amp offset voltage is normally distributed with a standard deviation of 10 mV. If a memory contains 4096 sense amps and the chip should have 99.9% parametric yield, how much offset voltage99.9% parametric yield, how much offset voltage must it tolerate.Ys = 0.999N = 4096Yc = 0.99999976Thi i b t 5 t d d d i tiThis is about 5 standard deviations– Tolerate 50 mV offset
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 147
CaveatsCaveatsWe’ve made many assumptions– Independence or dependence of RVs– Normal distribution
These assumptions are seldom quite true– Especially when examining long tails
Nevertheless usefulNevertheless useful– Qualitative understanding of system behavior– Back of the envelope estimatesBack of the envelope estimates– Understand key parameters
Confirm estimates through simulation, or just build it!
CMOS VLSI DesignCMOS VLSI Design 4th Ed.
g j
148
Variation ToleranceVariation ToleranceAdaptive Control– Adaptive body bias
• Compensate for systematic D2D Vt variation• Reduce spread in leakage, increase speed
– Adaptive voltage scaling• Reduces speed/power spread from corners• Reduces speed/power spread from corners
– Temperature SensingFault ToleranceFault Tolerance– Spares– Error detection and correction
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 149
SparesSparesProvide spare parts (e.g. extra cores or cache)Probability that a system with N components has r defective components is
If up to r defects can be repaired with sparesIf up to r defects can be repaired with spares
If N is large, consider defects / unit area D
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 150
ExampleExampleIf each core in a 16-core processor has a yield of 90%, what is the system yield? How would it improve if 2 spares were available?
Without spares: Ys = (0.9)16 = 18.5%
With spares: Ys = (0.9)18 + 18 x (0.9)17 x (0.1) + 18 x 17 / 2 x (0.9)16 x (0.1)2
= 73.4%
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 151
OutlineOutlineMotivationDevice ModelsEnergy & DelayVariationLow-Voltage Circuit Design with Variability
SRAM– SRAM– Sequencing Elements
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 152
Array ArchitectureArray Architecture2n words of 2m bits eachIf n >> m, fold by 2k into fewer rows of more columns
Good regularity – easy to designVery high density if good cells are used
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 153
y g y g
6T SRAM Cell6T SRAM CellCell size accounts for most of array size– Reduce cell size at expense of complexity
6T SRAM CellUsed in most commercial chips– Used in most commercial chips
– Data stored in cross-coupled invertersRead: bit bit b– Precharge bit, bit_b– Raise wordline
Write:
_
word
Write:– Drive data onto bit, bit_b– Raise wordline
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 154
SRAM ReadSRAM ReadPrecharge both bitlines highThen turn on wordlineOne of the two bitlines will be pulled down by the cellEx: A = 0, A_b = 1– bit discharges, bit_b stays high
But A bumps up slightly
bit bit_b
N1
N2P1
A
P2
N3
N4
A_b
word
– But A bumps up slightlyRead stability– A must not flipA must not flip– N1 >> N2
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 155
SRAM WriteSRAM WriteDrive one bitline high, the other lowThen turn on wordlineBitlines overpower cell with new value
bit bit_b
Ex: A = 0, A_b = 1, bit = 1, bit_b = 0– Force A_b low, then A rises high
WritabilityN1
N2P1
A
P2
N3
N4
A_b
word
Writability– Must overpower feedback inverter– N2 >> P1N2 >> P1
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 156
SRAM SizingSRAM SizingHigh bitlines must not overpower inverters during readsBut low bitlines must write new value into cell
bit bit_b
dweak
d
word
med
Astrong
med
A_b
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 157
SRAM Column ExampleSRAM Column ExampleRead Write
φ2
MoreCells
Bitline Conditioning
φ2
MoreCells
Bitline Conditioning
SRAM Cell
word_q1
bit_v1f
bit_b_v1 f
Ce s
SRAM Cell
word_q1
bit_v1
bit_b_v1
H H
f
out_v1rout_b_v1r
φ1
f f
data_s1
write_q1
φ2
word_q1
bit_v1f
out v1r
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 158
out_v1r
SRAM LayoutSRAM LayoutCell size is critical: 26 x 45 λ (even smaller in industry)Tile cells sharing VDD, GND, bitline contacts
VDD
GND GNDBIT BIT_B
WORD
Cell boundary
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 159
Thin CellThin CellIn nanometer CMOS– Avoid bends in polysilicon and diffusion– Orient all transistors in one direction
Lithographically friendly or thin cell layout fixes this– Also reduces length and capacitance of bitlines
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 160
Commercial SRAMsCommercial SRAMsFive generations of Intel SRAM cell micrographs– Transition to thin cell at 65 nm– Steady scaling of cell area
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 161
Cell StabilityCell StabilityCell constraints– Hold (at lower standby voltage)– Readability– Writability
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 162
Hold MarginHold MarginCell must hold value while idle– Even if VDD is low for standby
How much noise could be addedbefore the cell flips state?Butterfly Diagrams
Plot V vs V and– Plot V1 vs. V2 andV2 vs. V1
– SymmetricSymmetric– Square inscribed in diagram
indicates hold margin
CMOS VLSI DesignCMOS VLSI Design 4th Ed.
g
163
Read MarginRead MarginAvoid disturb during readBitlines are initially at VDD
Read margin is smaller than hold margin
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 164
Write MarginWrite MarginWrite should flip the state of the cellIf curves overlap, cell may be unwritable
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 165
VariabiltyVariabiltyVariability breaks symmetry of butterfly diagrams– Reduces noise margins– If margins become negative, cell is definitely
blunusable– Even cells with small positive margin might be
unreliable due to noiseu e ab e due o o se
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 166
ExampleExampleSuppose cells in a 64 Mb SRAM have normally distributed read margins with a 15 mV standard deviation. What must the mean read margin be to achieve 90% parametric yield?achieve 90% parametric yield?Cell failure probability:
26 921 1 0 9 1 6 10NX Y −= − = − = ×Requires 6σ reliability
1 1 0.9 1.6 10cX Y= − = − = ×
Thus read margin should be > 90 mVCaveats:
independence normalit chip ield point defectsCMOS VLSI DesignCMOS VLSI Design 4th Ed.
– independence, normality, chip yield, point defects167
Monte Carlo SimulationMonte Carlo SimulationSimulating 5-7 σ reliability is very time consuming– HSPICE results may be dubious anyway
Run smaller simulations to find the distribution– Fit curve to tail– Be conservative in case tail is not normal
Or use simulation techniques to explore the tailOr use simulation techniques to explore the tail– Mixture Importance Sampling [Kanj06]– Statistical Blockade [Singhee09 Wang10]Statistical Blockade [Singhee09, Wang10]
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 168
Tail FitTail FitTails of a wide class of distributions look exponential– Fit a quasiempirical exponential tail to measured
or Monte Carlo data [Keller10]O d d t i t X X t t i i l CDFOrder data points X1 … Xn to get empirical CDFReplace last k points with exponential– Keep the same expected value– Keep the same expected value
( ) 1n kx XkF x e
nθ
−−−
= −
12
1
n
n k i n ki n k
X X X
kθ
− −= − +
+ −=
∑
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 169
k
Low Power SRAMsLow Power SRAMsSRAM accounts for much of chip area and powerPower reduction techniques– Overall: low VDD
• Vmin Limited by read and write margins• ~ 0.7 V for 6T in 90 nm process, scaling upward
Dynamic: activate only necessary subarrays– Dynamic: activate only necessary subarrays– Static: sleep mode to reduce leakage
• Limited by hold marginLimited by hold marginImproving margins would allow more dynamic voltage scaling and lower voltage sleep
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 170
Density / V i tradeoffDensity / Vmin tradeoffRatioed transistors reduce Vmin
– Npulldown > Naccess > Ppullup
Example: Intel 65 nm process– High performance cell (> min size devices)
• Vmin = 0.7 V• V = 0 6 V• Vstandby = 0.6 V
– High density cell (minimum size devices)• V i = 1 1 VVmin 1.1 V• Vstandby = 1.0 V• 44% greater density
CMOS VLSI DesignCMOS VLSI Design 4th Ed.
g y
171
Read AssistRead AssistImprove read margin– Pulse wordline or bitline briefly to exploit dynamic
noise margins greater than static margins [Khellah06][Khellah06]
– Lower wordline voltage [Ohbayashi07, Yabuuchi07]
– Raise VDD during reads
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 172
Write AssistWrite AssistImprove write margin– Drive biltine to a negative voltage– Raise wordline voltage [Morita06]– Float cell GND during writes [Yamaoka04]– Float cell VDD during writes [Yamaoka06]
Lower cell V during writes [Zhang06– Lower cell VDD during writes [Zhang06, Ohbayashi07]
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 173
Leakage ControlLeakage ControlMinimize leakage during standby– Reduce Vds, neagative Vgs, or negative Vbs
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 174
Partial Power GatingPartial Power GatingUse weaker bias device to allow partial VDDV collapse during sleep mode~200 mV reduction cuts subthreshold leakage 2x
N l li i t t &– Nearly eliminates gate &junction leakage
Turn power gate on ahead ofTurn power gate on ahead ofexpected memory access
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 175
8T SRAM Cell8T SRAM CellEliminates read stability and ratio issuesOperates at a lower voltage (~0.7 V in 45 nm)Dual ported operation30% cell area penalty on 45 nm Core processors
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 176
Subthreshold 10T SRAMSubthreshold 10T SRAMImprove low voltage read/writeStack effect reduces leakageonto rbl when not reading– Allows more cells/bitline
Float VDDV during writeImproves write margin– Improves write margin
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 177
Sequencing ElementsSequencing ElementsSequencingMax and Min-DelayTime BorrowingClock SkewResilient Sequencing Elements
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 178
SequencingSequencingCombinational logic– output depends on current inputs
Sequential logic– output depends on current and previous inputs– Requires separating previous, current, future
Called state or tokens– Called state or tokens– Ex: FSM, pipeline
clk clk clk clk
CLin out
CL CL
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 179
PipelineFinite State Machine
Sequencing Cont.Sequencing Cont.If tokens moved through pipeline at constant speed, no sequencing elements would be necessaryEx: fiber-optic cable
Li ht l (t k ) t d bl– Light pulses (tokens) are sent down cable– Next pulse sent before first reaches end of cable– No need for hardware to separate pulses– No need for hardware to separate pulses– But dispersion sets min time between pulses
This is called wave pipelining in circuitsp p gIn most circuits, dispersion is high– Delay fast tokens so they don’t catch slow ones.
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 180
Sequencing OverheadSequencing OverheadUse flip-flops to delay fast tokens so they move through exactly one stage each cycle.Inevitably adds some delay to the slow tokensM k i it l th j t th l i d lMakes circuit slower than just the logic delay– Called sequencing overhead
Some people call this clocking overheadSome people call this clocking overhead– But it applies to asynchronous circuits too– Inevitable side effect of maintaining sequenceg q
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 181
Sequencing ElementsSequencing ElementsLatch: Level sensitive– a.k.a. transparent latch, D latch
Flip-flop: edge triggered– A.k.a. master-slave flip-flop, D flip-flop, D registerp p, p p, g
Timing Diagrams– Transparent
Opaque D
Flopatch Q
clk clk
D QD
Flopatch Q
clk clk
D Q– Opaque– Edge-trigger
FLa
clk
D
FLa
clk
D
Q (latch)
Q (flop)
Q (latch)
Q (flop)
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 182
Sequencing MethodsSequencing MethodsFlip-flops
F
Tc
2-Phase LatchesPulsed Latches
Flip-FlopsFl
op
Flop
clk
clk clk
Combinational Logic
φ1
φ2
2-Phase Transpar
Tc/2
tnonoverlap tnonoverlap
Latc
h
Latc
h
Latc
h
φ1 φ1φ2
ent LatchesP
ul
CombinationalLogic
CombinationalLogic
Half-Cycle 1 Half-Cycle 1
φp
φp φp
lsed Latches
Combinational Logic
Latc
h
Latc
h
tpw
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 183
Timing DiagramsTiming Diagrams
C t i ti dA tpd
CombinationalLogicA Y
t Logic Prop Delay
Contamination and Propagation Delays
p
Yog c
clk clk
tcd
tsetup thold
tpd Logic Prop. Delay
tcd Logic Cont. Delay
tpcq Latch/Flop Clk->Q Prop. Delay
FlopD Q D
Q tccq
tpcqtccq Latch/Flop Clk->Q Cont. Delay
tpdq Latch D->Q Prop. Delay
t Latch D >Q Cont DelayLa
tch
D Q
clk clk
D
Q
tccq
tsetup tholdtpcq
tpdqtcdq
tcdq Latch D->Q Cont. Delay
tsetup Latch/Flop Setup Time
thold Latch/Flop Hold Time
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 184
Max-Delay: Flip-FlopsMax Delay: Flip Flopsclk clk( )t T t t≤ +
F1 F2Combinational Logic
Tc
Q1 D2( )setup
sequencing overhead
pd c pcqt T t t≤ − +
clk
Q1 tpd
tsetuptpcq
D2
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 185
Max Delay: 2-Phase LatchesMax Delay: 2 Phase Latchesφ1 φ1φ2
( )2t t t T t≤Q1
L1
φ1
L2 L3
CombinationalLogic 1
CombinationalLogic 2
Q2 Q3D1 D2 D3( )1 2
sequencing overhead
2pd pd pd c pdqt t t T t= + ≤ −
Tc
φ1
φ2
D1 t
Q1
D2
D1
tpd1
tpdq1
tpdq2
Q2
D3
tpd2
pdq2
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 186
Max Delay: Pulsed LatchesMax Delay: Pulsed Latchesφp φp( )maxt T t t t t≤ +
Tc
Q1 Q2D1 D2
D1
p p
Combinational LogicL1 L2
t d
( )setup
sequencing overhead
max ,pd c pdq pcq pwt T t t t t≤ − + −
Q1
D2
D1
(a) tpw > tsetup tpd
tpdq
φp
tpw
Q1
D2
(b) tpw < tsetup
Tctpcq
tpd tsetup
D2
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 187
Min-Delay: Flip-FlopsMin Delay: Flip Flopsclk
holdcd ccqt t t≥ − CLF1
Q1
clk
clk
F2
D2
clk
Q1
D2
tcd
thold
tccq
hold
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 188
Min-Delay: 2-Phase LatchesMin Delay: 2 Phase Latches
φ1
1, 2 hold nonoverlapcd cd ccqt t t t t≥ − −CL
Q1
φ1
L1
φ2Hold time reduced byD2
L2
φ1
tnonoverlap
Hold time reduced by nonoverlap
Paradox: hold applies
Q1
D2
φ2
tcd
t
tccq
Paradox: hold applies twice each cycle, vs. only once for flops.
D2 thold
But a flop is made of two latches!
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 189
Min-Delay: Pulsed LatchesMin Delay: Pulsed Latchesφp
holdcd ccq pwt t t t≥ − +CL
Q1
L1
φpHold time increasedD2
φp tpw
L2
Hold time increased by pulse width
Q1
D2
pw
tcd
thold
tccq
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 190
Time BorrowingTime BorrowingIn a flop-based system:– Data launches on one rising edge– Must setup before next rising edge– If it arrives late, system fails– If it arrives early, time is wasted
Flops have hard edges– Flops have hard edgesIn a latch-based system– Data can pass through latch while transparentData can pass through latch while transparent– Long cycle of logic can borrow time into next– As long as each loop completes in one cycle
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 191
g p p y
Time Borrowing ExampleTime Borrowing Exampleφ1φ1
φ2
φ1 φ1φ2
Latc
h
Latc
h
Latc
h
Combinational Logic CombinationalLogic(a)
Borrowing time acrosshalf-cycle boundary
Borrowing time acrosspipeline stage boundary
h h
φ1 φ2
(b) Latc
h
Latc
hCombinational Logic Combinational
Logic
Loops may borrow time internally but must complete within the cycle
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 192
Loops may borrow time internally but must complete within the cycle
How Much Borrowing?How Much Borrowing?φ1 φ22-Phase Latches
Q1
L1
φ1
L2Combinational Logic 1Q2D1 D2
( )borrow setup nonoverlap2cTt t t≤ − +
φ1
φ2Tc
Tc/2 tborrow
tnonoverlap
tsetupt t t≤ −
Pulsed Latches
D2
c Nominal Half-Cycle 1 Delay
borrowborrow setuppwt t t≤ −
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 193
Clock SkewClock SkewWe have assumed zero clock skewClocks really have uncertainty in arrival time– Decreases maximum propagation delay– Increases minimum contamination delay– Decreases time borrowing
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 194
Skew: Flip-FlopsSkew: Flip Flops
F1 F2
clk clk
Combinational LogicQ1 D2
( )T≤ F F
clk
Combinational Logic
Tc
Q1
tskew
t
tpcq
t
( )setup skew
sequencing overhead
hold skew
pd c pcq
cd ccq
t T t t t
t t t t
≤ − + +
≥ − +Q1
D2
CL1
clk
Q1
tsetuptpdqhold skewcd ccq
CLF1
F2
clk
D2
Q1
D2
clk
tskew
t
thold
tccq
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 195
D2 tcd
Skew: LatchesSkew: Latchesφ1 φ1φ22-Phase Latches
Q1
L1
φ1
φ
L2 L3
CombinationalLogic 1
CombinationalLogic 2
Q2 Q3D1 D2 D3
( )sequencing overhead
2pd c pdqt T t≤ −
φ2
( )
1 2 hold nonoverlap skew
borrow setup nonoverlap skew
,
2
cd cd ccq
c
t t t t t t
Tt t t t
≥ − − +
≤ − + +2
( )setup skew
i h d
max ,pd c pdq pcq pwt T t t t t t≤ − + − +Pulsed Latches
( )
sequencing overhead
hold skewcd pw ccqt t t t t
t t t t
≥ + − +
≤ − +
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 196
( )borrow setup skewpwt t t t≤ +
SummarySummaryFlip-Flops:– Very easy to use, supported by all tools
2-Phase Transparent Latches:– Lots of skew tolerance and time borrowing
Pulsed Latches:Fast some skew tol & borrow hold time risk– Fast, some skew tol & borrow, hold time risk
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 197
Timing Error Detection LatchesTiming Error Detection Latches
Designers include timing margin– Voltage– Temperature– Process variation– Data dependency
Tool inaccuracies– Tool inaccuraciesAlternative: run faster and check for near failures– Increase frequency until at the verge of errorIncrease frequency until at the verge of error– Can reduce cycle time by ~30%
Leading flavors: DSTB, Razor II
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 198
g
DSTBDSTBDouble-Sampling L L
with Time Borrowing
atch
atch
( )d t t t f t lt t t t= + −( )detect pw setupf setuplt t t t+
pd c pcql setupf skewt T t t t≤ − − −
cd pw holdl ccql skewt t t t t≥ + − +
0t 0borrowt =
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 199
DSTB Time BorrowingDSTB Time BorrowingTime borrowing if flip-flop clock is delayed by td
borrow d setupft t t= −
( )detect pw setupf setupl dt t t t t
t t t
= + − −
= pw setupl borrowt t t= − −
( )max ,0pd c pcql setupf dt T t t t≤ − − −( ),pd c pcql setupf d
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 200
Razor IIRazor IILooks for input transitionexcept when the detectionclock is lowSame timing as DSTB
borrow dc pcnlt t t= −
detect pw setupl borrowt t t t= − −
( )max 0t T t t t≤ − − −( )max ,0pd c pcql setupf dt T t t t≤
cd pw holdl ccql skewt t t t t≥ + − +
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 201
Trade-offsTrade offsWant a wide pulse width for– Broad detection window
• Accommodate much uncertainty– Significant time borrowing
• Hide impact of clock skew• Balance logic between pipeline stages• Balance logic between pipeline stages
But wide pulses make hold times hard to satisfy
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 202
Two-Phase Adaptive LatchesTwo Phase Adaptive Latches
Break trade-off betweenL i 1
1To
tD1 Q1 D2 Q21
L i 2
2
detection, borrowing, hold times set by pulse
Logic 1
To ERR
next stage
D1 Q1 D2 Q2
1d 1d
Logic 2
At cost of 2nd latchin pipeline stage 1
Tc
T / 2
tnonoverlap
tphase
tsetupf
Tc / 2 tnonoverlap
td
2
1d
borrow d setupft t t= −
detect phase setupl borrowt t t t= − −
D1
tholdltsetupltdetect
1 2pd c pdql pdqlt T t t≤ − −
1 2cd holdl ccql skew nonoverlapt t t t t≥ − + −
CMOS VLSI DesignCMOS VLSI Design 4th Ed. 203
1,2cd holdl ccql skew nonoverlap
SummarySummaryNow you should be able to…– Make back-of-the-envelope predictions of energy
in CMOS circuitsM k i f d d i h i t d– Make informed design choices to reduce power subject to design constraints
– Describe the major sources of variation in circuitsesc be e ajo sou ces o a a o c cu s– Make statistical estimates of the impact of
variation on energy, delay, and yield– Analyze and improve noise margins in SRAM– Apply timing error detection registers to reduce
the margins caused by variationCMOS VLSI DesignCMOS VLSI Design 4th Ed.
the margins caused by variation204