ECE 260B – CSE 241A Power Consumption 1 http:/ /vlsicad.ucsd.edu
ECE260B – CSE241AWinter 2005
Power Consumption
Website: http:/ /vlsicad.ucsd.edu/courses/ece260b-w05
ECE 260B – CSE 241A Power Consumption 2 http:/ /vlsicad.ucsd.edu
VLSI Design Metrics
Area / cost
Performance
Power consumption
Reliability
Figure courtesy, D. Singh
● Manufacturing yield
● Signal integrity (e.g., crosstalk, supply voltage drop, etc.)
● Logic correctness / acceptable performance variation under process, operating condition variations
● Expected lifetime (due to eletromigration, soft-error, peak current, etc.)
ECE 260B – CSE 241A Power Consumption 3 http:/ /vlsicad.ucsd.edu
Power Dissipation
P6Pentium ® proc
486
3862868086
80858080
80084004
0.1
1
10
100
1971 1974 1978 1985 1992 2000Year
Pow
er (
Wat
ts)
Lead Microprocessor’s power continues to increase
Courtesy, Intel
Power delivery and dissipation will be prohibitive(?)
ECE 260B – CSE 241A Power Consumption 4 http:/ /vlsicad.ucsd.edu
Power Density
400480088080
8085
8086
286386
486Pentium® proc
P6
1
10
100
1000
10000
1970 1980 1990 2000 2010Year
Pow
er D
ensi
ty (
W/c
m2)
Hot Plate
NuclearReactor
RocketNozzle
Power density too high to keep junctions at low temp(?)
Courtesy, Intel
ECE 260B – CSE 241A Power Consumption 5 http:/ /vlsicad.ucsd.edu
Low Power Design DriversConsumer products
● Affects expected battery lifetime
● Slow development of battery technology (90-110 Watt-hrs/Kg)
● Low power reducing energy consumption
High performance designs● Increasingly expensive packaging and cooling strategies
- Size, weight, heat sinks,
- Air, liquid cooling mechanism
● Supply voltage drop
● Temperature - Every 10OC increase in operating temperature roughly doubles a
component’s failure rate
● Low power reducing peak power consumption for less thermal effects, better signal integrity and reliability
- Signal integrity / logic correctness / acceptable performance variation / design lifetime
ECE 260B – CSE 241A Power Consumption 6 http:/ /vlsicad.ucsd.edu
Low Power Design Metrics
Energy efficiency in Joules● Energy = power * delay (Joules = Watts * seconds)
● Affects battery lifetime
Average power consumption in Watts● Results in thermal effects
● Sets packaging limits (50W / cm2 ? 120W total ?) ($1/Watt ?)
Worst case supply current● Simultaneous transistor switching
● Supply voltage drop performance degradation
● Maximum device current device lifetime
● Electromigration wire lifetime
ECE 260B – CSE 241A Power Consumption 7 http:/ /vlsicad.ucsd.edu
Power Versus Energy
Watts
time
Power is height of curve
Watts
time
Approach 1
Approach 2
Approach 2
Approach 1
Energy is area under curve
Lower power design could simply be slower
Two approaches require the same energy
Slide courtesy of Mary Jane Irwin, PSU
ECE 260B – CSE 241A Power Consumption 8 http:/ /vlsicad.ucsd.edu
Low Power Design Objectives
Worst case supply current I
Average power P = I V● Maximum cycle power
● Maximum N-cycle power
● Maximum sustainable power
Energy E = ∫ P dt
Energy-delay products● Simultaneous power reduction and performance optimization
Usually to reduce average power under timing constraints
ECE 260B – CSE 241A Power Consumption 9 http:/ /vlsicad.ucsd.edu
Outline
Problem statement
Power dissipation components
Power estimation
Optimization techniques
ECE 260B – CSE 241A Power Consumption 10 http:/ /vlsicad.ucsd.edu
Static CMOS Gate Power
Power dissipation in static CMOS gate: 3 components
Dynamic capacitive (switching, “useful”) power● Still dominant component in current technology
● Charging and discharging the capacitor
Crowbar current (short-circuit power)● During a transition, current flows through both P and N
transistors simultaneously for a SHORT period of time
● Slow transitions worsen short-circuit power
Leakage (“useless power”) current● Even when a device is nominally OFF (VGS=0), a small amount of
current is still flowing
● With many devices, can add up to hundreds of mW
Slide courtesy of Mary Jane Irwin, PSU
ECE 260B – CSE 241A Power Consumption 11 http:/ /vlsicad.ucsd.edu
Reducing Dynamic Capacitive (Switching) Power
Pdyn = CL VDD2 P0→1 f
Capacitance:Function of fan-out, wire length, transistor sizes
Supply Voltage:Has been dropping with successive generations
Clock frequency:Increasing…
Activity factor:How often, on average, do wires switch?
Slide courtesy of Mary Jane Irwin, PSU
ECE 260B – CSE 241A Power Consumption 12 http:/ /vlsicad.ucsd.edu
Crowbar (Short-Circuit) Current
Finite slope of the input signal causes a direct current path between VDD and GND for a short period of time during switching when both the NMOS and PMOS transistors are conducting
When VTN < VIN < VDD+VTP
● Both transistors are ON● Current flowing directly from
VDD to VGND is crowbar current
Usually not a problem, e.g.,● P is ON strongly (LIN but with
small VDS if at all)
● N is barely ON
time
V ITransition
RN
CL
RP
Slide courtesy of Ken Yang, UCLA
ECE 260B – CSE 241A Power Consumption 13 http:/ /vlsicad.ucsd.edu
Leakage (Inactive, “Useless”) Power
Three sources of leakage
The dominant is the Source-to-Drain leakage current● Even when VGS = 0, a small amount of charge is still present
under the gate
● Exponentially related to the gate (and S/D) voltage
Source/Drain are junctions and some amount of reverse bias, IS is present
● Typically much smaller than S/D leakage
Gate tunneling leakage● When tox is only 5-10atoms, easy for tunneling current to flow
● More of an issue sub 0.10-µm technology
I D∝WL
exp q V GS−V T /nkT
Slide courtesy of Ken Yang, UCLA
ECE 260B – CSE 241A Power Consumption 14 http:/ /vlsicad.ucsd.edu
2001 ITRS Projections of 1/τ and Isd,leak for HP, LP Logic
100
1000
10000
2001 2003 2005 2007 2009 2011 2013 2015
Year
1/τ
(GH
z)
1.E-06
1.E-05
1.E-04
1.E-03
1.E-02
1.E-01
1.E+00
1.E+01
Isd
,leak (µA/µm
)
`
Isd,leak—Low pwr
Isd,leak—High Perf.
1/τ—High Perf.
1/τ—Low Pwr
ECE 260B – CSE 241A Power Consumption 15 http:/ /vlsicad.ucsd.edu
Projections for Low Power Gate Leakage
•Need for high K driven by Low Power, not High Performance
0.0001
0.001
0.01
0.1
1
10
100
1000
10000
100000
2001 2002 2003 2004 2005 2006 2007 2010 2013 2016
Year
J gat
e (
no
rmal
ized
)
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
To
x (no
rmalized
)
Simulated Igate, oxy-nitride
Igate spec. from ITRS
Oxy-nitride no longer adequate: high K needed
Tox
ECE 260B – CSE 241A Power Consumption 16 http:/ /vlsicad.ucsd.edu
Summary: Power and Energy Equations
E = CL VDD2 P0→1 + tsc VDD Ipeak P0→1 + VDD Ileakage
P = CL VDD2 f0→1 + tscVDD Ipeak f0→1 + VDD Ileakage
Dynamic power(~90% today and
decreasing relatively)
Short-circuit power
(~8% today and decreasing absolutely)
Leakage power(~2% today and
increasing relatively)
f0→1 = P0→1 * fclock
Slide courtesy of Mary Jane Irwin, PSU
•Designers need to comprehend issues of memory and logic power, speed/power tradeoffs at the process (HiPerf vs. LowPower) level,
ECE 260B – CSE 241A Power Consumption 17 http:/ /vlsicad.ucsd.edu
Outline
Problem statement
Power dissipation components
Power estimation
Optimization techniques
ECE 260B – CSE 241A Power Consumption 18 http:/ /vlsicad.ucsd.edu
Design Abstraction Levels
BehavioralSynthesis
RTLSynthesis
LogicOptimization
TransistorOptimization
Place & Route
HDL
PowerAnalysis
PowerAnalysis
PowerAnalysis
PowerAnalysis
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 19 http:/ /vlsicad.ucsd.edu
Transistor Level Power Estimation
BehavioralSynthesis
RTLSynthesis
LogicOptimization
TransistorOptimization
Place & Route
HDL
Power Analysis
Current Flows
Circuit Simulation
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 20 http:/ /vlsicad.ucsd.edu
Power EstimationDynamic Analysis
Simulation● requires representative simulation vectors
- Derived by designer
- Automatic (Monte Carlo)
Transitor level (PowerMill)● Very accurate
● Much faster than SPICE
Gate level (Powergate, DesignPower)● Faster than transistor level
● Still very accurate due to good modeling of power dissipation at cell-level
ECE 260B – CSE 241A Power Consumption 21 http:/ /vlsicad.ucsd.edu
Power Ingredients
VDD
In Out
CL
• Dynamic Dissipation
• Short-Circuit Currents
• Static Dissipation
Pdyn = CLVDDVswf0➠1
Psc = VDDIsc
Pstat = VDDIleakISC
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 22 http:/ /vlsicad.ucsd.edu
Transistor-Level Power Estimation
Spice is the reference, but too slow
Commercial tools claim to be within 10% of SPICE accuracy and up to 1000X faster
I
tP= 1
T∫0
T
i t v t dt
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 23 http:/ /vlsicad.ucsd.edu
Timing Simulation
Vdd
out1 out2in
out3
i(Vdd)
in
out1
out2
out3
Vdd-Vth
• Uses simplified (table-lookup) transistor model• Handles leakage, direct path, and reduced swing
• Up to 2 orders of magnitude faster than SPICE
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 24 http:/ /vlsicad.ucsd.edu
Switch-Level Simulation
A
B
X
F
Cap
(fF
/bit)
Sample
0102030405060708090
100
0 10 20 30 40 50 60
IRSIMSPICE
Up to 3 Orders of Magnitude Faster than Circuit
• Accurate for Dynamic Power
• Unreliable on leakage and direct path currents
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 25 http:/ /vlsicad.ucsd.edu
Perspective on accuracy and speed
Comparison between circuit simulation (SPICE)and timing or switch analysis
Adder Shift Register% Error Speedup % Error Speedup
Timing 6 15 7 3.7Switch 27 60 4 22
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 26 http:/ /vlsicad.ucsd.edu
Transistor Level Power Estimation Tools
PowerMill Epic
Star-ADM Avant!
LSIMAnalyst
MentorGraphics
• Mixed analog/digital simulation• Analytic closed-form model
• Mixed transistor/gate simulation• Series-Parallel Switch algorithm
• Mixed transistor/gate simulation• Piecewise linear model
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 27 http:/ /vlsicad.ucsd.edu
Design Abstraction Levels
BehavioralSynthesis
RTLSynthesis
LogicOptimization
TransistorOptimization
Place & Route
HDL
PowerAnalysis
PowerAnalysis
PowerAnalysis
PowerAnalysis
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 28 http:/ /vlsicad.ucsd.edu
Gate-Level Power Estimation
Dynamic Switching Power (Isw) [70-90%]
• Also referred to as capacitive power
Internal (Short-Circuit) Power (Iint) [10-30%]• Also referred to as short circuit power
Static Leakage Power (Ileak) [< 1%]
• Sub-threshold leakage dominates, some due to leakage substrate
InputTransition
V
IntISWI
N
LeakIiC
GND
Complete power model provides infrastructure for analysis and optimization
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 29 http:/ /vlsicad.ucsd.edu
Gate-Level Power Estimation
• state of the gate• input slope • output load• temperature• fabrication process
• toggle rate
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 30 http:/ /vlsicad.ucsd.edu
Design Abstraction Levels
ToggleRates
BehavioralSynthesis
RTLSynthesis
LogicOptimization
TransistorOptimization
Place & Route
HDL
ProbabilisticAnalysis
Simulation
PowerAnalysis
Simulationwith integratedPower Analysis
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 31 http:/ /vlsicad.ucsd.edu
Simulation Based Power Estimation
Problems:● The relationship of power versus primary input probabilities
and activities is acomplicated surface.
● The existing methods use discrete points to approximate such a surface.
- The effectiveness strongly depends on the density of the chosen points.
- The more points one chooses, the more accurate results.- More points directly translate to longer CPU time.
Slide courtesy, Z. Chen, K. Roy
ECE 260B – CSE 241A Power Consumption 32 http:/ /vlsicad.ucsd.edu
Toggle Rate Estimation
Probabilistic Propagation● no input vectors needed
● much faster than simulation
● less accurate than simulation
● glitches?
Simulation● requires representative simulation vectors
- derived by designer
- automatic (Monte Carlo)
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 33 http:/ /vlsicad.ucsd.edu
Signal probability and activity● Signal probability - probability of a signal being logic ONE
● Signal activity (transition density) - probability of signal switching
ni(T): the number of switching for i(T) in [ -T/2,T/2]
P i=¿limT ∞
1T∫
−T /2
T /2
i t dt
¿
Ai=limT ∞
ni T
T
Signal Probability and Activity
Slide courtesy, Z. Chen, K. Roy
ECE 260B – CSE 241A Power Consumption 34 http:/ /vlsicad.ucsd.edu
● Normalized activityf : clock frequency
Normalized power dissipation measure● Approximated power dissipation
Cj : node capacitance Aj : node activity
● Normalized power dissipation measure
fanout(j) : fanout number at node j
ai=Ai
f
P avg=12V dd
2 ∑j∈all ¿ nodes C
jAj
¿
= ∑j∈all nodes
f anout j a j
Power Dissipation in terms of Activity
Slide courtesy, Z. Chen, K. Roy
ECE 260B – CSE 241A Power Consumption 35 http:/ /vlsicad.ucsd.edu
Probability Propagation
Let y = f(x1, …, xn) be a Boolean function with independent variables xi, the signal probability of f can be obtained in linear time as follows.
where
are the cofactors of f with respect to x1.
Improve runtime by using a BDD
P y =P x1 P f x1P x 1 P f x1
f x1= f 1, x 2 , . . . , x n , f x1
= f 0, x 2 , . . . , x n
ECE 260B – CSE 241A Power Consumption 36 http:/ /vlsicad.ucsd.edu
Activity Propagation
Let y = f(x1, …, xn) be a Boolean function with independent variables xi, the signal activity of f can be obtained in linear time as follows.
where Boolean difference
where is the exclusive-or operation.
A y =∑i=1
n
P ∂ y∂ x i
A x i
∂ y∂ x
= y∣x=1⊕ y∣x=0
⊕
ECE 260B – CSE 241A Power Consumption 37 http:/ /vlsicad.ucsd.edu
AND gate
sp(1) = sp1 * sp2
tp(0➠1) = sp * (1 - sp)
Example
sp = 0.5 * 0.5 = 0.25
tp = 0.25 * (1 - 0.25) = 0.1875
Probability Propagation
1/2
1/2
1/2
1/2
1/4
1/4
7/16
Propagate
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 38 http:/ /vlsicad.ucsd.edu
Probability Propagation for Basic Gates
Ignores Temporal and Spatial Correlations
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 39 http:/ /vlsicad.ucsd.edu
Probability Propagation Problems
0.5
0.5
0.75 0.375?0.5!
Problem: Reconvergent Fan-out:
Creates spatial correlation between signals
Becomes complex and untractable real fast
P(X) = P(B=1).(P(X=1 | B = 1)
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 40 http:/ /vlsicad.ucsd.edu
Solution to Reconvergence
b
c
a
0 1
1 0
10
1
0
0.5
0.75
0.375
OBDD
Z = bc + abc
1
0.5
0.250.25
0.125
0.375
Preferred Technique:Ordered Binary Decision Diagrams (OBDDs)
Statistics computed in linear time(but graph size could be exponential)
Other approaches:● super-gates
● computation of correlation coefficients
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 41 http:/ /vlsicad.ucsd.edu
How to introduce time?
And include glitching effects …
TOUGH! If one also wants to include spatial effects or be general
Example: Symbolic Simulation Approach (for unit delay)
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 42 http:/ /vlsicad.ucsd.edu
Symbolic Network
Transition Counters
Value of d at time t= 0
Problem: Network can be huge and BDD cannot be created!
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 43 http:/ /vlsicad.ucsd.edu
Probability Simulation
User specifies typical signal behavior at the circuit inputs using probability waveforms, which is a sequence of values indicating the probability that the signal is high for a certain time intervals, and the probability that the signal takes transition from low to high.
Propagation is very similar to event driven logic simulation
0.5
0.25
0.75
0.0
t1 t2 t3
0.2 0.6 0.01
ECE 260B – CSE 241A Power Consumption 44 http:/ /vlsicad.ucsd.edu
How about sequential circuits?
NextState
Comb.Logic
I0
PS0
PSt
It
• Next State Logic introduces temporal correlations between subsequent samples
• Either assume that all states have equal probability, or use statistical Markov chains
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 45 http:/ /vlsicad.ucsd.edu
Gate-Level Power Estimation Tools
DesignPower Synopsys
PowerSim Systems Science
Power_tool Veritools
WattWatcherGate
Sente
Viewlogic
GenashorXpower
POET
• Probabilistic based• Simulation based
• Asynchronous designs
• Simulation based
• Simulation based
• Simulation based
• Simulation based
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 46 http:/ /vlsicad.ucsd.edu
Design Abstraction Levels
BehavioralSynthesis
RTLSynthesis
LogicOptimization
TransistorOptimization
Place & Route
HDL
PowerAnalysis
PowerAnalysis
PowerAnalysis
PowerAnalysis
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 47 http:/ /vlsicad.ucsd.edu
Power Estimation
Simulation ● Monte-Carlo technique
● PowerMill at transistor level
● Verilog-XL at gate level
Hierarchical simulation● Architectural/gate/transistor-level
● Parameterized power model for each module
Statistical estimation● Signal probability propagation
ECE 260B – CSE 241A Power Consumption 48 http:/ /vlsicad.ucsd.edu
Power Estimation Methodology
RTL library Synthesiscondition
Synthesis P&R
Post-layoutnetlist
PowerCharacterization
PowerMacro-model
database
Power model library generator
Powerlib.vhd Powerlib.v Powerlib.c
RTL design
RTL planning/ mapping
Structure(macro)netlist
Power model inference &Estimation code generation
Enhanced RTL
RTL simulation
Powerreport
Testbenchstimuli
Power waveform / profile
ECE 260B – CSE 241A Power Consumption 49 http:/ /vlsicad.ucsd.edu
Inaccuracies in Power Estimation
In increasing order:
The number of input stimuli did not cause any error above the 10% mark if we considered at least 10 input patterns
Using a gate-level simulator as opposed to a circuit simulator caused an error of about +/-15%
Repowering and physical design introduced inaccuracies below 20%
Glitch power varied between 7%-43%
Internal gate capacitances, which are a function of the target library, accounted for about half the power
Optimization and technology mapping may cause power estimates to be off by an order of magnitude
ECE 260B – CSE 241A Power Consumption 50 http:/ /vlsicad.ucsd.edu
Power and Synthesis Flow
Accuracy of Power Estimation
Po
tent
ial f
or
Po
we
r S
avin
gs
Behavioral
RTL
Gate
Switch
20%
400%
50%
10%
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 51 http:/ /vlsicad.ucsd.edu
Expectations
Algorithmic
Behavioral
RT Level
Tech. indep.
Tech dep.
Layout
Power manage
Algorithm selection
ConcurrencyMemory
Clock ctrl
Structural transform.
Extr/decomp
Tech. mappingGate sizing
Placement
orders of magnitude
several times
10-90%
10-15%
15%
20%20%
20%
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 52 http:/ /vlsicad.ucsd.edu
Power Estimation / Improving Guidelines
Before technology mapping, the accuracy levels are unacceptable
It is necessary to take into account internal gate capacitances as well as wire capacitances
Gate-level estimation implies >15% error
Simulation with as few as 10 patterns from typical inputs for a typical starting state is often sufficient to reach confidence levels matching those of gate-level simulation
Power improving transformations should be ● run in late design stages, they should be
● applied only if they can predict significant power improvement, and should be
● applied many times (hundreds) to maximize the confidence of positively impacting the design
ECE 260B – CSE 241A Power Consumption 53 http:/ /vlsicad.ucsd.edu
Outline
Problem statement
Power dissipation components
Power estimation
Optimization techniques
ECE 260B – CSE 241A Power Consumption 54 http:/ /vlsicad.ucsd.edu
Low Power Design Techniques
Reducing chip and package capacitance
Scaling the supply / threshold voltages
Using power management strategies
Employing better design techniques
ECE 260B – CSE 241A Power Consumption 55 http:/ /vlsicad.ucsd.edu
Reducing Capacitance
Minimum area minimum power consumption
Wirelength minimization with switching activities as weighting factors
● Placement / routing / partition / floorplanning
Clock gating
Sleep transistors
ECE 260B – CSE 241A Power Consumption 56 http:/ /vlsicad.ucsd.edu
CMOS Device and Voltage Scaling
Dual transistor threshold ● High Vth transistors optimize performance
● Low Vth transistors reduce leakage power
● Transistors with the same Vth need to group together
Dual supply voltage ● High Vdd transistors on critical paths
● Low Vdd transistors reduce power
● Level-converters between signals of different voltage swings
● Routing cost of dual power supply
Extension of classical transistor sizing algorithm, e.g., TILOS
ECE 260B – CSE 241A Power Consumption 57 http:/ /vlsicad.ucsd.edu
Power Management Strategies
Inactive hardware modules are automatically turned off to save power (for example, monitors, laptops, etc.)
Transistors on non-critical data paths are slowed down, e.g., by dynamically scaling down their supply voltages (for example, in Transmeta microprocessors)
● Sleep transistors
● Power gating (controllable power supply mechanism)
ECE 260B – CSE 241A Power Consumption 58 http:/ /vlsicad.ucsd.edu
Design Abstraction Levels
BehavioralSynthesis
RTLSynthesis
LogicOptimization
TransistorOptimization
Place & Route
HDL
PowerAnalysis
PowerAnalysis
PowerAnalysis
PowerAnalysis
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 59 http:/ /vlsicad.ucsd.edu
Transistor-Level Power Optimization
Optimizes up to 30,000 transistors at a time
Starts from three initial solutions: initial sizes, all transistors sized up with constant factor, and all transistor identical size
Optimization modes: ● individual transistor sizing
● retain ratios between connected NMOS and PMOS devices
● pseudo-NMOS
Optimization Goals● Delay
● Power
● Slack
AMPS - Epic
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 60 http:/ /vlsicad.ucsd.edu
Design Abstraction Levels
BehavioralSynthesis
RTLSynthesis
LogicOptimization
TransistorOptimization
Place & Route
HDL
PowerAnalysis
PowerAnalysis
PowerAnalysis
PowerAnalysis
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 61 http:/ /vlsicad.ucsd.edu
Gate-Level Power Optimization
TechLibrary Power Optimization
Logic orGate Netlist
Switching ActivityConstraints
(timing, power, area)
Parasitics(Capacitance)
Power OptimizedGate Level Netlist
Logic Optimization
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 62 http:/ /vlsicad.ucsd.edu
Gate-Level Tradeoffs for Power
Factoring
Structuring
Buffer insertion/deletion
Don’t care optimization
Technology mapping
Sizing
Pin assignment
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 63 http:/ /vlsicad.ucsd.edu
Factoring
Idea: Remove common expressions to reduce capacitance
Caveat: This may increase activity!
Pa = 0.1
Pb = 0.5
Pc = 0.5
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 64 http:/ /vlsicad.ucsd.edu
Logic Restructuring
Logic restructuring to minimize spurious transitions
Buffer insertion for path balancing
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 65 http:/ /vlsicad.ucsd.edu
Technology Mapping
a
b
c
d
slack=1
Smaller gates reduce capacitance, but are slower
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 66 http:/ /vlsicad.ucsd.edu
Technology Mapping
Example: 6-input AND
Implemented using 6 input NAND, 3 input NAND, and 2-input NAND [Bellaouar, ElMasry]
Library 1: High-Speed
Library 2: Low-Area
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 67 http:/ /vlsicad.ucsd.edu
Technology Mapping — Example
6-input 3-input 2-inputArea 9 11 13Delay (ns) 1.1 0.86 0.83Energy (fF) 6.7 42.5 89.4
6-input 3-input 2-inputLibrary 1 6.7 42.5 89.4Library 2 3.5 19.5 43.7
Mapping results for high speed-library
Energy comparison between libraries
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 68 http:/ /vlsicad.ucsd.edu
Sequential Logic Optimization
State encoding● seems to be of minimal impact in general
Data encoding in data paths● e.g. use of sign-magnitude , one-hot, or redundant representations
● mostly ad hoc
Retiming for low power● registers can be strategically placed to reduce glitching, or to perform
path balancing
Clock gating
Pre-computation
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 69 http:/ /vlsicad.ucsd.edu
Clock gating
Requires careful skew control ...Scary in current logic synthesis world!
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 70 http:/ /vlsicad.ucsd.edu
Pre-computation
Other options:• guarded evaluation• set output directly
Inputs xi … xn are not appliedif pre-computing holds
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 71 http:/ /vlsicad.ucsd.edu
Power Compiler
Results:● design dependent
● library dependent
● average 15-20% push-button reduction in power
Slide courtesy, Prof. J. Rabaey, UCB
ECE 260B – CSE 241A Power Consumption 72 http:/ /vlsicad.ucsd.edu
Low Power Synthesis
Introduce more concurrency for performance improvement
● Linear power consumption increase
Reduce power consumption by scaling down voltages● Quadratic power consumption decrease
Concurrency increasing transformations● Loop unrolling
● Control flow optimizations
Critical path reducing transformations● Logic level minimization
● Retiming
● Pipelining
ECE 260B – CSE 241A Power Consumption 73 http:/ /vlsicad.ucsd.edu
Summary
Design Flow for Power well covered at circuit and gate level
Most emphasis on analysis — not much on optimization
Overall optimization results are mixed
Plenty of room at the physical end● transistor sizing, circuit style selection, synthesis for pass-transistor
networks, threshold selection
Slide courtesy, Prof. J. Rabaey, UCB