Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
1
Lecture 10: Low Power Design • Notes:
– J. Rabaey, A. Chandrakasan, B. Nikolik, “Digital Integrated Circuits: A Design Perspective,” 2nd ed. Printice Hall 2003.
– Shekhar Borkar, Intel Corporation, “Digital Design for Low Power Systems” – M. Horowitz, Sanford University, EE271 Lecture Notes
• Books – J. Rabaey, A. Chandrakasan, B. Nikolik, “Digital Integrated Circuits: A Design
Perspective,” 2nd ed. Printice Hall 2003. – A. Chandrakasan, W. Bowhill, F. Fox (eds.), “Design of High-Performance
Microporcessor Circuits,” IEEE Press 2001 – A. Chandrakasan and R. Brodersen, “Low Power CMOS Design,” IEEE Press, 1998
• Articles – A.P. Chandrakasan and R.W. Brodersen, “Minimizing power consumption in digital
CMOS circuits,” Proc. of IEEE, no. 4, p.498-523, April 1995 – A.P. Chandrakasan, S. Sheng, R.W. Brodersen, “Low Power CMOS digital design,”
IEEE Journal of Solid-State Circuits, vol. 27, no.4, p.473-84, April 1992 – T.Kuroda, T. Sakurai, “Overview of low power ULSI circuit techniques,” IEICE Trans.
On Electronics, vol. E78-C, no.4, p.334-344, April 1995 – S. Borkar, “Design challenges of technology scaling,” IEEE Micro, vol. 19, no.4, p.
23-29, July-Aug, 1999
Principles of Low Power Design
• α – Switching probability • CL – capacitive load • Vswing – voltage swing • f – frequency • VDD – supply voltage
• ISC – mean value of switching current transient
• ΔtSC – short-circuit current time
• IDC – static current • ILEAK – leakage current
( ) ( ) DDLEAKDCDDSCSCswingL VIIfVtIVCP ⋅++⋅Δ+α~
fVVCP DDswingL⋅α~Dominant
2
Principles of Power Reduction
• Reducing Load Capacitance (CL)
• Reducing Supply Voltage (VDD)
• Reducing Frequency ( f )
• Reducing Switching Activity (α)
• Reducing Leakage Current (ILEAK)
( ) ( ) DDLEAKDCDDSCSCswingL VIIfVtIVCP ⋅++⋅Δ+α~
Low Power Design Metrics
• Good design has always meant careful trade-offs – Used to worry about performance, area, and design time – Need to consider energy too
• Need to understand design tradeoffs • Make low power design as objective • Need some metrics for low power designs
– Provides a way to compare designs – Which design / style is better – What should a designer expect
3
Metrics - Power
• Obvious choice – Sets battery life in hours, packaging limits
• Problem – Dynamic power proportional to frequency (f)
• Often want to do more, not have it take longer – Comparing the power of two designs can be misleading – Lower power design could simply be slower
Metrics – Energy / Operation • Rather than look at power, look at the total energy needed to
complete some operation. – Fixes obvious problems with the energy metric, since changing the operating
frequency does not change the answer
• The energy is the area under the curve. – However, one can decrease the energy/op by doing stuff that will slow down
the chip -- like lowering the supply voltage, or using small transistors.
Energy / Op = Power x Delay / Op
4
Power-Delay vs Delay
8-bit adders in 2.0um
Decreasing Vdd
CPL
Optimized static
Conventional Static
Carry Select
Standard Cell
DCVSL dynamic
Pow
er-D
elay
Pro
duct
(pJ)
Delay (ns)
log-log plot
Metrics – Energy/Op and Delay/Op • Since lower energy solutions might simply be lower
performance, we need to have a metric that includes both energy and performance. – One solution is to think about a 2-D solution space:
• Clearly ‘b’ is lower power than ‘c’. But can we say anything about ‘b’ and ‘d’? While ‘d’ is slightly lower power, it is also much slower.
5
Metrics – Energy/Op x Delay/Op
• Constant energy-delay products are straight-lines in the graph:
• Implies one can trade lower energy/op for increased delay – Most effective way to decrease Energy/Op is by supply scaling
Energy x delay = Power x (Delay / Op)2
Reducing Active Power - Voltage Scaling
6
Voltage Scaling
• Changing power supply voltage: – Has a large effect on power, since P = CV2F – Affects performance too
• Energy x Delay is:
(α is between 1-2 and models velocity saturation)
E=CV2
V
Looks like this is an important optimal point But you should be cautious (No labels on the graph)
For quadratic device it is often drawn as
Energy-Delay for Quadratic Devices
7
Vth scaling
• Optimal Energy-Delay product = k Vth
• Scaling Vth reduces dynamic power
• Minimum Vth set by static leakage power – When you turn a transistor off, there is still some leakage – Leakage current is exponential on (Vth - Vgs) – Since minimum of Vgs = 0, need some Vth to turn devices off
0e 1 egs t ds
T T
V V Vnv v
ds dsI I− −⎛ ⎞
= −⎜ ⎟⎜ ⎟⎝ ⎠
Reducing Active Power – Transistor Sizing • Use smaller transistors to
reduce power – Power decreases since the
low capacitance decreases – Delay increases since the
driving resistance increases
• Simple Example
Delay=R(Cg+CL)=RCg(1+CL/Cg)
Energy=CgV2
Cg/CL
8
Reducing Active Power – Downsize non-critical paths
• Lowering the supply on critical paths will lower the operating frequency – Down size non-critical paths
• No reward on finishing computation early in synchronous systems
– Narrow down the path delay distribution – Impact of process variations
Low Power Design • The major way to reduce power is to rethink about the problem
at the high level. – Like most things, this is were the leverage is. – There are two general techniques that will help with power:
• Reformulate the problem (example) – Assume operation needs 100 instructions
• Energy = 100 Einst • Delay = 100 Tinst
– If another algorithm only needs 50 instructions • Energy delay = 1/4 Old solution • Could use a slower, lower power processor
• Use parallelism and/or pipelining – Improve delay, and thus energy ∗ delay – Use voltage scaling / transistor sizing to convert excess speed to low power
9
Reducing the supply voltage • Example: Reference datapath
• Critical Path Delay: Tadder + Tcomparator (~25ns) →fref=40MHz • Total capacitance being switched: Cref • VDD=Vref=5V • Power for reference datapath: Pref= frefCrefVref
2 • [chandrakasan, 1992]
Parallel Datapath
• The clock rate can be reduced by half with the same throughput: fpar=fref/2
10
Pipelined Datapath
• Critical path delay is less → max [Tadder, Tcomparator]
A Simple Datapath Summary
11
Multiple Voltage Domains
• Block level supply assignment – Higher throughput/lower latency functions are assigned higher VDD – Slower functions are implemented with lower VDD – Separate supply grids, level conversion performed at block boundaries
• Issues – Level conversion overhead – Physical design is challenging: Multi-supply routing
Multiple Supplies in a Block
“Clustered Voltage Scaling” [M.Takahashi, ISSCC’98]
Can VDDH INV drive VDDL INV? Can VDDL INV drive VDDH INV?
12
Low Swing Bus and Level Converter
-INV1 and INV2 operated from lower supply -Level conversion performed at 9:1 dynamic MUX
ALU Example
Distributed Multiple Supply Voltages
13
Summary • Energy delay metric is helpful in making these trade-offs
– Reminds about the trade-off between performance and power • Low power design strategy
– Trade power for excess delay (no reward for finishing computation early) – Supply scaling and transistor sizing whenever feasible – Parallelism very helpful
• Leverage is at the top and bottom – Better technologies have much better energy-delay products
• Key is to think about the system-level problem – Here one can make order of magnitude changes