Upload
lykhanh
View
240
Download
4
Embed Size (px)
Citation preview
1
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.11Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Power Reduction Techniques in the Processor Core
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.22Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Power Usage Stats
52%
12%2%
18%
16%
MotherboardHard DiskFloppy DiskLCD/VGAPower Supply1995 5V Notebook PC
From Roy, 1997From Roy, 1997
2
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.33Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Processor Power Budgets
Clock
Datapath
Memory
I/O (pads)
Inner circle: low end embedded microprocessorNext circle: high end CPU with on-chip cache
Next circle: MPEG2 decoder ASICOuter circle: ATM switch ASIC
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.44Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Basic Principles of Low Power Design
lReduce switching (supply) voltage» quadratic effect -> dramatic savings» negative effect on performance
lReduce capacitancelReduce switching frequencylReduce glitchinglReduce leakage and static currents
P = CL Vdd2 f + (tr + tf)/2 Vdd Ipeak f + Vdd Ileakage
3
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.55Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Design LevelsAbstraction Power Analysis AnalysisLevel Savings Resources Accuracy
Most Least Worst
AlgorithmSoftware/systemArchitectureFunctional unitGateCircuit
Least Most Best
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.66Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Circuit and Logic Gate Techniques
4
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.77Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Transistor Sizing for Dynamic Power Reduction
lUse the smallest transistors that satisfy the delay constraints» slack time - difference between required
time and arrival time of a signal at a gate output– Positive slack - size down– Negative slack - size up
lMake gates that toggle more frequently smaller
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.88Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Equivalent Pin Ordering
lLogically equivalent pins may not have identical delay/power characteristics
B
AOut
Ci
Cout
lTo conserve power (and improve speed), connect inputs so that most active input is nearest output
lNeed to know signal stats
5
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.99Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Gate RestructuringlLogically equivalent gates may not have
identical power/delay characteristics
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.1010Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Network Restructuring
lLogically equivalent gate networks may not have identical power/delay characteristics
Technology mapping
F = ABCD delayarea
power
6
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.1111Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Dual Supply Voltagesl Use two Vdd’s (e.g., 2.5V and 1.5V)
» use the higher supply for gates on the critical path» use the lower supply for gates off the critical path
l Reduces power without a performance lossl Cons
» slight area penalty» increased design time» need level converters to interconnect gates on different
supplies (to avoid static currents)
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.1212Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Functional Unit Techniques
7
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.1313Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Latches and Flipflopsl Consume a lot of power because they are
clocked every cycle» Clock energy (Ec)
– energy dissipated when the ff is clocked with stable data» Data energy (Ed)
– energy dissipated when the ff is clocked and the data has changed so that the ff changes state
» Typically the data rate (fd) is much lower than the clock rate (fc)
l Also impacts clock power since a large portion of clock power is used to drive the sequential elements
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.1414Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Power Consumption in Latches
0
20
40
60
80
100
0 0.1 0.2 0.3 0.4 0.5
DataClock
Latch Data AF
% P
ower
CLK
D Q
CLKB
From From TiwariTiwari, 1998, 1998
8
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.1515Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Some Typical CMOS FFsCLK
D Q
CLK
D
Q
CLK
Static TG FF
DQ
CLK
D Q
Dynamic C2MOS FF
Dyn Precharged TSPC FF Dyn Non-Precharged TSPC FF
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.1616Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
FF Power Comparison
0
5
10
15
20
25
30
0.05 0.15 0.25 0.35 0.45
TGFFGFFC2MOSPTSPCNPTSPCRSLATCH
Latch Data AF
Rel
ativ
e P
ower
Con
sum
ptio
n
From From SvensonSvenson, 1996, 1996
9
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.1717Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Some Low Power FFs
CLK
D
GND
VDD
VDD
Q
Q
D Q
StrongArm SA110 FF
Power PC 603 FF
VDD
CLK
CLKCLKB
CLKB
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.1818Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
PDP of Some Low Power FFs
01020
3040
506070
80
HLFF
SDFF
PowerP
C
mC2MOS
SA110F
FK6E
TL
HighLowAverage
From From StojanovicStojanovic, 1998, 1998
PD
Pto
t (fJ
)
10
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.1919Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Self-Gating FFlWhen ff input is equal to its output, suppress
internal clocking to conserve power» gating function is derived within the FF
D Q
Φ
Φ Φ
Φ
ΦΦ
CLK
DQ
Φ
Φ
Φ
Strict ruleson when D canchange wrt CLK
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.2020Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Power of Self-Gated FF
0
10
1 2
SG FFReg FF
Data switching rate fd/fc
Pow
er d
issi
patio
n
From Reyes, 1996From Reyes, 1996
11
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.2121Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Double Edge Triggered FF
D Q
Loads data at bothrising and falling
clock edges
CLK
CLK
CLK
CLK
CLKB
CLKB
CLKB
CLKB
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.2222Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
DETFF Pros and Consl Advantages
» Clock frequency can be halved to achieve the same computational throughput: Pd = 0.84Ps
» Also get a 2X power savings in the clock network
l Disadvantages» About 15% larger in transistor count» Maximum operating frequency less» Strict requirements on clock skew» Requires a strict 50% duty cycle» Larger clock load
12
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.2323Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Arithmetic Components
lMany techniques for lowering power consumption of arithmetic components» adders, ALUs» barrel shifters, multipliers, MACs
lPDP of different architectureslDelay balancing to reduce glitchinglPrecomputationlCommon case computation
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.2424Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
PDP of Different Adders
0
25
50
75
100
8 bits 16 bits 32 bits 48 bits 64 bits
RCAMCCACSkAVSkACSlACLABKAELMA
From From NagendraNagendra, 1996, 1996
13
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.2525Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Array Multiplier
M00M01M02M03
M10M11M12M13
M20M21M22M23
M30M31M32M33
B0B1B2B3
A0
0
Y0
A1
Y1
A2
Y2
A3
Y3Y4Y5Y6Y7
0000
0
0
0
Longestdelay path2i+j+1
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.2626Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Multiplier Cell Structure
fulladder
Bjsuminput
Ai
carry in
sumoutput
carry out
adddelayelementsto minimizeglitching
14
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.2727Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Precomputation Logic
Outputs
Precomputationlogic
Precomputedinputs
Gatedinputs
g(X)
Combinationlogicf(X)
g(X)
R2
R1
Loaddisable
lIdentify logical conditions at inputs that are invariant to the output
»since those inputs don’t affect output, disable input transitions
»trade area for power
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.2828Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Binary Comparator Example
A > Bn-bit binary value
comparatorA > BR2
R1
LoaddisableAn = Bn
An
Bn
An-1
A1
Bn-1
B1
Can achieve up to 75% power reduction with 3% area overhead and 1 to 5 additional gate delays in worst case path
15
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.2929Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Design Issues in PrecomputationlDesign steps
1. Select precomputation architecture2. Determined the precomputed and gated inputs
(R1 should be much smaller than R2)3. Find (good implementation for) g(X)4. Evaluate potential power savings based on input
statistics (if savings not sufficient go to step 2 or 3 and try again)
lAlso works for multiple output functions where g(X) is the product of gj(X) over all j
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.3030Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Common Case Computation
Original Original circuitcircuit
CCC CCC controllercontroller
CC detection CC detection circuitcircuit
CC execution CC execution circuitcircuit
sleep2sleep2
sleep3sleep3
sleep1sleep1
common case detectedcommon case detected
InputsInputs
OutputsOutputs
common case completedcommon case completed
16
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.3131Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Activity of CCC Circuit Over Time
Original Original circuitcircuit
CC detectionCC detectioncircuitcircuit
CC executionCC executioncircuitcircuit
TimeTime
ttpp ttcc ttee
lSeveral (possibly conflicting) factors involved in choosing the CC circuit leading to maximal energy and/or time savings
lDependent on input data statistics
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.3232Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
CCC Performance
12.427.429.7Graphics
39.743.323.5Linegen
48.642.521.9Test1
58.217.914.5Poly
59.876.629.0GCD
Power (mW)% Decrease
Cycles % Decrease
Area% Increase
Circuit
From From Lakshminarayana, 1999, 1999
17
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.3333Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Control Unit Design
CombinationalLogic
Sta
te F
Fs
Inputs Outputs
n! different possibleencodings (n states)
11
00 01
0,1/1
1/X
1/X0/0
0/0
State EncodingOne of most important factors determining area, speed, and power of resulting control logic
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.3434Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Power State Encoding Heuristicl Area driven -> try to reduce the distance in
Boolean n-space between related statesl Power driven -> try to minimize number of bit
transitions in the state register» fewer transitions in state register» fewer transitions propagated to combinational logic
0.1
0.1
0.1
0.40.3
probability that a transition will occur(sum of all edges
equals unity)1100
01
18
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.3535Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
CaveatlLowest E[M] may not be lowest in power ->
it could require more gates and/or signal transitions in the combinational logic
lExperiments show that the area and power dissipation of a state machine are correlated when the state encoding is varied
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.3636Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
State Encoding Effects
500
550
600
650
700
750
3300 3400 3500 3600 3700 3800 3900 4000 4100
Area
Pow
er
From From YeapYeap, 1997, 1997
19
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.3737Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Practical ConsiderationslBalance area-power by forced encoding of
only a subset of states that span the high probability edges» leave assignment of remaining states to the
logic synthesis system for area optimization» fortunately, in practice, most state machines
have this characteristiclUnlike area encoding, power encoding
requires knowledge of probabilities of state transitions and input signals
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.3838Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Architecture Techniques
20
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.3939Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Glitch Reduction by Pipelining
lGlitches are dependent on the logic depth of the circuit
lNodes logically deeper are more prone to glitching» arrival times of the gate inputs are more
spread due to delay imbalances» usually affected by more PI switching
lReduce depth by adding pipeline registers
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.4040Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Typical RISC Datapath
lFive stage pipeline (originally for performance, but also helps with power)
PC
Fetch Decode Execute Memory WriteBack
Inst
ruct
ion
MA
R
MD
R
I$ D$
21
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.4141Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Pipelined Multiplier
M00M01M02M03
M10M11M12M13
M20M21M22M23
M30M31M32M33
B0B1B2B3
A0
0
Y0
A1
Y1
A2
Y2
A3
Y3Y4Y5Y6Y7
0000
0
0
0
CLK
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.4242Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Signal Gatingl Mask unwanted switching activity from
propagating
l Generation of control signals requires additional logic circuitry (more power)
control signal to suppress source signal
Latch/FF
sourcesignal
gatedsignal
22
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.4343Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Signal Gating, con’t
lSignal gating saves power if the relative enable/disable frequency of control signalis much lower than the frequency of source signal (so many signal activities blocked)
lSavings even greater if a group of source signals can share a control signal
lGood candidates - clock signals, address or data buses, signals with high frequency or high glitching
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.4444Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Guarded Evaluationl Reduce switching activity by adding latches at the
inputs if outputs are not used
l Latch preserves previous value of inputs to suppress activity
– could also use AND gates to mask one or both inputs to zero -> forced zero (good if zero-out condition changes infrequently compared to data rate)
A
MultiplierB
Ccondition
MultiplierBC
A
Latc
h
condition
23
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.4545Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Sleep ModeslSoftware power control - power
management» DOZE - most fu’s stopped except on-chip
cache memory (cache coherency)» NAP - cache also turned off, time out or
external interrupt to resume» SLEEP - clock off, external interrupt to
resume
Deeper sleep mode savesmore power
Deeper sleep mode requiresmore latency to resume
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.4646Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
PowerPC Sleep Modes
Mode 66Mhz 80MhzNo power mgmt 2.18W 2.54WDynamic power mgmt 1.89W 2.20WDOZE 307mW 366mWNAP 113mW 135mWSLEEP 89mW 105mWSLEEP without PLL 18mW 19mWSLEEP without clock 2mW 2mW
10 cycles to wake up from SLEEP100us to wake up from SLEEP+
24
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.4747Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Keeper CircuitslA floating node (not driven by any gates)
can suffer charge decay resulting in short-circuit currents
lKeeper circuits can» slightly increase power dissipation» slightly increase delay
lEssential in circuits with sleep modes
weak
power downcontrol
powereddown
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.4848Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
A Low Power Processor Core
Example
25
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.4949Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
M•CORE Architecture
GP reg file
(32bitx16)
Alt reg file
(32bitx16)
Control reg file
(32bitx13)
X port Y port Immed
Scale
Barrel shift, FF1
ALU, priority encode, 0 detect
Sign ext
Instr pipeline
Instr decoder
Branch adder
PC increment
Address bus
Data bus
Writeback busH/W acc bus
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.5050Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
M•CORE Power Distribution
36%
36%
28%
DatapathClockControl
42%
14%9%
8%
7%
6%
5%
9%
Reg FileAddr/Data BusInst RegBarrel ShifterX MUXY MUXAddr GenOther
26
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.5151Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Key ReferencesAlidina, Precomputation-based sequential logic optimization for low power,
IEEE Trans. on VLSI Systems, 2(4):426-436, 1994.Hossain, Low power design using double edge triggered flipflop, IEEE Trans.
on VLSI Systems, 2(2):261-265, 1994.Lakshminarayana, et.al., Common-Case Computation, Proc. of DAC, pp 56-
61, 1999.Motorola, M•CORE Architecture microRISC Engine, MCORE 1/D,
www.mot.com/SPS/MCORE/info_documentation.htmMutsunori, Low power designmethod using multiple supply voltages, Proc. of
SLPED, pp. 36-41, 1997.Rabaey, Digital Integrated Circuits, Prentice-Hall, 1996.Reyes, Low Power FF Circuit and Method Thereof, Patent No 5,498,988,
1996.Roy, Power analysis and design at the system level, Low Power Design in
Deep Submicron Electronics, Nebel and Mermet, Ed., Kluwer, 1997.
ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.5252Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999
Key References, con’tSakuta, Delay balanced multipliers for low power, Proc. of SLPE, pp. 36-37,
1995.Scott, Designing the Low-Power M•CORE Architecture, Proc. Inter. Symp.
Computer Architecture Power Driven Microarchitecture Workshop, June 1998.
Stojanovic, A unified approach in the analysis of latches and FFs for low power systems, Proc. of ISLPED, pp. 227-232, 1998.
Tiwari, Reducing power in high-performance microprocessors, Proc. of DAC, pp. 732-737, 1998.
Tiwari, Guarded evaluation, Proc. ISLPD, pp. 221-226, 1995.Yeap, CPU controller optimization for HDL logic synthesis, Proc. of CICC, pp.
127-130, 1997.Yeap, Practical Low Power Digital VLSI Design, KAP, 1998.