26
1 ASIC Tutorial Processor Core. ASIC Tutorial Processor Core.1 Low Power Design for Low Power Design for SoCs SoCs ©M.J. Irwin, PSU, 1999 M.J. Irwin, PSU, 1999 Power Reduction Techniques in the Processor Core ASIC Tutorial Processor Core. ASIC Tutorial Processor Core.2 Low Power Design for Low Power Design for SoCs SoCs ©M.J. Irwin, PSU, 1999 M.J. Irwin, PSU, 1999 Power Usage Stats 52% 12% 2% 18% 16% Motherboard Hard Disk Floppy Disk LCD/VGA Power Supply 1995 5V Notebook PC From Roy, 1997 From Roy, 1997

Power Reduction Techniques in the Processor Coremji/asic/tutorial-pcore.pdf · Power Reduction Techniques in the Processor Core ... for SoCs ASIC Tutorial Processor Core.18 ©M.J

  • Upload
    lykhanh

  • View
    240

  • Download
    4

Embed Size (px)

Citation preview

1

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.11Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Power Reduction Techniques in the Processor Core

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.22Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Power Usage Stats

52%

12%2%

18%

16%

MotherboardHard DiskFloppy DiskLCD/VGAPower Supply1995 5V Notebook PC

From Roy, 1997From Roy, 1997

2

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.33Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Processor Power Budgets

Clock

Datapath

Memory

I/O (pads)

Inner circle: low end embedded microprocessorNext circle: high end CPU with on-chip cache

Next circle: MPEG2 decoder ASICOuter circle: ATM switch ASIC

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.44Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Basic Principles of Low Power Design

lReduce switching (supply) voltage» quadratic effect -> dramatic savings» negative effect on performance

lReduce capacitancelReduce switching frequencylReduce glitchinglReduce leakage and static currents

P = CL Vdd2 f + (tr + tf)/2 Vdd Ipeak f + Vdd Ileakage

3

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.55Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Design LevelsAbstraction Power Analysis AnalysisLevel Savings Resources Accuracy

Most Least Worst

AlgorithmSoftware/systemArchitectureFunctional unitGateCircuit

Least Most Best

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.66Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Circuit and Logic Gate Techniques

4

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.77Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Transistor Sizing for Dynamic Power Reduction

lUse the smallest transistors that satisfy the delay constraints» slack time - difference between required

time and arrival time of a signal at a gate output– Positive slack - size down– Negative slack - size up

lMake gates that toggle more frequently smaller

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.88Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Equivalent Pin Ordering

lLogically equivalent pins may not have identical delay/power characteristics

B

AOut

Ci

Cout

lTo conserve power (and improve speed), connect inputs so that most active input is nearest output

lNeed to know signal stats

5

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.99Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Gate RestructuringlLogically equivalent gates may not have

identical power/delay characteristics

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.1010Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Network Restructuring

lLogically equivalent gate networks may not have identical power/delay characteristics

Technology mapping

F = ABCD delayarea

power

6

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.1111Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Dual Supply Voltagesl Use two Vdd’s (e.g., 2.5V and 1.5V)

» use the higher supply for gates on the critical path» use the lower supply for gates off the critical path

l Reduces power without a performance lossl Cons

» slight area penalty» increased design time» need level converters to interconnect gates on different

supplies (to avoid static currents)

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.1212Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Functional Unit Techniques

7

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.1313Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Latches and Flipflopsl Consume a lot of power because they are

clocked every cycle» Clock energy (Ec)

– energy dissipated when the ff is clocked with stable data» Data energy (Ed)

– energy dissipated when the ff is clocked and the data has changed so that the ff changes state

» Typically the data rate (fd) is much lower than the clock rate (fc)

l Also impacts clock power since a large portion of clock power is used to drive the sequential elements

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.1414Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Power Consumption in Latches

0

20

40

60

80

100

0 0.1 0.2 0.3 0.4 0.5

DataClock

Latch Data AF

% P

ower

CLK

D Q

CLKB

From From TiwariTiwari, 1998, 1998

8

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.1515Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Some Typical CMOS FFsCLK

D Q

CLK

D

Q

CLK

Static TG FF

DQ

CLK

D Q

Dynamic C2MOS FF

Dyn Precharged TSPC FF Dyn Non-Precharged TSPC FF

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.1616Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

FF Power Comparison

0

5

10

15

20

25

30

0.05 0.15 0.25 0.35 0.45

TGFFGFFC2MOSPTSPCNPTSPCRSLATCH

Latch Data AF

Rel

ativ

e P

ower

Con

sum

ptio

n

From From SvensonSvenson, 1996, 1996

9

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.1717Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Some Low Power FFs

CLK

D

GND

VDD

VDD

Q

Q

D Q

StrongArm SA110 FF

Power PC 603 FF

VDD

CLK

CLKCLKB

CLKB

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.1818Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

PDP of Some Low Power FFs

01020

3040

506070

80

HLFF

SDFF

PowerP

C

mC2MOS

SA110F

FK6E

TL

HighLowAverage

From From StojanovicStojanovic, 1998, 1998

PD

Pto

t (fJ

)

10

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.1919Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Self-Gating FFlWhen ff input is equal to its output, suppress

internal clocking to conserve power» gating function is derived within the FF

D Q

Φ

Φ Φ

Φ

ΦΦ

CLK

DQ

Φ

Φ

Φ

Strict ruleson when D canchange wrt CLK

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.2020Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Power of Self-Gated FF

0

10

1 2

SG FFReg FF

Data switching rate fd/fc

Pow

er d

issi

patio

n

From Reyes, 1996From Reyes, 1996

11

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.2121Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Double Edge Triggered FF

D Q

Loads data at bothrising and falling

clock edges

CLK

CLK

CLK

CLK

CLKB

CLKB

CLKB

CLKB

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.2222Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

DETFF Pros and Consl Advantages

» Clock frequency can be halved to achieve the same computational throughput: Pd = 0.84Ps

» Also get a 2X power savings in the clock network

l Disadvantages» About 15% larger in transistor count» Maximum operating frequency less» Strict requirements on clock skew» Requires a strict 50% duty cycle» Larger clock load

12

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.2323Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Arithmetic Components

lMany techniques for lowering power consumption of arithmetic components» adders, ALUs» barrel shifters, multipliers, MACs

lPDP of different architectureslDelay balancing to reduce glitchinglPrecomputationlCommon case computation

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.2424Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

PDP of Different Adders

0

25

50

75

100

8 bits 16 bits 32 bits 48 bits 64 bits

RCAMCCACSkAVSkACSlACLABKAELMA

From From NagendraNagendra, 1996, 1996

13

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.2525Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Array Multiplier

M00M01M02M03

M10M11M12M13

M20M21M22M23

M30M31M32M33

B0B1B2B3

A0

0

Y0

A1

Y1

A2

Y2

A3

Y3Y4Y5Y6Y7

0000

0

0

0

Longestdelay path2i+j+1

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.2626Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Multiplier Cell Structure

fulladder

Bjsuminput

Ai

carry in

sumoutput

carry out

adddelayelementsto minimizeglitching

14

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.2727Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Precomputation Logic

Outputs

Precomputationlogic

Precomputedinputs

Gatedinputs

g(X)

Combinationlogicf(X)

g(X)

R2

R1

Loaddisable

lIdentify logical conditions at inputs that are invariant to the output

»since those inputs don’t affect output, disable input transitions

»trade area for power

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.2828Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Binary Comparator Example

A > Bn-bit binary value

comparatorA > BR2

R1

LoaddisableAn = Bn

An

Bn

An-1

A1

Bn-1

B1

Can achieve up to 75% power reduction with 3% area overhead and 1 to 5 additional gate delays in worst case path

15

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.2929Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Design Issues in PrecomputationlDesign steps

1. Select precomputation architecture2. Determined the precomputed and gated inputs

(R1 should be much smaller than R2)3. Find (good implementation for) g(X)4. Evaluate potential power savings based on input

statistics (if savings not sufficient go to step 2 or 3 and try again)

lAlso works for multiple output functions where g(X) is the product of gj(X) over all j

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.3030Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Common Case Computation

Original Original circuitcircuit

CCC CCC controllercontroller

CC detection CC detection circuitcircuit

CC execution CC execution circuitcircuit

sleep2sleep2

sleep3sleep3

sleep1sleep1

common case detectedcommon case detected

InputsInputs

OutputsOutputs

common case completedcommon case completed

16

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.3131Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Activity of CCC Circuit Over Time

Original Original circuitcircuit

CC detectionCC detectioncircuitcircuit

CC executionCC executioncircuitcircuit

TimeTime

ttpp ttcc ttee

lSeveral (possibly conflicting) factors involved in choosing the CC circuit leading to maximal energy and/or time savings

lDependent on input data statistics

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.3232Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

CCC Performance

12.427.429.7Graphics

39.743.323.5Linegen

48.642.521.9Test1

58.217.914.5Poly

59.876.629.0GCD

Power (mW)% Decrease

Cycles % Decrease

Area% Increase

Circuit

From From Lakshminarayana, 1999, 1999

17

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.3333Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Control Unit Design

CombinationalLogic

Sta

te F

Fs

Inputs Outputs

n! different possibleencodings (n states)

11

00 01

0,1/1

1/X

1/X0/0

0/0

State EncodingOne of most important factors determining area, speed, and power of resulting control logic

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.3434Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Power State Encoding Heuristicl Area driven -> try to reduce the distance in

Boolean n-space between related statesl Power driven -> try to minimize number of bit

transitions in the state register» fewer transitions in state register» fewer transitions propagated to combinational logic

0.1

0.1

0.1

0.40.3

probability that a transition will occur(sum of all edges

equals unity)1100

01

18

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.3535Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

CaveatlLowest E[M] may not be lowest in power ->

it could require more gates and/or signal transitions in the combinational logic

lExperiments show that the area and power dissipation of a state machine are correlated when the state encoding is varied

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.3636Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

State Encoding Effects

500

550

600

650

700

750

3300 3400 3500 3600 3700 3800 3900 4000 4100

Area

Pow

er

From From YeapYeap, 1997, 1997

19

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.3737Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Practical ConsiderationslBalance area-power by forced encoding of

only a subset of states that span the high probability edges» leave assignment of remaining states to the

logic synthesis system for area optimization» fortunately, in practice, most state machines

have this characteristiclUnlike area encoding, power encoding

requires knowledge of probabilities of state transitions and input signals

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.3838Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Architecture Techniques

20

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.3939Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Glitch Reduction by Pipelining

lGlitches are dependent on the logic depth of the circuit

lNodes logically deeper are more prone to glitching» arrival times of the gate inputs are more

spread due to delay imbalances» usually affected by more PI switching

lReduce depth by adding pipeline registers

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.4040Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Typical RISC Datapath

lFive stage pipeline (originally for performance, but also helps with power)

PC

Fetch Decode Execute Memory WriteBack

Inst

ruct

ion

MA

R

MD

R

I$ D$

21

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.4141Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Pipelined Multiplier

M00M01M02M03

M10M11M12M13

M20M21M22M23

M30M31M32M33

B0B1B2B3

A0

0

Y0

A1

Y1

A2

Y2

A3

Y3Y4Y5Y6Y7

0000

0

0

0

CLK

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.4242Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Signal Gatingl Mask unwanted switching activity from

propagating

l Generation of control signals requires additional logic circuitry (more power)

control signal to suppress source signal

Latch/FF

sourcesignal

gatedsignal

22

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.4343Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Signal Gating, con’t

lSignal gating saves power if the relative enable/disable frequency of control signalis much lower than the frequency of source signal (so many signal activities blocked)

lSavings even greater if a group of source signals can share a control signal

lGood candidates - clock signals, address or data buses, signals with high frequency or high glitching

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.4444Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Guarded Evaluationl Reduce switching activity by adding latches at the

inputs if outputs are not used

l Latch preserves previous value of inputs to suppress activity

– could also use AND gates to mask one or both inputs to zero -> forced zero (good if zero-out condition changes infrequently compared to data rate)

A

MultiplierB

Ccondition

MultiplierBC

A

Latc

h

condition

23

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.4545Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Sleep ModeslSoftware power control - power

management» DOZE - most fu’s stopped except on-chip

cache memory (cache coherency)» NAP - cache also turned off, time out or

external interrupt to resume» SLEEP - clock off, external interrupt to

resume

Deeper sleep mode savesmore power

Deeper sleep mode requiresmore latency to resume

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.4646Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

PowerPC Sleep Modes

Mode 66Mhz 80MhzNo power mgmt 2.18W 2.54WDynamic power mgmt 1.89W 2.20WDOZE 307mW 366mWNAP 113mW 135mWSLEEP 89mW 105mWSLEEP without PLL 18mW 19mWSLEEP without clock 2mW 2mW

10 cycles to wake up from SLEEP100us to wake up from SLEEP+

24

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.4747Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Keeper CircuitslA floating node (not driven by any gates)

can suffer charge decay resulting in short-circuit currents

lKeeper circuits can» slightly increase power dissipation» slightly increase delay

lEssential in circuits with sleep modes

weak

power downcontrol

powereddown

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.4848Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

A Low Power Processor Core

Example

25

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.4949Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

M•CORE Architecture

GP reg file

(32bitx16)

Alt reg file

(32bitx16)

Control reg file

(32bitx13)

X port Y port Immed

Scale

Barrel shift, FF1

ALU, priority encode, 0 detect

Sign ext

Instr pipeline

Instr decoder

Branch adder

PC increment

Address bus

Data bus

Writeback busH/W acc bus

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.5050Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

M•CORE Power Distribution

36%

36%

28%

DatapathClockControl

42%

14%9%

8%

7%

6%

5%

9%

Reg FileAddr/Data BusInst RegBarrel ShifterX MUXY MUXAddr GenOther

26

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.5151Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Key ReferencesAlidina, Precomputation-based sequential logic optimization for low power,

IEEE Trans. on VLSI Systems, 2(4):426-436, 1994.Hossain, Low power design using double edge triggered flipflop, IEEE Trans.

on VLSI Systems, 2(2):261-265, 1994.Lakshminarayana, et.al., Common-Case Computation, Proc. of DAC, pp 56-

61, 1999.Motorola, M•CORE Architecture microRISC Engine, MCORE 1/D,

www.mot.com/SPS/MCORE/info_documentation.htmMutsunori, Low power designmethod using multiple supply voltages, Proc. of

SLPED, pp. 36-41, 1997.Rabaey, Digital Integrated Circuits, Prentice-Hall, 1996.Reyes, Low Power FF Circuit and Method Thereof, Patent No 5,498,988,

1996.Roy, Power analysis and design at the system level, Low Power Design in

Deep Submicron Electronics, Nebel and Mermet, Ed., Kluwer, 1997.

ASIC Tutorial Processor Core.ASIC Tutorial Processor Core.5252Low Power Design for Low Power Design for SoCsSoCs ©©M.J. Irwin, PSU, 1999M.J. Irwin, PSU, 1999

Key References, con’tSakuta, Delay balanced multipliers for low power, Proc. of SLPE, pp. 36-37,

1995.Scott, Designing the Low-Power M•CORE Architecture, Proc. Inter. Symp.

Computer Architecture Power Driven Microarchitecture Workshop, June 1998.

Stojanovic, A unified approach in the analysis of latches and FFs for low power systems, Proc. of ISLPED, pp. 227-232, 1998.

Tiwari, Reducing power in high-performance microprocessors, Proc. of DAC, pp. 732-737, 1998.

Tiwari, Guarded evaluation, Proc. ISLPD, pp. 221-226, 1995.Yeap, CPU controller optimization for HDL logic synthesis, Proc. of CICC, pp.

127-130, 1997.Yeap, Practical Low Power Digital VLSI Design, KAP, 1998.