24
Leakage Energy Management in Cache Hierarchies L. Li, I. Kadayif, Y-F. Tsai, N. Vijaykrishnan, M. Kandemir, M. J. Irwin, and A. Sivasubramaniam Penn State University http://www.cse.psu.edu/~mdl PACT-2002 Charlottesville, Virginia September 22-25, 2002

Leakage Energy Management in Cache Hierarchies L. Li, I. Kadayif, Y-F. Tsai, N. Vijaykrishnan, M. Kandemir, M. J. Irwin, and A. Sivasubramaniam Penn State

Embed Size (px)

Citation preview

Leakage Energy Management in Cache Hierarchies

L. Li, I. Kadayif, Y-F. Tsai, N. Vijaykrishnan, M. Kandemir, M. J. Irwin, and A. Sivasubramaniam

Penn State University

http://www.cse.psu.edu/~mdl

PACT-2002 Charlottesville, Virginia September 22-25, 2002

Outline

Motivation Related works Circuit support for leakage control Leakage optimization strategies Integration with other strategies Conclusion Future works

Motivation

Leakage energy is projected to become the dominant portion of the chip power budget for 0.10 micron technology and below. A. Chandrakasan et al., Design of High-Performance

Microprocessor Circuits. Leakage energy is of particular concern in

dense cache memories that form a major portion of the transistor budget.

Related Works M. D. Powell et al.

An integrated circuit/architecture approach to reducing leakage in deep-submicron high-performance I-caches.(HPCA-7)

S. Kaxiras et al. Cache decay: exploiting generational behavior to reduce cache

leakage power. (ISCA-28) H. Zhou et al.

Adaptive mode control: A static-power-efficient cache design. (PACT’01)

K. Flautner et al. Drowsy caches: Simple techniques for reducing leakage power.

(ISCA-29) Y-F. Tsai et al.

A sizing model for SRAM data preserving sleep transistors. (ASIC’02)

Circuit Support for Leakage Control

State-destroying mechanism. (Gated-Vdd) Introduce a power-switch between the ground and

the circuit to reduce leakage. Sizing to maximize the static power saving but lose

data in cells.

State-preserving mechanism. (Modified Gated-Vdd) Appropriately sizing NMOS power-switch to provide

the required minimum supply voltage to maintain the state of a static memory cell.

State-preserving Leakage Control

Leakage Optimization Strategies Employ state-destroying or state-preserving

mechanisms in cache. For single block, state-destroying mechanism saves

more leakage energy than state-preserving mechanism.

For whole cache hierarchies, state-destroying mechanism pays a higher miss penalty.

Exploit data duplication in the cache hierarchy. Data duplication: data in L2 subblocks also exist in L1

blocks. Implement five leakage reduction strategies.

Leakage Optimization Strategies (II)

Strategy When is L2 subblock turned off?

Mechanism in L2

When is L2 subblock reactivated?

Conservative when L1 block becomes dirty

state-destroying when accessed

Speculative-I when L2 subblock is moved to L1

state-preserving when accessed

Speculative-II when L2 subblock is moved to L1

state-destroying when accessed

Speculative-III when L2 subblock is moved to L1

state-preserving when L1 block is evicted

Speculative-IV when L2 subblock is moved to L1

state-destroying when L1 block is evicted

Conservative

L1 L2

Active

Active

Destroying

Write

load

Only deactivate dead L2 subblocks. Before written in L1, both two copies of data are in active mode.

Speculative-I

L1 L2

Active

Active

load

Preserving

re-access

Active

evict

Put L2 subblock in state-preserving mode when data is brought from L2 to L1.

Not lose data in L2 and need time to reactivate L2 subblock when re-access.

Speculative-II

L1 L2

Active

Active

load

re-accessevict

Destroying

Active

load

Put L2 subblock in state-destroying mode when data is brought from L2 to L1.

Lose data in L2 and need longer time to load data from main memory when re-access.

Speculative-III

L1 L2

Active

Active

load

Preserving

Active

evict

Similar to Speculative-I except that L2 subblock reactivated when L1 block is replaced.

Hide reactivation time.

Speculative-IV

L1 L2

Active

Active

load

evict and Write back

Destroying

Active

Similar to Speculative-II except that L2 subblock is written back when L1 block is replaced.

Experimental Configuration

Technology 0.07 micron

Supply Voltage 1.0V

Virtual Supply Settling Time 50 cycles

Dynamic Energy per L1 Access 0.565nJ

Dynamic Energy per L2 Access 5.83nJ

Leakage Energy per L1 Block per Active Cycle

0.551pJ

Leakage Energy per L2 Subblock per Standby Cycle (state-preserving)

0.055pJ

Leakage Energy per L2 Subblock per Standby Cycle (state-destroying)

0pJ

Control Energy 0.055nJ

Result of Energy Saving

0

20

40

60

80

100

120

adpcm-rawcaudio

cjpeg epic g721-decode

mesa-mipmap

No

rma

liz

ed

en

erg

y c

on

su

mp

tio

n (

%) Leakage Dynamic Control

Conse

rvati

ve

Specu

lati

ve-I

Specu

lati

ve-

II

Specu

lati

ve-

III

Specu

lati

ve-

IV

Result of Energy-delay Saving

0

20

40

60

80

100

120

140

adpcm-rawcaudio

cjpeg epic g721-decode

mesa-mipmap

No

rma

lize

d e

ne

rgy

-de

lay

pro

du

cts

(%

)

Conse

rvati

ve

Specu

lati

ve-I

Specu

lati

ve-I

I Specu

lati

ve-

III

Specu

lati

ve-

IV

Average Saving of Five Strategies

-756.44-75

-50

-25

0

25

50

75

Avera

ge S

avin

g (

%)

Leakage

Total CacheEnergy

Energy-Delay

Integration With Other Strategies Cache decay

Exploiting generational behavior and use state-destroying mechanism to reduce cache leakage energy.

Implement four strategies

L1 L2

Decay-I cache decay state-destroying cache decay state-destroying

Decay-II cache decay state-destroying cache decay state-preserving

Speculative

-Decay-I

cache decay state-destroying speculative-I state-preserving

Speculative

-Decay-II

cache decay state-destroying cache decay + speculative-I

state-preserving

Result of Energy Saving

0

20

40

60

80

100

adpcm-rawcaudio

cjpeg epic g721-decode

mesa-mipmap

No

rma

lize

d e

ne

rgy

co

ns

um

pti

on

(%

)

Leakage Dynamic Control

Deca

y-I

Deca

y-I

I

Sp

ecu

lati

ve-D

eca

y-I

Sp

ecu

lati

ve-D

eca

y-I

I

Result of Energy-delay Saving

0

20

40

60

80

100

120

140

adpcm-rawcaudio

cjpeg epic g721-decode

mesa-mipmap

No

rma

lize

d e

ne

rgy

-de

lay

pro

du

cts

(%

)

Deca

y-I

Deca

y-I

I

Sp

ecu

lati

ve-D

eca

y-I

Sp

ecu

lati

ve-D

eca

y-I

I

Average Savings of Strategies

-75

-50

-25

0

25

50

75S

pec

ula

tive

-I

Dec

ay-I

Dec

ay-I

I

Sp

ecu

lati

ve-D

ecay

-I

Sp

ecu

lati

ve-D

ecay

-II

Ave

rag

e sa

vin

g (

%)

Leakage

Total CacheEnergy

Energy-Delay

Conclusion Duplication of data at different levels of memory

hierarchy is costly from the leakage energy perspective.

Applying state-preserving leakage control strategy to L2 cache can reduce energy consumption significantly.

Our strategies can be combined with other techniques to provide additional energy gains.

Future Works

More powerful combined optimization strategies. Combining state-preserving and state-

destroying strategies. Software-based leakage optimization. Integrating hardware-based and software-based

strategies.

Thanks !Thanks !