27
Uppsala Architecture Research Team But... That’s three wishes in one!!! Mommy, mommy! I want’ a hardware cache with few conflicts and low power consumption that is easy to implement!

But... That’s three wishes in one!!!

  • Upload
    avon

  • View
    32

  • Download
    0

Embed Size (px)

DESCRIPTION

Mommy, mommy! I want’ a hardware cache with few conflicts and low power consumption that is easy to implement!. But... That’s three wishes in one!!!. Refinement and Evaluation of the Elbow Cache or The Little Cache that could. Mathias Spjuth. Cache. Address Space. {. {. Sets. {. {. - PowerPoint PPT Presentation

Citation preview

Page 1: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

But... That’s three

wishes in one!!!

Mommy, mommy!I want’ a hardware

cache with few conflicts and low

power consumption that is easy to implement!

Page 2: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

Refinement and Refinement and Evaluation of theEvaluation of the

Elbow Elbow CacheCache

oror

The Little Cache that The Little Cache that couldcould

Mathias SpjuthMathias Spjuth

Page 3: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

CacheCacheAddress SpaceAddress Space

BB

H

C

Memory References: Memory References: AAMemory References: Memory References: A-BA-BMemory References: Memory References: A-B-CA-B-C

2-way Set Associative 2-way Set Associative CacheCache

Memory References:Memory References:

AA

DDFF

EE

HH

A

F

ED

B

G

Memory References: Memory References: A-B-C-DA-B-C-DMemory References: Memory References: A-B-C-D-EA-B-C-D-EMemory References: Memory References: A-B-C-D-E-FA-B-C-D-E-FMemory References: Memory References: A-B-C-D-E-F-GA-B-C-D-E-F-GMemory References: Memory References: A-B-C-D-E-F-G-HA-B-C-D-E-F-G-H

CC

GG

{{{{

{{{{Se

tsSets

Page 4: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

Conflicts (cont.)Conflicts (cont.)

Traditional way of reducing conflicts Traditional way of reducing conflicts is to use is to use set associativeset associative caches. caches.

++ Lower miss rate (than direct-++ Lower miss rate (than direct-mapped)mapped)

-- Slower access-- Slower access

-- More complexity (uses more chip--- More complexity (uses more chip-area)area)

-- Higher power consumption -- Higher power consumption

Page 5: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

Address SpaceAddress Space

Cache Bank 1Cache Bank 1 BB

H

C

Memory References: Memory References: AAMemory References: Memory References: A-BA-BMemory References: Memory References: A-B-CA-B-CMemory References:Memory References:

AA

FF

EE

HH

A

F

E DB

G

Memory References: Memory References: A-B-C-DA-B-C-DMemory References: Memory References: A-B-C-D-EA-B-C-D-EMemory References: Memory References: A-B-C-D-E-FA-B-C-D-E-FMemory References: Memory References: A-B-C-D-E-F-GA-B-C-D-E-F-GMemory References: Memory References: A-B-C-D-E-F-G-HA-B-C-D-E-F-G-H

CC

GG

Cache Bank 2Cache Bank 2

2-way 2-way

SkewedSkewed

AssociativeAssociative

CacheCacheDD

Page 6: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

Address SpaceAddress Space

Cache Bank 1Cache Bank 1 BB

H

C

Memory References: Memory References: AAMemory References: Memory References: A-BA-BMemory References: Memory References: A-B-CA-B-CMemory References:Memory References:

AA

FF

EE

HH

A

F

E DB

G

Memory References: Memory References: A-B-C-DA-B-C-DMemory References: Memory References: A-B-C-D-EA-B-C-D-EMemory References: Memory References: A-B-C-D-E-FA-B-C-D-E-FMemory References: Memory References: A-B-C-D-E-F-GA-B-C-D-E-F-GMemory References: Memory References: A-B-C-D-E-F-G-HA-B-C-D-E-F-G-H

CC

GG

Cache Bank 2Cache Bank 2

2-way 2-way

SkewedSkewed

AssociativeAssociative

CacheCacheDD

HHNo No

Conflicts!Conflicts!

Page 7: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

Skewed associative Skewed associative cachescaches

Uses Uses differentdifferent hashing (skewing) hashing (skewing) functions for indexing each cache functions for indexing each cache bankbank

++ Lower missrate (than set-assoc.)++ Lower missrate (than set-assoc.)++ More predictable++ More predictable -- Slightly slower (hashing)-- Slightly slower (hashing) -- ”Cannot” use LRU replacement-- ”Cannot” use LRU replacement -- ”Cannot” use VI-PT -- ”Cannot” use VI-PT

Page 8: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

Elbow CacheElbow Cache

Improve the performance of a Improve the performance of a skewed associative cache by skewed associative cache by reallocatingreallocating blocks within the blocks within the cache.cache.

By doing so we get a broader choice By doing so we get a broader choice of which block to choose as the of which block to choose as the victim.victim.

Use timestamps as replacement Use timestamps as replacement metric.metric.

Page 9: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

Finding the victimFinding the victim

Two methods:Two methods:

1.1. Look-aheadLook-aheadConsider all possible placements Consider all possible placements beforebefore the first reallocation is the first reallocation is made.made.

2.2. FeedbackFeedbackOnly consider the immediate Only consider the immediate placements, then iterate.placements, then iterate.

Page 10: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

Address SpaceAddress Space

Cache Bank 1Cache Bank 1 BB

HC

Memory References: Memory References: AAMemory References: Memory References: A-BA-BMemory References: Memory References: A-B-CA-B-CMemory References:Memory References:

FF

EE

HH

A

F

E

D

B

G

Memory References: Memory References: A-B-C-DA-B-C-DMemory References: Memory References: A-B-C-D-EA-B-C-D-EMemory References: Memory References: A-B-C-D-E-FA-B-C-D-E-FMemory References: Memory References: A-B-C-D-E-F-GA-B-C-D-E-F-GMemory References: Memory References: A-B-C-D-E-F-G-H-XA-B-C-D-E-F-G-H-X

CC

GG

Cache Bank 2Cache Bank 2

2-way 2-way

ElbowElbow

LookaheadLookahead

CacheCacheDD

XX

AA

Replacement paths:Replacement paths:

F-B-AF-B-A

E-D-HE-D-H

X

Page 11: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

Address SpaceAddress Space

Cache Bank 1Cache Bank 1 BB

HC

Memory References: Memory References: AAMemory References: Memory References: A-BA-BMemory References: Memory References: A-B-CA-B-CMemory References:Memory References:

FF

HH

A

F

E

D

B

G

Memory References: Memory References: A-B-C-DA-B-C-DMemory References: Memory References: A-B-C-D-EA-B-C-D-EMemory References: Memory References: A-B-C-D-E-FA-B-C-D-E-FMemory References: Memory References: A-B-C-D-E-F-GA-B-C-D-E-F-GMemory References: Memory References: A-B-C-D-E-F-G-H-XA-B-C-D-E-F-G-H-X

CC

GG

Cache Bank 2Cache Bank 2

2-way 2-way

ElbowElbow

FeedbackFeedback

CacheCacheXX

AA

X

Temp. RegisterTemp. Register

EEDD

Page 12: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

Finding the victim Finding the victim (cont.)(cont.)

Look-ahead:Look-ahead:++ Most optimal++ Most optimal -- Difficult to implement -- Difficult to implement (>1 (>1

transformation)transformation)

Feedback:Feedback:++ Easy to implement ++ Easy to implement

(feed victim back to write buffer)(feed victim back to write buffer)

-- Needs extra space in the write -- Needs extra space in the write bufferbuffer

Page 13: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

Replacement MetricsReplacement Metrics

Enhanced-Not-Recently-Used (NRUE):

The best policy for skewed caches known so far.

Each block contains two extra bits, a recently-used and very-recently-used bit, that are set on access to the block.

These bits are regularly cleared. The very-recently-used bit is cleared more often.

First, try to find a victim with no bit set.

Then one with only the recently-used bit set.

Then use random replacement.

Page 14: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

TimestampsTimestamps

1010010100

100000100000A TA

B TBTTcurrcurr

DataData TimestampTimestamp

CounterCounter

Increase counter Increase counter on every cache on every cache allocationallocation

Dist(A)=Tmax– Tcurr + TA if Tcurr < TA

Tcurr – TA if Tcurr >= TA{

1010010100

1000011000011010010100

100010100010

Page 15: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

TimestampsTimestamps

TimestampTimestamp

[ticks][ticks]TTmaxmax00

TTcurrcurrTTAATTcurrcurr TTBB

Dist(A) > Dist(B); A older than BDist(A) > Dist(B); A older than B

TTAA

Dist(A) < Dist(B); B older than ADist(A) < Dist(B); B older than A

Page 16: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

ImplementationImplementation

Lookahead:Lookahead: At most one transformation At most one transformation

(4 possible victims) each (4 possible victims) each replacement.replacement.

Do the transformation and Do the transformation and load the new data at the load the new data at the same time.same time.

Page 17: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

ImplementationImplementation

Feedback:Feedback: Up to 7 transformations (max. 8 Up to 7 transformations (max. 8

possible victims) each possible victims) each replacement.replacement.

Temporary victims are moved to Temporary victims are moved to the write buffer, before the write buffer, before reallocation.reallocation.

Extra control field in write buffer.Extra control field in write buffer.

Page 18: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

FeedbackFeedback

N

2:12:1

2:2

Y

X

X

Bank I Bank II

Write Buffer

Xid1 Xid2b Step Data+Tag TmSt

Data+Tag TmSt

BTmSt

ATmSt

≥1

Wri

teR

ea

d

CTmSt

v

b

s

writemem

readmem

i j

k

&

Data+Tag TmSt

Page 19: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

Test ConfigurationsTest Configurations Set associative: 2-way, 4-way, 8-way, 16-waySet associative: 2-way, 4-way, 8-way, 16-way Fully associative cacheFully associative cache Skewed associative, LRUSkewed associative, LRU Skewed associative, NRUESkewed associative, NRUE Skewed associative, 5-bit timestampSkewed associative, 5-bit timestamp Elbow cache, 1-step lookahead, 5-bit Elbow cache, 1-step lookahead, 5-bit

timestamptimestamp Elbow cache, 7-step feedback, 5-bit Elbow cache, 7-step feedback, 5-bit

timestamptimestamp

Page 20: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

Test Configurations (2)Test Configurations (2)

General configuration:General configuration: 8 KB, 16 KB, 32 KB cache size8 KB, 16 KB, 32 KB cache size L1 data cache with 32 byte block sizeL1 data cache with 32 byte block size Write Back – No Allocate on Write &Write Back – No Allocate on Write &

infinite write buffer (all writes infinite write buffer (all writes ignored)ignored)

Miss Rate Reduction (MRR):Miss Rate Reduction (MRR):

MRR = (MRMRR = (MRrefref – MR)/MR – MR)/MRrefref

Page 21: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

Average miss rate reduction

0,00%

5,00%

10,00%

15,00%

20,00%

25,00%

8KB 16KB 32KB

Cache size

Mis

s r

ate

re

du

cti

on

2-w4-w8-w16-wFully Assoc.Skewed LRUSkewed NRUESkewed TS 5-bitElbow LA 5-bitElbow FB 5-bit-7-step

Page 22: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

16 KB Cache size

-5,0

0%0,

00%

5,00

%10

,00%

15,0

0%20

,00%

25,0

0%30

,00%

AMMP EQUAKE MCF PARSER VCF_PLACE VCF_ROUTE

Benchmark (Red. SPEC 2000)

Mis

s R

ate

Red

ucti

on

2-w

4-w

8-w

16-w

Fully Assoc.

Skewed LRU

Skewed NRUE

Skewed TS 5-bit

Elbow LA 5-bit

Elbow FB 5-bit-7-step

Page 23: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

ConclusionsConclusions

I.I. For a 2-way skewed cache, For a 2-way skewed cache, timestamp replacement gives timestamp replacement gives almost the same performance as almost the same performance as LRU.LRU.

II.II. Timestamps are useful.Timestamps are useful.

III.III. A 2-way elbow cache has A 2-way elbow cache has roughly the same performance roughly the same performance as an 8-way set associative as an 8-way set associative cache of the same size.cache of the same size.

Page 24: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

Conclusions (2)Conclusions (2)

IV.IV. The lookahead design is slightly The lookahead design is slightly better than the feedback.better than the feedback.

V.V. There are drawbacks with all There are drawbacks with all skewed caches (skewing delays, skewed caches (skewing delays, VI-PT). VI-PT).

VI.VI. If the problems can be solved, If the problems can be solved, the elbow cache is a good the elbow cache is a good alternative to set associative alternative to set associative caches.caches.

Page 25: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

Future WorkFuture Work

Power awareness:Power awareness:

How does an elbow cache How does an elbow cache stand up against stand up against traditional set associative traditional set associative caches when power caches when power consumptions is is considered?considered?

Page 26: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

LinksLinks

UART web:UART web:

www.it.uu.se/research/group/uart/www.it.uu.se/research/group/uart/

Page 27: But...  That’s three wishes in one!!!

Uppsala Architecture Research Team

??