44
Emulating Optimal Replacement with a Shepherd Cache Kaushik Rajan 1 & R. Govindarajan Supercomputer Education and Research Centre Indian Institute of Science Bangalore 05-Dec-2007 1 Author acknowledges financial support provided through the MSR PhD Fellowship Kaushik Rajan, Indian Institute of Science

Emulating Optimal Replacement with a Shepherd Cache

  • Upload
    others

  • View
    8

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Emulating Optimal Replacement with a Shepherd Cache

Emulating Optimal Replacement with a ShepherdCache

Kaushik Rajan1 & R. Govindarajan

Supercomputer Education and Research CentreIndian Institute of Science

Bangalore

05-Dec-2007

1Author acknowledges financial supportprovided through the MSR PhD Fellowship

Kaushik Rajan, Indian Institute of Science

Page 2: Emulating Optimal Replacement with a Shepherd Cache

Introduction

Outline

1 IntroductionMemory hierarchyDrawbacks of LRU

2 The Optimal Replacement Policy (OPT)DefinitionsUnderstanding OPT

3 Emulating Optimal Replacement with a Shepherd CacheShepherd cacheIllustrative exampleCache organization

4 Performance EvaluationBridging the performance gapComparison with related workImpact on system performance

5 ConclusionsKaushik Rajan, Indian Institute of Science

Page 3: Emulating Optimal Replacement with a Shepherd Cache

Introduction Memory hierarchy

Memory Hierarchy

Kaushik Rajan, Indian Institute of Science

Page 4: Emulating Optimal Replacement with a Shepherd Cache

Introduction Memory hierarchy

Memory Hierarchy

L2 and lower cachesObjective : Need to reduce expensivememory accesses

Design : Large size, Higher associativity,Complex design

Problem : Do not interact with programdirectly and observe filtered temporal locality

Kaushik Rajan, Indian Institute of Science

Page 5: Emulating Optimal Replacement with a Shepherd Cache

Introduction Memory hierarchy

Memory Hierarchy

L2 and lower cachesObjective : Need to reduce expensivememory accesses

Design : Large size, Higher associativity,Complex design

Problem : Do not interact with programdirectly and observe filtered temporal locality

High Associativity =⇒ replacement policy crucial to performance

L1 cache services temporal accesses =⇒ Lack of temporalaccesses at L2 =⇒ LRU replacement inefficient

Replacement decisions are taken off the processor critical path

Kaushik Rajan, Indian Institute of Science

Page 6: Emulating Optimal Replacement with a Shepherd Cache

Introduction Drawbacks of LRU

LRU vs OPT

MPKI for SPEC2000 suite, Benchmarks with MPKI < 5 not plotted butcount towards average

Huge performance gap between LRU and OPT

Kaushik Rajan, Indian Institute of Science

Page 7: Emulating Optimal Replacement with a Shepherd Cache

Introduction Drawbacks of LRU

LRU vs OPT

MPKI for SPEC2000 suite, Benchmarks with MPKI < 5 not plotted butcount towards average

Huge performance gap between LRU and OPTOPT at half the size preferable to LRU at double the size

Kaushik Rajan, Indian Institute of Science

Page 8: Emulating Optimal Replacement with a Shepherd Cache

The Optimal Replacement Policy (OPT)

Outline

1 IntroductionMemory hierarchyDrawbacks of LRU

2 The Optimal Replacement Policy (OPT)DefinitionsUnderstanding OPT

3 Emulating Optimal Replacement with a Shepherd CacheShepherd cacheIllustrative exampleCache organization

4 Performance EvaluationBridging the performance gapComparison with related workImpact on system performance

5 ConclusionsKaushik Rajan, Indian Institute of Science

Page 9: Emulating Optimal Replacement with a Shepherd Cache

The Optimal Replacement Policy (OPT) Definitions

The Optimal Replacement Policy

1 Replacement Candidates : On a miss any replacement policycould either choose to replace any of the lines in the cache orchoose not to place the miss causing line in the cache at all.

2 Self Replacement : The latter choice is referred to as aself-replacement or a cache bypass

Kaushik Rajan, Indian Institute of Science

Page 10: Emulating Optimal Replacement with a Shepherd Cache

The Optimal Replacement Policy (OPT) Definitions

The Optimal Replacement Policy

1 Replacement Candidates : On a miss any replacement policycould either choose to replace any of the lines in the cache orchoose not to place the miss causing line in the cache at all.

2 Self Replacement : The latter choice is referred to as aself-replacement or a cache bypass

Optimal Replacement Policy

On a miss replace the candidate to which an access is leastimminent [Belady1966,Mattson1970,McFarling-thesis]

Kaushik Rajan, Indian Institute of Science

Page 11: Emulating Optimal Replacement with a Shepherd Cache

The Optimal Replacement Policy (OPT) Definitions

The Optimal Replacement Policy

1 Replacement Candidates : On a miss any replacement policycould either choose to replace any of the lines in the cache orchoose not to place the miss causing line in the cache at all.

2 Self Replacement : The latter choice is referred to as aself-replacement or a cache bypass

Optimal Replacement Policy

On a miss replace the candidate to which an access is leastimminent [Belady1966,Mattson1970,McFarling-thesis]

3 Lookahead Window : Window of accesses between miss causingaccess and the access to the least imminent replacementcandidate. Single pass simulation of OPT make use of lookaheadwindows to identify replacement candidates and modify currentcache state [Sugumar-SIGMETRICS1993]

Kaushik Rajan, Indian Institute of Science

Page 12: Emulating Optimal Replacement with a Shepherd Cache

The Optimal Replacement Policy (OPT) Understanding OPT

Understanding OPT

0 1 2 3 4

0 1 2 3 4

Access Sequence 5 1 6 3 1 4 5 2 5 7 6 8A A A A A A A A A A A A

5

6

A

AOPT order for

OPT order for

Consider 4 way associative cache with one set initially containing lines(A1,A2,A3,A4), consider the access stream shown in table

Access A5 misses, replacement decision proceeds as follows

Kaushik Rajan, Indian Institute of Science

Page 13: Emulating Optimal Replacement with a Shepherd Cache

The Optimal Replacement Policy (OPT) Understanding OPT

Understanding OPT

0 1 2 3 4

0 1 2 3 4

Access Sequence 5 1 6 3 1 4 5 2 5 7 6 8A A A A A A A A A A A A

5

6

A

AOPT order for

OPT order for

Consider 4 way associative cache with one set initially containing lines(A1,A2,A3,A4), consider the access stream shown in table

Access A5 misses, replacement decision proceeds as follows1 Identify replacement candidates : (A1,A2,A3,A4,A5)

Kaushik Rajan, Indian Institute of Science

Page 14: Emulating Optimal Replacement with a Shepherd Cache

The Optimal Replacement Policy (OPT) Understanding OPT

Understanding OPT

0 1 2 3 4

0 1 2 3 4

Access Sequence 5 1 6 3 1 4 5 2 5 7 6 8A A A A A A A A A A A A

5

6

A

AOPT order for

OPT order for

Consider 4 way associative cache with one set initially containing lines(A1,A2,A3,A4), consider the access stream shown in table

Access A5 misses, replacement decision proceeds as follows1 Identify replacement candidates : (A1,A2,A3,A4,A5)2 Lookahead and gather imminence order : shown in table,

lookahead window circled

Kaushik Rajan, Indian Institute of Science

Page 15: Emulating Optimal Replacement with a Shepherd Cache

The Optimal Replacement Policy (OPT) Understanding OPT

Understanding OPT

0 1 2 3 4

0 1 2 3 4

Access Sequence 5 1 6 3 1 4 5 2 5 7 6 8A A A A A A A A A A A A

5

6

A

AOPT order for

OPT order for

Consider 4 way associative cache with one set initially containing lines(A1,A2,A3,A4), consider the access stream shown in table

Access A5 misses, replacement decision proceeds as follows1 Identify replacement candidates : (A1,A2,A3,A4,A5)2 Lookahead and gather imminence order : shown in table,

lookahead window circled3 Make replacement decision : A5 replaces A2

Kaushik Rajan, Indian Institute of Science

Page 16: Emulating Optimal Replacement with a Shepherd Cache

The Optimal Replacement Policy (OPT) Understanding OPT

Understanding OPT

0 1 2 3 4

0 1 2 3 4

Access Sequence 5 1 6 3 1 4 5 2 5 7 6 8A A A A A A A A A A A A

5

6

A

AOPT order for

OPT order for

Consider 4 way associative cache with one set initially containing lines(A1,A2,A3,A4), consider the access stream shown in table

Access A5 misses, replacement decision proceeds as follows1 Identify replacement candidates : (A1,A2,A3,A4,A5)2 Lookahead and gather imminence order : shown in table,

lookahead window circled3 Make replacement decision : A5 replaces A2

A6 self-replaces, lookahead window and imminence order in table

Kaushik Rajan, Indian Institute of Science

Page 17: Emulating Optimal Replacement with a Shepherd Cache

Emulating Optimal Replacement with a Shepherd Cache

Outline

1 IntroductionMemory hierarchyDrawbacks of LRU

2 The Optimal Replacement Policy (OPT)DefinitionsUnderstanding OPT

3 Emulating Optimal Replacement with a Shepherd CacheShepherd cacheIllustrative exampleCache organization

4 Performance EvaluationBridging the performance gapComparison with related workImpact on system performance

5 ConclusionsKaushik Rajan, Indian Institute of Science

Page 18: Emulating Optimal Replacement with a Shepherd Cache

Emulating Optimal Replacement with a Shepherd Cache Shepherd cache

Motivation

LRU-OPT gap

Smaller OPT cache better than larger LRU cache =⇒

Focus on matching OPT performance even on a smallercache

Lookahead

OPT requires lookahead to to find least imminentline =⇒ Use part of cache to aid in emulating OPT forremaining cache

Kaushik Rajan, Indian Institute of Science

Page 19: Emulating Optimal Replacement with a Shepherd Cache

Emulating Optimal Replacement with a Shepherd Cache Shepherd cache

Emulating OPT with a Shepherd Cache

Kaushik Rajan, Indian Institute of Science

Page 20: Emulating Optimal Replacement with a Shepherd Cache

Emulating Optimal Replacement with a Shepherd Cache Shepherd cache

Emulating OPT with a Shepherd Cache

Split the cache into two logical partsMain Cache (MC) for which optimalreplacement is emulatedShepherd Cache (SC) used to provide alookahead and guide replacements from MCtowards OPT

Kaushik Rajan, Indian Institute of Science

Page 21: Emulating Optimal Replacement with a Shepherd Cache

Emulating Optimal Replacement with a Shepherd Cache Shepherd cache

Emulating OPT with a Shepherd Cache

Split the cache into two logical partsMain Cache (MC) for which optimalreplacement is emulatedShepherd Cache (SC) used to provide alookahead and guide replacements from MCtowards OPT

Operation1 Buffer lines temporarily in SC before moving

them to MC, SC acts as a FIFO buffer2 While in SC, gather imminence information and

emulate lookahead3 When forced out of SC, make an MC

replacement based on the gathered imminenceorder

Kaushik Rajan, Indian Institute of Science

Page 22: Emulating Optimal Replacement with a Shepherd Cache

Emulating Optimal Replacement with a Shepherd Cache Illustrative example

Overview of Shepherd Caching

A 5,A 1,A 6,A 3,A 1,A 4,

A 5, 2,A A 5,A 7,A 6,A 8

2SCSC1

NVC2

CM

MC

A 1A 2

3A

NVC1

4A

To emulate MC with 4 ways per set and 2 SCways per set

To gather imminence order add a countermatrix (CM)

CM has one column per SC way to trackimminence order w.r.t to it

CM has one row per SC and MC line as anyof them can be a replacement candidate

Each column has one Next Value Counter(NVC) to track the next value to assign alongcolumn

Kaushik Rajan, Indian Institute of Science

Page 23: Emulating Optimal Replacement with a Shepherd Cache

Emulating Optimal Replacement with a Shepherd Cache Illustrative example

1A

A 2A 3

4A

5,A 1,A 6,A 3,A 1,A 4,

A 5, 2,A A 5,A 7,A 6,A 8

CM

SCSC

21

NVCs

MC

A

Kaushik Rajan, Indian Institute of Science

Page 24: Emulating Optimal Replacement with a Shepherd Cache

Emulating Optimal Replacement with a Shepherd Cache Illustrative example

1A

A 2A 3

4A

5,A 1,A 6,A 3,A 1,A 4,

A 5, 2,A A 5,A 7,A 6,A 8

CM

SCSC

21

NVCs

MC

A

1A

A 2A 3

4A

5,A 1,A 6,A 3,A 1,A 4,

A 5, 2,A A 5,A 7,A 6,A 8

CM

SCSC

21

NVCs

e

MC

5A eeeee

0

A

Kaushik Rajan, Indian Institute of Science

Page 25: Emulating Optimal Replacement with a Shepherd Cache

Emulating Optimal Replacement with a Shepherd Cache Illustrative example

1A

A 2A 3

4A

5,A 1,A 6,A 3,A 1,A 4,

A 5, 2,A A 5,A 7,A 6,A 8

CM

SCSC

21

NVCs

MC

A

1A

A 2A 3

4A

5,A 1,A 6,A 3,A 1,A 4,

A 5, 2,A A 5,A 7,A 6,A 8

CM

SCSC

21

NVCs

e

MC

5A eeeee

0

A

1A

A 2A 3

4A

5, 1, 6,A 3,A 1,A 4,

A 5, 2,A A 5,A 7,A 6,A 8

CM

SCSC

21

NVCs

e

MC

5A e0

A AA

eee

1

Kaushik Rajan, Indian Institute of Science

Page 26: Emulating Optimal Replacement with a Shepherd Cache

Emulating Optimal Replacement with a Shepherd Cache Illustrative example

1A

A 2A 3

4A

5, A 3,A 1,A 4,

A 5, 2,A A 5,A 7,A 6,A 8

CM

SCSC

21

NVCs

0

MC

5A e0

A AA

eee

1

1, 6,

A 6 eeeeee

0

Kaushik Rajan, Indian Institute of Science

Page 27: Emulating Optimal Replacement with a Shepherd Cache

Emulating Optimal Replacement with a Shepherd Cache Illustrative example

1A

A 2A 3

4A

5, A 3,A 1,A 4,

A 5, 2,A A 5,A 7,A 6,A 8

CM

SCSC

21

NVCs

0

MC

5A e0

A AA

eee

1

1, 6,

A 6 eeeeee

0

1A

A 2A 3

4A

5, 3,A 1,A 4,

A 5, 2,A A 5,A 7,A 6,A 8

CM

SCSC

21

NVCs

0

MC

5A e0

A A

e

e

1, 6,

A 6

0

1

e

eeee

2

1

A A

Kaushik Rajan, Indian Institute of Science

Page 28: Emulating Optimal Replacement with a Shepherd Cache

Emulating Optimal Replacement with a Shepherd Cache Illustrative example

1A

A 2A 3

4A

5, A 3,A 1,A 4,

A 5, 2,A A 5,A 7,A 6,A 8

CM

SCSC

21

NVCs

0

MC

5A e0

A AA

eee

1

1, 6,

A 6 eeeeee

0

1A

A 2A 3

4A

5, 3,A 1,A 4,

A 5, 2,A A 5,A 7,A 6,A 8

CM

SCSC

21

NVCs

0

MC

5A e0

A A

e

e

1, 6,

A 6

0

1

e

eeee

2

1

A A

1A

A 2A 3

4A

5, 3, 1,A 4,

A 5, 2,A A 5,A 7,A 6,A 8

CM

SCSC

21

NVCs

0

MC

5A e0

A A

e

1, 6,

A 6

0

2

e

ee

1

A AA

e e

1

2

Kaushik Rajan, Indian Institute of Science

Page 29: Emulating Optimal Replacement with a Shepherd Cache

Emulating Optimal Replacement with a Shepherd Cache Illustrative example

1A

A 2A 3

4A

5, 3, 1, 4,

5, 2,A A 5,A 7,A 6,A 8

CM

SCSC

21

NVCs

0

MC

5A

0

A A

e

1, 6,

A 6

0

4

e

e

1

A A A

1

2

4

A

A

2

3 3

Kaushik Rajan, Indian Institute of Science

Page 30: Emulating Optimal Replacement with a Shepherd Cache

Emulating Optimal Replacement with a Shepherd Cache Illustrative example

1A

A 2A 3

4A

5, 3, 1, 4,

5, 2,A A 5,A 7,A 6,A 8

CM

SCSC

21

NVCs

0

MC

5A

0

A A

e

1, 6,

A 6

0

4

e

e

1

A A A

1

2

4

A

A

2

3 31A

A 2A 3

4A

5, 3, 1, 4,

5, 2,A 5,A 7,A 6,A 8

CM

SCSC

21

NVCs

0

MC

5A

0

A A 1, 6,

A 6

0

5

e

1

A A A

1

2

5

A

2

3

AA

3

4 4

Kaushik Rajan, Indian Institute of Science

Page 31: Emulating Optimal Replacement with a Shepherd Cache

Emulating Optimal Replacement with a Shepherd Cache Illustrative example

1A

A 2A 3

4A

5, 3, 1, 4,

5, 2,A A 5,A 7,A 6,A 8

CM

SCSC

21

NVCs

0

MC

5A

0

A A

e

1, 6,

A 6

0

4

e

e

1

A A A

1

2

4

A

A

2

3 31A

A 2A 3

4A

5, 3, 1, 4,

5, 2,A 5,A 7,A 6,A 8

CM

SCSC

21

NVCs

0

MC

5A

0

A A 1, 6,

A 6

0

5

e

1

A A A

1

2

5

A

2

3

AA

3

4 41A

A 3

4A

5, 3, 1, 4,

5, 2, 7,A 6,A 8

CM

SCSC

21

NVCs

MC

7A

A A 1, 6,

A 6

0

5

e

A A A

1

0

A

2

e 3

A 5,AA

A 5

eee

ee

0

Kaushik Rajan, Indian Institute of Science

Page 32: Emulating Optimal Replacement with a Shepherd Cache

Emulating Optimal Replacement with a Shepherd Cache Illustrative example

1A

A 3

4A

5, 3, 1, 4,

5, 2, 7,A 6,A 8

CM

SCSC

21

NVCs

MC

7A

A A 1, 6,

A 6

0

5

e

A A A

1

0

A

2

e 3

A 5,AA

A 5

eee

ee

01A

A 3

4A

5, 3, 1, 4,

5, 2, 7, 6,A 8

CM

SCSC

21

NVCs

MC

7A

A A 1, 6,

A 6

0

A A A

1

1

A

2

3

A 5,A

A 5

eeeee

0 5

6

0

AAA

Kaushik Rajan, Indian Institute of Science

Page 33: Emulating Optimal Replacement with a Shepherd Cache

Emulating Optimal Replacement with a Shepherd Cache Illustrative example

1A

A 3

4A

5, 3, 1, 4,

5, 2, 7,A 6,A 8

CM

SCSC

21

NVCs

MC

7A

A A 1, 6,

A 6

0

5

e

A A A

1

0

A

2

e 3

A 5,AA

A 5

eee

ee

01A

A 3

4A

5, 3, 1, 4,

5, 2, 7, 6,A 8

CM

SCSC

21

NVCs

MC

7A

A A 1, 6,

A 6

0

A A A

1

1

A

2

3

A 5,A

A 5

eeeee

0 5

6

0

AAA

1A

A 3

4A

5, 3, 1, 4,

5, 2, 7, 6, 8

CM

SCSC

21

NVCs

MC

7A

A A 1, 6,

A 8

A A A

1

A

A 5,A

A 5

eeeee

0 e

0

AA AA

eeeee

Kaushik Rajan, Indian Institute of Science

Page 34: Emulating Optimal Replacement with a Shepherd Cache

Emulating Optimal Replacement with a Shepherd Cache Cache organization

Baseline MC replacement

Lookahead provided may-notalways be sufficient to identifyleast imminent replacement

Partial imminence order used torestrict replacement choice toless imminent candidates

Use baseline MC replacement tochoose victim among these

Baseline MC replacement can beany standard policy, we evaluateLRU, LFU and random

When line moves from SC to MC,place it in LRU/LFU position

1A

A 2A 3

4A

CM

SCSC

21

NVCs

MC

1

2

4

CM

SCSC

21

NVCs

MC

56 0

2

0

e

e

0

1

2

e

e

3 2

1

e

A

A

3A

A

A

A

A1,

A A6, 3, 5, 1,

A75,

A A A

Initial State Before Access A7

A5 replaces one of A2 or A4 based onbaseline MC replacement

Kaushik Rajan, Indian Institute of Science

Page 35: Emulating Optimal Replacement with a Shepherd Cache

Emulating Optimal Replacement with a Shepherd Cache Cache organization

Cache Organization

TAG DATA

ASSOCIATIVITYSE

T 0

SE

T 1

ASSOCIATIVITY

ASSOCIATIVITY

....... .

.

. . .

. . .

. . .

. . .

. . .

SC−FLAG/SC−PTR

NEXT VALUE

TAG DATA

s/3s/2s/1s/0

mmmmmmmmmm

mm

s/0 s/1 s/2 s/3

COUNTER MATRIX

COUNTERS

Cache Organization (SC-4, MC-12)

Divide cache into SC and MC with some ways of each set are SC ways

Use SC flag to identify SC ways, use SC-ptr to link SC line to its column

No need to physically move lines from SC to MC, just adjust SC-flag/ptr

Larger SC-assoc =⇒ more lookahead =⇒ OPT in smaller MC

Kaushik Rajan, Indian Institute of Science

Page 36: Emulating Optimal Replacement with a Shepherd Cache

Performance Evaluation

Outline

1 IntroductionMemory hierarchyDrawbacks of LRU

2 The Optimal Replacement Policy (OPT)DefinitionsUnderstanding OPT

3 Emulating Optimal Replacement with a Shepherd CacheShepherd cacheIllustrative exampleCache organization

4 Performance EvaluationBridging the performance gapComparison with related workImpact on system performance

5 ConclusionsKaushik Rajan, Indian Institute of Science

Page 37: Emulating Optimal Replacement with a Shepherd Cache

Performance Evaluation Bridging the performance gap

Simulation Methodology

1 Memory hierarchyL2 Cache:- 512KB,1MB,2MB and 4MB, 16 way, 128B lines, 8 cycleL1 cache:- Both L1-I and L1-D are 16KB 2 way, 32B lines, 2 cycleMemory:- Infinite size 400 cycleInfinite MSHR and store buffer

2 SC+MC configurationTotal assoc 16, performance evaluated for (SC-2, MC-14) to(SC-14, MC-2)Schemes referred to by SC-associativity, SC-4 =⇒ (SC-4, MC-12)

3 Simulation methodologyBenchmarks Suite:- SPEC 2006 all 26 benchmarks with ref inputsInstructions:- Fast-forward 2 billion run 3 billionMethodology:- Trace driven for MPKI, Simplescalar for systemperformanceProcessor configuration:- Similar to recent studies

Kaushik Rajan, Indian Institute of Science

Page 38: Emulating Optimal Replacement with a Shepherd Cache

Performance Evaluation Bridging the performance gap

Trade-off between SC and MC

Avg MPKI over SPEC2000 suite

SC-assoc vs MC-assocMarginal gains whenMC size less than halfthe cache

Peak gains : SC-6 toSC-4

Quarter to half ofcache should be forlookahead

Size overhead48B-92B, less than onecache block (128B)

Kaushik Rajan, Indian Institute of Science

Page 39: Emulating Optimal Replacement with a Shepherd Cache

Performance Evaluation Bridging the performance gap

Bridging the performance gap

Avg MPKI over SPEC2000 suite

Bridging the LRU-OPT gap

SC-4 bridges 32-52%of gap

SC moves closer toOPT as cache sizeincreases

Kaushik Rajan, Indian Institute of Science

Page 40: Emulating Optimal Replacement with a Shepherd Cache

Performance Evaluation Comparison with related work

Comparison with related work

Scheme ParametersDIP [Qureshi-ISCA07] 16/32/64 dedicated sets, epsilon 1/32V-way cache [Qureshi-ISCA05] 32 tags per set, tdr 2

global reuse distance replacementVictim Buffer [Jouppi-ISCA90] MC+VB, VB same assoc as SCfa cache fully associative with LRUOPT-16 16 way associative OPT replacement

Performance compared with base LRU, DIP, victim buffer with oneadditional way per set to compensate for size overhead

Kaushik Rajan, Indian Institute of Science

Page 41: Emulating Optimal Replacement with a Shepherd Cache

Performance Evaluation Comparison with related work

Comparison with related schemes

MPKI for 512KB 16 way cache, select benchmarks, all 26 in average

On average both SC-4 and SC-8 out-performs LRU, DIP, v-way, faand victim by 4-10%

For most benchmarks SC-4 performs better than SC-8

Different SC associativities best for different benchmarks

Kaushik Rajan, Indian Institute of Science

Page 42: Emulating Optimal Replacement with a Shepherd Cache

Performance Evaluation Impact on system performance

Speedup in CPI

512KB

2MBAverage speedup of about 7% with SC-4 and 5% with SC-8

Kaushik Rajan, Indian Institute of Science

Page 43: Emulating Optimal Replacement with a Shepherd Cache

Conclusions

Outline

1 IntroductionMemory hierarchyDrawbacks of LRU

2 The Optimal Replacement Policy (OPT)DefinitionsUnderstanding OPT

3 Emulating Optimal Replacement with a Shepherd CacheShepherd cacheIllustrative exampleCache organization

4 Performance EvaluationBridging the performance gapComparison with related workImpact on system performance

5 ConclusionsKaushik Rajan, Indian Institute of Science

Page 44: Emulating Optimal Replacement with a Shepherd Cache

Conclusions

Conclusions and Future Work

Conclusion

Using part of cache for lookahead and emulating OPT inremaining cache effective at improving L2 performance

Quarter to Half of cache space should be dedicated to SC

Performs better than related proposals and successfully bridges32-52% of the LRU-OPT gap

Future Work

Different SC associativity works best with different benchmarks,can the SC-associativity be adapted at runtime?

Study the benefits of dissociating SC and MC?

Tuning the organization to a shared CMP cache?

Kaushik Rajan, Indian Institute of Science