21
Paging for Multi-Core Shared Caches Alejandro López-Ortiz , Alejandro Salinger ITCS, January 8 th , 2012

Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

Embed Size (px)

Citation preview

Page 1: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

Paging for Multi-Core Shared Caches

Alejandro López-Ortiz , Alejandro Salinger

ITCS, January 8th, 2012

Page 2: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

2

Page 3: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

Multi-Core challenges• Access to data is a key factor • Cache efficiency is determinant

– Algorithms– Schedulers– Paging strategies

• Extensively studied for sequential case• Almost no previous theory for multi-core case

3

Page 4: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

Sequential Paging

5

Slow memory Cache of size K

…p6 p3 p2 p4 p4 p2 p10 p11 p5 p4…Page request

Is pi in the cache? -Yes, do nothing (hit)-No, fetch pi from slow memory, evict one page from cache (fault)

Goal: minimize number of faults

Page 5: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

Sequential PagingCommon eviction policies:

– Least-Recently-Used (LRU)– First-In-First-Out (FIFO)– Flush-When-Full (FWF)– Furthest-In-The-Future (FITF) (offline)

• An online algorithm A is c-competitive if for all R

6

Page 6: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

Multi-Core Paging

7

RAM

Core 1 Core 2 Core 3 Core 4

L2/L3 Cache

Page 7: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

t 1 2 3 4 5 6 7 8 9 10 11 12

R1: p2 p8 p1 p4 p3 p4 p10 p5 …

R2: p9 p1 _ _ _ p8 p2 p1 p1 p4 p7 …

R3: p3 p18 p17 p8 p2 p3 p2 p9 …

Multi-Core Paging

• p sequences• shared cache of size K• total length n (n K, p)• hit = 1 unit of time • fault = units

8

t 1 2 3 4 5 6 7 8 9 10 11 12

R1: p2 p8 p1 p4 p3 p4 p10 p5 …

R2: p9 p1 p8 p2 p1 p1 p4 p7 …

R3: p3 p18 p17 p8 p2 p3 p2 p9 …

fault at t=2 on p1,

Page 8: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

Related Models

• Multiple applications or threads• Multi-Core model [Hassidim, ICS‘10]

– Makespan– LRU is not competitive– Scheduling

• Our model:– No scheduling of requests– Separates scheduling and paging– Minimize faults

9

Page 9: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

Natural Strategies

• Share the cache– Eviction policy

• Partition the cache among cores– Partition function (static, dynamic)– Eviction policy

• Examples: – Shared-LRU – Optimal Static Partition with LRU

10

Page 10: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

Partition vs. Shared

11

𝑂𝑝𝑡 𝑆𝑡𝑎𝑡𝑖𝑐 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛h𝑆 𝑎𝑟𝑒𝑑 𝐿𝑅𝑈

=Ω(𝑛)

h𝑆 𝑎𝑟𝑒𝑑 𝐿𝑅𝑈𝑂𝑝𝑡 𝑆𝑡𝑎𝑡𝑖𝑐 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛

≤𝐾

For any online dynamic partition that changes o(n) times

𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛h𝑆 𝑎𝑟𝑒𝑑 𝐿𝑅𝑈

=𝜔 (1)

Partitions that don’t change enough are not competitive

Page 11: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

Shared strategies

The same applies to FIFO, CLOCK, FWF

12

Theorem:Competitive Ratio of (Shared) LRU =

when offline algorithm has cache h ≈ K/2

Page 12: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

Proof idea

13

pages pages

Faults LRU ≥ n/2

Faults Offline ≤ Initial + αK per coloured phase =

Competitive Ratio LRU =

Obs: Furthest-In-The-Future is not optimal

Page 13: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

The Offline Problem

14

Page 14: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

PARTIAL-INDIVIDUAL-FAULTS (PIF):

Given , time and , can be served such that at time the number of faults on is at most ?

15

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

p1 _ _ p2 p8 p1 p4 _ _ p10 p5 p1 p4 p2 p9 p9 p5 p2 p3 p7

p2 p9 p1 _ _ p4 p8 _ _ p1 p4 p7 p2 _ _ p3 p4 _ _ p1

p3 p4 _ _ p8 p2 p3 p2 p9 p5 p1 p4 p2 p9 p9 p1 _ _ p4 p2

p2 _ _ p3 p8 p1 p1 p3 p9 _ _ p10 p5 p1 p8 _ _ p1 p4 p2

(𝑓 1𝑓 2𝑓 3𝑓 4

)≤(2334)E.g. At t=18, ?

Page 15: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

PARTIAL-INDIVIDUAL-FAULTS (PIF):

• Optimization version (MAX-PIF): given an instance of PIF, maximize the number of sequences that fault within given bound

• Unless P=NP, there is no PTAS for MAX-PIF

Theorem: PIF is NP-complete

Theorem: MAX-PIF is APX-hard

Page 16: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

PIF vs. Min Faults

• Partial-Individual-Faults remains NP-hard even when

• If , minimizing faults can be solved by FITF• Achieving a fair fault distribution is harder

than minimizing the total number of faults

17

Page 17: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

The Offline Problem• Offline algorithm can align sequences properly by means of faults• Algorithm could “force faults” for this sake

• Regular execution

• Forcing a fault on p1

18

p1 p2 p3

p5 p8 p9

p1 p5 p4 p5 p1 p4 p6 p9 …p2 p3 p3 p2 p8 p8 p3 p10 p7 …

p1 p2 p3

p5 p8 p9

p1 p2 p3

p5 p8 p4

p1 p5 p4 _ _ _ p5 p1 p4 p6 p9 …p2 p3 p3 p2 p8 p8 p3 p10 p7 …

p1 p5 p4 p5 p1 p4 p6 p9 …p2 p3 p3 p2 p8 p8 p3 p10 p7 …

p1 _ _ _ p5 p4 p5 p1 p4 p6 p9 …p2 p3 p3 p2 p8 p8 p3 p10 p7 …

p1 p2 p3

p5 p8 p9

p1 p2 p3

p5 p8 p9

p1 p4 p3

p5 p8 p9

p1 _ _ _ p5 p4 _ _ _ p5 p1 p4 p6 p9

p2 p3 p3 p2 p8 p8 p3 p10 p7 …

Page 18: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

The Offline Problem

• However, this has no advantage over an honest offline algorithm

19

Theorem: Let A be an offline algorithm that forces faults. There exists an offline algorithm A’ such that for all disjoint R

A(R) =A’(R)

Page 19: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

The Offline Problem• For minimizing faults:

• Yields an time algorithm• Can be improved to using dynamic programming (recall n>>p)• This algorithm extends to Partial-Individual-Faults

20

Theorem: There exists an optimal offline algorithm that upon each fault evicts a page whose next request time is maximal in , for some j=1..p

Page 20: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

Conclusions• Multi-core paging is significantly different

from sequential paging• Traditional paging strategies are not

competitive • Serving a set of requests while limiting faults

in each sequence is hard• Multi-core paging is in P when number of

cores is constant

21

Page 21: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

Open Problems

• What are good online strategies?• What are good measures of performance?

– Fairness? • What is the complexity of minimizing the

number of faults?• Can we obtain more efficient offline

algorithms (exact or approximate)?

22

Thank you