55
Distributed Transactional Memory Presented by Gala Yadgar

Distributed Transactional Memory Presented by Gala Yadgar

  • View
    218

  • Download
    1

Embed Size (px)

Citation preview

Distributed Transactional Memory

Presented by Gala Yadgar

2

Model

A network of nodes Transactions are immobile Objects move from node to node

3

Model

Cache coherence protocol Locate the current copy Move and invalidate

Metric Location aware

A

C

B

D

4

Outline

Ballistic protocol Hierarchical clustering Operations

Requirements Finite response time

Performance Publish Move

Additional results Summary

5

Motivation

The contention manager guarantees atomicity

Should be obstruction free Performance goals

Makespan Competitive ratio

Makespan of optimal

6

Transactional memory proxy

Local request: Local object – return copy Remote object – locate with Ballistic

Remote request: Object not in use – invalidate copies and send Object in use – abort or postpone response

Commit: No invalidations – commit Invalidations – abort

7

Outline

Ballistic protocol Hierarchical clustering Operations

Requirements Finite response time

Performance Publish Move

Additional results Summary

8

Hierarchical clustering

0

1

2

L=3

9

Hierarchical clustering

0

1

2

L=3 Level 0 Physical nodes are leaf

nodes x and y are connected iff

d(x,y) < 21

Leader0 is the maximal independent set

10

Hierarchical clustering

0

1

2

L=3 Level l Only nodes from leaderl-1

x and y are connected iff d(x,y) < 2l+1

Leaderl is the maximal independent set

Level L Root L ≤ log2Diam + 1

11

Hierarchical clustering

0

1

2

L=3 Level l, node x Lookup parent set

Levell+1 nodes within distance 10*2l+1 from x

Home parent Closest lookup parent

Move parent set Levell+1 nodes within distance

4*2l+1 from x

x

12

Outline

Ballistic protocol Hierarchical clustering Operations

Requirements Finite response time

Performance Publish Move

Additional results Summary

13

Publish()

0

1

2

L=3 Object at node p Create a single directed

path from root to p Homei(p).link = Homei-1(p)

* We deal with a single object

14

lookup()

0

1

2

L=3 Request at node q Up phase

Homei-1(q) initiates a search for a non-null link at lookupProbei(q)

Down phase Follow links to a leaf Obtain copy or wait with leaf

15

move()

0

1

2

L=3 Request at node q Up phase

Homei-1(q) initiates a search for a non-null link at moveProbei(q)

Homei(q).link = Homei-1(q) Redirect if found

Down phase Follow links to a leaf Erase links Wait in queue

16

Outline

Ballistic protocol Hierarchical clustering Operations

Requirements Finite response time

Performance Publish Move

Additional results Summary

17

Overtaking

0

1

2

L=3 Object at node p a - 1st request b - 2nd request b enqueued first.

ab

18

Outline

Ballistic protocol Hierarchical clustering Operations

Requirements Finite response time

Performance Publish Move

Additional results Summary

19

Finite write response time

Every move request is satisfied within time n * TE + n * TO from when it is generated

TE – maximum enqueue delay TO – maximum time to reach a successor n – number of nodes

Equivalent – finite read response time

ab

20

Proof

By time at most t + n * TE , either1. All successor links between r and its n

predecessors r1, r2, …, rn have been established

2. There is k≤n-1, rk is p, the publish request.

1. At least two requests ri, rj come from the same node

They are different (Lemma 1) One was satisfied The object reached a predecessor

21

Proof

Let x be the location of the object at time t + n * TE

r is at most n steps away from x by taking the successor links

r will have the object by time at most t + n * TE + n * TO

22

Bounded overtaking (corollary)

(Every move request is satisfied within timen * TE + n * TO from when it is generated)

Request r is generated at time t All requests generated after time t + n * TE will be

ordered after r All requests generated prior to time t - n * TE will

be ordered before r

23

Lemma 1

There exists no set of finite number of requests R={r1, r2, …, rf} whose successor links form a cycle

r’s arrow: a downward link added by r’s visit

Outside arrows: established by requests outside R

P

C

24

Invariants

1. The root always has an arrow

2. Requests see an arrow at the peak level before the down phase

3. During the down phase, requests see an arrow until they reach a leaf

4. r’s arrow at level i points to C=homei-1(r)

25

Invariants

5. r adds and arrow PC at time t At time t –, r added an arrow to a grandchild C At time t+ that arrow will be erased by r’

r’ reached C from P r’ erased r’s arrow PC

During [t -,t+], C always has an arrow May be redirected from one

grandchild to another

P

C

26

Proof

H: the highest peak level reached by requests in R

The first request to reach H sees an outside arrow

We show: in any level l<H some request from R sees an

outside arrow That request is queued behind an outside request

27

Proof (by induction )

Base: at level H Step:

At time t, r in R sees an outside arrow at level k in node P.

The arrow was established by x not in R. PC, C is x’s home directory x also established C at level k-1, at time t –

At time t+, r reaches C. Either1. Another request from R sees C during [t -,t+] 2. r sees C at time t+

P

C

28

Outline

Ballistic protocol Hierarchical clustering Operations

Requirements Finite response time

Performance Publish Move

Additional results Summary

29

Overtaking revisited

0

1

2

L=3 Object at node p a - 1st request b - 2nd request b enqueued first.

What if a’s priority is higher?

ab

30

Intuitively…

Optimal schedule Minimum cost Hamiltonian path Visit each node once

Greedy schedule worst case Tx aborted by all higher priority Txs Each abort requires a move() Node with timestamp k visited k times

31

Performance

Work An operation’s communication overhead

Distance The cost of communicating directly from the

requesting node to its destination Stretch

work/distance Executions can be sequential or concurrent

32

Performance

Publish cost The publish operation has work O(Diam)

Move cost If an object has moved a combined distance d

since its initial publication, the amortized move stretch is O(min{log2d,L})

33

Outline

Ballistic protocol Hierarchical clustering Operations

Requirements Finite response time

Performance Publish Move

Additional results Summary

34

Performance

Publish cost The publish operation has work O(Diam)

35

1. Bounded link property

The metric distance between a level-l child and its level-(l+1) parent is less than or equal to cb * 2l, for some constant cb.

x and y are connected iff d(x,y) < 2l+1

36

2. Constant expansion property

Any node has no more than a constant number of lookup parents and lookup children (ce)

* This property requires a constant doubling metric

0

1

2

L=3

37

Constant doubling dimension

Metric: distances between all pairs, non-negative, triangle inequality Ball Bu(r) = { v | d(u,v) ≤ r } 2α balls of radius r/2 cover ball of radius r Doubling dimension: α is constant

Based on “Ad Hoc Sensor Networks” by Roger Wattenhofer

38

3. Lookup property

For any two leaves p,q pl: any of p’s level-l

ancestors by following move parents only.

If pl is not in lookupProbel(q) d(p,q) ≥ cl * 2l, for

some constant cl

0

L

p q

plpl

cl * 2l

39

4. Move property

For any two leaves p,q pl: p’s level-l home

directory If pl is not in

moveProbel(q) d(p,q) ≥ cm * 2l, for

some constant cm

0

L

p q

pl

cm * 2l

40

Lemma 3

There exists a constant cw such that for any operation that peaks at level l, the work it performs is at most cw * 2l

Proof Bounded link link cost cb * 2l

Constant expansion number of steps in each level

41

Publish performance

The publish operation has work O(Diam)

Publish operations peak at level L The work is ≤ cw * 2L (Lemma 3)

L ≤ log2Diam + 1

42

Outline

Ballistic protocol Hierarchical clustering Operations

Requirements Finite response time

Performance Publish Move

Additional results Summary

43

Performance

Move cost If an object has moved a combined distance d

since its initial publication, the amortized move stretch is O(min{log2d,L})

44

Lemma 4

q move request sequential execution

Discovers a non-null link at node P at level l

p move/publish request last to visit P

d(p,q) ≥ cm * 2l-1 0

L

p q

P

cm * 2l-1

45

Proof

The non null link at P points to homel-1(p) homel-1(p).link is non null

at least until q removes its link (Invariant 5)

It was there during q’s up phase q did not visit homel-1(p)

going up level l-1 Move property:d(p,q) ≥ cm * 2l-1

0

L

p q

P

homel-1(p)

cm * 2l-1

46

Lemma 5 Distance of a sequential execution

Sum of distances for all move operations

In a sequential execution with distance d,the maximum level reached byany move request doesnot exceed min(log2d+c,L) c is a constant

d

log2d

47

Proof

q0 - the initial publish request l – highest level reached by a move request q – the request that peaked at level l (first)

l ≤ L q saw a non null link at level l

established by q0

d(q,q0) ≥ cm * 2l-1 (Lemma 4)

d ≥ d(q,q0) l ≤ Log2(d/cm)+1 q0 q

d

l

48

Move performance

Lemma 6 For any sequential execution α

work(α) ≤ (cw/cm) * l(α) * distance(α) l(α) – the maximum level reached by a move request

of execution α

By Lemma 5 distance(α) = d l(α) ≤ min(log2d+c,L)

The amortized move stretch is O(min{log2d,L})

No proof here…

49

Outline

Ballistic protocol Hierarchical clustering Operations

Requirements Finite response time

Performance Publish Move

Additional results Summary

50

Additional results

Move cost Amortized move stretch is O(min{log2d,L}) for

concurrent executions as well Idea

“Lock” critical section in each level Prevent neighbors from “stealing” links

51

Additional results

Lookup cost The stretch for a lookup operation is constant

Idea An operation peaks at level h The work is at most cw * 2h (Lemma 3)

d(p,q) ≥ cl * 2h (lookup property)

Redefine distance with overlapping move requests

52

Additional results

Multiple objects Require multiple directories Storage and request handling load on each node is

O(log Diam). Idea

Hash each node to multiple parallel directory nodes

53

Outline

Ballistic protocol Hierarchical clustering Operations

Requirements Finite response time

Performance Publish Move

Additional results Summary

54

Summary

Distributed transactional memory Cache coherence protocol Hierarchical clustering: L ≤ log2Diam + 1 Results

Finite write response time: n * TE + n * TO Publish cost: O(Diam) Move cost: O(min{log2d,L}) Constant lookup cost Multiple objects with O(log Diam) load

55

References

Maurice Herlihy, Ye Sun. Distributed transactional memory for metric-space networks. Distributed Computing 20(3): 195-208 (2007).

Ye Sun. The Ballistic Protocol: Location-Aware Distributed Cache

Coherence in Metric-Space Networks. Doctoral Thesis, Brown University. 2006.

Bo Zhang, Binoy Ravindran. Location-Aware Cache-Coherence Protocols for Distributed

Transactional Contention Management in Metric-Space Networks.

2009.