19
Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, S. Dwarkadas, A. Cox, and W. Zwaenepoel The Winter Usenix Conference 1994 Presented by 2008-20864 Honggyu Kim 1

Treadmarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Embed Size (px)

DESCRIPTION

Treadmarks : Distributed Shared Memory on Standard Workstations and Operating Systems. P. Keleher , S. Dwarkadas , A. Cox, and W. Zwaenepoel The Winter Usenix Conference 1994 Presented by 2008-20864 Honggyu Kim. Treadmarks. Motivation No widely available DSM implementations - PowerPoint PPT Presentation

Citation preview

Page 1: Treadmarks : Distributed Shared Memory on Standard Workstations and Operating Systems

1

Treadmarks: Distributed Shared Memory on Standard Workstations

and Operating Systems

P. Keleher, S. Dwarkadas, A. Cox, and W. Zwaenepoel The Winter Usenix Conference

1994

Presented by2008-20864

Honggyu Kim

Page 2: Treadmarks : Distributed Shared Memory on Standard Workstations and Operating Systems

2

Treadmarks

• Motivation• No widely available DSM implementations• Performance problems

• Objectives• Determine efficiency of user-level DSM implementa-

tion• Reduce communication overhead

• Commercially available workstations and OS• Standard Unix system on DECstation

Page 3: Treadmarks : Distributed Shared Memory on Standard Workstations and Operating Systems

3

Distributed Shared Memory

• Software-based distributed shared memory (DSM)• Provide an illusion of shared memory on a cluster

• A private address space in each node, but a single globally shared address space on the cluster

• Keep the main memories of the nodes coherent• Embedding a coherence protocol in the page fault

handlers

• High overheads of,• The protocol processing

• Implemented in software

• The large granularity (page) of coherence and com-munication• false sharing

Page 4: Treadmarks : Distributed Shared Memory on Standard Workstations and Operating Systems

4

TreadMarks Motivation (1)

• Sequential Consistency• Every write visible “immediately”• Problems

• number of messages• latency

Page 5: Treadmarks : Distributed Shared Memory on Standard Workstations and Operating Systems

5

TreadMarks Motivation (2)

• False sharing• Pieces of the same page updated by different pro-

cessors• Leads to “ping-pong” effect, increasing network

traffic enormously

Page 6: Treadmarks : Distributed Shared Memory on Standard Workstations and Operating Systems

6

inv. p

Release Consistency (1)

• Eager Release Consistency (ERC) (Munin)• Write access information is de-

livered to all the shared copies at the release point.

• Release blocks until acknowl-edgments have been received from all others.

P2P1P0

r(x) r(y)acq(L)w(x)rel(L)

r(y)

r(x)x y

x yacq(L)

w(x)rel(L)

acq(L)

r(y)

rel(L)x y

x yVariables x and y are in the same page p

inv. p inv. p

grant L

grant L

inv. p

Page 7: Treadmarks : Distributed Shared Memory on Standard Workstations and Operating Systems

7

Release Consistency (2)

• Lazy Release Consistency (LRC)• Write access information is de-

livered only to the next acquir-ing copy at the next acquire point.

• Fewer messages than ERC• TreadMarks uses LRC model

P2P1P0

r(x) r(y)acq(L)w(x)rel(L)

r(y)r(x)

x y

acq(L)

w(x)

rel(L)acq(L)

r(y)

rel(L)x y

x yVariables x and y are in the same page p

grant L+ inv. p

grant L + inv. p

Page 8: Treadmarks : Distributed Shared Memory on Standard Workstations and Operating Systems

8

ERC vs. LRC

inv. p

P2P1P0

r(x) r(y)acq(L)w(x)rel(L)

r(y)

r(x)x y

x yacq(L)

w(x)rel(L)

acq(L)

r(y)

rel(L)x y

inv. p inv. p

grant L

grant L

inv. p

P2P1P0

r(x) r(y)acq(L)w(x)rel(L)

r(y)r(x)

x y

acq(L)

w(x)

rel(L)acq(L)

r(y)

rel(L)x y

grant L+ inv. p

grant L + inv. p

ERC LRC

Page 9: Treadmarks : Distributed Shared Memory on Standard Workstations and Operating Systems

9

Multiple Writer Protocol

• Mitigates the effect of false sharing by making each processor’s local copy of a shared page

• Their modifications are merged at the next syn-chronization operation• at release points.

P1P0

r(x) r(y)

w(y)

acq(L0)

w(x)

rel(L0)

x y

x yVariables x and y are in the same page p

acq(L1)

rel(L1)

r(y) r(x)

x y x y

x y x y

x yx y

x y

Page 10: Treadmarks : Distributed Shared Memory on Standard Workstations and Operating Systems

10

Twins and Diffs

P1P0

r(x) r(y)

w(y)

acq(L0)

w(x)

rel(L0)

x yVariables x and y are in the same page p

acq(L1)

rel(L1)

r(y)

r(x)

x y x y

x y x y

x y

x y

x y x y

x ydiff diff

y

x

x y y+ =

+= x y x

twin twin

Page 11: Treadmarks : Distributed Shared Memory on Standard Workstations and Operating Systems

11

Lazy Diff Creation

P1P0

r(y)

acq(L)

w(x)

rel(L)

Variables x and y are in the same page p

acq(L)

make a twin

create diff

apply diff

grant L + inv. p

diff

Page 12: Treadmarks : Distributed Shared Memory on Standard Workstations and Operating Systems

12

MWP vs. LDC

• Multiple Writer Protocol• A diff is created for each modified page and propa-

gated to all other copies of the page at each re-lease

• Lazy Diff Creation• allows diff creation to be postponed until the modi-

fications are requested.• TreadMarks uses Lazy Diff Creation model

Page 13: Treadmarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Performance

• Experimental Environment• 8 DECstation-5000/240• connected to a 100-Mbps ATM LAN and a 10-Mbps

Ethernet

• Applications• Water – molecular dynamics simulation • Jacobi – Successive Over-Relaxation• TSP – branch & bound algorithm to solve the travel-

ing salesman problem• Quicksort – using bubblesort to sort subarray of

less than 1K element• ILINK – genetic linkage analysis

13

Page 14: Treadmarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Result

14

Page 15: Treadmarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Execution Time Breakdown (1/2)

15

Page 16: Treadmarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Execution Time Breakdown (2/2)

16

Page 17: Treadmarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Lazy vs. Eager Release Consistency(1/2)

17

Page 18: Treadmarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Lazy vs. Eager Release Consistency (2/2)

18

Page 19: Treadmarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Conclusion

• Efficient user-level implementation

• Suggested novel techniques to reduce communication overhead• Lazy Release Consistency• Lazy Diff Creation

• Viable technique for parallel computation on clusters of workstations connected by suitable networking technology• Network speed has been improved

19