23
CSCS: A Concise Implementation of User- Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Dec. 11, 2009 Final Presentation

CSCS: A Concise Implementation of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Dec

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

CSCS: A Concise Implementation of User-Level Distributed Shared

Memory Zhi Zhai Feng Shen

Computer Science and EngineeringUniversity of Notre Dame

Dec. 11, 2009

Final Presentation

DSM Overview

DSM Characteristics:• Physically: distributed memory• Logically: a single shared address space

Figure 1 DSM architecture

Related WorkModels and Main Features: • IVY (Yale) - Divided Space: Shared & Private space • Mirage (UCLA) - Time Interval d : Avoid page thrashing• TreadMarks (Rice) - Lazy Release Consistency : Improve efficiency

• SAM (Stanford)

System Design

Figure 2 Server/Client mode

System Design

• Server– Holder of metadata only

– Thread-based Connection

– Event-based Service

System Design

Figure 3 Server Process/Threads

System Design

• Client– Physical memory owner

– UI/Work/Page Fetch Thread

– Fixed-home Protocol

– Not Aware of Peer Clients

System Design

Figure 4 Client process/thread

System Design

Figure 5 Sample Operation

Implementation

• Message Passing: TCP socket

Figure 6 Message Passing

Implementation

• Server/Client Page Table– Server holds most up-to-date meta data– Server managers whole virtual memory space– Server records id & addresses of all nodes

– Client owns the most up-to-date local memory segment

– Client caches referenced pages from peer nodes

Client ID IP Address

0 129.74.155.107 (e.g.)

1 129.74.155.122

…. …

Page # Frame # Access Bits Page Owner

0 57 PROT_READ 1

1 67 PROT_READ|PROT_WRITE 1

2 57 PROT_READ 3

… … … …

Figure 7 Connection Table

Figure 8 Server Page Table

Implementation

Page # Frame # Access Bits Page Owner Ref Count

0 30 PROT_READ 1 0

1 31 PROT_READ 1 0

2 32 PROT_READ 1 4

3 60 PROT_READ|PROT_WRITE 1 1

4 200 PROT_READ 5 0

… … … … …

Figure 9 Client Page Table

Implementation

• Page fault handler– Client Server

• Check the access right• Fetch the page owner id/address• Update global access bits

– Client Client• Connect to the page owner• Cache the referenced page• Update local access bits

Implementation

• Page fault handler– Page fault type

• Read remote page• Write on a page

– Assumption• Reading happens more often than writing• Writing needs most-to-date copy more than

reading

Implementation

Assume reading remote page

dsm call:dsm_do_no_page ()

Truly a remote reading fault?

NO: double page fault

dsm call:dsm_do_wrt_page ()

YES: continue

Figure 10 Page fault handler wordflow

Implementation

• Memory Consistency Model– Assumption Revisit

• Reading happens more often than writing• Writing needs most-to-date copy more than reading

– Multi-Reader/Single Writer• Snap-shot for reading• Every writing triggers page fault

– Locks on pages being referenced• Semaphore-like reference counts:

If ref_count > 0 Waiting/Re-random

DSM Evaluation

Figure 11 Parallel Computation on ASP Problem

DSM Evaluation

Figure 12 Execution time comparison

DSM Evaluation

Figure 13 Message Transmission Comparison

DSM Evaluation

Figure 14 Network Traffic Comparison

Future Work

• Enhance system robustness

• Evaluate scalability boundary

• Provide better programmability