View
228
Download
3
Tags:
Embed Size (px)
Citation preview
Winter, 2004 CSS490 DFS 1
CSS490 Distributed File SystemsCSS490 Distributed File SystemsTextbook Ch7 (p421 - 440)Textbook Ch7 (p421 - 440)
Instructor: Munehiro Fukuda
These slides were compiled from the course textbook and reference books.
Winter, 2004 CSS490 DFS 2
DFS Services Storage service
Disk service: giving a transparent view of distributed disks. Block service giving the same logical view of disk-accessing units.
True file service File-accessing mechanism: deciding a place to manage remote
files and unit to transfer data (at server or client? file, block or byte?)
File-sharing semantics: providing similar to Unix but weaker file update semantics
File-caching mechanism: improving performance/scalability File-replication mechanism: improving performance/availability
Name service Mapping between text file names and reference to files, (i.e. file IDs) Directory service
Winter, 2004 CSS490 DFS 3
DFS Desirable Features
Transparency: should include structure, access, naming, and replication transparency.
User mobility: should not force a user to work on a specific node. Performance: should be comparable to that of a centralized file system. Simplicity: should give the same semantics as a centralized file system. Scalability: should cope with the growth of nodes. Fault tolerance: should not face a failure stop and maintain backup copies. Synchronization: should complete concurrent access requests consistently. Security: should protect files from network intruders. Heterogeneity: should allow a variety of nodes to share files in different
storage media
Winter, 2004 CSS490 DFS 4
File Models Unstructured and Structured Files
An un-interpreted sequence of bytes: UNIX and MSDOS: Non-indexed records: IBM mainframe Indexed records such as B-tree: Research Storage
System(RSS) and Oracle Mutable and Immutable Files
Mutable: a single stored sequence altered by each update (ex. Unix and MSDOS)
Immutable: a history of immutable versions, each created every update (ex. Cedar File System)
Winter, 2004 CSS490 DFS 5
File-Accessing Models Accessing Remote Files
Cache consistency problem
Reducing network traffic
At a client that cached a file copy
Data caching model
Communication overhead
A simple implementation
At a serverRemote service model
DemeritsMeritsFile access
Transfer level
Merits Demerits
File Simple, less communication overhead, and immune to server
A client required to have large storage space
Block A client not required to have large storage space
More network traffic/overhead
Byte Flexibility maximized Difficult cache management to handle the variable-length data
Record Handling structured and indexed files
More network trafficMore overhead to re-construct a file.
Unit of Data Transfer
Winter, 2004 CSS490 DFS 6
File-Sharing Semantics
Define when modifications of the file data made by a user are observable by other users
1. Unix semantics2. Session Semantics3. Immutable shared-files semantics4. Transaction-like semantics
Winter, 2004 CSS490 DFS 7
File-Sharing SemanticsUnix Semantics
Absolute Ordering
t1 t2 t3 t4 t5 t6
a b a b c a b c da b c a b c d e a b c d e
Client A
Client BAppend(c) Append(d)
read
read
Append(d)
Network Delays
delayed
a b c
a b
Winter, 2004 CSS490 DFS 8
File-Sharing SemanticsSession Semantics
Client A Client B Client C
a b
a b c
a b c d
a b c d e
Open(file)
Append(x)
Append(y)
Append(z)
Close(file)
Server
a b
a b c d e
Close(file)
Open(file)
a bOpen(file)
a b c d e
a b x
a b c y
a b c d z
a b c d z
Close(file)
Append(m) a b c d e m
a b c d e m
Append(c)
Append(d)
Append(e)
Close(file)
Winter, 2004 CSS490 DFS 9
File-Sharing SemanticsTransaction-Like Semantics (Concurrency
Control)
Backward validation Forward validation
R1R2W3R4W5
R1R2W6R4W7
R1R2W9R4W8
R1R2R6R8W8
Trans_start
Trans_start
Trans_start
Trans_startTrans_end
Trans_end
Trans_end
Trans_abortTrans_restart
validation
Commitment
Client A Client B Client C Client D
R1R2W3R4W5
R1R2W6R4W7
R1R2W9R4W8
R1R2R6R8W8
Trans_start
Trans_start
Trans_start
Trans_startTrans_end
Trans_end
Trans_abortTrans_restart
validation
Commitment
Client A Client B Client C Client D
Compare reads withformer writes
Compare write withlater reads
Trans_endAbort itself or conflicting active transactions
Which validation is better?
Winter, 2004 CSS490 DFS 10
File-Sharing SemanticsImmutable Shared-Files Semantics
Version1.0
Tentativebased on
1.0
Tentativebased on
1.0
Version1.1
Version conflict
Version1.2
Version1.2
Ignore conflict Merge
Abort
ServerClient BClient A
Depend on each file system.Abortion is simple (later, the client A canDecide to overwrite it with its tentative 1.0by changing the corresponding directory)
Winter, 2004 CSS490 DFS 11
File-Caching SchemesCache Location
Disk
Mainmemory
Location Merits Demerits
No caching No modifications Frequent disk access,Busy network traffic
In server’s main memory
One-time disk access,Easy implementation,Unix-like file-sharing semantics
Busy network traffic
In client’s disk
One-time network access,No size restriction
Cache consistency problem,File access semantics, Frequent disk access,No Diskless workstation
In client’s main memory
Maximum performance,Diskless workstation,Scalability
Size restriction,Cache consistency problem,File access semantics
Disk
Mainmemory
Node boundaryClient Server
file
copy
copy
copy
Winter, 2004 CSS490 DFS 12
Mainmemory
File-Caching SchemesModification Propagation
Write-through scheme Pros: Unix-like semantics and high
reliability Cons: Poor write performance
Delayed-write scheme Write on cache displacement Periodic write Write on close Pros:
Write accesses complete quickly Some writes may be omitted by the
following writes. Gathering all writes mitigates network
overhead. Cons:
Delaying of write propagation results in fuzzier file-sharing semantics.
Disk
file
Mainmemory
copycopyW
new
Client 1 Client 2
W
W
Immediate write
Mainmemory
Disk
file
Mainmemory
copyW copy
new
Client 1 Client 2
delayed writeW
Winter, 2004 CSS490 DFS 13
File-Caching SchemesCache Validation Schemes – Client-Initiated
Approach
Checking before every access (Unix-like semantics but too slow)
Checking periodically (better performance but fuzzy file-sharing semantics)
Checking on file open (simple, suitable for session-semantics)
Problem: High network traffic
Mainmemory
Disk
file
Mainmemory
copycopyW
Client 1 Client 2
W
Mainmemory
Disk
file
Mainmemory
copycopyW
Client 1 Client 2
W
Check beforeevery access
Write through
Delayed write?
WW
Write-on-close Check-on-open
new
Check-on-close?
Winter, 2004 CSS490 DFS 14
File-Caching SchemesCache Validation Schemes – Server-Initiated
Approach
Keeping track of clients having a copy Denying a new request, queuing it, and disabling caching Notifying all clients of any update on the original file Problem:
violating client-server model Stateful servers Check-on-open still needed for the 2nd file opening.
Mainmemory
Disk
file
Mainmemory
copy copyW
Client 1 Client 2
W
WW
Mainmemory
copy
Client 3
Notify (invalidate)
Mainmemory
Client 4
Deny for a new open
Write throughOr
Delayed write?
Winter, 2004 CSS490 DFS 15
Sun NFSStructure
/
usrbin
shared
VFS
Local FS NFS client
RPC stub
/
optbin
shared
VFS
Local FS NFS client
RPC stub
/
usrbin
org
VFS
Local FS NFS server
RPC stub
ServerClient A Client B
export exportUser
process Userprocess
Winter, 2004 CSS490 DFS 16
Sun NFSInstallation
Server: Check if NFS is running:rpcinfo –p Start NSF: /etc/rc.d/init.d/nfs start Edit /etc/exports file: /dir/to/export client1(permissions), client2(… Export dirs in /etc/exports: exportfs –a Check exported directories: showmount –e
Client: Import a server’s directory: mount –o options server_name:/dir
/my_dir bg: continue working on importing upon a failure, intr: a process will be interupted if its I/O request to the server dir is pending. soft: allowing a client to time out the connection after a number of retries rw/ro: normal r/w or read only
Underlying Connections: portmapperNFS mount service port
mountdpermission
portmapper2049
client
nfsrpc
Winter, 2004 CSS490 DFS 17
Sun NFSOverviews
Communication RPC: a compound procedure
Lookup, Open, and Read Server status
Stateless: simple implementation in ver 3. Statefull: allowing clients to cache files in ver 4.
RPC call back from a server to invalidate a client’s cache Synchronization
Session semantics File Locking in ver 4: lock, lockt, locku, and renew
Ex. Emacs: Tests with lockt when modifying buffer, locks a file with lockt, and unlock with locku after writing buffer contents to the file.
Share reservation: specify how to share a file (with ro, wo, or r/w)
Winter, 2004 CSS490 DFS 18
SUN NFSOverviews (Cont’d)
Caching In client’s memory Session semantics Revalidation of client’s cache upon re-opening the same file Open delegation:
A server delegates a open decision to a writing client which can handle an open request from other clients on the same machine.
A server calls back the client when receiving an open request from another machine.
Fault Tolerance RPC failure: use a duplicate-request cache File locking failure: provide a grace period during which a
client reclaim locks previously granted and the server builds up its previous state.
Winter, 2004 CSS490 DFS 19
Sun NFSDuplicate Request Cache
client server
XID = 1234
reply
XID = 1234
Too soon, ignore
Transactioncompleted
client server
XID = 1234
reply
XID = 1234
Just replied, ignore
Transactioncompleted
client server
XID = 1234
reply
XID = 1234
Too soon, ignore
Transactioncompleted
reply
Then, when does the server delete this cached result?
Winter, 2004 CSS490 DFS 20
DFS ExampleAndrew File System
/
usrtmp
bin
Unix Kernel(Unix FS)
Client A
Symbolic links
Venusprocess
cache
Userprocess
/
usrtmp
bin
Unix Kernel(Unix FS)
Client A
Symbolic links
Venusprocess
Winter, 2004 CSS490 DFS 21
DFS ExampleXFS
Client
MetadataManager
StorageServer
MetadataManager
StorageServer
StorageServerClient
LAN
1: Write requests
2: Log themin a segment
3: Fragment a segmentand sent them to a strip group of servers1: Read request
2: Query a manager
3: Collaborative caching(Read data from another client if possible)
Winter, 2004 CSS490 DFS 22
DFS ExamplePlan 9
/
ba
in ex
d1
da
d2 d3
x y
c
ba dac
x y net
N
File server 1 File server 2 Computation server Network Interface
Client
net
N
import import export
import
Internet
Union directory
Remote execution
Network access
Winter, 2004 CSS490 DFS 23
Paper Review by Students Sun NFS Andrew File System XFS Plan 9 LFS