View
219
Download
0
Tags:
Embed Size (px)
Citation preview
Why Distributed Systems?
Aggregate resources!– memory– disk– CPU cycles
Proximity to physical stuff– things with sensors– things that print– things that go boom– other people
Fault tolerance!– Don’t want one tsunami to take everything down
Peer To Peer
– Lots of reasonable machines• No one machine loaded more than others• No one machine irreplacable!
Peer-to-Peer (P2P)
Where do the machines come from?– “found” resources
• SETI @ home• BOINC
– existing resources• computing “clusters” (32, 64, ….)
What good is a peer to peer system?– all those things mentioned before, including
Storage: files, MP3’s, leaked documents, porn …
The lookup problem
Internet
N1
N2 N3
N6N5
N4
Publisher
Key=“title”Value=MP3 data… Client
Lookup(“title”)
?
Centralized lookup (Napster)
Publisher@
Client
Lookup(“title”)
N6
N9 N7
DB
N8
N3
N2N1SetLoc(“title”, N4)
Simple, but O(N) states and a single point of failure
Key=“title”Value=MP3 data…
N4
Flooded queries (Gnutella)
N4Publisher@
Client
N6
N9
N7N8
N3
N2N1
Robust, but worst case O(N) messages per lookup
Key=“title”Value=MP3 data…
Lookup(“title”)
Routed queries (Freenet, Chord, etc.)
N4Publisher
Client
N6
N9
N7N8
N3
N2N1
Lookup(“title”)
Key=“title”Value=MP3 data…
Bad load balance.
Routing challenges
Define a useful key nearness metric.
Keep the hop count small.– O(log N)
Keep the routing tables small.– O(log N)
Stay robust despite rapid changes.
Distributed Hash Tables to the Rescue!
Load Balance: Distributed hash function spreads keys evenly over the nodes (Consistent hashing).
Decentralization: Fully distributed (Robustness).
Scalability: Lookup grows as a log of number of nodes.
Availability: Automatically adjusts internal tables to reflect changes.
Flexible Naming: No constraints on key structure.
What’s a Hash?
Wikipedia: any well-defined procedure or mathematical function that converts a large, possibly variable-sized amount of data into a small datum, usually a single integer
Example: Assume: N is a large prime ‘a’ means the ASCII code for the letter ‘a’ (it’s 97)
H(“pete”) =
= (H(“pe”) x N + ‘t’) x N + ‘e’
= (H(“pe”) x N + ‘t’) x N + ‘e’
= 451845518507
H(“pet”) x N + ‘e’ H(“pete”) mod 1000 = 507H(“peter”) mod 1000 = 131H(“petf”) mod 1000 = 986
H(“pete”) mod 1000 = 507H(“peter”) mod 1000 = 131H(“petf”) mod 1000 = 986
It’s a deterministic random number generator!
Chord (a DHT)
m-bit identifier space for both keys and nodes.
Key identifier = SHA-1(key).
Node identifier = SHA-1(IP address).
Both are uniformly distributed.
How to map key IDs to node IDs?
Consistent hashing [Karger 97]
N32
N90
N105
K80
K20
K5
Circular 7-bitID space
Key 5Node 105
A key is stored at its successor: node with next higher ID
“ Finger table” allows log(N)-time lookups
N80
½¼
1/8
1/161/321/641/128
Every node knows m other nodes in the ring
Finger i points to successor of n+2i-1
N80
½¼
1/8
1/161/321/641/128
112
N120
Each node knows more about portion of circle close to it
Joining: linked list insert
N36
N40
N25
1. Lookup(36)K30K38
1. Each node’s successor is correctly maintained.
2. For every key k, node successor(k) is responsible for k.
Join (2)
N36
N40
N25
2. N36 sets its ownsuccessor pointer
K30K38
Initialize the new node finger table
Stabilization Protocol
To handle concurrent node joins/fails/leaves.
Keep successor pointers up to date, then verify and correct finger table entries.
Incorrect finger pointers may only increase latency, but incorrect successor pointers may cause lookup failure.
Nodes periodically run stabilization protocol.
Won’t correct a Chord system that has split into multiple disjoint cycles, or a single cycle that loops multiple times around the identifier space.