View
216
Download
0
Category
Tags:
Preview:
Citation preview
1
Koorde:A Simple Degree Optimal DHT
Frans Kaashoek, David KargerMIT
Brought to you by the IRIS project
2
DHT Routing Distributed hash tables
Implement hash table interface Map any ID to the machine responsible
for that ID (in a consistent fashion) Standard primitive for P2P
Machines not all aware of each other Each tracks small set of “neighbors” Route to responsible node via sequence
of “hops” to neighbors
3
Performance Measures Degree
How many neighbors nodes have Hop count
How long to reach any destination node Fault tolerance
How many nodes can fail Maintenance overhead
E.g., making sure neighbors are up Load balance
How evenly keys distribute among nodes
4
Tradeoffs With larger degree, hope to
achieve Smaller hop count Better fault tolerance
But higher degree implies More routing table state per node Higher maintenance overhead to
keep routing tables up to date Load balance “orthogonal issue”
5
Current Systems Chord, Kademlia, Pastry, Tapestry O(log n) degree O(log n) hop count O(log n) ratio load balance
Chord: O(1) load balance with O(log n) “virtual nodes” per real node
Multiplies degree to O(log2 n)
6
Outliers CAN
Degree d O(dn1/d) hops
Viceroy O(log n) hop count Constant average degree But some nodes have degree log n
7
Lower Bounds to Shoot For Theorem: if max degree is d, then
hop count is at least logd n Proof: < dh nodes at distance h Allows degree O(1) and O(log n) hops Or deg. O(log n) and O(log n / loglog n) hops
Theorem: to tolerate half nodes failing, (e.g. net partition) need degree (log n) Pf: if less, some node loses all neighbors Might as well take O(log n / loglog n) hops!
8
Koorde New routing protocol Shares almost all aspects with Chord But, meets (to within constant factor)
all lower bounds just mentioned: Degree 2 and O(log n) hops Or degree log n and O(log n / loglog n) hops
and fault tolerant Like Chord, O(log n) load balance
or constant with O(log n) times degree
9
Chord Review
Chord consists of Consistent hashing to assign IDs to
nodes Good load balance
Efficient routing protocol to find right node
Fast join/leave protocol Few data items shifted
Fault tolerance to half of nodes failing Efficient maintenance over time
■ Koorde routing protocol to find right node
10
Consistent Hashing
06
13
18
22
31
47
4236
51
60
Assign ID to “successor”
node on ring
49
Assign doc with hash 49 to node
51
11
Chord Routing
Each node keeps successor pointer
Also keeps power-of-two “fingers” neighbors providing shortcuts
So log n fingers
06
13
18
22
3136
60
47
42
51
12
Chord Lookups0
6
13
18
22
31
47
4236
51
60
13
Koorde Idea Chord acts like a hypercube
Fingers flip one bit Degree log n (log n different flips) Diameter log n
Koorde uses a deBruijn network Fingers shift in one bit Degree 2 (2 possible bits to shift in) Diameter log n
14
De Bruijn Graph Nodes are b-bit integers (b = log n) Node u has 2 neighbors (bit shifts): 2u mod 2b and 2u+1 mod 2b
000
011
001
010
100
101
110
111
0
00
00
0
001
1
1
1
1
1
1
1
15
De Bruijn Routing Shift in destination bits one by one b hops complete route Route from 000 to 110:
000
011
001
010
100
101
110
111
0
00
00
0
001
1
1
1
1
1
1
1
16
Routing Code Procedure u.LOOKUP(k, toShift)
/* u is machine, k is target keytoShift is target bits not yet shifted in */
if k u thenReturn u /* as owner for k */
else/* do de Bruijn hop */
t = u ° topBit(toShift)
Return t.lookup(k, toshift 1)
Initially call self.LOOKUP(k,k)
17
Summary Each node has 2 outgoing
neighbors Also two incoming Can show good routing load balance
Need b = log n bits for n distinct nodes
So log n hops to route
18
Problems to Solve Want b-bit ring, b >> log n, to avoid
colliding identifiers as nodes join Implies use b >> log n hops Worse, most nodes not present to route!
Solutions Imaginary routing: present nodes
simulate routing actions of absent nodes Short cuts: use gaps to start route with
most of destination bits already shifted in
19
Imaginary routing Node u holds two pointers
Successor on ring One finger: predecessor of 2u (mod 2b)
On sparse ring, is also predecessor of 2u+1 So handles both de Bruijn edges
Node u “owns” all imaginary nodes between self and (real) successor
Simulates de Bruijn routing from those imaginary nodes to others by forwarding to the others’ real owners
20
Code Procedure u.LOOKUP(k, toShift, i)
if k (u,u.successor] thenreturn u.successor /* as bucket for k */
else if i (u,u.successor] then /* i belongs to u; do de Bruijn hop */return u.finger.LOOKUP(k, toshift 1,
i ° topBit(toShift))
else /* i doesn’t belong to u; forward it */return u.successor.LOOKUP(k, toShift, i)
Initially call self.LOOKUP(k,k,self)
21
True route tracks imaginary
start
target
successor
finger (< double)
imaginary(double)
22
Correctness Once b de Bruijn steps happen, done
At this point, i = k Will follow successors to bucket for k
Successor steps delay de Bruijn steps, but not forever After finite number of successor steps,
reach predecessor of i Conclude: all necessary de Bruijn
steps happen in finite time. So correct.
23
How long? Only b de Bruijn steps Just bound (expected) number of
successor steps per de Bruijn step Nodes randomly distributed on ring So node expects to own size 1/n
interval So distance to imaginary node on de
Bruijn step is 1/n De Bruijn step doubles everything,
makes distance 2/n Expect 2 nodes in interval of that size
24
Few Successor Stepsstart
target
1/n
< 2/n
25
Summary Each de Bruijn hop followed by 2
successor hops (in expectation) b de Bruijn hops Conclude 2b successor hops so 3b
hops in total Expectation argument extends to
“with high probability” argument (same bounds)
Remaining problem: b>>log n, too big
26
Exploit Address Blocks Only n real nodes Each owns ~1/n “block” of keyspace Within that block, only top log n bits
“significant”; low bits arbitrary So set low bits to high bits of target Then just have to shift out log n most
significant bits So log n de Bruijn hops, So O(log n) hops in total
27
Example Start at u = 001011011… Successor 001110101…. u “owns” imaginary 00101****** Target 1101011…. Set imaginary start
001011101011… Only need to shift out 00101
5 hops, independent of b
28
Summary Koorde uses
2 neighbors per node (one successor, one finger)
And requires O(log n) routing hops with high probability
29
Variant: Koorde-K We used a binary de Bruijn Network Generalizes to other base K:
000 011
001
010
100101
110
111
0 1
002
022 021 020
012
102
122120
121
112
2
30
Analysis To represent n distinct node ids
need logK n base-K digits Suggests logK n hops to route
Same problem as Koorde: b >> logK n Same solution: imaginary routing Node u points at predecessor(Ku) Same analysis: K de Bruijn hops
interspersed with successor hops
31
Successor Hops Now de Bruijn hop multiplies ids by K
So expect K nodes between finger and next imaginary node
Implies K successor hops per de Bruijn hop
Gives K logK n hops---no good
To avoid successor hops, u fingers predecessor(Ku) and following K nodes Allows K successor hops by one finger Gives O(logK n) hops as desired
32
Summary Using K fingers per node, can
achieve O(logK n) = O(log n / log K) routing hops
As discussed earlier, degree log n is necessary (and sufficient) for fault tolerance (and is degree of most previous systems)
So, O(log n / log log n ) hops
33
Summary: What do we Gain? Lower degree for same number of
hops Storage isn’t really an issue But lower degree should translate into
lower maintenance traffic Lower hop count for same degree
And tunable Other systems also have tunable hop
count But at low hop counts (high degree) their
extra log factor in degree does matter
34
What do we lose? Chord is “self stabilizing”
From successors, can build entire routing system quickly by “pointer jumping” to find fingers
Koorde is not Given only successor pointers, no clear
fast way to find fingers Not a problem for joins, because joiner
can use lookup to find its finger But could be a problem if massive
changes
35
More Info
http://www.pdos.lcs.mit.edu/chord/
Recommended