35
1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

Embed Size (px)

Citation preview

Page 1: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

1

Koorde:A Simple Degree Optimal DHT

Frans Kaashoek, David KargerMIT

Brought to you by the IRIS project

Page 2: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

2

DHT Routing Distributed hash tables

Implement hash table interface Map any ID to the machine responsible

for that ID (in a consistent fashion) Standard primitive for P2P

Machines not all aware of each other Each tracks small set of “neighbors” Route to responsible node via sequence

of “hops” to neighbors

Page 3: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

3

Performance Measures Degree

How many neighbors nodes have Hop count

How long to reach any destination node Fault tolerance

How many nodes can fail Maintenance overhead

E.g., making sure neighbors are up Load balance

How evenly keys distribute among nodes

Page 4: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

4

Tradeoffs With larger degree, hope to

achieve Smaller hop count Better fault tolerance

But higher degree implies More routing table state per node Higher maintenance overhead to

keep routing tables up to date Load balance “orthogonal issue”

Page 5: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

5

Current Systems Chord, Kademlia, Pastry, Tapestry O(log n) degree O(log n) hop count O(log n) ratio load balance

Chord: O(1) load balance with O(log n) “virtual nodes” per real node

Multiplies degree to O(log2 n)

Page 6: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

6

Outliers CAN

Degree d O(dn1/d) hops

Viceroy O(log n) hop count Constant average degree But some nodes have degree log n

Page 7: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

7

Lower Bounds to Shoot For Theorem: if max degree is d, then

hop count is at least logd n Proof: < dh nodes at distance h Allows degree O(1) and O(log n) hops Or deg. O(log n) and O(log n / loglog n) hops

Theorem: to tolerate half nodes failing, (e.g. net partition) need degree (log n) Pf: if less, some node loses all neighbors Might as well take O(log n / loglog n) hops!

Page 8: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

8

Koorde New routing protocol Shares almost all aspects with Chord But, meets (to within constant factor)

all lower bounds just mentioned: Degree 2 and O(log n) hops Or degree log n and O(log n / loglog n) hops

and fault tolerant Like Chord, O(log n) load balance

or constant with O(log n) times degree

Page 9: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

9

Chord Review

Chord consists of Consistent hashing to assign IDs to

nodes Good load balance

Efficient routing protocol to find right node

Fast join/leave protocol Few data items shifted

Fault tolerance to half of nodes failing Efficient maintenance over time

■ Koorde routing protocol to find right node

Page 10: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

10

Consistent Hashing

06

13

18

22

31

47

4236

51

60

Assign ID to “successor”

node on ring

49

Assign doc with hash 49 to node

51

Page 11: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

11

Chord Routing

Each node keeps successor pointer

Also keeps power-of-two “fingers” neighbors providing shortcuts

So log n fingers

06

13

18

22

3136

60

47

42

51

Page 12: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

12

Chord Lookups0

6

13

18

22

31

47

4236

51

60

Page 13: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

13

Koorde Idea Chord acts like a hypercube

Fingers flip one bit Degree log n (log n different flips) Diameter log n

Koorde uses a deBruijn network Fingers shift in one bit Degree 2 (2 possible bits to shift in) Diameter log n

Page 14: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

14

De Bruijn Graph Nodes are b-bit integers (b = log n) Node u has 2 neighbors (bit shifts): 2u mod 2b and 2u+1 mod 2b

000

011

001

010

100

101

110

111

0

00

00

0

001

1

1

1

1

1

1

1

Page 15: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

15

De Bruijn Routing Shift in destination bits one by one b hops complete route Route from 000 to 110:

000

011

001

010

100

101

110

111

0

00

00

0

001

1

1

1

1

1

1

1

Page 16: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

16

Routing Code Procedure u.LOOKUP(k, toShift)

/* u is machine, k is target keytoShift is target bits not yet shifted in */

if k u thenReturn u /* as owner for k */

else/* do de Bruijn hop */

t = u ° topBit(toShift)

Return t.lookup(k, toshift 1)

Initially call self.LOOKUP(k,k)

Page 17: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

17

Summary Each node has 2 outgoing

neighbors Also two incoming Can show good routing load balance

Need b = log n bits for n distinct nodes

So log n hops to route

Page 18: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

18

Problems to Solve Want b-bit ring, b >> log n, to avoid

colliding identifiers as nodes join Implies use b >> log n hops Worse, most nodes not present to route!

Solutions Imaginary routing: present nodes

simulate routing actions of absent nodes Short cuts: use gaps to start route with

most of destination bits already shifted in

Page 19: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

19

Imaginary routing Node u holds two pointers

Successor on ring One finger: predecessor of 2u (mod 2b)

On sparse ring, is also predecessor of 2u+1 So handles both de Bruijn edges

Node u “owns” all imaginary nodes between self and (real) successor

Simulates de Bruijn routing from those imaginary nodes to others by forwarding to the others’ real owners

Page 20: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

20

Code Procedure u.LOOKUP(k, toShift, i)

if k (u,u.successor] thenreturn u.successor /* as bucket for k */

else if i (u,u.successor] then /* i belongs to u; do de Bruijn hop */return u.finger.LOOKUP(k, toshift 1,

i ° topBit(toShift))

else /* i doesn’t belong to u; forward it */return u.successor.LOOKUP(k, toShift, i)

Initially call self.LOOKUP(k,k,self)

Page 21: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

21

True route tracks imaginary

start

target

successor

finger (< double)

imaginary(double)

Page 22: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

22

Correctness Once b de Bruijn steps happen, done

At this point, i = k Will follow successors to bucket for k

Successor steps delay de Bruijn steps, but not forever After finite number of successor steps,

reach predecessor of i Conclude: all necessary de Bruijn

steps happen in finite time. So correct.

Page 23: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

23

How long? Only b de Bruijn steps Just bound (expected) number of

successor steps per de Bruijn step Nodes randomly distributed on ring So node expects to own size 1/n

interval So distance to imaginary node on de

Bruijn step is 1/n De Bruijn step doubles everything,

makes distance 2/n Expect 2 nodes in interval of that size

Page 24: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

24

Few Successor Stepsstart

target

1/n

< 2/n

Page 25: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

25

Summary Each de Bruijn hop followed by 2

successor hops (in expectation) b de Bruijn hops Conclude 2b successor hops so 3b

hops in total Expectation argument extends to

“with high probability” argument (same bounds)

Remaining problem: b>>log n, too big

Page 26: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

26

Exploit Address Blocks Only n real nodes Each owns ~1/n “block” of keyspace Within that block, only top log n bits

“significant”; low bits arbitrary So set low bits to high bits of target Then just have to shift out log n most

significant bits So log n de Bruijn hops, So O(log n) hops in total

Page 27: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

27

Example Start at u = 001011011… Successor 001110101…. u “owns” imaginary 00101****** Target 1101011…. Set imaginary start

001011101011… Only need to shift out 00101

5 hops, independent of b

Page 28: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

28

Summary Koorde uses

2 neighbors per node (one successor, one finger)

And requires O(log n) routing hops with high probability

Page 29: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

29

Variant: Koorde-K We used a binary de Bruijn Network Generalizes to other base K:

000 011

001

010

100101

110

111

0 1

002

022 021 020

012

102

122120

121

112

2

Page 30: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

30

Analysis To represent n distinct node ids

need logK n base-K digits Suggests logK n hops to route

Same problem as Koorde: b >> logK n Same solution: imaginary routing Node u points at predecessor(Ku) Same analysis: K de Bruijn hops

interspersed with successor hops

Page 31: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

31

Successor Hops Now de Bruijn hop multiplies ids by K

So expect K nodes between finger and next imaginary node

Implies K successor hops per de Bruijn hop

Gives K logK n hops---no good

To avoid successor hops, u fingers predecessor(Ku) and following K nodes Allows K successor hops by one finger Gives O(logK n) hops as desired

Page 32: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

32

Summary Using K fingers per node, can

achieve O(logK n) = O(log n / log K) routing hops

As discussed earlier, degree log n is necessary (and sufficient) for fault tolerance (and is degree of most previous systems)

So, O(log n / log log n ) hops

Page 33: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

33

Summary: What do we Gain? Lower degree for same number of

hops Storage isn’t really an issue But lower degree should translate into

lower maintenance traffic Lower hop count for same degree

And tunable Other systems also have tunable hop

count But at low hop counts (high degree) their

extra log factor in degree does matter

Page 34: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

34

What do we lose? Chord is “self stabilizing”

From successors, can build entire routing system quickly by “pointer jumping” to find fingers

Koorde is not Given only successor pointers, no clear

fast way to find fingers Not a problem for joins, because joiner

can use lookup to find its finger But could be a problem if massive

changes

Page 35: 1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project

35

More Info

http://www.pdos.lcs.mit.edu/chord/