View
218
Download
2
Tags:
Embed Size (px)
Citation preview
Tapestry: Finding Nearby Objects in Peer-to-Peer Networks
Joint with:Ling Huang
Anthony JosephRobert Krauthgamer
John KubiatowiczSatish RaoSean Rhea
Jeremy StriblingBen Zhao
Why nearby?(DHT vs. DOLR)
Nearby= low stretch, ratio of distance traveled to find object to distance to closest copy of object
• Objects are services, so distance isn’t one-time cost (see COMPASS)
• (smart) publishers put objects at chosen locations in network– Bob Miller places retreat schedule at node in Berkeley
• Wildly popular objects
• Low stretch dynamic peer-to-peer network
• Tolerate failures in network
• Adapting to network variation
• Future work
Outline
System Neighbors
Motivating Structure
Hops
CAN, 2001 O(r) grid O(rn1/r)
Chord, 2001 O(log n) hypercube O(log n)
Pastry, 2001 O(log n) hypercube O(log n)
Tapestry, 2001 O(log n) hypercube O(log n)
Distributed Hash Tables
• These systems give– Guaranteed location – Join and leave algorithms– Load-balanced storage
• No stretch guarantees
Low Stretch Approaches
System Stretch Space Balanced Metric
Awerbuch Peleg, 1991
polylog polylog no General
PRR, 1997 O(1) O(log n) yes Special
Thorup-Zwick O(k2) O(kn1/k) yes General
RRVV, 2001 polylog polylog yes General
• Not dynamic
Tapestry is first dynamic low-stretch scheme
Neighbor TableFor “5471” (Octal)
Routing Levels1234
1xxx
2xxx
0xxx
3xxx
4xxx
5xxx
6xxx
7xxx
50xx
51xx
52xx
53xx
54xx
55xx
56xx
57xx
540x
541x
542x
543x
544x
545x
546x
547x
5470
5471
5472
5473
Ø
5475
Ø
5477
Balancing Load
1
NodeID5123
3
3
2
22
4
3
NodeID5471
NodeID5416
NodeID5061
NodeID5432
NodeID5455
NodeID5470
Big Challenge: Joining Nodes
Theorem 1 [HKRZ02] When peer A is finished inserting, it knows about all relevant peers that have finished insertion.
Results
• Correctness O(log n) insert & delete – Concurrent inserts in a lock-free fashion
• Neighbor-search routine– Required to keep low stretch– All low-stretch schemes do something like this
• Zhao, Huang, Stribling, Rhea, Joseph & Kubiatowicz (JSAC)– This works! Implemented algorithms– Measured performance
Neighbor Search
In growth-restricted networks (with no additional space!):
Theorem 2 [HKRZ02] Can find nearest neighbor with high probability with O(log2 n) messages
Theorem 3 [HKMR04] Can find nearest neighbor, and messages is O(log n) with high probability
• Low stretch dynamic peer-to-peer network
• Tolerate failures in network
• Adapting to network variation
• Future work
Outline
Dealing with faults
• Multiple paths– Castro et. al– One failure along path,
path breaks
• Wide path– Paths faulty at the same
place to break
• Exponential difference in width effect
• “retrofit” Tapestry to do latter in slightly malicious networks
Failed!
Still good…
Effective even for small overhead
Theorem 4 In growth restricted spaces, can make probability of failed route less than 1/nc for width O(clog n)Hildrum & Kubiatowicz, DISC02
0
10
20
30
40
50
60
70
80
90
100
0.1 0.2 0.3 0.4 0.5
Fraction of Bad Nodes
% f
aile
d r
ou
tes 1
2
3
4
5
6
Wide path vs. multiple paths
0
10
20
30
40
50
60
70
80
90
0 0.1 0.2 0.3 0.4 0.5 0.6
Fraction of Bad nodes
Fa
ile
d P
ath
s
4
4 Single
• Low stretch dynamic peer-to-peer network
• Tolerate failures in network
• Adapting to Network Variation
• Future work
Outline
Network not homogeneous
Previous schemes picked a digit size• How do we find a good one?• But what if there isn’t one?
San Francisco
Nebraska
Paris
New Result
• Pick digit size based on local measurements• Don’t need to guess• Vary digit size depending on location
– No, it’s not obvious that this works, but it does!
Hildrum, Krauthgamer & Kubiatowicz [SPAA04]:
Dynamic, locally optimal low-stretch network
Conclusions and Future WorkConclusion
– Low stretch object location is practical• System provably good [HKRZ02]• System built [ZHSJK]
Open Questions– Do we need a DOLR?
• Object placement schemes? Workload?
– Examples where low stretch, load balance, and low storage not possible simultaneously
• What is tradeoff between degree, stretch, load balance as function of graph?
• Can we get best possible? Trade off smoothly?