29
P2P Network Structured Networks: Distributed Hash Tables Pedro García López Universitat Rovira I Virgili [email protected]

P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

P2P NetworkStructured Networks:

Distributed Hash Tables

Pedro García López

Universitat Rovira I Virgili

[email protected]

Page 2: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Index

• Description of CHORD’s Location and routing mechanisms

• Symphony: Distributed Hashing in a Small World

Page 3: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Description of CHORD’sLocation and routing mechanisms

Vincent MatossianOctober 12th 2001

ECE 579

Page 4: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Overview

• Maps keys onto nodes in a 1D circular space• Uses consistent hashing –D.Karger, E.Lehman

• Aimed at large-scale peer-to-peer applications

• Consistent hashing• Algorithm for key location• Algorithm for node joining• Algorithm for stabilization• Failures and replication

Chord:

Talk

Page 5: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Consistent hashing

• Distributed caches to relieve hotspots on the web

• Node identifier hash = hash(IP address)• Key identifier hash = hash(key)• Designed to let nodes enter and leave the

network with minimal disruption

In Chord hash function is Secure Hash SHA-1

A key is stored at its successor: node with next higher ID

Page 6: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Key Location

• Finger tables allow faster location by providing additional routing information than simply successor node

Notation Definitionfinger[k].start (n+2k-1)mod 2m, 1=<k=<m.interval [finger[k].start,finger[k+1].start).node first node>=n.finger[k].startsuccessor the next node on the identifier circle;

finger[1].nodepredecessor the previous node on the identifier circle

k is the finger table index

Page 7: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Lookup(id)

Finger tables and key locations with nodes 0,1,3 and keys 1,2 and 6

Finger table for Node 1

Page 8: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Lookup PseudoCode

To find the successor of an id :

Chord returns the successor of the closest preceding finger to this id.

Finding successor of identifier 1

Page 9: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Lookup cost

• The finger pointers at repeatedly doubling distances around the circle cause each iteration of the loop in find_predecessorto halve the distance to the target identifier.In an N node Network the number of messages is of

O(Log N)

Page 10: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Node Join/Leave

Finger Tables and key locations after Node 6 joins After Node 3 leaves

Changed values are in black, unchanged in gray

Page 11: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Join PseudoCode

Three steps:

1- Initialize finger and predecessor of new node n

2- Update finger and predecessor of existing nodes to reflect the addition of n

n becomes ith finger of node p if:

• p precedes n by at least 2i-1

• ith finger of node p succeeds n

3- Transfer state associated with keys that node n is now responsible for

New node n only needs to contact node that immediately forwards it to transfer responsibility for all relevant keys

Page 12: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Join/leave cost

Number of nodes that need to be updated when a node joins is

O(Log N)

Finding and updating those nodes takes

O(Log2 N)

Page 13: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Stabilization

• If nodes join and stabilization not completed 3 cases are possible– finger tables are current lookup successful– successors valid, fingers not lookup successful

(because find_successor succeeds) but slower – successors are invalid or data hasn’t migrated

lookup fails

Page 14: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Stabilization cont’d

n

np

Node n joins

n acquires ns as successor

np runs stabilize:

• asks ns for its predecessor (n)

• np acquires n as its successor

• np notifies n which acquires npas predecessor

ns

Predecessors and successors are correct

Page 15: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Failures and replication

• Key step in failure recovery is correct successor pointers

• Each node maintains a successor-list of rnearest successors

• Knowing r allows Chord to inform the higher layer software when successors come and go when it should propagate new replicas

Page 16: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition
Page 17: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

SymphonySymphonyDistributed Hashing in a Small WorldDistributed Hashing in a Small World

Gurmeet Singh MankuStanford University

with Mayank Bawa and Prabhakar Raghavan

Page 18: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

DHTs: The Big PictureLoad BalanceLoad Balance

““How do we splice the hash table evenly?How do we splice the hash table evenly?””Nodes choose their ID in the hash spaceuniformly at random”.

Caching, Hotspots, Fault Tolerance, Caching, Hotspots, Fault Tolerance, Replication, ...Replication, ...

x x ------ xx

Topology EstablishmentTopology Establishment

““How do we route with small state per node?How do we route with small state per node?””

Deterministic Randomized(CAN/Chord) (Pastry/ Tapestry) (Symphony)

Page 19: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Spectrum of DHT Protocols

Protocol #links latency

CAN O(log n) O(log n)Chord O(log n) O(log n)

Viceroy O(1) O(log n)Tapestry O(log n) O(log n)Pastry O(log n) O(log n)

DeterministicTopology

PartlyRandomizedTopology

CompletelyRandomizedTopology

Symphony 2k+2 O((log2 n)/k)

Page 20: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Symphony in a NutshellNodes arranged in a unit circle (perimeter = 1)

Arrival --> Node chooses position along circleuniformly at random

Each node has 1 short link (next node on circle)and k long links

node long link short link

A typical Symphony network

?

Fault Tolerance:No backups for long links! Only short linksare fortified for fault tolerance.

Adaptation of Small World Idea: [Kleinberg00]Long links chosen from a probability distributionfunction: p(x) = 1 / x log n where n = #nodes.

Simple greedy routing:“Forward along that link that minimizes the absolute distance to the destination.”

Average lookup latency = O((log2 n) / k) hops

Page 21: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Network Size Estimation ProtocolProblem: What is the current value of n, the total number of nodes?

x = Length of arc1/x = Estimate of n

(Idea from Viceroy)

- 3 arcs are enough.- Re-linking Protocol not worthwhile.

Page 22: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Intuition Behind Symphony’s PDF

Distance to long distance neighbour

Prob

abili

ty D

istr

ibut

ion

0 ¼ ½ 1

Chord

Distance to long distance neighbour

Prob

abili

ty D

istr

ibut

ion

0 ¼ ½ 1

Distance to long distance neighbour

Prob

abili

ty D

istr

ibut

ion

0 ¼ ½ 1

0 ¼ ½ 1

Prob

abili

ty

Dis

trib

utio

n

Symphony

Distance to long distance neighbour

Page 23: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Step 0: Symphony

0 ¼ ½ 1

p(x) = 1 / (x log n)

Symphony:“Draw from the PDF log n times”

Prob

abili

ty D

istr

ibut

ion

Distance to long distance neighbour

Page 24: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Step 1: Step-Symphony

0 ¼ ½ 1

p(x) = 1 / x log n

Step-Symphony:“Draw from the discretized PDF log n times”

Prob

abili

ty D

istr

ibut

ion

Distance to long distance neighbour

Page 25: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Step 2: Divide PDF into log n Equal Bins

Prob

abili

ty D

istr

ibut

ion

Step-Partitioned-Symphony:“Draw exactly once from each of log n bins”

0 ¼ ½ 1Distance to long distance neighbour

Page 26: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Step 3: Discrete PDFPr

obab

ility

Dis

trib

utio

n

Chord:“Draw exactly once from each of log n bins”Each bin is essentially a point.

0 ¼ ½ 1Distance to long distance neighbour

Page 27: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Two Optimizations

Bi-directional Routing

- Exploit both outgoing and incoming links!- Route to the neighbor that

minimizes absolute distance to destination- Reduces avg latency by 25-30%

1-Lookahead

- List of neighbor’s neighbors- Reduces avg latency by 40%

Page 28: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Latency vs State MaintenanceA

vera

ge L

aten

cy5

10

15

0 10 20 30 40 50 60

Viceroyx x CAN

Pastryx x Chord

TapestryX Pastry

Network size: n=2Network size: n=21515 nodesnodes

Symphony x x x x

xx x x

x

x

+ Bidirectional Links+ 1-Lookahead

# TCP ConnectionsMany more graphs in the paper.

Page 29: P2P Network Structured Networks: Distributed Hash Tablesdeim.urv.cat/~pedro.garcia/P2P/ppt/p2pstructured2.pdf2-Update finger and predecessor of existing nodes to reflect the addition

Why Symphony?1. Low state maintenance

Low degree --> Fewer pings/keep-alives, less control trafficLow degree --> Distributed locking and coordination overhead

over smaller sets of nodesLow degree --> Smaller bootstrapping time when a node joins

Smaller recovery time when a node leaves

2. Fault toleranceOnly short links are bolstered. No backups for long links !

3. Smooth out-degree vs latency tradeoff Only protocol that offers this tuning knob even at run time!Out-degree is not fixed at runtime, or as a function of network size.

4. Flexibility and support for heterogeneityDifferent nodes can have different #links !