Transcript
Page 1: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Peer-to-Peer Structured Overlay Networks

Antonino Virgillito

Page 2: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Background

Peer-to-peer systems

• distribution

• symmetry (communication, node roles)

• decentralized control

• self-organization

• dynamicity

Page 3: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Data Lookup in P2P Systems

• Data items spread over a large number of nodes

• Which node stores which data item?

• A lookup mechanism needed– Centralized directory -> bottleneck/single

point of failure– Query Flooding -> scalability concerns– Need more structure!

Page 4: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

More Issues

• Organize, maintain overlay network– node arrivals– node failures

• Resource allocation/load balancing

• Resource location

• Network proximity routing

Page 5: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

What is a Distributed HashTable?

• Exactly that • A service, distributed over multiple machines,

with hash table semantics– put(key, value), Value = get(key)

• Designed to work in a peer-to-peer (P2P) environment

• No central control• Nodes under different administrative control• But of course can operate in an “infrastructure”

sense

Page 6: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

What is a DHT?

• Hash table semantics:put(key, value),Value = get(key)

• Key is a single flat string• Limited semantics compared to keyword search• Put() causes value to be stored at one (or more) peer(s)• Get() retrieves value from a peer• Put() and Get() accomplished with unicast routed

messages• In other words, it scales• Other API calls to support application, like notification

when neighbors come and go

Page 7: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Distributed Hash Tables (DHT)

k6,v6

k1,v1

k5,v5

k2,v2

k4,v4

k3,v3

nodes

Operations:put(k,v)get(k)

P2P overlay networ

k

P2P overlay networ

k

• p2p overlay maps keys to nodes• completely decentralized and self-organizing• robust, scalable

Page 8: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Popular DHTs

• Tapestry (Berkeley)– Based on Plaxton trees---similar to hypercube routing– The first* DHT– Complex and hard to maintain (hard to understand

too!)

• CAN (ACIRI), Chord (MIT), and Pastry (Rice/MSR Cambridge)– Second wave of DHTs (contemporary with and

independent of each other)

Page 9: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

DHTs Basics

• Node IDs can be mapped to the hash key space• Given a hash key as a “destination address”,

you can route through the network to a given node

• Always route to the same node no matter where you start from

• Requires no centralized control (completely distributed)

• Small per-node state is independent of the number of nodes in the system (scalable)

• Nodes can route around failures (fault-tolerant)

Page 10: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Things to look at

• What is the structure?

• How does routing work in the structure?

• How does it deal with node joins and departures (structure maintenance)?

• How does it scale?

• How does it deal with locality?

• What are the security issues?

Page 11: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

The Chord Approach

• Consistent Hashing

• Logical Ring

• Finger Pointers

Page 12: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

The Chord Protocol

• Provides:– A mapping successor: key -> node– To lookup key K, go to node successor(K)

• successor defined using consistent hashing:– Key hash– Node hash– Both Keys and Nodes hash to same (circular)

identifier space– successor(K)=first node with hash ID equal to or

greater than hash(K)

Page 13: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Example: The Logical Ring

Nodes 0, 1, 3

Keys 1, 2, 6

Page 14: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Consistent Hashing [Karger et al. ‘97]

• Some Nice Properties:– Smoothness: minimal key movement on node

join/leave– Load Balancing: keys equitably distributed

over nodes

Page 15: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Mapping Details

• Range of Hash Function– Circular ID space module 2m

• Compute 160 bit SHA-1 hash, and truncate to m-bits– Chance of collision rare if m is large enough

• Deterministic, but hard for an adversary to subvert

Page 16: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Chord State

• Successor/Predecessor in the Ring

• Finger Pointers– n.finger[i] = successor (n+2 i-1)– Each node knows more about

portion of circle close to it!

Page 17: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Example: Finger Tables

Page 18: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Chord: routing protocol

- A set of nodes towards id are contacted remotely - Each node is queried for the known node which is closest to id- Process stops when a node is found having successor > id

Notation n.foo( ) stands for a remote call to node n.

Page 19: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Example: Chord Routing

Finger Pointers for Node 1

Page 20: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Lookup Complexity

• With high probability: O(log(N))

• Proof Intuition: – Being p the successor of the targeted key, distance to

p reduces by at least half in each step– In m steps, would reach p– Stronger claim: In O(log(N)) steps, distance ≤ 2m/N Thereafter even linear advance will suffice to give

O(log(N)) lookup complexity

Page 21: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Chord invariants

• Every key in the network can be located as long as the following invariants are preserved after joins and leaves: – Each node’s successor is correctly

maintained– For every key k, node successor(k) is

responsible for k

Page 22: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Chord: Node Joins

• New node B learns of at least one existing node A via external means

• B asks A to lookup its finger-table information– Given that B’s hash-id is b, A does lookup for

B.finger[i] = successor ( b + 2i-1) if interval not already included in finger[i-1]

– B stores all finger information and sets up pred/succ pointers

Page 23: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Node Joins (contd.)

• Update of finger table of existing nodes p such that:

1. p precedes b by at least 2i-1

2. the i-th finger of node p succeeds b– Starts from p = predecessor( b - 2i-1 ) and proceeds

in counter-clock-wise direction while 2. is true

• Transferring keys:– Only from successor(b) to b– Must send notification to the application

Page 24: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Example: finger table update

Node 6 joins

Page 25: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Example: transferring keys

Node 1 leaves

Page 26: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Concurrent Joins/Leaves

• Need a stabilization protocol to guard against inconsistency

• Note: – Incorrect finger pointers may only increase latency,

but incorrect successor pointers may cause lookup failure!

• Nodes periodically run stabilization protocol– Finds successor’s predecessor– Repair if this isn’t self

• This algorithm is also run at join

Page 27: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Example: node 25 joins

Page 28: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Example: node 28 joins before 20 stabilizes (1)

Page 29: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Example: node 28 joins before 20 stabilizes (2)

Page 30: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

CAN• Virtual d-dimensional

Cartesian coordinatesystem on a d-torus– Example: 2-d [0,1]x[1,0]

• Dynamically partitionedamong all nodes

• Pair (K,V) is stored bymapping key K to a point P in the space using a uniform hash function and storing (K,V) at the node in the zone containing P

• Retrieve entry (K,V) by applying the same hash function to map K to P and retrieve entry from node in zone containing P– If P is not contained in the zone of the requesting node or

its neighboring zones, route request to neighbor node in zone nearest P

Page 31: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Routing in a CAN

• Follow straight line path through the Cartesian space from source to destination coordinates

• Each node maintains a table of the IP address and virtual coordinate zone of each local neighbor

• Use greedy routing to neighbor closest to destination

• For d-dimensional space partitioned into n equal zones, nodes maintain 2d neighbors– Average routing path length:

dn

d 1

4

Page 32: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

CAN Construction

• Joining node locates a bootstrapnode using the CAN DNS entry– Bootstrap node provides IP addresses

of random member nodes

• Joining node sends JOIN request torandom point P in the Cartesian space

• Node in zone containing P splits thezone and allocates “half” to joining node

• (K,V) pairs in the allocated “half” aretransferred to the joining node

• Joining node learns its neighbor setfrom previous zone occupant– Previous zone occupant updates its neighbor set

Page 33: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Departure, Recovery and Maintenance

• Graceful departure: node hands over its zone and the (K,V) pairs to a neighbor

• Network failure: unreachable node(s) trigger an immediate takeover algorithm that allocate failed node’s zone to a neighbor– Detect via lack of periodic refresh messages– Neighbor nodes start a takeover timer initialized in proportion to

its zone volume– Send a TAKEOVER message containing zone volume to all of

failed node’s neighbors– If received TAKEOVER volume is smaller kill timer, if not reply

with a TAKEOVER message– Nodes agree on neighbor with smallest volume that is alive

Page 34: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Pastry

Generic p2p location and routing substrate

• Self-organizing overlay network

• Lookup/insert object in < log16 N routing steps (expected)

• O(log N) per-node state• Network proximity routing

Page 35: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Pastry: Object distribution

objId

Consistent hashing

128 bit circular id space

nodeIds (uniform random)

objIds (uniform random)

Invariant: node with numerically closest nodeId maintains object

nodeIds

O2128-1

Page 36: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Pastry: Object insertion/lookup

X

Route(X)

Msg with key X is routed to live node with nodeId closest to X

Problem:

complete routing table not feasible

O2128-1

Page 37: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Pastry: Routing table (# 65a1fc)

log16 Nrows

Row 0

Row 1

Row 2

Row 3

0x1x2x3x4x5x

7x8x9xaxbxcxdxexfx

60x

61x

62x

63x

64x

66x

67x

68x

69x

6ax

6bx

6cx

6dx

6ex

6fx

650x

651x

652x

653x

654x

655x

656x

657x

658x

659x

65bx

65cx

65dx

65ex

65fx

65a0x

65a2x

65a3x

65a4x

65a5x

65a6x

65a7x

65a8x

65a9x

65aax

65abx

65acx

65adx

65aex

65afx

Page 38: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Pastry: Leaf sets

Each node maintains IP addresses of the nodes with the L/2 numerically closest larger and smaller nodeIds, respectively.

• routing efficiency/robustness

• fault detection (keep-alive)

• application-specific local coordination

Page 39: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Pastry: Routing procedureif (destination is within range of our leaf set)

forward to numerically closest memberelse

let l = length of shared prefix let d = value of l-th digit in D’s addressif (Rl

d exists)

forward to Rld

else forward to a known node that (a) shares at least as long a prefix(b) is numerically closer than this node

Page 40: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Pastry: Routing

Properties• log16 N steps • O(log N) state

d46a1c

Route(d46a1c)

d462ba

d4213f

d13da3

65a1fc

d467c4d471f1

Page 41: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Pastry: Performance

Integrity of overlay message delivery:• guaranteed unless L/2 simultaneous failures

of nodes with adjacent nodeIds

Number of routing hops:

• No failures: < log16 N expected, 128/b + 1 max

• During failure recovery:– O(N) worst case, average case much better

Page 42: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Pastry Join

• X = new node, A = bootstrap, Z = nearest node

• A finds Z for X• In process, A, Z, and all nodes in path

send state tables to X• X settles on own table

– Possibly after contacting other nodes

• X tells everyone who needs to know about itself

Page 43: Peer-to-Peer Structured Overlay Networks Antonino Virgillito

Pastry Leave

• Noticed by leaf set neighbors when leaving node doesn’t respond– Neighbors ask highest and lowest nodes in leaf set for

new leaf set

• Noticed by routing neighbors when message forward fails– Immediately can route to another neighbor– Fix entry by asking another neighbor in the same

“row” for its neighbor– If this fails, ask somebody a level up


Recommended