A Scalable Peer To Peer Lookup Service For Internet Applications

Chord: A Scalable Peer-to-Peer Lookup Service for Internet ApplicationsIon Stoica Robert Morris

David Liben-Nowell David R. KargerM. Frans Kaashoek Frank Dabek

Hari Balakrishnan

Presented By-

Manan Rawal, Bhaskar Gupta

Introduction

Efficient lookup of a node which stores data items for a particular search key.

Provides only one operation: given a key, it maps the key onto a node.

Example applications: Co-operative Mirroring Time-shared storage Distributed indexes Large-Scale combinatorial search

Problems addressed

Load Balance: Distributed hash function spreads keys evenly over the nodes

Decentralization: Fully distributed Scalability: Lookup grows as a log of number

of nodes Availability: Automatically adjusts internal

tables to reflect changes. Flexible Naming: No constraints on key

structure.

Chord Protocol

Assumes communication in underlying network is both symmetric and transitive.

Assigns keys to nodes with consistent hashing

Hash function balances the load When Nth node joins or leaves only O(1/N)

fraction of keys moved.

Chord protocol

Consistent hashing function assigns each node and key an m-bit identifier using SHA-1 base hash function.

Node’s IP address is hashed. Identifiers are ordered on a identifier circle

modulo 2m called a chord ring. succesor(k) = first node whose identifier is >=

identifier of k in identifier space.

Chord protocol

m = 610 nodes

Theorem

For any set of N nodes and K keys, with high probability:

1. Each node is responsible for at most (1+e)K/N keys.

2. When an (N+1)st node joins or leaves the network, responsibility for O(K/N) keys changes hands.

e = O(log N)

Simple Key Location Scheme

N1

N8

N14

N21N32

N38

N42

N48

K45

Scalable Lookup Scheme

N1

N8

N14

N21N32

N38

N42

N48

N51

N56

N8+1 N14

N8+2 N14

N8+4 N14

N8+8 N21

N8+16 N32

N8+32 N42

Finger Table for N8

finger 1,2,3

finger 4

finger 6

finger [k] = first node that succeeds (n+2k-1)mod2m

finger 5


// ask node n to find the successor of idn.find_successor(id)

if (id belongs to (n, successor])return successor;

elsen0 = closest preceding node(id);return n0.find_successor(id);

// search the local table for the highest predecessor of idn.closest_preceding_node(id)

for i = m downto 1if (finger[i] belongs to (n, id))

return finger[i];return n;

Lookup Using Finger Table

N1

N8

N14

N21N32

N38

N42

N51

N56

N48

lookup(54)


Each node forwards query at least halfway along distance remaining to the target

Theorem: With high probability, the number of nodes that must be contacted to find a successor in a N-node network is O(log N)

Dynamic Operations and FailuresNeed to deal with:

Node Joins and Stabilization Impact of Node Joins on Lookups Failure and Replication Voluntary Node Departures

Node Joins and Stabilization

Node’s successor pointer should be up to date For correctly executing lookups

Each node periodically runs a “Stabilization” Protocol Updates finger tables and successor pointers

Node Joins and Stabilization

Contains 6 functions: create() join() stabilize() notify() fix_fingers() check_predecessor()

Create()

Creates a new Chord ring

n.create()

predecessor = nil;

successor = n;

Join()

Asks m to find the immediate successor of n. Doesn’t make rest of the network aware of n.

n.join(m)

predecessor = nil;

successor = m.find_successor(n);

Stabilize()

Called periodically to learn about new nodes Asks n’s immediate successor about successor’s predecessor p

Checks whether p should be n’s successor instead Also notifies n’s successor about n’s existence, so that successor

may change its predecessor to n, if necessary

n.stabilize()x = successor.predecessor;if (x (n, successor))

successor = x;successor.notify(n);

Notify()

m thinks it might be n’s predecessor

n.notify(m)

if (predecessor is nil or m (predecessor, n))

predecessor = m;

Fix_fingers()

Periodically called to make sure that finger table entries are correct New nodes initialize their finger tables Existing nodes incorporate new nodes into their finger tables

n.fix_fingers()next = next + 1 ;if (next > m)

next = 1 ;finger[next] = find_successor(n + 2next-1);

Check_predecessor()

Periodically called to check whether predecessor has failed If yes, it clears the predecessor pointer, which can

then be modified by notify()

n.check_predecessor()

if (predecessor has failed)

predecessor = nil;

Theorem 3

If any sequence of join operations is executed interleaved with stabilizations, then at some time after the last join the successor pointers will form a cycle on all nodes in the network

Stabilization Protocol

Guarantees to add nodes in a fashion to preserve reach ability

By itself won’t correct a Chord system that has split into multiple disjoint cycles, or a single cycle that loops multiple times around the identifier space

Impact of Node Joins on Lookups Correctness

If finger table entries are reasonably current Lookup finds the correct successor in O(log N) steps

If successor pointers are correct but finger tables are incorrect Correct lookup but slower

If incorrect successor pointers Lookup may fail

Impact of Node Joins on Lookups Performance

If stabilization is complete Lookup can be done in O(log N) time

If stabilization is not complete Existing nodes finger tables may not reflect the new nodes

Doesn’t significantly affect lookup speed Newly joined nodes can affect the lookup speed, if the new

nodes ID’s are in between target and target’s predecessor Lookup will have to be forwarded through the intervening nodes,

one at a time

Theorem 4

If we take a stable network with N nodes with correct finger pointers, and another set of up to N nodes joins the network, and all successor pointers (but perhaps not all finger pointers) are correct, then lookups will still take O(log N) time with high probability

Failure and Replication

Correctness of the protocol relies on the fact of knowing correct successor

To improve robustness Each node maintains a successor list of ‘r’ nodes This can be handled using modified version of

stabilize procedure Also helps higher-layer software to replicate data

Theorem 5

If we use successor list of length r = O(log N) in a network that is initially stable, and then every node fails with probability ½, then with high probability find_successor returns the closest living successor to the query key

Theorem 6

In a network that is initially stable, if every node fails with probability ½, then the expected time to execute find_successor is O(log N)

Voluntary Node Departures

Can be treated as node failures Two possible enhancements

Leaving node may transfers all its keys to its successor

Leaving node may notify its predecessor and successor about each other so that they can update their links

Conclusion

Efficient location of the node that stores a desired data item is a fundamental problem in P2P networks

Chord protocol solves it in a efficient decentralized manner Routing information: O(log N) nodes Lookup: O(log N) nodes Update: O(log2 N) messages

It also adapts dynamically to the topology changes introduced during the run

Future Work

Using Chord to detect and heal partitions whose nodes know of each other. Every node should know of some same set of

initial nodes Nodes should maintain long-term memory of a

random set of nodes that they have visited earlier Malicious set of Chord participants could

present an incorrect view of the Chord ring Node n periodically asks other nodes to do a

lookup for n

THANK YOU

Technology

A Scalable Peer To Peer Lookup Service For Internet Applications