Vincent Matossian
September 21st 2001
ECE 579
An Overview of Decentralized Discovery mechanisms
Decentralized Discovery mechanisms
•Centralized indexes and repositories• Flooding broadcast of queries•Selective forwarding/routing of queries•Decentralized hashing index systems•Distributed indexes and repositories
Centralized indexes and repositories
Napster 1
Central Napster server
(xyz.mp3, 128.1.2.3)
128.1.2.3
From Sylvia Ratnasamy
Berkeley
Napster 2
Central Napster server
xyz.mp3 ?
128.1.2.3
128.1.2.3
From Sylvia Ratnasamy
Berkeley
Napster 3
Central Napster server
128.1.2.3xyz.mp3 ?
From Sylvia Ratnasamy
Berkeley
Drawbacks Advantages
• Single point of failure
• Scalability
• Cost increases with popularity
• Lawsuits
• Performance
• Control of accesses
Decentralized Discovery mechanisms
•Centralized indexes and repositories• Flooding broadcast of queries•Selective forwarding/routing of queries•Decentralized hashing index systems•Distributed indexes and repositories
Flooding broadcast
From Sylvia Ratnasamy
Berkeley
Gnutella step 1
xyz.mp3 ?
From Sylvia Ratnasamy
Berkeley
Gnutella step 2
Gnutella step 3
From Sylvia Ratnasamy
Berkeley
xyz.mp3
Gnutella step 4
From Sylvia Ratnasamy
Berkeley
Drawbacks Advantages
• Message broadcasting becomes a problem as popularity increases due to bandwidth requirements
• Susceptible to malicious attacks
• Simple• Efficient• Flexible query
interpretation• Reliable in small
networks
Decentralized Discovery mechanisms
•Centralized indexes and repositories •Flooding broadcast of queries•Selective forwarding/routing of queries•Decentralized hashing index systems•Distributed indexes and repositories
Clip2 Reflector (Gnutella)
CANCELLED
FastTrack (KaZaA Morpheus)
Nodes become supernodes automatically if they have sufficient bandwidth and processing power.
Drawbacks Advantages
• Susceptible to malicious activities
• Too much importance on Super Nodes
• Each peer must contain additional information used to route or direct queries received.
• Performance
• Scalability
• Fault-Tolerance
Decentralized Discovery mechanisms
•Centralized indexes and repositories •Flooding broadcast of queries•Selective forwarding/routing of queries•Decentralized hashing index systems•Distributed indexes and repositories
Selective forwarding of queries
•Chord Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan; MIT
•Content-Addressable Networks
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker; UC Berkeley
•Pastry Antony Rowstron (Rice University) and Peter Druschel (Microsoft)
•Tapestry Ben Y. Zhao, John Kubiatowicz and Anthony D.Joseph UC Berkeley
N4Publisher
Client
N6
N9
N7N8
N3
N2N1
Lookup(“title”)
Key=“title”Value=MP3 data…
Concept
From Robert Morris
MIT
Chord 1
N32
N90
N105
K80
K20
K5
Circular 7-bitID space
Key 5Node 105
A key is stored at its successor: node with next higher ID
Chord 2
N32
N90
N105
N60
N10N120
K80
“Where is key 80?”
“N90 has K80”
Content-Addressable Networks - CAN
hash(K) = (a,b)
(K,V)
retrieve (K)
insert (K,V)
(a,b)
Bootstrap node
1) Discover some node “I” already in CANnew node
CAN Node Insertion
I
Bootstrap node
new node1) Discover some node “I” already in CAN
CAN Node Insertion
2) pick random point in space
I
(p,q)
new node
CAN Node Insertion
(p,q)
3) I routes to (p,q), discovers node J
I
J
new node
CAN Node Insertion
newJ
4) split J’s zone in half… new owns one half
CAN Node Insertion
Example: Octal digits, 218 namespace, 005712 627510
005712
340880 943210
387510
834510
727510
627510
Neighbor MapFor “5712” (Octal)
Routing Levels1234
xxx0
5712
xxx0
xxx3
xxx4
xxx5
xxx6
xxx7
xx02
5712
xx22
xx32
xx42
xx52
xx62
xx72
x012
x112
x212
x312
x412
x512
x612
5712
0712
1712
2712
3712
4712
5712
6712
7712
005712 0 1 2 3 4 5 6 7
340880 0 1 2 3 4 5 6 7
943210 0 1 2 3 4 5 6 7
834510 0 1 2 3 4 5 6 7
387510 0 1 2 3 4 5 6 7
727510 0 1 2 3 4 5 6 7
627510 0 1 2 3 4 5 6 7
Plaxton Rajamaran Richa
PASTRY TAPESTRY
Based on Plaxton Rajamaran Richa algorithm but have additional support for dynamic node insertion and deletion.
Node insertion: Node N requests a new ID and contacts a Gateway G. Neighbor maps tables are updated along each hop.
Minor differences in object replications and routing distances calculation.
Comparing Key Metrics
• Properties– Parameter
– Logical Path Length
– Neighbor-state
– Routing Overhead (RDP)
– Messages to insert
– Mutability– Load-balancing
Tapestry Chord CAN Pastry
LogbN Log2N O(d*N1/d)
O(d)
Base b
bLogbN
O(1) O(1) O(1) ? O(1)?
App-dep. App-dep Immut. ???
Good
LogbN
Log2N
None Dimen d Base b
bLogbN+O(b)
O(Log22N) O(d*N1/d)
Good Good Good
O(Logb2N) O(LogbN)
Designed as P2P Indices
Drawbacks Advantages
• No keyword search
• Susceptible of malicious activities
• Scalable
• Fault Tolerant
Common Applications:
•Storage systems
• Application-level multicasts
• Event Notification
Decentralized Discovery mechanisms
•Centralized indexes and repositories •Flooding broadcast of queries•Selective forwarding/routing of queries•Decentralized hashing index systems•Distributed indexes and repositories
Clients Content Broker
Content Distribution Networks
Drawbacks Advantages
• Infrastructure difficult to setup
• Cost
• Cache Coherence
• “Slash-Dot” effect
• Low latency delivery of content•Cuts ISP’s bandwidth costs •Load balancing• QoS is possible•Centrally managed, pre-installed network
Conclusion
•Not one system fits-all
•Drawbacks for all systems include malicious activities
•Business-oriented and Research discovery mechanisms will merge
Links:
•http://cubicmetercrystal.com/alpine/discovery.html
•http://www.caip.rutgers.edu/~vincentm/p2p.html