Peer to-peer

  • Published on
    02-Nov-2014

  • View
    821

  • Download
    0

Embed Size (px)

DESCRIPTION

 

Transcript

<ul><li> 1. An Overview of Peer-to-Peer Sami Rollins 11/14/02</li></ul> <p> 2. Outline P2P Overview What is a peer? Example applications Benefits of P2P P2P Content Sharing Challenges Group management/data placement approaches Measurement studies 3. What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing 4. What is a peer? Contrasted withClient-Server model Servers are centrallymaintained andadministered Client has fewerresources than a server 5. What is a peer? A peers resources aresimilar to theresources of the otherparticipants P2P peerscommunicatingdirectly with otherpeers and sharingresources 6. Levels of P2P-ness P2P as a mindset Slashdot P2P as a model Gnutella P2P as an implementation choice Application-layer multicast P2P as an inherent property Ad-hoc networks 7. P2P Application TaxonomyP2P SystemsDistributed Computing File Sharing Collaboration Platforms SETI@homeGnutellaJabber JXTA 8. P2P Goals/Benefits Cost sharing Resource aggregation Improved scalability/reliability Increased autonomy Anonymity/privacy Dynamism Ad-hoc communication 9. P2P File Sharing Content exchange Gnutella File systems Oceanstore Filtering/mining Opencola 10. P2P File Sharing Benefits Cost sharing Resource aggregation Improved scalability/reliability Anonymity/privacy Dynamism 11. Research Areas Peer discovery and group management Data location and placement Reliable and efficient file exchange Security/privacy/anonymity/trust 12. Current Research Group management and data placement Chord, CAN, Tapestry, Pastry Anonymity Publius Performance studies Gnutella measurement study 13. Management/Placement Challenges Per-node state Bandwidth usage Search time Fault tolerance/resiliency 14. Approaches Centralized Flooding Document Routing 15. Centralized BobAlice Napster model Benefits: Efficient search Limited bandwidth usage No per-node state Drawbacks: Central point of failure JudyJane Limited scale 16. FloodingCarl Jane Gnutella model Benefits: No central point of failure Limited per-node state Drawbacks: Bob Slow searches Bandwidth intensive AliceJudy 17. Document Routing 001 012 FreeNet, Chord, CAN,Tapestry, Pastry model 212 ? 212 ? Benefits: 332 More efficient searching212 Limited per-node state 305 Drawbacks: Limited fault-tolerance vs redundancy 18. Document Routing CAN Associate to each node and item a unique id in and-dimensional space Goals Scales to hundreds of thousands of nodes Handles rapid arrival and failure of nodes Properties Routing table size O(d) Guarantees that a file is found in at most d*n1/d steps, where n is the total number of nodesSlide modified from another presentation 19. CAN Example: TwoDimensional Space Space divided between nodes7 All nodes cover the entire6space5 Each node covers either a4square or a rectangular area ofratios 1:2 or 2:1 3n1 Example:2 Node n1:(1, 2) first node that 1 joins cover the entire space 001 2 3 4 5 6 7Slide modified from another presentation 20. CAN Example: TwoDimensional Space Node n2:(4, 2) joins space7is divided between n1 and n26 5 4 3 n1 n2 2 10 01 2 3 45 6 7 Slide modified from another presentation 21. CAN Example: TwoDimensional Space Node n2:(4, 2) joins space7is divided between n1 and n26n3 5 4 3 n1n2 2 10 01 23 45 6 7 Slide modified from another presentation 22. CAN Example: TwoDimensional Space Nodes n4:(5, 5) and n5:(6,6)7join6n5n3n4 5 4 3 n1n2 2 10 01 23 4 56 7 Slide modified from another presentation 23. CAN Example: Two Dimensional Space Nodes: n1:(1, 2); n2:(4,2); n3:7(3, 5); n4:(5,5);n5:(6,6)6 n5 Items: f1:(2,3); f2:(5,1); f3:n3n45f4(2,1); f4:(7,5);4 f13n1 n22 f31 0 f20123 4 5 67Slide modified from another presentation 24. CAN Example: TwoDimensional Space Each item is stored by the 7node who owns its mappingin the space 6n3n4n55f44 f13n1 n22 f31 0 f20123 4 5 67Slide modified from another presentation 25. CAN: Query Example Each node knows itsneighbors in the d-space7 Forward query to the6 n5n4neighbor that is closest to the 5n3 f4query id4 Example: assume n1 queries3 f1f4n1 n22 Can route around somef31failures f20 some failures require local0123 4 5 67 floodingSlide modified from another presentation 26. CAN: Query Example Each node knows itsneighbors in the d-space7 Forward query to the6 n5n4neighbor that is closest to the 5n3 f4query id4 Example: assume n1 queries3 f1f4n1 n22 Can route around somef31failures f20 some failures require local0123 4 5 67 floodingSlide modified from another presentation 27. CAN: Query Example Each node knows itsneighbors in the d-space7 Forward query to the6 n5n4neighbor that is closest to the 5n3 f4query id4 Example: assume n1 queries3 f1f4n1 n22 Can route around somef31failures f20 some failures require local0123 4 5 67 floodingSlide modified from another presentation 28. CAN: Query Example Each node knows itsneighbors in the d-space7 Forward query to the6 n5n4neighbor that is closest to the 5n3 f4query id4 Example: assume n1 queries3 f1f4n1 n22 Can route around somef31failures f20 some failures require local0123 4 5 67 floodingSlide modified from another presentation 29. Node Failure Recovery Simple failures know your neighbors neighbors when a node fails, one of its neighbors takesover its zone More complex failure modes simultaneous failure of multiple adjacent nodes scoped flooding to discover neighbors hopefully, a rare event Slide modified from another presentation 30. Document Routing ChordN5N10 N110K19 MIT project N20 Uni-dimensional IDN99space N32 Keep track of log NnodesN80 Search through log Nnodes to find desired key N60 31. Doc Routing Tapestry/Pastry 43F 993 13FEEE Global mesh Suffix-based routing 73F F99 Uses underlying network E 0distance in constructing 04FEmesh 999 ABF0E 239E1290 32. Comparing Guarantees Model Search StateChord Uni- log Nlog N dimensional Multi-CANdN1/d2d dimensionalTapestry Global Mesh logbNb logbNPastryNeighbor logbNb logbN + bmap 33. Remaining Problems? Hard to handle highly dynamicenvironments Usable services Methods dont consider peer characteristics 34. Measurement Studies Free Riding on Gnutella Most studies focus on Gnutella Want to determine how users behave Recommendations for the best way todesign systems 35. Free Riding Results Who is sharing what? August 2000 The top Share As percent of whole 333 hosts (1%)1,142,645 37% 1,667 hosts (5%)2,182,087 70% 3,334 hosts (10%) 2,692,082 87% 5,000 hosts (15%) 2,928,905 94% 6,667 hosts (20%) 3,037,232 98% 8,333 hosts (25%) 3,082,572 99% 36. Saroiu et al Study How many peers are server-likeclient-like? Bandwidth, latency Connectivity Who is sharing what? 37. Saroiu et al Study May 2001 Napster crawl query index server and keep track of results query about returned peers dont capture users sharing unpopular content Gnutella crawl send out ping messages with large TTL 38. Results Overview Lots of heterogeneity between peers Systems should consider peer capabilities Peers lie Systems must be able to verify reported peercapabilities or measure true capabilities 39. Measured Bandwidth 40. Reported Bandwidth 41. Measured Latency 42. Measured Uptime 43. Number of Shared Files 44. Connectivity 45. Points of Discussion Is it all hype? Should P2P be a research area? Do P2P applications/systems have commonresearch questions? What are the killer apps for P2P systems? 46. Conclusion P2P is an interesting and useful model There are lots of technical challenges to besolved </p>