54
Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section.

Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

Computer Communications

Peer to Peer networking

Ack: Many of the slides are adaptations of slides by authors in the bibliography section.

Page 2: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 2

p2p

Quickly grown in popularity numerous sharing applications many million people worldwide use P2P networks

But what is P2P in the Internet? Searching or location? Computers “Peering”? Take advantage of resources at the edges of

the network• End-host resources have increased dramatically• Broadband connectivity now common

Page 3: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

Lecture outline

Evolution of p2p networking seen through file-sharing applications

Other applications

Multimedia

P2P 3

Page 4: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 4

P2P Networks: file sharing

Common Primitives: Join: how do I begin participating? Publish: how do I advertise my file? Search: how to I find a file/service? Fetch: how to I retrieve a file/use service?

Page 5: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 5

First generation in p2p file sharing/lookup

Centralized Database: single directory Napster

Query Flooding Gnutella

Hierarchical Query Flooding KaZaA

(Further unstructured Overlay Routing Freenet, …)

Structured Overlays …

Page 6: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 6

P2P: centralized directoryoriginal “Napster” design (1999, S.

Fanning)1) when peer connects, it informs

central server: IP address, content

2) Alice queries directory server for “Boulevard of Broken Dreams”

3) Alice requests file from Bob

centralizeddirectory server

peers

Alice

Bob

1

1

1

12

3

Page 7: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 7

Napster: Publish

I have X, Y, and Z!

Publish

insert(X,123.2.21.23)

...

123.2.21.23

Page 8: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 8

Napster: Search

Where is file A?

Query Reply

search(A)-->123.2.0.18Fetch

123.2.0.18

Page 9: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 9

First generation in p2p file sharing/lookup

Centralized Database Napster

Query Flooding: no directory Gnutella

Hierarchical Query Flooding KaZaA

(Further unstructured Overlay Routing Freenet)

Structured Overlays …

Page 10: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 10

Gnutella: Overview

Query Flooding: Join: on startup, client contacts a few other

nodes (learn from bootstrap-node); these become its “neighbors”

Publish: no need

Search: ask neighbors, who ask their neighbors, and so on... when/if found, reply to sender.

Fetch: get the file directly from peer

Page 11: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 11

I have file A.

I have file A.

Gnutella: Search

Where is file A?

Query

Reply

Page 12: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 12

Gnutella: protocol

Query

QueryHit

Query

QueryHit

File transfer:HTTP• Query message

sent over existing TCPconnections• peers forwardQuery message• QueryHit sent over reversepath

Scalability:limited scopeflooding

Page 13: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 13

Napsetr vs Gnutella: Discussion +, -?

Pros: Simple Fully de-centralized Search cost

distributed

Cons: Search scope is O(N) Search time is O(???)

Pros: Simple Search scope is O(1)

Cons: Server maintains O(N)

State Server performance

bottleneck Single point of failure

Page 14: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 14

Gnutella

Interesting concept in practice: overlay network:

active gnutella peers and edges form an overlay

A network on top of another network: Edge is not a physical link (what is it then?)

Page 15: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 15

First generation in p2p file sharing/lookup

Centralized Database Napster

Query Flooding Gnutella

Hierarchical Query Flooding: some directories KaZaA

Further unstructured Overlay Routing Freenet

Page 16: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 16

KaZaA: Overview “Smart” Query Flooding:

Join: on startup, client contacts a “supernode” ... may at some point become one itself

Publish: send list of files to supernode Search: send query to supernode, supernodes flood query

amongst themselves. Fetch: get the file directly from peer(s); can fetch

simultaneously from multiple peers

Page 17: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 17

KaZaA: File Insert

I have X!

Publish

insert(X,123.2.21.23)

...

123.2.21.23

“Super Nodes”

Page 18: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 18

KaZaA: File Search

Where is file A?

Query

search(A)-->123.2.0.18

search(A)-->123.2.22.50

Replies

123.2.0.18

123.2.22.50

“Super Nodes”

Page 19: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 19

KaZaA: Discussion

Pros: Tries to balance between search overhead and space needs Tries to take into account node heterogeneity:

• Bandwidth• Host Computational Resources

Rumored to take into account network locality Cons:

Still no real guarantees on search scope or search time

P2P architecture used by Skype, Joost (communication, video distribution p2p systems)

Page 20: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 20

First steps in p2p file sharing/lookup

Centralized Database Napster

Query Flooding Gnutella

Hierarchical Query Flooding KaZaA

Further unstructured Overlay Routing Freenet: some directory, cache-like, based on recently seen targets;

see literature pointers for more

Structured Overlay Organization and Routing Distributed Hash Tables Combine database+distributed system expertise

Page 21: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 21

Problem from this perspective

How to find data in a distributed file sharing system?

(Routing to the data)

Lookup is a key problem

Internet

PublisherKey=“LetItBe”

Value=MP3 data

Lookup(“LetItBe”)

N1

N2 N3

N5N4Client ?

Page 22: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 22

Centralized Solution

O(M) state at server, O(1) at clientO(1) search communication overheadSingle point of failure

Internet

PublisherKey=“LetItBe”

Value=MP3 data

Lookup(“LetItBe”)

N1

N2 N3

N5N4Client

DB

Central server (Napster)

Page 23: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 23

Distributed Solution

O(1) state per node

Worst case O(E) messages per lookup

Internet

PublisherKey=“LetItBe”

Value=MP3 data

Lookup(“LetItBe”)

N1

N2 N3

N5N4Client

Flooding (Gnutella, etc.)

Page 24: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 24

Distributed Solution (some more structure? In-between the two?)

Internet

PublisherKey=“LetItBe”

Value=MP3 data

Lookup(“LetItBe”)

N1

N2 N3

N5N4Client

balance the update/lookup complexity..Abstraction: a distributed “hash-table” (DHT) data structure:

put(id, item);item = get(id);

Implementation: nodes form a distributed data structure

eg. Ring, Tree, Hypercube, SkipList, Butterfly.

Hash function maps entries to nodes; usingthe node structure, find the node responsiblefor item; that one knows where the item is

- >

Page 25: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 25

Hash function maps entries tonodes; using the node structure, findthe node responsible for item; thatone knows where the item is

Challenges:•Keep the hop count small• Keep the routing tables “right size”• Stay robust despite rapid changes in membership

figure source: wikipedia

I do not know DFCD3454but should ask my

right-hand neighbour

Page 26: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 26

DHT: Overview

Structured Overlay Routing: Join: On startup, contact a “bootstrap” node and integrate

yourself into the distributed data structure; get a node id Publish: Route publication for file id toward an appropriate node

id along the data structure• Need to think of updates when a node leaves

Search: Route a query for file id toward a close node id. Data structure guarantees that query will meet the publication.

Fetch: Two options:• Publication contains actual file => fetch from where query stops• Publication says “I have file X” => query tells you 128.2.1.3 has X,

use http or similar (i.e. rely on IP routing) to get X from 128.2.1.3

Page 27: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

DHT: Comments/observations?

think about structure maintenance/benefits

P2P 27

Page 28: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 28

Next generation in p2p netwoking

Swarming BitTorrent, Avalanche, …

Page 29: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 29

BitTorrent: Next generation fetching In 2002, B. Cohen debuted BitTorrent Key Motivation:

Popularity exhibits temporal locality (Flash Crowds) Focused on Efficient Fetching, not Searching:

Distribute the same file to groups of peers Single publisher, multiple downloaders

Used by publishers to distribute software, other large files

http://vimeo.com/15228767

Page 30: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 30

BitTorrent: Overview

Swarming: Join: contact centralized “tracker” server, get

a list of peers. Publish: can run a tracker server. Search: Out-of-band. E.g., use Google, some

DHT, etc to find a tracker for the file you want. Get list of peers to contact for assembling the file in chunks

Fetch: Download chunks of the file from your peers. Upload chunks you have to them.

Page 31: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

31

File distribution: BitTorrent

tracker: tracks peers participating in torrent

torrent: group of peers exchanging

chunks of a file

obtain listof peers

trading chunks

peer

P2P file distribution

Page 32: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

32

BitTorrent (1)

file divided into chunks. peer joining torrent:

has no chunks, but will accumulate them over time registers with tracker to get list of peers,

connects to subset of peers (“neighbors”) while downloading, peer uploads chunks to other

peers. peers may come and go once peer has entire file, it may (selfishly) leave or

(altruistically) remain

Page 33: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

33

BitTorrent: Tit-for-tat(1) Alice “optimistically unchokes” Bob

(2) Alice becomes one of Bob’s top-four providers; Bob reciprocates(3) Bob becomes one of Alice’s top-four providers

With higher upload rate, can find better trading

partners & get file faster!

Page 34: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

34

BitTorrent (2)

Pulling Chunks at any given time,

different peers have different subsets of file chunks

periodically, a peer (Alice) asks each neighbor for list of chunks that they have.

Alice sends requests for her missing chunks rarest first

• Sending Chunks: tit-for-tat• Alice sends chunks to (4)

neighbors currently sending her chunks at the highest rate• re-evaluate top 4 every

10 secs• every 30 secs: randomly

select another peer, starts sending chunks• newly chosen peer may

join top 4• “optimistically unchoke”

Page 35: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

new leecher

BitTorrent – joining a torrent

Peers divided into: seeds: have the entire file leechers: still downloading

datarequest

peer list

metadata file

join

1

2 3

4seed/leecher

website

tracker

1. obtain the metadata file2. contact the tracker3. obtain a peer list (contains seeds & leechers)4. contact peers from that list for data

Page 36: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

!

BitTorrent – exchanging data

I have leecher A

●Verify pieces using hashes●Download sub-pieces in parallel● Advertise received pieces to the entire peer list● Look for the rarest pieces

seed

leecher B

leecher C

Page 37: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

BitTorrent - unchoking

leecher A

seed

leecher B

leecher Cleecher D

● Periodically calculate data-receiving rates● Upload to (unchoke) the fastest downloaders● Optimistic unchoking▪ periodically select a peer at random and upload to it▪ continuously look for the fastest partners

Page 38: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 38

BitTorrent: Discussion

Works reasonably well in practice Gives peers incentive to share resources; tries

to avoids freeloaders

Page 39: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

Efficiency/scalability throughpeering (collaboration)

P2P 39

Page 40: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

40

File Distribution: Server-Client vs P2PQuestion : How much time to distribute file

from one server to N peers?

us

u2d1 d2u1

uN

dN

Server

Network (with abundant bandwidth)

File, size F

us: server upload bandwidthui: peer i upload bandwidth

di: peer i download bandwidth

Page 41: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

41

File distribution time: server-client

us

u2d1 d2u1

uN

dN

Server

Network (with abundant bandwidth)

F server sequentially sends N copies: NF/us time

client i takes F/di time to download

increases linearly in N(for large N)

= dcs depends on

max { NF/us, F/min(di) }i

Time to distribute Fto N clients using

client/server approach

Page 42: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

42

File distribution time: P2P

us

u2d1 d2u1

uN

dN

Server

Network (with abundant bandwidth)

F server must send one copy: F/us time

client i takes F/di time to download

NF bits must be downloaded (aggregate) fastest possible upload rate: us + ui

dP2P depends on max { F/us, F/min(di) , NF/(us + ui) }i

Page 43: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

2: Application Layer 43

0

0.5

1

1.5

2

2.5

3

3.5

0 5 10 15 20 25 30 35

N

Min

imum

Dis

tribu

tion

Tim

e P2PClient-Server

Server-client vs. P2P: example

Client upload rate = u, F/u = 1 hour, us = 10u, dmin ≥ us

Page 44: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

Lecture outline

Evolution of p2p networking seen through file-sharing applications

Other applications

P2P 44

Page 45: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

Overlay: a network implemented on top of a network E.g. Peer-to-peer

networks, ”backbones” in adhoc networks, transportaiton network overlays, electricitygrid overlays ...

P2P – not only sharing files…• Content delivery, software publication

• Streaming media applications

• Distributed computations (volunteer computing)

• Portal systems

• Distributed search engines

• Collaborative platforms

• Communication networks

• Social applications

• Other overlay-related applications....

Page 46: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

Router Overlays for e.g. protection/mitigation of flooding attacks

P2P 46

Page 47: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

2: Application Layer 47

P2P Case study: Skype

inherently P2P: pairs of users communicate.

proprietary application-layer protocol (inferred via reverse engineering)

hierarchical overlay with SNs

Index maps usernames to IP addresses; distributed over SNs

Skype clients (SC)

Supernode (SN)

Skype login server

Page 48: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

2: Application Layer 48

Peers as relays

Problem when both Alice and Bob are behind “NATs”. NAT prevents an outside

peer from initiating a call to insider peer

Solution: Using Alice’s and Bob’s

SNs, Relay is chosen Each peer initiates

session with relay. Peers can now

communicate through NATs via relay

Page 49: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

More examples: a story in progress...

Page 50: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

New power grids: be adaptive!

• Bidirectional power  and information flow– Micro‐producers or “prosumers”, can share resources– Distributed energy resources

• Communication + resource‐administration  (distributed system) layer– aka “smart” grid 50

Page 51: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

New power grids

Natural overlays for microgrids

Page 52: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

Interested in project on the topic? To express interest, send email

<ptrianta>@ chalmers.se Incl. Motivation and info on you (courses,

interests, some own work) Research course in preparation, ac. year 2012-

2013

P2P 52

Page 53: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

P2P 53

BibliographyCollective sources Kurose, Ross: Computer Networking, a top-down approach, p2p in applications chapter

AdisonWesley 2009 Aberer’s coursenotes

http://lsirwww.epfl.ch/courses/dis/2007ws/lecture/week%208%20P2P%20systems-general.pdf

http://lsirwww.epfl.ch/courses/dis/2007ws/lecture/week%209%20Structured%20Overlay%20Networks.pdf

Papers for Further Study Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications, Ion Stoica,

Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan. ACM SIGCOMM 2001, San Deigo, CA, August 2001, pp. 149-160.

Kademlia: A Peer to peer information system Based on the XOR Metric. PetarMaymounkov and David Mazières , 1st International Workshop on Peer-to-peer Systems, 2002.

Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems, A. Rowstron and P. Druschel, IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), November 2001.

Page 54: Computer Communications€¦ · Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section

Bibliography (cont) A Scalable Content-Addressable Network, S. Ratnasamy, P. Francis, M. Handley, R. Karp,

and S. Shenker, Sigcomm 2001, San Diego, CA, USA, August, 2001. Ian Clarke, Oskar Sandberg, Brandon Wiley, and Theodore W. Hong. Freenet: A

Distributed Anonymous Information Storage and Retrieval System. Int’l Workshop on Design Issues in Anonymity and Unobservability. LNCS 2009. Springer Verlag 2001.

Viceroy: A Scalable and Dynamic Emulation of the Butterfly. By D. Malkhi, M. Naor and D. Ratajczak. In Proceedings of the 21st ACM Symposium on Principles of Distributed Computing (PODC '02), August 2002. Postscript.

Incentives build Robustness in BitTorrent, Bram Cohen. Workshop on Economics of Peer-to-Peer Systems, 2003.

Do incentives build robustness in BitTorrent? Michael Piatek, Tomas Isdal, Thomas Anderson, Arvind Krishnamurthy and Arun Venkataramani, NSDI 2007

Exploiting BitTorrent For Fun (But Not Profit)iptps06.cs.ucsb.edu/talks/Liogkas_BitTorrent.ppt

J. Mundinger, R. R. Weber and G. Weiss. Optimal Scheduling of Peer-to-Peer File Dissemination. Journal of Scheduling, Volume 11, Issue 2, 2008. [arXiv] [JoS]

Christos Gkantsidis and Pablo Rodriguez, Network Coding for Large Scale ContentDistribution, in IEEE INFOCOM, March 2005 (avalanche swarming)

P2P 54