A Measurement Study of Peer-to-Peer File Sharing Systems

Preview:

DESCRIPTION

A Measurement Study of Peer-to-Peer File Sharing Systems. Stefan Saroiu, P. Krishna Gummadi, Steven D. Gribble Presented by Zhengxiang Pan March 18 th , 2003. Introduction. Napster & Gnutella Population of users Bottleneck bandwidth of hosts & latencies Duration time of remain connected - PowerPoint PPT Presentation

Citation preview

A Measurement Study of Peer-to-Peer File Sharing

Systems

Stefan Saroiu, P. Krishna Gummadi, Steven D. Gribble

Presented by Zhengxiang PanMarch 18th, 2003

Introduction

• Napster & Gnutella• Population of users• Bottleneck bandwidth of hosts & latencies• Duration time of remain connected• Number of files shared & downloaded

Methodology-architecture

• Napster’s architecture– A cluster of central servers– Each peer connects to one server– Servers cooperate to process query

• Gnutella’s architecture– No centralized servers– Peers form overlay network– Send a query by a controlled flood

Methodology-crawler• Napster crawler

– A larger number of connections to a single server

– Issue popular queries in parallel– Captured 40%-60% local users

• Gnutella crawler– Iteratively send ping messages with large

TTLs– Discover new hosts by receiving pong

messages.– Capture 25%-50% of the total population

Methodology-directly measure characteristics• Latency

– Measure the time spent by exchanging a 40-byte TCP packet.

• Lifetime– Offline: not respond to TCP SYN packets– Inactive: respond with TCP RST– Active: accept the connection

• Bottleneck bandwidth– Approximate to available bandwidth– Actively measure upstream and downstream using

a few TCP packets

Results-bandwidth

Downstream & upstream bottleneck bandwidth-50% in Napster & 60% in Gnutella use broadband connections-25% in Napster & 8% in Gnutella use modems-20% in Napster & 30% in Gnutella have high bandwidth (>3Mbps)

Result-reported bandwidth

22% in Napster report “unknown” bandwidth

Result- latency

Latencies for Gnutella users-Unstructured, ad-hoc, a substantial fraction suffer from high-lantency-Difference in trans-oceanic peers

Result- availability

-only 20% peers had an IP-level uptime of 93% or more-Median session duration : 60 minutes

Result-files

-25% in Gnutella do not share any files-40%-60% peers share 5%-20% of the shared files

Result-download & upload

the percentage of peers in each bandwidth class is roughly the same as the percentage of files shared by that bandwidth class.

Result- cooperate

-30% of the users that report their bandwidth as 64 Kbps or less actually have a significantly greater bandwidth.-10% of the users reporting high bandwidth (3Mbps or higher) in reality have significantly lower bandwidth.

Result-resilience of Gnutella overlay

Although highly resilient in the face of random breakdowns, Gnutella is nevertheless highly vulnerable in the face of well-orchestrated, targeted attacks.

Conclusion

• Heterogeneity of hosts– Carefully delegate responsibilities

• Clearly evidence of client-like and server-like behaviors

• Peers tend to misreport information if there is an incentive to do so– Built-in incentive for telling the truth– Verify reported information

Recommended