Cost-effective broadcast for fully decentralized peer-to-peer networks

Embed Size (px)

Text of Cost-effective broadcast for fully decentralized peer-to-peer networks

  • Cost-effective broadcast for fully decentralized peer-to-peer networks

    Marius Portmanna,*, Aruna Seneviratneb

    aSchool of Electrical Engineering and Telecommunications, University of New South Wales, Sydney, AustraliabComputer and Networks Laboratory, Swiss Federal Institute of Technology ETH, CH-8092 Zurich, Switzerland

    Received 9 October 2002; accepted 9 October 2002

    Abstract

    Fully unstructured and decentralized peer-to-peer networks such as Gnutella are appealing for a variety of applications, among which file-

    sharing is the most prominent one. The decentralized nature of these systems provides a high degree of robustness and the ability to cope with

    a highly dynamic and transient network environment. However, the lack of centralized directory nodes makes the task of searching more

    expensive and difficult. In completely unstructured peer-to-peer networks, searching can only be realized via application-layer broadcast,

    where query messages are routed to every node in the network. Gnutella implements application-layer broadcast by using flooding as the

    underlying message routing mechanism. Flooding creates a large amount of traffic and can quickly exhaust the resources of nodes in a large

    network. In this paper, we explore Rumor mongering (also known as Gossip) as a more cost-effective and scalable alternative to flooding for

    implementing services such as searching in decentralized peer-to-peer networks. We further present a new variant of the Rumor mongering

    protocol, which exploits the power-law characteristics of typical peer-to-peer networks and achieves a significant further reduction in cost.

    q 2002 Elsevier Science B.V. All rights reserved.

    Keywords: Peer-to-peer networks; Broadcast; Rumor mongering

    1. Introduction

    Recently, there has been a lot of activity in the area of

    peer-to-peer networking, sparked by the popularity of file-

    sharing applications such as Napster,1 Freenet [12] and

    Gnutella.2 Although the exact meaning of the term peer-to-

    peer is very debatable, these systems typically depend on a

    number of voluntarily participating nodes contributing

    resources to create some form of infrastructure. The most

    well-known service that such a peer-to-peer infrastructure

    can provide is certainly file-sharing, but other systems with

    completely different applications such as SETI@home3 are

    also labeled as peer-to-peer.

    Even among the peer-to-peer systems aimed at file-

    sharing, there exist big differences in terms of architecture

    and implementation of key mechanisms such as searching.

    Napster, for instance, relies on a central indexing server to

    locate files that are shared. Gnutella implements searching

    in a fully decentralized and distributed manner, without the

    need for any special nodes with facilitating or adminis-

    trative roles. The advantages of a decentralized architecture

    are clear. It allows accommodating the highly dynamic and

    transient nature of typical peer-to-peer networks, where

    nodes frequently join and leave the network. Furthermore,

    the absence of a central node that represents a single point of

    failure, results in increased fault tolerance and robustness.

    Gnutella provides two types of services via its overlay

    network: Searching for files and peer discovery. These

    services are implemented through broadcasting messages

    across the network. This application-level broadcast is

    implemented with flooding as the underlying routing

    mechanism, where every node acts a router and forwarder

    for messages sent by other nodes. The use of flooding

    obviously raises the question of cost and scalability.

    Individual nodes can be overloaded with messages to

    forward, and might not have enough resources such as CPU

    cycles, memory or network bandwidth required for the task.

    This problem has also been observed in the real Gnutella

    system,4 where the fact that nodes could not keep up with

    0140-3664/03/$ - see front matter q 2002 Elsevier Science B.V. All rights reserved.

    PII: S0 14 0 -3 66 4 (0 2) 00 2 50 -5

    Computer Communications 26 (2003) 11591167

    www.elsevier.com/locate/comcom

    * Corresponding author.

    E-mail addresses: marius@ieee.org (M. Portmann), a.seneviratne@

    unsw.edu.au (A. Seneviratne).1 The Napster home page, http://www.napster.com2 The Gnutella home page, http://gnutella.wego.com3 The SETI@home home page, http://setiathome.ssl.berkeley.edu

    4 Distributed Search Solutions (DSS) group, Gnutella: To the Bandwidth

    Barrier and Beyond, http://www.clip2.com/gnutella.html

  • the rate of messages to be forwarded has lead to a

    fragmentation of the network. It is therefore important for

    the successful deployment of decentralized peer-to-peer

    networks that the problem of cost and scalability is

    addressed, and more cost-effective methods for appli-

    cation-level broadcast are explored.

    As an alternative to flooding, we study the use of Rumor

    mongering (also known as Gossip) for message routing.

    Rumor mongering trades off reliability and speed for a

    reduction in cost, by forwarding messages only to a randomly

    selected subset of neighbors. By means of simulation, we

    show that Rumor mongering can implement application-

    layer broadcast at a considerably lower cost than flooding.

    We further present a novel variant of the protocol, called

    Deterministic rumor mongering, which is an even more

    cost-effective method of broadcast message routing. Deter-

    ministic rumor mongering makes a more intelligent decision

    about message forwarding by making use of the fact that the

    topology of a typical peer-to-peer network displays a power-

    law distribution of the node degrees.5 In networks with a

    power-law characteristic [3], few nodes have a very high

    degree and many have a low degree, and the number of

    nodes with a certain degree decreases with the degree

    according to the following power law: fd / dk: The numberof nodes of a degree d, fd, is proportional to d to the power of

    a negative constant k.

    In Ref. [1] Adamic et al. addresses the problem of

    searching in power-law networks. They suggest a random-

    walk search strategy, in which the query messages are

    directed towards nodes of a high degree, for which the paper

    shows a superior scaling behavior in terms of cost.

    However, their work does not consider that by directing

    messages towards high-degree nodes, these nodes are very

    likely to be overloaded, especially in peer-to-peer networks

    where resources can be assumed to be distributed rather

    uniformly among the nodes. This is in contrast to the

    approach taken in this paper.

    The rest of the paper is structured as follows. In Section

    2, we introduce our simulation platform and we discuss

    relevant parameters such as the metric of cost. Section 3

    studies the cost of flooding-based broadcast, as it is

    implemented in Gnutella. In Sections 4 and 5, we evaluate

    two variants of Rumor mongering as an alternative to

    flooding. Finally, Section 6 concludes the paper.

    2. Simulationplatform and parameters

    2.1. Flooding-based broadcast

    We are going to study the cost of flooding-based

    application-layer broadcast as it is implemented in Gnutella.

    The corresponding rules for routing messages are as

    follows:

    In general, a message received by a node is forwarded toall its neighbors, except for the one, where the message

    was received from.

    Each message has a Time To Live (TTL) field which isdecremented by one at each visited node. If the TTL

    reaches zero, the message is no longer forwarded.

    Each message has an ID to uniquely identify it on thenetwork.Every nodekeeps a record of IDsofmessages that

    it has seen in the recent past. If a node receives a message

    with the same ID and type as one that it has received

    before,6 the message is ignored and not forwarded.

    2.2. Metric of cost

    In order to evaluate the cost of application-level broad-

    cast mechanisms, we first need to define a metric of cost. An

    obvious choice for this is the average number of messages

    that need to be forwarded per node reached in a single

    broadcast. The number of messages forwarded by a node

    relates directly to the amount of resources, such as

    bandwidth and CPU cycles, that are consumed. As our

    metric, we define the cost c as follows:

    c 1r

    XN

    i1mi

    where r is the number of nodes reached by a broadcast, mi,

    the number of messages forwarded by node i, N is the

    number of nodes in the network.

    To illustrate the cost metric defined above, Fig. 1 shows

    Gnutellas broadcast mechanism for two simple topologies

    which consist of three nodes. The first is a simple spanning

    tree with only two bidirectional links. The second has the

    same number of nodes but has an additional link between

    node 2 and 3.

    As the first step, the initiator of the broadcast (node 1)

    sends messages to its immediate neighbors (2 and 3). In the

    case of the first topology, the broadcast is completed. The

    total number of messages generated is 2, one per node

    reached. Therefore, according to our metric, the cost is 1. In

    the second topology, additional messages are sent from

    node 2 to 3 and vice versa, and the cost is 2, twice as high.

    The theoretical minimal cost for the type of broadcast

    considered here is one message per node to

Recommended

View more >