6
AN EFFICIENT GOSSIP BASED OVERLAY NETWORK FOR PEER-TO-PEER NETWORKS Muhammad Hasan Islam Sanaa Waheed Izza Zubair Center for Advanced Studies in Engineering, Islamabad, Pakistan Fatima Jinnah Women University, Rawalpindi, Pakistan Center for Advanced Research in Engineering, Islamabad, Pakistan [email protected] [email protected] [email protected] particular, any protocol collecting all local data at a single node will create communication bottlenecks. For large distributed environments, scalability, reliability, robustness, and fast data delivery are main concerns. The traditional centralized approach to network and service management does not work well in large complex heterogeneous networks, and the need for an automated, distributed and decentralized management approach is growing. Abstract— Overlay networks have emerged as a means to enhance end-to-end application performance and availability. Network topology plays an important role in utilizing resources in Peer-to-peer systems. Epidemic algorithms are potentially effective solutions for disseminating information in large scale and dynamic systems. P2P networks are popular for their dynamicity, but they are easy to deploy, robust and provide high resilience to failures. They proactively fight random process and network failures and do not need any reconfiguration when failures occur. This characteristic is particularly useful in P2P systems deployed on Internet or ad-hoc networks. Recently, gossip based protocols are developed for providing high reliability and scalability of message delivery. Gossip protocols are highly used for reducing control message overhead [7]. Gossip protocols are scalable because they don’t require as much synchronization as traditional reliable multicast protocols. In gossip-based protocols, each node contacts one or a few nodes in each round usually chosen at random, and exchanges information with these nodes. The dynamics of information spread algorithm behavior stems from the work in epidemiology, and leads to high fault tolerance. Gossip-based protocols usually do not require error recovery mechanisms, and thus enjoy a large advantage in simplicity, while often incurring only moderate overhead compared to optimal deterministic protocols. In this paper we propose a new efficient gossip based algorithm for intelligent node selection (INS) and local view maintenance of a node. We also argue that the random node selection leads to data duplication which results in increasing bandwidth utilization. However, when selecting nodes on INS basis the bandwidth utilization has been reduced significantly. By using INS in application-level gossip multi-cast protocol the routing path will converge rapidly. For INS a distinctive overlay network which closely matches the Internet topology is constructed by combining different network topology-aware technique., The P2P system based on this structure is not only highly efficient for routing, but also keeps maintenance overhead very low even under highly dynamic environment, like ad-hoc networks. However, it requires a longer time for each node get the message. While reducing message dissemination overhead, we still want maintain the speedy information delivery provided by multicast or broadcast. Keywords— Overlay Networks, Epidemic algorithms, Gossip Protocol, Intelligent node selection They are resilient against common failures and are relatively simple. However, aimless gossiping does not guarantee good reliability. A pure gossip protocol [10, 11] takes place in rounds; they randomly choose a participating process for each round to share information. There are number variations on process or node selection for every round. In “pull” gossip, information is forward when a node request for the piece of information. Whereas in “push” gossip, information is forward without I. INTRODUCTION An overlay network is built on top of an existing network. It can be visualized as virtual network of nodes and logical links. Recently, overlay networks are being utilized to improve end-to-end application performance and availability. Communication bandwidth is often a scarce resource for dynamic networks, so the information sharing should involve only small messages. In 978-1-4244-4216-4/09/$25.00 © 2009 IEEE ICUFN 2009 62

[IEEE 2009 First International Conference on Ubiquitous and Future Networks (ICUFN) - Hong Kong, China (2009.06.7-2009.06.9)] 2009 First International Conference on Ubiquitous and

  • Upload
    izza

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: [IEEE 2009 First International Conference on Ubiquitous and Future Networks (ICUFN) - Hong Kong, China (2009.06.7-2009.06.9)] 2009 First International Conference on Ubiquitous and

1

AN EFFICIENT GOSSIP BASED OVERLAY NETWORKFOR PEER-TO-PEER NETWORKS

Muhammad Hasan Islam Sanaa Waheed Izza Zubair

Center for Advanced Studies in Engineering, Islamabad, Pakistan

Fatima Jinnah Women University, Rawalpindi, Pakistan

Center for Advanced Research in Engineering, Islamabad, Pakistan

[email protected] [email protected] [email protected]

particular, any protocol collecting all local data at a single node will create communication bottlenecks. For large distributed environments, scalability, reliability, robustness, and fast data delivery are main concerns. The traditional centralized approach to network and service management does not work well in large complex heterogeneous networks, and the need for an automated, distributed and decentralized management approach is growing.

Abstract— Overlay networks have emerged as a means to enhance end-to-end application performance and availability. Network topology plays an important role in utilizing resources in Peer-to-peer systems. Epidemic algorithms are potentially effective solutions for disseminating information in large scale and dynamic systems. P2P networks are popular for their dynamicity, but they are easy to deploy, robust and provide high resilience to failures. They proactively fight random process and network failures and do not need any reconfiguration when failures occur. This characteristic is particularly useful in P2P systems deployed on Internet or ad-hoc networks.

Recently, gossip based protocols are developed for providing high reliability and scalability of message delivery. Gossip protocols are highly used for reducing control message overhead [7]. Gossip protocols are scalable because they don’t require as much synchronization as traditional reliable multicast protocols. In gossip-based protocols, each node contacts one or a few nodes in each round usually chosen at random, and exchanges information with these nodes. The dynamics of information spread algorithm behavior stems from the work in epidemiology, and leads to high fault tolerance. Gossip-based protocols usually do not require error recovery mechanisms, and thus enjoy a large advantage in simplicity, while often incurring only moderate overhead compared to optimal deterministic protocols.

In this paper we propose a new efficient gossip based algorithm for intelligent node selection (INS) and local view maintenance of a node. We also argue that the random node selection leads to data duplication which results in increasing bandwidth utilization. However, when selecting nodes on INS basis the bandwidth utilization has been reduced significantly. By using INS in application-level gossip multi-cast protocol the routing path will converge rapidly. For INS a distinctive overlay network which closely matches the Internet topology is constructed by combining different network topology-aware technique., The P2P system based on this structure is not only highly efficient for routing, but also keeps maintenance overhead very low even under highly dynamic environment, like ad-hoc networks.

However, it requires a longer time for each node get the message. While reducing message dissemination overhead, we still want maintain the speedy information delivery provided by multicast or broadcast.

Keywords— Overlay Networks, Epidemic algorithms, Gossip Protocol, Intelligent node selection

They are resilient against common failures and are relatively simple. However, aimless gossiping does not guarantee good reliability. A pure gossip protocol [10, 11] takes place in rounds; they randomly choose a participating process for each round to share information. There are number variations on process or node selection for every round. In “pull” gossip, information is forward when a node request for the piece of information. Whereas in “push” gossip, information is forward without

I. INTRODUCTIONAn overlay network is built on top of an existing network. It can be visualized as virtual network of nodes and logical links. Recently, overlay networks are being utilized to improve end-to-end application performance and availability.

Communication bandwidth is often a scarce resource for dynamic networks, so the information sharing should involve only small messages. In

978-1-4244-4216-4/09/$25.00 © 2009 IEEE ICUFN 200962

Page 2: [IEEE 2009 First International Conference on Ubiquitous and Future Networks (ICUFN) - Hong Kong, China (2009.06.7-2009.06.9)] 2009 First International Conference on Ubiquitous and

request is originated. It has been shown that in a group of N processes, (or nodes) it takes O (logN) rounds for every process (or node) to become infected with that information. These are called epidemic algorithms because their information flow behavior resembles spreading of diseases in epidemiology.

In large scale dynamic networks, broadcasting and flooding leads to data loss and congestion, over the Internet, reliable communication is performed end-to-end in which potentially every P2P node could participate, and communicate. Reliable gossip is based on a simple heuristic: messages are flooded on critical links and gossiping is done over the other links. A possible variant is directional gossip [1], it is primarily aimed at reducing the communication overhead of traditional gossip protocols. [6]

This paper focused on improving performance of Overlay Networks by developing a technique that blends attributes of Gossip and Overlay. It is presenting a new gossip protocol that can provide higher reliability than traditional gossip protocols while reducing the overhead involved. However, when a message m is received, instead of forwarding it to all members, a gossip node intelligently selects some nodes in a way that at the end of gossiping, all nodes have received data. By employing this intelligent node selection INS, scheme technique considerably reduces the average latency and jitter of reliable communication and avoid data duplication. When using such an approach one has to consider networking aspects such as network topology and round trip time RTT. In this approach, a modified directional gossip strategy is utilized. It is assumed that the individual node knows its immediate neighbors. A node will select some not all of its neighbors to send data in a way that network will experience comparatively low overhead as compare to existing gossip. In rest of the paper, P2P node will be termed as gossip node.

II. LITERATURE REVIEW Ralitsa Kostadinova and Constantin Adam

[5] analyze epidemic algorithms, focusing on disseminating information. Epidemic algorithms offer effective solutions for disseminating information in large scale and dynamic systems. They are easy to deploy, robust and provide high resilience to failures, thus make them particularly useful in P2P systems.

Yair Amir and Claudiu Danilov [�] discuss reliable point-to-point communication usually achieved in overlay networks by applying TCP on the end nodes They achieved reliability on hop-by-hop

basis thus reducing the latency and jitter considerably. There approach is feasible and beneficial in overlay networks that do not have the scalability and interoperability requirements of the global Internet. Though theory scheme produces overhead at the nodes but does not contribute much in the overall gain.

Junghee Han, David Watson, and Farnam Jahanian [4] presented an idea of topology aware overlays networks by using the inherent redundancy of the Internet’s underlying routing infrastructure to redirect packets along an alternate path when the given primary path is not available (link is down or due to congestion). However, the performance efficiency of these overlay networks depends on the availability of alternate path between the two hosts in terms of physical links, routing infrastructure, administrative control, and geographical distribution. There analysis shows that a single-hop overlay path provides the same degree of path diversity as the multi hop overlay path for more than 90% of source and destination pairs. Finally, they also validate that their architecture is able to provide a significant amount of resilience to real-world failures. Meng-Jang Lin, TX Keith Marzullo, San Diego presenting concept of directional gossip [1], in which there is a gossip server. A gossip server is aware of the network and it is responsible for directing direct path of one node to its neighbors. The protocol that a gossip server s executes is the following. Initially, gossip server s knows only the direct paths connecting itself to its neighbors. It will give initialization and assign an initial weight of one to each of its neighbors. The weights of initialization vector may be low, but as the gossip server learns more paths by sending new messages to all of its neighbors, it will compute more accurate weights. In directional gossip deterministic topological node selection mechanism is utilized to optimize query processing in WAN. This result in, low to moderate overhead by having a node identify the critical directions it has to forward gossip messages. Therefore, directional gossip helps in reducing overhead with time, achieves good reliability.

David Andersen, Hari Balakrishnan, Frans Kaashoek, and Robert Morris, worked on Resilient Overlay Network[2] (RON) an architecture that allows distributed Internet applications to detect and recover from path outages and periods of degraded performance within several seconds,. They found that forwarding packets via at most one intermediate RON node is sufficient to overcome faults and improve performance in most cases. These improvements, particularly in the area of fault detection and recovery, demonstrate the benefits of

26�

Page 3: [IEEE 2009 First International Conference on Ubiquitous and Future Networks (ICUFN) - Hong Kong, China (2009.06.7-2009.06.9)] 2009 First International Conference on Ubiquitous and

moving some of the control over routing into the hands of end-systems.

Communication bandwidth is often a scarce resource during the attacks, so the attack information sharing should involve only small messages. Recently, gossip based protocols have been developed to reduce control message overhead while still providing high reliability and scalability of message delivery [9]. Gossip protocols are scalable because they don’t require as much synchronization as traditional reliable multicast protocols. In gossip-based protocols, each node contacts one or a few nodes in each round (usually chosen at random), and exchanges information with these nodes. The dynamics of information spread bears a resemblance to the spread of an epidemic, and leads to high fault tolerance. Gossip-based protocols usually do not require error recovery mechanisms, and thus enjoy a large advantage in simplicity, while often incurring only moderate overhead compared to optimal deterministic protocols.

III. GOSSIP PROTOCOL Gossip nodes communicate each other by

exchanging messages. Every node maintains a membership list of all neighbors. When a gossip node receives data, or when an event is monitored it forwards data packet to its member nodes. In contrast to multicast and broadcast, gossip node selects one node per round, for sending data as gossiping is done in rounds.

Gossiping is an interesting approach because they can distribute the load among all nodes in the system. Ideally, one would like to have each participant to select gossip targets at random from the entire system membership. However, this is not a scalable solution, because of the overhead associated with maintaining node information and traffic overhead caused by updating procedures required to maintain consistency. The high memory costs associated with maintaining full membership information about all nodes participating in the protocol, the network congestion results in ensuring that such information is up-to-date.

In a partial view nodes selects peers to whom they relay gossip messages, it is a small subset of the entire system membership. It resolves the scalability issues, but vulnerability of system due to the effects of nodes failures increases, like in case of network portioning. The issue can be resolved by INS for construction of partial views, and then gossip protocols may be used to implement highly scalable and resilient reliable broadcast primitives.

Gossip based broadcast protocols rely on structured overlays to disseminate information, so they are less efficient than other approaches. As the

intrinsic redundancy of gossip protocols produces more network traffic as the network capacity might be exhausted, this makes unsuitable conditions for broadcasting. In order to incorporate efficiency in gossip mechanism, complexity of construction, and also the time costs for repair associated with such structured overlays, and the price to avoid the high cost in terms of traffic must be considered.

A. Gossip MechanismA generic gossip protocol running at process p has a structure something like the following:

when (p receives a new message m) while (p believes that not enough of its neighbors have received m) {

q = a neighbor process of p; send m to q;

}In generic protocol, when node p receives

message m, it simply forwards it to its neighbor process q, until it believes that message m is propagated to all its neighbors. The process is simple, reliable, and scalable as compare to multicast, broadcast and flooding which may result in hang up in case of large number of nodes, it is limited by TTL to avoid flooding of network. However, it has been observed that the overhead of gossip protocols can be reduced by taking the network topology into account.

IV. Proposed Design It is assumed nodes in the network are

connected by TCP, with some desirable topology. In addition, propose an optimized data dissemination technique. A formal framework is required that is simple yet powerful enough to be able to capture most of the interesting structures. Nodes maintain addresses of other nodes through partial views, which are sets of node descriptors. In addition to an address, a node descriptor contains a profile membership list,which contains those properties of the nodes such as ID, hop count, RTT etc.

A. Network Topology Our proposed overlay network topology consists:

Capital Nodes CN,Intermediate Nodes IN, and Leaf Nodes LN.

All capital nodes CN are directly connected to all other capital nodes and the corresponding intermediate nodes. In the same manner, all IN, which belong to one CN, are directly connected with each other and with corresponding LN’s. In third layer of hierarchy, all LN that belong to one IN are directly connected with one another and with

364

Page 4: [IEEE 2009 First International Conference on Ubiquitous and Future Networks (ICUFN) - Hong Kong, China (2009.06.7-2009.06.9)] 2009 First International Conference on Ubiquitous and

corresponding IN.

FIGURE 1: Topology of Gossip Overlay

Figure 1, describes the proposed topology, where capital nodes CN are represented by solid thick line circle, intermediate nodes by solid double line and leaf nodes by single line circle. The capital nodes A, B, C are connected to each other and with their corresponding intermediate nodes. There can be nnumber of IN’s (a1, a2, a� …an) in care of one CN. All these a1, a2, a� …an have direct links with each other that is they are one hop neighbors. This presents a hierarchical topology, where network is portioned on geographical basis. This pattern continues in lower layers. C. Proposed Protocol

When a node receives data or wishes to spread any information some member nodes are selected to forward data, unlike basic gossip mechanism D. Protocol Design

An overview of Gossip Overlay Protocol Design describes nodes discovery mechanism and the way they select nodes to forward data.

1. Membership Gossip nodes establish connection with its

neighbor nodes by synchronous mechanism i.e.;issues ping request with its node id to one hop neighbors and receiver will send reply with its node id. At the end of both nodes know node ids of each other, node id’s are process names.

As the gossip protocol maintains membership table, which contains one hop neighbor. Initially a gossip node has empty membership record. If a CN wants to discover its members, it will broadcast ping with TTL value 1.RTT can be calculated by setting timer for sending ping to

receiving pong duration. Gossip nodes maintains list of nodes, round trip time RTT, node ID’s in membership list.

a) Membership List Members are selected by each node

considering two things minimum hop count, and minimum RTT. For maximum throughput, select member that are one hop away that provides speedy delivery. Hence, membership list of Capital nodes will have other CNs and corresponding INs because only these are immediate neighbors of CN. Membership list of INs will have CN to which it belongs, other INs of same CN, and corresponding LNs. Similarly, membership list of LN will have IN to which it belong, and other LNs of same IN.

b) Node Selection Mechanism When any node receives a message m,

instead of forwarding it to all members it selects some nodes on priority basis to avoid data duplication. INS, chooses nodes that can contribute in selecting minimum number of nodes to spread data in minimum number of gossip rounds. There can be variable possibilities for node selection by CN, or IN, or LN depending on type of sender.

c) Member Selection at Capital Node There can be two possibilities:

Sending node is also a CN, that means its peer CN have received this message by the same sender. CN will forward m to some CN of membership list, with minimum RTT; it will reduce number of rounds. So, sender CN will send m, only to three other CN’s. Then these three CN’s are supposed to spread this m to all other CN’s. Sender CN will send id’s of its targets to avoid duplication. If sender node is IN, that means its peer CN have not received this message by the same sender. Forward m to any � corresponding CNs with minimum RTT. Then these three CN’s are supposed to spread this m to all other CN’s.

d) Member Selection at Intermediate Node There can be three possibilities:

Sending node is CN, that means its peer IN have not received this m, forward mto � corresponding Ins, with minimum RTT.

465

Page 5: [IEEE 2009 First International Conference on Ubiquitous and Future Networks (ICUFN) - Hong Kong, China (2009.06.7-2009.06.9)] 2009 First International Conference on Ubiquitous and

Figure 2: Membership in Gossip Overlay

If the sender node is also an IN, which means some of its peer IN has received m, IN will forward m to � LN and � IN of membership list, again with minimum RTT.If sender node is also a LN, simply forward m to � corresponding Inns, with minimum RTT.

e) Member Selection at Leaf Node There can be two possibilities:

If IN is sender node, LN forward m to � LN of membership list so on receiving by LN it will send m to its peer LN with minimum RTT. Sending node is also a LN than it will forward m to its peer LN with minimum RTT.

Hence, when a node receives m from same layer neighbor node it will send to its neighbor with node id � times greater than receiving node. For this purpose, we need a node ID that explains network topology, or that node will send node id’s of its targets concatenated with m.

V. ANALYSIS AND RESULTS The proposed architecture and existing model is

tested with same delay values for all the links and with different delays on different links. Network is tested for following performance parameters:

Throughput Duplication Rate Convergence Time & End-to-End Delay

The network is tested for different delay rates as mostly different links experience different delay values. In experiment, number

of nodes is �9 to address scalability issue so, for large network statistics are recorded at event # 250 & �50, using OMNeT++; it is observed that about 200 events are required to discover topology.

A. Results The summary of results are shown and explained below:

Result Summary

02040

6080

Thro

ughp

ut(%

age)

Dup

licat

ion

(%ag

e)

Con

verg

ence

(sec

)

Parameters

Valu

es a

t 350

eve

nt

GossipOp Gossip

Figure 3: Summary of some results in Large Networks

a) Throughput:In gossip protocol, throughput %age is about

19% when 250 events are scheduled and whereas, in optimized gossip it is about 16%.With the passage of time, it increases until �0% when �50 events are scheduled, in optimized gossip throughput %age increases �6%.

b) Duplication Rate: In gossip protocol, duplication %age is about

21% when 250 events are scheduled and whereas, in optimized gossip it is about 15%.With the passage of time, it increases until �4% when �50 events are scheduled, in optimized gossip duplication %age increases until 25%. In both gossip protocols duplication rate increases with number of messages but it is less in optimized than in original gossip.

566

Page 6: [IEEE 2009 First International Conference on Ubiquitous and Future Networks (ICUFN) - Hong Kong, China (2009.06.7-2009.06.9)] 2009 First International Conference on Ubiquitous and

c) Convergence: In gossip protocol, convergence time for the

whole network is same for all gossip nodes it is about 6� second. Whereas, in optimized gossip it is it is about 64 seconds.

B. ANALYSIS Iterative software development techniques

are used; graphical results have shown that proposed gossip model has successful in attaining high throughput and low duplication rate in comparison to existing gossip. Convergence time refers to, the time network take to disseminate a data packet to the entire network, is managed with small number of nodes in proposed model but it is a bit high for larger network but still it is quite lesser than existing gossip . Overhead of a node can be number of packets send by node in one second; deletion of duplicate packets reduces node overhead and thus improving network efficiency. Throughput of the network improves as the data duplication rate is reduced in proposed model by incorporation of hierarchical approach in network topology.

VI. CONCLUSION This paper’s first contributes to explore the

optimum data dissemination technique in P2P networks suing overlays. The second phase contributes to select a desirable topology to speed up gossiping mechanism. The third contribution is made by maintaining a profile having membership list, which end in the maintenance of partial view of node addresses, which are sets of node descriptors. The fourth phase demonstrates the intelligent node selection INS, helps in improving efficiency of gossiping approach. Performance analysis is performed after implementation of proposed model that concludes efficiency of network is increased by 6%. As data duplication rate is reduced by hierarchical model for gossiping. The significant advantage of proposed model is, reliability, and scalability of gossip model is not affected by adaptations required to speed up the transmission. Conclusively, proposed adaptations to existing gossip protocol ensures reliable, scalable and speedy delivery of information on distributed networks. .

VII. REFERENCES [1] Meng-Jang Lin, TX Keith Marzullo, San Diego, “Directional

Gossip: Gossip in a Wide Area Network”, University of Texas at Austin, University of California, CA. July 1999.

[2] Jun Li, Peter Reiher, and Gerald Popek, “Resilient Self - Organizing Overlay Networks for Security Update Delivery”, IEEE Journal on Selected Areas in Communications, special

issue on Service Overlay Networks,Vol. 22, No. 1, January 2004

[�] Yair Amir and Claudiu Danilov, “Reliable ommunication in Overlay Networks“, Johns Hopkins University, { yairamir,claudiu}@cs.jhu.edu

[4] Junghee Han, David Watson, and Farnam Jahanian, “Topology Aware Overlay Networks”, Department of Electrical Engineering and Computer Science, University of Michigan, USA, {arnam}@eecs.umich.edu

[5] Performance Analysis of the Epidemic Algorithms, by Ralitsa Kostadinova and Constantin Adam

[6] Guangsen Zhang, Manish Parashar, “Cooperative Mechanism Against DDoS Attacks”,The Applied Software Systems Laboratory,Department of Electrical and Computer Engineering, Rutgers University , {gszhang,parashar}@caip.rutgers.edu

[7] L. Gupta, K. P. Birman and R. van Renesse, “Fighting fire with fire: using randomized gossip to combat stochastic scalability limits”, Special Issue Journal Quality and Reliability Engineering International: Secure, Reliable Computer and Network Systems (ed. Nong Ye), vol. 18, no. �, pp. 165-184, May/June, 2002.

[8] Angelos D. Keromytis, Vishal Misra, Daniel Rubenstein, “Using Overlays to Improve Network Security”, Columbia University, New York fangelos,misra,[email protected]

[9] L.Gupta, K. P. Birman and R. van Renesse, “Fighting fire with fire: using randomized gossip to combat stochastic scalability limits”, Special Issue Journal Quality and Reliability Engineering International: Secure, Reliable Computer and Network Systems (ed. Nong Ye), vol. 18, no. �, pp. 165-184, May/June, 2002.

[10] N. T. J. Bailey, “The Mathematical Theory of Infectious Diseases and its Applications” (second edition) Hafner Press, 1975.

[11] B. Pittel, “On spreading a rumor”. SIAM Journal of Applied Mathematics, 1987.

[12] Kate Jenkins, Ken Hopkinson, Ken Birman,“A Gossip Protocol for Subgroup Multicast”, Department of Computer Science, Cornell University.{katej, hopkik,ken}@cs.cornell.edu, Dec 199�.

667