Data Comm. Assignment1

ASSIGNMENT: DATA COMMUNICATION (KEEW3202)

STUDENT: INDIRA KARIMOVA (KEW100701) _______________________________________________________________________________________________________________________________________

P a g e 1 | 6

INTRODUCTION

[1] When attempting to access information on devices such as PC, laptop, PDA, or cell phone, the data might not

be physically stored on their device. In this case, a request to access that information must be made to the device where

the data resides. The request for data can occur and be fulfilled using the client/server model, application layer services

and protocols, and peer-to-peer (P2P) networking and applications. The peer-to-peer (P2P) model involves two distinct

forms which are peer-to-peer network design and peer-to-peer applications. Both forms have similar features but work

differently. The current network scenario is dominated by the TCP/IP protocol that naturally suits the P2P model.

However, there is also a need to provide the following services which P2P will pivot [2]:

a) Subscription service used by the current members to reject or accept new subscriptions to a group. Peers wishing to join

a peer group must first locate a current member, and then request to join.

b) Discovery service used by peer members to search for peer-group resources. Only the peers that are currently logged on

will be the ones that are searched.

c) Peer monitoring service to keep a close track of a peer's status. Such a service is useful when features such as reliability

and guaranteed service times are to be provided to the subscriber of a P2P network.

d) Access Service used to validate requests made by one peer to another. The peer requiring data from another peer

provides its credentials and particulars about the request being made. The access service has to determine if the access is

permitted and if the request is warranted.

P2P SYSTEMS

Peer-to-peer systems have been defined in many papers. Here are two definitions that cover the concepts peer-to-

peer network and peer-to-peer systems [3]:

Distributed network architecture may be called a peer-to-peer network, if the participants share a part of their own hardware resources (processing power, storage capacity, network link capacity, printers). These shared resources are

necessary to provide the Service and content offered by the network (e.g. file sharing or shared workspaces for

collaboration). They are accessible by other peers.

Peer-to-peer systems are distributed systems consisting of interconnected nodes able to self-organize into network topologies with the purpose of sharing resources such as content, CPU cycles, storage and bandwidth, capable of

adapting to failures and accommodating transient populations of nodes while maintaining acceptable connectivity and

performance, without requiring the intermediation or support of a global centralized server or authority.

[1] To better appreciate the P2P model, lets have a brief review of the client/server model. In the client/server model, the device requesting the information is called a client and the device responding to the request is called a server.

This model is considered to be in the application layer. There exist a server that is the place where resources are stored.

The client (a PC host) makes a request for a file to the server and the server respond by transferring the file to the client.

In a similar manner, the client can also transfer a file to the server for storage purpose.

In a P2P network, a dedicated server is not required. Multiple computers can be connected through a network to

share resources such as printers and files with needing the assistance of a server. Each of the end devices connected is

called a peer and can function as either a server or a client on a per-request basis. One computer might also assume the

roles both the server and client simultaneously for several simultaneous transactions.

Figure 1: Example of P2P networks



P a g e 2 | 6

A P2P application, unlike a peer-to-peer network, allows a device to act as both a client and a server within the

same communication session. P2P applications can be used on peer-to-peer networks, in client/server networks, and

across the Internet. Figure 3 shows two phones belonging to the same network sending an instant message with the digital

traffic between the two phones shown on top. Both can initiate a communication and are considered equal in the

communication process. However, each end device needs to provide a user interface and run a background service. When

you launch a specific peer-to-peer application, it invokes the required user interface and background services. After that,

the devices can communicate directly.

Figure 2: Example of P2P applications

[7] Devices will need to be installed with a P2P program that creates a virtual network between these

communities of P2P users. It will appear to the users as if their device is in a P2P network allowing them to share files to

other users and download files shared by other users. It is very similar to our Instant Messaging like Yahoo, AOL or

GTalk where even though the person we are communicating with are on a different network but a virtual network is

created where it looks like we are on a same network and we can share files and chat.

CHARACTERISTICS OF MOST P2P SYSTEMS [3]

a) Resource sharing each peer contributes system resources to the operation of the P2P system. Ideally this resource sharing is proportional to the peers use of the P2P system, but many systems suffer from the free rider problem. b) Networked all nodes are interconnected with other nodes in the P2P system, and the full set of nodes is members of a connected graph. When the graph is no longer connected, the overlay is said to be partitioned.

c) Decentralization the behavior of the P2P system is determined by the collective actions of peer nodes, and there is no central control point.

d) Symmetry nodes assume equal roles in the operation of the P2P system. In many designs this property is relaxed by the use of special peer roles such as super peers or relay peers.

e) Autonomy participation of the peer in the P2P system is determined locally, and there is no single administrative context for the P2P system.

f) Self-organization the organization of the P2P system increases over time using local knowledge and local operations at each peer, and no peer dominates the system.

g) Scalable This is a pre-requisite of operating P2P systems with millions of simultaneous nodes, and means that the resources used at each peer exhibit a growth rate as a function of overlay size that is less than linear. It also means that the

response time doesnt grow more than linearly as a function of overlay size. h) Stability Within a maximum churn rate, the P2P system should be stable, i.e., it should maintain its connected graph and be able to route deterministically within a practical hop-count bounds.

P2P SYSTEMS ARCHITECTURES

[3] In a pure P2P network, there is no notion of clients/servers but only equal peer nodes that simultaneously

function as both "clients" and "servers" to the other nodes on the network. This differs from the client/server model where

communication is usually to and from a central server.

The modern peer-to-peer systems are often implemented using an abstract overlay network, built at Application

Layer, on top of the native or physical network topology. Such overlays are used for indexing and peer discovery and

make the P2P system independent from the physical network topology. Content is typically exchanged directly over the

underlying Internet Protocol (IP) network. The two main P2P overlay architectures which are the unstructured P2P

overlay architecture and structured P2P overlay architecture. We will only discuss an overview of the structured P2P

overlay but we will be discussing about unstructured P2P overlay in more detailed. The definition of an overlay network:



P a g e 3 | 6

An overlay network is an application layer virtual or logical network in which end points are addressable and that provides connectivity, routing, and messaging between end points. Overlay networks are frequently used as a substrate

for deploying new network services, or for providing a routing topology not available from the underlying physical

network. Many peer-to-peer systems are overlay networks that run on top of the Internet.

a) Unstructured P2P overlay architecture [4]

An unstructured overlay is an overlay in which a node relies only on its adjacent nodes for delivery of messages to other nodes in the overlay. Example message propagation strategies are flooding and random walk. Our study will focus on the file sharing application, which is one of the most important applications for P2P networks. Unstructured overlays can be

further classified into centralized, distributed, hybrid and some other approaches for file sharing.

Figure 3: Search process in unstructured P2P networks. (a) Napster (b) Gnutella (c) Kazaa (d) BitTorrent

i. A Centralized Approach: Napster

Napster file sharing system consists of a central directory server and a set of registered users (or peers). The server

maintains information of all files in the system, including an index with metadata (such as file name and size) of all files

in the system, a list of all registered peers, and a list showing the files that each peer holds and shares. When a new peer

joins the system, it contacts the server and reports a list of files it maintains and shares. When a peer wants to search for a

file, it sends a request to the server. The server will return a list of peers that hold the matching file. The searching peer

then contacts the returned peers to download the file. Figure 3(a) shows the search process in Napster. When peer A wants

to search for some file, it contacts the central server. The server returns some peers that hold the file, say, peer B. Peer A

then starts to download the file from peer B.

Napster is easy to be implemented as we only needs to deploy and maintain a central server. The system is also

highly adaptive to peer joining and leaving. However, it is not scalable and the server needs to have much resource (such

as computational capability and bandwidth) to support a large number of peers. In addition, the server forms a single point

of failure. If the server is down, the whole system is broken.

ii. A Distributed Approach: Gnutella

In the basic Gnutella protocol, when a new peer joins the system, it first connects to some public peers then sends

a PING message to any peer it is connected to, to announce the existence of the new peer. Upon receiving a PING

message, a Gnutella peer returns a PONG message and propagates the PING message to its neighbors. In a dynamic

network with frequent peer joining and leaving, a peer periodically sends PING messages its neighbors. Search in

Gnutella is based on flooding, which is broadcasting in the overlay. To reduce the amount of query messages in the

network, each query message contains a time-to-live (TTL) field.

Figure 3(b) shows the search process in Gnutella. Suppose peer A wants to search for some file. It floods its

search query to its neighbors, i.e., peers B and D in the figure. When peer B receives the query, it checks whether itself

holds the matching file. If not, it forwards the query to its neighbors. As in the example, peer B forwards the query to its

neighbor C. Suppose C holds the file that A wants. C returns a response to the peer that sends it the query, which is B in

the figure. B then continues forwarding the response to the query sender A. Finally, A contacts C to download the file.

Gnutella is a dynamic, self-organized network unlike Napster. Each peer independently connects to and

communicates with a few other peers in the system. The system is highly robust to peer dynamics through the exchange of

PING and PONG messages. A limitation of Gnutella is its relatively low search efficiency.



P a g e 4 | 6

iii. A Hybrid Approach: FastTrack/Kazaa

The hybrid approach combines the approach used in purely centralized networks and purely distributed networks

to overcome limitations. FastTrack is a typical example as a partially centralized P2P protocol. In FastTrack, peers with

the fastest Internet connections and the most powerful computers are automatically designated as supernodes. A super

node maintains information about some resource and connections with other supernodes. A peer first searches for the

closest super node, which returns immediate results if any and refers the search to other supernodes if needed. Two

practical softwares based on FastTrack are Kazaa and Grokster. But the latter closed its service in 2005 due to the copyright issue.

Figure 3(c) shows the search process in Kazaa. When peer A wants to search for some file, it sends the search

query to the closest super node. The super node either returns some matching peers, or forwards the query to other

supernodes. Finally, A will obtain some matching peers from the super node (say, peer B in the figure) and download the

file from these peers. Therefore, an ordinary peer (e.g., peer A in the figure) communicates with a super node as if

communicating with the server in Napster. Then, Gnutella like search is performed in a highly pruned overlay network of

supernodes.

Kazaa achieves much lower search time compared to purely distributed networks like Gnutella. Search among

supernodes is much faster than search among all peers, because the number of supernodes is much smaller than the total

number of peers. The high bandwidth and large storage space of supernodes can efficiently process a large amount of

queries from ordinary peers. The system hence makes good use of peer heterogeneity. In addition, unlike Napster, it does

not form a single point of failure. The peers connecting to them can connect to other supernodes if some supernodes go

down.

iv. Other Approach: BitTorrent

BitTorrent is a P2P system that does not belong to any of the above categories. BitTorrent uses a central location

to coordinate data upload and download among peers. To share a file f, a peer first creates a small torrent file, which

contains metadata about f, e.g., its length, name and hashing information. Usually, BitTorrent cuts a file into pieces of

fixed size, typically between 64 KB and 4 MB each. Each piece has a checksum from the SHA1 hashing algorithm, which

is also recorded in the torrent file. Most importantly, the torrent file contains the URL of a tracker, which keeps track of

all the peers who have file f (either partially or completely) and the lookup peers. A peer that wants to download the file

first obtains the corresponding torrent file, and then connects to the specified tracker. The tracker responds with a random

list of peers which are downloading the same file. The requesting peer then connects to these peers for downloading.

Figure 3(d) shows the search process in BitTorrent. When peer A wants to search for some file, it first needs to

obtain the corresponding torrent for the file. From the torrent, A knows the address of the tracker and connects to the

tracker. The tracker then returns a list of peers who are downloading or sharing the file. A then exchanges data with these

peers.

The centralization of trackers in BitTorrent systems brings some limitations. If a tracker is down, peers will not be

able to start their sharing (by uploading their torrents to the tracker), and new incoming peers cannot start their

downloading. To overcome this, the latest BitTorrent clients implement a decentralized tracking mechanism (e.g.,

Torrent, BitComet, KTorrent). In the mechanism, every peer acts as a mini-tracker. Peers first join a DHT network, which is inherently implemented in the BitTorrent client. A torrent is then stored at a certain peer according to the DHT

storage method. All peers in the DHT network can search for the torrent through DHT search. Therefore, this mechanism

eliminates central trackers from the system.

b) Structured P2P overlay architecture

[5] Structured P2P overlay is a network overlay that connects nodes using a particular data structure or protocol to ensure that node lookup or data discovery is deterministic. Early versions P2P systems mainly consisted of unstructured overlays that organize nodes into random data structures. These unstructured overlays use techniques such as

walking or flooding the nodes in the system for lookup, and are often optimized for some common lookup queries. But, in

general, these unstructured overlays are quite un- predictable for finding rare items and for some real-time applications such as voice, video sharing etc. To overcome these issues, structured overlays are developed to provide deterministic

bounds on the data discovery. Structured overlays provide scalable network overlays based on a distributed data structure

that supports deterministic behavior for data lookup. Structured P2P overlays impose restrictions on node placement in the

overlay and hence, improve the efficiency of data lookup. We categorize structured P2P systems in terms of the bound on



P a g e 5 | 6

numbers of hops required for data lookup and present issues such as node lookup, finger table maintenance, and join/leave

properties of the overlays.

[3] Each peer has a local routing table which is used by the forwarding algorithm. The peers routing table is initialized when the peer joins the overlay, using a specified bootstrap procedure. Peers periodically exchange routing

table changes as part of overlay maintenance. The majority of structured overlays use key-based routing in which a set of keys is associated with addresses in the address space such that the nearest peer to an address stores the values for the

associated keys, and the routing algorithm treats keys as addresses. A distributed hash table (DHT) is a structured overlay that uses key-based routing for put and get index operations and in which each peer is assigned to maintain a

portion of the DHT index. Because the address space is virtualized and peer addresses are typically randomly assigned,

peers which are neighbors in the overlay can be distant in the underlying network. While this improves the fault tolerance

of the overlay, it causes significant performance loss. Consequently, topology-aware overlays use measurements of

proximity of peers in the underlying network to create neighbor peers in the overlay.

ADVANTAGES AND DISADVANTAGES OF P2P NETWORKS AND APPLICATIONS [2]

P2P networks has advantage of providing us with increased availability of resources by sharing of resources

between peers in the same network, may it be computational resources or content. It also provides enhanced load

balancing feature. In a situation where a piece of data is present only at a particular peer, it is possible that the peer is

overburdened with requests. P2P can circumvent this problem by providing multiple copies of data. Also, using explicit

caching algorithms, intermediate peers cache frequently used data and helps to distribute the content more evenly. Thus

query load is more evenly balanced. P2P networks also provides redundancy and fault tolerance feature. In case a peer in

the network goes down, we can rely on other peers to perform the required task or as source of the same data because of

the fast duplication of data in P2P model. Besides that, P2P networks also enable content based addressing. In the present

Internet scene, there may be very little correspondence between the site name a person typed and its contents. In P2P, the

exact address of a node storing a particular content remains transparent to the user. The user queries the network for the

content and P2P software translates the requests into specific nodes that hold the content. This procedure can lead to a

grouping of addresses based on the content the respective nodes store which can lead to more refined data repository.

P2P network has disadvantage of having spurious content and poor connection due to lack of central authority,

thus, the quality of the content posted on the peer group is questionable. For example the Mp3 version of the same song

may be available as a copy with a very good sound quality and another copy with poor quality. But for the P2P search

both versions are part of the same search and indistinguishable, until actually heard. Also, slow and error prone dial up

connections used by some of the peers may disrupt the normal functioning of the network. P2P networks also have

numerous security considerations that are discussed in the next section.

SECURITY CONSIDERATIONS OF P2P [8]

Security is an important issue when implementing a system. The first issue that needs to be considered is to

which extent the nodes in the system can be trusted. If all the nodes in the system are fully trusted (all the nodes are

trusted to never act in a malicious way), P2P architecture can achieve a high level of security. However, if nodes are not

fully trusted and can be expected to behave in malicious ways, providing an acceptable level of security in a P2P

environment becomes significantly more challenging because of its distributed ownership and lack of centralized control.

The P2P model networks decentralize the resources on a network, thus, information can be located anywhere on

any connected device without the need of a dedicated server. However, this makes it more difficult to enforce security and

access policies in networks that have many computers. User accounts and access rights must be set individually on each

peer device.

P2P allows attackers to passively obtain valid IP addresses of potential victims without performing active scans

because a given peer is typically connected to multiple peers. This attack is much more efficient than performing scans

when the address space to be scanned is large such as the current IPv6 address space and sparsely populated. Additionally,

due to the high correlation between a particular application and a particular operating system, an attacker can launch

attacks that exploit known specific vulnerabilities of an operating system.

Central elements in centralized architectures become an obvious target for attacks. P2P systems minimize the

amount of central elements and, thus, are more resilient against attacks targeted only at a few elements. Besides that, it is



P a g e 6 | 6

also important to consider a number of threats that are specific to P2P systems which mainly focus on the data storage

functions and the routing of P2P systems.

In a P2P system, messages between two given peers generally traverse a set of intermediate peers that help route

messages between the two peers. Those intermediate peers compromised by the attacker can attempt to a man-in-the-

middle attacks since they are on the path between the two given peers. The Sybil attack is an example of such an attack.

This type of attack can be mitigated by controlling how peers obtain their identifiers such as by having a central authority.

We can also encrypt message parts that are not required for routing to prevent this type of attack. Without the key to

decrypt the message, the attacker will not be able to view the actual message content. Attackers can also attempt to launch a set of attacks against the routing of the P2P system by modifying the routing of the system in order to be able to launch

on-path attacks. Attackers can use forged routing maintenance messages for this purpose. The Eclipse attack is an

example of such an attack. Enforcing structural constraints or enforcing node degree bounds can mitigate this type of

attack.

An attacker can create a message and claim that it was actually created by another peer. The attacker can even

take a legitimate message as a base and modify it to launch the attack. Peer and message authentication techniques can be

used to avoid this type of attack.

In P2P-specific attacks against the data storage function of a P2P system, an attacker can refuse to store a

particular data object or claim that a particular data object does not exist even if another peer created it and stored it on the

attacker. These are called DoS (Denial-of-Service) attacks and can be mitigated by using data replication techniques and

performing multiple, typically parallel, searches. It is also possible to launch DoS attacks by modifying or dropping

routing maintenance messages or by creating forged ones but we can mitigate this by having nodes get routing tables from

multiple peers. By creating churn, attackers can also launch a DoS attack. By leaving and joining a P2P overlay rapidly

many times, a set of attackers can create large amounts of maintenance traffic and make the routing structure of the

overlay unstable. We can mitigate this by limiting the amount of churn per node.

CONCLUSION

P2P systems provide many new opportunities of communicating, sharing resources, and computing over the Internet. New

advancement in software and hardware technology has eased the realization of P2P systems. Although there are still

numerous disadvantages and security considerations involved in P2P systems, many innovative ideas and much efforts are

done to enhance the P2P systems technology.

REFERENCES

[1] Mark, A. D., Rick, M., & Antoon, W. R. (2008). Application Layer Functionality and Protocols. Network

Fundamentals CCNA Exploration Companion Guide (pp. 63-98). Indianapolis, IN: Cisco Press

[2] Kini, U. A., & Shetty, S. M. (2001). Peer-to-Peer networking. Resonance, 6(12), 69-79

[3] Xuemin, S., Yu, H., Buford, J., & Akon, M. (2009). Introduction to Peer-to-Peer Networking. Handbook of

Peer-to-Peer Networking (pp. 44-154). New York, NY: Springer

[4] Xuemin, S., Yu, H., Buford, J., & Akon, M. (2009). Unstructured P2P Overlay Architectures. Handbook of


[5] Xuemin, S., Yu, H., Buford, J., & Akon, M. (2009). Structured P2P Overlay Architectures. Handbook of


[6] Wikipedia. Peer-to-peer. Retrieved 20, March, 2013 from http://en.wikipedia.org/wiki/Peer-to-peer

[7] Vikran, K. (November, 2009). What do P2P Applications do and How to block Peer to Peer Applications

(P2P) using Symantec Endpoint Protection? Retrieved 20, March, 2013 from

http://www.symantec.com/connect/articles/what-do-p2p-applications-do-and-how-block-peer-peer-

applications-p2p-using-symantec-endpoin

[8] Internet Engineering Task Force (IETF). (November, 2009). RFC 5694 - Peer-to-Peer (P2P) Architecture:

Definition, Taxonomies, Examples, and Applicability. Retrieved 15, March, 2013 from

http://tools.ietf.org/html/rfc5694

Documents

Data Comm. Assignment1