Real time information dissemination and management in peer-to-peer networks

Real Time Information Dissemination and

Management in Peer-to-Peer Networks

A Thesis Submitted in Fulfillment of the Requirement of the Degree of

Doctor of Philosophy

By

Shashi Bhushan

Reg. No.: 2K06NITK-PhD-1095

Under supervision of

Dr. Mayank Dave Associate Professor

Dept. of Computer Engineering NIT, Kurukshetra

Dr. R.B. Patel Associate Professor

Dept. of Computer Science & Engineering G. B. Pant Engineering College, Pauri Garhwal

(Uttarakhand)

Department of Computer Engineering

National Institute of Technology Kurukshetra-136119 India

May 2013

Department of Computer Engineering National Institute of Technology

Kurukshetra-136119, Haryana, India

Candidate’s Declaration

I hereby certify that the work which is being presented in the thesis entitled “Real Time

Information Dissemination and Management in Peer-to-Peer Networks” in

fulfillment of the requirement for the award of the degree of Doctor of Philosophy in

Computer Engineering and submitted in the Department of Computer Engineering of

National Institute of Technology, Kurukshetra, Haryana, India is an authentic record of

my own work, carried out during the period from May 2006 to April 2013, under the

supervision of Dr. Mayank Dave and Dr. R. B. Patel.

The matter presented in this thesis has not been submitted by me for the award of any

other degree in this Institute or any other Institute/University.

Shashi Bhushan

This is to certify that the above statement made by the candidate is correct to the best of

our knowledge.

Date:

Dr. Mayank Dave Associate Professor Dept. of Computer Engineering NIT, Kurukshetra

Dr. R.B. Patel Associate Professor

Dept. of Computer Science & EngineeringG. B. Pant Engineering College, Pauri Garhwal

(Uttarakhand)

Acknowledgement

Success in life is never attained single handed. First and foremost, I would like to

express my sincere gratitude to my supervisor, Dr. Mayank Dave for his continuous

support, encouragement and enthusiasm. I thank him for all the energy and time he

has spent for me, discussing everything from research to career choices, reading my

papers and guiding my research through the obstacles and setbacks. His professional

yet caring approach towards the people, his working and his passion for living the life

to the fullest have truly inspired me.

It is extremely difficult for me to express in words my gratitude towards my co-

supervisor, Dr. R. B. Patel who stood by me throughout my research work and guided

me not only towards becoming an able researcher but also a good human being. His

constant motivation made me to believe in myself towards this research work. Without

his persuasion and interest, it would not have been possible for me to gain the confidence

that I have today.

My sincere thanks goes to Dr. J.K. Chhabra, Head, Department of Computer

Engineering, for his insightful comments and administrative help at various occasions.

His hard working attitude and high expectation towards research have inspired me to

mature into a better researcher. I would also like to thank my DRC members, Dr. A.

Swarup, Dr. A K. Singh and Dr. S. K. Jain for stimulating questions and valuable

feedback. I owe my thanks to the faculty members of the department for their

valuable feedback.

I would be no where in life if I had not grown up in the most wonderful family

one can imagine. I want to thank my parents and brother for their love and for giving

me all the happiness and opportunities that most people can only dream of.

I am grateful to my better half, Dr. Anjoo Kamboj for keeping lots of patience,

constant encouragement and support. She was always with me in my difficult times and

encouraged me, whenever, I was down with frustrations. Words are not sufficient to

express my deepest love to my lovingly kids Ashu and Abhi, for their cooperation and

sacrifice of childhood, that they may enjoy with their father. They always pray to God

to make me success in the work

iii

iv

I consider this as an opportunity to express my gratitude to all the dignitaries who

have been involved directly or indirectly with the successful completion of this work.

Last but not least, I thank God, the almighty for giving me the strength, will and

wisdom to carry out my work successfully. You have made my life more ample. May

your name be exalted, honored and gloried.

Shashi Bhushan

Abstract

In P2P networks, peers are rich in computing resources/services, viz., data files, cache

storage, disk space, processing cycles, etc. These peers collectively generate huge

amount of resources and collaboratively perform computing tasks using these

available resources. These peers can serve as both clients and servers, and eliminate

the need of a centralized node. In P2P systems, a major drawback is that resources or

nodes are restricted to temporary availability only. A network element may disappear

at a given time from the network and can reappear at another locality of the network

with an unpredictable pattern. Under these circumstances, one of the most

challenging problems is how to place and access real-time information over the

network. This is because the resources should always be successfully located by the

requesters whenever needed within some bounded delay. This requires management

of information under time constraints and dynamism of the peers. There are multiple

challenges to be addressed for implementing Real Time Distributed Databases

Systems (RTDDBS) over dynamic P2P networks. In order to enable resource

awareness in such a large-scale dynamic distributed environment, specific

management system is required, which takes into account the following P2P

characteristics: reduction in redundant network traffic, data distribution, load

balancing, fault-tolerance, replica placement/updation/assessment, data consistency,

concurrency control, design and maintain logical structure for replicas, etc. In this

thesis, we have developed a solution for resource management which should support

fault-tolerant operations, shortest path length for requested resources, low overhead in

network management operations, well balanced load distribution between the peers

and high probability of successful access from the defined quorums.

In this thesis, we have proposed a self managed, fault adaptive and load adaptive

middleware architecture called Statistics Manager and Action Planner (SMAP) for

implementing Real Time Distributed Database System (RTDDBS) over P2P networks.

Various algorithms are also proposed to enhance the performance of different

modules of SMAP. A Matrix Assisted Technique (MAT) is proposed to partition the

database for implementing the RTDDBS. This approach also provides primary

security to the database over unreliable peers and easy access to the information over

P2P systems. A 3-Tier Execution Model (3-TEM) that integrates MAT, for parallel

v

vi

execution is also proposed. 3-TEM enhances throughput of the P2P system and

balances the load among participating peers. Timestamp based Secure Concurrency

Control Algorithm (TSC2A) is also developed, which handles the issues of concurrent

execution of transactions in the dynamic environment of P2P networks. This

algorithm has capabilities of providing security to both arrived transactions and data

items. An approach called Common Junction Methodology (CJM) is proposed to

reduce redundant traffic and improved response time in P2P network through

common junction in the paths. The quorum acquisition time is reduced through a

novel fault adaptive algorithm called Logical Adaptive Replica Placement Algorithm

(LARPA), which implements logical structure for dynamic environments. The

algorithm efficiently distributes replicas on one hop distance sites to improve data

availability in RTDDBS over P2P system. A self organized Height Balanced Fault

Adaptive Reshuffle (HBFAR) scheme is proposed for improving hierarchical

quorums over P2P systems. It improves data availability through logical arrangement

of replicas. We finally conclude and compare the proposed middleware with some

existing schemes.

Table of Contents

Candidate’s Declaration................................................................................................... ii

Acknowledgement .......................................................................................................... iii

Abstract ......................................................................................................................v

Table of Contents .......................................................................................................... vii

List of Figures ............................................................................................................... xii

List of Tables .................................................................................................................xv

List of Abbreviations .................................................................................................... xvi

Chapter 1: Introduction ........................................................................................... 1-10 1.1 What is Peer-to-Peer Network? ...............................................................................1

1.2 Why P2P Networks? ................................................................................................2

1.3 Applications of P2P Systems ...................................................................................3

1.4 Motivation................................................................................................................3

1.5 Issues in P2P Systems..............................................................................................5

1.6 Research Problem ....................................................................................................5

1.7 Work Carried Out ....................................................................................................6

1.8 Organization of Thesis.............................................................................................9

1.9 Summary ...............................................................................................................10

Chapter 2: Literature Review................................................................................ 11-55

2.1 Peer-to-Peer (P2P) Networks.................................................................................11

2.2 Types of P2P Networks .........................................................................................13

2.2.1 Structured P2P Networks...........................................................................13

2.2.2 Unstructured P2P Networks.......................................................................14

2.3 File Sharing System ...............................................................................................15

2.4 Underlay and Overlay P2P Networks ....................................................................18

2.5 Challenges in P2P Systems....................................................................................20

2.5.1 Challenges in P2P Networks.........................................................................20

2.5.2 Challenges for Databases in P2P Networks..................................................25

2.6 Parallelism in Databases ........................................................................................27

vii

2.6.1 Partitioning Methods..................................................................................28

2.7 Concurrency Control..............................................................................................30

2.8 Topology Mismatch Problem ................................................................................31

2.9 Replication for Availability ...................................................................................32

2.10 Quorum Consensus ................................................................................................33

2.11 Databases ...............................................................................................................36

2.11.1 Real Time Applications Framework..........................................................38

2.12 Some Middlewares.................................................................................................39

2.13 Analysis..................................................................................................................53

2.14 Summary ................................................................................................................54

Chapter 3: Statistics Manager and Action Planner (SMAP) for P2P Networks ....................................................................................................... 56-66

3.1 Introduction ...........................................................................................................56

3.2 System Architecture...............................................................................................57

3.2.1 Interface Layer (IL)....................................................................................58

3.2.2 Data Layer (DL).........................................................................................60

3.2.3 Replication Layer (RL) ..............................................................................62

3.2.4 Network Layer (NL) ..................................................................................63

3.2.5 Control Layer (CL) ....................................................................................64

3.3 Advantages of SMAP ............................................................................................64

3.4 Discussion ..............................................................................................................65

3.5 Summary ................................................................................................................65

Chapter 4: Load Adaptive Data Distribution over P2P Networks..................... 67-97

4.1 Introduction............................................................................................................68

4.2 System Model .......................................................................................................69

4.3 3-Tier Execution Model (3-TEM) .........................................................................70

4.3.1 Transaction Coordinator (TC)....................................................................71

4.3.2 Transaction Processing Peer (TPP)............................................................73

4.3.3 Result Coordinator (RC) ............................................................................74

4.3.4 Working of 3-TEM ...................................................................................75

4.4 Load Balancing ......................................................................................................76

4.5 Database Partitioning .............................................................................................76

4.5.1 Matrix Assisted Technique (MAT)............................................................77

viii

4.5.2 Database Partitioning .................................................................................79

4.5.3 Algorithm to Access the Partitioned Database. .........................................80

4.5.4 Peer Selection Criterion .............................................................................83

4.6 Simulation and Performance Study .......................................................................84

4.6.1 Assumptions...............................................................................................84

4.6.2 Simulation Model.......................................................................................84

4.6.3 Performance Metrics .................................................................................87

4.6.4 Simulation Results .....................................................................................89

4.7 Advantages of 3-TEM............................................................................................94

4.8 Discussion..............................................................................................................94

4.9 Summary ................................................................................................................95

Chapter 5: Concurrency Control in Distributed Databases over P2P Networks ....................................................................................................... 96-108

5.1 Introduction............................................................................................................96

5.2 System Model ........................................................................................................97

5.3 Transaction Model .................................................................................................98

5.4 Serializability of Transactions ...............................................................................99

5.5 A Timestamp based Secure Concurrency Control Algorithm (TSC2A)..............100

5.5.1 Algorithm for Write Operation ................................................................100

5.5.2 Algorithm for Read Operation .................................................................101

5.6 Simulation and Performance Study .....................................................................102

5.6.1 Performance Metrics................................................................................102

5.6.2 Assumptions.............................................................................................102

5.6.3 Simulation Results ...................................................................................103

5.7 Discussion............................................................................................................107

5.8 Summary ..............................................................................................................108

Chapter 6: Topology Adaptive Traffic Controller for P2P Networks ........... 109-127 6.1 Introduction..........................................................................................................109

6.2 System Model ......................................................................................................112

6.3 System Architecture.............................................................................................113

6.4 Common Junction Methodology (CJM) ..............................................................114

6.4.1 Common Junction Methodology Algorithm............................................114

6.4.2 System Analysis.......................................................................................116

ix


6.5.1 Simulation Model.....................................................................................119



6.6 Advantages in using CJM………………. ...........................................................126

6.7 Discussion............................................................................................................126

6.8 Summary .............................................................................................................127

Chapter 7: Fault Adaptive Replica Placement over P2P Networks............... 128-147 7.1 Introduction..........................................................................................................128

7.2 System Model ......................................................................................................130

7.3 Logical Adaptive Replica Placement Algorithm (LARPA) ................................131

7.3.1 LARPA Topology....................................................................................131

7.3.2 Identification of Number of Replicas in the System................................132

7.3.3 LARPA Peer Selection Criterion .............................................................133

7.3.4 Algorithm 1: Selection of Best Suited Peers ...........................................133

7.3.5 Algorithm 2: Selection of Suitable Peers with Minimum Distance.........134

7.4 Implementation ....................................................................................................136

7.4.1 Replica Leaving from the System............................................................138

7.4.2 Replica Joining to the system ..................................................................138




7.6 Discussion............................................................................................................145

7.7 Summary ..............................................................................................................146

Chapter 8: Height Balanced Fault Adaptive Reshuffle Logical Structure for P2P Networks ............................................................................................. 148-167

8.1 Introduction..........................................................................................................148

8.2 System Model ......................................................................................................150

8.3 System Architecture.............................................................................................151

8.4 Height Balanced Fault Adaptive Reshuffle (HBFAR) Scheme...........................153

8.4.1 Rule Set-I: Rules for Generation of Height Balanced Fault Adaptive Reshuffle (HBFAR) Structure .............................................................156

8.4.2 Rule Set-II: Rules for replica leaving from HBFAR ..............................157

x

xi

8.4.3 Rule Set-III: Rules for replica joining into the replica logical structure................................................................................................................157

8.4.4 Rule Set-IV: Rules for Acquisition of Read/Write Quorum from HBFAR Logical Tree...........................................................................................158

8.4.5 Correctness Proof of the Algorithm.........................................................160




8.6 Discussion............................................................................................................166

8.7 Summary ..............................................................................................................167

Chapter 9: Conclusion and Future Work......................................................... 168-173 9.1 Contributions .......................................................................................................169

9.2 Future Scope ........................................................................................................172

List of Publications ............................................................................................. 174-175

Bibliography ........................................................................................................ 176-193

List of Figures

2.1 The Basic Architecture of P2P Network..........................................................12

2.2 The Basic Client/Server Architecture ..............................................................12

2.3 Distributed Hash Table (DHT) ........................................................................14

2.4 Information Retrieval form Hybrid P2P Based System...................................15

2.5 Classifications of P2P System Networks.........................................................16

2.6 Typical Overlay Network ................................................................................19

2.7 The Architecture of Napster.............................................................................40

2.8 The Architecture of Gnutella ...........................................................................42

2.9 The Freenet chain mode files discovery mechanism. The query is forwarded from node to node using the routing table, until it reaches the node which has the requested data. The reply is passed back to the original node following the reverse path. ........................................................43

2.10 The path taken by a message originating from node 67493 destined for node 34567 in a Plaxton mesh using decimal digits of length 5 in Tapestry............................................................................................................46

2.11 Chord identifier circle consisting of the three nodes 0,1 and 3.In this figure, key1 is located at node 1, key 2at node 3 and key 6 at node 0.............48

2.12 (a) Example 2-d [0,1][0,1] coordinate space partitioned between 5 CAN nodes. (b) Example 2-d space after node F joins...................................49

2.13 JXTA Architecture...........................................................................................50

2.14 APPA Architecture ..........................................................................................51

3.1 Architecture of Statistics Manager and Action Planner (SMAP) ....................59

4.1 3-Tier Execution Model (3-TEM) for P2P Systems ........................................70

4.2 System Architecture of 3-Tier Execution Model (3-TEM) .............................73

4.3 Logical View of Database Partitioning with 10rdf = , 3cdf = .........................78

4.4 Simulation Model for 3-TEM ..........................................................................86

4.5 Relationship between Peer Availability vs. Partitions Availability.................90

4.6 Relationship between Throughput vs. Mean Transaction Arrival Rate...........91

4.7 Relationship between Numbers of Partitions vs. Response Time ...................91

4.8 Relationship between Mean Transaction Arrival Rate vs. Query Completion Ratio .............................................................................................92

4.9 Relationship between Mean Transaction Arrival Rate vs. Miss Ratio ............93

4.10 Relationship between Mean Transaction Arrival Rate vs. Restart Ratio.........93

4.11 Relationship between Mean Transaction Arrival Rate vs. Abort Ratio...........94

xii

5.1 Comparisons between Miss Ratio of Transactions and Mean Transaction Arrival Rate (MTAR).................................................................104

5.2 Comparison between Transaction Restart Ratio and MTAR ........................105

5.3 Comparison between Transaction Success Ratio and MTAR .......................106

5.4 Comparison between Transaction Abort Ratio and MTAR ..........................106

5.5 Comparison between Throughput and MTAR ..............................................107

6.1 Overlay and Underlay Networks Setup .........................................................111

6.2 3-Layer Traffic Management System (3-LTMS) for Overlay Networks ......113

6.3 Network Simulation Model for P2P Networks..............................................119

6.4 Average Number of Partitions vs. Underlay Cardinality...............................122

6.5 Average Path Lengths for Maximum Reachability vs. Underlay Cardinality......................................................................................................123

6.6 Average Path Cost vs. Overlay Cardinality ...................................................124

6.7 Average Path Cost vs. Underlay Cardinality .................................................124

6.8 Average Response Time vs. Overlay Hop Count ..........................................125

6.9 Average %age of reduction in Path Cost vs. Overlay Path (Hop Count) ......125

6.10 Average % age Reduction in Response Time vs. Overlay Hop Count..........126

7.1 Peers Selection and Logical Connection for LARPA Structure ....................136

7.2 LARPA obtains Logical Structure from the Network shown in Figure 7.1...................................................................................................................136

7.3 LARPA Structure Representing the Replica departing the Network.......138 14p

7.4 LARPA Structure Representing the Replica from the Centre departing the Network....................................................................................138

5p

7.5 Relationship between session time and its availability of a peer in P2P Networks ........................................................................................................140

7.6 Variations in response time with quorum size...............................................141

7.7 Variations in restart ratio with system workload ...........................................142

7.8 Relationship of transaction success ratio with system workload...................142

7.9 Variation in throughput with system workload .............................................143

7.10 Relationship between average search time with quorum size .......................143

7.11 Variation in network traffic with quorum size...............................................144

7.12 Probability to Access Updated Data vs. Peer Availability ............................144

7.13 Response Time Comparison between LARPA1 and LARPA2 .................................145

7.14 Messages Overhead Comparison between LARPA1 and LARPA2 ......................145

8.1 7-Layers Transaction Management System (7-LTMS) .................................152

8.2 The arrangement of peers to make Height Balanced Fault Adaptive Reshuffle Tree over the peers from underlay topology of P2P

xiii

xiv

networks. Here the dotted line connector shows the connection between the peers in overlay topology. The dark line connector shows the connection between the peers in the replica topology in tree. P14 is shown as isolated peer in the network. ..........................................................155

8.3 Replica arrangements in the HBFAR Scheme generated from Figure 8.2. The session time of P1 is greater than the P2 and P3. The order of the replicas according to session time from the HBFAR Scheme is P1, P2, P3, P4, P5, P6, P7, and P8. ......................................................................155

8.4 Replica arrangements in a HBFAR logical structure. Peer 2 which is shown by dotted lines is a peer leaving the network .....................................158

8.5 The HBFAR structure after leaving of Peer 2. Peer 4 takes the position of Peer 2 which already leaved the network. All other replicas in downlink are readjusted accordingly .............................................................158

8.6 Reachability of peers under availability in the network ................................164

8.7 Comparison in accessing stale data under availability of peers.....................164

8.8 The comparison of average search time to form the quorum from the networks.........................................................................................................165

8.9 Comparison of average response time ...........................................................165

8.10 Comparison of average message transfer to maintain the system .................166

List of Tables

2.1 A Comparison of Various P2P Middlewares................................................55

4.1 Performance Metrics-I ..................................................................................87

4.2 Performance Metrics-II.................................................................................88

4.3 Performance Parameters Setup .....................................................................89

7.1 Effect of Peer Availability over Data Availability in the System...............132

7.2 Performance Metrics-III..............................................................................140

9.1 Comparison of Few Existing Systems with SMAP ....................................173

xv

List of Abbreviations

1-TEM 1-Tier Execution Model

3-LTMS 3-Layer Traffic Management System

3-TEM 3-Tier Execution Model

7-LTMS 7-Layers Transaction Management System

AM Authenticity Manager

APC Average Path Cost

APL Average Path Length

ART Average Response Time

CCM Concurrency Control Manager

CJM Common Junction Methodology

CL Control Layer

CPU Central Processing Unit

DA Data Administrator

DAT Data Access Tracker

DBA Database Administrator

DBMS Database Management System

DCE Distributed Computing Environment

DD Data Distributor

DL Data Layer

DM Data Manager

DS Data Scheduler

DSS Data Storage Space

GCM Group Communication Manager

HBFAR Height Balanced Fault Adaptive Reshuffle

HQC Hierarchical Quorum Consensus

IL Interface Layer

LA Load Analyzer

LARPA Logical Adaptive Replica Placement Algorithm

LD Local Database

MAT Matrix Assisted Technique

MTAR Mean Transaction Arrival Rate

xvi

NCM Network Connection Manager

NL Network Layer

NM Network Manager

P2P Peer-to-Peer

PAL Peer Allocator

PA Peer Analyzer

PC Path Cost

PCS Path Cost Saved

PL Path Length

PPQ Participating Peer Queue

QEE Query Execution Engine

QI Query Interface

QM Quorum Manager

QO Query Optimizer

QP Quorum Processor

RA Resource Allocator

RC Result Coordinator

RDA Result Data Administrator

RL Replication Layer

RSM Result Manager

RM Resource Manager

ROM Replica Overlay Manager

ROWA Read One Write All

RP Result Pool

RPB Resource Publisher

RSM Replica Search Manager

RT Response Time

RTDB Real Time Database

RTDBS Real Time Database System

RTDDBS Real Time Distributed Database System

RTM Replica Topology Manager

RTR Response Time Reduction

SC Security Checker

SI Sub Transaction Interface

xvii

xviii

SM Security Manager

SMAP Statistics Manager and Action Planner

SQSM Subquery Schedule Manager

SRTDDBS Secure Real Time Distributed Database System

SS Schema Scheduler

SSM Sub Transaction Manager

TAR Transaction Abort Ratio

TC Transaction Coordinator

TI Transaction Interface

TLO Traffic Load Optimizer

TM Transaction Manager

TMR Transaction Miss Ratio

TPP Transaction Processing Peer

TRR Transaction Restart Ratio

TSC2A Timestamp based Secure Concurrency Control Algorithm

TSR Transaction Success Ratio

TTL Time to Live

UM Update Manager

Chapter 1

Introduction

Peer-to-Peer (P2P) networks were developed in early 90s and were used mostly for

inhouse purposes for the companies and for limited applications of sharing

information between cooperative researchers. When the Internet began to explode in

the mid 90s, a new wave of ordinary people began to use the Internet as a way to

exchange email, access web pages, and buy things, which was much different from

the initial usage. As intelligent systems become more pervasive and homes become

better connected, a new generation of applications is being deployed over the Internet

[1]. In this scenario, P2P applications become very attractive because they improve

scalability and enhance performance by enabling direct and real time communication

among the peers.

Rest of the chapter is organized as follows. A P2P network is introduced in

Section 1.1. Objectives of P2P networks are presented in Section 1.2. Applications of

P2P systems are given Section 1.3. Section 1.4 discusses motivation behind this

research. Section 1.5 presents challenges in P2P Systems. Section 1.6 gives the

statement of this research. Section 1.7 presents work contribution of Thesis.

Organization of the thesis is explored in Section 1.8. Finally chapter is summarized in

Section 1.9

1.1 What is Peer-to-Peer Network?

Peer-to-Peer (P2P) systems provide an environment where peers (nodes)

collaboratively perform computing tasks/share resources. Moreover, a P2P system

links the resources of all participating peers in the network and allows the resources to

be shared in a manner that eliminates the need for a central host. These peers can

serve as both clients and servers. P2P systems may also be referred as P2P networks.

P2P systems are computer networks or systems in which peers of equal roles and

responsibilities, often with various capabilities, exchange information or share

resources directly with each other. These systems may function without any central

administration and coordination instance. A P2P network differs from conventional

client/server or multitier server's networks allowing direct communication between

peers.

P2P architecture enables true distributed computing and creates network of

computing resources. It allows systems to have temporary associations with each

other for a short period of time, and then separate afterwards. Besides these, peers are

autonomous in the sense that they can: (i) join the system anytime, (ii) leave without

any prior warning, and (iii) take routing decision locally in an ad hoc manner [2].

More precisely, a P2P network can be defined as a distributed system consisting

of interconnected nodes that are able to self organize into network topologies with the

purpose of sharing resources such as content, CPU cycles, storage and bandwidth,

capable of adapting to failures and accommodating transient populations of peers

while maintaining acceptable connectivity and performance, without requiring the

intermediation or support of a global centralized server or authority [3].

1.2 Why P2P Networks?

In contrast to the conventional client/server model, P2P systems are characterized by

symmetric roles among the peers, where every peer in the network acts alike, and the

processing and communication are widely distributed among the peers. Unlike the

conventional centralized systems, P2P systems offer scalability [4] and fault tolerance

[5, 6]. It is a feasible approach to implement global scale system such as the Grid [6].

An important goal in P2P networks is that all clients provide resources, including

bandwidth, storage space, and computing power. Thus, as peers arrive and demand on

the system increases, the total capacity of the system also increases. This is not true

for traditional client/server architecture having a fixed set of servers, in which adding

more clients could mean slower data transfer for all users. The distributed nature of

P2P networks also increases robustness in case of failures by replicating data over

multiple peers, and in pure P2P systems by enabling peers to find the data without

relying on a centralized index server [7]. In the latter case, there is no single point of

failure in the system.

2

1.3 Applications of P2P Systems

A growing application of P2P technology is the harnessing the dormant processing

power in desktop PCs [8]. Because P2P has the main design principle of being

completely decentralized and self organized, the P2P concept paves the way for new

type of applications such as file swapping applications and collaboration tools over

the Internet that has recently attracted tremendous user interest. Using software like

Kazaa [9], Gnutella [10, 11] or the now defunct Napster [12], users access files on

other peers and download these files to their computer. These file swapping

communities are commonly used for sharing media files, and MP3 music files. Kazaa

and Gnutella based networks allowed users to continue to share music files at a rate

similar to Napster at its peak. P2P networks became popular with the development,

popularity, and attention given to Napster [8, 13].

Another application domain of P2P networks is the sharing and aggregation of

large scale geographically distributed processing and storage capacities of idle

computers around the globe to form a virtual supercomputer as the SETI@Home

project did [14]. The P2P technology also allows for peripheral sharing, in which one

peer can access scanners, printers, microphones and other devices that are connected

to another peer.

Medical consultation, agricultural consultation and awareness programmers may

be provided to people in rural area using P2P technology, which may play a great role

to make India a developed country by 2020 (a vision by Dr. Abdul Kalam, Former

President of INDIA). P2P system may be used to share and exchange information,

which may help to provide the education. P2P systems may also be used for

implementation of Enterprise Resource Planning (ERP) systems, which require a huge

amount of data for processing. For these types of systems P2P may be a cheaper and

good option. Any bus/railway/airways information system may be implemented over

P2P networks. These are few of the P2P application which may be useful for the

community.

1.4 Motivation

In the traditional client/server model, one powerful machine acts as the server, i.e. the

service provider and all other attached machines are clients, the service consumers.

But from last two decades this model is facing new challenges due to increased

demands in computing and data sharing. Capacity enhancement in client/server model

3

is very expensive due to the requirement of dedicated, expensive and powerful

hardware. Other challenges are single point of failure, scalability, load balancing, and

bandwidth congestion near the server.

Evolution in computer and communication technologies has also played an

important role in “Expecting More”. Distributed collaborative applications are

becoming common as a result of research and development in distributed systems.

Examples of such applications are grid, P2P, cloud computing and mobile computing.

The development in communication technologies (3G/4G), availability of Internet

bandwidth at affordable rates, better connectivity and availability of Internet has also

contributed in the applications of distributed system technologies.

The number of computing devices in homes is increasing rapidly with growth in

technology and availability of compact computing devices, e.g., laptops, smartphones

apart from PCs. In this era of technology, available hardware is fast, efficient, reliable

and having large memory for storage. This combination of availability of better

hardware and connectivity provides favorable environment for distributed

applications. However, the utilization of these available resources is still limited. Most

of the time these computing devices having such powerful hardware are idle and huge

amount of computational power and storage remain underutilized or wasted. Here, a

question arises “Can we combine and utilize these underutilized but distributed

resources for any useful work?” The answer is, yes, through available distributed

technologies. P2P systems provide methods to combine these geographically apart

and wasted resources.

P2P systems are gaining their popularity in various application domains e.g.,

communication and collaboration, Distributed Computation, Internet services support,

Database Systems, Content Distribution, etc. The second generation Wiki is an

example of such applications that works over a P2P network and supports users on the

elaboration and maintenance of shared documents in a collaborative and

asynchronous manner.

Motivated by these challenges, this thesis has aimed to utilize the geographically

distributed resources freely available on Internet and provide a real time distributed

databases management system, by placing data over P2P system’s resources.

4

1.5 Issues in P2P Systems

P2P systems are usually large scale dynamic systems where nodes are distributed on a

wide geographic area. In P2P systems, the resources or peers are restricted to

temporary availability only. A network element can disappear at a given time from the

network and reappear at another locality of the network with an unpredictable pattern.

Under these circumstances, one of the most challenging problems of P2Ps is to

manage the dynamic and distributed network so that resources can always

successfully be located by their requesters when needed. Another important issue is

partitioning of data for improving data availability and to provide the primary security.

The distribution of data on various peers is a difficult task. Secure distribution of data

is another issue [15, 16].

The participating peers in P2P systems may join or leave the network with or

without informing the peers. For implementation of databases over P2P systems,

issues related to P2P networks as well as related to the database systems must be

addressed. Other issues related to P2P networks that should also be addressed are

churn rate, session time, P2P network traffic, overlay and underlay topologies,

topology mismatch problems, etc. The issues related to databases are data availability,

replication handling, concurrency control and security, accessing updated data from

dynamic environment, etc.

1.6 Research Problem

There are multiple challenges to be addressed in implementing Real Time Distributed

Databases Systems (RTDDBS) over dynamic P2P networks. In order to enable

resource awareness in such a large scale dynamic distributed environment, specific

management system is required, which takes into account the following P2P


balancing, fault tolerance, replica placement/updation/assessment, data consistency,

concurrency control, design and maintenance logical structure for replicas, controlling

network traffic of overlay and underlay networks, etc.

In order to enable resource awareness in such a large scale dynamic distributed

environment, a specific resource management strategy is required that takes into

account the main P2P characteristics.

5

In this thesis, we are looking for a self organized system that will address some of the

above mentioned issues. Thus, we are required to develop a suitable solution for

resource management which should support fault tolerant operations, shortest path

length for requested resources, low overhead in network management operations, well

balanced load distribution between the peers and high probability of successful access

from the defined quorums. This developed system must be decentralized in nature for

managing the P2P applications and the system resources in an integrated way,

monitors the behavior of P2P applications transparently, obtains accurate resource

projections, manages the connections between the peers, distributes the objects (data

items/replicas) in response to the user requests in dynamic processing and networking

conditions. The developed system should also place/disseminate dynamic data

intelligently at the appropriate peers, or on the suitable peers. To achieve desired data

availability, data must be replicated over group of suitable peers by the system.

Further, this system should manage the data consistency among replicas. This system

should be fault tolerant and capable of managing load at every peer in the system. It

should be adaptable to any joining and leaving of peers to/from networks and address

the database related issues.

1.7 Work Carried Out

To address few of the above issues we have designed Statistics Manager and Action

Planner (SMAP) system for P2P networks. It is a five layer system. Various

algorithms are also proposed to enhance the performance of various layers. The

following are the major contributions of this research work:

1. SMAP enables fast and cost efficient deployment of information over the P2P

network. It is a self managed P2P system, having a capability to deal with high

churn rate of the peers in the network. SMAP is fault adaptive and provides load

balancing among participating peers. It permits true distributed computing

environment for every peer to use the resources of all other peers participating in

the network. It provides data availability by managing replicas in efficient logical

structure. SMAP provides fast response time for transactions with time constraints.

It reduces redundant traffic from P2P networks by reducing conventional overlay

6

path. It also addresses most of the implementation issues of P2P networks for

RTDDBS.

2. A 3-Tier Execution Model (3-TEM) is developed to enhance the execution

performance of the system. A Matrix Assisted Technique (MAT) is developed to

partition real time database for the P2P networks. MAT is integrated in 3-TEM. It

provides a mechanism to store partitions and access dynamic data over P2P

networks under the dynamic environment. MAT also provides the primary

security concern to the stored data simultaneously it also improves data

availability in the system. 3-TEM splits its functioning in three parts, i.e.,

Transaction Coordinator (TC), Transaction Processing Peer (TPP) and Result

Coordinator (RC). These are designed to operate in parallel to improve throughput

of the system. TC receives and manages the execution of arrived transactions in

the system. It resolves transaction mapped with global schema into

subtransactions mapped with local schema and available with TPP. TPPs are

developed for receiving subtransactions from coordinator, execute it in

serializable form and submit partial results to the RC. It compiles partial results

and prepared according to the global schema and finally delivers to the user.

3. A Timestamp based Secure Concurrency Control Algorithm (TSC2A) is

developed which handles the issues of concurrent execution of transactions in

dynamic environment of P2P network. It maintains security of data and time

bounded transactions along with controlled concurrency. TSC2A uses timestamp

to resolve the conflicts rise in the system. It uses three security levels secure the

execution of transactions. This also avoids the covert channel problem in the

system. TSC2A provides serializability in the execution of transactions at global as

well as at local level. It is implemented in the Data Layer of SMAP.

4. A Common Junction Methodology (CJM) reduces the redundant traffic generated

by topology mismatch problem in the P2P networks. CJM finds its own route to

transfer the messages from one peer to other. Common Junction among two paths

is identified for redirecting the messages. The messages are usually forwarded

from one peer to other in overlay topology. A message traverses multihop distance

in underlay to deliver the message in overlay. These multihops in underlay may

7

intersect at any point and this point referred as Common Junction which is utilized

to reroute the messages. It also reduces the traffic in the underlay network. CJM

reduces the traffic without affecting search scope in the P2P networks. It supports

a fast response time because of reducing path length at overlay level. Thus, the

cost to transfer a unit data from one peer to another is also reduced by CJM. The

correctness of the CJM is analyzed through mathematical model as well as

through simulation. It is implemented in the Network Layer of SMAP.

5. A novel Logical Adaptive Replica Placement Algorithm (LARPA) is developed

which implements logical structure for dynamic environment. The algorithm is

adaptive in nature and tolerates up to 1n − faults. It efficiently distributes replicas

on the one hop distance sites to improve data availability in RTDDBS over P2P

system. LARPA uses minimum number of peers to place replicas in a system.

These peers are identified through peer selection criteria. All peers are placed at

one hop distance from the centre of LARPA, it is place from where any search

starts. Depending upon the selection of peers for logical structure, LARPA is

classified as LARPA1 and LARPA2. LARPA1 uses the peers with highest

candidature value only, calculated through peer selection criteria. This candidature

value is compromised in LARPA2 by the distance of peers from the centre. It also

presents effect of peer leaving and joining the system. LARPA improves the

response time of the system, throughput, data availability and degree of

intersection between two consecutive quorums. The reconciliation of LARPA is

fast, because system updates itself at fast rate. It also reduces the network traffic in

P2P network due to its one hop distance logical structure formation with minimum

number of replicas. It is implemented in the Replica Management Layer of SMAP.

6. A self organized Height Balanced Fault Adaptive Reshuffle (HBFAR) scheme

developed for improving hierarchical quorums over P2P systems. It arranges all

replicas in a tree logical structure and adapts the joining and leaving of a peer in

the system. It places all updated replica on the root side of the logical structure. To

access updated data items from this structure, this scheme uses a special access,

i.e., Top-to-Bottom and Left-to-Right. HBFAR scheme always select updated

replicas for quorums from logical structure. It provides short quorum acquisition

time with high quorum intersection degree among two consecutive quorums,

8

which maximizes the overlapped replicas for read/write quorums. HBFAR

improves the response time and search time of replicas for quorums.

7. HBFAR scheme provides high data availability and high probability to access

updated data from the dynamic P2P system. High fault tolerance and low network

traffic is reported by HBFAR scheme under the churn of peers. Parallelism in

quorum accessing and structure maintenance keeps HBFAR scheme updated

without affecting the quorum accessing time. It is analyzed mathematically as well

as through simulator. It provides the feature read one in its best case. It is

implemented in the Replica Management Layer of SMAP.

1.8 Organization of Thesis

Work presented in this thesis is divided into Nine chapters. In subsequent chapters

various techniques pertaining to the requirements, design, and development of the

proposed system architecture for real time data placement and management are

presented.

Chapter 1 briefly defines what is P2P network? What are limitations and

applications of P2P systems? A look is also given on the challenges available in the

existing and for the development of new systems. Objective of this research is

followed by contribution made in this dissertation is also presented. This chapter

gives roadmap of the dissertation and finally summarizes the chapter. Chapter 2

explores the literature review. Give more detail for this chapter.

Chapter 3 enables fast, cost efficient and self managed P2P system, having a

capability to deal with high churn rate of the peers in the network. The architectural

view of Statistics Manager and Action Planner (SMAP) system, advantages behind

the development of SMAP followed by the summary of the chapter.

Chapter 4 explores a architecture of 3-Tier Execution Model (3-TEM). 3-TEM

executes system events in parallel and provides high throughput in the system. It is

integrated with a Matrix Assisted Technique (MAT) for partitioning the real time

database for the P2P networks. This chapter also discusses on simulation study

findings followed by the summary of the chapter.

9

10

Chapter 5 gives Timestamp based Secure Concurrency Control Algorithm

(TSC2A). It handles the issues of concurrent execution of transactions in dynamic

environment of P2P network. This chapter also presents system model and system

architecture for the implementation of the TSC2A. It also gives simulation results and

finding followed by summary of the chapter.

Chapter 6 highlights a Common Junction Methodology (CJM). It reduces

redundant network traffic generated by topology mismatch problem in the P2P

network. it also explores the correctness proof, implementation, simulation results and

finding followed by summary of the chapter.

Chapter 7 explores on Logical Adaptive Replica Placement Algorithm called

(LARPA). It identifies the suitable number of replicas to maintain data availability in

acceptable range. Two variants of LARPA are presented for maintaining the logical

structure. It also gives implementation, simulation results and findings followed by

summary of the chapter.

Chapter 8 discusses a self organized Height Balanced Fault Adaptive Reshuffle

(HBFAR) scheme. It is developed for improving hierarchical quorums over P2P

systems. It also gives correctness proof of HBFAR, its implementation, simulation

results, findings followed by summary of the chapter.

Finally Chapter 9 concludes the work presented in this thesis followed by future

scope.

1.9 Summary

In this chapter we have briefly defined P2P network, their limitations and applications.

A look is also given on the challenges available in the existing system. The

motivation of doing this research is followed by contribution made in this dissertation.

This chapter gives roadmap of the thesis.

In the next chapter we will present literature review.

Chapter 2

Literature Review

In recent years evolution of a new wave of innovative network architectures for P2P

networks has been witnessed [17]. In these networks all peers cooperate with each

other to perform a critical function in a decentralized manner. All peers. i.e., both

users and resources providers (service providers) can access each other directly

without intermediary agents. Compared with a centralized system, a P2P system

provides an easy way to aggregate large amounts of resource residing on the edge of

Internet or in ad hoc networks with a low cost of system maintenance.

Rest of the chapter is organized as follows. P2P networks are explored in Section

2.1. Types of P2P networks are presented in Section 2.2. File sharing systems are

given Section 2.3. Section 2.4 discusses underlay and overlay networks. Section 2.5

presents challenges in P2P systems. Section 2.6 discusses parallelism in database.

Section 2.7 presents concurrency control. Topology mismatch problem are explored

in Section 2.8. Replication for availability is given in Section 2.9. Section 2.10

explores on quorum consensus. Section 2.11 presents databases and some middleware

are presented in Section 2.12. Analysis of review work is presented in Section 2.13.

Finally chapter is summarized in Section 2.14.

2.1 Peer-to-Peer (P2P) Networks

A P2P system is a distributed network architecture composed of participants that

make a portion of their resources, such as, processing power, disk storage or network

bandwidth are directly available to other network participants, without the need for

central coordination instances such as servers or stable hosts (see Figure 2.1). A P2P

system assumes equipotency of its participants, organized through the free

cooperation of equals for performing a common task [18].

11

Peer Peer

Peer

Peer

Peer

Figure 2.1 The Basic Architecture of P2P Network

In P2P networks, all peers cooperate with each other to perform a critical function

in a decentralized manner. These peers, i.e., are both users and resources providers

(service providers) can access each other directly without intermediary agents.

Compared with a centralized system, a P2P system provides an easy way to aggregate

large amounts of resource residing on the edge of the Internet or in ad hoc networks

with a low cost of system maintenance. P2P systems attract increasing attention from

researchers. Such systems are characterized by direct access between peer systems,

rather than through a centralized server. More simply, a P2P network links the

resources of all the peers on a network and allows the resources to be shared in a

manner that eliminates the need for a central host. In P2P systems, peers of equal

roles and responsibilities, often with various capabilities, exchange information or

share resources directly with each other. Such types of systems function without any

central administration and coordination instance. A P2P network differs from

conventional client/server or multitiered server's networks. The peers are both

suppliers and consumers of resources, in contrast to the traditional client/server model

where only servers supply and clients consume (see Figure 2.2).

Peer

Peer

Peer

Peer

Peer

Server

Figure 2.2 The Basic Client/Server Architecture

12

2.2 Types of P2P Networks

P2P systems can be of various types. File sharing is the dominant P2P application on

the Internet, allowing users to easily contribute, search and obtain content [19, 20,

21]. The P2P file sharing architecture can be classified according to what extent they

rely to one or more servers to facilitate the interaction between peers. Peer-to-Peer

systems are categorized into centralized, decentralized structured, decentralized

unstructured.

An important achievement of P2P networks is that all clients provide resources,

including bandwidth, storage space, and computing power. Thus, as a peer arrives and

demands on the system increases, the total capacity of the system also increases. This

is not true for client/server architecture with a fixed set of servers, in which adding

more clients could mean slower data transfer for all users.

Companies are using the processing capabilities of many smaller, less powerful

computers replacing large and expensive supercomputers [5, 22]. These features are

fulfilling the requirement of the large computing tasks using the processing of existing

in house computers or by accessing computers through the Internet.

2.2.1 Structured P2P Networks

Structured P2P network employ a globally consistent protocol to ensure that any peer

can efficiently route a search to some peer that has the desired file, even if the file is

extremely rare (see Figure 2.3). Such a guarantee necessitates a more structured

pattern of overlay links [23]. By far the most common type of structured P2P network

is the distributed hash table (DHT) [24, 52], in which a variant of consistent

hashing is used to assign ownership of each file to a particular peer, in a way

analogous to a traditional hash table's assignment of each key to a particular array

slot.

DHTs are a class of decentralized distributed systems that provide a lookup

service similar to a hash table: (key, value) pairs are stored in the DHT, and any

participating peer can efficiently retrieve the value associated with a given key.

Responsibility for maintaining the mapping from keys to values is distributed among

the peers, in such a way that a change in the set of participants causes a minimal

amount of disruption. This allows DHTs to scale to extremely large numbers of peers

and to handle continual peer arrivals, departures, and failures.

13

Internet Key

Key

Key

Data

Data

Data

Hash Function

Distributed Key

PeerPeer

Peer Figure 2.3 Distributed Hash Table (DHT)

DHTs form an infrastructure that can be used to build P2P networks. DHT based

networks have been widely utilized for accomplishing efficient resource discovery for

grid computing systems, as it aids in resource management and scheduling of

applications. Resource discovery activity involves searching for the appropriate

resource types that match the user’s application requirements. Recent advances in the

domain of decentralized resource discovery have been based on extending the existing

DHTs with the capability of multidimensional data organization and query routing.

2.2.2 Unstructured P2P Networks

An unstructured P2P network is formed when the overlay links [25] are established

arbitrarily. Such networks can be easily constructed as a new peer that wants to join

the network can copy existing links of another peer and then form its own links over

time. In an unstructured P2P network, if a peer wants to find a desired piece of data in

the network, the query has to be flooded through the network to find as many peers as

possible that share the data (see Figure 2.4). The main disadvantage with such

networks is that the queries may not always be resolved. Popular content is likely to

be available at several peers and a peer searching for it is likely to find the same thing.

But if a peer is looking for rare data shared by only a few other peers, then it is highly

unlikely that search will be successful. Since there is no correlation between a peer

and the content managed by it, there is no guarantee that flooding will find a peer that

has the desired data. Flooding also causes a high amount of signaling traffic in the

network and hence such networks typically have very poor search efficiency [26].

Many of the popular P2P networks are unstructured.

14

In pure P2P networks, peers act as equals, merging the roles of clients and server. In

such networks, there is no central server managing the network, neither is there a

central router. Some examples of pure P2P Application Layer networks designed for

file sharing are Gnutella and Freenet [27].

There also exist hybrid P2P systems, which distribute their clients into two

groups: client peers and overlay peers. Typically, each client is able to act according

to the momentary need of the network and can become part of the respective overlay

network used to coordinate the P2P structure. This division between normal and better

peers is done in order to address the scaling problems on early pure P2P networks, for

example Gnutella (version 2.2).

Response

Request

Search Request

Registration

Peer

PeerPeer

Peer

Directory Server

Figure 2.4 Information Retrieval form Hybrid P2P Based System

Another type of hybrid P2P network is a network using on the one hand central

server(s) or bootstrapping mechanisms, on the other hand P2P for their data transfers.

These networks are in general called centralized networks because of their lack of

ability to work without their central server(s), e.g., eDonkey network (eD2k) [28].

2.3 File Sharing System

File sharing is the dominant P2P application on the Internet, allowing users to easily

contribute, search and obtain content. These applications are popularized by file

sharing systems like Napster [13]. P2P file sharing networks have inspired new

structures and philosophies in other areas of human interaction. In such social

contexts, P2P as an idea refers to the classless social networking that is currently

emerging throughout society, in general enabled by the Internet technologies.

15

The P2P file sharing architecture can be classified according to what extent they rely

to one or more servers to facilitate the interaction between peers. P2P systems are

categorized [29] into centralized, decentralized structured, decentralized unstructured,

shown in Figure 2.5.

Centralized: In this type of systems, there is a central control over the peers. There

is a server which carries the information regarding the peers, data files and other

resources. If any peer wants to communicate or wants to use the resources of other

peer have to send the request to the server. Server then searches the location of the

peer /resource through its database/index. After getting the information, peer directly

communicates with the desired peer. This system is very similar to the client/server

model, viz., Napster which is very popular for sharing the music files. The security

measure can be implemented due to the central server. At the time of request sending

the authorization and authentication of the peer can be checked.

Peer-to-Peer Systems

Unstructured (e.g., Gnutella, Freenet)

Structured (e.g, Chord, CAN)

Decentralized Centralized (e.g., Napster)

Figure 2.5 Classifications of P2P System Networks

It is easy to locate and search an object/peer due to central server. These systems

are easy to implement as the structure is similar to client/server model, i.e.,

complexity is low.

These types of systems are not scalable due to limitation of computational

capability, bandwidth, etc. These systems have poor fault tolerance due to

unavailability of replication of objects [30] and load balancing. These types of

systems are not reliable due to single point failure, malicious attack and network

congestions near the server. These types of systems are least secure. The overhead on

16

the performance of the system is also high. Distributed Databases may be used in

these types of systems.

In centralized P2P systems the resource discovery is done using the central server

which keeps all the information regarding resource, e.g., Napster [13]. Multiple

servers are used to enhance performance in centralized systems [31].

Decentralized Structured: Decentralized structured P2P networks (e.g., Chord [31],

CAN[32, 33], Tapestry[34, 35], Pastry[34] and TRIAD[36]) use a logical structure to

organize the peers of the networks. These networks use a distributed hash table like

mechanism, to lookup files and efficient in locating the object quickly due to the

logical structure (search space is reduced exponentially). As decentralized structured

networks impose a tight control on the overlay topology, hence they are not robust to

peers dynamics. It is easy to locate and search an object/peer, due to logical structure

in these networks. The traffic of messages in these types of networks is reduced.

These systems are scalable, due to dynamic routing protocols. They have good

performance and are least affected due to scalability. These types of system are

reliable in nature, support failure peer detection and replication of objects.

The systems have tight control over the overlay topology hence they are not

robust to peer dynamics. Performance of these types of systems brutally effected, if

churn rate is high and these types of systems are not suitable for the ad hoc peer.

Database searching is comparatively complex with centralized systems [37].

Decentralized Unstructured: These types of systems are actual P2P systems, i.e.,

which are more close to the definition of P2P systems [38, 39]. There is not any

central control and all peers may act as server (which provides the service) as well as

client (which take the service). Peer wants to communicate with other peer, have to

broadcast/flooded the request to all the connected peers for searching the peer/data

object. Only the peer having the data responds and sends the data object through the

reverse path to the requesting peer. The flooding or broadcasting of requests creates

the unnecessary traffic on the network, which is main drawback of the system. A lot

of work is going on to reduce the traffic of the network. Various techniques are also

proposed, i.e., forwarding based, cached based and overlay optimization [40], etc.

17

These types of systems are not having the tight control over the overlay topology, so

they support peer dynamics. The performance is not much affected due to high churn

rate. These systems are distributed in nature, so there is no single point failure.

The scalability is poor due to overhead of traffic to discover the object/peer, as

system grows after a limit its performance goes on decreasing. It is very costly to

search a resource in unstructured system. Flooding is used to search a resource for

enhancing the performance Random Walk [41] and Location aware topology

matching [42] are used. For providing the fault tolerance, a Self maintenance and self

repairing technique are used [17].

For providing security to information, these systems use PKI [18]. Alliatrust, a

reputation management scheme [19] deals with threats, e.g., Free riders, polluted

contents, etc.

To cope up with query loss and system overloading a congestion aware search

protocol may be used [17]. This includes Congestion Aware Forwarding (CAF),

Random Early Stop (RES) and Emergency Signaling (ES). Location dependent

queries use the Voronoi Diagram [43]. The structured and unstructured P2P networks

have it’s own advantages and disadvantages. File sharing system of P2P networks

depends upon the application deployed on the network. To implements databases over

P2P networks, structured file sharing system have advantage over unstructured,

because of multiple communication between the peers and to reduce the search time

of data from the network.

2.4 Underlay and Overlay P2P Networks

The underlay networks comprises of the active/passive entities participating to

transfer a message (physically) from source to destination using physical

channels/links. An overlay network is a computer network {refer Figure 2.6} used to

logically connects the peers, which is built on the top of underlay networks (IP) [44,

45, 46]. Peers in the overlay (logical structure) correspond to a path between them,

through many physical links in the underlying network. For example, distributed

systems such as cloud computing, P2P networks, and client/server applications are

overlay networks because their peers run on top of the Internet. Internet was built as

an overlay upon the telephone network. Overlay networks have also been proposed as

a way to improve Internet routing, such as through quality of service

18

(QoS) guarantees to achieve higher quality streaming media [47]. Earlier proposals

such as IntServ, DiffServ, and IP Multicast have not seen wide acceptance largely

because they require modification of all routers in the network. On the other hand, an

overlay network may be incrementally deployed on end hosts running the overlay

protocol software, without cooperation from ISPs. The overlay has no control over

how packets are routed in the underlying network between two overlay peers, but it

controls the sequence of overlay peers a message traverses before reaching its

destination.

Node Mapping

Logical Connection

Physical Connection

Underlay Topology

Overlay Topology

Network Layer

Application Layer

Figure 2.6 Typical Overlay Network

Such an overlay networks might form a structured overlay network following a

specific topology or an unstructured network where participating entities are

connected in a random or pseudo random fashion. In weakly structured P2P overlays

where peers are linked depending on a proximity measure providing more flexibility

than structured overlays and better performance than fully unstructured ones.

Proximity aware overlays connect participating entities so that they are connected to

close neighbors according to a given proximity metric reflecting some degree of

19

affinity (computation, interest, etc.) between peers. Researchers need to use this

approach to provide algorithmic foundations of large scale dynamic systems.

2.5 Challenges in P2P Systems

The Internet started out as a fully symmetric, P2P network of cooperating users. It

has grown to accommodate the millions of people flocking online, technologies have

been put in place that have split the network into a system with relatively few servers

and many clients. These phenomena pose challenges and obstacles to P2P

applications: both the network and the applications have to be designed together to

work in cycle. Application authors must design robust applications that can function

in the complex Internet environment, and network designers must build in capabilities

to handle new P2P applications. Fortunately, many of these issues are familiar from

the experience of the early Internet; the researcher must learn lessons and follow up in

the new system design. A P2P system has to address the challenges related to

networks and to application specific. In this thesis the problem defined is to make a

P2P system for real time information dissemination and management. This problem

has two folds, related to P2P dynamic networks and real time database. Thus, in the

next section a discussion of challenges related to network and real time databases is

presented.

2.5.1 Challenges in P2P Networks

P2P systems are usually large scale dynamic systems whose peers are distributed on a

wide geographic area. In order to enable resource awareness in such a large scale

dynamic distributed environment, a specific resource management strategy is required

which takes into account the P2P characteristics. Within the scope of this research, a

suitable solution for real time data/resource management in P2P systems must fulfill

the following requirements:

Fault Tolerance [8]: P2P systems are used in situations when a system has to

function properly without any kind of centralized monitoring or management facility,

because of the dynamic behavior of peers, an appropriate resource management

strategy for P2P systems must support fault tolerance in its operations [48]. Therefore,

automatic self recovery from failures without seriously affecting overall performance

20

becomes extremely important for P2P systems. The term fault tolerance means that a

system can provide its services even in the presence of faults that are caused either by

internal system errors or occur due to some influence of its environment.

Thus, scalability [49] and reliability are defined in traditional distributed system

terms, such as the bandwidth usage — how many systems can be reached from one

peer, how many systems can be supported, how many users can be supported, and

how much storage can be used. However, sometimes it is not possible to recover from

a failure. It is then necessary that the system be capable of adequately providing the

services in the presence of such partial failure. In case of a failure a P2P system must

be capable of providing continuous service while necessary repairs are being made. In

other words, operation such as routing between any two peers n1 and n2 must be

completed successfully even when some peers on the way from n1 to n2 fail

unpredictably.

Reliability is related to systems and network failure, disconnection, availability of

resources, etc. With the lack of strong central authority for autonomous peers,

improving system scalability and reliability is an important goal. As a result,

algorithmic innovation in the area of resource discovery and search has been a clear

area of research, resulting in new algorithms for existing systems, and the

development of new P2P platform.

Low cost for network maintenance [5, 50, 51]: the management of a peer’s

insertion or deletion in the network, as well as the dissemination and replication of

resources generate control messages in the network. Control messages are mainly

used to keep the topology changing network up-to-date and in a consistent state.

However, since the number of control messages can become very large and grow even

larger than the number of data packets, it is required to keep the proportion of control

messages to the data packets as low as possible. The cost for resource management

should not be higher than the cost of the network resource utilization itself.

Load Balancing [51]: the load distribution is measured by investigating how good

the network management duties are distributed between the peers in the network. A

parameter for assessing this is for example the routing table and the location table at

each peer of the system. A suitable resource management strategy for P2P should

21

ensure a well balanced distribution of the management duties between the peers of the

system [3, 51, 53].

Peer Availability [54]: A peer’s lifetime is the time between when it enters the

overlay for the first time and when it leaves the overlay permanently. A peer’s session

time is the elapsed time between when it joins the overlay and when it subsequently

leaves the overlay. The sum of a peer’s session times divided by its lifetime is defined

as its uptime or called availability [55, 56, 57, 58, 59]. The availability of a P2P

management solution defines the probability that a resource is successfully located in

the system. A resource management strategy is said to be highly available, when it

enables any existing resources of the system to be found when it is requested with a

probability of almost 100%. This depends on the fault tolerant routing and the

resource distribution strategies [2, 60].

Cost sharing/reduction: Centralized systems that serve many clients typically bear

the majority of the cost of the system. When that main cost becomes too large, a P2P

architecture can help spread the cost over all the peers [1, 18]. For example, in the file

sharing space, the developed system will enable the cost sharing of file storage, and

will able to maintain the index required for sharing. Much of the cost sharing is

realized by the utilization and aggregation of otherwise unused resources which

results both in net marginal cost reductions and a lower cost for the most costly

system component. Because peers tend to be autonomous, it is important for costs to

be shared reasonably.

Logical Structure: The structures, in which replicas/peers are connected, play an

important role to reduce the search time of replicas and network traffic. The messages

propagated to search replicas from the structure generate huge network traffic,

because of the topology mismatch problem. Structures should be selected to minimize

the search time and network traffic.

Underlay/Overlay Paths [61, 62]: A message travels multiple hops in underlay

corresponding to one hop path in overlay. Each forwarding of message through

overlay path adds heavy network traffic to the physical network. The peers in

underlay are traversed multiple times, by the messages, while messages are forwarded

22

in overlay path. This cause the redundant traffic in the network, and network may

slowdown to the extent of choking. The unnecessary message forwarding at least at

overlay should be minimized.

Resource Aggregation and Interoperability [7, 50]: A decentralized approach lends

itself naturally to aggregation of resources. Each peer in the P2P system brings with it

certain resources such as computing power or storage space. Applications that benefit

from huge amounts of these resources, such as compute intensive simulations or

distributed file systems, naturally lean toward a P2P structure to aggregate these

resources to solve the larger problem. Interoperability is also an important

requirement for the aggregation of diverse resources.

Dynamism [6, 51]: P2P systems assume that the computing environment is highly

dynamic. That is, resources, such as compute peers, will be entering and leaving the

system continuously. When an application is intended to support a highly dynamic

environment, the P2P approach is a natural fit. In communication applications, such

as Instant Messaging, so called “Buddy Lists” are used to inform users when persons

with whom they wish to communicate become available. Without this support, users

would be required to “poll” for chat partners by sending periodic messages to them.

Dynamic Service Relationships [63, 64]: Dynamic Service Relationships become an

important issue in P2P systems due to the fact that those systems are non

deterministic, dynamic and are self organizing based on the immediately available

resources. A P2P system is typically loosely coupled; moreover it is capable of

adapting to changes in the system structure and its environment, viz., number of

peers, their roles, and infrastructure. In order to build a loosely coupled system that is

capable of dynamic reconfiguration, several mechanisms should be there.

Data/Peer Discovery: There must be a distributed search mechanism that allows for

finding services and service providers based on certain criteria. The challenge is to

find the right number of look up services that should be available in the system.

Another challenge is how to decide which peer will run a look up service in a fully

distributed environment. Again we need a decision making system or voting. Running

23

a look up service requires additional resources such as power and memory from the

peer, therefore cannot be always requested form the peer on a free of charge base.

Thus, shortest path of the resource lookup operation is a benchmark for the

effectiveness of the resource management. Herewith, any requested resource should

be found within an optimal lookup path length that is as close as possible to the

Moore Bound D = logΔ-1(Nmax(Δ - 2) + 2) - logΔ-1 -Δ,[65, 66]. Here, D is the

diameter of a Moore graph which is defined as the lowest possible end-to-end

distance between any two peers in a connected graph.

Naming /Addressing [6]: In order to identify a resource (peer or service) a unique

identification mechanism or naming concept needs to be introduced into a P2P

system. How to address a peer in the global network? Addresses that are normally

used to access the peer in the network (such as IP address in the TCP/IP network) do

not help a lot – the P2P system is heterogeneous; therefore different addressing

protocols can be theoretically used within one P2P network.

Security [67, 68, 69, 70]: P2P systems are subjected to numerous challenges with

respect to security. Making sure a user of the system is really the one it claims to be.

In P2P systems service and resource consumers might require proof of information

about the provider; otherwise authentication cannot be considered successful.

Therefore, distributed trust establishment mechanisms are needed which will decide

the authentication of user to access the system. In centralized systems the user rights

are pre defined and therefore the decision to allow access for the certain user is taken

based on these predefined rights [13]. In P2P systems the requestor is not known a

prior, that leads to a complex decision making process. This includes the challenges

of making sure data cannot be accessed by unauthorized parties, making sure it was

not modified “on the wire” without this being recognized, proofing from whom the

data came. For example with cryptographic signatures, or making sure that actions

that have been executed cannot be claimed never to have happened (non repudiation).

Thus, the system must be especially hardened against insider attacks, because people

can easily become insiders.

State and Data Management: P2P systems are characterized by the fact that a single

failing peer must not bring down the system as a whole. Of course, specific services

24

(those that had lived on the dying peer) might not be available anymore, but the

system still fulfills a useful purpose. In many systems this requires facilities for some

kind of distributed data management [10]. As a consequence, we have to look at the

following challenges: replication [71, 72, 73], caching [63], consistency and

synchronization, and finding the nearest copy.

2.5.2 Challenges for Databases in P2P Networks

The conventional distributed database expects the 100% availability of server/hosts on

the network, where these databases are deployed. But, in case of P2P networks, all

peers are prone to leave the network as and when they want, i.e., there is no control

over the participating peers over which database are to be placed. This presents a

separate set of challenges to be addressed. The few of them are as follows.

Peer Selection for Storing Database [74, 75, 76]: a large number of peers are

participating in the P2P networks, depending upon their interest in the system. Peers

are connected to the system with some bandwidth. Each peer is having its session

time for which it is connected with the system. Each peer act as server as well as

client, depending upon the service/resource provided or consumed. It is very difficult

to select a suitable peer among number of participated peers, under high churn rate of

peers. For selecting suitable peers, which may contribute in the system performance, a

variety of parameters have to be checked that may affect system performance.

Peer Discovery: The network uses distributed discovery algorithms to find suitable

peer, which is having sufficient resources for the system. The selection of peer for

various roles in the system is to be addressed.

Network Traffic Balancing: P2P systems generate a huge amount of redundant

traffic over Internet due to topology mismatch problem. This traffic is further

increased exponentially, whenever, P2P system deal with the database. This increases

the Internet traffic at the level of congestion/choking. An efficient system deal with

this increased traffic. To reduce the traffic load on Internet and balance the traffic in

case of any congestion is to be addressed.

25

Database Partitioning: Databases may be partitioned to maintain availability,

security, peer load, etc. the system performance also depends upon, how database is

divided in to partitions and how these partitions are accessed for a submitted query by

the system.

Data Availability [77, 78]: Data availability of a system is a measure of how often a

complete data is available for the arrived query. High data availability increases the

performance of the system. Hence, high data availability is required by the system and

to maintain this data availability is a challenge for the system.

Security of the Database Partition [79]: The database cannot be placed at untrusted

peers. The peer can misuse/tamper/destroy the database. This issue should be

considered while addressing the other challenges.

Query Interface in Heterogeneous Environment: Each peer participating in the

system may be heterogeneous in its hardware and platform used. To execute the

arrived query through heterogeneous peer is also an important issue in implementing

database over P2P systems.

Schema Mapping [80]: at the time of data partition, a global schema is partitioned

into local schemas. The arrived queries are based upon global schema. To make

arrangement for executing that global query through local schema a mapping is

required. The technique for schema mapping also affects the system performance.

Peers Mapping: In this operation we select peers are used to store the database

partitions. The mechanism used to map data items on to the peers and affect the

system performance.

Transparency in Database: Transparency is the hiding the intermediate process from

the user. This gives the miracle that queries are executed by coordinator only, not by

the peers storing database partitions.

Data Consistency [81, 82]: system deal with number of replicas. Each replica is

accessed to read/write operations. Thus, data must be similar in all replicas after

26

execution of any operation in the system. An efficient update mechanism is required

to update each replica in the system.

One-Copy-Serializability: a P2P system uses distributed execution for a query,

where query is distributed to number of participating peers. The system compiles the

partial results from the distributed execution processes. Thus, the final result of the

global query must be similar to, if it is executed on single machine. This refers to one-

copy-serializability of the query.

Locality Awareness: Locality awareness can significantly improves the high level

routing and information exchange in the application layer. This will help to check

whether a peer is still in the network or it quits from the network.

Adaptation: The overlay network should endue of good adaptation that nodes can

join and leave anytime. Thus, it has to be able to adapt changes rapidly.

Fault Tolerance & Robustness: If one or more nodes from the overlay network fail,

the overlay has still to function accurately. Failures have to be rapidly recognized and

corrected. If nodes fail, logical connections which are incident to the nodes also fail.

Thus, the overlay has to seek for alternative connections.

State and Data Management: P2P systems are characterized by the fact that a single

failing peer must not bring down the system as a whole. Of course, specific services

(those that had lived on the dying peer) might not be available anymore, but the

system still fulfills a useful purpose. In many systems this requires facilities for some

kind of distributed data management [10]. As a consequence, we have to look at the

following challenges: replication, caching [63], consistency and synchronization, and

finding the nearest copy.

2.6 Parallelism in Databases [83, 84]

Parallel execution is referred as parallelism. It may be implemented on certain types

of Online Transaction Processing (OLTP) and hybrid systems. It is the idea of

breaking down a task so that, instead of one process doing all of the work in a query,

27

many processes do part of the work at the same time. Parallelism is effective in the

systems having all of the following characteristics: (a) Symmetric Multi Processors

(SMP), clusters, or massively parallel systems, (b) Sufficient I/O bandwidth, (c)

Underutilized or intermittently used CPUs (for example, systems where CPU usage is

typically less than 30%) and (d) Sufficient memory to support additional memory

intensive processes such as sorts, hashing, and I/O buffers. An example of this is

when four processes handle four different tasks at a work place instead of one process

handling all four tasks by itself. The improvement in performance can be quite high.

In this case, each task will be a partition, a smaller and more manageable unit of an

index or table. The most common use of parallelism is in Decision Support Systems

(DSS) and data warehousing environments. The parallel execution significantly

reduces response time for data intensive operations on large databases and used in

DSS and data warehouses. Complex queries, such as those involving joins of several

tables or searches of very large tables, are often best executed in parallel. If a system

lacks any of these characteristics, parallelism might not significantly improve the

performance.

2.6.1 Partitioning Methods

Database partitioning is the process of dividing database into number of parts,

depending upon some criterion. This criterion should be such that the data items from

various database partitions may be compiled and generates similar results as

generated by non partitioned database. There are four partitioning methods:

• Range Partitioning

• Hash Partitioning

• List Partitioning

• Composite Partitioning

Each partitioning method has different advantages and design considerations. Thus,

each method is more appropriate for a particular situation.

Range Partitioning: Range partitioning maps data to partitions based on ranges of

partition key values that is established for each partition. It is the most common type

of partitioning and is often used with dates, e.g., to partition sales data into monthly

partitions. The range partitioning maps rows to partitions based on ranges of column

28

values. The range partitioning is defined by the partitioning specification for a table or

index in partition by range (column_list) and by the partitioning specifications for

each individual partition in values less than(value_list), where column_list is an

ordered list of columns that determines the partition to which a row or an index entry

belongs. These columns are called the partitioning columns. The values in the

partitioning columns of a particular row constitute that row's partitioning key.

Hash Partitioning: Hash partitioning maps data to partitions based on a hashing

algorithm may applies to a partitioning key for their identification. The hashing

algorithm evenly distributes rows among partitions, giving partitions approximately

the same size. Hash partitioning is the ideal method for distributing data evenly across

devices. It is a good and easy to use alternative to range partitioning when data is not

historical and there is no obvious column or column list where logical range partition

pruning can be advantageous. A linear hashing algorithm to prevent data from

clustering within specific partitions is used by Oracle Database.

List Partitioning: List partitioning enables to explicitly control, how rows map to

partitions. This can be done by specifying a list of discrete values for the partitioning

column in the description for each partition. This is different from range partitioning,

where a range of values is associated with a partition and with hash partitioning,

where a user have no control of the row to partition mapping. The advantage of list

partitioning is that one can group and organize unordered and unrelated sets of data in

a natural way.

Composite Partitioning: Composite partitioning combines range and hash or list

partitioning. The distribution of data into partitions by range partitioning is done in

Oracle Databases. Oracle uses a hashing algorithm to further divide the data into sub

partitions within each range partition. For range list partitioning, Oracle divides the

data into sub partitions within each range partition based on the explicit list.

In combination with parallelism, partitioning can improve performance in data

warehouses/or in a system. The partitioning significantly enhances data access and

improves overall application performance. The partitioned tables and indexes

facilitate administrative operations by enabling these operations to work on subsets of

data, e.g., create new partition, organize an existing partition, drop a partition and

29

cause less than a second of interruption to a read only application. The partitioned

data greatly improves manageability of very large databases and dramatically reduces

the time required for administrative tasks such as backup and restore, etc. The

granularity can be easily added or removed to the partitioning scheme by splitting

partitions. Partitioning also allows one to swap partitions with a table. To improve the

performance of databases over dynamic P2P networks, parallelism and database

partitioning may be useful.

2.7 Concurrency Control

For the sake of distinguishing the transactions executing order, the system assigns an

exclusive integer which increases with time when each transaction begins to execute.

We call this integer timestamp. The concurrency control based on timestamp is to

dispose the collisions by the sequence of the timestamp to make a group of

transactions’ cross executions equivalent to a serial sequence which is labeled by

timestamp. The aim of timestamp is to assure the collisions’ read operation and write

operation could be executed by the sequence of the timestamp. In the Timestamp

Method, the system would give an timestamp TS(Ti) to any transaction Ti. To any

data item R, the timestamps of the last read operation and write operation are RTS(R)

and WTS(R) respectively. When Ti request to read R, the timestamp of the read

operation is TSR(Ti); and when Ti request to write R, the timestamp is TSW(Ti).

The Concurrency Control (CC) mechanisms used in a RTDDBS have a significant

impact on the timeliness, a large amount of work has been performed on the design of

CC mechanisms for RTDBSs in past decades [46, 85, 86, 87, 88]. Locking based

protocols usually combine two phase locking (2PL) with a priority scheme to detect

and resolve conflicts between transactions. However, some inherent problems of 2PL

such as the possibility of deadlocks and long blocking times make transactions

difficult to meet their deadlines. On the other hand, the Optimistic Concurrency

Control (OCC) protocols have the properties of non blocking and deadlock free which

make them attractive for RTDDBS. The conflict resolution between transactions is

delayed until a transaction near completes, so there will be more information available

for making the choice in resolving conflicts. However, the late conflict detection

makes the restart overhead heavy. Some concurrency control protocols based on

dynamic adjustment of serialization order have been developed to avoid unnecessary

30

restarts [112, 113, 114, 116]. Among these protocols, OCC-TI [112] and OCC-DATI

[113] based on time interval are better than OCC-DA [114] which is based on single

timestamp since time interval can capture the partial ordering among transactions

more flexible. OCC-DATI is better than OCC-TI since it avoids some unnecessary

restarts in the latter. But there are still some unnecessary restarts with these protocols.

New version of OCC-DATI is Timestamp Vector based Optimistic Protocol (OCC-

TSV). With the new protocol, more unnecessary restarts of transactions can be

avoided. A Feedback Based Secure Concurrency Control for MLS Distributed

Database are presented in [89 ] which secure the multi level databases.

These all conventional protocols are defined for static network, for dynamic

networks like P2P, modification have to be made. The protocol/ algorithms may

include the constraints of P2P environments. The concurrent processes are distributed

over unreliable peers, which are prone to leave the network. In such environment one-

copy-serializability at global and local level, in the transaction execution is hard to

achieve. To achieve secure transaction execution over secure data items, one-copy-

serializability of transaction at global and local levels, a secure protocol need to be

identified for dynamic environment of P2P.

2.8 Topology Mismatch Problem

There are several traditional topology optimization approaches. In [90] authors

describe an approach called End System Multicast. Here the authors first construct a

rich connected graph on which shortest path spanning trees are constructed. Each tree

rooted at the corresponding source then uses a routing algorithm for message

forwarding. This approach introduces large overhead for constructing the graph and

spanning trees and does not consider the dynamic joining and leaving characteristics

of the peers. The overhead of End System Multicast is proportional to the multicast

group size. This approach is not feasible for large scale P2P systems.

The researchers have also considered peers that are close in a cluster based on

their IP addresses [91, 92]. In this approach there are two limitations – First, the

mapping accuracy is not guaranteed and second, it affects the searching scope

increasing the volume of network traffic.

In [93], the authors measure the latency between each peer to multiple stable

Internet servers called “landmarks”. The measured latency is used to determine the

31

distance between peers. This measurement is conducted in a global P2P domain and

needs the support of additional landmarks. Similarly, this approach also affects the

search scope in P2P systems.

GIA [94] introduces a topology adaptation algorithm to ensure that high capacity

peers are the ones with high degree and low capacity peers are within short reach of

high capacity peers. It addresses a different matching problem in overlay networks.

To chase topology mismatch Minimum Spanning Tree (MST) based approaches

are used in [95, 96]. In these peers build an overlay MST among the source peer and

certain hop neighbors, and then optimizes connections that are not on the tree. An

early attempt at alleviating topology mismatch is called Location-aware Topology

Matching (LTM) [95], in which each peer issues a detector message in a small region

so that the peers receiving the detector can record relative delay information. Based

on the delay information, a receiver can detect and cut most of the inefficient and

redundant logical links, as well as add closer peers as direct neighbors. The major

drawback of LTM is it needs to synchronize all peering peers and thus requires the

support of NTP [97], which is critical.

In [92] authors discuss the relationship between message duplication in overlay

connections and the number of overlay links. The authors proposed Two Hop

Neighbor Comparison and Selection (THANCS) to optimize the overlay network

[98], which may change the overlay topology. This change in overlay topology may

not be acceptable in many cases.

In the above approaches overlay network is only considered for optimization.

These approaches are not considering the underlay network problems. The network

search space should not be altered, as it causes the change in overlay structure of the

network. Thus, a methodology is required, which will reduce network traffic at

underlay level without affecting the overlay topology of the network and make the

system fast and scalable.

2.9 Replication for Availability [99, 100, 101, 102]

Replication is one of the most important resiliency strategies and has been used to

increase the reliability of services and the availability of data in distributed systems

[103, 104]. By providing multiple identical instances of the same data at different

locations, the data can still be available when part of the system fails or goes offline.

32

The replication for availability is an especially important principle in end system

based P2P networks, where the failures and loss of access happen frequently to the

peer. The availability of shared data in a P2P system can be improved from 24% to

99.8% at 6 times excess storage for replication [104]. With the help of erasure coding,

data can be highly available even when only a small subset of peers are online.

Similarly, [105, 106, 107] implements a scalable, distributed file system that logically

functions as a centralized file server but is physically built across a set of client

desktop computers. The system monitors machine availability and places replicas of

files on multiple client desktop machines to maximize effective system availability

using different replication algorithms.

One of the most important problem in replication systems is replica placement.

Choosing the right replica placement approach is a non trivial and non intuitive

exercise. The replica placement techniques include both passive caching [108, 109]

and proactive replication [110], both centralized mechanism [111] and distributed

methods [110].

All above methods of replication are interested only in replicating the data items,

but in P2P networks only data replication is not sufficient. This requires high

probability to access updated data from the replicas and all copies of data items

maintains data consistency among all replicas. Some protocols uses n number of

replicas in the system, it causes network choking due to huge network traffic in P2P

network. Thus, a efficient logical structure is to be established, which should be

capable to place limited number of replicas (as per requirement of system) at

appropriate place, provides high probability to access updated data items from P2P

environment.

2.10 Quorum Consensus

Data replication is a technique to improve the performance of Distributed Database

Systems (DDBS) [11, 112, 113, 114, 115] and make the system fault tolerant [41,

116, 117, 118, 119, 120]. Replication improves the system performance by reducing

latency, increasing throughput and increasing availability. However, data replication

is the basic requirement for the DDBS [178] deployed on the networks that are

dynamic in nature for example P2P systems [121]. The churn rate of peers is observed

to be high in P2P networks [122, 123, 124, 125]. For such a highly dynamic

33

environment, the probability to access stale data from the replicas is higher as

compared with the static environment where peers do not leave the system. Several

protocols have been developed to solve the problem of accessing updated data items

from replicas in dynamic environments. The examples include single lock, distributed

lock, primary copy, majority protocol [126], biased protocol, and quorum consensus

protocol [128, 129, 130, 131]. These protocols are used to keep data consistent and to

access updated data items [124] by using multiple replicas maintained in the

distributed system.

A group of replicas are accessed to get updated data items from the replicas. This

group is generally known as “Quorum” [122] and depending upon the operation,

quorum is said to be “Read Quorum” or “Write Quorum”. To get the updated data

item, read-write quorums and two consecutive write-write quorums must intersect.

The intersection is set of replicas which are common in read-write and two

consecutive write-write quorums. This ensures that the read quorum always gets

updated data from the system. This updated data can be propagated to all other

replicas. The degree of intersection of two quorums makes the system resilient to

churn rate of the peers.

In the literature, many replication protocols have been suggested in [63, 132] for

replica management protocol in a Binary Balanced Tree. The most simple replication

protocol is the Read One Write All (ROWA) [133]. This protocol is suitable for static

networks having fixed and dedicated servers for the replication. It has minimum read

cost amongst other protocols and is highly fault tolerant. This protocol has maximum

communication cost for write operation. This communication cost increases with

increase in number of replicas. In dynamic system update all creates the problem of

unlimited wait. A variation of this technique is known as Read One Write All

Available (ROWAA). The scheme requires all replicas to be available to perform a

write operation, which improves data availability for dynamic environments [134].

Read-Few, Write-Many, approach is presented in [136]

The Dynamic Voting protocol [135] and Majority Consensus protocol [126]

perform better than ROWAA in dynamic environments. In both protocols the number

of replicas is accessed in groups. These protocols have good read and write

availability but have a disadvantage of high read cost. They have long search time to

search the replicas as the replicas are stored randomly in the network.

34

Rather than storing replicas randomly, logical structures [137, 138, 139, 140] have

been proposed to store replicas over the dynamic network. These protocols reduce

search time to make quorum from the replicas and reduce communication cost. The

Multi Level Voting Protocol, Adaptive Voting [141], Weighted Voting, Grid protocol

[142] and Tree Quorum protocol [143] are such replication protocols each with

different operational process. The Multi Level Voting protocol is based on the

concepts of the Hierarchical Quorum Consensus (HQC) strategy. HQC [132, 144,

145, 146] is a generalization of the Majority Scheme. In this tree structure, the

replicas are located only in the leaves, whereas the non leaf peers of the tree are said

to be as “logical replicas”, which in a way summarize the state of their descendants.

The advantage of tree structure is it reduces the search time to find replicas from the

structure as compare to the random structure. HQC+ [147] is also a generalization of

other protocols that use a grid logical structure to form quorums. Tree structure also

reduces the message transfer to find replicas; hence, it reduces the network traffic

generated in the system. A disadvantage of Tree Quorum protocol is that the number

of replicas grows rapidly as the tree level grows. In case of Adaptive voting and

weighted Voting protocols the formed quorum satisfies some conditions which are (a)

write and read quorums always made up of more than half replicas. (b) Write and read

quorum must be such that they intersect with each other. The disadvantage of these

protocols is the size of quorums grows with increase in number of replicas; hence,

network overhead automatically increases in the system.

Bandwidth Hierarchy Replication (BHR) is proposed in [148]. BHR reduces data

access time by avoiding network congestions in a data grid network. In [149] author

proposed BHR algorithm by using three level hierarchical structures. The proposal

addresses both scheduling and replication problems. Two replication algorithms

Simple Bottom Up (SBU) and Aggregate Bottom Up (ABU) for multi tier data grids

are proposed in [150]. These algorithms minimize data access time and network load.

In these algorithms replicas of the data should be created and spread from the root

center to regional centers, or even to national centers. These strategies are applicable

only to multi tiered grids. The strategy proposed in [151] creates replicas

automatically in a generic decentralized P2P network. Their goal of proposed model

is to maintain replica availability with some probabilistic measure. Various replication

strategies are discussed in [152]. All these replication strategies are tested on

hierarchical Grid Architecture. A different cost model was proposed in [149] to

35

decide the dynamic replication. This model evaluates the data access gains by creating

a replica and the costs of creation and maintenance for the replica. Probabilistic

Quorum Systems are presented in [153].

There are several challenges to update and access replicated data items over a

dynamic network like a P2P network. Data consistency [134], degree of intersection

between two consecutive quorums, search time to find replica and fault tolerance are

some of the identified problems. There is a need of new proposals for dynamic

environment of P2P system which should facilitate low search time, low network

traffic, fast recovery from faults, high degree of quorum intersection and access to

updated data.

2.11 Databases

Database Systems are designed to manage large bodies of information. The

management of data involves both the definition of structures for the storage of

information and the provision of mechanisms for the manipulation of information.

Thus, database is a collection of objects, which satisfy a set of integrity constraints

[82, 154].

Centralized Database Systems: are those that run on a single computer system and do

not interact with each other computer systems. Such systems span a range from single

user database systems running on personal computers to high performance database

systems running on mainframe systems [154].

Distributed Database Systems[155]: consists of collection of sites, connected together

via some kind of communications network, in which each site is a database system

site in its own right, but the sites have agreed to work together, so that, a user at any

site can access data anywhere in the network, exactly as if, the data were all stored at

the user’s own site. The distributed database system can thus be regarded as a kind of

partnership among the individual local DBMSs at the individual local sites; a new

software component at each site logically an extension of the local DBMS provides

the necessary partnership functions, and it is the combination of this new component

together with the existing DBMS that constitutes what is usually called the distributed

database management system.

36

Real Time Database (RTDB) System[156, 157]: can be viewed as an amalgamation

of a conventional Database Management System (DBMS) and a real time systems.

Like a DBMS, it has to process transactions and guarantee basic correctness criteria.

Furthermore it has to operate in real time, satisfying timing constraints imposed on

transaction commitments and on the temporal validity of data [158]

The program, used by users to interact with database, are executed and, thus

partially ordered sets of read and write operations are generated. This set of operations

is called a transaction. The transaction is an atomic unit of work, which is either

completed in its entirety or not at all. The transaction terminates either by executing

commit or an abort operation. A commit operation implies that the transaction was

successful, and, hence all its updates should be incorporated into the database in

permanent fashion. An abort operation indicates that the transaction has failed, and,

hence requires the database management system to cancel or abolish all its effects on

the database system. In short, a transaction is an "all or nothing" unit of execution. A

transaction that updates the objects of the database must preserve integrity constraints

of the database [159, 160, 161, 162]. For example, in a bank, an integrity constraint

can be imposed on account that an account cannot have a negative balance. The

transfer of money from one account to another, reservation of train tickets, filing of

tax returns, entering marks on a student's grade sheet, etc. are all examples of

transactions.

Many distributed real time database applications store their data distributed across

various sites. These sites are connected via a communication network. A single

transaction needs to process various data within specified period of time. The

difficulty is that the data may be dispersed at various sites and, therefore the

transaction has to execute at various sites in a timely fashion. In such a distributed

environment, the problem is that the transaction at some sites could decide to commit

while at some sites it could decide to abort resulting in a violation of transaction

atomicity. To address and overcome this problem, distributed database systems use a

transaction commit protocol. A commit protocol ensures the uniform commitment of

the distributed transaction, that is, it ensures that all the participating sites agree on the

final outcome (commit or abort) of the transaction. Most importantly, this guarantee is

valid even in the presence of site or network failures.

Over the last two decades, database researchers have proposed a variety of

distributed transaction commit protocols. To achieve their functionality, these commit

37

protocols typically require exchange of multiple messages, in multiple phases,

between the participating sites (where the distributed transaction executes). In

addition, several log records are generated, some of which have to be "forced", that is,

flushed to disk immediately in a synchronous manner. Due to these costs, the commit

processing can result in a significant increase in transaction execution times, making

the choice of commit protocol an important design decision for distributed database

systems [163]. The commit protocols used in conventional database systems cannot

be directly used in Real time database systems. The conventional transaction commit

protocols do not take into considerations the real time nature of the transactions,

therefore commit protocols need some modifications to cater to the specific

requirements of the real time transactions.

2.11.1 Real Time Applications Framework

The real time applications can be classified into the following three categories based

on how the application is impacted by the violation of the task completion deadline

[164].

Hard Deadline Real Time Applications: In these applications, the consequences of

missing the deadline of even a single task could be catastrophic. Life critical

applications such as flight control systems or missile guidance systems belong to this

category. Database systems for efficiently supporting hard deadline real time

applications, where all transaction deadlines have to be met, appear infeasible due to

the large variance between the average case and the worst case execution times of a

typical database transaction. The large variance is due to transactions interacting with

the operating system, the I/O subsystem, and with each other in unpredictable ways.

Guaranteeing completion of all transactions within their deadlines under such

circumstances requires an enormous excess of resource capacity to account for the

worst possible combination of concurrently executing transactions.

Soft Deadline Real Time Applications: In these applications, the tasks are associated

with deadlines, but even if a task fails to complete within the deadline, it is allowed to

execute upto completion. Generally, in these systems, a “value function" assigns a

value to the tasks. This value remains constant upto the deadline, but starts decreasing

38

after the deadline. The questions to be addressed in these applications include how to

identify the proper value function, which actually may be application dependent.

Firm Deadline Real Time Applications: These applications are different from the

soft deadline applications in the sense that the tasks, which miss the deadline, are

considered worthless (and may even be harmful if executed to completion) and are

thrown out of the system immediately. The emphasis, thus, is on the number of tasks

that complete within their deadlines. Our interest in the RTDB systems is on the

applications in the firm deadline real time domain [164]. We believe that

understanding firm deadline RTDB systems will provide necessary insight into the

RTDB technology, which is necessary for addressing the more complex framework of

soft deadline applications. Therefore, we have carried out our work from the

perspective of a “Firm Deadline Real Time Database System"[163, 154, 158].

2.12 Some Middlewares

The P2P architecture is a way to structure a distributed application so that it consists

of many identical software modules, each module running on a different computer.

The different software modules communicate with each other to complete the

processing required for the completion of the distributed application. One could view

the P2P architecture as placing a server module as well as a client module on each

computer. Thus, each computer can access services from the software modules on

another computer, as well as providing services to the other computer. Each computer

would need to know the network addresses of the other computers running the

distributed application, or at least of that subset of computers with which it may need

to communicate. Furthermore, propagating changes to the different software modules

on all the different computers would also be much harder. However, the combined

processing power of several large computers could easily surpass the processing

power available from even the best single computer, and the P2P architecture could

thus result in much more scalable applications.

Napster [12, 19, 20]: it is a simply structured centralized system. We present it here

as a sort of simplest model (which was very successful socially) to contrast the other

systems to. It uses a centralized server to create its own flat namespace of host

39

addresses. In startup, the client contacts the central server and reports a list with the

files it maintains. When the server receives a query from a user, it searches for

matches in its index, returning a list of users that hold the matching file. The user then

directly connect the peer that holds the requested file, and downloads it as shown in

Figure 2.7. There are problems with using a centralized server including the fact that

there is a single point of failure. Napster does not replicate data. It uses "keepalives"

to make sure that its directories are current.

Maintaining a unified view is computationally expensive in Napster. It does not

provide scalability. The focus on Napster as a music sharing system in which users

must be active in order to participate has made it exceedingly popular. Napster does

not use the resource sharing, but it uses distributed file management. Regarding

routing, it is simply a centralized directory system using Napster servers. The main

advantage of Napster and similar systems is that they are simple and they locate files

quickly and efficiently. The main disadvantage is that such centralized systems are

vulnerable to malicious attack and technical failure. Furthermore, these systems are

inherently not largely scalable, as there are bound to be limitations to the size of the

server database and its capacity to respond to queried. This system is not reliable as it

is prone to single point failure, easily attacked by DoS. Napster provides

communication level fault tolerance as any packet dropped due to congestion, can be

retransmitted. Napster provides communication level security. It does not support

system level and application level security. Performance of Napster is good in under

load, but it falls sharply when server is overload. The response time will increase

when the number of peers and request exceed the capability of the server.

Peer

Peer

Peer Peer

Peer

PeerServer

File

Query

Peer

Figure 2.7 The Architecture of Napster

40

Gnutella [18, 5]: The Gnutella network, which is originated as a project at Nullsoft, a

subsidiary of America online. Gnutella is a one of the earliest P2P file sharing

systems that are completely decentralized. The general architecture of Gnutella is

given in Figure 2.8. Like most P2P systems, Gnutella builds, at the application level, a

virtual overlay network with its own routing mechanisms. In Gnutella, each peer is

identified by its IP address and connected to some other peers. All communication is

done over the TCP/IP protocol. To join to the network, the new peer needs to know

the IP address of one peer that is already in the system. It first broadcasts a “join”

message via that peer to the whole system. Each of these peers then responds to

indicate its IP address, how many files it is sharing, and how much space those files

take up. So, in connecting, the new peer immediately knows how much is available on

the network to search through. Gnutella uses file name as the key. In order to search a

file, in unstructured systems, random searches are the only option since the peers have

no way of guessing where the file may lie. Each peer handles the search query in its

own way. To save on bandwidth, a peer does not have to respond to a query if it has

no matching items. The peer also has the option of returning only a limited result set.

After the client peer receives response from other peers, it uses HTTP to download

the files it wants.

Gnutella is completely decentralize but the peers are organized loosely, so the

costs for peer joining and searching are O(N), which means that Gnutella cannot grow

to a very large scale. It is more reliable than the Napster as there is no single point of

failure; objects are replicated proportionally to the square root of their query rate.

Node failure can be detected by neighbors. There exist multi path to connect to a peer.

Gnutella provide similar function as the Napster does. It does not provide resource

sharing. This uses distributed file management. Gnutella uses the fault tolerance at

system level, as the process is recovered due to multiple point execution. Data

replication is also provided by this system. It also provides, the fault tolerance at

communication level due to the IP addresses, dropped packets may be recovered by

retransmission. But channel level tolerance is not supported. Gnutella does not

support the security at any level (system, communication and application level).

Threats are: flooding, malicious contents virus spread, attacks on queries, etc. The

scalability is also a little better than Napster. Gnutella can not grow after a limit, as

the performance of the system drop sharply as the traffic on the network grows.

Further, the response time is greater in Gnutella.

41

Peer

Peer

Peer

Peer

Peer

Peer

Figure 2.8 The Architecture of Gnutella

Freenet [79]: Freenet is a purely decentralized unstructured system, operating as a

self organizing P2P network (see Figure 2.9). It essentially pools unused disk space to

create a collaborative virtual file system providing both security and publisher

anonymity. Freenet provides file storage service, rather then file sharing services as

provided by Gnutella. In Freenet files are pushed to other peers for storage,

replication and persistence. Freenet peers maintain their own local data store, which

they make available to the network for reading and writing, as well as a dynamic

routing table containing addresses of other peers and the keys they are thought to

hold.

Files in Freenet are identified by binary keys. There are three types of keys:

keyword signed keys, signed subspace keys and content hash keys. To search for a

file, the user sends a request message specifying the key and a timeout (hope to live)

value. Joining the Freenet network is simply discovering the address of one or more

existing peers, and then starting to send messages.

In order to insert new files to the network, the user must first calculate a binary

file key for it, and then send an insert message to its own peer specifying the proposed

key and a hop to live value. When a peer receives the insert message, it first checks to

see if the key is already taken. If the key is found to be taken, the peer returns the pre

existing file as if a request were made for it. If the keys not found, the peer looks up

the nearest key in its routing table, and forwards the insert message to the

corresponding peer. If the hop to live limit is reached without any key collision, an all

clear result will be propagated back to the original inserter, informing that the insert

was unsuccessful. In the basic model, the request for keys is passed along from peer

to peer through a chain of requests in which each peer makes a local decision about

42

where to send the request next. In this there is no direct connection between requester

and actual data source, anonymity is maintained, and the owners of files cached

cannot be held responsible for the content of their caches (file encryption with

original text names as key is a further measure that is taken).Fig shown the discovery

mechanism in Freenet. Freenet support multi path searching and faulty peer can be

detected by the neighbor peers so Freenet is reliable in nature. Freenet uses the file

storing rather then file sharing. Load balancing, resource sharing is not supported by

Freenet, also it does not support fault tolerance and security at any level. Performance

and scalability of Freenet is not good.

Peer

Peer Peer

Peer

Peer

Peer Peer

Peer File

Figure 2.9 The Freenet Chain Mode files discovery mechanism. The query is forwarded from peer to peer using the routing table, until it reaches the peer which has the requested data. The repl is passed back to the original peer following the reverse path. y

TRIAD[ 165]: TRIAD is not a comprehensive P2P system, but a solution to the

problem of content based routing. Its goal is to reduce the time need to access content.

Despite that it is focused on the performance problem, it also represents

improvements in other traits. The core idea in TRIAD is network integrated content

routing. It is an intermediary system between a centralized model and a fully

decentralized model because it relies upon using replicated servers. So a client can go

through one of a variety of servers to reach content as long as each server hosts the

content. The content routers are integrated into the system which acts as both IP

routers and name servers. The main idea is that the content routers hold “name to next

hop” information so that all routing is done through adjacent servers so that each step

is on the path to the data, avoiding some of the back and forth calling of traditional

DNS (Domain Name Server). They also explore piggybacking connection set up on

the name lookup so that immediately upon locating the data the connection is already

established. Reliability is increased because the system topology is structured so that

there are multiple paths to content. TRIAD increases performance by proposing its

name based content routing as a topological enhancement. This reduces a lot of the

43

overhead from a DNS (Domain Name Server) based system. Its protocols make it

easier to maintain the system by using routing aggregates, instead of a large number

or individual names. The core ideas in TRIAD relate to P2P because in such a system

end users’ machines can act as either content routers or servers, or both. At minimum

this system could replace the centralized servers of a Napster type system. TRIAD

supports the distributed file management system but it does not support the resource

sharing and load balancing. TRIAD does not support the fault tolerance and security

at any level. TRIAD has good scalability.

th that message; it automatically adapts

to t

ash

fun

least as long as the present peer, and is numerically closer to the key than the present

Pastry: Pastry [55] is a generic P2P content location and routing system based on a

self organizing overlay network of peers connected via the Internet. It is completely

decentralized, scalable, fault resilient, and reliably routes a message to the live peer

with a peerId numerically closest to a key wi

he arrival, departure and failure of peers.

Each peer in the Pastry P2P overlay network has a unique 128-bit peerId, this

peerId is assigned randomly when a peer joins the system by computing a

cryptographic hash of the peer’s public key or its IP address. With this naming

mechanism, Pastry makes an important assumption that peerIds are generated such

that the resulting set of peerIds is uniformly distributed in the peerId space. Each data

also has a 128-bit key. This key can be the original key, or generated by a h

ction. The data is stored in the peer whose id is numerically closest to the key.

Each Pastry peer maintains a routing table, a neighborhood set and a leaf set.

Neighborhood set contains the peerIds and IP addresses of the peers that are closest to

the present peer. Leaf set: Leaf set contains the peerIds and IP addresses of the half

peers with numerically closest larger peerIds, and half peers with numerically closest

smaller peerIds, relative to the present peer’s peerId given a message, the peer first

checks to see if the key falls within the range of peerIds covered by its leaf set. If so,

the message is forwarded directly to the destination peer, namely the peer in the leaf

set whose peerId is closest to the key. If the key is not covered by leaf set, then the

routing table is used and the message is forwarded to a peer that shares a common

prefix with the key by at least one more digit. In certain cases, it is possible that the

appropriate entry in the routing table is empty or the associated peer is not reachable,

in which case the message is forwarded to a peer that shares a prefix with the key at

44

peer’s peerId. Such a peer must be in the leaf set unless the message has already

arrived at the peer with numerically closest peer Id.

Pastry supports dynamic data object insertion and deletion, but does not explicitly

support for mobile objects. Pastry is reliable due to multi path search, replication of

data objects. Pastry supports dynamic peer join and departure. Pastry support the

distributed file management & load balancing. It also supports the communication

level fault tolerance due to maintaining the routing tables and a neighbor hood set.

Pastry support the at communication level security is supported as the hash function

& cryptography is used in the communication. Pastry has good performance due to its

content location, and scalable due to self organization.

Tapestry: Tapestry [55, 47] is an overlay infrastructure designed as a routing and

location layer in OceanStore [32]. Tapestry mechanisms are modeled after the Plaxton

scheme. Tapestry provides adaptability, fault tolerance against multiple faults, and

introspective optimizations. In Tapestry, each peer has a neighbor map, which is

organized into routing levels, and each level contains entries that point to a set of

peers closest in network distance that matches the suffix for that level. Each peer also

maintains a back pointer list that points to peers where it is referred as a neighbor.

They are used in peer integration algorithm to generate neighbor maps for a peer, and

to integrate it into Tapestry. Tapestry uses a distributed algorithm, called Surrogate

Routing, to incrementally compute a unique root peer for an object and moreover each

object gets multiple root peers through concatenating a small globally constant

sequence of salt values to each object ID, then hashing the result to identify the

appropriate roots. The appropriate root searching is shown in Figure 2.10.

When locating an object, tapestry performs the hashing process with the target

object ID, generating a set of roots to search. Tapestry, store locations of all such

replicas to increase semantic flexibility. There are only some small modifications in

routing mechanism for improving fault tolerance, e.g. in case of bad links

encountered, routing can be continued by jumping to a random neighbor peer.

Tapestry send publish and delete message to multiple roots, Tapestry provide

explicit support for mobile objects. Node insertion is easily implemented through

populating neighbor maps and neighbor notification. Node deletion is more trivial. It

is worth notice that Tapestry provides two introspective mechanisms to allow

Tapestry to adapt to environmental changes. First, in order to adapt to the changes of

45

network distance and connectivity, Tapestry peers tune their neighbor pointers by

running a refresher thread which uses network Pings to update network latency to

each neighbor. Second, Tapestry presents an algorithm that detects query hotspots and

offers suggestions on locations where the additional copies can significantly improve

query response time. Tapestry is reliable in nature as it supports the multi path

searching, failure peer detection mechanism and data replication. It does not support

the resource sharing, but the databases are shared between the peer peers. It supports

the distributed file management system & load balancing mechanism. Tapestry does

not support security at any level. Performance of Tapestry is good due to reduced

searching time (Additional copies at hot spots).Tapestry has good scalability due to

populating neighbors & neighbor’s notification techniques.

67493

98747

64567

64267

45567

34567

XXXXX

XXXXX

XXXXX

XXXXX

XXXX7

XXX67 XX567

X4567

34567

Figure 2.10 The path taken by a message originating from peer 67493 destined for peer 34567 in a Plaxton mesh using decimal digits of length 5 in Tapestry.

Chord: Chord [166] is a distributed lookup protocol designed by MIT (see Figure

2.11). It supports fast data locating and peer joining/leaving. Each machine is

assigned an m-bit peerID, which is got by hashing its IP address. Each data record (K,

V) has its unique key K. In Chord, it is also assigned an m-bit ID by hashing the key,

P=hash (K). This ID is used to indicate the location of the data.

All the possible N=2m peerIDs are ordered in a one dimensional circle the

machines are mapped to this virtual circle according to their peerIDs. For each

peerID, the first physical machine on its clockwise side is called its successor peer, or

succ(peerID). Each data record (K, V) has an identifier P=hash(K), which indicates

the virtual position in the circle. The data record (K,V) is stored in the first physical

46

machine clockwise from P as shown in Figure 2.11. This machine is called the

successor peer of P, or succ(P). To do routing efficiently, each machine contains part

of the mapping information. In the view of each physical machine, the virtual cycle is

partitioned into 1+logN segments itself, and logN segments with length 1, 2, 4, …,

N/2. The machine maintains a table with logN entries, each entry contains the

information for one segment. The boundaries and the successor of its first virtual peer.

In this way, each machine only need O(logN) memory to maintain the topology

information. This information is sufficient for fast locating/routing.

On query for a record with key K, the virtual position is first be calculated:

P=hash(K). The locating can start from any physical machine. Using the mapping

table, the successor of the segment that contains P is selected to be the next router

until P is lies between the start of the segment and the successor (this means the

successor is also P’s successor, i.e., the target). The distance between the target and

the current machine will decrease by half after each hop. Thus the routing time is

O(logN).

For high availability, the data can be replicated using multiple hash functions, we

can also replicate the data at the r machines succeeding its data ID. Chord also support

failure peer detection mechanism, Hence this system is reliable. The time taken by

each operation is O(logN). In Chord, machines can join and leave at any time. For

normal peer arrival and departure, the cost is O(log2N) with high probability, but in

the worst case, the cost is O(N). The peer failure can also be detected and recovered

automatically if each peer maintains a “successor list” of its r nearest successors on

the Chord ring. Chord is reliable, as this support failure peer detection mechanism &

data replication. This supports the distributed file management system, but does not

support the resource sharing. No security is provided at any level. Performance of

Chord is good due to fast allocation (Dynamic Hash Table is used for the purpose) of

the objects & replication of objects using multiple hashing functions. Chord has good

scalability due to distributed look up protocol, which supports the peer

joining/leaving.

47

Successor (1) =1

Successor (2) =3

0

1

2

3

4

5

6

7

1

2

6

Successor (6) =0

Figure 2.11 Chord identifier circle consisting of the three peers 0, 1 and 3.In this figure, key1 is located at peer 1, key 2 at peer 3 and key 6 at peer 0.

CAN: (Content Addressable Network) [56, 121] is a distributed hash based

infrastructure that provides fast lookup functionality on Internet like scales. In CAN,

the machines are addressed by their IP addresses. Each data record has its unique key

K. A hash function assigns a d-dimensional vector P=hash(K) for each key, which

corresponds a point in d-dimensional space. In CAN, the point indicates the virtual

position for the data. CAN maintains a d-dimensional virtual space on a “d-torus”.

The virtual space is partitioned into many small d-dimensional zones. Each physical

machine corresponds to one zone and stores the data that are mapped to this zone by

the hash function. These zones are divided between the new joined peer and the

previous peer as shown in Figure 2.12 (a) and 2.12(b). In the d-dimensional space,

two peers are neighbors if their coordinate spans overlap along d-1 dimensions and

about along one dimension. Each machine knows the zones and IP addresses of its

neighbors.

For a given key, the virtual position will be calculated, then starting from any

physical machine, the query message is passed through the neighbors until it find the

IP address of the target machine. In a d-dimensional space, each peer maintains 2d

neighbors [at most 4d, in fact]

CAN supports data insertion and deletion in (d/4)(n1/d) hops. In CAN, a machine

can also copy its data to one or more of its neighbors. This is very useful for load

balance and fault tolerance. It can also detect and recover peer failure automatically.

CAN also support replication that’s why CAN is reliable in nature. CAN support

distributed file management and load balancing, but it does not support the resource

48

sharing. It supports good fault tolerance at system & communication level as it can

copy its contents to one or more of its neighbors. No security is provided at any level.

Performance of CAN is good due to distributed hash based infrastructure that

provides fast lookup of the contents. Due to Topology Updating, CAN supports

dynamic machine joining and leaving The average cost for machine joining is

(d/4)(n1/4) hops for machine leaving and failure recovering, it is a constant time. It is

scalable.

A C D B E

1

0.5

0 0.5 1

A C D B E F

1

1

0.5

0.5 0

E’s neighbor set :{ B, E} E’s neighbor set :{ B, E, F} F’s neighbor set :{ E, D}

Figure 2.12 (a) Example 2-d [0,1][0,1] coordinate space partitioned between 5 CAN peers. (b) Example 2-d space after peer F joins.

JXTA [167]: architecture is organizes in three layers as shown in Figure 2.13, JXTA

core, JXTA services and JXTA application. The core layer provides minimal and

essential primitives that are common to P2P networking. Services layer includes

network services that may not be absolutely necessary for a P2P network to operate,

but re common or desirable in the P2P environment. The application layer provides

integrated applications that aggregate services, and usually provide user interface.

Edutella [168]: attempts to design and implement a schema based P2P infrastructure

for the semantic web. It uses W3C standards RDF and RDF Schema as the schema

language to annotate resources on the web: achieving a mark up for educational

resources. Edutella provides meta data services such as querying, and replication as

well as semantic services such as mapping, mediation and clustering. Edutella

services are built over JXTA [167], a widely used framework for building P2P

applications. Edutella query service provides the syntax and semantics for querying

49

both individual RDF repositories and for distributed querying across repositories.

Edutella uses mediators to provide coherent views across data sources through

semantic reconciliation. Edutella was visualized to provide a platform for educational

institutions to participate in a global information network, retaining autonomy of

learning resources.

Figure 2.13 JXTA Architecture

The same authors have also attempted to use super peer based organization of the

Edutella peers to make searching more efficient. The paper [169] describes an

organization of the super peers based on HyperCup, a structured P2P system based on

the Hypercube topology [170]. The super peers maintain meta data for a set of peers,

instead of each peer maintaining its own meta data. The super peers themselves are

connected using the Hypercup overlay. This makes searching for meta data quite

efficient, as searches are executed only in the super peer overlay. They also use super

peer indices based on schema information to facilitate faster search.

Atlas Peer-to-Peer Architecture (APPA): is a data management system that

provides scalability, availability and performance for P2P advanced applications,

which also deals with semantically rich data, viz., XML documents, relational tables,

using a high level SQL like query language. The replication service is placed in the

upper layer of APPA architecture; the APPA architecture provides an Application

Programming Interface (API) to make it easy for P2P collaborative applications to

take advantage of data replication. The architecture design also establishes the

50

integration of the replication service with other APPA services by means of service

interfaces APPA has a layered service-based architecture shown in Figure 2.14.

Besides the traditional advantages of using services (encapsulation, reuse, portability,

etc.), this enables APPA to be network-independent so it can be implemented over

different structured (e.g. DHT) and super-peer P2P networks. The advanced services

layer provides advanced services for semantically rich data sharing including schema

management, replication [171], query processing [172], security, etc. using the basic

services.

Figure 2.14 APPA Architecture

Piazza [173]: is a peer data management system that facilitates decentralized sharing

of heterogeneous data. Each peer contributes schemas, mappings, data and/or

computation. Piazza provides query answering capabilities over a distributed

collection of local schemas and pairwise mappings between them. It essentially

provides a decentralized schema mediation mechanism for data integration over a P2P

system. Peers in the system contribute to stored relations, similar to data sources in

data integration systems. The query reformulation occurs through stored relations,

stored either locally or at other peers. Piazza also addresses the key issue of security,

which would enable users to share their data in a controlled manner. Another paper

[179] describes the way a single data item is published in protected form using

51

cryptographic techniques. The owner of the data item encrypts the data and can

specify access control rights declaratively, restricting users to parts of the data.

PIER: P2P Information Exchange and Retrieval (PIER) [43] is a P2P query engine

for query processing in Internet scale distributed systems. PIER provides a

mechanism for scalable sharing and querying of finger print information, used in

network monitoring applications such as intrusion detection. PIER uses four guiding

principles in its design. First, it provides relaxed consistency semantics best effort

results, as achieving ACID properties may be difficult in Internet scale systems [174].

Second, it assumes organic scaling, meaning that there are no data centers/warehouses

and machines can be added in typical P2P fashion. Third, the query engine assumes

data is available in native file systems and need not necessarily be loaded into local

databases. The fourth principle is that instead of waiting for breakthroughs on

semantic technologies for data integration, PIER tries to combine local and reporting

mechanisms into a global monitoring facility. PIER is realized over CAN, the

hypercube based P2P system [33].

PeerDB [175]: is an object management system that provides sophisticated searching

capabilities. PeerDB is realized over BestPeer [176], which provides P2P enabling

technologies. PeerDB can be viewed as a network of local databases on peers. It

allows data sharing without a global schema by using meta data for each relation and

attributes. The query proceeds in two phases: in the first phase, relations that match

the user’s search are returned by searching on neighbors. After the user selects the

desired relations, the second phase begins, where queries are directed to peers

containing the selected relations. Mobile agents are dispatched to perform the queries

in both phases.

NADSE: Neighbor Assisted Distributed and Scalable Environment (NADSE) [180]

enables the fast and cost-efficient deployment of self-managed intelligent systems

with low management cost at each peer. NADSE implements a structured P2P

concept which enables efficient resource management in P2P systems even during

high rate of network whips. It permits distributed computing environment [177] to

every peer node by grouping the nodes in a cluster and deputing a node as cluster

head (CH) and assumes whole network as group of clusters. Every CH manages

52

topology of the network and resource available in a cluster. NADSE provides

common solution to P2P system’s fault tolerance and load balancing problems and

gives true distributed computing environment with the help of MAs. It also provides

fault-tolerant execution of processes (mobile device/mobile codes) which is based on

a realistic view of the current status of mobile process based computing.

In [181] Local Relational Model (LRM) is presented. In this model author

assumed that the set of all data in a P2P network consists of local (relational)

databases. Each peer can exchange data and services with a set of other peers, called

acquaintances. Peers are fully autonomous in choosing their acquaintances. In this

model Local relational database is stored at the peer, the complete information stored

at peer, which may be on the target of intruder/hacker. Peer can misuse the

information stored at the peer. In [37] Cooperative File system is proposed, but it is

read only storage system developed at MIT. This file system provides robustness, load

balancing and scalability. In this system we cannot updates the data entries, i.e., data

is static in nature. The existing systems are considering the dynamic data, which may

be updated when required. In the presented model we are also considering this

problem of existing systems.

2.13 Analysis

A number of existing P2P systems such as Napster, Gnutella, Kazaa and Overnet are

popular for file sharing over the Internet. Most of the systems are dealing with static

data. Irrespective of good research in the socially popular and emerging field, i.e., P2P

networks and systems, still there is a lot of scope of research in this field. It is

identified that most of the P2P systems are popular for the static data, while it is

shared among the networks. A little work is done in the direction of sharing dynamic

data among the P2P systems, the data which is changed, while it is shared among the

networks. Motivating from the uses of P2P resources freely available in P2P system

and wasted for implementing real time information.

To achieve the above objective of placing real time data over P2P environment,

data must be partitioned and replicated over multiple peers for number of reasons,

e.g., security, data availability, etc. Some mechanism is also required to enhance the

throughput of the system to match the expectations of RTDBS. Distributed

Concurrency control mechanism is also required for execution of concurrent

53

54

processes, which will maintain data consistency, serializability, etc. in the system.

Network traffic is a major issue of the P2P networks, because of topology mismatch

problem. A mechanism is required to reduce this heavy traffic, as peers will

communicate with each other in large extent, will cause the situation of network

choking. Some schemes to place the replicas will be required to reduce the replica

search time, thus, some logical structures to be identified to lace the replicas.

Reliability is another issue which needs more attention of the research society.

Other issues are concurrency control, fault tolerance and load balancing. The response

time and traffic cost needs to be measured and compared as performance measure for

the network. In order to enable resource awareness in such a large scale dynamic

distributed environment, specific middleware is required which takes into account the

following P2P characteristics: managing underlay/overlay topologies, reduction in

redundant network traffic, data distribution, load balancing, fault tolerance, replica

placement/updation/assessment, data consistency, concurrency control, design and

maintenance logical structure for replicas, controlling network traffic of overlay and

underlay networks, etc. Architecture of the proposed middleware should be suitable

for dissemination of dynamic information in the P2P networks [41, 116, 117]. In

Table 2.1 we have presented comparison of various P2P middleware approaches.

2.14 Summary

In this chapter we have presented P2P (P2P) Networks, Types of P2P Networks,

overlay networks, Overlay P2P networks, Limitation of P2P systems, Parallelism in

databases is presented. The concurrency control, topology mismatch problem is also

discussed in the chapter. Replication for availability, quorum consensus, information

regarding databases and their requirements in P2P environment, Some P2P

middleware is also discussed. At the end of the chapter, an analysis of the literature

survey is presented followed by summary.

In the next chapter we have proposed Statistics Manager and Action Planner

(SMAP) for P2P Systems.

Table 2.1 A Comparison of Various P2P MiddlewaresMiddleware Attributes

CAN Tapestry Chord Pastry Napster Gnutella Freenet APPA Piazza PIER PeerDB NADSE Load balancing Y Y Y Y Y N N Y Y Y Y Y

Fault tolerant (communication link)

Y N Y Y Y Y Y Y Y Y Y

Fault tolerant (host level)

Y Y Y N N Y Y Y Y Y Y Y

Reliable Replication Replication Replication Replication N Replication Replication Replication Replication Replication Replication Replication

Resource sharing N N N N Y Y N Y Y Y Y Y

Secure N N N Communication level

N N N Y N Y N Y

Scalable Y Y Y Y Little Little Little Y Little Y Y Y

Better at under load





Performance Good Good Good Good

Poor at overload

Poor at overload

Poor at overload

Good

Poor at overload

Good

Poor at overload

Good

Distributed file Management

Y Y Y Y N N Y Y Y Y Y Y

Data Partitioning NA NA NA NA NA NA NA N N N N N

Traffic Optimize NA NA NA NA NA NA NA N NA N N N

Concurrency Control NA NA NA NA NA NA NA Y NA Local N N

Parallel Execution NA NA NA NA NA NA NA Y N N N Y

Schema Management NA NA NA NA NA NA NA Y(Global) Y(Pairwise) Y(Global) N N

File sharing N N N N Y Y N Y Y Y Y Y

Degree of Decentralization

Distributed Distributed Distributed Distributed Centralized Decentralized Distributed Hybrid Hybrid Super Peer Based

Distributed Hybrid Hybrid

Network Structure Structured Structured Structured Structured Structured Unstructured Loosely Structured

Independent Unstructured Structured Loosely Structured

Loosely Structured

*NA: Not addressed at the best of our knowledge

55

Chapter 3

Statistics Manager and Action Planner

(SMAP) for P2P Networks

P2P technology facilitates to share the resources of geographically distributed peers

connected to Internet. P2P applications are attractive because of their scalability,

enhance performance and substitute to client/server model which enable direct and

real-time communication among the peers. To place the data over dynamic P2P

network while maintaining scalability, performance, resource management peer churn

rate and traffic in the P2P system are major issues. That affects the performance of

P2P system.

In this Chapter, we have proposed Statistics Manager and Action Planner (SMAP)

for P2P Network which is an evolutionary approach to P2P systems. SMAP supports

fault-tolerance, shortest path length to requested resources, low overhead generation

during network management operations, balanced load distribution between peers and

high probability of lookup success.

Rest of the chapter is organized as follows. Introduction is given in Section 3.1.

Section 3.2 explores the architecture of SMAP. Section 3.3 gives advantages behind

the development of SMAP. Section 3.4 presents discussion. Finally chapter is

summarized in Section 3.5.

3.1 Introduction

P2P technology facilitates to share the resources of geographically distributed peers

connected to Internet. As technology approaches to its peak, the computation power,

storage capacity, capability of input/output operations of any computing device also

goes on increasing. Major part of CPU ticks and storage space of computing devices

are wasted due to limited requirements of users in general. Distributed technologies

enable the sharing of data as well as resources. P2P networks share the data storage,

computation power, communications and administration among thousands of

individual client workstations. The ability of P2P systems to share data and resources

56

can be utilized to pool the wasted resources, e.g., CPU ticks, storage space of

participating peers, etc. To utilize this pool of storage space for implementing

RTDDBS over P2P network is burning challenge. This vision is also supported by

increased usage, availability of Internet facility and popularity of P2P systems. To

address these challenges a number of issues related to P2P systems, database and real

time constraints of databases have to be addressed.

The key issues in implementing RTDDBS over P2P systems is to efficiently

maintain the target data and peer availability in the environment of high node churn,

network traffic, fast response time, high throughput which are also acceptable in real

time environment. Load balancing, fault tolerance, replication are some other issues

without which a system cannot be useful.

We are required to develop a computing/communication P2P systems that fulfills

most of the above challenges. Thus, a system is needed for P2P network which will

increase the availability of the data items, reduce the response time of the system,

provide fast response to update the database system, arrange secure access, distribute

the information over the P2P networks and manage the other dynamic issues in the

database.

3.2 System Architecture

Middleware is considered as necessary layer among the hardware, operating systems,

and application. The aim of the middleware layer is to provide appropriate interfaces

to diverse applications, a runtime environment that supports and coordinates multiple

applications mechanism to achieve adaptive and efficient use of system resources.

Middleware is often used by traditional systems as a bridge between the operating

system and the applications. It makes the development of distributed applications

possible. Traditional distributed middleware (DCOM, CORBA) is not adequate for

the dynamic P2P network requirements of memory and computation. The

maintenance of traditional middleware architectures is also not easy due to dynamic

P2P network constraints. For these kinds of networks, a middleware that is simple,

light, and easy to implement is needed.

To achieve the above a Statistics Manager and Action Planner (SMAP) {Figure

3.1} is proposed. It is a decentralized management system that manages P2P

applications, system resources in an integrated way, monitors behavior of the P2P

57

applications transparently, obtains accurate resource projections, manages

connections between the peers, distributes replicas/database objects in response to the

user requests, changed processing and networking conditions. SMAP is a middleware

which support the P2P systems to store and access real time information. It consists of

five layers and these layers in combination helps the user to store and retrieve their

data over P2P networks efficiently. It distributes real time data in P2P networks and

replicates this data to provide acceptable data availability. It also manages the replicas

in efficient overlay topology that provides fast and updated data against any user

query. SMAP enhance response time, throughput by providing parallel execution to

arrived queries. It minimized network traffic generated against data/ control

information in the system. It also provides fault tolerance, load balancing in the

system. These layers are briefly describes as follows.

3.2.1 Interface Layer (IL)

It handles heterogeneity of the participating system. It receives the queries from the

outside world and forwards the same to next layer after checking the authenticity of

the user/query/data corresponding to the query. The results corresponding to the

received queries are returned to the user. IL also maintains the log of resource

available and in use. A brief discussion of various components of this layer is as

follows.

Authenticity Manager (AM): This module looks after the authenticity of a user and

check whether the user is authentic to use the system or not. Various

privileges/permissions, e.g., read/write/execute/update to the user are also verified

through by the authenticity module. Various conventional techniques are used to

avoid unauthorized access in this module e.g., login ids, code exchange techniques,

etc. To avoid the misuse by the malware various techniques, e.g., capcha may be used,

so that unauthorized program may not temper the information/complete system.

Resource Manager (RM): It manages the resources of the system (viz., up/down

bandwidth, storage space, CPU ticks, etc.), and also controls the participation of peers

in the network. RM mainly collect the resources and maintain the statistics of the

58

resources. To simplify the functionalities, RM is subdivided into Resource Allocator

(RA) and Resource Publisher (RP).

Network Management Layer

Replication Management Layer

Data Handling Layer

Interface Layer

Data Manager

Authenticity Manager

Peer Analyzer

Data Scheduler

Replica Topology Manager

Schema Manager

Query Interface

Query Processor

Query Optimizer

Query Execution Engine

Traffic Load Optimizer

Network Manager

Data Storage Space

Application

Internet

Quorum Manager

Replica Search Manager

Replica Update Manager

Resource Manager

Group Communication

Con

trol

Lay

er

Figure 3.1 Architecture of Statistics Manager and Action Planner (SMAP)

Resource Allocator (RA): allocates and controls the resources for newly subscribed

services. Resources are allocated fairly among peers, at the same time fulfilling

individual peer requirements. RA keeps the global state of the distributed resources

consistent among all local resources based on a given coherence strategy.

59

Resource Publisher (RPB): is responsible to collect and publish the resource

permitted to be shared in the system.

Security Manager (SM): It provides coordination among all the applications running

on numbers of peer. Security, trust and privacy are addressed from the very

beginning of system design and on all levels such as hardware, operating system,

protocols, and architecture. SM has following roles: (1) Protecting channels against

unauthorized access or modifications, (2) Program validation/verification (what an

uploaded/downloaded piece of software really does), Trust modeling, (3) How

fragments of information can be efficiently shared in a controlled manners,

Key/certificate management, and (4) Implications of dynamic P2P network (that can

be done without trusted servers). SM also provides and looks after the security levels

of the data/query/user, etc. at various levels in the system.

n

Query Interface (QI): It accepts the queries from outside world. Before accepting a

query it forwards the same to SM for the validation of the user. If a user is

authenticated, it returns the information of RC from where user receives the results

against the submitted queries. All the required information is exchanged with the user

for smooth functioning of the operation. This submitted query is further submitted to

the query analyzer (QA).

3.2.2 Data Layer (DL)

The Data Layer (DL) is responsible for data handling, data integrity and data flow

management in the system. It receives queries from the IL, subdivides, optimizes and

distributes to the different components as per requirement. DL ensures efficient

distribution of data to different components. It also supports the concurrent execution

of processes in the system. DL implements Matrix Assisted Technique (MAT) for

data distribution and Timestamp based Secure Concurrency Control Algorithm

(TSC2A) for concurrency control in the system. MAT also manages the global/local

schema of the database. DL compiles partial results which are corresponding to local

schemas and generates results corresponding to the global schema. The details about

MAT and TSC2A are available in the Chapter 4 and Chapter 5, respectively. A brief

60

discussion of various components that execute the functionality of MAT and TSC2A

of DL is as follows.

Schema Scheduler (SS): It is responsible for handling the global databases. Global

database is further partitioned horizontally, vertically or by both. SS ensures the

partitioning and resembling of data into the system. It helps DL in compiling the final

results from partial results received from various peers holding the replicas.

Query Processor (QP): It subdivides the received query into subqueries according to

the database schema and distributes the same to the corresponding replicas. It uses the

global schema, Local schema and MAT data partitioning algorithm for subdividing

the queries. QP also helps DL in compiling the partial results.

Query Optimizer (QO): It analyzes, resolve and optimizes the received queries. It is

also responsible for breaking query into subqueries. QO decides whether a peer is

suited for a particular subquery or not.

Query Execution Engine (QEE): It is responsible for executing the subqueries and

produces partial results corresponding to the subqueries. These partial results are

further sent to the peers responsible for compiling the partial results and send to QP.

QEE is also ensures parallel execution of submitted subqueries and manages various

stages used in the execution. It produces timely response of a subquery and execution

of all subqueries corresponding to a query. It dispatches a subquery to suitable peer

and gets back the information/partial results.

Data Scheduler (DS): It maintains the global/local schema of the database. DS finds

the correlation between global and local schema. It checks and distributes the

information to the selected peers through some predefined pattern decided by the

administrator. DS is used to gather the information from the replicas of data partition

from the peers.

Data Storage Space (DSS): maintains actual database to be accessed in response to

any query. All data items belonging to the database partition are physically stored in

DSS. This is the shared region permitted by the owner of a peer.

61

Data Manager (DM): DM is responsible to distribute data among the network. It

ensures integrity of data and correct transfer rights in the system. It permits access

rights to the database stored at a peer in serializability form. DM also checks and

manages a timely retransfer of any tempered data during transfer from one component

to another and one peer to other participating peer in the system.

3.2.3 Replication Layer (RL)

The Replication Layer (RL) replicates the data and ensures updated data availability.

To ensure updated data availability RL implements logical structure to manage

replicas. The logical structure of replicas ensures fault tolerance to the system. RL

controls the available space with peers. It provides updated data in response of a

query with reduced network traffic. RL is also responsible for all issues related to

replication, i.e., issues related to number of replicas, replica selection, logical

structure in which replicas are arranged, etc. The details about this layer are described

in Chapter 7 and Chapter 8. The various components of the layer are as follows.

Replica Topology Manager (RTM): RTM places the replicas in a logical structure

which ease the access of the information, responsible for searching any replicas from

the group of replicas. It permits read/write quorums. It implements Logical Adaptive

Replica Placement Algorithm (LARPA) and Height Balanced Fault Adaptive

Reshuffle (HBFAR) scheme for reducing search time of the replicas. LARPA and

HBFAR scheme identify the replicas for making quorum. RTM maintains the logical

structure time to time, in which replicas are placed. Every time when any replicas

leave the network, it readjusts the replicas by arranging the addresses of these in

logical structure. LARPA and HBFAR scheme are also used to maintain the overlay

topology in the system. RTM pays a key role in the SMAP.

Replica Update Manager (RUM): The major aim of RUM is to maintain the

freshness of data item. It uses LARPA and HBFAR scheme for maintaining latest

information. The probability to access stale data from the system is minimized by

minimizing the update time of the system.

62

Quorum Manager (QM): It is responsible to decide the quorum consensus to access

the data item. The quorum is decided such that the system got the desirable

availability of the replicas, i.e., the number of replica to be accessed is increased, if

the availability of the peers storing replica are low. QM decides regarding number of

replicas to be accessed. It is responsible to maintain the availability of the replicas

upto desired level. QM recognizes the replica to be accessed from the logical

structure.

3.2.4 Network Layer (NL)

The Network Layer (NL) maintains logical connection among any number of peers in

the network. It provides optimized network paths from one peer to another. NL is

responsible for sending and receiving packets into the network. This layer is further

connected to the Internet layer of TCP/IP model. All the information is sent or

received through NL only. It is completely responsible for logical structure,

connections between participating peers, network traffic, deciding paths to transfer the

messages, etc. The detailed functioning of the components involved in this layer is

described in Chapter 6. The various components of this layer are as follows:

Network Manager (NM): NM is responsible for managing logical topology in which

replicas are arranged. It implements the logical topology over P2P network. NM

assumes the network as hierarchical Distributed Computing Environment (DCE).

Group Communication Manager (GCM): Every logical address of a replica is

checked and converted into the physical address of that replica. It also helps to send

the parallel update messages to the group of replicas. GCM routes data and control

information. It establishes communication links to other replica.

Traffic Load Optimizer (TLO): Huge network traffic is generated in the P2P

networks. This is the main bottle neck of the P2P network, which prevent the network

to be scalable beyond a limit. TLO reduces the network traffic in the network by

optimizing the network paths. It analyses network traffic and provides statistics to

system. This statistics is used in managing the network traffic. TLO implements

63

Common Junction Methodology (CJM) to minimize network traffic in the system,

details are given in Chapter 6.

Peer Analyzer (PA): PA is responsible for collecting statistics of the peers available

in network. Peers are selected for storing replicas depending upon the statistics

received. They can leave and join the network with and without informing the system.

To trace the behavior of peers PA keeps tracks to each peer’s leaving and joining time

bandwidth with which they are connected to network, storage space available, CPU

utilization of peer, etc., are the parameters which are analyzed by PA in the network.

3.2.5 Control Layer (CL)

NL provides interaction between various layers and the corresponding components of

SMAP. It manages the complete working of the SMAP. It also synchronizes the

system to improve the system performance. CL synchronizes all layers of SMAP,

takes decisions depending upon the statistics received from various components and

activates the corresponding action. It also ensures the information and dataflow in the

system. CL keeps track on the working of different components to ensure the

efficient working of the system.

3.3 Advantages of SMAP

P2P systems in general have high overall management cost. However, the presence of

SMAP enables fast and cost-efficient deployment of self-managed P2P system. The

SMAP enables low management cost at each peer. It supports the P2P network to

store and retrieve the real time information. It first distributes the database into

number of partitions. These partitions of database are further stored on group of

highly available replicas. By distributing the database, SMAP supports primary

security to the database. It enables P2P network to effectively exploit under utilized

resources of the system and utilize them for highly complicated RTDDBS. It utilizes

three-stage parallelism for the execution of received queries, which enhances

throughput of the system.

n

SMAP accesses information from the partitions distributed over the network. It

supports concurrent execution of the processes and also accesses partitions of the

database. It also helps to use P2P networks as a Data Management System and speeds

64

up the information retrieving process. SMAP makes system similar to any other

conventional file management system stored on static network. SMAP is a highly

fault tolerant system. The availability of the data items is also improved through it.

SMAP receives both data and query for processing and management, and avoids any

leakage of information from high security level to low security level.

SMAP also provides the route management in the P2P network. It reduces the

network traffic at large scale providing scalability to the network. This solves the

problem of topology mismatch faced in the P2P networks, which generates heavy

redundant traffic in the network. SMAP helps to balance the load of traffic over the

network.

Using SMAP, one may deploy large-scale intelligent systems without the need of

cost-intensive supercomputing infrastructure in which management is highly complex

and requires high-skilled administrators for their maintenance. The approach is

evolutionary in the sense that it gives a new approach towards the application of P2P

into real-time service scenarios.

3.4 Discussion

The SMAP enables fast and cost-efficient deployment of information over the P2P

network with high availability of data and peers. It permits DCE to every peer uses

the resources of all other peers participating in the network. It utilizes the wasted

resource of peers to implement RTDDBS, offered by the owner of these peers. SMAP

is self-managed P2P system and has the capability to deal with high churn rate of the

peers in the network. It also reduces redundant network traffic generated through

topology mismatch problem in any P2P system. It provides efficient replica placement

in the network which support high data availability to the system.

3.5 Summary

In this chapter, we have presented “Statistics Manager and Action Planner (SMAP)”

for P2P networks. It is an evolutionary approach to P2P systems that implements any

dynamic information (which can change while shared between peers) over highly

dynamic environment of P2P networks. SMAP enables fast and cost-efficient

deployment of self-managed P2P system with high overall management cost, but with

low management cost at each peer. It implements a structured P2P concept which

65

66

enables efficient resource management in P2P systems even during high churn rate of

peers in network. SMAP reduces the redundant traffic generated through topology

mismatch problem. It permits distributed computing environment to every peer

participating in the network.

In the next chapter, Data Placement and Execution Model for the RTDDBS will

be discussed.

Chapter 4

Load Adaptive Data Distribution over

P2P Networks

Data placement and management is a challenging task for storing RTDDBS over P2P

networks. This includes the problem of data distribution on Internet, peer/data

availability, replication of partitions, congestion free load balancing, response time

and the throughput in presence of high churn rate of peer nodes. An efficient data

placement procedure plays an important role to enhance performance of the P2P

systems.

In this chapter, a 3-Tier Execution Model (3-TEM) is proposed. It divides

conventional execution process used for P2P system, into three independent stages. 3-

TEM implements a Matrix Assisted Technique (MAT) to distribute the database over

P2P networks. It uses the range partitioning to partition the database with dividing

factors and partition the database horizontally, vertically or both and place each

partition on group of peers as replicas. MAT also provides primary security to the

database. 3-TEM requires small chunks of CPU ticks for executing the subprocesses

through multiple stages. It improves throughput, query completion ratio, resource

utilization by executing the process using parallelism.

Rest of the chapter is organized as follows. Section 4.1 presents introduction.

Section 4.2 gives system model. Section 4.3 highlights the 3-Tier Execution Model

(3-TEM). Section 4.4 gives load balancing and database partitioning is presented in

Section 4.5. Implementation and performance study is given in Section 4.6. Section

4.7 highlights advantages of 3-TEM. Section 4.8 presents discussion and finally

chapter is summarized in Section 4.9.

67

4.1 Introduction A large number of peers are participating in the P2P networks. P2P systems are

dynamic in nature because participating peers may join or leave the networks with or

without informing the systems. The churn rate of the peers is the rate at which these

peers are leaving and joining the system. Each peer is having its session time for

which it is connected with the system. It is very difficult to select a suitable peer

among number of participating peers for a particular task, which are having variety of

parameters that can affect system performance. P2P systems are popular for their

unrestricted sharing of data files, e.g., Napster, Gnutella. In such environment, the

processes require small CPU ticks to execute them. To mange data availability in the

presence of churn rate during the service time is an issue to be addressed.

For implementation of databases over P2P networks, a system has to address the

challenges related to P2P networks as well as databases. The challenges related to P2P

networks are peer selection, churn rate, session time, network traffic, overlay and

underlay topologies and topology mismatch problems, etc.

The challenges related to databases are data availability, replication, concurrency

control, security and real time access of data, etc. Databases may be partitioned to

maintain data availability, peer availability, primary security and peer load, etc. The

system performance also depends upon, how database is divided into partitions and

how these partitions are permitted to access for the execution of a submitted query by

the system. A global schema is partitioned into local schemas. A proper placement of

the partitions improves performance of the system. To execute a global query through

local schema, an arrangement for mapping between global and local schema is

required. The technique for schema mapping also affects the system performance.

Another challenge is that a real time environment expects the execution of a query in

bounded time. Such requirement of time bound execution of queries and high

throughput of the system is hard to achieve in P2P networks due to the churn rate of

the peers.

To address few of the above issues we have developed a 3-Tier Execution system

which addresses discovery and peer selection, churn rate, data partitioning, data

availability, primary security, schema mapping and data consistency issues.

68

4.2 System Model

To distribute a database over P2P network, it may be partitioned into small parts. The

partitions are placed on groups of selected peers to maintain the data availability at

acceptable level. The peers holding replicas are selected through predefined selection

criterion from the network. These groups are further consulted to get updated data

from the P2P network. The requester sends its query to the database in the form of

transactions with respect to the global schema. These transactions are further

subdivided into subtransactions (with respect to local schema) depending upon the

database partition. The partial results obtained from various group of replicas are

compiled to achieve the desired results corresponding to a submitted transaction. This

result is returned to its requester. The system model is as under:

The set of peers participating in the network is defined as a set

where is the number of peers. It is assumed that the

relational database has fields

1 2 3{ , , ,..., }npP p p p p=

DB

np

nf 1 2 3, , ,..., nff f f f and records .

The database is divided into partitions represented as

.

nr 1, 2, 3,..., nrr r r r

DB

1 2 3{ , , ,...,DB Db Db Db= }pηDb

1

piDb DB

i

η=

=∪ is an operation which compiles the partial results to produce the final

result corresponding to a global schema. Each partition is stored at set of peers,

such as replica

iDb

1 2 3{ , , ,..., }ii i i i

rDbR p p p pη=

iT DB∼

. A transaction over database

represented as . This transaction is further subdivided into subtransactions

corresponding to the database partitions

iT

}i

DB

1 2 3{ , , ,i i iT ts ts ts ...,ittsη= , and . Where

is the

its Dbj ∼j

ijts thj subtransaction of the transaction . And is the number of peers

participating in the system, is the number of partitions of the database,

iT np

np rη is the

number of peers over which a particular partition is replicated.

The partial results from replica set of partitions are compiled to generate the

final results, i.e., .

imrs

{i irs= ∪1 2 3 ... }i i ipRs rs rs rsη∪ ∪ ∪

69

4.3 3-Tier Execution Model (3-TEM)

In conventional 1-Tier Execution Model (1-TEM), a head peer accepts a transaction

from a user, resolves and distributes it to corresponding processing peers for the

execution. These processing peers execute the subtransaction received from the head

peer and revert the results after executing it. The head peer compiles the received

results and returns the same to its corresponding requester. The head peer becomes

highly overloaded because of the responsibility for every event belonging to a

transaction, viz., transaction caching, transaction division, result compilation, result

caching and result delivery, etc. The head peer is prone to single point of failure.

Other issue in 1-TEM is that the query spends a large time in head peer. In the case of

head peer failure, execution time of queries gets wasted and therefore the query will

be re-executed. This reduces the throughput because throughput of the system

depends upon the performance of head peer only.

Transaction Coordinator (TC)

Transaction Processing Peers (TPP)

Result Coordinator (RC)

User/Requester

Transaction Processing Peers (TPP) Transaction Processing

Peers (TPP) Transaction Processing Peers (TPP)

Figure 4.1 3-Tier Execution Model (3-TEM) for P2P Systems

There are three major subprocesses, responsible to execute a transaction in this

conventional P2P system, i.e., transaction coordination, transaction processing at

remote peer and result coordination. To speed up the execution process of

conventional P2P system and to balance the load of head peer, a 3-Tier Execution

Model (3-TEM) is proposed {Figure 4.1}. It comprises of three components -


Coordinator (RC). The execution process may be decomposed into small

subprocesses and executed independently using TC, TPP and RC. To improve the

70

response time, these subprocesses are managed by dedicated peers along with

required information. The partial results received back from the subprocesses are

compiled at RC for final results corresponding to the global schema.

These stages share control information for the execution of subtransactions of the

parent transaction. The three components require small chunks of CPU to execute

their corresponding responsibilities and may be executed in parallel. This parallelism

improves the throughput of the system. A timestamp is used to maintain the

serializability in the subtransactions.

TC receives the global transactions from the users, translates and decomposes a

transactions against global schema into subtransactions (local) iT 1 2 3{ , , ,..., }i i i itts ts ts tsη

depending upon the partitioning mechanism used. Local subtransactions further

routed to the corresponding TPPs in serializable order for execution. A subtransaction

may be executed on a number of TPPs. 3-TEM coordinates among TPPs during the

execution of subtransactions. To improve the performance of 3-TEM in terms of

response time, TPP executes the subtransactions in the order of timestamp associated

with them. RC receives the partial results from TPPs and, are compiled for final

results against global transaction. The final results are returned to the owner of the

global transaction. The details about different components of 3-TEM are given in

Figure 4.2.

4.3.1 Transaction Coordinator (TC)

TC act as interface and used by the requesting sites for sending their requests in the

form of transactions (queries). It is also responsible for providing compatible

environment to the global transactions coming from heterogeneous peers. TC receives

transaction from the user, it guides a user regarding Result Coordinator (RC), from

where a user may obtain the results corresponding to the submitted transaction, TC

checks authenticity of the arrived transaction and assigns a security level to it. It

resolves transaction into subtransactions with respect to local schema and sends them

to the corresponding TPPs with the help of Data Access Tracker (DAT). TC has

various components to process the received transactions. These components are as

follows.

Transaction Interface (TI): It provides interface to the user and receives the

authorized global transactions from external world. It helps the user to get the iT

71

results iRs corresponding to the submitted transaction from the RC. An

authenticated user always receives a token for its submitted query/transaction.

iT

Security Checker (SC): It authenticates a requester/user. The authentication of the

user(s) is done through username and password. SC blocks the unauthorized access of

the data. It also blocks the low security level transactions to access high security data

item and filters the possibility of malicious attack through arrived queries. SC

allocates the security classes/level ( )ivL T Sc∈ to all authorized and arrived

transactions ( ) and to data items stored in database, respectively. It

completely secures transactions and data in the system.

irts ( )L x Sv ∈ c

Transaction Manager (TM): It handles the transactions and data in the system and

ensures global serializability. TM resolves the global transaction into

subtransactions . It assigns the timestamp of global transaction to all its

subtransactions. The subtransaction will be sent to that TPP where the data required

by this subtransaction is available.

iTirts

Load Analyzer (LA): It analyzes the load at each participating peer and maintains the

statistics of the load. Depending upon the statistics, load distributing mechanism is

activated to balance the load over the peers participating in the system.

Data Administrator (DA): It is responsible for all data and database related activities

in the system. DA keeps track of the peers where the partitions are stored in the form

of an address table. It also sends the update massage to the DAT in the event of any

data updation.

Data Access Tracker (DAT): It controls, manages and provides the required

information in the system. It keeps track of the read/write timestamps associated with

each data items. Every time a data item is added, read or updated by a transaction,

corresponding timestamp of data item is also updated in the DAT. Two types of

timestamps are associated with every data item, i.e., read and write. The read

timestamp is the timestamp of the last global transaction that reads particular data

items. The write timestamp is the timestamp of the last global transaction which

72

writes this data item. DAT also detects and resolves conflicts between global

transactions.

Peer Identifier (PI): It keeps track of the addresses of peers where database partitions

are stored. PI also holds the routing information of the network. It implements peer

selection criterion procedure.

4.3.2 Transaction Processing Peer (TPP)

Transaction Processing Peer (TPP) is responsible for execution and maintaining local

serializability in the received subtransactions. The components of TPP are shown in

Figure 4.2. These components is as follows.

Transaction Interface

Security Checker

Load Analyzer

Transaction Manager

Data Administrator

Peer Identifier

Data Access Tracker (DAT)

Transaction Coordinator

(TC)

Transaction/ Query

To TPP

Subtransaction Interface

Subtransaction Manager

Data Manager

Local Database

Transaction Processing Peer (TPP)

To RC

Result

Manager

Result Data Administrator

Result Pool

Result Coordinator

(RC)

To Requester

Figure 4.2 System Architecture of 3-Tier Execution Model (3-TEM)

Subtransaction Interface (SI): The subtransaction can be blocked, aborted, restarted

or preceded for the execution, depending upon the order of subtransactions and their

corresponding timestamp associated with them. SI receives the subtransaction , and

checks for the prescribed format. It places the subtransactions in a priority queue

according to the timestamp associated with them. SI also checks the deadline of

subtransactions, and abort the subtransaction those exceeds their deadline.

irts

73

Subtransaction Manager (SSM): resolves a subtransaction , and decides the data

required by it at TPPs. It also checks the feasibility and availability of requested data

items in its local database. The subtransaction is further sent to the Data Manager

(DM) for data mapping. This identifies data items corresponding to subtransaction

from the local database and maintains local serializability.

irts

Data Manager (DM): is responsible for mapping of subtransaction with its

required data items from the database available at a TPP. It is responsible for all the

events being done on all data items corresponding to read/write subtransactions. It

also maintains the data consistency.

irts

Local Database (LD): is the actual partition of global database within which data

item resides. This provides the data items corresponding to read/write subtransactions.

4.3.3 Result Coordinator (RC)

A Result Coordinator (RC) is responsible for compiling partial results received from

TPPs. This compiling of partial results is a mapping of partial results from local

schema to global schema and generation of final results corresponding to global

schema. It forwards the compiled results to its authenticated user/owner

corresponding to the received transaction. RC stores results in a result pool in sorted

order depending upon deadlines of the transactions. The components of RC are shown

in Figure 4.2 and briefly described as under.

Result Manager (RSM): is similar to Transaction Manger (TM) but does the work in

reverse of TC. It ensures global serializability of final results. Serializability of the

results identified through timestamps of partial results. RSM is responsible for

compiling the partial result into global result irrs iRs . It also compares the partial

results for updated result among all partial results received from various replicas.

RSM sends a message to the user indicating that result is ready. It handovers final

result to the user (transaction owner) after checking its authenticity. A user is

identified through comparing the token, issued to it against a submitted transaction.

74

Result Data Administrator (RDA): It manages the global as well as local databases. It

helps in compilation of partial results.

Result Pool (RP): holds the results till it is not handover to their owner. It keeps log

of the peers from where the partial results are received. RP also keeps track of the

deadline attached with the transaction and corresponding results, which is utilized to

discard the result after a certain period of time.

4.3.4 Working of 3-TEM

The requester peer sends its query to the Transaction Coordinator (TC). On receiving

query, i.e., from the requester, it checks the authenticity of the requester and also

checks whether the requester peer authorized to access the system. The next level of

operation is the scope of query, i.e., whether the result is within the scope of system.

After authenticating and checking the scope, TC assigns a timestamp to received

query, a token number/query id corresponding to the arrived query is provided to the

requester. This query id remains with the query/corresponding subqueries through out

the life of query within the system. A timestamp is also provided to the arrived query,

which further helps in the sequence of subquery execution in the system. The address

of result coordinator is also provided to the requester so that after the specified time

period RC may be consulted to get the result corresponding to the query id. The

partition ids are calculated through partitioning algorithm, addresses of corresponding

TPPs are obtained through address mapping table. Information related to number of

partitions used, number of replicas of each partition are sent to the RC, which further

helps in the compilation of result from the partial results received from various TPPs.

The packet of information (data and control information) for the TPPs are generated

and forwarded to the corresponding TPPs.

TPP resolves the data and control information from the received packet. The

position of required data is identified (corresponding to the control information

provided in the information packet) for the TPP. The operation as mentioned in the

information is performed over the provided data. The results generated by TPP are

forwarded to the Result Coordinator (RC).

RC receives all partial results from corresponding TPPs, analyzes these partial

results against timestamp attached with the partial results, compiles the result

corresponding to the received query (corresponding to global schema). The final

75

results are stored in the RP. This result is forwarded to the requester after

authenticating the requester and the token ids/timestamp associated with the query.

4.4 Load Balancing

The time required at a peer for execution of a transaction is much higher in

conventional system (1-TEM) as compared with the 3-TEM system. In dynamic

environment of P2P, the session time of participating peer is a constraint. Thus, the

processes that require small CPU ticks get the advantage in 3-TEM over the 1-TEM

(conventional execution model). The heavy load of head peer is shared among TC and

RC in 3-TEM. 3-TEM generates no extra overhead for data transfer over the overlay

network as it performs the same task pattern as used in 1-TEM. The 1-TEM follows

the pattern TC-TPP-TC, here, TC is holding the dual responsibilities of transactions

and results. But, in 3-TEM the pattern is TC-TPP-RC. The load of TC is shared with

RC, with tolerable increase of control overhead.

4.5 Database Partitioning

To place real time information over untrusted P2P environment, a number of issues

should be addressed. Efficient utilization of storage space available on peers, primary

security of placed database and availability of data whenever required are some

primary issues to be addressed in P2P network having high churn rate. To address the

above issues, complete database cannot be placed over any peer due to the security

reasons. The idea is to place only partial information on untrusted peers which may

not be useful till it is complete. The database must be partitioned and these partitions

must be placed over multiple peers. Peer availability is another reason of database

partition. The replication of database is required to match data availability

requirement of the database. To achieve desired data availability, each database

partition is replicated over multiple peers, so that system works in case of any peer

failure.

The database partitioning is primary requirement in case of placing database over

P2P environment. In this case vertical partitioning is also required along with

horizontal partitioning of database. To achieve vertical and horizontal partitioning, a

simple, fast and efficient mechanism is required which can partition the database as

per requirement of the system. A matrix Assisted Technique (MAT) which resolves

76

the transactions according to the database partitions and compiles partial results

received from the number of partitions after execution is proposed. It is simple, fast

and efficient technique for P2P environment which addresses the above issues.

4.5.1 Matrix Assisted Technique (MAT)

To improve the security and data availability in dynamic P2P environment database

must be stored in small partition over the P2P network. Primary requirements for

partitioning are fast execution, simplicity and efficiency. To fulfill the above

requirements of database partitioning for P2P dynamic environment, a Matrix

Assisted Technique (MAT) is identified. This is a simple hashing technique, which is

capable to partition a relational database horizontally, vertically or in both ways and

providing range partitioning of database. MAT uses dividing factors to partition the

database. With dividing factors database can be partitioned into small units. It

efficiently performs all the required tasks within time constraints. MAT is simple in

complexity and easy for implementation.

Relational database is considered for implementation of the MAT. Consider a

relational database DB has nf fields (columns) 1 2 3, , ,..., nff f f f and number of

records (rows) is required to be placed over the P2P networks. MAT uses

two dividing factors and for horizontal and vertical partition, respectively.

The number of horizontal partitions depends upon the quotient obtained from dividing

the number of records by , and similarly number of vertical partitions obtained

from dividing the number of columns by . This decides which field comes under

which partition and at what position. Partitioning ranges of partial data is decided

against a partition.

nr

1, 2 nrr r r, 3,...,r

rdf cdf

rdf

cdf

Figure 4.3 presents a dummy relational database with 30 records and each with 9

columns. This database partitioned with the factors 10rdf = and . This

generates total partitions ranging from [0, 0] to [2, 2]. A partition contains partial data,

e.g., [0, 0] contains only 3 columns of first 10 records in the form of stream of bytes.

The number of partitions depends upon the dividing factor, smaller the value of

dividing factor, larger is the number of partitions. Larger is the value of factor,

smaller is the number of partitions. The value of database dividing factors is decided

by the database owner/data administrator.

3cdf =

77

MAT is inspired by method used to search element of 2-D matrix from memory

(RAM). It identifies the partition number and local record number to the partition

against a ( , . ) of a record. MAT calculates the partition number and

record number using the following procedure:

.Row No Column No

. . / ( , )Sr No df Q Rr r r= , where , are

the quotient and remainder record wise, respectively. Quotient Q is the partition id

and remainder is the local record id in the partition.

Qr Rr

r

Rr

. . / ( , )Column No df Q Rc c c=

Q

. .Sr No Co

, where Q , are the quotient and remainder column wise,

respectively. Quotient is the partition id and remainder is the local column id

in the partition. Thus, the partition id [ Q , ] and local record id for [ , ], is

searched for ( , ) record. MAT stores the information in the form of

stream of bytes, and arranged in the linked list, because of security reasons and easy

access. The insertion, deletion and updation in the linked lists are easy.

c Rc

c

lum

Rc

r Qc Rr Rc

.n No

0 1 2 3 4 5 6 7 8

Sr.No PAN No Name Address Age XY YZ AB CD 0 P1 Anil ABC YY DD FG 1 P2 Ashok ABC YY DD FG 2 P3 YY Kamal ABC DD FG 3 Raja ABC DD FG P4 YY 4 Peter ABC DD FG P5 YY 5 P6 Jony ABC YY DD FG 6 P7 Anjoo ABC YY DD FG 7 P8 Ashu ABC YY DD FG 8 P9 Abhi ABC YY DD FG 9 P10 Agya ABC YY DD FG 10 P11 Bohati ABC YY DD FG 11 P12 Parth ABC YY DD FG 12 Ritu ABC DD FG P13 YY 13 Norti ABC DD FG P14 YY 14 Ved ABC DD FG P15 YY 15 P16 Usha ABC YY DD FG 16 P17 Peshhi ABC YY DD FG 17 P18 Noni ABC YY DD FG 18 P19 Schin ABC YY DD FG 19 P20 Vansh ABC YY DD FG 20 P21 Heena ABC YY DD FG 21 P22 Shikhu ABC YY DD FG 22 P23 Vikram ABC YY DD FG 23 Sukha ABC DD FG P24 YY 24 Krishna ABC DD FG P25 YY 25 P26 Radha ABC YY DD FG 26 P27 Pawan ABC YY DD FG 27 P28 Rani ABC YY DD FG 28 P29 Golu ABC YY DD FG 29 P30 Manish ABC YY DD FG

[0, 0] [0, 1] [0, 2]

[1, 0] [1, 1] [1, 2]

[2, 0] [2, 1] [2, 2]

Figure 4.3 Logical View of Database Partitioning with 10rdf = , . 3cdf =

78

4.5.2 Database Partitioning

The records upto fixed value of primary key are selected to place in each partition

(decided at the time of partitioning), i.e., each partition contain the range of records

upto fixed value of primary keys. The data items between starting and ending primary

key reside in the particular partition. The data items in each partition are indexed

locally. The database is repartitioned in case of 25% change in the partition due to

insertion/deletion of record from the partition. Each partition of the database is

replicated on suitable number of sites (peers). The identifies the suitable peers

with the help of peer selection criterion {Section 4.5.4, Eqn (4.1)}.

Pi Pi

More is the number of replicas used, better is the data availability of the partition.

Only the owner of data has the complete information of database, no other peer has

this complete information for security reasons. It requires mutual coordination mainly

between TC, TPPs and RC.

Creation of Partitions Corresponding to the Database

Prior to the creation of partitions, all decisions regarding the number of replicas per

partition are to be finalized. The addresses of the peers going to act as TC, TPPs and

RC have to be identified. These peers are selected through peer selection criteria in

{Section 4.5.6}. The number of replicas per partitions are decided on the basis of

required availability of the partitions (depends on required quality of service). The

number of partitions may be controlled (increase or decrease) by deciding the values

of . The operations involved to create the partitions corresponding to the

database are as follows:

and rdf dfc

1. Selection of TC among all active peers participating in the system, through

peer selection criteria.

2. Depending upon the environment and requirement of the system, number of

replicas is to be decided to maintain the level of availability.

3. rdf and cdf are decided as per the required number of partitions.

4. Mapping table structure is to be initialized and prepared simultaneously.

5. The target database (to be partitioned) is prepared to hold all the relevant

information used by the system (record wise), e.g., timestamp, updated, in use

etc.

79

6. Partitions generated with the help of rdf and cdf .

7. Each partition of the database is stored on number of selected peers (depends

upon the level of required availability). TPP module of the system is installed

on the selected peers, so that they may execute the received subqueries.

8. Addresses of replicas corresponding to each partition are stored into the

mapping table.

9. RC(s) are also decided depending upon the availability of peer(s) and RC

module of the system is installed on the selected peer(s) to compile results

received from TPPs and synchronize with all other modules.

4.5.3 Algorithm to Access the Partitioned Database

To access any of the data location from the database (stored in parts on number of

peers), a set of operation has to be performed. These operations depend on how the

database is partitioned and stored on the peers. The operations to be performed at

various components of 3-TEM are as follows.

The global query is received after checking the authenticity of the requester. TC

parsed the received query and generates subqueries against the local partitions. The

addresses of all replicas corresponding to each partition id are identified through

mapping table. The address of RC is also identified depending upon the peer selection

criterion. The address of selected RC is shared with TPPs so that the required

information can be sent to the TPP. TPP executes the received subqueries, and update

the partition with the corresponding information packet received from the TC. RC

collects the response from entire available replicas corresponding to each participating

partition, and consolidate the partial results. This result is further shared with the TC,

and exchange various signals required to confirm the completion of process. The

following steps are to be performed by 3-TEM.

80

Operations to Access the Partition at TC

1. Authenticity and privileges to the requester is analyzed. Query is received from

the authenticated requesters. Token number corresponding to each query is

provided to the requester. Address of RC is also shared with requester, from

where the results, corresponding to the query will be received.

2. TC resolves the incoming global query and identifies the position of the record

into global database to be accessed, i.e., .Sr No . Primary key of the inserted

record may be used to identify . .Sr No against global schema. The conventional

methods may be used to resolve the query (as per the primary key).

3. This record may be accessed from various partitions stored at remote locations.

The partition ids are calculated, i.e., from where that record is to be accessed.

(i) Row check- . . / ( , )r r rSr No df Q R=

Case I. All column/fields of the record are to be accessed then, each

partition ids, where ( , )rQ i 0... 1i n= − are the partitions to be accessed.

Case II. If specific column corresponding to the specified row is to be

accessed.

Column Check . . / ( , )c c cColumn No df Q R= .

( , )r cR R th position of partition is having corresponding

record with to be accessed.

( , )r cQ Q th

.Sr No.

(ii) The addresses of entire available replicas corresponding to each partition

are identified from the mapping table (address of TPP(s)).

(iii) Information packets for TPPs are prepared, i.e., PacketTPP [1,0],

PacketTPP [1,1], PacketTPP [1,2], …, and so on, for all partitions that are

going to be actually accessed.

(iv) Information packets are sent to all the available replicas (TPPs)

corresponding to selected partitions.

81

(v) Result information packets for RC(s) is prepared, i.e., PacketRC [],

includes operation to be performed on the TPPs, number of replicas

against each partition, partial results/acknowledgment expected from the

TPPs. Query ids, timestamps, information about the requester.

(vi) Result information packets are sent to the RC(s).

Operations at TPP:

After receiving the information packet from TC, TPP checks the serial number

( ) of the record to be accessed into its local database, i.e., the position of the

record.

.Sr No.

(i) The information packet analyzed for the operation to be performed and data

required for performing the operation.

(ii) The proper position of required record is calculated through supplied r cR and R

with in the local database of corresponding replica.

(iii) Specified operation is performed on the data.

(iv) Partial data/ acknowledgment in the form of result packets sent to RC(s), after

completion of the operation.

(v) Send the acknowledgement to TC for successful accessing of data items.

Operations at RC:

After receiving the information packets from TC, RC prepares the store to hold the

partial results/ acknowledgements expected in the form of result packets from TPPs.

The store is prepared to hold all result packets from the multiple replicas

corresponding to each partition of the database.

82

(i) The result information packet for RC is analyzed for the operation to be

performed and space requires for holding the results/ acknowledgements,

corresponding to each replica.

(ii) Wait till the results packets received from the TPPs.

(iii) Each partial result/ acknowledgement from TPPs is positioned at proper place to

compile these partial results to the final result corresponding to the global query.

(iv) RC collects all process completion messages from each and compiles all

messages. It also sends the process completion message to TC.

(v) The compiled result is placed in the result pool.

The result is handed over to the authenticated requester, after matching the its

corresponding token number.

4.5.4 Peer Selection Criterion

The following equation can be used to compute the candidature of peers. This

candidature is further utilized to select the peers for holding replicas.

1 2 3 4i i i i i iCd AST w FSA w CPA w BD w Cr w= × + × + × + × + × 5 … (4.1)

Where

iCd Candidature of a peer to hold the replica. iP

iAST Average Session Time of the peer for which a peer is active in the

system.

iP

iFSA Free Space Available in a peer . iP

iCPA Computation Power Available in a peer . iP

iBD Bandwidth with which peer is connected to a system/network. iP

iCr Cardinality of a peer (Number of connection with peer ). iP

1 2 3 4 5, , , ,w w w w w are the weights for adjusting importance of the parameters.

There are two parameters that are considered while selecting a peer for storing the

replicas, i.e., candidature of the peer, which is the measure of, how capable a peer is

to store a replica. Second is the distance of each participating peer in the system. The

83

distance is a measure of the cost spends to send and receive messages from the peers

to centre peer. For any efficient system this distance should be minimized. A priority

queue is utilized to store the best peers having largest candidature among all peers.

The length of priority queue is equal to the double of the number of peers required in

the system (length may be varied depending upon system requirement). All these

peers are best suited to store replicas among all participating peers in the overlay.

4.6 Simulation and Performance Study

To study the performance of 3-TEM integrated MAT, some assumptions are made for

the simulation. These assumptions are as follows.

4.6.1 Assumptions

Relational database is considered for the simulation. It is also assumed that database

satisfies all the normal forms and free from any anomaly, while dealing with the

database. TC takes care of ACID properties of the database. All subtransactions carry

same security level as of the main transaction. The serializability of the transaction

and subtransaction are maintained by the TC and all TPPs. Timestamp is used to

compare the freshness of data items. Each subtransaction carries the timestamp of its

parent transaction. TC does the data placement, decides the divide factor and types of

database partitioning (horizontal, vertical, or both). It has the complete information of

all the fields in database (DB). TC manages the concurrent execution of processes.

4.6.2 Simulation Model

To evaluate the performance of 3-TEM an event driven simulation model for firm

deadline real time distributed database system has been developed {Figure 4.4}. This

model is an extended version of the model defined in [164]. The model consists of a

RTDDBS distributed over peers connected by a secure network. The various

components of the model are categorized into global and local components described

as follows:

n

Global Components: Transaction Generator generates transaction workload of the

system with specified mean transaction arrival rate. It also provides timestamps to the

arrived transactions. Transaction Manager models the execution behavior of the

84

transaction over the network and also resolves in subtransactions. Transaction

Scheduler schedules the global as well as local subtransactions and makes them ready

to dispatch. All global conflicts are resolved by it through timestamps. Transaction

Dispatcher dispatches the generated transactions which are in order to the network

and finally they reach to ready queue at local peer. Network Manager route all the

massages (traffic) among the peers. It also keeps track the addresses of all available

peers.

Local Components: Ready Queue all arrived subtransactions and ready to execute are

initially placed in it, according to their priority. Subtransactions get CPU ticks one by

one and in order of their priority. Wait Queue holds the subtransactions which are

blocked due to any reasons, e.g., to any conflicts of resources, concurrent execution of

processes, etc. It holds the subtransactions till their corresponding conflicts are not

resolved. A transaction from the blocked queue is also gets the CPU when it is ready

to execute, i.e., its all corresponding conflicts are resolved.

Concurrency Control Manager (CCM) implements the Timestamp based Secure

Concurrency Control Algorithm (TSC2A) {described in Chapter 5}. It manages and

resolves the concurrent execution of processes, and all conflicts of resources,

processes are resolved with the help of timestamps associated with them. Local

Scheduler is responsible for managing the locks for subtransactions. Depending on

CCM, it decides, whether the lock requesting subtransaction can be processed,

blocked in the wait queue, or restarted. It schedules the subtransactions and controls

the processes for CPU. At any given time, the transaction that has the highest priority

gets the CPU, unless it is being blocked by other transactions due to lock conflict.

For firm deadline system, the transactions that once missed their deadlines are

useless and aborted from the system. The deadlines of each transaction are checked

before execution. Ready transactions wait for execution in the ready queue according

to their priorities. Since the main memory database systems can better support real

time applications, it is assumed that the databases are residing in the main memory. A

transaction requests for a lock on data items before it executes on them. A restarted

subtransaction releases all locked resources, and be restarted from its beginning. A

subtransaction successfully committed also releases all its locked resources by it.

Finally, Sink collects statistics on the completed transactions from the peer.

85

Wait Queue

Ready Queue

Sink

Memory

Transaction Arrival

Blocked

Computation

Commit

Terminate

Database Operation Peer 5

Concurrency Control Manager

Local Scheduler

Network Manager Peer 1

Peer 2

Peer 4Peer 3

Transaction Generator

Transaction Manager

Transaction Dispatcher

Coordinator

Transaction Scheduler

Figure 4.4 Simulation Model for 3-TEM

The transaction scheduler is responsible for managing the locks for transactions.

Depending on Timestamp based Secure Concurrency Control Algorithm (TSC2A), the

transaction scheduler determines, whether the lock requesting transaction can be

processed, blocked, or restarted. A restarted transaction releases all locked resources,

and be restarted from its beginning. A transaction after successfully commitment

releases all the locked resources. The deadlines of the firm real time transactions are

defined based on the execution time of the transactions such as:

(Deadline ArrivalTime ExecutionTimeT T T S= + + )F … (4.2)

Where:

ArrivalTimeT : Time when a transaction arrives in the system.

SF : Slack factor is a random variable uniformly distributed between the

slack range.

86

Pr( ) .ExecutionTime Time Lock Time ocess TimeUpdateT T T T No of Operations= + + × … (4.3)

Where:

.No of Operations : Number of operations in the transaction.

Time LockT : CPU time required to set a lock.

PrTime ocessT : CPU time required to process an operation.

TimeUpdateT : CPU time to update a data object (for write operations).

4.6.3 Performance Metrics

The performance of the MAT integrated 3-TEM is evaluated and compared with other

existing systems through simulation. In the simulation we have used the performance

metrics defined in Table 4.1 and Table 4.2 and the performance parameters defined in

Table 4.3.

Table 4.1 Performance Metrics-I

• Network Load: is measure of the number of messages transferred in the network to

propagate an update message to the system in underlay topology. For an efficient

logical structure, network load should be lower. Higher network load creates

congestions in the P2P networks.

• Peer Availability: is the total up time of individual peer out of its total time.

• Partition Availability: is the total up time of group of peers out of its total time.

87

Table 4.2 Performance Metrics-II

Followings are the performance metrics used to evaluate the performance • Transaction Miss Ratio (TMR): is the percentage of input transactions that are unable

to complete before expiree of their deadline over the total number of transactions

submitted to the system. MissedTTM

Total

RT

=

• Transaction Restart Ratio (TRR): is the percentage of transactions that are restarted

due to any reasons over the total number of transactions submitted to the system.

Re startT=

Total

TRRT

• Transaction Success Ratio (TSR): is the percentage of transactions that are committed

successfully within deadline over the total number of transactions submitted to the

system. SuccessT Total

TSRT

=

• Transaction Abort Ratio (TAR): is the percentage of transactions that are aborted due

to any reasons over the total number of transactions submitted to the system.

AbortTR =

Total

TAT

• Throughput: is the number of transaction successfully committed before their

deadlines in unit time. The logical structures with high throughput can be utilized for

high performance databases over P2P networks. com dT itteThroughputTotalTime

=

• Response Time: is the time duration between transactions submitted and gets its first

response from the system.

88

Table 4.3 Performance Parameters Setup Name of

Parameters

Default Settings Description

Num_Peer 200 Number of peers participating in the

network

DB_Size 200 data items/database Size of data items/database

Mean_Arrival_Rate 0.2-2.0 Number of transactions /sec

Ex_Pattern Sequential Transaction type (Sequential or Parallel)

Num_CPUs 1 Number of processors per peer

SlackFactor 5-20 uniformly distributed Transaction deadline time.

Min_HF 1.2 Threshold value of Health factor

Commit_Time 40 ms Minimum time for commit processing

Num_Operation 3-8 uniformly distributed Number of operations in transaction.

U_Carda 5-15 uniformly

Distributed

Number of connection of a peer in underlay

O_Carda 5-10 uniformly

Distributed

Number of connection of a peer in overlay

Latency 5-20 uniformly

Distributed

Latency of a connection between peers

4.6.4 Simulation Results

To evaluate the performance of MAT integrated 3-TEM, a series of experiments are

performed. The simulation is done using the performance metrics defined in Table 4.1

and Table 4.2 and the performance parameters defined in Table 4.3.

Figure 4.5 presents that the availability of individual partition is more as

compared to the overall system availability. It is also observed from graph that

partition availability reaches in acceptable range of 0.7 approximately with peer

availability 0.35 in case of individual partition availability and 0.55 in case of system

availability, with 4 replicas per partition. To achieve the availability in acceptable

level of 0.95, 6-7 numbers of peers with more than 0.7 availability, may be used for

the system.

89

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1Peer Availability

Part

ition

Ava

ilabi

lity

Individual AvailabilitySystem Availability

Figure 4.5 Relationship between Peer Availability vs. Partitions Availability

Figure 4.6 shows that the throughput initially increases with increase in the mean

transaction arrival rate. After reaching its peak it starts decreasing with further

increase in MTAR. In the graph peaks of throughput is maximum, at this value of

mean transaction arrival rate, the system is in its peak performance. It is also observed

that 1-TEM produces its best performance at MTAR value in range of 1-1.2, where as

3-TEM produces its best at MTAR value of 1.4. The 3-TEM bears extra load and

executes more transaction per second as compared to 1-TEM. The throughput of 3-

TEM is higher than 1-TEM system for all values of MTAR because the 3-TEM

system takes small time span of CPU for the execution of a transaction.

Figure 4.7 presents that response time for 1-TEM (conventional execution model)

and 3-TEM both the case are same. It is also observed that with increase in number of

partitions the response time also goes on increasing. It may be due the possibility of

network delay, which increases with increase in number of partition.

90

0

1

2

3

4

5

6

7

8

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2MTAR

Thr

ough

put (

tps)

1-TEM3-TEM

Figure 4.6 Relationship between Throughput vs. Mean Transaction Arrival Rate

0

10

20

30

40

50

60

70

0 2 4 6 8 10 12# Partitions

Res

pons

e T

ime

(ms)

14

1-TEM3-TEM

Figure 4.7 Relationship between Numbers of Partitions vs. Response Time

Figure 4.8 shows that the Query Completion Ratio is initially high and then

decreases with increase in the value of MTAR. For the smaller values of MTAR, the

system has sufficient time and resources to execute the small number of query arrival

per second. The Query Completion Ratio in the case of 3-TEM is always higher as

compared to 1-TEM. It starts decreasing near MTAR value of 1.4 as compared to 1 in

case of 1-TEM. This indicates that 3-TEM completes more transactions as compared

91

to the 1-TEM and bears high load (more transaction per second) as compared to 1-

TEM.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2MTAR

Que

ry C

ompl

etio

n R

atio

3-TEM

1-TEM

Figure 4.8 Relationship between Mean Transaction Arrival Rate vs. Query Completion Ratio

Figure 4.9 presents that the miss ratio starts increasing with increase in the value

of MTAR. It is also observed that after a certain value of MTAR, the miss ratio

increases rapidly, which is 1.6 in the case of 3-TEM and 0.6 in the case of 1-TEM.

This is because after a certain value of MTAR, the number of transactions in the

system increases beyond the execution rate. The maximum resources are occupied by

the transactions and dependency within the resources is also increased, resulting into

more transactions getting blocked in wait queue, or aborted.

From the Figure 4.10 it is observed that the restart ratio of transactions goes on

increasing with increase in value of MTAR and start decreasing after some value of

MTAR, which is 1.2 in case of 1-TEM and 1.4 for 3-TEM. The peaks in the restart

ratio shows that after that value the number of transactions do not have sufficient time

to execute, i.e., the system rejects to allocate the resources to the transactions in wait

queue, due to the shortage of available remaining time which exceeds the deadline

with the transactions. It is also observed that the restart ratio of the 3-TEM is very less

as compared to the 1-TEM. Because 3-TEM executes the subtransactions in parallel

and resources get freed by the stages after executing and readily available for next

subprocess.

92

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

MTAR

Mis

s Rat

io

1-TEM3-TEM

Figure 4.9 Relationship between Mean Transaction Arrival Rate vs. Miss Ratio

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2MTAR

Res

tart

Rat

io

1-TEM3-TEM

Figure 4.10 Relationship between Mean Transaction Arrival Rate vs. Restart Ratio

Figure 4.11 presents that abort ratio start increasing with increase in the value of

MTAR. After a certain value of MTAR, the abort ratio start increasing rapidly which

is 0.6 for 3-TEM and 0.8 for 1-TEM. The abort ratio for 3-TEM is very less as

compared with 1-TEM. Thus, the resource utilization in 3-TEM is higher as compared

to the 1-TEM.

93

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2MTAR

Abo

rt R

atio

1-TEM3-TEM

Figure 4.11 Relationship between Mean Transaction Arrival Rate vs. Abort Ratio

4.7 Advantages of 3-TEM

In 3-TEM complete conventional execution process is divided into small processes,

i.e., TC, TPP and RC, which takes smaller CPU ticks for execution. The small ticks

processes are most suitable in dynamic P2P environment, due to small session time of

participating peers. The peers having small session time may also execute the

subprocesses completely which require small CPU ticks. The parallel execution may

increase throughput approximately three times. The speedup factor will approaches to

three in case of large number of subprocesses to be executed. The possibility of time

wastage due to leaving of a participating peer is lower in case of small ticks processes.

The distribution of work reduces the effect of single point failure at TC level. The

dependency on head peer is reduced because the responsibilities are distributed

among TC and RC.

4.8 Discussion

From the simulation results it is observed that partition availability reaches in

acceptable range of more than 0.7 with peer availability 0.35 of individual peer. To

achieve the availability in acceptable level of 0.95, 6-7 peers with more than 0.7

availability may be recommended for the system. It is observed that 1-Tier Execution

Model (1-TEM) produce its best performance at MTAR values from 1 to 1.2, where

94

95

as this value for 3-TEM is from 1.2 to 1.4. Thus, 3-TEM can bear extra load, i.e., it

can execute more transaction per second as compared to 1-TEM. The throughput is

always higher for all values of MTAR in case of 3-TEM, because small time span of

CPU is required to execute a transaction. The 3-TEM executes the subtransactions in

parallel fashion.

The response time increases with increase in the number of partitions, higher

network delay is responsible for this. The query completion ratio in case of 3-TEM is

always high as compared to 1-TEM, because of parallelism and the resources

availability in the system. The resource wastage is very low for 3-TEM as compared

with 1-TEM, because restart ratio, miss ratio for 3-TEM is always less than the 1-

TEM. Thus, the resource utilization is more for 3-TEM. The resource availability in

system is also high for 3-TEM, because of small subprocesses, a process holds the

resource for short duration for execution. The reduced load at TC also help in

improving the performance of 3-TEM.

4.9 Summary

In this chapter 3-Tier Execution Model (3-TEM) is presented which divides the

complete execution process into three independent stages. It provides high

throughput than 1-TEM and load bearing capability as compared to 1-TEM. It

provides high query completion ratio with comparable response time and enhances

the resource utilization of the system. The 3-TEM is integrated with MAT. MAT

partitions the database horizontally, vertically or both. It places small partitions of

database over P2P network and support all the operations on partitioned database. It

also provides primary security to the database by dividing database into small

partitions. The possibility of misusing the information is also reduced due to small

and partial information stored at the peer.

The next chapter presents the timestamp based concurrency control algorithm for

distributed databases over P2P Networks.

Chapter 5

Concurrency Control in Distributed Databases over P2P Networks

A global database is partitioned into a collection of local databases which are stored at

different sites is known as Real Time Distributed Database System (RTDDBS).

Distribution of real time data over the network always suffers from concurrency,

security and time bounded response. In this chapter we present a Timestamp based

Secure Concurrency Control Algorithm (TSC2A). It maintains security of data and

time bounded transactions along with controlled concurrency in the system.


Section 5.2 explores system model. Section 5.3 gives transaction model. Section 5.4

presents Serializability of Transactions. Section 5.5 presents Timestamp based Secure

Concurrency Control Algorithm (TSC2A). Section 5.6 presents simulation and

performance study. Discussion about the findings is highlighted in Section 5.7 and

finally chapter is summarized in Section 5.8.

5.1 Introduction

In Real Time Distributed Database System (RTDDBS) multiple database sites are

linked by a communication system in such a way that the data at any site is available

to users at other sites. This system has several characteristics such as: (1) transparent

interface between user and data sites; (2) ability to locate the data; (3) Database

Management System (DBMS) to process queries; (4) distributed concurrency control

and recovery procedures on the network, and (5) mediators which translates the

queries and data between heterogeneous systems.

A Secure Real Time Distributed Database System (SRTDDBS) consists of

security classes and restricts database operations based on the security levels. It

secures each transaction and data in the system. A security level for a transaction

represents its clearance, security and classification levels. Concurrency control is an

96

integral part of a database system. It is used to manage the concurrent execution of

different transactions on the same data item without violating consistency.

The communications in a distributed system is complex and rapidly changing.

There are many different links, channels, or circuits over which the data may travel on

the network. In addition, there are several issues which must be considered when

transferring the data/transaction across the network, viz., illegal information flows

through covert channels, security of transactions, concurrency control over concurrent

transactions, follows-route establishment time, end-to-end network delay, network

bandwidth (transfer rate), fault tolerant and reliability, etc. We will concentrate on the

covert channel problem to avoid the illegal information flow and on timestamp based

concurrency control algorithm for executing concurrent transactions.

5.2 System Model

In SRTDDMS a global database is partitioned into a collection of local databases

stored at different sites. It consists of a set of sn number of sites. A site is having a

secure database, which is a partition of global database scattered on all the

ni

sn sites.

Each peer has an independent processor connected through communication links to

other peers.

A global transaction is generated by the user and submitted to the system for

execution. A secure distributed database is defined as a set of five tuples

, where is the set of data items, is the set of operations for

SRTDDMS and is defined as

rT

tD, , , ,t p s c vD O T S L< > pO

1 2 3{ , , ,..., }kp p p p pO O O O O=

sT

i

, where is number less than the

maximum number of operations defined for the transactions. A global transaction

can be divided into i subtransactions and can be defined as . The

coordinator assigns a timestamp to each transaction at the time of its arrival into

the system. All transactions are ordered in ascending order of their timestamps. A

subtransaction carries the timestamp

k

rT

1 2 3{ , , ,..., }ir r r r rT t t t t=

st

icS

of its corresponding parent transaction .

is the partially ordered set of security levels with an ordering relation and , is a

mapping from to . Security level is said to dominate security level iff

. A subtransaction can precedes a subtransaction if timestamp

irT

jcS

cS

≤ vL

i

rtD

j ic cS S≤

T∪ cS

irt

jrt st of is

smaller than the timestamp

jrt

jst of , i.e., j

rti js st t< .

97

For every data object cvt SxLDx ∈∈ )( , and for every , ( )i ir r v rt T L t Sc∈ ∈ . Each secure

database is also mapped to an ordered pair of security classes and

, where ,

N )(min NLv

)(max NLv )N )(max NLv(minLv ∈ , and cS )() max NLv(min NLv ≤ . In other words,

every secure database in the distributed system has a range of security levels

associated with it. A data item x is stored in a secure database , if it satisfies the

condition . Similarly, a distributed transaction is executed

at , if it satisfies the condition

N

ax( )v N

)() max NLv≤

min

() xLN v≤(minLv rT

N m( )rL T L( )v vL N ≤ ≤

(maxv NL

. A site is allowed to

communicate with another site iff

iN

)jjN vL )(max iN = . The security policy used is

based on the Bell-La Padula model [190] and enforces the following restrictions:

Simple Security Property: A transaction (subject) is allowed to read a data item

(object)

rT

x , iff . )()( TLxL vv ≤

Restricted Property: A transaction is allowed to write a data item rT x iff

. Thus, a transaction can read objects at its level or below, but it can write

objects only at its level. A transaction with low security level is not allowed to write

at higher security level data objects. This phenomenon is used for incorporating of

database integrity. In addition to these two requirements, a secure system must guard

against illegal information flows through covert channels.

)()( TLxL vv =

5.3 Transaction Model

A user from any peer may issue a global transaction for a global schema. This schema

may be accessible to all users by one of the following configurations:

(a) Replicate a global schema on all peers.

(b) Select the number of peers (coordinators) to maintain copies of the global

schema and the global transaction manager, and direct requests for a global

schema to the nearest coordinator.

(c) Select only one peer (called the coordinator) to maintain the global schema and

the global transaction manager, and direct requests for a global schema to that

peer.

98

For the proposed algorithm, second and third configuration favors over the first one

because it is difficult to maintain a copy of the global schema at every peer. It also

hinders the expandability and simplicity of the system. The coordinator solves the

problem of assigning timestamps incase (c), which is responsible for assigning

timestamps to all global transactions. Case (c) is considered for implementation of

TSC2A.

5.4 Serializability of Transactions

To handle the concurrent execution of transactions in the system, serializability is

enforced at global and local level, i.e., by coordinator and TPPs, respectively. To

maintain the global serializability coordinator with the help of data manager uses the

timestamp information available in DAT. All the global transactions are by default

enforces serializability, due to timestamp of each global transaction. To maintain local

serializability at TPP, the timestamps inherited from global transactions are utilized.

The coordinator sends subtransactions in order to the TPP through communication

links. When the subtransactions are not received in order at TPP due to

communication delay and path failure, etc. TPP itself arrange these subtransactions in

order, depending upon the associated timestamps. A subtransaction may be blocked,

after receiving a subtransaction having lower timestamp. These blocked transactions

may be restarted at its turn in order. Hence, the local serializability is also guaranteed

through the used mechanism.

Let be the set of global subtransactions to be executed. A transaction from ,

resolved in subtransactions . Subtransactions from are executed such that, if a

subtransaction precedes a subtransaction in this ordering, then for every pair of

atomic operations

rT rT

rt rt

irt

jrt

ipO and j

pO , from and irt

jrt , respectively, i.e., i

pO proceeds jpO in

each local schedule. The execution of subtransaction jrt can be blocked, after

receiving by the TPP, results the local serializability. Therefore, if the Coordinator

submits subtransactions in a serializable order to TPP, then TPP executes the

subtransaction in serializable order and guarantees the overall serializability in the

system.

irt

99

5.5 A Timestamp Based Secure Concurrency Control Algorithm (TSC2A)

Timestamp is normally used to manage the sequence of execution of transactions in

the distributed system. A global concurrency control algorithm which is designed

using timestamps maintains the transactions in serializable order for the execution and

achieves authenticated results. The order of transactions in timestamp based

concurrency control completely depends upon the read/write timestamp associated

with each data item and timestamp of submitted transaction. Serializability pays the

major role in sequencing the transactions.

5.5.1 Algorithm for Write Operation

TSC2A presents the sequence of operations performed to execute the read/write

transactions in the system. This algorithm checks the timestamp assigned with the

data items prior to the execution on the data items and compares with the timestamp

of requesting transaction, for managing concurrent processes in the system. The steps

involved in the execution of write operation are as under:

Algorithm for write operation on data item x requested by subtransaction i with

timestamp

S

iTs .

If ( ( ) isRTs x T> )

{

Abort ( ); iS

}

Else

{

If ( ( ) isWTs x T> )

{

Ignore ( ); iS

}

Else

{

If ( ) )()( ivv SLxL ==

100

/* are security levels of data item ( ) & ( )v vL x L Si x

& transaction i respectively*/ S

{

Writelock( x );

Execution( x );

= )(xWTs isT ;

Update DAT to isT ;

}

Else

{

Abort( i);/* access denied due to security */ S

}

}

}//end Algorithm

5.5.2 Algorithm for Read Operations

The steps involved in the execution of read operation are as under:

Algorithm for read operation on data item x requested by subtransaction i with timestamp

Si

sT :

If ( ( ) isWTs x T> )

{

Abort( iS );

Rollback( ); iS

}

Else

{

If ( ) )()( ivv SLxL ≤

{

Readlock( x );

Execute( x );

= )(xRTs isT ;

Update DAT to isT ;

101

}

Else

{

Abort( iS );

Rollback( ); iS

}

} //end Algorithm

Global transactions are not likely to be rolled back frequently. But a global

subtransaction is rolled back by TSC2A, it will roll back all subtransactions

corresponding to global transaction. TSC2A enhances the execution autonomy of TPP

by rolling back a global transaction at the coordinator site, before sending its

subtransactions to the relevant TPP. This job is normally done by DAT.


To study the characteristics of TSC2A, we have implemented it on 3-TEM system

described in Section {4.4}. It is an event driven simulation model for firm deadline

real time distributed database system. The model consists of an RTDDBS distributed

over peers connected by a network and data partitions are replicates over multiple

peers. The execution model and architecture of 3-TEM is shown in Figure 4.1 and

Figure 4.2, respectively. The deadlines of the firm real time transactions are defined

based on the execution time of the subtransactions using the eqn (4.2) & (4.3) given in

Section {4.6.2}.

m


The performance of the TSC2A is evaluated and compared for three cases, viz., low,

medium and high security through simulation. In the simulation we have used the

performance parameters defined in Table 4.3 and performance metrics defined in

Table 4.2 {Chapter 4} to study performance of the TSC2A.

5.6.2 Assumptions

The following assumptions are made during the implementation of the TSC2A:

102

• Arrivals of transactions at a peer are independent of the arrivals at other sites.

• The model assumes that each global transaction is assigned a unique identifier.

• Each global transaction is decomposed into subtransactions to be executed by

TPP.

• Subtransactions inherit the identifier of the global transaction.

• No site or communication failure is considered.

• To execute a transaction it requires the use of CPU and data items located peer.

• A communication link is used to connect the peers.

• There is no global shared memory in the system and all peers communicate via

messages exchange over the communication links.

• The transaction is assigned a globally distinct real time priority by using specific

priority assignment technique. Earlier deadline first is used in the simulation.

• The cohorts of the transaction are activated at the corresponding TPP to perform

the operations.

• A distributed real time transaction is said to commit, if the coordinator has

reached to the commit decision before the expiry of the deadline.

• Each cohort makes a series of read and update accesses.

• The transaction already in the dependency set of another transaction or the

transaction already having another transaction in its dependency set, cannot permit

another incoming transaction to read or update.

• Read accesses involve a concurrency control request to obtain access followed by

a disk I/O to read followed by a CPU usage for processing the data item.


To evaluate the performance of the TSC2A a series of simulation experiments are

performed. Three security levels are considered in the simulation, i.e., low, medium

and high. We report the important simulation results obtained from the simulation

experiments.

From Figure 5.1, it is observed that the transactions missed their deadlines with

increase in MTAR. This increase in miss ratio is due to the number of transactions

wait for its turn to be scheduled for CPU of any peer. It is also observed that the miss

ratio is high in case of high security transactions as compared to two other security

level transactions. This miss ratio is high due to interaction of high security

103

transactions with low secure transactions. This shows that to implement high security

transactions in any distributed database system may compromise on the miss ratio and

throughput against these high security transactions is reduced.

From Figure 5.2 it is observed that transaction restart ratio increases with increase

in the value of MTAR. After reaching a point restart ratio goes on decreasing for

further increase in the value of MTAR. It is the point where maximum numbers of

transactions are restarted. This decrease in restart ratio is due to the fact that

transactions are not having sufficient remaining time to execute before expiry of

deadline of that transaction. At this point the abort ratio of the transactions are high.

Hence, the restart ratio of the transactions goes on decreasing after the value of

MTAR. This value is different for individual security level. The transaction restart

ratio is higher in case of high security transactions, because transactions restart due to

their wait for high security level data items.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2Mean Transaction Arrival rate (MTAR)

Mis

s Rat

io

High SecurityMedium SecurityLow Security

Figure 5.1 Comparison between Miss Ratio of Transactions and Mean Transaction Arrival

Rate (MTAR)

104

0

0.2

0.4

0.6

0.8

1

1.2

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

Mean Transaction Arrival Rate (MTAR)

Ras

tart

Rat

ioHigh SecurityMedium SecurityLow Security

Figure 5.2 Comparison between Transaction Restart Ratio and MTAR

It is observed from Figure 5.3 that initially the success ratio in all three security

levels increases. After a value of MTAR the success ratio goes on decreasing. A

variation in the particular value is due to the transaction executing rate which is high

in case of low security transaction than the medium and high, where as this rate is

higher in case of medium than high security transactions. The system exhausts at

higher value of MTAR in case of low security transactions as compared to the

medium and high security transactions because transactions have to wait for its turn in

wait queue due to unavailability of resources. The success ratio starts decreasing after

certain value of MTAR, because system has sufficient resources and times to execute

the transactions at the rate of transaction arrivals and after this point transactions wait

queue starts increasing due to more arrival rate than the completion rate. With this

load, all resources also become busy and transactions have to further waits for locks

on resources. Thus, the success ratio starts decreasing with further increase in MTAR.

The low security level transactions have highest success ratio among three security

levels, i.e., high, medium and low security level transactions.

105

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2Mean Transaction Arrival Rate (MTAR)

Succ

ess R

atio


Figure 5.3 Comparison between Transaction Success Ratio and MTAR

From Figure 5.4 it is observed that the transactions abort ratio increases with

increase in mean transactions arrival rate. This abort ratio is increased in all three

cases high, medium and low security levels. The abort ratio is highest in case of high

security transaction as compared to other security level transactions. The high abort

ratio is observed in high security transaction because higher priority is given to low

security level transactions. A high security level transaction is aborted and restarted

after some delay whenever a data conflicts occur between a high security level and

low security level transactions.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2Mean Transaction Arival Rate (MTAR)

Abo

rt R

atio


Figure 5.4 Comparison between Transaction Abort Ratio and MTAR

106

Figure 5.5 shows the transaction throughput as a function of the MTAR per peer.

It can be seen that the throughput of TSC2A initially increases with the increase in

arrival rates then decreases with further increase in arrival rates. The peak values are

the workload of transaction arrival rate that the system can bear. This value is

different for all three cases and is higher in case of low security transactions.

However, the overall throughput of high security transactions is always less than

medium and low security level transactions. It is also observed that the throughput of

high security level transactions is lower than that of low security level transactions as

arrival rate increases.

0

20

40

60

80

100

120

140

160


Thr

ough

put (

tps)

High SecurityMedium SecurityLow security

Figure 5.5 Comparison between Throughput and MTAR

5.7 Discussion

From the simulation results it is observed that restart ratio degrades the performance

of the system, both in time requirement by the database coordinator to reset the

database to its pervious state and computation time of the individual transaction. This

performance degradations increases with increase in the number of concurrent

transactions. The restart of transactions increases because of the time required for

taking permission to access high security data items and other conflict of resource

locks. The arrival rate of transaction should be managed so that, the performance of

the system, in terms of minimum abort, restart ratio and to maximize the throughput.

The transaction execution rate is higher for less secure transactions than the

medium and high security transaction. The system starts exhausting at different value

of MTAR for all three cases and is higher in case of low security transactions. The

107

load bearing capacity of the system also varies with respect to security levels used for

the transactions and the data items. The load bearing capacity of the system in terms

of rate of transactions execution is higher in case of low security transaction as

compared in the case of medium and high security.

It is also observed that throughput of the system decreases as the security level of

the transaction increases, due to the probability of successfully executed transaction is

decreased because there is a tradeoff between the security level and the throughput of

the system.

5.8 Summary

In this chapter, we have presented an algorithm for Timestamp based Secure

Concurrency Control Algorithm (TSC2A). This algorithm takes care of security of

transactions and the data items provided to the transactions and data items stored at

various peers. It also controls the flow of high security data items to be accessed by

low security transactions.

TSC2A secures data items and transactions through security levels, and restrict the

data access from various data levels. It also avoids covert channel problem in

accessing the database. TSC2A ensures serializability in execution process of a

transaction. It enforces serializability property at global (TC) and local level (TPPs) in

the system.

In the next we will discuss topology adaptive traffic controller for P2P networks.

108

Chapter 6

Topology Adaptive Traffic Controller for P2P Networks

In structured and unstructured P2P systems frequent joining and leaving of peers

causes topology mismatch between the P2P logical overlay network and the physical

underlying network. When a peer communicates to another in overlay topology,

exchange message travels multiple hop distance in the underlay topology. A large

portion of the redundant traffic caused by topology mismatch problem between the

overlay and underlay topology, which makes the unstructured P2P systems

unscalable.

This chapter presents a Common Junction Methodology (CJM) to reduce the

overlay traffic at underlay level. It finds common junction between available paths

and route the traffic through this common junction. CJM avoids the conventional

identified paths. Simulation results show that CJM resolves the mismatch problem

and significantly reduces redundant network traffic. This methodology works for both

structured and unstructured P2P networks. CJM reduces the response time

approximately upto by 45%. It does not alter overlay topology and perform without

affecting the search scope of the network.

Rest of the chapter is organized as follows. Section 6.1 presents Introduction.

Section 6.2 discusses System Model. Section 6.3 gives System Architecture. Section

6.4 presents CJM. Simulation and Performance Study is given in Section 6.5. Section

6.6 gives advantages of CJM. Discussion is provided in Section 6.7 and finally


6.1. Introduction

A Peer-to-Peer (P2P) network is an abstract, logical network called an overlay

network. Instead of strictly decomposing the system into clients (which consume

services) and servers (which provide services), peers in the system elects to provide

services as well as consume them. All participating peers form a P2P network over a

109

physical network. The network overlay abstraction provides flexible and extensible

application-level management techniques that can be easily and incrementally

deployed despite the underlying network. When a new peer joins the network, a

bootstrapping node provides the list of IP addresses of existing peers in the network.

The new peer then tries to connect with these peers. If some attempts succeed, the

connected peers will be the new neighbors of the peer. Once a peer connects into the

network, the new peer will periodically ping the network connections and obtain the

IP addresses of some other peers. These IP addresses are cached by this new peer.

When a peer leaves the network and wants to rejoin (no longer the first time), the peer

will try to connect to the peers whose IP addresses have already been cached.

When a peer is randomly joining and leaving the network, causes topology

mismatch between the P2P logical overlay and the physical underlying network,

causing a large volume of redundant traffic. The flooding-based routing algorithm

generates 330 TB/month in a Gnutella network with only 50,000 nodes [10]. A large

portion of the heavy P2P traffic caused by topology mismatch problem between

overlay and underlay topology, which makes the unstructured P2P systems

unscalable. A message exchanged between the peers in overlay topology, travels

multiple hop distance in the underlay topology. To maintain the topology many data

and control massages have to be sent from one peer to other in overlay network.

Generally a flooding technique is used for searching a peer/data items in the P2P

overlay networks. The search messages are sent to all connected peers, bounded by

TTL. These messages load in overlay becomes the multiple times in underlay

topology. Thus, generates a heavy redundant traffic in the network.

In P2P networks peer nodes rely on one another for services, rather than solely

relying on dedicated and often on centralized infrastructure, i.e., decentralized data-

sharing and discovery algorithms/mechanisms will be the boosting option for the

deployment of P2P networks. The challenge for the researchers is to address the

topology mismatch problem for avoiding unnecessary redundant traffic from the

networks. This problem can be more clearly introduced using Figure 6.1.

In Figure 6.1 the eight peers are (numbered 1-8) participating in the underlay

network, out of which only four peers are in overlay network. We deal in overlay

networks and therefore, we have two cases. First, willingly we have to send the

massage from one peer to another peer. Second, there is no option to send the

message from one peer to another peer without using some intermediate peer in

110

overlay networks. Both the cases cause a heavy redundant traffic on the physical

network. In Figure 6.1, if we send the query from peer 1 to peer 6 in overlay, let it is a

from (1,6). (1)Path

4

1

6

8

5

6 7

4

8

3

1

2

Overlay Topology

Underlay Topology

Figure 6.1 Overlay and Underlay Networks Setup

In underlay this path is an ordered set of peers, and for the considered example it

is: (1) {1, 2,3,5,6} (1,6)Path from=

(2) {6,5,3, 4} (6,4)Path from=

(3) {4,3,7,8} (4,8)Path from=

Say, the query is sent from peers (1, 6, 4) in overlay. The query has to travel in

underlay as which means twice the traffic cost from is wasted,

similarly as we send the query to 3 hop in overlay network, then twice the traffic cost

of and is wasted. This is the one of the major reason for unwanted heavy

traffic in any of the P2P network. We have to save the unwanted traffic mentioned

above.

}4,3,5,6,5,3,2,1{

}4,3{

}6,5,3{

}6,5,3{

A mechanism to reduce redundant traffic from the P2P network is required to be

developed. The search scope of overlay should not be changed while reducing the

network traffic. The overlay topology must be unaltered during reducing the path. In

111

this chapter a novel Common Junction Methodology (CJM) is proposed to reduce

traffic in P2P networks. It also reduces the response time of the system.

6.2 System Model

To reduce the redundant traffic generated through the topology mismatch problem, a

path to transfer the message is to be reduced. A message travels number of underlay

peers during forwarding it in the overlay topology. Some of the underlay peers are

visited by a message multiple times, while forwarding the messages indirectly. This

multiple traversing of message in underlay generates the redundant traffic which may

be reduced without affecting the system performance. This redundant traffic is a large

fraction of total traffic generated in the system. To address this problem, the following

system model is proposed:

The set of peers and elements of network participating in the system are

represented as 1 2 3{ , , ,..., }nP p p p p= , i.e., set of participating peers in underlay and

overlay.

The set of peers participating in overlay topology for computing, sharing of data

& services and forwarding of messages, is represented as 1 2 3{ , , ,..., }O mP p p p p= .

All other peers which are not in overlay topology are in set which is a set of

underlay peers, i.e., .

UP

U OP P P= −

The path traversed by a message forwarded between two peers in overlay,

starting from as a source to as a destination (direct path), is represented as

. This path, from a to is an ordered set of peers traveled by a

message in underlay is represented as .

,a b

a b

( , )ia bPath thi b

Path( , ) { , , ,..., }ia b x ya p p b=

A message may use two paths (indirect path) and while it is

forwarding from source to destination, i.e., a to then to , , otherwise

message forwarding is not possible through conventional method. This is represented

as . Two paths i and

( , )ia bPath

b c

( , )jc dPath

iff b c=d

( , )( , )( , ) ( , )

i j jia ba d c dPath Path Path= ∪ j can be used only when

destination of path i and source of path j are same. Similarly, the cost of a path i is

the cost to transfer unit data from source a to destination and is represented as b

112

( , )ia bCost

( , )( , )i ja dCost

. The cost of two consecutive paths is

. ( , ) ( , ) ,

,

jia b c dCost Cost b c

otherwise

⎛ ⎞+ =⎜ ⎟=⎜ ⎟∝⎝ ⎠


Figure 6.2 presents a 3-Layer Traffic Management System (3-LTMS) for overlay

networks. First layer of 3-LTMS is under overlay control, i.e., it provides the interface

to the P2P system and receives query or subqueries.

Underlay Network

Query Analyzer

Application

Query Optimizer Query Execution

Engine

Path Manager

Overlay Network

Physical Network Manager

Figure 6.2: 3-Layer Traffic Management System (3-LTMS) for Overlay Networks

This layer is implemented over the application layer of the network. Next layer

comprises of four components. The Query Analyzer accepts the queries, resolves and

forwards the same to Query Optimizer. It is also responsible for breaking query into

subqueries. The Query Optimizer decides whether a peer suitable for a particular

subquery or not. Query Execution Engine execute the subquery and produces partial

results corresponding to the subquery. These partial results are further sent to the

requesting peer or to the peer responsible for compiling these partial results. The Path

Manager is responsible for reducing the path and implements the CJM. It reduces the

logical path between the peers in overlay network. All the paths are checked and

reduced (if possible) using the database of the underlay peers. The third layer, which

113

is Physical Network Manager, is responsible for managing the underlay. This layer

utilized the information from path manager.

6.4 Common Junction Methodology (CJM)

To reduce the unnecessary traffic from the network, the query should be routed

through shorter paths. Every path in the overlay network is assigned a number. Let

be the number of such paths. Let path (from peer to peer ) is in logical

topology, be the ordered set of peers in the underlay topology. Let is the hop count

of the path for which we want to find the shortest path. The problem is to find the

optimum path for a path of length . The path and total traffic cost is identified using

the following.

m

( )( , )

ia bP a b

n

n

CJM is based upon the assumption that there should be a physical path

corresponding to the logical path before sending the queries/messages across the

network, i.e., the path in overlay topology. In this chapter path cost and traffic cost is

used interchangeably. These assumptions are summarized as follows.

(a) A path exists between source and destination peers participating in overlay

networks.

(b) A peer rejoins at the same position from where it lefts from.

(c) Each edge of the graph shows the traffic cost, which includes total cost to

transfer the unit data.

(d) If path is broken, a conventional strategy (may be flooding) is used to find the

new path.

6.4.1 Common Junction Methodology Algorithm

To avoid the redundant traffic in the P2P networks the two physical paths in underlay

may be checked for any common junction other than source/destination. This

common junction may be used to reroute the traffic and avoid the conventional path

of higher path cost. To find the common junction the following algorithm is used.

114

1

1

1

Step 1. Intialize variables:be the hop count of the path.

.. be the source peer of respective individual paths.

.. be the destination peer of respective individual paths.

.. be the common p

n

n

n

Let nS S

D D

r r

1

eer find in respective individual paths.

;0; // TTC is Total Traffic Cost;

Start Peer STTC

==

Step 2. for all values of from [1...( 1)];varies from [ ...( 1)];

i nj n i

−+

Step 3. While ( ) goto Step 4;Else go to Step 8;

j i>

( ) ( )( , ) ( , )

Step 4. Find Common Junction

{};i i j j

i jS D S DPath Path CJ=∩

( , ) ( )( )( , ) ( , ) ( , )

( , )( , )

Step 5. If {} then 1, and goto Step 3;Else: let is the common junction

. ;

where the shortest path identified through CJMi j i i i j

i j

ii j jiS D S r r D

i jS D

CJ j jr

Path CJM Path Path

Path CJM

φ= = −

=

;

Step 6. [ , ];// TC is traffic Cost between two peers;

iTTC TTC TC Start Peer r= +

Step 7. ;;

goto Step 3;i

i jStart Peer r=

=

( , )( , )

Step 8. End;

// holds shortest path;

// TTC holds total traffic cost of the shortest path;i j

i jS DPath CJM

To get the paths, CJM finds the Common Junction and diverts the query from regular

path to reduced/shorter one.

In Figure 6.1, two hop paths set Common Junction is obtained by

, here 3 is the common Junction from where query may be

diverted to save the unwanted traffic cost of the network. Now, the query has to travel

the path . The traffic cost of this saved path is very less as compare to the

conventional path. Second, take 3 hop path traveling from , we got the

Common Junction peer 3 from 1st and 3rd path which means the query has to travel the

{}CJ

(1) (2) {3}Path Path CJ=∩

}4,3,2,1{

(1,2,3)Path

115

path . As a result CJM saves a lot of traffic cost on identified paths. If a

query is sent logically from peer 1 to peer 4 through peer 6, in overlay network, it is a

two hop count path {Figure 6.1}. The physical path traversed by the messages is

. Which is conventional path and described as:

. This path is combination of two paths, i.e., path 1 and path 2. These

two paths can be merged only if, when the destination of first path and the source of

second path are same peer using conventional methodology. At this point we can save

the traffic cost of the path by routing the query through path[ . In this

path the traffic cost to send the query through path is saved. It is

redundant traffic in the network and saves this portion of the traffic cost. Here peer 3

is common junction between the two paths. CJM is based on the key idea of common

junction between the two paths. This common junction can be identified as

, where CJ{} is ordered set of peers common in both

the paths. Here, peer 3 and peer 6 are the common junction of two paths, and

, from where a query may be diverted. If a query is diverted through peer 6, it

becomes a conventional path. But in case of choosing peer 3 for routing a query, we

can save a lot of traffic cost. Similarly, this common junction can be identified

between any two path available. These paths may or may not be continuous.

}8,7,3,2,1{

2 3→ → → →

(1,2)(1,2,3,5,6,5,3,4)

(1)(1,2,3,5,6) Path∩

(2)

)(),(

ikjP

[1 5 6 5 3 4→ → →

Path

(2)(6,5,3,4) {3,CJPath =

Path

thi

]

]1 2 3 4→ → →

5 3]→ →

Path

[3 5 6→ →

6}

(1)

6.4.2 System Analysis

The objective of CJM is to minimize the cost of existing paths, when data is sending

across the multiple hops.

Let be a path. It is ordered set of peers coming across the path from

source peer j to destination peer . If we want to find two hop path from then

we have to find the . Assume that there is no direct path between

peers.

k( )

( ,j

k nP

)n,(l( , ) (

( , ) (i j

l nP P=

( )( , )ii jTC

), ) .i

l k

thi

)

The traffic cost of a path is the cost to transfer the unit data from peer i

to peer j from conventional path. The cost of the two hop path is

),(),(ji

nlP

( )) ( , ))

,i j l kTC

=∝

( )( , ( , )( ,

,i jm n

l nTC k m

TCotherwise

⎧ ⎫+ =⎪ ⎪⎨ ⎬⎪ ⎪⎩ ⎭

116

CJM finds the minimum path length between two connected paths, i.e., .

It searches common junction between the paths using the following:

jnm

ilk PP ),(),( &

… (6.1) {}),(),( CJPP jnm

ilk =∩

If φ={}CJ then there is no common junction. If φ≠{}CJ there exist a Common

Junction and CJM to finds the cost accordingly, i.e., w.r.t. ( )( , )ii jTCC r (Common

Junction). It finds the cost using the following:

( , ) ( ) ( ) ( )( , ) ( , ) ( , ) ( , )(i j i j jk n k r m n m rTCC TC TC TC= + − ) … (6.2)

From eqn (6.2 ) there exist two cases:

Case-I φ={}CJ

In Case-I, there is no common junction between two paths, i.e., no continuous path

and no common junction peer is present in the two paths. The cost computed with the

help of eqn (6.3) and (6.4) is as follows.

( , ) ( ) ( )( , ) ( , ) ( , )i j i jk n k l m nTC TC TC= + =∝

) =∝

… (6.3)

( , ) ( ) ( ) ( )( , ) ( , ) ( , ) ( , )(i j i j jk n k r m n m rTCC TC TC TC= + − … (6.4)

From eqn (6.3) and (6.4), both the cases are showing total traffic cost infinity, i.e.,

there is no path available from source to destination.

Case II. φ≠{}CJ

This means that there is a common junction between the two paths. The path may be

either continuous or have some common junction peers. The traffic cost for these two

cases are computed as follows:

Case-II (a): When , there is continuous path, i.e., the destination of path 1st is

equal to the source of path 2nd. In this case the common junction between two

mk =

117

paths is the destination of path 1st and source of the path 2nd. If r is the last peer, it

is the last peer of path 1st and starting peer of path 2nd.

Traffic cost for the normal path (conventional path) and CJM is computed as under:

… (6.5) ( , ) ( ) ( )( , ) ( , ) ( , )i j i jk n k l m nTC TC TC= +

… (6.6) ( , ) ( ) ( ) ( , )( , ) ( , ) ( , ) ( , )( 0)i j i j i jk n k r m n k nTCC TC TC TC= + − =

From eqn (6.5) and (6.6) it is observed that the traffic cost in both the cases is

same.

Case-II (b): When we find a common junction between the two paths (then mk ≠ r

will be a common junction) r . The traffic cost of the joint is:

… (6.7) ( , ) ( ) ( ) ( )( , ) ( , ) ( , ) ( , )( )i j i j jk n k r m n m rTCC TC TC TC= + −

Equation (6.7) can be simplified as

( ) ( ) ( )( , ) ( , ) ( , )

j jm n m r m nTC TC TC− ≤ j … (6.8)

Hence ( , ) ( , )( , ) ( , )i j i jk n k nTCC TC≤

From eqn (6.7) and (6.8), it can be concluded that the traffic for CJM is less than

the conventional method.


It is assumed that there are 1,000 peers in the underlay, out of which 10%-20% are in

the overlay network. The peers in the underlay may be connected in any network

topology (regular/irregular/mesh). The cardinality of a peer in underlay and overlay is

a maximum number of peers to which a particular peer may connect. The value of

cardinality is ranging from 3-20 for underlay network. Uniform random number is

used to generate the cardinality of a peer participating in underlay network. The

cardinality used for overlay network is ranging from 3-12 are uniform random

numbers. The path between two peers in overlay is a sequence of peers to be traversed

in underlay between source and destination. Path Length is the number of hops to be

traversed from source to destination. Dijkstra’s algorithm is used to find shortest path

in underlay only. The structured and unstructured topologies are implemented in

overlay network. The path cost is used as measure the cost to transfer the message

from source to destination. The path cost comprises all the cost including bandwidth,

latency, processing cost, etc.

118

6.5.1 Simulation Model

To study the behavior of CJM an event driven simulation model is developed {Figure

6.2} in C++. A brief overview of the different components of the model is as follows.

Peers

Underlay Topology Manager

Overlay Topology Manager

Network Analyzer

Path Manager

Network Manager

P1

P2

P3

P4

P5

P n

Time Scheduler

Figure 6.3 Network Simulation Model for P2P Networks

Peers are the active entities participating in the network. Each peer has its predefined

availability factor which is decided at the time of its generation. Availability factor

decides the availability behavior of a peer in the network.

Time Scheduler schedules all time based events for the system. It analyzes the session

time and other statistics of peers and networks. Time scheduler decides the time of

joining/leaving of a peer depending upon the availability factor from the network.

Underlay Topology Manager binds peer and manages the underlay topology of the

peers. The number of connections a peer has is decided by the cardinality of the peer.

The Underlay Manager randomly decides the cardinality (from the range decided by

the user) of a peer at the time of connection. The other related parameters are also

decided at the time of connection, viz., latency of the communication link, etc.

Overlay Topology Manager manages the topology used to connect the selected peers

in the overlay. It connects the selected peers in structured or unstructured topology. It

119

uses the overlay cardinality to decide the number of connections a peer has in overlay.

It also implements the logical topologies used to analyze the structure of the network.

Network Analyzer keeps track of statistics of the various elements, viz., peers,

network, paths, and cost. It collects information from all other components which

helps in making decisions about the network.

Path Manager manages all the paths to connect the peers in underlay/overlay. It uses

CJM algorithm to find the shortest path in underlay and required paths in overlay

topology. It also keeps track of underlay path to connect two peers in overlay. The

Path Manager provides multiple paths in underlay against any connection between

two peers in overlay. It updates paths in underlay after leaving of any peer from the

underlay.


To study the behavior of CJM based P2P network, we have considered that

distributed peer elements are connected with communication links in underlay. The

followings metrics are considered:

n

Response Time (RT) is the time taken by a test message to traverse the maximum hop

count path of the network.

Average Response Time (ART): is the average of all possible paths of every hop

count. Computation of ART is as follows.

( )

1( )

tRT of Path i

iART j Hop Path t==

∑ … (6.9)

( )

1

sART j Hop Path

jARTs

⎡ ⎤⎣ ⎦

==

∑ … (6.10)

Where

t is the number of possible paths of a particular hop count.

is the number of all possible hop counts of available paths. s

120

Path Length (PL) of a path is the maximum hop count between source and

destination. Average Path Length (APL) is the average of all PLs in the network and

computed as follows.

[ ] ..1

t PL i No of PathsAPLTotal No of Pathsi

×=

=∑ … (6.11)

Where

is the number of possible paths of a particular hop counts. t

Path Cost (PC) is the cost spent by the test message to travel over communication

link between source and destination. The Path cost comprises all the cost including

bandwidth and latency, etc. Average Path Cost (APC): is the average of cost spent in

all possible paths of every hop count. The computation of APC is as follows.

( )

1( )

tPC i

iAPC j Hop Path t==

∑ … (6.12)

( )

1

sAPC j Hop Path

jAPCs

⎡ ⎤⎣ ⎦

==

∑ … (6.13)

Where

t is the number of possible paths of a particular hop count.

is the number of all possible hop counts of available paths. s

Path Cost Saved (PCS):

(% 100

APC APCCJMage PCSAPC−

=)× … (6.14)

Where

is Average Path Cost through CJM APCCJM

121

Response Time Reduction (RTR):

(% 100CJMART ARTage RTR )ART−

= × … (6.15)

Where

is Average Response Time through CJM CJMART


The results obtained from simulation show that even after optimizing the overlay

network, there is a scope to reduce the traffic in underlay. A number of simulation

experiments are performed to observe the behavior of CJM and performance of P2P

systems in the presence of CJM. The number of peers considered in the network is

1,000 in underlay and 10-20% of them are participating in overlay topology.

From Figure 6.4, it is observed that the number of partitions in network decreases

with increase in underlay cardinality, and number of partitions are become

approximately constant after the cardinality value 11.

0

2

4

6

8

10

12

14

16

18

20

3 5 7 9 11 13 15 17 19Underlay Cardinality

Avg

. Num

ber

of P

artit

ions

#Partitions

Figure 6.4 Average Number of Partitions vs. Underlay Cardinality

Figure 6.5 shows that maximum path length in underlay used to transfer the

data/control message depends upon underlay cardinality (number of connections a

peer has in the network). The cardinality in overlay and underlay is assumed to be

more than 3. Initially path length is more against the cardinality value 3. The path

122

length starts reducing with increase in the cardinality and start stabilizing after the

value of 13.

0

2

4

6

8

10

12

14

16

18

20


Avg

. Pat

h L

engt

hs fo

r M

ax. R

each

abili

ty

Path Length

Figure 6.5 Average Path Lengths for Maximum Reachability vs. Underlay Cardinality

From Figure 6.6, it is observed that average path cost in P2P system also depends

upon the cardinality of participating peers in overlay topology. A high path cost is

observed in the overlay topology with less cardinality values and it reduces with

increase in the value of cardinality of a peer. Path cost in the network starts stabilizing

after the cardinality value of 6. Because the number of connections may be sufficient

to contact any peer with approximately same path cost in the network.

Figure 6.7 presents that the path cost initially is approximately same and start

decreasing with increase in underlay cardinality. In all the three cases CJM provides

minimum path cost as compared to the conventional path and path suggested by

THANCS algorithm. The path cost starts getting stabilized after the underlay

cardinality is 13. The removal of redundant path from conventional paths reduces the

path cost.

123

0

50

100

150

200

250

3 4 5 6 7 8 9 10 11 12Overlay Cardinality

Avg

Pat

h C

ost

Normal PathTHANCSCJM

Figure 6.6 Average Path Cost vs. Overlay Cardinality

0

50

100

150

200

250

300


Avg

. Pat

h C

ost


Figure 6.7 Average Path Cost vs. Underlay Cardinality

It is observed from Figure 6.8 that average response time of a path in overlay

starts constantly increasing with increase in overlay hop count in case of conventional

path. The response time in case of THANCS and CJM is very less as compared to

conventional paths CJM is providing minimum response time among the three.

Because, it uses common junction between two paths to route the messages. The

reduction of average response time is increasing with the increase of hop count in the

overlay path, because for long paths, the possibility to find the common junction is

higher.

124

From the results shown in Figure 6.9 it is observed that the average percentage of

reduction in path cost for CJM is lower than THANCS. But after the increase in path

length more than 3 hop count, the average reduction in path cost for CJM increases

sharply and is more than THANCS. The maximum average reduction in path cost is

observed 61% for CJM and 46% for THANCS. The reason for this reduction in path

cost is, the actual path traveled (through CJM) by the messages is reducing in the

network.

0

50

100

150

200

250

300

350

400

450

1 2 3 4 5 6 7 8 9 10Overlay Hop Count

Avg

. Res

pons

e T

ime

(ms)

11


Figure 6.8 Average Response Time vs. Overlay Hop Count

0

10

20

30

40

50

60

70

1 2 3 4 5 6 7 8 9 10Overlay Path Hop Count

%ag

e R

educ

tion

in P

ath

Cos

t

11

THANCSCJM

Figure 6.9 Average %age of reduction in Path Cost vs. Overlay Path (Hop Count)

125

0

5

10

15

20

25

30

35

40

45

50

1 2 3 4 5 6 7 8 9 10Overlay Path (Hop Count)

%ag

e R

educ

tion

in R

espo

nse

Tim

e

11

THANCSCJM

Figure 6.10 Average % age Reduction in Response Time vs. Overlay Hop Count

From Figure 6.10, it is observed that for initial values of hop count in overlay

path, the RTR percentage for CJM is lower than THANCS. But after the hop count

value 3 RTR percentage becomes higher than THANCS and remains higher for

higher hop count values. CJM provides approximately upto 45% RTR percentage. It

provides comparatively fast data transfer in P2P networks.

6.6 Advantages in using CJM

CJM saves traffic cost without modifying the network topology, and search space.

Other advantages of CJM are, it saves the traffic cost of the P2P network. Second, it

finds the path in underlay network for the two paths in overlay network, i.e.,

discontinuous paths in overlay network. Third, CJM find the shortest path of any hop

count path i.e., CJM is also suitable for any path length. The saved traffic cost

increases by increase in hop count of the path with increased possibility to have

common junction in the paths.

6.7 Discussion

Simulation results show that the average saving of path cost is increasing with the

increase in hop count of the path. The maximum path cost is saved upto 61% for the

hop count 11 in its best case. Initially it is observed that the average saving of path

cost increases drastically and after a limit it reduces, which indicates the average

saving in path cost is small. It is also observed that up to 8 hop count the proposed

126

127

technique CJM gives good results. CJM also significantly reduces the response time

of the network. Approximately 45% of response time of the network is saved on an

average using CJM. It is also observed from the results that a significant amount of

response time is reduced up to 9 hops, after that the reduction in response time is

minor. CJM reduces the physical path without altering the overlay connection and

search space of the system. It is useful for any overlay topology in the system. CJM

provides better results than THANCS for majority of the performance metrics

considered.

6.8 Summary

In this chapter, we have proposed Common Junction Methodology (CJM) technique.

This technique shows amazing results in saving of path cost and reducing the

response time of the network. It solves the topology mismatch problem upto large

extent. CJM works on any of the overlay topology, i.e., centralized or decentralized

topology. It can be implemented in any of the overlay topology without changing the

topology. Other salient features of the CJM are fast convergent speed and search

scope in the network.

In the next chapter an efficient replica placement algorithm LARPA is discussed.

Chapter 7

Fault Adaptive Replica Placement over P2P Networks

Data replication technique is widely used to improve the performance of distributed

databases. The replica logical structure reduces search time to find replicas for

quorums in P2P networks. It also guides to improve the turn around time of

transactions.

In this chapter a Logical Adaptive Replica Placement Algorithm (LARPA) is

proposed. It is adaptive in nature and tolerates up to 1n − faults. It efficiently stores

replicas on the one hop distance sites (peers) to improve data availability in RTDDBS

over P2P system.


Section 7.2 gives system model. Section 7.3 introduces the LARPA. Section 7.4

highlights implementation. Section 7.5 explores on simulation and performance study.

Section 7.6 discusses about the findings and finally chapter is summarized in Section

7.7.

7.1 Introduction

Peer-to-Peer (P2P) networks are low maintenance, massively distributed computing

systems in which peers (nodes) communicate directly with one another to distribute

tasks, exchange information, or share resources. P2P networks are also known for its

huge amount of network traffic due to topology mismatch problem. A large portion of

the heavy P2P traffic is due to topology mismatch problem between overlay topology

and underlay topology. There are currently a number of P2P systems in operation viz.,

Gnutella [67] to construct the unstructured overlay without rigid constraints for search

and placement of data items. However, there is not any guarantee of finding an

existing data object within a bounded number of hops.

P2P systems are rich in free availability of computing power and storage space. A

Real Time Distributed Database System (RTDDBS) is one of the application which is

128

suitable for such resources. But, there are various issues to be handled before

implementation, viz., time constraints to execute transactions in the said system.

Depending upon type of application, real time transaction can be categories in three

types: hard, soft and firm deadline transactions. Any transaction which misses the

deadline is considered worthless and is thrown out of the system immediately in case

of firm deadline transaction.

In replication data items are replicated over the number of peers participating in

the system. But replication is life line for the environment where nodes are prone to

leave the system and data availability is the primary challenge. Data replication

technique is used to provide the fault tolerance. It improves the performance and

reliability of the distributed systems. It also reduces the response time and increases

the data availability of the conventional distributed systems.

Replica logical structures also improve the performance of the system by

reducing the time of quorum formation. The quorums are decided from the structure

such that data consistency and data availability. A special replica overlay structure is

used to place replicas. Data availability is also a primary objective of the P2P

networks.

The numbers of replicas are increased blindly in normal cases for improving data

availability. Due to large number of replicas heavy redundant traffic is generated

during system maintenance and updating phase. Maintaining data consistency is also a

challenge in quorum system. Increasing the number of replicas in the system faces

more problems to maintain the data consistency. It takes more time to update all

replicas present in the system. Network overhead of the system is increased

exponentially with increase in number of replication. This problem has major impact

in the case of P2P network where network overhead is very large due to topology

mismatch problem. Because messages are transferred through a number of peers

present in the underlay topology. These peers are transparent in the overlay topology.

To implement RTDDBS over the P2P networks, data distribution must be efficient

to match the requirements of transactions deadlines. For improving the data

availability and fast data access, normally replicas are placed in efficient overlay

structure. Necessary modification is required in replica overlay topology to reduce the

network traffic and chasing other challenges in the P2P networks. We have

considered few of the above challenges and developed a LARPA for P2P network. It

is adaptive in nature and tolerates up to 1n − faults. LARPA efficiently stores replicas

129


system. A comparative study is also made with some existing systems.

7.2 System Model

The connectivity structure of P2P network is represented by an undirected graph with

vertices as peers and edges represent connection among the peers. The overlay is

modeled as undirected graph ( , )G P E= , where is the set of active peers

participating in the network, and

P

E is the set of edges (links) between the peers.

Further, and is defined as P E 1 2{ , , ,P p p 3 ..., }pnp p= , 1 2 3{ , , ,..., }

enE e e e e= .Where

are the number of peers participating in the network and number of edges to connect

the participating peers, respectively. Two peers

,p en n

1p and 2p in a graph are said to be

connected if there exists a series of consecutive edges such that is

incident upon the vertex

1 2 3, , ,..., }pe e e e{ 1e

1p and is incident upon the vertex pe 2p . An edge ( ,1 2 )p p in

E means that 1p knows a direct way to send a message to 2p . Henceforth, we use the

terms graph and network interchangeably. Similarly, the terms peer and vertex are

used equivalently, and so are the terms edges and connections. The series of edges

leading from 1p to 2p is called a path from vertex 1p to 2p , represented as 1 2p p∼ ,

iff they are at more than one hop distance and by 1 2p p→ in case of one hop distance.

The length of a path is the number of edges in the path from peers (1,2l

)Hop 1p to 2p .

The distance is the measure of total cost including all types of cost to send the unit

data from source 1p to destination 2p and can be defined as shortest distance

calculated in underlay topology. Replicas of the database are stored on

the peers selected through some criterion among the peers in . This set of replicas is

defined by

1( , )t p 2pDis

P

1 2 , 3 ,{ , ..., }R iprP pr pr pr= where ;i Rpr P P P∀ ∈ ⊆ . Replicas form a replica

overlay topology, Hence, replica overlay topology can also be defined by the graph

, where G G . 1G 1 ⊆ RP is the set of vertices of . The edges in G are 1G 1

{ }RE E⊆ N∪ ewEstablished verlO ay Links . The one hop neighbor of any peer 1p is

defined by 1

1P 1 1 2 ) 1, }{ | ( ,N p p= p E p P∈ ∀ ∈ and two hop neighbor of any peer 1p , is

defined by

1 1 1

2 13 3 2 3 2 1{ | ( , ) , , } { }P P

1PN p p p E p P p N N p= ∈ ∀ ∈ ∀ ∈ − − .

130

Data items D are defined by the set of tuples ,i iVr Dc< > where is the version

number of the data items, highest version number implies the latest value of data

items. For every committed write query, version number is incremented by one at

particular replica. is the value of data contents stored at a replica.

iVr

iDc

7.3 Logical Adaptive Replica Placement Algorithm (LARPA)

The logical replica topology improves the performance of replica system as compared

to random placement of replicas in the P2P networks. To achieve data availability,

number of replicas are increased blindly without considering effects of network traffic

and data consistency problems in the system. These are major factors affecting the

performance of any system. A logical structure reduces response time, distance of

replicas and replica searching time. A small set of replicas in a P2P system reduces

the system overhead. LARPA, considers network traffic and data consistency problem

along with problem of read/write quorum generation time, update time, churn rate of

peers and performance of the system.

LARPA places the replicas close to the point from where a search starts. It selects

a suitable peer with highest candidature value for storing the replica which acts the

centre of structure {Figure 7.1 & 7.2}. All other replicas are stored at peers having

maximum candidature. The one hop connections are established among the replica

and the centre peer. Connections can be further improved to minimize the effect of

centre failure. This can be done through establishing the new connections between the

replicas (which are at more than one hop distance) and the peers at one hop distance

from the centre (direct neighbors). These extra connections improve the search

performance of the system.

7.3.1 LARPA Topology

In LARPA overlay, all peers are selected for placing replicas on the basis of their

resource availability and the session time during which they participate in the

network. Second, threshold value (which is decided by the DBA and may vary for

every situation) of candidature is used to select the peers {Section 4.5.6}. This

threshold value is the minimum value at which one can expect acceptable results.

These conditions improve the capabilities of peers which improve the durability and

efficiency of the system.

131

For any read/write quorum, a group of replicas are selected from the logical structure

to execute read/write operations in the system. The time spent in generating the

quorums also affects the system performance. LARPA selects limited numbers of

peers for placing replicas. It forms logical structure on the basis of resource

availability with the peers. One best peer among identified peers is selected as a

centre peer. This is the point from where a query enters in the system for execution

and to select the quorums for it. All remaining special peers establish direct overlay

connection with the centre peers. In LARPA one hop overlay connections improve the

search time to generate quorums. This time to select quorum is reduced by improving

the search time of replicas from the replica overlay. These peers may also establish

extra connections with the peers at one hop distance from the centre.

7.3.2 Identification of Number of Replicas in the System

To achieve desired data availability in the network, data is replicated in the system.

There is a tradeoff between number of replicas and system overhead, it needs to select

number of replicas in the system intelligently. The number of replicas should be

minimum, to reduce the problem of data consistency, network overhead, network

traffic and other factors. Further, data availability must also be in the acceptable

range, to access the updated data from the system. From the property of parallel

systems at least one peer must be active and is defined by: . The

target data availability to be achieved by any database system is assumed as 95%. To

achieve this, data is replicated over P2P networks. From Table 7.1, it is observed that

only six replicas with peer availability of 0.4 produces the data availability up to 95%.

1(1 (1 ))

ni

iP P

== − −∏

Table 7.1 Effect of Peer Availability over Data Availability in the System No of Replicas in the System

2 3 4 5 6 7 8 9 0.3 0.51 0.657 0.7599 0.83193 0.882351 0.917646 0.942352 0.959646 0.4 0.64 0.784 0.8704 0.92224 0.953344 0.972006 0.983204 0.989922 0.5 0.75 0.875 0.9375 0.96875 0.984375 0.992188 0.996094 0.998047 0.6 0.84 0.936 0.9744 0.98976 0.995904 0.998362 0.999345 0.999738 0.7 0.91 0.973 0.9919 0.99757 0.999271 0.999781 0.999934 0.99998 0.8 0.96 0.992 0.9984 0.99968 0.999936 0.999987 0.999997 0.999999 0.9 0.99 0.999 0.9999 0.99999 0.999999 1 1 1 Pe

er A

vaila

bilit

y in

the

Syst

em

1 1 1 1 1 1 1 1 1

132

This data availability goes on increasing with increase in peer availability. With these

facts it is concluded that one should avoid unnecessary large number of replicas in the

system, and limit the number of replicas up to 7. Hence, if peer having more than 40%

availability it is selected for storing replica, then data availability will be in acceptable

range. To guarantee the data availability, the peers having more than 0.5 availability

are considered for storing the replicas.

7.3.3 LARPA Peer Selection Criterion

There are two parameters that are considered while selecting a peer for storing the

replicas, i.e., candidature of the peer {Section 4.5.6}, which is the measure of, how

capable a peer is to store a replica. The value of candidature is computed using eqn

(4.1) {Section 4.5.6}. Second is the distance of each participating peer in the system.

The distance is a measure of the cost spends to send and receive messages from the

peers to centre peer. For any efficient system this distance should be minimized. A

priority queue is utilized to store the best peers having largest candidature among all

the peers. The length of priority queue is equal to double of the number of peers

required in the system (length may vary depending upon system requirement). In the

present case, the number of required replicas are seven and the length of priority

queue is fourteen. The remaining seven peers are utilized as the replacement or to add

new replica, if required by the system. All these peers are best suited among all

participating peers in the overlay to store replicas.

7.3.4 Algorithm 1: Selection of Best Suited Peers

The best i peers are identified among peers from set . A peer with candidature

value greater than threshold value is qualified for set

pn P

RP . But a predefined number i

of best peers among these qualified peers are in the RP . The number i may vary

depending on the requirement of the system. The qualified peers are arranged in the

descending order. On the basis of peer availability LARPA decides double of the

peers required in the system. The numbers of peers decided by LARPA for the present

system is 7. The first 7 peers among 14 peers are selected from RP . The steps of

algorithm are as follows.

133

Algorithm 1: LARPA1 (selecting best suited peers)

1. ip P∀ ∈ , calculate candidature . iCd

2. Put first sn peers having maximum candidature in priority

queue which is sorted in descending order and having

candidature grater than the threshold value. This is the set RP .

3. Take first element from the priority queue, having highest

candidature among all participating peers. Consider this peer as

centre peer cp .

4. Take next sm peers from set RP , where s sm n≤ . Put all sm

peers in a queue, {Participating Peer Queue ( )}. PPQ

5. kp PPQ∀ ∈ , k cp p∼ and find shortest path from

each selected peer from to centre peer

( , )k cDist p p

PPQ cp and establish

the direct connection k cp p→ .

6. Establish the overlay connection among all selected peers

(except cp ), such that each peer connects with at least two

other peers from the queue

.PPQ 1 1; ;k k k Lastp PPQ p p p p+→ →∀ ∈ .

7. End

7.3.5 Algorithm 2: Selection of Suitable Peers with Minimum Distance

A set RP of best qualified peers are selected from the set of the participating

peers similar to the previous algorithm. These peers are selected to store replicas

among set

i P

RP , such that the distance between a peer and center peer is minimum. The

peers are selected from RP on the basis of distance and candidature. The steps of

algorithm are as follows.

Algorithm 2: LARPA2 (Selecting suitable peers with minimum distance)

1. ip P∀ ∈ , calculate candidature . iCd

2. Put first sn peers having maximum candidature in priority

queue and sort them in descending order and having

candidature grater than threshold value. This is the set RP .

134

3. Take first element from the priority queue, having highest

candidature among all participating peers. Consider this peer as

centre peer cp .

4. Take sm peers from set RP , where s sm n≤ and having

minimum distance from the centre among all participating

peers, i.e., where

is the minimum distance from source to destination. Put these

1 (mk k kMin p PPQ t p p= ∀ ∈( |Dis , ));c ( , )k cDist p p

sm peers in a queue {Participating Peer Queue ( )}. PPQ

5. kp PPQ∀ ∈ , k cp p∼ and find shortest path from

each selected peers from to centre peer

( , )k cDist p p

PPQ cp and establish

the direct connection k cp p→ .

6. Establish the overlay connection among all selected peers, such

that each peer connect with at least two other peers from the

queue .PPQ 1 1; ;k k k Lastp PPQ p p p p+∀ ∈ → → .

7. End

In Figure 7.1 eighteen peers are participating in the network from peer 1p to 18p .

Seven peers are selected from eighteen peers on the basis of candidature, which is

nothing but the availability of resources and capability of the peers. A peer with

highest candidature value among these seven peers, is selected as centre in the replica

overlay. LARPA selects as a centre peer. The remaining selected peers establish

one hop overlay connection with centre of the replica. These one hop overlay

connections may or may not be already present. New one hop connections are

established in case of connection is not already present between selected peers and the

centre. LARPA also connects selected peers with each other along with the

connection with centre. New connections are represented with bold curve connectors

e.g.,

5p

12 5p to p and 17 5p to p .

135

P9

P3

P4

P2

P8

P7 P6

P5

P1

P10 P11

P12

P13

P14

P15 P16

P17

P18

Figure 7.1 Peers Selection and Logical Connection for LARPA Structure

In Figure 7.1 circles filled with gray color, represent the peers selected for storing

replicas among all the peers. Peer 5p filled with dark gray color is selected as a

centre. The connections with bold dark connector represent existing overlay

connection among replicas, connectors with bold and gray color (curves connectors)

represents the new established connections created for LARPA. All other dashed

connectors represent other overlay connections and can be utilized in case of any path

failure in LARPA. The LARPA logical structure is presented in Figure 7.2, is

obtained from the existing/newly generated connections from network, shown in

Figure 7.1.

P5 P12

P14

P17 P8

P10

P9

Figure 7.2 LARPA obtains Logical Structure from the Network shown in Figure 7.1

7.4 Implementation

The arrivals of transactions at a site (peer) are independent of the arrivals at other

sites. The model assumes that each global transaction is assigned a unique identifier.

Each global transaction is decomposed into subtransactions to be executed by remote

136

sites. Subtransactions inherit the identifier of the global transaction. No site or

communication failure is considered. The execution of a transaction, it requires the

use of CPU and data items located on remote site. A trusted communication network

is used to connect the sites. There is no global shared memory in the system and all

sites communicate via message exchange over the trusted communication channels.

The cohorts of the transaction at the relevant sites are activated to perform the

operations. A distributed real time transaction is said to commit, if the master has

reached to the commit decision before the expiry of its deadline at the site. Each

cohort makes a series of read and update accesses. The transaction already in the

dependency set of another transaction or the transaction already having another

transaction(s) in its dependency set cannot permit another incoming transaction to

read or update. Read accesses involve a concurrency control request to obtain access

followed by a disk I/O to read followed by a CPU usage for processing the data item.

Peers that vote yes, lock their replica and send back their (global) version number to

the requesting peer. Reading or writing without a quorum jeopardizes consistency.

Firm deadline is used in the transaction. The transactions supplied to the system are

free from concurrency control.

LARPA inherits the read/write attributes of ROWAA protocol [133]. It uses all

active replicas available for write quorum. LARPA permits for quorum to start from

the centre of the structure. The address of this centre is provided to the authorized

owners/users. The replicas are searched through broadcasting from the centre peer.

The replicas having active status can respond to the query received. Control messages

are exchanged within replicas in the system to share the active status of the replicas.

Searching for Read/Write Quorums: Authorized owner/users of the database sends

the query to the system. System searches the group of replicas to respond the query

received. The replicas are searched through broadcasting, starting from the centre

replica. It takes less time in searching because all other replicas are available at one

hop distance. Read quorum is constituted of any one replica, searched first from the

structure. LARPA perform well in small replication system. The write quorum is

constituted of all available replicas from the structure. The fixed wait time is allowed

for replicas to participate, otherwise replicas are assumed to be unavailable.

137

7.4.1 Replica Leaving from the System

For LARPA fail stop based system is assumed. If any of the peers fails, it stops

sending all types of messages.

P5 P12

P14

P17 P8

P10

P9

Figure 7.3 LARPA Structure Representing the Replica 14p departing the Network

P5 P12

P14

P17 P8

P10

P9

Figure 7.4 LARPA Structure Representing the Replica 5p from the Centre departing the Network

Replica Leave: A replica in LARPA can leave with or without informing the system.

It simply stop working and forwarding the data/control messages. A ping message

always sent to the centre against the active status of replica. This provides the

information about a replica whether it is active and working properly or not. It

maintains updated copy of data. In case of replica leaving from the system it does not

effect the functioning of the system till a single replica is active {Figure 7.3}.

Centre Leave: In case of centre fails as shown in Figure 7.4, the next replica will

automatically take the charge of centre. The replica will manage the system from its

present location in the structure.

7.4.2 Replica Joining to the system

A replica tries to connect with the addresses of its old neighbors. These addresses of

neighbors are already stored with replica at the time leaving the system. After

connecting with old neighbor replicas in the system, it updates its data contents with

138

neighbors. Active central replica may also be utilized for updating the data items.

Replica announces its active status after successfully updating its data items, through

control message passing.

Centre Joins: When the centre replica wants to join the system. It first tries to connect

with its old connection (stored in its memory). After connecting the replica in the

system, centre updates its data contents through matching the version number and its

contents. The centre replica receives data update acknowledgement from all available

replica participating in the system. The centre replica announces its active status after

successfully updating its data items, through control message.

Network Traffic: in case of LARPA, network traffic is very less as compare to the

random, Hierarchical Quorum Consensus, Extended Hierarchical Quorum Consensus

[145, 147]. This reduced traffic is due to its logical structure and placement of

replicas. Network traffic due to message passing is limited in the replica overlay,

which further reduces traffic in underlay topology and reduces the traffic at Internet.

Fault Tolerance: This is high priority requirement for the system, especially in case

of P2P systems. LARPA works with its only active last replica, available in the

system. It tolerates faults (where 1sn − sn is the number of replicas in the system), as

single replica provides the complete information to the system.


To evaluate the LARPA an event driven simulation model for firm deadline real time

distributed database system has been used {Figure 4.5}. Model presented in Figure

4.5 is extended version of the model defined in [164]. This model consists of a

RTDDBS distributed over n peers connected by a secure network.


The performance of the LARPA is evaluated and compared with other existing

systems through simulation. Performance metrics defined in Table 4.2 and in Table

7.2 are used to evaluate the system. In the simulation we have used the performance

parameters defined in Table 4.3.

139

Table 7.2 Performance Metrics-III

Performance Metrics used to measure the performance of the system

• Quorum Search Time: is the time duration between the requests for quorum is

submitted and required replicas for quorum are searched in the system.

• Network Overhead: network load is measure of the number of messages

transferred in the network to propagate an update message to the system in

underlay topology. For an efficient logical structure, network load should be

lower. Higher network load creates congestions in the P2P networks.


To evaluate the LARPA logical structure, a series of simulation experiments are

performed. Figure 7. required for the peers to respond the arrived queries and to

increase the performance of the system. The peers with availability having more than

0.5 availability are suitable to store replicas, because such peers have sufficient

session time to execute transaction. Below this availability the replica leave/rejoin

overhead is more and show poor performance. The system constituted with high

probability peers provides the minimum interference, so the effect of churn rate on the

system is minimized.5 presents behavior of the peers. It is observed that the session

time of the peers is increased with increase in availability of the peer. A sufficient

time slot is

0

50

100

150

200

250

300

350

400

450

500

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Availability

Ave

rage

Up

Sess

ion

(ms)

Avg. Session Time

Figure 7.5 Relationship between session time and its availability of a peer in P2P networks

140

Figure 7.6 the compares the response times of Random [153], HQC [145] and HQC+

[147] replication systems with LARPA. LARPA minimized the response time. This

shorter response time, helps fast execution of transactions in the system. It reduces the

workload of the system. This lowest response time is due to the placement of

minimum required replicas in LARPA based system. This LARPA system bears more

workload than the other considered systems.

0

10

20

30

40

50

60

70

80

90

100

0 2 4 6 8 10 12 14 16 18 2Quorum Size

Avg

. Res

pons

e T

ime

(ms)

0

RandomHQCHQC+LARPA

Figure 7.6 Variations in response time with quorum size

It is observed from Figure 7.7, that LARPA structure is better among all the

considered structures in restart ratio. Each logical structure has a highest restart ratio

on a value of Mean Transaction Arrival Rate (MTAR), for Random MTAR=0.8, for

HQC MTAR=1.2, HQC+=1.2 and for LARPA MTAR=1.4. This represents that the

system starts exhausting near to these values of MTAR. After this peak restart ratio

decreases with increase in MTAR. This occurs because the time left with transaction

is less to restart. The time is consumed in communication delays and other factors

affecting the performance. Restart ratio for LARPA is the minimum and can bear the

load of transaction up to 1.4 approx.

Figure 7.8 presents the relationship in success ratio with MTAR in the system.

Success ratio is the number of transactions completed successfully within deadline

over the number of transaction submitted for the execution. The success ratio

decreases with increase in MTAR. The success ratio is same for all considered

systems at MTAR=0.2. The success ratio for all the systems decreases with increase

141

in MTAR. But after MTAR=0.2 success ratio decrease very sharply for random

system and much lesser for LARPA based system. LARPA executes more

transactions successfully as compared with other systems.

0

0.2

0.4

0.6

0.8

1

1.2

1.4

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2Mean Transaction Arrival Rate(MTAR)

Trn

sact

ion

Ras

tart

Rat

io

LARPAHQCHQC+Random

Figure 7.7 Variations in restart ratio with system workload

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Succ

ess R

atio

LARPAHQCHQC+Random

Figure 7.8 Relationship of transaction success ratio with system workload

From Figure 7.9 it is observed that throughput of LARPA is higher as compare to

the other selected structures. This is due to minimum time for searching quorums and

shorter response time. The throughput increases in its initial phase and after its peak it

starts decreasing. At the peaks maximum rate of transactions which a structure may

bear. After these peaks transactions waiting time and lock time, etc., are start

142

increasing and throughput starts decreasing. The maximum value of MTAR at peak

for LARPA is 1.4 (approx), which is highest among other selected logical structure.

0

20

40

60

80

100

120

140

160


Thr

ough

put (

tps)

RandomHQCHQC+LARPA

Figure 7.9 Variation in throughput with system workload

It is observed from Figure 7.10 that lesser is the search time for quorum

formation, faster is the response. It also reduces the time to execute the transactions.

LARPA performs better than other logical structures and takes minimum search time

to form quorum.

0

5

10

15

20

25

30

35

40

0 2 4 6 8 10 12 14 16 18 2Quorum Size

Avg

. Sea

rch

Tim

e (m

s)

0

RandomHQCHQC+LARPA

Figure 7.10 Relationship between average search time with quorum size

Figure 7.11 represents the variation in average messages transfer in case of

random replica topology is highest where as LARPA generates minimum average

143

message transfers. This reduced average message transfer is due to its least number of

replicas activated in the system. It is also observed that LARPA generates minimum

network load.

It is observed from Figure 7.12 that LARPA has high probability to access

updated data. This is because of the time to update the system is minimum due to one

hop distance and reduced response time of replicas.

0

200

400

600

800

1000

1200

0 2 4 6 8 10 12 14 16 18 2Quorum Size

Avg

. Mes

sage

Tra

nsfe

r

0

LARPAHQCHQC+Random

Figure 7.11 Variation in network traffic with quorum size

0

0.2

0.4

0.6

0.8

1

1.2

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1Peer Availability

Prob

abili

ty to

Acc

ess U

pdat

ed D

ata CELL

HQC+HQCRandom

Figure 7.12 Probability to Access Updated Data vs. Peer Availability

Figure 7.13 presents the comparison of response time between LARPA1 and

LARPA2, it is observed that LARPA2 provides better response time than LARPA1.

Because LARPA2 take less time to travel from centre to peer in LARPA structure, as

144

peers are selected on the basis of minimum distance from the centre for LARPA

structure.

Figure 7.14 gives the comparison of network overhead between LARPA1 and

LARPA2 , it is observed that LARPA2 generates less network overhead than

LARPA1.

0

2

4

6

8

10

12

14

16

18

20

2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7Quorum Size

Avg

. Res

pons

e T

ime

(ms)

LARPA1LARPA2

Figure 7.13 Response Time Comparison between LARPA1 and LARPA2

0

10

20

30

40

50

60

70

80

90

2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7Quorum Size

Mes

sage

Ove

rhea

d

LARPA1LARPA2

Figure 7.14 Messages Overhead Comparison between LARPA1 and LARPA2

7.6 Discussion

LARPA permits a replica to keep its old addresses and to join the same place from

where it had left, which reduces rejoin overhead of the replica. Peer resume time is

minimum, and is similar to just checking its old neighbors. Thus, the system

145

reconciliation time is very less as compared to other considered system. It exploits the

advantages of overlay topology in establishing the connections and the connections

can be disconnected or established without affecting the peers in underlay topology.

The network overhead is minimized with limiting the number of replicas in the

system. All replicas are placed at one hop distance from the centre peer, from where

any search starts. Data availability of the system is maintained with placing the replica

over the peers with maximum candidature value. Fault detection is fast due to one

hop distance of all the replicas from centre. LARPA is adaptive in nature and tolerates

upto faults. It allows a system to works till the last active replica is active. 1sn −

In LARPA based systems one replica may be accessed in its best case. It provides

high probability to access updated data from the system. It also has the minimum

quorum acquisition time and response time. LARPA provides minimum transaction

restart ratio, better throughput, and better transaction success ratio. On the basis of

comparative analysis it is find that, LARPA2 provides better response time and

generates less network overhead to the network. All these features recommend

LARPA2 structure for the dynamic environment applications where high throughput is

required.

7.7 Summary

In this chapter we have presented a Logical Adaptive Replica Placement Algorithm

(LARPA). LARPA matches the requirements of RTDDBS over P2P where fast

response is expected from the system. It uses its own peer selection criterion to

maintain data availability of the system in acceptable range. It efficiently stores

replicas on the one hop distance peers to improve data availability for RTDDBS over

P2P system.

To avoid long waiting time LARPA inherited read/write quorum attributes of the

ROWAA protocol. LARPA is adaptive in nature and tolerates up to faults. It

shows minimum response time, search time to generate read/write quorums,

transaction restart ratio and transaction miss ratio. It generates lowest message traffic

to update in P2P networks. LARPA based system bears maximum workload. It is

further observed that the algorithm LARPA2 performs slightly better than algorithm

LARPA1 due to its shorter distance among centre and replicas. LARPA is a suitable

for implementing reliable RTDDBS over P2P networks.

1sn −

146

147

Next chapter represents a hierarchy based quorum consensus scheme for

improving data availability in P2P systems.

Chapter 8

Height Balanced Fault Adaptive Reshuffle Logical Structure for P2P

Networks

Distributed databases are known for their improved performance over the

conventional databases. Data replication is a technique for enhancing the performance

of the distributed databases in which data is replicated over the geographically

separated systems. In highly dynamic environments the probability to access a stale

data is comparatively high.

In this chapter a Height Balanced Fault Adaptive Reshuffle (HBFAR) scheme for

Improving Hierarchical Quorums over P2P Systems is developed. It inherits read one

write all attribute of (ROWA) protocol [133]. HBFAR scheme maximizes the

overlapped replicas for read/write quorums and improves the response and search

time.

Rest of the chapter is organized as follows. Section 8.1 gives the introduction to

the chapter. Section 8.2 presents a system model. Section 8.3 briefs the system

architecture. HBFAR scheme is explored in Section 8.4. Section 8.5 explains

simulation and performance study. Section 8.6 gives a look on findings. Finally


8.1 Introduction

Data replication is one of the technique to enhance the performance of the distributed

databases. In replication data is distributed over the geographically separated systems.

Each data replicated over the peers are generally called replica. The multiple replicas

are consulted to get the fresh data items from the distributed systems. This makes

system reliable and resilient to any fault. Data replication is a fundamental

requirement of distributed database systems deployed on the networks which are

dynamic in nature, viz., P2P networks. Peers can join or leave the network at any time

148

with or without prior information. It is also found in the literature that churn rate of

the peers is high in P2P networks. For such a highly dynamic environment the

probability to access a stale data is comparatively high as compare to the static

environment. There are number of challenges to implement the databases over the

dynamic systems like P2P networks. The major challenges are summarized as

follows- data consistency, one copy serializability, fault tolerance, availability of the

data items, response time, churn rate of the peers and network overhead.

A number of protocols and algorithms are proposed in the literature to implement

and maintain the consistency in the distributed databases. Some examples are single

lock, distributed lock, primary copy, majority protocol, biased protocol and quorum

consensus are proposed in the literature. The availability of the replicas in dynamic

P2P network is a major challenge, because of churn rate of peers. Data availability is

also affected by the peer availability in the system. To maintain the data availability in

acceptable range a quorum consensus protocol to access the replicas are quit good

option. A system with replicas stored in a logical structures, improves the quorum

acquisition time.

If a quorum is formed such that, it contains maximum updated replicas then the

probability to access a stale data is obviously reduced. To improve the probability to

access updated data from the set of replicas, the degree of intersection must be high

for two consecutive quorums. To improve the degree of intersection among

consecutive read-write and write-write quorums the logical structure needs to be

accessed in a special way.

Logical structure of the replicas reduces unnecessary network traffic due to

multicasting of search messages/queries for the existing replicas. The network traffic

can be reduced by prioritizing the access of logical structures in P2P systems. Self

organization of the logical structures may also improve the performance of the

system. The network traffic further reduced through optimizing underlay path.

In order to reduce search time we may take the advantage of overlay topology in

the P2P network. If a logical structure is organized in such a way that all updated

replicas are popped up, then search time will reduce drastically and improves the

probability to access updated data item. To address above said challenges we have

developed Height Balanced Fault Adaptive Reshuffles (HBFAR) scheme for P2P

system. It is a self organized scheme to arrange replicas in a binary complete tree with

149

some special attributes. It also improves the probability to access updated data from

the quorums. HBFAR provides high degree of intersection between two consecutive

quorums.

8.2 System Model

For the database replication peers are selected on the basis of availability factor or the

session time. A prioritized access is preferred to access the peers. A longer session

time provides better probability to access fresh data items. All replicas are organized

in a binary logical tree structure. The model defined for a system is as follows:

P is the set of peers and defined as 1 2 3{ , , ,..., }kP p p p p= where is the number of

peers participating in the system. Replica set

k

R is the set of l number of peers

holding replicas. Where R is subset of , i.e., P R P⊆ . A replica may be active or

inactive according to its availability in the system.

Ra is a set of all active replicas in the system. 1 2 3{ , , ,..., }sRa Ra Ra Ra Ra=

1

is an

ordered set of all active replicas arranged in logical tree structure. Ra have larger

session time than 2Ra . 2Ra have the large session time as compare to the 3Ra and so

on. Replicas in the logical structure are managed according to the session time. A

replica from the logical structure which has longest session time is placed at the root.

Rd is a set of all dead/inactive replica at particular time. Ra Rd φ∩ = , i.e., each

replica is either active or dead depending upon the present state of the peer. Replica

set is also defined in terms of R Ra and , Rd { }R Ra Rd= ∪ .

Write quorum is an ordered set of replicas from Ra , 1 2 3{ , , ,..., }sQw Qw Qw Qw Qw= is

set of write quorum at various times, respectively. It is starting from the 1st replica

from Ra , up to the number of replicas equal to the quorum size decided by the

administrator, let it be a . n

Let write quorum , where is the size of write

quorum and is the session time of the replica for the period it is active in the

system. Here i is the position of replica in the logical structure starting from the root,

i.e., root of the logical structure is at position1 , left child and right child are at

position 2 & 3, respe

1{ | , [1... ]}ij

tw i niQ Ra t t i n+= ∀ > = n

it

ctively.

150

Read Quorums 1 2 3{ , , ,..., }sQr Qr Qr Qr Qr= is set of read quorums at various times. A

Read Quorum is defined as where is the size of

read quorum, decided by the administrator. The read quorum must follow the

condition , ; are the session time of the last replica involved in

the quorum. All the replicas having largest session time are involved in the read and

write quorums. A read quorum is always subset of the write quorum, i.e.,

. Thus, will always contain updated replicas,

because all replicas with greater session time are involve in the quorums, including

root of the logical structure. The only condition when all replicas go down including

root, then only this system will fail, otherwise every time read quorum have the

updated information in the replicas.

1| , [1... ]}r i ma t t i m+> =

iQr

{ ij

tiQ R= ∀

1i it t +≥ m nt and t

, ]i i i iQr Wr Qr Qw∀ ∈ ⊆

m

m n≤

[ ,QwQw∀ ∈

The quorum size depends upon the availability of the peers in the system. The

value of may increase in case of low availability of peers holding the replicas

and can be decrease in case of highly available peers holding replicas.

,m n


To study the behavior of HBFAR scheme a 7-Layers Transaction Management

System (7-LTMS) is proposed {Figure 8.1}. 7-LTMS helps to execute the received

queries and to maintains other necessary requirements in the system. A brief

discussion of 7-LTMS components are as follows:

Query Optimizer (QO): A Query is divided into number of subqueries. It optimizes

these subqueries to reduce the execution time of overall query. It decides the order of

execution of the subqueries. QO uses various optimizing techniques for the

optimization of the subqueries.

Subquery Schedule Manager (SQSM): A subquery which is ready to execute, is

scheduled to achieve one copy serializability of the transaction. SQSM is responsible

to rearrange the order of subqueries to achieve this. It also maintains concurrency and

data consistency in the system.

151

Update Manager

Network Connection Manager

Subquery Schedule Manager

Quorum Manager

Replica Overlay Manager

Local Storage of Partial Database

Replica Search Manager

Query Optimizer

P2P Network

Figure 8.1 7-Layers Transaction Management System (7-LTMS)

Quorum Manager (QM): is responsible to decide the quorum consensus to access the

data items. Quorum is decided such that the system achieve the acceptable replica

availability, i.e., the number of replica to be accessed is increased if the availability of

the peers storing replica are low. The number of replicas may be reduced if the

availability of the peers is high to reduce the overhead of the network. QM is

responsible to maintain the availability of the replicas at the desired level. It

recognizes the replica to be accessed from the logical structure. QM implements

read/write quorum algorithm.

Replica Search Manager (RSM): is responsible for searching any replica from the

group of replicas. These replicas are arranged into a logical structure, maintained by

ROM. RSM also facilitates searching of read/write quorums. It uses algorithms for

searching of replicas, e.g., HQC, HQC+ and HBFASR.

Replica Overlay Manager (ROM): to increase the performance of the system,

replicas are arranged into some logical structure. It is responsible to place the replica

in a logical structure, to easily access of the replicas. The efficiency of overlay logical

152

structure may reduce the search time of replicas. ROM identifies the replica for

making the quorum. It maintains the logical structure from time to time, in which

replicas are placed. Every time when any replicas left the network, ROM readjusts the

replicas by arranging the addresses of these replicas in logical structure.

Update Manager (UM): It implements all the update strategies. The eager

methodology is used to update the prioritized replicas, selected for the write quorum.

Lazy methodology is used for remaining replicas existed in the system. This update is

performed by ROM. UM maintains the freshness of data item available with replicas.

It implements the eager/lazy update algorithm. The algorithm is selected such that it

should update information within minimum time, because it plays the key role in

system performance. The probability to access stale data from the system is

minimized by minimizing the update time of the system.

Network Connection Manager (NCM): Every logical address of the replica is

checked and converted into the physical address of that replica. NCM routes data and

control information. It also maintains connectivity of the replicas.

Local Storage/Database Partition: This is actual local database to be accessed. All

data items belonging to the database partition are physically stored here. This may be

the region provided by the owner of that peer. This memory region is shared among

the network. The data encryption technologies may be implemented to protect the

data from any attack/misuse by any unauthorized person.

8.4 Height Balanced Fault Adaptive Reshuffle (HBFAR) Scheme

HBFAR scheme improves the replica acquisition time and probability to access

updated data items. It places replicas in a binary logical structure. Initially, peers

participating in overlay topology are checked for their average session time. The peers

with highest average session time are selected to hold the replicas in the system.

These replicas are arranged in binary tree logical structure such that they can form

almost complete tree. The replica having longest up session time in the system is

selected as a root. Similarly, other replicas are also arranged as left child and right

153

child, according to the session time of each replica in the system. All replicas having

higher session time are placed on root in the logical structure. These top replicas are

further participating in every read/write quorums. The regular participation of these

top replicas maintains updated copy of data items which increases the probability to

access updated copy of data. The HBFAR scheme inherits some special properties,

viz., replicas lying on a peer are arranged in the tree such that the session time of each

parent replica is greater than its child. The session time of left child is greater than

right child of any parent. All peers participating in the system find the alternate path

for all its ancestors. This includes all parent replicas comes across the path from leaf

to root.

The replicas with highest session time are given priority over the smaller session

time while including in a quorum. The quorum is formed by first taking the replica at

the root of the tree and then replicas at the branches, i.e., from top to bottom. The

branches of parent node are accessed such that the left child is accessed before right

child of that branch, i.e., from left to right. This pattern to access replicas from the

tree is referred as Top-to-Bottom and Left-to-Right. HBFAR scheme permits each

read quorum to get the updated copy of data items.

Every time replicas are accessed from same position from logical structure

increases the degree of intersection among consecutive quorums. The common

replicas will increase the probability to access updated data items from the system.

Since the replicas are accessed from upper part of tree in Top-to-Bottom and Left-to-

Right fashion, which minimizes the replicas search time. All read-write and write-

write quorum intersect each other; hence, every read quorum accesses the updated

copy of data item. The maximized degree of intersection set from two consecutive

read-write and write-write quorum ensures the access of fresh data items.

The problem of finding a path between a pair of source-destination peers in the

overlay is the problem of finding a route between the source and destination peers in

the underlay topology. The path between peer and is direct of one hop count

distance and path between peer to is of two hop count distance as shown in

Figure 8.2. The identified route between source and destination in the underlay may

or may not be the shortest path. However, a shortest path in the underlay will be

advantageous for reducing communication cost.

1P 2P

1P 4P

154

The working of HBFAR scheme is divided into two independent parts, accessing the

group of replicas and maintenance of logical structure. Both parts are executed in

parallel to improve the efficiency of the system. The replicas from root to terminal

nodes are included in quorums to maximize the degree of intersection set. The level

up to which replicas are included, it depends upon the size of quorum, e.g.,

1,2,3,4,5,6,7,8 are the replicas and generate HBFAR tree as shown in Figure 8.3. The

replicas 1,2,3 are used for quorum of size three replicas and 1,2,3,4 are used to form a

quorum of size four. The replicas with higher session time are given priority to form

the quorums as shown in Figure 8.3.

P2

P1

P13

P12

P11

P10

P8

P9

P7

P6

P5

P4

P3

P14

P15

Figure 8.2 The arrangement of peers to make Height Balanced Fault Adaptive Reshuffle Tree over the peers from underlay topology of P2P networks. Here the dotted line connector shows the connection between the peers in overlay topology. The dark line connector shows the connection between the peers in the replica topology in tree. P14 is shown as isolated peer in the network.

Parent Id Peer Id 1 Hop 2 Hop 3 Hop 4 Hop

P1 X P2 P1 X P3 P1 X P4 P2 P1 X P5 P2 P1 X P6 P3 P1 X P7 P3 P1 X P8 P4 P2 P1 X P7

P3 P2

P6 P5

P1

P4

P8

Figure 8.3 Replica arrangements in the HBFAR scheme generated from Figure 8.2. The session time of P1 is greater than the P2 and P3. The order of the replicas according to session time from the HBFAR scheme is P1, P2, P3, P4, P5, P6, P7, and P8.

155

The performance of the system remains approximately same even in high churn rate

of peers. With maximized overlapped replicas in consecutive read and write quorums,

this scheme ensures the access of fresh data items from any number of replicas in the

system. The HBFAR scheme also provides high fault tolerance. With self

organization this scheme tolerate up to 1n − faults among replicas in the system.

Multicasting and directional forwarding is used to transfer the messages in the system.

n

The HBFAR scheme triggers maintenance procedure for every leaved replica.

Other replicas in the systems are adjusted according to their session time. The replica

with next higher session time takes the position of leaved replica from the network.

By default, the replicas with longer session time move in the upward direction with

passage of time. It works on 4 rules which are defined in the following sections.

8.4.1 Rule Set-I: Rules for Genration of Height Balanced Fault Adaptive

Reshuffle (HBFAR) Structure

The HBFAR is a special type of logical structure similar to the complete binary tree

which is used for replica overlay. A HBFAR structure of size n is a binary tree of

nodes which satisfies the following (Rule Set-I):

n

(i) This binary tree is almost a complete tree, means if there is an integer k such

that every leaf of the tree is at level k or 1k + and if a node has a right

descendant at level 1k + then that node also has a left descendent at level 1k + .

(ii) The key in the nodes are arranged in such a way that the content of each node is

less than or equal to the content of its parent. This means for each

node i j , where Key Key≤ j is the parent of node i .

A peer at level holds the addresses of its connected peers at and k 1k − 1k +

levels in HBFAR structure, i.e., each peer stores the addresses of its directly

connected siblings and of its parent. Simultaneously each peer stores the addresses of

all its grandparent peers coming across the path from that peer to root. All

peers/replicas follow the rules (Rule Set-I) of HBFAR structure to make this overlay

logical structure or replica overlay. The session time of each peer/replica is used as

key in the replica overlay. The use of session time as the key, results in the movement

of replicas having longer session time towards the root. Each newly joined peer

connect at the position of leaves in logical structure decided according to the rules

156

(Rules Set-I) of HBFAR. These peers also search the alternate path of each parent up

to root. Each peer holding replicas transmit the beacon against its active status to all

its directly connected peers. These addresses and beacon are used for making the

connection in case of any failure.

The addresses of peers are used to access the peers in a particular sequence.

HBFAR reduces the search time in building the quorums by using minimum hop

count. This reduction in search time is even less than that achieved by HQC and

HQC+.

8.4.2 Rule Set-II: Rules for replica leaving from HBFAR

When any replica leaves by informing or without informing the system, the following

steps (Rules Set-II) are taken into account to maintain the replica logical structure. It

is assumed that replica x is at level k going to leave the network, e.g., peer 2 going

to leave the network shown in Figure 8.4

.

(i) The replicas at level 1k + , which is directly connected with the replica x , tries

to connect with its alive grandparent (addresses are stored at each peer joined in

the system).

(ii) Alive grandparent compare the session time in case of multiple replicas

approaches to connect. The replicas with highest session time will connect with

the active grandparent and will take the position of recently left x in the logical

structure as shown in Figure 8.5.

(iii) The replicas at level k under parent at level 1k − are adjusted according to the

above conditions.

8.4.3 Rule Set-III: Rules for replica joining into the replica logical structure

When any replica rejoins the system, the following rules (Rules-III) are taken into

account by system to maintain the replica overlay topology:

(i) Initially rejoined peer searches its position through ping pong messages starting

from the root, assuming the position at level k .

(ii) The replica establishes the connection with its parent peer and both save the

addresses of each other.

157

(iii) Replica updates its data items by comparing the data items with its alive parent

and update version number of data items.

(iv) Replica stores the addresses of its entire grandparents till root, through the path

find message.

Parent Id Peer Id

1 Hop 2 Hop 3 Hop 4 Hop P1 X

P1

P2 P1 X P3 P1 X P2 P3 P4 P1 X P2 P5 P1 X P2 P6 P3 P1 X P7 P3 P1 X

P4 P8 P4 P2 P1 X P5 P6 P7

P8

Figure 8.4 Replica arrangements in a HBFAR logical tree structure. Peer 2 which is shown by dotted lines is a peer leaving the network

Parent Id Peer Id 1 Hop 2 Hop 3 Hop 4 Hop

P1 X P3 P1 X P4 P1 X P5 P4 P1 X P6 P3 P1 X P7 P3 P1 X P8 P4 P1 X

P7

P3 P4

P6 P8

P1

P5

Figure 8.5 The HBFAR logical tree structure after leaving of Peer 2. Peer 4 takes the position of Peer 2 which already leaved the network. All other replicas in downlink are readjusted accordingly

8.4.4 Rule Set-IV: Rules for Acquisition of Read/Write Quorum from HBFAR

Logical Tree

The following rules (Rule Set-IV) are defined to access the HBFAR logical tree to

form the read/write quorums:

(i) i , the size of read quorum is always less than or equal to

the size of write quorum.

iSizeof Qr Sizeof Qw≤

(ii) Quorum acquisition is always starting from the root, i.e., root is always included

in the read/write quorum.

158

(iii) For any integer k , if the replica at k level is in the quorum then every replica

from 1k − level must be in the quorum of the HBFAR logical tree. This rule is

referred as Top-to-Bottom.

(iv) If a replica from right descendent of a parent replica is in the quorum then there

must be a replica from left descendent which is also in the quorum. Follows the

rule Left-to-Right.

These rules are implemented in the HBFAR scheme by combining Top-to-

Bottom and Left-to-Right to access replicas from the HBFAR logical structure.

Read Write Quorum: The quorum size depends upon overall availability of replicas

and overhead of the network. The size of quorums may be increased in case of low

availability and reduced in case of high availability of the replicas. The replicas

included in quorums are selected according to the Rule Set-IV Rules. The size of

quorums also affects the network traffic. The number of messages transferred to

maintain the HBFAR logical structure increases with increase in quorum size.

Therefore, the network overhead is increased in the system. The quorum size is

directly proportional to the network overhead, i.e., there is tradeoff between the

network overhead and quorum size of the replicas.

The HBFAR scheme uses the fixed number of replicas in quorums, after

considering all factors affecting the system for read/write quorums. The quorum size

of read and write quorums may be different depending upon the requirement of a

system. The replicas are included in sequence from HBFAR logical structure

according to session time to form the quorums. All accessed replicas in read quorum

are compared for updated version of the data items. In the best case only the root may

be accessed for updated data.

Write quorums are decided same as the read quorums. The replicas from top to

bottom and left to right are selected from the HBFAR logical structure to form the

quorum. Whenever write query is executed in the system, all the replicas in quorum

are updated by write through method, i.e., write is committed after receiving

acknowledgement from all available replicas in the quorum. The remaining replicas in

the structure are updated with write back method. Here maximum queries are

responded by the top most replicas of HBFAR logical structure having longer session

time. The replicas which are not used in write quorums are updated through lazy

159

method. These extra links reduce time of update message to reach all replicas in the

system.

It quorum size equal to four, then all four replicas available at the top, starting from

root to branch and left to right will be in read/write quorum of the HBFAR logical

structure. The peers and are used for quorum size two. The peers , and

are used in quorum of size three. The peers , , and are used in quorum of size

four by considering the logical structure shown in Figure 8.5.

1P 4P 1P 4P 3P

1P 4P 3P 8P

8.4.5 Correctness Proof of the Algorithm

We use mathematical induction to prove that the number of replicas accessed in the

read quorums from HBFAR logical structure has at least one replica having updated

data items. Assuming replicas are organized in the HBFAR logical structure in height

. h

Basis:

(i) Assuming the height of the HBFAR tree is 0, i.e., only one peer/replica is in the

structure (placed at root). Since according to the Rule Set-IV (ii) read as well as

write quorum must involve root peer in the quorum. Every read/write quorum

includes replica at root. Hence, every access gets the updated data items from the

root. These quorums mathematically described as:

0 0 { }Qw P= , , , 0 0 { }Qr P= Qr Qw∴ ⊆ 0 0Qw Qr φ≠∩ read quorum and write quorum

intersect with each other, i.e., read quorum gets the updated data items. HBFAR

scheme provides updated data for height 0. {Hence Proved}

(ii) Assuming height of the HBFAR logical structure be 1, i.e., HBFAR logical tree

has maximum 2-3 peers. One replica is at the root and 1-2 in the down link of the

root. According to the Rule Set-IV, the size of write quorum is greater than or

equal to the size of read quorum. The replicas in the quorum are selected through

Top-to-Bottom and Left-to-Right. Write quorum is defined as: 1Qw

1 0 0 1 0 1 2{{ },{ , },{ , , }}Qw P P P P P P= … (8.1)

Read quorum is defined:

1 0 0 1 0 1 2{{ }, { , }, { , , }}Qr P P P P P P= … (8.2)

160

For every possible set of read quorum against write quorum, quorums intersect

each other thus, it always get the updated information. Using eqn (8.1 & 8.2) we

conclude the following:

1 1 1 , : Qr Qw Qr Qw∀ ⊆ 1 1 1Qw Qr, φ∴ ≠∩

All possible read quorums always contain at least one updated replica. This

implies that read quorum always accesses the fresh data item, as intersection is

always non-empty. Thus, HBFAR scheme provides updated data for height 1.

{Hence Proved}

Hypothesis:

Assuming HBFAR logical tree of height i and the write and read quorums

of size l and , respectively and defined as:

,iQw Qri

l

k

1

k

1 2{ , ,..., ,..., }i kQw P P P P= … (8.3)

1 2 3{ , , ,..., } iQr P P P P= … (8.4)

, 2il k ≤ − , and where 1k l≤ 2i − is the total number of replicas up to i level of

the logical structure. Replicas are accessed in Top-to-Bottom and Left-to-Right

fashion as mentioned in the Rule Set-IV. Assume that entire replicas up to

comes in the intersection set of write and read quorums according to the Rule Set-

IV. From eqn (8.3 & 8.4)

kP

1 2 3{ , , ,..., }i i kQw Qr P P P P=∩ … (8.5)

Therefore, each read quorum accesses the updated replicas as intersection of write

and read quorum is not empty.

Inductive Step:

We have to prove that this is also true for the HBFAR logical structure of height

. According to the Rule Set-IV, write quorum of size is defined as: 1i + n

1 1 2 3{ , , ,..., ,..., ,..., }i kQw P P P P P P+ = l n

m

… (8.6)

Read quorum of size is defined as: m

1 1 2 3{ , , ,..., ,..., }i kQr P P P P P+ = … (8.7)

161

If the size selected for the write quorum is n and size for the read quorum is .

Where and . The quorum is generated through Rule Set-IV. From eqn

(8.6 & 8.7)

m

l n≤ k m≤

1 ,i i i iQw Qw Qr Qr+⊆ ⊆ 1+ … (8.8)

From eqn (8.5 & 8.8)

1 1i iQw Qr φ+ + ≠∩

1 2 3{ , , ,..., }kP P P P

, means at least the intersection of two quorums is equal to

from eqn (8.5). It proves that in every read quorum intersects with

the write quorum in HBFAR scheme. Therefore, every read quorum carries the

updated information. {Proof}

Adaptive & Fault Tolerance: HBFAR scheme easily adapts any of the peer faults. It

works for both cases, peer leaving or joining the system. HBFAR scheme tolerates up

to faults among n number of replicas participating in the system. 1n −

Availability: It is the probability that at least one replica is available in the system and

is given as

1 (11

i

nPr

i− −=∏ ) … (8.9)

where is the probability of replica to stay alive and is the number of replicas . i

Prthi n


For simulation, a discrete event driven simulator is developed in C++. It is assumed

that the network consists of 1000 peers in underlay and 10%-20% of total peers are

used in the overlay topology. the random peer placement, HQC and HBFAR

topologies are implemented for the simulation. The average search time, response

time and average message transfer time are considered as performance measurement

metrics. These metrics are used to maintain the system after executing write quorum.

In the simulation Dijkstra’s algorithm is used to find shortest path in all experimental

setups.

In a particular group of replicas it is assumed that all peers are directly connected

with each other. Sending and receiving messages are the only means to communicate

162

between the peers. A replica stores the addresses of other connected replicas. This

helps in the searching for replicas, while form the read/write quorums. A peer

contains the stale information, while it rejoins the system.


To study behavior of HBFAR scheme we have used the performance parameters

defined in Table 4.3{Chapter 4}. The performance metrics defined in Table

7.2{Chapter 7} and few from Table 4.2 {Chapter 4} along with below defined

metrics:

Availability: of peer is calculated as the total active time of peer over the total time of

peer including active and down time. This is the measure of participation of a peer in

the system. Longer is the participation time in the system more contributor a peer is.

Percentage of Stale Data: is calculated as the number of accessed replicas with stale

data over total accesses in the quorum. Any system requires minimal amount of stale

data access. A system is considered better which is having lower value of stale data

access.


It is observed from Figure 8.6 that approximately 80% of peers are reachable under

100% availability. The peer availability is one of the factors responsible for the

network partitioning. Failure of any peer may cause network partitioning. The

network may be partitioned with low availability peers. The peers belonging to the

network having more than one partition, are not reachable from the peer belongs to

another partition. Another factor is the cardinality of the peers participating in the

network. A connection in overlay uses the multiple underlay connections. Any loss of

connection in underlay increases the path length to reach a peer in overlay. It may also

possible that a peer is not reachable in a predefined value of Time to Live (TTL). The

reachability affects search time of replicas. The low reachability increases path length

of the searched peer which it increases the cost to access a replica.

163

0

20

40

60

80

100

120

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1Availability

% R

each

abili

ty

Up PeersReachable Peers

Figure 8.6 Reachability of peers under availability in the network

Figure 8.7 presents the probability to access stale data decreases with increase in

availability of the peers. In HBFAR scheme, the probability to access stale data is

very less as compared to the Random and HQC. The access of all subqueries is

considered against the stale data accessing. It is also observed that the probability to

access stale data is in acceptable range with replicas having availability greater than

0.7. The peers having availability more than 0.7 may be given priority to store the

replica over other peers so that the performance of the system may be improved.

0

0.2

0.4

0.6

0.8

1

1.2

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1Availability

Prob

abili

ty o

f Sta

le D

ata

RandomHQCHBFAR

Figure 8.7 Comparison in accessing stale data under availability of peers

164

0

5

10

15

20

25

30

35

0 2 4 6 8 10 12 14 16 18 2Quorum Size

Avg

. Sea

rch

Tim

e (m

s)

0

HBFARHQCRandom

Figure 8.8 Comparison of average search time to form the quorum from the networks

Figure 8.8 shows that the quorum acquisition time is increasing while increasing

in quorum size. The average search time for the random quorum consensus is

comparatively more than the HBFAR scheme. It takes less time to search a peer

because the peers are located at proper position. The Random Quorum Consensus

takes higher time because it finds the peer through flooding as compare to the

structured.

0

10

20

30

40

50

60

70

80

90

100

0 2 4 6 8 10 12 14 16 18 2Quorum Size

Avg

. Res

pons

e T

ime

(ms)

0

RandomHQCHBFAR

Figure 8.9 Comparison of average response time

It is observed from Figure 8.9 that average response time increases with increase

in quorum size. The response time in all cases becomes approximately constant after

165

12 value of quorum size. The response time of the HBFAR scheme is lowest among

all the considered schemes. The quorum acquisition time is very low from Random

and HQC.

0

200

400

600

800

1000

1200

0 2 4 6 8 10 12 14 16 18 2Size of Quorum

Ave

rage

Mes

sage

Tra

nsfe

r

0

Random

Hierarchical

Figure 8.10 Comparison of average message transfer to maintain the system

Figure 8.10 shows that network overhead for Random Quorum Consensus

increases rapidly as the size of quorum is increasing. The network overhead in case of

hierarchical quorum consensus is small as compared to the random quorum because

hierarchical quorum consensus uses the binary logical structure.

8.6 Discussion

The simulation results show that average messages transfer in the P2P network is

minimized through the directional search as compared with the random search. The

message transfer time in the hierarchical topology is also less as compared with the

random topology. The quorum acquisition time is a major factor for the performance.

The system which takes lesser time to search the replicas is having better

performance. HBFAR scheme takes lesser search time as compare to the random and

HQC, because it fixes searching location in the logical structure. But in random and

HQC replicas are searched randomly. This takes time to make the quorum, which

affects the performance of the system.

166

The HBFAR scheme performs better than HQC in respect of search time to form

quorums, response time and probability to access updated data items in dynamic

environment of the network. It provides better data availability in the system. It

maximizes the degree of intersection among consecutive read-write and write-write

quorums and provides the better probability to access updated data items from the

system. HBFAR scheme easily adapts any leave and joining of peer in the system.

System performance is not seriously degrades with increase in churn rate of the peers.

It also works in case of any fault. It may tolerate up to 1n − faults.

8.7 Summary

In this chapter we have presented a HBFAR logical structure for overlay networks.

HBFAR scheme logical structure is organized in such a way that all updated replicas

are popped towards root and only updated replicas are participating in any quorum

formation. The replicas having large session time are on the root side and replicas

with lower session time are arranged on the branch sides of the root. It adjusts itself

after every leaving a replica from the structure. Always a replica spent longer session

time are on the top of the tree. This also reduces the time spent to make the quorum of

replicas, and improves the response time of the system.

In next chapter work is concluded with recommendations for future scope.

167

Chapter 9

Conclusion and Future Work

In P2P networks, peers are rich in computing resources/services, viz., data files,

cache storage, disk space, processing cycles, etc. These peers collectively generate

huge amount of resources and collaboratively perform computing tasks using these

available resources. These peers can serve as both clients and servers and eliminate

the need of a centralized node. However, owing to the properties of the nodes, which

can join and leave continually, makes P2P systems very dynamic with high rate of

churn and unpredictable topology. In P2P systems, major drawback is that resources

or nodes are restricted to temporary availability only. A network element can

disappear at a given time from the network and reappear at another locality of the

network with an unpredictable pattern. Under these circumstances, one of the most

challenging problems is to place and access real-time information over the network.

This is because the resources should always be successfully located by the requesters

whenever needed within some bounded delay. This requires management of

information under time constraints and dynamism of the peers. There are multiple

challenges to be addressed in the direction of implementing Real Time Distributed

Databases Systems (RTDDBS) over dynamic P2P networks. In order to enable

resource awareness in such a large-scale dynamic distributed environment, specific

management system is required which takes into account the following P2P


balancing, fault-tolerance, replica placement/updation/assessment, data consistency,

concurrency control, design and maintain logical structure for replicas, etc. In this

thesis, we have developed a solution for resource management which should support

fault-tolerant operations, shortest path length for requested resources, low overhead in

network management operations, well balanced load distribution between the peers

and high probability of successful access from the defined quorums.

Rest of the chapter is organized as follows. Contributions of this dissertation are

explored in Section 9.1. Finally chapter is ended with future work Section 9.2.

168

9.1 Contributions

Contributions of this dissertation are as follows.

1. We have designed Statistics Manager and Action Planner (SMAP) system for P2P

networks. Various algorithms are also proposed to enhance the performance of

various modules of this system. SMAP enables fast and cost-efficient deployment

of information over the P2P network. It is a self-managed P2P system, having a

capability to deal with high churn rate of the peers in the network. SMAP is fault

adaptive and provides load balancing among participated peers. It permits true

distributed computing environment for every peer node to use the resources of all

other peers participating in the network. It provides data availability by managing

replicas in efficient logical structure. To improve the throughput, execution

process is divided into three independent sub-processes by the system. These sub

processes can execute in parallel. SMAP provides fast response time for

transactions with time constraints. It also reduces redundant traffic from P2P

networks by reducing conventional overlay path. It also addresses most of the

issues related to RTDDBS implemented over P2P networks.

2. We have developed a 3-Tier Execution Model (3-TEM) which comprises of


Coordinator (RC). All these operate in parallel to improve throughput of the

system. It is adaptive in nature and balances the excessive load in the system by

distributing the work of head peer to TC and RC. TC receives and manages the

execution of arrived transactions in the system. It resolves transaction mapped

with global schema into subtransactions mapped with local schema and available

with TPP. The partial results received from TPPs are combined and prepared

according to the global schema and delivered to the requester through RC. TPPs

are developed for receiving subtransactions from coordinator, execute it in

serializable form and submit partial results to the RC. These three stages are

independent and execute the transactions in parallel. Peer selection criterion to

identify the most suited peers for holding the replica is also presented. The

selection of peers is performed on the basis of multiple parameters, e.g., available

resources, session time of peers, etc. which improves the performance of the

system.

169

3. A Matrix Assisted Technique (MAT) is developed to partition real time database

for the P2P networks. It provides a mechanism to store partitions and access

dynamic data over P2P networks under the dynamic environment. MAT also

provides the primary security concern to the stored data simultaneously it also

improves data availability in the system.

4. A Timestamp based Secure Concurrency Control Algorithm (TSC2A) is

developed which handles the issues of concurrent execution of transactions in

dynamic environment of P2P network. TSC2A maintains security of data and time

bounded transactions along with controlled concurrency. It uses timestamp to

resolve the conflicts rise in the system. Simultaneously, security is also provided

by TSC2A to each data items and the transactions accessing that data items. Three

security levels are used to provide the security in execution of transactions.

TSC2A also avoids the covert channel problem in the system. It provides

serializability in the execution of transactions at global as well as at local level.

5. A Common Junction Methodology (CJM) reduces network traffic in the P2P

network. It considers the redundant traffic generated by topology mismatch

problem in the P2P networks. CJM finds its own route to transfer the messages

from one peer to other. Common Junction among two paths is identified for

redirecting the messages. The messages are usually forwarded from one peer to

other in overlay topology. A message traverses multi hop distance in underlay to

deliver the message in overlay. These multi hops in underlay may intersect at any

point and this point referred as Common Junction which is utilized to reroute the

messages. The traffic in the underlay network is also reduced by it. CJM reduces

the traffic without affecting search scope in the P2P networks. The correctness of

the proposed CJM is analyzed through mathematical model as well as through

simulation.

6. A novel Logical Adaptive Replica Placement Algorithm (LARPA) is developed

which implements logical structure for dynamic environment. The algorithm is

adaptive in nature and tolerates up to 1n − faults. It efficiently distributes replicas


system. LARPA uses minimum peers to place replicas in a system. These peers

170

are identified through peer selection criteria. All peers are placed at one hop

distance from the centre of LARPA, it is place from where any search starts.

Depending upon the selection of peers for logical structure, LARPA is classified

as LARPA1 and LARPA2. LARPA1 uses the peers with highest candidature value

only, calculated through peer selection criteria. This candidature value is

compromised in LARPA2 by the distance of peers from the centre. LARPA

improves the response time of the system, throughput, data availability and degree

of intersection between two consecutive quorums. It also provides the high

possibility of accessing updated data items from the system and short quorum

acquisition time. The reconciliation of LARPA is fast, because system updates

itself at fast rate. It also reduces the network traffic in P2P network due to its one

hop distance logical structure formation with minimum number of replicas.

7. A self organized Height Balanced Fault Adaptive Reshuffle (HBFAR) scheme

developed for improving hierarchical quorums over P2P systems. It arranges all

replicas in a tree logical structure and adapts the joining and leaving of a peer in

the system. It places all updated replica on the root side of the logical structure. To

access updated data items from this structure, this scheme uses a special access,

i.e., Top-to-Bottom and Left-to-Right. HBFAR scheme always select updated

replicas for quorums from logical structure. It provides short quorum acquisition

time with high quorum intersection degree among two consecutive quorums,

which maximizes the overlapped replicas for read/write quorums. HBFAR

improves the response time and search time of replicas for quorums. It provides

the feature read one in its best case. HBFAR scheme provides high data

availability and high probability to access updated data from the dynamic P2P

system. High fault tolerance and low network traffic is reported by HBFAR

scheme under the churn of peers. The parallelism in quorum accessing and

structure maintenance keeps HBFAR scheme updated without affecting the

quorum accessing time. HBFAR is analyzed mathematically as well as through

simulator.

A comparative study of SMAP is summarized in Table 9.1. It is found that

SMAP fulfills most of the existing challenges.

171

172

9.2 Future Scope Information delivery in ad hoc networks is a task, which in any case, is a resource

consuming. We are heading towards a future of miniaturization and wireless

connectivity and ad hoc networks have the ability to deliver both at very low cost.

1. For future research we can extend this work for secure dissemination of

information by integrating security framework in term of trust establishment and

trust management in P2P network. This system will also be developed for

exploring and solving security issues on open networks.

2. To address unique security concerns, it would be imperative to study the adjacent

technological advances in distributed systems, ubiquitous computing, broadband

wireless communication, nanofabrication and bio-systems.

3. Irrespective of good research in the socially popular and emerging field, i.e., P2P

networks and systems, still there is a lot of scope of research in this field.

4. We identified that most of the P2P systems are popular for the static data. The

data which is not changed, while it is shared among the networks. A little work is

done in the direction of sharing dynamic data among the P2P systems. The data

which is changed, while it is shared among the networks. We have developed

SMAP and tested through simulation in future it will be transported on real

networks.

5. Reliability is another issue which needs more attention of the research society.

Other issues are secure concurrency control, secure fault tolerance and secure load

balancing.

Table 9.1 Comparison of Few Existing Systems with SMAP

Middleware Attributes CAN Tapestry Chord Pastry Napster Gnutella Freenet APPA Piazza PIER PeerDB NADSE SMAP

Load balancing Y Y Y Y Y N N Y Y Y Y Y Y

Fault tolerant (communication link)

Y N Y Y Y Y Y Y Y Y Y Y

Fault tolerant (host level)

Y Y Y N N Y Y Y Y Y Y Y Y

Reliable Replication Replication Replication Replication N Replication Replication Replication Replication Replication Replication Replication Replication

Resource sharing N N N N Y Y N Y Y Y Y Y Y

Secure N N N Communication level

N N N Y N Y N Y Y

Scalable Y Y Y Y Little Little Little Y Little Y Y Y Y






Performance Good Good Good Good

Poor at overload

Poor at overload

Poor at overload

Good

Poor at overload

Good

Poor at overload

Good Good

Distributed file Management

Y Y Y Y N N Y Y Y Y Y Y Y

Data Partitioning NA NA NA NA NA NA NA N N N N N Y

Traffic Optimize NA NA NA NA NA NA NA N NA N N N Y

Concurrency Control

NA NA NA NA NA NA NA Y NA Local N N Y

Parallel Execution NA NA NA NA NA NA NA Y N N N Y Advanced Parallel

Schema Management

NA NA NA NA NA NA NA Y(Global) Y(Pairwise) Y(Global) N N Y(Global)

File sharing N N N N Y Y N Y Y Y Y Y Y

Degree of Decentralization

Distributed Distributed Distributed Distributed Centralized Decentralized Distributed Hybrid Hybrid Super Peer Based

Distributed Hybrid Hybrid Hybrid

Network Structure Structured Structured Structured Structured Structured Unstructured Loosely Structured

Independent Unstructured Structured Loosely Structured

Loosely Structured

Structured

*NA: Not addressed at the best of our knowledge 173

List of Publications

International Journals

1. Shashi Bhushan, M. Dave, R. B. Patel, “Self Organized Replica Overlay

Scheme for P2P Networks”, International Journal of Computer Network and

Information Security, Vol. 4(10), 13-23, 2012. ISSN: 2074-9090 (Print),

ISSN: 2074-9104 (Online). DOI: 10.5815/ijcnis.2012.10.02 2. Shashi Bhushan, R. B. Patel, Mayank Dave, “Height Balanced Reshuffle

Architecture for Improving Hierarchical Quorums over P2P Systems”,

International Journal of Information Systems and Communication, Vol. 3(1),

215-219, 2012. ISSN: 0976-8742 (Print) & E-ISSN: 0976-8750 (Online)

Available online at http://www.bioinfo.in/contents.php?id=45

3. Shashi Bhushan, R. B. Patel, M. Dave, “Reducing Network Overhead with

Common Junction Methodology”, International Journal of Mobile

Computing and Multimedia Communication (IJMcMc), IGI-Global, 3(3), 51-

61, December 2011. ISSN: 1937-9412, EISSN: 1937-9404, USA. DOI:

10.4018/jmcmc.2011070104

4. Shashi Bhushan, R. B. Patel, Mayank Dave, “A Secure Time-Stamp Based

Concurrency Control Protocol for Distributed Databases”, Journal of

Computer Science, 3(7), 561-565, 2007, ISSN: 1549-3636, New York.

DOI:10.3844/jcssp.2007.561.565

5. Shashi Bhushan, R. B. Patel, M. Dave, “LARPA - A Logical Adaptive

Replica Placement Algorithm to Improve Performance of Real Time

Distributed Database over P2P Networks”, Journal of Network and

Computer Applications (JNCA), Elsevier. {Communicated on October 27,

2012}

Papers in Conference Proceedings

6. Shashi Bhushan, R. B. Patel, M. Dave, “Hierarchical Data Distribution

Scheme for Peer-to-Peer Networks” In Proceedings of International

Conference on Methods and Models in Science and (ICM2ST-10) December

25-26, 2010, Chandigarh. Indexed with AIP Conference Proceedings, volume

174

175

1324, pp. 332-336 (2010). (DOI: 10.1063/1.3526226)

7. Shashi Bhushan, R. B. Patel, M. Dave, “Adaptive Load Balancing within

Replicated Databases over Peer-to-Peer Networks” In Proceedings of 1st

International Conference on Mathematics & Soft Computing (Application in

Engineering) (ICMSCAE) December 4-5, 2010 at NC College of Engineering

ISRANA (HR), INDIA.

8. Shashi Bhushan, R. B. Patel, M. Dave, “CJM: A Technique to Reduce

Network Traffic in P2P Systems”, In Proceedings of IEEE International

Conference on Advances in Computer Engineering (ACE 2010), Bangalore,

INDIA, June 21-22, 2010.

(DOI:ieeecomputersociety.org/10.1109/ACE.2010.55) arnumber=5532818,

Available with IEEE XPLORE.

9. Shashi Bhushan, R. B. Patel, Mayank Dave, “A Distributed System For

Placing Data Over P2P Networks”, In Proceedings of International

Conference on Soft Computing and Intelligent Systems, Jabalpur Engineering

College Jabalpur, INDIA, 27-29 December, 2007, pp.160-164.

10. Shashi Bhushan, R. B. Patel, M. Dave, “Load Balancing within Hierarchical

Data Distribution in Peer-to-Peer Networks”, In Proceedings of 4th National

Conference on Machine Intelligence (NCMI-2008), Haryana Engg. College,

Jagadhri (HR) INDIA. August 22-23, 2008, pp. 392-395.

Bibliography

[1] V.Gorodetsky, O.Karsaev, V.Samoylov, S.Serebryakov, S.Balandin,

S.Leppanen, M.Turunen, “Virtual P2P Environment for Testing and

Evaluation of Mobile P2P Agents Networks”, Proceedings of the IEEE

Second International Conference on Mobile Ubiquitous Computing, Systems,

Services and Technologies, pp. 422-429, 2008.

[2] Javed I. Khan and Adam Wierzbicki, “Foundation of Peer-to-Peer

Computing”, Special Issue, Computer Communications, Vol. 31(2), pp. 187-

418, February 2008.

[3] C. Shirky, “What is P2P and What Isn’t”, Proceedings of The O'Reilly Peer to

Peer and Web Service Conference, Washington D.C., pp. 5-8, November 2001.

[4] William Sears, Zhen Yu, Yong Guan, “An Adaptive Reputation-based Trust

Framework for Peer-to-Peer Applications”, Proceedings of the Fourth IEEE

International Symposium on Network Computing and Applications (NCA’05),

pp. 1-8, 2005.

[5] Nikta Dayhim, Amir Masoud Rahmani, Sepideh Nazemi Gelyan, Golbarg

Zarrinzad, “Towards a Multi-Agent Framework for Fault Tolerance and QoS

Guarantee in P2P Networks”, Proceedings of the Third IEEE International

Conference on Convergence and Hybrid Information Technology, pp. 166-171,

2008.

[6] Qian Zhang, Yu Sun, Zheng Liu, Xia Zhang Xuezhi Wen, “Design of a

Distributed P2P-based Grid Content Management Architecture”, Proceedings

of the 3rd IEEE Annual Communication Networks and Services Research

Conference (CNSR’05), pp. 1-6, 2005.

[7] Grokster official homepage. http://www.grokster.com.

[8] Nouha Oualha, Jean Leneutre, Yves Roudier, “Verifying Remote Data

Integrity in Peer-to-Peer Data Storage: A comprehensive survey of protocols”,

Peer-to-Peer Networking Application (Springer), Vol. 4, pp. 1-11, October

2011.

[9] Morpheus official homepage. http://www.morpheus.com.

[10] Ritter, “Why Gnutella Can't Scale. No, Really”,

http://www.tch.org/gnutella.html.

176

[11] J. Holliday, D. Agrawal, A. E. Abbadi, “Partial Database Replication using

Epidemic Communication”, Proceedings of the 22nd International Conference

on Distributed Computing Systems, IEEE Computer Society, Vienna, Austria,

pp. 485–493, 2002.

[12] A. Gupta, Lalit K. Awasthi, “Peer-to-Peer Networks and Computation:

Current Trends and Future Perspectives”, International Journal of Computing

and Informatics, Vol. 30(3), pp. 559–594, 2011.

[13] Press Release, “Bertelsmann and Napster form Strategic Alliance”, Napster,

Inc., Oct 2000, http://www.napster.com/pressroom/pr/001031.html.

[14] SETI@home: Search for Extraterrestrial Intelligence at Home, Space Science

Laboratory, University of California, Berkley, 2002,

http://setiathome.ssl.berkeley.edu/

[15] N. Oualha, Y. Roudier, “Securing P2P Storage with a Self Organizing

Payment Scheme”, 3rd international workshop on autonomous and

spontaneous security (SETOP 2010), Athens, Greece, September 2010.

[16] N. Oualha, “Security and Cooperation for Peer-to-Peer Data Storage”, PhD

Thesis, EURECOM/Telecom ParisTech, June, 2009.

[17] D.S. Milojicic, V. Kalogeraki, R. Lukose, “Peer-to-Peer Computing”, Tech

Report: HPL-2002-57, http://www.hpl.hp.com/techreports/2002/HPL-2002-

57.pdf.

[18] Kai Guo, Zhijng Liu, “A New Efficient Hierarchical Distributed P2P

Clustering Algorithm”, Proceedings of the IEEE Fifth International

Conference on Fuzzy Systems and Knowledge Discovery, pp. 352-355, 2008.

[19] Gareth Tyson, Andreas Mauthe, Sebastian Kaune, Mu Mu, Thomas

Plagemann Corelli, “A Dynamic Replication Service for Supporting Latency-

Dependent Content in Community Networks”, Proceedings of the 16th

ACM/SPIE Multimedia Computing and Networking Conference (MMCN),

San Jose, CA, 2009.

[20] Ian Taylor, “Triana Generations”, Proceedings of the Second IEEE

International Conference on e-Science and Grid Computing (e-Science'06),

pp.1-8, 2006.

[21] B. Yang, H. Garcia-Molina, “Improving Search in Peer-to-Peer Networks”,

Proceedings of the 22nd International Conference on Distributed Computing

Systems (ICDCS’02), IEEE Computer Society, pp. 5, 2002.

177

[22] Lijiang Chen, Bin Cui, Hua Lu, Linhao Xu, Quanqing Xu, iSky, “Efficient and

Progressive Skyline Computing in a Structured P2P Network”, Proceedings of

the IEEE 28th International Conference on Distributed Computing Systems,

pp. 160-169, 2008.

[23] Lionel M. Ni, Yunhao Liu, “Efficient Peer-to-Peer Overlay Construction”,

Proceedings of the IEEE International Conference on E-Commerce

Technology for Dynamic E-Business (CEC-East’04) 2004.

[24] Hari Balakrishnan, M. Frans Kaashoek, David Karger, Robert Morris Ion

Stoica, “Looking Up Data in P2P Systems”, Communications of the ACM,

Vol. 46(2), pp. 43-48, February 2003.

[25] David G. Andersen, Hari Balakrishnan, M. Frans Kaashoek, Robert Moms,

“Resilient Overlay Networks”, Proceedings of 18th ACM SOSP, Banff,

Canada, October, 2001.

[26] S. Sen, J. Wang, “Analyzing Peer-to-peer Traffic across Large Networks”,

Proceedings of ACM SIGCOMM Internet Measurement Workshop, France,

2002.

[27] Freenet project’s Official Website http://freenetproject.org/index.html.

[28] Yoram Kulbak, Danny Bickson, “The eMule Protocol Specification”,

Technical Report, DANSS (Distributed Algorithms, Networking and Secure

Systems) Lab, School of Computer Science and Engineering, The Hebrew,

University of Jerusalem, Jerusalem, pp. 1-67, January, 2005

http://www.cs.huji.ac.il/labs/danss/presentations/emule.pdf

[29] Rudiger Schollmeier, “A Definition of Peer-to-Peer Networking for the

Classification of Peer-to-Peer Architecture and Applications”, Proceedings of

the First International Conference on Peer-to-Peer Computing (P2P.01),

Sweden, pp. 101-102, August 27-29, 2001.

[30] S. Saroiu, K. P.Gummadi, R. J. Dunn, S. D. Gribble, H. M. Levy, “An

Analysis of Internet Content Delivery Systems”, Proceedings of the 5th

Symposium on Operating Systems Design and Implementation, Boston,

Massachusetts, USA, 2002.

[31] Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari

Balakrishnan, “Chord: A Scalable Peer-to-Peer Lookup Service for Internet

Applications”, Proceedings of the ACM SIGCOMM 2001, San Diego, CA, pp.

149-160, August 2001.

178

http://www.cs.huji.ac.il/labs/danss/presentations/emule.pdf

[32] HC. Kim, “P2P overview”, Technical Report, Korea Advanced Institute of

Technology, August 2001.

[33] S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker, “A Scalable

Content-Addressable Network”, Proceedings of the ACM SIGCOMM, pp.

161-172, 2001.

[34] A. Rowstron, P. Druschel, “Pastry: Scalable, Distributed Object Location and

Routing for Large-Scale Peer-to-Peer Systems”, Proceedings of, International

Conference on Distributed Systems Platforms (Middleware), pp. 329-350,

2001.

[35] Ben Y. Zhao, John D. Kubiatowicz, Anthony D. Joseph, “Tapestry: An

Infrastructure for Fault-tolerant Wide-area Location and Routing”, U. C.

Berkeley Technical Report CB//CSD-01-1141, April 2000.

[36] X. Shen, H. Yu, J. Buford, M. Akon, Handbook of Peer-to-Peer Networking,

(1st ed.), New York, Springer, pp. 118, 2010, ISBN 0387097503.

[37] B. Zheng, WC Lee, DL Lee, “On semantic caching and Query Scheduling for

Mobile Nearest-Neighbor Search”, Wireless Networks, Vol.10(6), pp. 653-664,

2004.

[38] Kin Wah Kwong, Danny H.K., “A Congestion Aware Search Protocol for

Unstructured Peer-to-Peer Networks”, Proceedings of the, LNCS 3358, pp.

319-329, 2004.

[39] Dr. Ing, HiPeer, “An Evolutionary Approach to, P2P Systems”, PhD Thesis,

Berlin, 2006.

[40] Lalit Kumar, Manoj Misra, Ramesh Chander Joshi, “Low Overhead Optimal

Check Pointing for Mobile Distributed Systems”, 19th IEEE International

Conference on Data Engineering, pp. 686 – 688, March 2003.

[41] D. Agrawal, A. E. Abbadi, “An Efficient and Fault-Tolerant Solution for

Distributed Mutual Exclusion”, ACM Transactions on Computer Systems, Vol.

9(1), pp. 1–20, 1991.

[42] Aaron J. Elmore, Sudipto Das, Divyakant Agrawal, Amr El Abbadi,

“InfoPuzzle: Exploring Group Decision Making in Mobile Peer-to-Peer

Databases”, PVLDB, Vol. 5(12), 1998-2001, 2012.

[43] Ryan Huebsch, Joseph M. Hellerstein, Nick Lanham, Boon Thau Loo, Scott

Shenker, Ion Stoica. “Querying the Internet with PIER”, Proceedings of the

29th VLDB Conference, Berlin, Germany, 2003.

179

http://en.wikipedia.org/wiki/International_Standard_Book_Number

http://en.wikipedia.org/wiki/Special:BookSources/0387097503

http://www.informatik.uni-trier.de/%7Eley/pers/hd/d/Das:Sudipto.html

[44] Choudhary Suryakant, Dincturk Mustafa Emre, V. Bochmann Gregor, Jourdan

Guy-Vincent, Onut Iosif-Viorel Viorel, Ionescu Paul, “Solving Some

Modeling Challenges when Testing Rich Internet Applications for Security”,

Proceedings of Fifth International Conference on Software Testing,

Verification and Validation (ICST), University of Ottawa, ON, Canada, pp.

850 – 857, 17-21, April 2012.

[45] K. Ramesh, T. Ramesh, “Domain-Specific Modeling and Synthesis of

Distributed Networked Systems”, International Journal of Computer Science

and Communication Vol. 2(2), pp. 485-495, July-December 2011.

[46] S. Lin, Q. Lian, M. Chen, Z. Zhang, “A Practical Distributed Mutual

Exclusion Protocol in Dynamic Peer-to-Peer Systems”, Proceedings of the 3nd

International Workshop on Peer-to-Peer Systems, 2004.

[47] G. Koloniari, N. Kremmidas, K. Lillis, P. Skyvalidas, E. Pitoura, “Overlay

Networks and Query Processing: A Survey”, Technical Report TR2006-08,

Computer Science Department, University of Ioannina, October 2006.

[48] May Mar Oo, The’ Soe, Aye Thida, “Fault Tolerance by Replication of

Distributed Database in P2P System using Agent Approach”, International

Journal of Computers, Vol. 4(1), pp. 9-18, 2010.

[49] Erwan Le Merrer, Anne-Marie Kermarrec, Laurent Massouli, “Peer-to-Peer

Size Estimation in Large and Dynamic Networks: A Comparative Study”,

Proceedings of 15th IEEE International Symposium on High Performance

Distributed Computing (HPDC), pp. 7-17, 19-23 June 2006.

[50] Runfang Zhou, Kai Hwang, Min Cai, “Gossip Trust for Fast Reputation

Aggregation in Peer-to-Peer Networks”, IEEE Transactions on Knowledge

And Data Engineering, Vol. 20(9), pp. 1282-1295, September, 2008.

[51] X.Y. Yang, P. Hern´andez, F. Cores, “Distributed P2P Merging Policy to

Decentralize the Multicasting Delivery”, Proceedings of the IEEE

EUROMICRO Conference on Software Engineering and Advanced

Applications (EUROMICRO-SEAA’05), pp. 1-8, 2005.

[52] S. Ktari, M. Zoubert, A. Hecker, H. Labiod, “Performance Evaluation of

Replication Strategies in DHTs under Churn”, Proceedings 6th International

Conference on Mobile and Ubiquitous Multimedia MUM ’07. New York, NY,

USA: ACM, pp. 90–97, 2007.

[53] Napster Website[ EB/ OL] http://www. napster.com

180

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=p_Authors:.QT.Choudhary,%20Suryakant.QT.&searchWithin=p_Author_Ids:38244154900&newsearch=true

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=p_Authors:.QT.Dincturk,%20Mustafa%20Emre.QT.&searchWithin=p_Author_Ids:38243807100&newsearch=true

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=p_Authors:.QT.Bochmann,%20Gregor%20V..QT.&searchWithin=p_Author_Ids:37274081100&newsearch=true

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=p_Authors:.QT.Jourdan,%20Guy-Vincent.QT.&searchWithin=p_Author_Ids:37392200700&newsearch=true

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=p_Authors:.QT.Jourdan,%20Guy-Vincent.QT.&searchWithin=p_Author_Ids:37392200700&newsearch=true

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=p_Authors:.QT.Onut,%20Iosif-Viorel%20Viorel.QT.&searchWithin=p_Author_Ids:37392201800&newsearch=true

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=p_Authors:.QT.Ionescu,%20Paul.QT.&searchWithin=p_Author_Ids:38243665900&newsearch=true

http://www/

[54] S Saroiu, P. Gummadi, S. Gribble, “A Measurement Study of Peer-to-Peer

File Sharing Systems”, Proceedings of Multimedia Computing and

Networking (MMCN’02), 2002.

[55] R Bhagwan, S Savage, G Voelker, “Understanding Availability”, Proceedings

of the 2nd International Workshop on Peer-to-Peer Systems (IPTPS '03),

Berkeley, CA, USA, pp. 1-11, February 2003.

[56] R Bhagwan, S Savage, G Voelker, “Replication Strategies for Highly

Available Peer-to-Peer Storage Systems”, Proceedings of FuDiCo: Future

directions in Distributed Computing, June, 2002.

[57] Osrael J, Froihofer L, Chlaupek N, Goeschka KM, “Availability and

Performance of the Adaptive Voting Replication”, Proceedings of

International Conference on Availability, Reliability and Security (ARES),

Vienna, Austria, pp. 53–60, 2007.

[58] Najme Mansouri, Gholam Hosein Dastghaibyfard, Ehsan Mansouri,

“Combination of Data Replication and Scheduling Algorithm for Improving

Data Availability in Data Grids”, Journal of Network and Computer

Applications, Vol. 36(2), pp. 711-722, March 2013.

[59] R. Kavitha, A. Iamnitchi, I. Foster, “Improving Data Availability through

Dynamic Model Driven Replication in Large Peer-to-Peer Communities”,

Proceedings of Global and Peer-to-Peer Computing on Large Scale

Distributed Systems Workshop, Berlin, Germany, May 2002.

[60] Heng Tao Shen, Yanfeng Shu, Bei Yu, “Efficient Semantic-Based Content

Search in P2P Network”, IEEE Transactions on Knowledge And Data

Engineering, Vol. 16(7), pp. 813- 826, July 2004.

[61] Hung-Chang Hsiao, Hao Liao, Po-Shen Yeh, “A Near-Optimal Algorithm

Attacking the Topology Mismatch Problem in Unstructured Peer-to-Peer

Networks”, IEEE Transactions on Parallel and Distributed Systems, Vol. 21(7),

pp. 983-997, July 2010.

[62] Z. Xu, C. Tang, Z. Zhang, “Building Topology-Aware Overlays Using Global

Soft-State”, Proceedings of the 23rd International Conference on Distributed

Computing Systems (ICDCS), RI, USA, 2003.

[63] D Saha, S Rangarajan, SK Tripathi, “An Analysis of the Average Message

Overhead in Replica Control Protocols”, IEEE Transactions Parallel

Distributed Systems, Vol. 7(10), pp. 1026–1034, 1996.

181

[64] Yao-Nan Lien, Hong-Qi Xu, “A UDP Based Protocol for Distributed P2P File

Sharing”, Eighth International Symposium on Autonomous Decentralized

Systems (ISADS'07), pp. 1-7, 2007.

[65] D.P. Vidyarthi, B.K. Sarker, A.K. Tripathi, L.T. Yang, “Scheduling in

Distributed Computing Systems: Analysis, Design and Models”, Book,

Springer-Verlag, ISBN-10: 0387744800, 1st Edition, 2008.

[66] A. El. Abbadi, S.Toueg, “Maintaining Availability in Partitioned Replicated

Databases”, ACM Transactions on Database Systems, Vol. 14(2), pp. 264-290,

1989.

[67] Gnutella Website[ EB/ OL] . http://www.gnutella.com

[68] Hung-Chang Hsiao, Hao Liao, Cheng-Chyun Huang, “Resolving the

Topology Mismatch Problem in Unstructured Peer-to-Peer Networks”, IEEE

Transactions on Parallel and Distributed Systems, Vol. 20(11), pp. 1668-1681,

November 2009.

[69] Jing Tian, Zhi Yang, Yafei Dai, “A Data Placement Scheme with Time-

Related Model for P2P Storages”, Proceedings of the Seventh IEEE

International Conference on Peer-to-Peer Computing, pp. 151-158, 2007.

[70] R. Dunaytsev, D. Moltchanov, Y. Koucheryavy, O. Strandberg, H. Flinck “A

Survey of P2P Traffic Management Approaches: Best Practices and Future

Directions”, Journal of Internet Engineering, Vol. 5(1), June 2012.

[71] E. Cohen, S. Shenker, “Replication Strategies in Unstructured Peer-to-Peer

Networks”, Proceedings of ACM SIGCOMM’02, Pittsburgh, USA, Aug. 2002.

[72] H. Lamehamedi, Z. Shentu, B. Szymanski, E. Deelman, “Simulation of

Dynamic Data Replication Strategies in Data Grids”, Proceedings of the 17th

International Parallel and Distributed Processing Symposium, IEEE Computer

Society, Nice France, 2003.

[73] Q. Lv, P. Cao, E. Cohen, K. Li, S. Shenker, “Search and Replication in

Unstructured Peer-to-Peer Networks”, Proceedings of the 16th annual ACM

International Conference on Supercomputing (ICS’02), New York, USA, June

2002.

[74] S. Jamin, C. Jin, T. Kurc, D. Raz, Y. Shavitt, “Constrained Mirror Placement

on the Internet”, Proceedings of the IEEE INFOCOM Conference, Alaska,

USA, pp. 1369 – 1382, 2001.

182

[75] A. Vigneron, L. Gao, M. Golin, G. Italiano, B. Li, “An Algorithm for Finding

a k-median in a Directed Tree”, Information Processing Letters, Vol. 74(1, 2),

pp. 81-88, April 2000.

[76] Anna Saro Vijendran, S.Thavamani, “Analysis Study on Caching and Replica

Placement Algorithm for Content Distribution in Distributed Computing

Networks”, International Journal of Peer to Peer Networks (IJP2P), Vol. 3(6),

pp. 13-21, November 2012.

[77] Francis Otto, Drake Patrick Mirenbe, “A Model for Data Management in Peer-

to-Peer Systems”, International Journal of Computing and ICT Research, Vol.

1(2), pp. 67-73, December 2007.

[78] Houda Lamehamedi, Boleslaw Szymanski, Zujun Shentu, Ewa Deelman,

“Data Replication Strategies in Grid Environments”, Proceedings of the 5th

IEEE International Conference on Algorithms and Architecture for Parallel

Processing, ICA3PP'2002, Bejing, China, pp. 378-383, October 2002.

[79] S. Misra, N. Wicramasinghe, “Security of a Mobile Transaction: A Trust

Model”, Electronic Commerce Research / Kluwer Academic Publishers, Vol.

4(4), 359-372, 2004.

[80] P. Shvaiko, J. Euzenat, “A Survey of Schema-Based Matching Approaches”,

Journal on Data Semantics, Springer, Heidelberg, LNCS, Vol. 3730, pp. 146–

171, 2005.

[81] HK Tripathy, BK Tripathy, K Pradip, “An Intelligent Approach of Rough Set

in Knowledge Discovery Databases”, International Journal of Computer

Science and Engineering, Vol. 2 (1), pp. 45-48, 2007.

[82] Alireza Poordavoodi, Mohammadreza Khayyambashi, Jafar Haminusing,

“Replicated Data to Reduce Backup Cost in Distributed Databases”, Journal of

Theoretical and Applied Information Technology (JATIT), pp. 23-29, 2010.

[83] Lin Wujuan, Bharadwaj Veeravalli, “An Object Replication Algorithm for

Real-Time Distributed Databases”, Distributed Parallel Databases, MA, USA,

Vol. 19, pp. 125–146, 2006.

[84] T. Loukopoulo, I. Ahmad, “Static and Adaptive Distributed Data Replication

using Genetic Algorithms”, Journal of Parallel and Distributed Computing,

Vol. 64(11), pp. 1270–1285, 2004.

183

[85] Amita Mittal, M.C. Govil, “Concurrency Control Design Protocol in Real

Time Distributed Databases”, Proceedings of Emerging Trends in Computing

& Communication (ETCC07), NIT Hamirpur, pp. 155-160, July 2007.

[86] A. Kumar, A. Segev, “Cost and Availability Tradeoffs in Replicated Data

Concurrency Control”, ACM Transactions on Database Systems (TODS), Vol.

18(1), pp. 102–131, 1993.

[87] J. Huang, J. A. Stankovic, K. Ramamritham, D. Towsley, “On using Priority

Inheritance in Real-Time Databases”, Proceedings of the 12th IEEE Real-Time

Systems Symposium, IEEE Computer Society Press, San Antonio. Texas.

USA, pp. 210-221, 1991.

[88] P. S. Yu, K.-L. Wu, K.-J. Lin, S. H. Son, “On Real-Time Databases:

Concurrency Control and Scheduling”, Proceedings of the IEEE, Vol. 82(1),

pp. 14-15, January 1994.

[89] Navdeep Kaur, Rajwinder Singh, Manoj Misra, A. K.Sarje, “A Feedback

Based Secure Concurrency Control For MLS Distributed Database”,

International Conference on Computational Intelligence and Multimedia

Applications 2007.

[90] Y. Chu, S. G. Rao, H. Zhang, “A Case for End System Multicast”,

Proceedings of ACM SIGMETRICS, Santa Clara, California. June, 2000.

[91] B. Krishnamurthy, J. Wang, “Topology Modeling via Cluster Graphs”,

Proceedings of the SIGCOMM Internet Measurement Workshop, San

Francisco, USA, November, 2001.

[92] V. N. Padmanabhan, L. Subramanian, “An Investigation of Geographic

Mapping Techniques for Internet Hosts”, Proceedings of the ACM

SIGCOMM, University of California, San Diego 2001.

[93] Z. Xu, C. Tang, Z. Zhang, “Building Topology-aware Overlays Using Global

Soft-state”, Proceedings of the 23rd International Conference on Distributed

Computing Systems (ICDCS), RI, USA, 2003.

[94] Y. Chawathe, S. Ratnasamy, L. Breslau, N. Lanham, S. Shenker, “Making

Gnutella-Like P2P Systems Scalable”, Proceedings of the ACM SIGCOMM,

Miami, Florida, USA 2003.

[95] Y. Liu, L. Xiao, L.M. Ni, “Building a Scalable Bipartite P2P Overlay

Network”, Proceedings of the IEEE Transaction Parallel and Distributed

Systems (TPDS), Vol. 18(9), pp. 1296-1306, September, 2007.

184

[96] L. Xiao, Y. Liu, L.M. Ni, “Improving Unstructured Peer-to- Peer Systems by

Adaptive Connection Establishment”, Proceedings of the IEEE Transaction

Computers, 2005.

[97] NTP: The Network Time Protocol. http://www.ntp.org/ 2007

[98] Yunhao Liu, “A Two-Hop Solution to Solving Topology Mismatch”,

Proceedings of the IEEE Transactions on Parallel and Distributed Systems,

Vol. 19(11), pp. 1591-1600, November 2008.

[99] G. Sushant, R. Buyya, “Data Replication Strategies in Wide-Area Distributed

Systems”, Chapter IX of Enterprise Service Computing: From Concept to

Deployment, IGI Global, pp. 211-241, 2006.

[100] Ashraf Ahmed, P.D.D.Dominic, Azween Abdullah, Hamidah Ibrahim, “A

New Optimistic Replication Strategy for Large-Scale Mobile Distributed

Database Systems”, International Journal of Database Management Systems

(IJDMS), Vol. 2(4), pp. 86-105, November 2010.

[101] R. Meersman, Z. Tari et al., “An Adaptive Probabilistic Replication Method

for Unstructured P2P Networks”, Springer-Verlag Berlin Heidelberg, LNCS

4275, pp. 480–497, 2006.

[102] K. Sashi, Antony Selvadoss Thanamani, “Dynamic Replica Management for

Data Grid”, IACSIT International Journal of Engineering and Technology,

Vol. 2(4), pp. 329-333, August 2010.

[103] Kubiatowicz, J., “Oceanstore: An Architecture for Global-Scale Persistent

Storage”, Proceedings of the International Conference on Architectural

Support for Programming Languages and Operating Systems (ASPLOS), pp.

190–201, November 2002.

[104] Auenca-Acuna, F. M., Martin, R. P., Nyuyen, T. D., “Autonomous Replication

for High Availability in Unstructured P2P systems”, 22nd International

Symposium on Reliable Distributed Systems (SRDS), 2003.

[105] Adya, A., Bolosky, W. J., Castro, M., Cermak, G., Chaiken, R., Douceur, J. R.,

Howell, J., Lorch, J. R., Theimer, M., Wattenhofer, R. P., “Farsite: Federated,

Available, and Reliable Storage for an Incompletely Trusted Environment”,

SIGOPS Oper. Syst. Rev., Vol. 36, (New York, NY, USA), pp. 1–14, ACM

Press, 2002.

[106] Adya, A., Bolosky, W. J., Castro, M., Cermak, G., Chaiken, R., Douceur, J. R.,

Howell, J., Lorch, J. R., Theimer, M., Wattenhofer, R. P., “Feasibility of a

185

Serverless Distributed File System Deployed on an Existing Set of Desktop

PCs”, OSDI, 2002.

[107] Douceur, J. R. and Wattenhofer, R. P., “Large-Scale Simulation of Replica

Placement Algorithms for a Serverless Distributed File System”, 9th

International Symposium on Modeling, Analysis and Simulation of Computer

and Telecommunication Systems (MASCOTS), 2001.

[108] Korupolu, M., Plaxton, G., Rajaraman, R., “Placement Algorithms for

Hierarchical Cooperative Caching”, Journal of Algorithms, Vol. 38(1), pp.

260–302, January 2001.

[109] Krishnaswamy, V., Walther, D., Bhola, S., Bommaiah, E., Riley, G. F., Topol,

B., Ahamad, M., “Efficient Implementation of Java Remote Method

Invocation (RMI)”, COOTS, pp. 19–27, 1998.

[110] Ko, B.-J., Rubenstein, D., “Distributed, Self-Stabilizing Placement of

Replicated Resources in Emerging Networks”, 11th IEEE International

Conference on Network Protocols (ICNP), November 2003.

[111] Qiu, L., Padmanabhan, V. N., Voelker, G. M., “On the Placement of Web

Server Replicas”, Proceedings of INFOCOM’01, pp. 1587–1596, 2001.

[112] T. Loukopoulo, I. Ahmad, “Static and Adaptive Distributed Data Replication

Using Genetic Algorithms”, Journal of Parallel and Distributed Computing,

Vol. 64(11), pp. 1270–1285, 2004.

[113] P. Francis, S. Jamin, V. Paxson, L. Zhang, D. Gryniewicz, Y. Jin, “An

Architecture for a Global Internet Host Distance Estimation Service”, IEEE

INFOCOM '99 Conference, New York, NY, USA, pp. 210-217, 1999.

[114] Chapram Sudhakar, T. Ramesh, “An Improved Lazy Release Consistency

Model”, Journal of Computer Science, Vol. 5(11), pp. 778-782, 2009.

[115] M. Naor, U. Wieder, “Scalable and Dynamic Quorum Systems”, Proceedings

of the ACM Symposium on Principles of Distributed Computing, 2003.

[116] T. Loukopoulos, I. Ahmad, “Static and Adaptive Data Replication Algorithms

for Fast Information Access in Large Distributed systems”, Proceedings of the

IEEE International Conference on Distributed Computing Systems, Taipei,

Taiwan, pp. 385 – 392, 2000.

[117] S. Abdul-Wahid, R. Andonie, J. Lemley, J. Schwing, J. Widger, “Adaptive

Distributed Database Replication through Colonies of Pogo Ants”, Parallel

186

and Distributed Processing Symposium, IEEE International, California USA,

pp. 358, 2007.

[118] I. Gashi, P. Popov, L. Strigini, “Fault Tolerance via Diversity for Off-the-

Shelf Products: A Study with SQL Database Servers”, IEEE Transactions on

Dependable and Secure Computing, Vol. 4(4), pp. 280–294, 2007.

[119] C. Wang, F. Mueller, C. Engelmann, S. Scott, “A Job Pause Service Under

Lam/Mpi+Blcr for Transparent Fault Tolerance”, Proceedings of IEEE

International Parallel and Distributed Processing Symposium, California USA,

pp. 1-10, 2007.

[120] Lin Wujuan, Veeravalli Bharadwaj, “An Object Replication Algorithm for

Real-Time Distributed Databases”, Distributed Parallel Databases, Vol. 19, pp.

125–146, 2006.

[121] A. Bonifati, E. Chang, T. Ho, L. V. Lakshmanan, R. Pottinger, Y. Chung,

“Schema Mapping and Query Translation in Heterogeneous P2P XML

Databases”, The VLDB Journal, Vol. 2 (19), pp. 231-256, April 2010.

[122] Ricardo JimeNez-Peris, Gustavo Alonso, Bettina Kemme, M. Patin O

Martinez, “Are Quorums an Alternative for Data Replication?”, ACM

Transactions on Database Systems, Vol. 28(3), pp. 257–294, 2003.

[123] J. Kangasharju, K.W. Ross, D. Turner, “Optimal Content Replication in P2P

Communities”, Manuscript, 2002.

[124] Jiafu Hu, Nong Xiao, Yingjie Zhao, Wei Fu, “An Asynchronous Replic

Consistency Model in Data Grid. ISPA Workshops”, LNCS 3759, Nanjing,

China, pp. 475 – 484, 2005.

[125] Radded Al King, H. Abdelkader, M. Franck, “Query Routing and Processing

in Peer-to-Peer Data sharing Systems”, International Journals of Database

Management Systems (IJDMS), Vol. 2(2), pp. 116-139, 2010.

[126] R.H. Thomas, “A Majority Consensus Approach to Concurrency Control for

Multiple Copy Databases”, ACM Transactions on Database Systems, Vol 4(2),

pp. 180–209, 1979.

[127] Ananth Rao, Karthik Lakshminarayan, Sonesh Surana, Richard Karp Ion

Stoica, “Load Balancing in Structured P2P Systems”, Proceedings of ACM,

Vol. 63(3), pp. 217-240, 2006.

187

[128] Ada Wai-Chee Fu, Yat Sheung Wong, Man Hon Wong, “Diamond Quorum

Consensus for High Capacity and Efficiency in a Replicated Database

System”, Distributed and Parallel Databases, Vol. 8, pp. 471–492, 2000.

[129] A. Sleit, W. Al Mobaideen, S. Al-Areqi, A. Yahya, “A Dynamic Object

Fragmentation and Replication Algorithm in Distributed Database Systems”,

American Journal of Applied Sciences, Vol 4(8), pp. 613-618, 2007.

[130] D. Agrawal, A. El. Abbadi, “The Generalized Tree Quorum Protocol: An

Efficient Approach for Managing Replicated Data”, ACM Transactions on

Database Systems, Vol. 17(4), pp. 689-717, 1992.

[131] David Del Vecchio, Sang H Son, “Flexible Update Management in Peer-to-

Peer Database Systems”, Proceedings of 9th International Conference on

Database Engineering and Application Symposium, VA, USA, pp. 435 – 444,

July 2005.

[132] Hidehisa Takamizawa, Kazuhiro Saji. “A Replica Management Protocol in a

Binary Balanced Tree Structure-Based P2P Network”. Journal of Computers,

Vol. 4(7), pp. 631-640, 2009.

[133] Ahmad N, Abdalla AN, Sidek RM. “Data Replication Using Read-One-Write-

All Monitoring Synchronization Transaction System in Distributed

Environment.” Journal of Computer Science, Vol. 6(10), pp. 1066–1069, 2010.

[134] P. A. Bernstein, N. Goodman, “An Algorithm for Concurrency Control and

Recovery in Replicated Distributed Databases”, ACM Transactions on

Database Systems, Vol. 9(4), pp. 596–615, 1984

[135] Jajodia, S., Mutchler, D., “Integrating Static and Dynamic Voting Protocols to

Enhance File Availability”, Fourth International Conference on Data

Engineering, IEEE, pp. 144-153, New York, 1988.

[136] B. Silaghi, P. Keleher, B. Bhattacharjee, “Multi-Dimensional Quorum Sets for

Read-Few Write-Many Replica Control Protocols”, Proceedings of the 4th

International Workshop on Global and Peer-to-Peer Computing, 2004.

[137] Latip R, Ibrahim H, Othman M, Sulaiman MN, Abdullah A, “Quorum Based

Data Replication in Grid Environment”, Rough Sets and Knowledge

Technology, LNCS, pp. 379–386, 2008.

[138] Oprea F, Reiter MK. “Minimizing Response Time for Quorum-System

Protocols Over Wide-Area Networks”, International Conference on

188

Dependable Systems and Networks (DSN), pp. 409–418, Edinburgh, UK,

2007

[139] Sawai Y, Shinohara M, Kanzaki A, Hara T, Nishio S. “Consistency

Management Among Replicas Using a Quorum System in Ad Hoc Networks”,

International Conference on Mobile Data Management (MDM), pp. 128–132,

Nara, Japan, 2006.

[140] Frain I,M’zoughi A, Bahsoun JP, “How to Achieve High Throughput with

Dynamic Tree-Structured Coterie”, International Symposium on Parallel and

Distributed Computing (ISPDC), pp. 82–89, Timisoara, Romania, 2006.

[141] Osrael J, Froihofer L, Chlaupek N, Goeschka KM., “Availability and

Performance of the Adaptive Voting Replication”, International Conference

on Availability, Reliability and Security (ARES), pp. 53–60, Vienna, Austria,

2007.

[142] S. Cheung, M. Ammar, A. Ahamad, “The Grid Protocol: A High Performance

Scheme for Maintaining Replicated Data”, IEEE Sixth International

Conference on Data Engineering, pp. 438–445, Los Angeles, CA, USA, 1990.

[143] Storm C, Theel O, “A General Approach to Analyzing Quorum-Based

Heterogeneous Dynamic Data Replication Schemes”, 10th International

Conference on Distributed Computing and Networking, Hyderabad, India, pp.

349–361, 2009.

[144] Kevin Henry, Colleen Swanson, Qi Xie, Khuzaima Daudjee, “Efficient

Hierarchical Quorums in Unstructured Peer-to-Peer Networks”, OTM 2009,

Part I, LNCS 5870, pp. 183–200, 2009.

[145] A. Kumar, “Hierarchical Quorum Consensus: A New Algorithm for Managing

Replicated Data”, IEEE Transactions on Computers, Vol. 40(9), pp. 996-1004,

1991.

[146] Dongming Huang, Zong Hu, “Research of Replication Mechanism in P2P

Network”, WSEAS Transactions on Computer, Vol. 8(12), pp. 1845-1854,

December 2009.

[147] Nabor das Chagas Mendonça, “Using Extended Hierarchical Quorum

Consensus to Control Replicated Data: from Traditional Voting to Logical

Structures”, Proceedings of the 27th Annual Hawaii International Conference

on Systems Sciences, Minitrack on Parallel and Distributed Databases, Maui,

pp. 303-312, 1993.

189

[148] Sang-Min Park, Jai-Hoon Kim, Young-Bae Ko, Won-Sik Yoon, “Dynamic

Data Replication Strategy Based on Internet Hierarchy BHR”, Springer-

Verlag Heidelberg, 3033, pp. 838-846, 2004.

[149] A. Horri, R. Sepahvand, Gh. Dastghaibyfard. “A Hierarchical Scheduling and

Replication Strategy”, International Journal of Computer Science and Network

Security, Vol. 8, pp. 30-35, 2008.

[150] Tang M., Lee B., Tang X., Yeo C. “The Impact of Data Replication on Job

Scheduling Performance in the Data Grid. Future Generation Computing

System”, Vol. 22(3), 2006.

[151] Kavitha, R., A. Iamnitchi, I. Foster, “Improving Data Availability through

Dynamic Model Driven Replication in Large Peer-to- Peer Communities”,

Global and Peer-to-Peer Computing on Large Scale Distributed Systems

Workshop, Berlin, Germany, 2002.

[152] Ranganathan, I. Foster, “Design and evaluation of Replication Strategies for a

High Performance Data Grid in Computing and High Energy and Nuclear

Physics”, International Conference on Computing in High Energy and Nuclear

Physics (CHEP'01), Beijing, China, 2001.

[153] D. Malkhi, M.K. Reiter, A. Wool, “Probabilistic Quorum Systems”,

Information and Computation, Vol. 170(2), pp.184-206, 2001.

[154] Abraham Silberschatz, Henry F. Korth, S. Sudarshan, Database System

Concepts, McGraw-Hill Computer Science Series, International Student

Edition, 2005.

[155] Udai Shanker, Manoj Misra, Anil K. Sarje, “Distributed Real Time Database

Systems: Background and Literature Review”, Distributed Parallel Databases,

Vol. 23, pp. 127–149, 2008.

[156] Arvind Kumar, Rama Shankar Yadav, Ranvijay, Anjali Jain, “Fault Tolerance

in Real Time Distributed System”, International Journal on Computer Science

and Engineering, Vol. 3 (2), pp. 933-939, 2011.

[157] Amr El Abbadi, Mohamed F. Mokbel, “Social Networks and Mobility in the

Cloud”, Proceedings of the PVLDB, Vol. 5 (12), pp. 2034-2035, 2012.

[158] K.Y. Lam, Tei-Wei Kuo, Real-Time Database Systems: Architecture and

Techniques, Kluwer Academic Publishers, 2001.

190

http://www.informatik.uni-trier.de/%7Eley/pers/hd/m/Mokbel:Mohamed_F=.html

http://www.informatik.uni-trier.de/~ley/db/journals/pvldb/pvldb5.html#AbbadiM12

[159] A. Datta, M. Hauswirth, K. Aberer, “Updates in Highly Unreliable, Replicated

Peer-to-Peer Systems”, Proceedings of the 23rd International Conference on

Distributed Computing Systems, 2003.

[160] J. Wang, K. Lam, S. H. Son, and A. Mok, “An Effective Fixed Priority Co-

Scheduling Algorithm for Periodic Update and Application Transactions”,

Springer, Computing, pp. 184-202, November 2012.

[161] T. Hara, M. Nakadori, W. Uchida, K. Maeda, S. Nishio, “Update Propagation

Based on Tree Structure in Peer-to-Peer Networks”, Proceedings of

ACS/IEEE International Conference on Computer Systems and Applications

(AICCSA’05), pp. 40–47, 2005.

[162] A. Datta, M. Hauswirth, K. Aberer, “Updates in Highly Unreliable, Replicated

Peer-to-Peer Systems”, Proceedings 23rd International Conference on

Distributed Computing Systems (ICDCS’03), pp. 76, 2003.

[163] Gupta, R., Harista, J., Ramamritham, K., Seshadri, S., “Commit Processing in

Distributed Real Time Database Systems”, Proceedings of Real-time Systems

Symposium, Washington DC. IEEE Computer Society Press, San Francisco,

1998.

[164] Jayant R. Haritsa, Michael J. Carey, Miron Livny, “Data Access Scheduling in

Firm Real-Time Database Systems”, The Journal of Real-Time Systems, Vol.

4(3), 1992.

[165] K.-W. Lam, K.-Y. Lam, S. Hung, “Real-Time Optimistic Concurrency

Control Protocol with Dynamic Adjustment of Serialization Order”,

Proceedings of IEEE Real-Time Technology and Application Symposium,

Chicago, Illinois, pp. 174-179, May 1995.

[166] S. Gribble, A. Halevy, Z. Ives, M. Rodrig, D. Suciu, “What can Databases do

for Peer-to-Peer?”, Proceedings of the Fourth International Workshop on the

Web and Databases (WebDB 2001), June 2001.

[167] L Gong. “JXTA: A Network Programming Environment”, IEEE Internet

Computing, Vol. 5(3), pp. 88–95, 2001.

[168] Wolfgang Nejdl, Wolf Siberski, Michael Sintek, “Design Issues and

Challenges for RDF- and Schema-Based Peer-to-Peer Systems”, ACM

SIGMOD Record, Vol. 32(3), pp. 41–46, 2003.

[169] Wolfgang Nejdl, Martin Wolpers, Wolf Siberski, Christoph Schmitz, Mario

Schlosser, Ingo Brunkhorst, Alexander L’oser, “Super-Peer Based Routing

191

and Clustering Strategies for RDF-Based Peer-to-Peer Networks”,

Proceedings of the 12th international conference on World Wide Web, pages

536–543, New York, NY, USA, 2003.

[170] Mario T. Schlosser, Michael Sintek, Stefan Decker, Wolfgang Nejdl,

“Hypercup - Hypercubes, Ontologies, and Efficient Search on Peer-to-Peer

Networks”, Proceedings of the First International Workshop on Agents and

Peer-to-Peer Computing, volume 2530 of LNCS (Springer), pp. 112–124,

2002.

[171] R. Akbarinia, V. Martins, E. Pacitti, and P. Valduriez. Global Data

Management (Chapter: Design and implementation of Atlas P2P architecture),

1st Edition, IOS Press, July 2006.

[172] R. Akbarinia, V. Martins, E. Pacitti, and P. Valduriez, “Top-k query

processing in the APPA P2P system”, Proceedings of the International

Conferenceon High Performance Computing for Computational Science

(VecPar), Rio de Janeiro, Brazil, July 2006.

[173] Igor Tatarinov, Zachary Ives, Jayant Madhavan, Alon Halevy, Dan Suciu,

Nilesh Dalvi, Xin Dong, Yana Ka diyska, Gerome Miklau, Peter Mork, “The

Piazza Peer Data Management Project. ACM SIGMOD Record”, September

2003.

[174] Seth Gilbert, Nancy Lynch, “Brewer’s Conjecture and the Feasibility of

Consistent, Available, Partition-Tolerant Web Services”, ACM SIGACT

News, Vol. 33(2), pp. 51–59, 2002.

[175] Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, “PeerDB: A

P2P-based System for Distributed Data Sharing”, Proceedings of the 19th

IEEE International Conference on Data Engineering, pp. 633–644, IEEE

Computer Society, 2003.

[176] Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan. “BestPeer: A Self-

Configurable Peer-to-Peer System”, Proceedings of the 18th IEEE

International Conference on Data Engineering, San Jose, CA, IEEE Computer

Society, pp. 272, 26 February - 1 March 2002.

[177] B.K. Sarker, A.K. Tripathi, D.P. Vidyarthi, Kuniaki Uehara, “A Performance

Study of Task Allocation Algorithms in a Distributed Computing System

(DCS)”, IEICE Transactions on Information and Systems, Vol. 86(9), pp.

1611-1619, 2003.

192

193

[178] Anirban Mondal, Masaru Kitsuregawa, “Open Issues for Effective Dynamic

Replication in Wide-Area Network Environments”, Peer-to-Peer Networking

and Applications, Vol. 2(3), pp. 230-251, 2009.

[179] Gerome Miklau Dan Suciu, “Controlling Access to Published Data Using

Cryptography”, Proceedings of the Very Large Databases Conference, USA,

pp. 898–909, September 2003.

[180] R. B. Patel, Vishal Garg, “Resource Management in Peer-to-Peer

Networks: NADSE Network Model”, Proceedings of the 2nd International

Conference On Methods and Models in Science and Technology

(ICM2ST’11), Jaipur, (India), pp. 159-164, 19–20 November 2011.

[181] Alfred Loo, “Distributed Multiple Selection Algorithm for Peer-to-Peer

Systems”, Journal of Systems and Software, Vol. 78, pp. 234-248, 2005.

Documents

Real time information dissemination and management in peer-to-peer networks