TRUST AND REPUTATION IN PEER-TO-PEER NETWORKS A DISSERTATION ... - Stanford University · 2006. 2. 10. · trust and reputation in peer-to-peer networks a dissertation submitted to

TRUST AND REPUTATION IN PEER-TO-PEER NETWORKS

A DISSERTATION

SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE

AND THE COMMITTEE ON GRADUATE STUDIES

OF STANFORD UNIVERSITY

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY

Sergio Marti

May 2005

c© Copyright by Sergio Marti 2005

All Rights Reserved

ii

I certify that I have read this dissertation and that, in my opinion, it

is fully adequate in scope and quality as a dissertation for the degree

of Doctor of Philosophy.

Hector Garcia-Molina Principal Adviser




Mary Baker




Rajeev Motwani

Approved for the University Committee on Graduate Studies.

iii

iv

Abstract

The increasing availability of high bandwidth Internet connections and low-cost, com-

modity computers in people’s homes has stimulated the use of resource sharing peer-

to-peer networks. These systems employ scalable mechanisms that allow anyone to

offer content and services to other system users. However, the open accessibility of

these systems make them vulnerable to malicious users wishing to poison the system

with corrupted data or harmful services and worms. Because of this danger, users

must be wary of the quality or validity of the resources they access.

To mitigate the adverse behavior of unreliable or malicious peers in a network,

researchers have suggested using reputation systems. Yet our understanding of how

to incorporate an effective reputation system into an autonomous network is limited.

This thesis categorizes and evaluates the components and mechanisms necessary to

build robust, effective reputation systems for use in decentralized autonomous net-

works. Borrowing techniques from game theory and economic analysis, we begin

with high-level models in order to understand general trends and properties of repu-

tation systems and their effect on a user’s behavior and experience. We then closely

examine the effects of limited reputation sharing through simulations based on large-

scale measurements from actual, operating P2P networks. Finally, we propose new

mechanisms for improving message routing throughput in decentralized networks of

untrusted peers: one geared towards structured DHTs (SPROUT) and two other

complementary mechanisms for mobile ad hoc networks (Watchdog and Pathrater).

v

Acknowledgements

I would like to thank my advisor Hector Garcia-Molina for his unending patience

and guidance. I appreciate his great passion for research that is only matched by

his strong commitment to his students. I am always amazed that, regardless of his

many duties and projects, Hector would make himself available provide feedback and

insight on my work. Not only is Hector a wonderful advisor but also a caring friend.

I am also deeply grateful for the opportunity to have had Mary Baker as my

advisor when I first came to Stanford. Her professionalism and enthusiasm for research

inspired me to pursue my Ph.D. Mary’s devotion to excellence is exemplified in the

work of her students.

My experience at Stanford has been joyful and enlightening and I am grateful

to the members of both the Mosquitonet and Database groups for their insights,

constructive criticism and friendship. I would especially like to thank my co-authors

TJ Giuli, Kevin Lai and Prasanna Ganesan. I also appreciate Rajeev Motwani for

agreeing to be a member on my reading committee.

Finally, I must thank my friends and family for their encouragement and support.

In particular, I am grateful to my parents for their love and for instilling in me a deep

sense of academic pride. And most of all, to my wife Wendy whose patience and love

has kept me going, even when I doubted myself. From proofreading my papers to

preparing tasty treats, Wendy is always there for me.

vi

Contents

Abstract v

Acknowledgements vi

1 Introduction 1

1.1 Research Contributions and Thesis Outline . . . . . . . . . . . . . . . 5

2 Taxonomy of Trust 8

2.0.1 Taxonomy Overview . . . . . . . . . . . . . . . . . . . . . . . 9

2.1 Terms and Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2 Assumptions and Constraints . . . . . . . . . . . . . . . . . . . . . . 12

2.2.1 User Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2.2 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2.3 Environmental Limitations . . . . . . . . . . . . . . . . . . . . 16

2.3 Gathering Information . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.3.1 System Identities . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.3.2 Information Sharing . . . . . . . . . . . . . . . . . . . . . . . 19

2.3.3 Dealing with Strangers . . . . . . . . . . . . . . . . . . . . . . 23

2.4 Reputation Scoring and Ranking . . . . . . . . . . . . . . . . . . . . 24

2.4.1 Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

vii

2.4.2 Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.4.3 Peer Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.5 Taking Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.5.1 Incentives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.5.2 Punishment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.6 Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.6.1 Resource Reputation . . . . . . . . . . . . . . . . . . . . . . . 30

2.6.2 Social Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3 Agent Strategies Under Reputation 32

3.1 Definitions and Dimensions . . . . . . . . . . . . . . . . . . . . . . . 33

3.1.1 Game Setup and Rules . . . . . . . . . . . . . . . . . . . . . . 33

3.1.2 Knowledge-space . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.1.3 Player-space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.1.4 Price-space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.1.5 eBay Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.2 Strategy Independent Analysis . . . . . . . . . . . . . . . . . . . . . . 38

3.2.1 Single Transaction Payoff . . . . . . . . . . . . . . . . . . . . . 38

3.2.2 Social Optimum . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.3 Selfish Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.3.1 Zero Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.3.2 Perfect Knowledge . . . . . . . . . . . . . . . . . . . . . . . . 41

3.4 Perfect History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

3.4.1 Basic Reputation-based Strategies . . . . . . . . . . . . . . . . 45

3.4.2 Independent Decisions for MB-1S/VP . . . . . . . . . . . . . . 47

3.4.3 Independent Decisions for 1B-MS/FP . . . . . . . . . . . . . . 61

viii

3.5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

3.6 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

3.6.1 Variably-valuated goods . . . . . . . . . . . . . . . . . . . . . 64

3.6.2 Malicious Sellers . . . . . . . . . . . . . . . . . . . . . . . . . 65

3.6.3 Costly Signaling . . . . . . . . . . . . . . . . . . . . . . . . . . 66

3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4 Modeling Reputation and Incentives 69

4.1 Assumptions and Definitions . . . . . . . . . . . . . . . . . . . . . . . 71

4.1.1 Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4.1.2 Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4.2 Formal Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

4.2.1 Incentive Schemes . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.2.2 Currency Scenarios . . . . . . . . . . . . . . . . . . . . . . . . 77

4.2.3 Trust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

4.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

4.3.1 Trust Over time . . . . . . . . . . . . . . . . . . . . . . . . . . 87

4.3.2 Utility over Time . . . . . . . . . . . . . . . . . . . . . . . . . 91

4.4 Simulation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

4.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

4.5.1 Base Population . . . . . . . . . . . . . . . . . . . . . . . . . . 98

4.5.2 NR and MTPP . . . . . . . . . . . . . . . . . . . . . . . . . . 104

4.5.3 Trust vs Capacity . . . . . . . . . . . . . . . . . . . . . . . . . 109

4.5.4 Single-Peer Experiments . . . . . . . . . . . . . . . . . . . . . 110

4.6 Variations on the Model . . . . . . . . . . . . . . . . . . . . . . . . . 115

4.6.1 Profit Trust Factor . . . . . . . . . . . . . . . . . . . . . . . . 115

4.6.2 Additional Trust Models . . . . . . . . . . . . . . . . . . . . . 117

ix

4.6.3 Tying Service to Reputation . . . . . . . . . . . . . . . . . . . 121

4.7 Generalized Model of Trust and Profit . . . . . . . . . . . . . . . . . 127

4.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

4.8.1 Credits and Economic Stimulation . . . . . . . . . . . . . . . 134

4.9 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

4.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

5 P2P Reputation System Metrics 138

5.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

5.1.1 Authenticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

5.2 Threat Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

5.2.1 Document-based Threat Model . . . . . . . . . . . . . . . . . 143

5.2.2 Node-based Threat Model . . . . . . . . . . . . . . . . . . . . 143

5.3 Reputation Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

5.3.1 Identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

5.4 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

5.4.1 Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

5.4.2 Effectiveness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

5.4.3 Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

5.4.4 Message Traffic . . . . . . . . . . . . . . . . . . . . . . . . . . 155

5.4.5 Threat-Reputation Distance . . . . . . . . . . . . . . . . . . . 156

5.5 Simulation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

5.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

5.6.1 Local Reputation System . . . . . . . . . . . . . . . . . . . . . 160

5.6.2 Voting-System . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

5.6.3 Node-based Threat Model . . . . . . . . . . . . . . . . . . . . 181

5.7 Statistical Analysis of Reputation Systems . . . . . . . . . . . . . . . 190

x

5.8 Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

5.9 Empirical Estimations . . . . . . . . . . . . . . . . . . . . . . . . . . 193

5.10 Long-Term Reputation System Performance . . . . . . . . . . . . . . 194

5.10.1 Random base case . . . . . . . . . . . . . . . . . . . . . . . . 195

5.10.2 Select-Best/Weighted ideal case with threshold . . . . . . . . . 196

5.10.3 Weighted ideal case without threshold . . . . . . . . . . . . . 196

5.10.4 Select-Best ideal case without threshold . . . . . . . . . . . . 197

5.10.5 Select-Best/Weighted local reputation system with threshold . 198

5.10.6 Weighted local system without threshold . . . . . . . . . . . . 199

5.10.7 Select-Best local system . . . . . . . . . . . . . . . . . . . . . 199

5.11 Comparison of Statistical Analysis to Simulation Results . . . . . . . 200

5.12 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

5.13 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

6 SPROUT: P2P Routing with Social Networks 205

6.1 Trust Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207

6.1.1 Trust Function . . . . . . . . . . . . . . . . . . . . . . . . . . 208

6.1.2 Path Rating . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210

6.2 Social Path Routing Algorithm . . . . . . . . . . . . . . . . . . . . . 211

6.2.1 Optimizations . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

6.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

6.3.1 Simulation Details . . . . . . . . . . . . . . . . . . . . . . . . 214

6.3.2 Algorithm Evaluation . . . . . . . . . . . . . . . . . . . . . . . 215

6.3.3 Calculating Trust . . . . . . . . . . . . . . . . . . . . . . . . . 218

6.3.4 Number of Friends . . . . . . . . . . . . . . . . . . . . . . . . 220

6.3.5 Comparison to Gnutella-like Networks . . . . . . . . . . . . . 223

6.3.6 Latency Comparisons . . . . . . . . . . . . . . . . . . . . . . . 225

xi

6.3.7 Message Load . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

6.4 Related and Future Work . . . . . . . . . . . . . . . . . . . . . . . . 228

6.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229

7 Mitigating MANET Misbehavior 231

7.1 Assumptions and Background . . . . . . . . . . . . . . . . . . . . . . 235

7.1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

7.1.2 Physical Layer Characteristics . . . . . . . . . . . . . . . . . . 235

7.1.3 Dynamic Source Routing (DSR) . . . . . . . . . . . . . . . . . 236

7.2 Watchdog and Pathrater . . . . . . . . . . . . . . . . . . . . . . . . . 237

7.2.1 Watchdog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

7.2.2 Pathrater . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241

7.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

7.3.1 Movement and Communication Patterns . . . . . . . . . . . . 243

7.3.2 Misbehaving Nodes . . . . . . . . . . . . . . . . . . . . . . . . 244

7.3.3 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244

7.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

7.4.1 Network Throughput . . . . . . . . . . . . . . . . . . . . . . 246

7.4.2 Routing Overhead . . . . . . . . . . . . . . . . . . . . . . . . 248

7.4.3 Effects of False Detection . . . . . . . . . . . . . . . . . . . . 250

7.5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251

7.6 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254

7.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255

8 Conclusion and Future Work 257

A Proof Of Long-Term Reputation Damage 262

A.1 Error Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265

xii

A.2 Improved Approximation . . . . . . . . . . . . . . . . . . . . . . . . . 266

B Unique Maximum of Segregated Schedule 269

C Optimal Schedule 272

D Math. Deriv. of Econ. Model 274

D.1 Utility Over Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275

D.2 Generalized Trust Over Time (σ(T, p∗) = 1) . . . . . . . . . . . . . . 276

Bibliography 277

xiii

List of Tables

2.1 Breakdown of Reputation System Components . . . . . . . . . . . . . 9

3.1 Parameter descriptions with sample values . . . . . . . . . . . . . . . 34

3.2 General payoff matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.3 Payoff matrix for fixed $2 priced goods with valuation $3 and cost $1 39

3.4 Payoff Matrix for variable priced p goods for default v = $3 and c = $1. 43

3.5 Payoff Matrix for fixed $2 priced goods with valuation $3, cost $1, and

maliciousness factor $1 . . . . . . . . . . . . . . . . . . . . . . . . . . 66

4.1 Trust and Profit Parameters and Default Values . . . . . . . . . . . . 88

4.2 Simulation Parameters and Default Values . . . . . . . . . . . . . . . 97

4.3 Definition of Generalized Model Terms . . . . . . . . . . . . . . . . . 130

5.1 Simulation statistics and metrics . . . . . . . . . . . . . . . . . . . . . 152

5.2 Configuration parameters, and default values . . . . . . . . . . . . . . 157

5.3 Distributions and their parameters with default values . . . . . . . . 158

6.1 SPROUT vs. Chord . . . . . . . . . . . . . . . . . . . . . . . . . . . 215

6.2 Evaluating lookahead and MHD . . . . . . . . . . . . . . . . . . . . . 216

7.1 Maximum and minimum network throughput obtained by any simula-

tion at 40% misbehaving nodes with all features enabled. . . . . . . . 247

xiv

7.2 Maximum and minimum overhead obtained by any simulation at 40%

misbehaving nodes with all features enabled. . . . . . . . . . . . . . . 249

7.3 Comparison of the number of false positives between the 0 second and

60 second pause time simulations. Average taken from the simulations

with all features enabled. . . . . . . . . . . . . . . . . . . . . . . . . . 251

xv

List of Figures

2.1 Representation of primary identity scheme properties. . . . . . . . . . 19

3.1 Number of transactions until gain from single defection equals loss from

lowered reputation k. . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.2 Optimal number of cooperation/defections as a function of total sales. 58

3.3 Relative utility error between optimal schedule and ±1 C/D. . . . . . 59

3.4 Relative utility error between optimal schedule using weak approxima-

tion and ±1 C/D. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4.1 Relationship between a peer’s profit rate and the number of peers in

the network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.2 Representation of a reputation system’s role in a trading network.

Transaction observations update peer reputations maintained in the

trust vector. Reputation information is then used by peers in transac-

tions to improve expected utility. . . . . . . . . . . . . . . . . . . . . 81

4.3 A peer’s trust rating over time. . . . . . . . . . . . . . . . . . . . . . 88

4.4 Convergence of T as t→∞. Note the logscale x-axis. CB = 0 in both. 90

4.5 A peer’s utility over time. Initial trust T(0) = 0.01. Higher is better. 92

4.6 A peer’s utility over time. Initial trust T(0) = 0.0035. . . . . . . . . . 93

xvi

4.7 Minimum capacity needed for a good peer to (eventually) generate

positive profit (using default πgt, kv, and kc) is approximately 0.035

(for default parameters). . . . . . . . . . . . . . . . . . . . . . . . . . 94

4.8 Capacity distribution for base population. . . . . . . . . . . . . . . . 99

4.9 Trust and utility values for default population after 200 turns. . . . . 100

4.10 Distribution of credits in base population at turn 200. . . . . . . . . . 100

4.11 Trust and utility for base population after 1000 turns. . . . . . . . . . 102

4.12 Trust and utility for NR=400 after 1000 turns. . . . . . . . . . . . . . 104

4.13 Trust and utility for NR=1 after 1000 turns. . . . . . . . . . . . . . . 105

4.14 Trust and utility for MTPP=2 after 1000 turns. . . . . . . . . . . . . 107

4.15 Utility for MTPP=3 after 1000 turns. . . . . . . . . . . . . . . . . . . 108

4.16 Comparing the analytical and simulation results for the convergence

of T as t→∞ as a function of C = CG. Note the logscale x-axis. . . 110

4.17 Comparing the analytical and simulation results of trust over time for

new good peers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

4.18 Comparing the analytical and simulation results of trust over time.

MTPP=1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

4.19 Comparing utility over time for new good peer. MTPP=2. . . . . . . 113

4.20 Comparing utility over time for new bad peer. MTPP=1. . . . . . . . 114

4.21 Effects of varying trust factor σ. . . . . . . . . . . . . . . . . . . . . . 116

4.22 Comparison of ratio trust model to differential trust model. T (0) = 0.01119

4.23 πgt w.r.t T for various functions of T . . . . . . . . . . . . . . . . . . . 123

4.24 Effects of sample πgt w.r.t varying functions of T . . . . . . . . . . . . 124

4.25 Steady-state trust as a function of CB. C = 1 . . . . . . . . . . . . . 125

4.26 Steady-state profit as a function of CB. C = 1 . . . . . . . . . . . . . 125

4.27 Effects of varying σ(T, p). . . . . . . . . . . . . . . . . . . . . . . . . 132

xvii

5.1 Sample document and matching query . . . . . . . . . . . . . . . . . 141

5.2 Efficiency for varying ρ0. Lower value is better. 1 is optimal. . . . . . 161

5.3 Varying selection threshold values. . . . . . . . . . . . . . . . . . . . 162

5.4 Efficiency comparison. . . . . . . . . . . . . . . . . . . . . . . . . . . 164

5.5 Relative message traffic of Friends-First and maximum Friend-Cache

utilization w.r.t. cache size. . . . . . . . . . . . . . . . . . . . . . . . 167

5.6 Efficiency of voting reputation system w.r.t. varying quorumweight. . 170

5.7 Efficiency of the voting reputation system w.r.t. Friend-Cache size. . 172

5.8 Effects of front nodes on efficiency. . . . . . . . . . . . . . . . . . . . 174

5.9 Efficiency of two reputation systems with the random algorithm as a

function of πB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

5.10 Average load on well-behaved nodes as a function of pB. . . . . . . . 177

5.11 Distribution of load on good nodes (and their corresponding number

of files shared). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

5.12 Efficiency comparison of local and ideal reputation systems under the

node-based threat model. . . . . . . . . . . . . . . . . . . . . . . . . . 182

5.13 Efficiency comparison of reputation systems with uniformly distributed

node threat ratings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

5.14 Comparison of the local reputation system with ρT of 0.0 and 0.15 and

the base case over time. . . . . . . . . . . . . . . . . . . . . . . . . . 185

5.15 Comparison of the local reputation system with both Weighted and

Select-Best variants and a selection threshold of 0.0 and 0.15 and the

base case over time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

5.16 Comparison of the efficiency of the reputation systems over time. . . 189

5.17 Expected steady-state system behavior . . . . . . . . . . . . . . . . . 201

6.1 Performance of SPROUT and AC in different size Small World networks.217

xviii

6.2 Performance of SPROUT and AC for different trust functions and vary-

ing f . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

6.3 Performance of SPROUT and AC for varying r. . . . . . . . . . . . . 220

6.4 Performance as a function of a node’s degree. Club Nexus data. . . . 221

6.5 Performance of SPROUT and AC for different uniform networks with

varying degrees. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

6.6 Performance of SPROUT and AC versus unstructured flooding. . . . 224

6.7 Latency measurements for SPROUT vs AC w.r.t. network size. Lower

is better. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

6.8 Distribution of load (in fraction of routes) for augmented Chord and

SPROUT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227

7.1 Example of a route request. . . . . . . . . . . . . . . . . . . . . . 236

7.2 Watchdog in action . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

7.3 Node A does not hear B forward packet 1 to C, because B’s transmis-

sion collides at A with packet 2 from the source S. . . . . . . . . . . . 238

7.4 Node A believes that B has forwarded packet 1 on to C, though C

never received the packet due to a collision with packet 2. . . . . . . 239

7.5 Overall network throughput as a function of the fraction of misbehav-

ing nodes in the network. . . . . . . . . . . . . . . . . . . . . . . . . . 246

7.6 This figure shows routing overhead as a ratio of routing packet trans-

missions to data packet transmissions. This ratio is plotted against the

fraction of misbehaving nodes. . . . . . . . . . . . . . . . . . . . . . . 248

7.7 Comparison of network throughput between the regular Watchdog and

a Watchdog that reports no false positives. . . . . . . . . . . . . . . . 250

xix

xx

Chapter 1

Introduction

Previously, the ability to both send and receive large amounts of digital content and

data was limited to large institutions with the funds and resources to install and

manage high-speed networks and fast server machines. However, the increasing avail-

ability of high bandwidth Internet connections and low-cost, commodity computers

in people’s homes allows regular home users to quickly communicate and share data

with each other. This spread of computing resources has stimulated the use of re-

source sharing peer-to-peer (P2P) networks. These systems employ a simple scalable

mechanism that allows anyone to offer content and services to other users, as well as

search for and request resources from the network.

What distinguishes P2P systems from other distributed systems is their focus on

full user autonomy. Typically, distributed systems consist of computers managed

by a single organization or hierarchy. Devising an efficient architecture that spans

many networked machines is much simpler when all machines can be monitored and

controlled by a single operator.

However, in pure P2P architectures there are no centralized services or control

mechanisms dictating the actions of other nodes. Each user decides what computing

resources he will contribute, as well as when and for how long. The architecture is

1

2 CHAPTER 1. INTRODUCTION

designed to handle large numbers of nodes joining and abruptly leaving the network.

In addition, these systems emphasize equality and balancing the load across nodes.

This flexibility, self-determination and low participation cost encourages a much larger

number of participants, which, in turn, greatly increases the number and value of the

services provided by the system to all.

The most important contribution of peer-to-peer system research is providing an

architecture that allows a group of users spread throughout the Internet to cheaply

and efficiently connect their commodity computing resources into one massive system,

useable by all. The implications for rapid prototyping and deployment of new ser-

vices by small teams of developers without large amounts of capital are astounding.

Already we see P2P systems that handle a plethora of applications, ranging from grid

computing to data storage to digital preservation.

However, current media attention to peer-to-peer systems is concentrated on the

legal issues of copyright infringement that plague popular file-sharing applications.

Users have discovered P2P networks to be an efficient and cheap method of trans-

mitting digital content. However, these transmissions are being done without the

consent of the legal owners of the content. No legally acceptable solution to content

distribution using P2P technology is deployed today. If such a solution existed both

content owners/creators and consumers would benefit greatly.

To understand the potential impact of P2P systems, we must step back and chroni-

cle the evolution of media distribution. Currently, the cost of setting up and managing

traditional media distribution channels is too great for individual content creators to

overcome, resulting in a few large monopolistic companies that control all develop-

ment and distribution of media, such as music, movies and books. These companies

decide what media is produced based primarily on what can be marketed for max-

imum profit, not artistic merit. This filtering severely limits the public’s access to

new and diverse content and ideas.

3

The evolution of the World Wide Web has greatly helped independent artists

and authors to reach a larger segment of the population. Artists can now distribute

or sell their work in digital form from their websites, circumventing the packaging,

transportation, and retail costs of CDs, DVDs and books. The Web has also enabled

the sale of all kinds of material goods by ordinary people on a global scale. The best

example of this is the auction site eBay [42], which allows any individual to advertise

and auction items to people all over the world. Not only has the Web created new

distribution channels for digital content, but it provides a cheap solution for global

advertising of physical items.

Although the Web has lowered the cost of distribution and marketing, it does

impose costs that are still too great for many users. Websites that distribute songs

or movies will require large amounts of bandwidth to serve all their customers, and

bandwidth costs money. Running a commercial website with the necessary computing

resources to handle sales and distribution for a vast number of customers is still

beyond the capacity of most individuals. This need for technical capital has resulted

in the emergence of large companies that specialize in digital content distribution.

These new electronic distribution middlemen, such as eBay and Apple’s iTunes [8],

are once again in a position of power over the content creators. They decide what

is sold and what they charge for access to their service. Many eBay merchants are

unhappy with the fees they must pay eBay to use its services. Every increase in

fees results in sellers leaving eBay as they lose the already slim profit margins they

maintained [86]. A new distribution revolution is needed.

This revolution is coming in the form of P2P networks. When content can be

transferred between customers without involving a single centralized server, the com-

putational and bandwidth burdens on the content creator or owner are removed. The

cost of distribution would be much lower for the content owner and the distribution

channels could no longer be monopolized by a small group of middlemen. The result


would mean lower prices for consumers and increased profits for the producers. Mer-

chants who have left eBay (or never used it) due to the increasing fees may welcome

a pure P2P-commerce solution where no fees are collected and all sellers participate

equally.

Unfortunately, both producers and consumers are reticent about using P2P net-

works for distribution. P2P technology is not sufficiently mature to support a secure

and safe method for purchasing content through these systems. The primary hurdles

are: providing an efficient, secure mechanism for purchasing content, a universally ac-

cepted method for verifying content authenticity and ownership, and ways to prevent

or mitigate attacks on the system by malicious users. These attacks include:

• defrauding customers and stealing their money,

• intentionally modifying content to damage the owner and/or creator of the con-

tent, and

• using content distribution to infect computers with worms or viruses.

Because of the lack of a secure payment system that prevents or punishes malicious

attackers, P2P technology is not yet a viable distribution medium.

These worries have appeared before whenever a new distribution channel emerged,

most recently with e-commerce over the World Wide Web. Each time, methods

and practices were developed to combat malicious activity and instill confidence in

consumers and sellers alike. These mechanisms have proven successful. In 2004

Americans spent approximately $115 billion on online purchases, up over 25% from

the previous year [66, 130, 134]. EBay alone, posted 2004 revenues of $3.3 billion [135].

The success of eBay is of special relevance because eBay is a hybrid peer-to-peer

system. Although certain functions such as indexing and auction management are

operated by a centralized server, distribution of goods and payment is handled directly

between the buyers and sellers.

1.1. RESEARCH CONTRIBUTIONS AND THESIS OUTLINE 5

Now researchers are working fervently to develop the secure payment, digital rights

management, auditing and enforcement mechanisms peer-to-peer systems need in or-

der to allow users to confidently purchase and distribute all kinds of content. A

major component in detecting and mitigating malicious attacks will be the reputa-

tion system. Online trading and auction systems, such as eBay, employ reputation

systems as a means of distinguishing well-behaved productive users from the selfish

or malicious peers. Reputation systems provide users with a summarized (perhaps

imperfect) history of another peer’s transactions. Users use this information to de-

cide to what extent they should trust an unknown peer before they themselves have

interacted with him/her.

Scholars and researchers have adopted reputation systems as a useful mechanism

for detecting, containing and discouraging misbehavior in P2P networks. Unfortu-

nately, the lack of a centralized trusted entity capable of monitoring user behavior and

enforcing rules, complicates the design of mechanisms for detecting and preventing

malicious behavior in autonomous environments. However, it is this challenge that

most inspires the work presented in this thesis, as well as the research field of security

for peer-to-peer systems. Secure solutions will encourage more users to engage in

larger-valued transactions through the flexible and efficient commercial medium of

P2P systems. This growth will drive the burgeoning economy of digital goods and

services. Reputation systems are necessary in order to revolutionize content and in-

formation distribution just as much, if not more, than the World Wide Web, as the

cost of distribution is lowered once again.

1.1 Research Contributions and Thesis Outline

This thesis presents a top-down exploration of designing reputation systems for au-

tonomous, decentralized computer systems. After an introductory decomposition and


survey of the research field, we present high-level models of the relationship between

reputation and user behavior in typical trading systems. We then focus on P2P

networks, using detailed simulations to investigate characteristics of basic system de-

sign decisions. Finally, we present two novel applications of trust and reputation

for routing security in different autonomous networks. The following thesis outline

describes the content of each chapter and touches on the major findings or research

contributions discussed therein.

Chapter 2 lays out an overview of the area of reputation system research geared

towards peer-to-peer networks. We decompose peer-to-peer reputation systems into

separate components. Each component must provide certain properties or capabilities

in order for the whole system to function. Designing mechanisms that achieve these

properties in an autonomous transient network, yields the most interesting research

problems. In addition to defining terms used throughout the thesis, this chapter

discusses in detail related work in this vast field of research. Further chapters briefly

describe related research that is more closely tied to results presented in the chapter.

The next two chapters study reputation in general systems where resources or

commodity goods are exchanged. Although the examples used for illustration focus

on online trade, the resulting conclusions are applicable to many economic systems.

Chapters 3 and 4 present theoretical models for how reputation affects user behavior

and utility, each applying a different approach at different granularity. These models

provide a framework for evaluating reputation algorithms using economic metrics,

which we then use to analyze high-level implementation issues. Based on these studies,

we propose guidelines for reputation system designers. Chapter 3 applies elementary

game theory to explore agent strategies on a microeconomic scale. Chapter 4 expands

these ideas to a macroeconomic mathematical model for expected user performance

in a large-scale online trading system. Our mathematical model is then compared to

simulation results.

1.1. RESEARCH CONTRIBUTIONS AND THESIS OUTLINE 7

In Chapter 5, we look closely at using limited reputation sharing in unstruc-

tured peer-to-peer resource-sharing networks. We propose several performance met-

rics (such as message traffic, load and efficiency) that allow us to evaluate and compare

reputation systems. Through detailed simulations of multiple variations on the basic

reputation system, we quantify the effects of certain system properties and design

choices. Our study demonstrates that even a small amount of reputation information

collecting and sharing can vastly improve a peer’s ability to locate and fetch valid

resources, even when faced with large-scale whitewashing and collusion by malicious

peers. In addition, certain methods for calculating reputation and ranking peers may

perform equally well in terms of detecting and avoiding malicious peers, but have

vastly differing effects on load balancing.

The following two chapters each present specific protocols/mechanisms that ex-

ploit reputation information in order to improve message routing performance in

two types of networks that vary both in their physical medium and their structure.

Chapter 6 proposes the SPROUT protocol for incorporating existing social network

information and services into a structured P2P network in order to improve the re-

liability of message transmission. Using our model of “social trust” we show that

SPROUT can improve expected message delivery by 50%.

Chapter 7 concentrates on the issue of trust in ad hoc wireless routing. The Watch-

dog mechanism uses the inherent broadcast nature of wireless transmission to detect

when packets are not being forwarded correctly by eavesdropping on next-hop trans-

mission. The reputation of nodes along a path is incremented or decremented based

on the message throughput. These reputations are used when selecting new paths as

nodes move around. Simulations show Watchdog improves routing throughput by up

to 27% under high mobility when 40% of the nodes fail to route correctly.

Finally, we give our concluding comments in Chapter 8.

Chapter 2

Taxonomy of Trust: Categorizing

P2P Reputation Systems

The development of any complex computer architecture can be a challenge. This is

especially true of a complex distributed algorithm that is run by autonomous un-

trusted agents, yet is expected to be relatively reliable, efficient, and secure. Such is

the task of designing a complete reputation system for use in peer-to-peer networks.

To accomplish the task, it is necessary to break down the problem into separate sim-

pler problems of constructing a mechanism that provides a specific set of functions

or properties, allowing developers to “divide and conquer” the problem of reputation

system design.

Our primary goal in this chapter is to provide a useful taxonomy of the field of

peer-to-peer reputation design. To accomplish this goal, we identify the three basic

components of a reputation system, break them down into the necessary separate

mechanisms, and categorize properties we feel the mechanisms need to provide in

order for the reputation system to fulfill its function. For each mechanism we list

possible design choices proposed by the research community.

In the process, we give examples of research in the area of trust and reputation. A

8

9

Table 2.1: Breakdown of Reputation System ComponentsReputation Systems

Information Gathering Scoring and Ranking ResponseIdentity Scheme Good vs. Bad Behavior IncentivesInfo. Sources Quantity vs. Quality PunishmentInfo. Aggregation Time-dependenceStranger Policy Selection Threshold

Peer Selection

variety of research papers and implementations are referenced to illustrate ideas and

provide the reader avenues for further investigation. We often draw on work done by

the Peers research group [1] at Stanford University and do not pretend to produce

a complete survey of the research area. We feel this overview will be of particular

interest to those who are unfamiliar with the breadth of issues relating to reputation

system design for peer-to-peer networks.

Taxonomies related to trust and reputation systems (either in part or as a whole)

have been proposed by others (e.g Daswani [33] and O’Hara et al. [101]) and will be

discussed in the text when appropriate.

2.0.1 Taxonomy Overview

The following section defines terms we use throughout the thesis. We begin our tax-

onomy by classifying the common assumptions and constraints that guide reputation

system design in Section 2.2. These assumptions include expected user behavior, as

well as the goals of adversaries in the system and their capabilities. How effectively a

reputation system can deal with adversaries may be constrained by the the technical

limitations imposed on the implementation by the target system environment. These

issues determine the necessary properties and powers of the reputation system.

Next, we break down the functionality of a reputation system into the three com-

ponents shown in Table 2.1. In general, a reputation system assists agents in choosing

10 CHAPTER 2. TAXONOMY OF TRUST

a reliable peer (if possible) to transact with when one or more have offered the agent

a service or resource. To provide this function, a reputation system collects infor-

mation on the transactional behavior of each peer (information gathering), scores

and ranks the peers based on expected reliability (scoring and ranking), and allows

the system to take action against malicious peers while rewarding contributors (re-

sponse). Each component requires separate system mechanisms (listed in Table 2.1).

For each mechanism we study the possible desired properties and then discuss the

implementation limitations and trade-offs that may prevent some of the properties

from being met. In the discussion we will reference existing solutions or research to

illustrate how different mechanism designs achieve certain properties within the given

system constraints.

The three functionalities, gathering, scoring and response are covered in turn in

Sections 2.3, 2.4 and 2.5.

2.1 Terms and Definitions

Before discussing the various taxonomies we would like to define certain terms we will

be using throughout the thesis:

Peer A single active entity in any system or network of autonomous entities. In

general, a peer in a system is associated with a specific user and his/her rep-

resentation in a network. However, in some systems it is possible for a single

human user to control multiple network entities with different identities (as used

in Sybil attacks [38]). Also, a user’s computer may be compromised by a worm

or trojan horse and consequently the computer may behave differently in the

network than the user intended. The user may even be unaware the computer

is misbehaving. Therefore, we distinguish between a user and user’s represen-

tation(s) or node(s) in the network. At times, we will use the term node, agent

2.1. TERMS AND DEFINITIONS 11

or even user (when not considering compromised clients) synonymously with

peer. For instance, in Chapter 3 we use the term agent out of the tradition of

the field of game theory.

Transactions Peer-to-peer systems are defined by interactions between autonomous

agents or peers. These interactions may include swapping files, storing data,

answering queries, or remote CPU usage. In addition, money may be exchanged

when purchasing the desired resource. We refer to all interactions in general as

transactions between two parties.

Cooperate/Defect When well-behaved peers carry out transactions correctly, we

say they cooperate. Bad peers, however, may at times attempt to cheat or

defraud another peer, in which case they defect on the transaction. We will use

these terms (when applicable) when discussing general system/peer behavior.

Structured vs Unstructured P2P network architectures tend to be categorized

as either structured or unstructured, depending on how the overlay topology is

formed. Structured networks use a specific protocol to assign network IDs and

establish links to new peers and are exemplified by the class of systems called

Distributed Hash Tables (DHTs) (e.g. [127, 113, 118]). In purely unstructured

topologies new users connect randomly to other peers. A hybrid approach is

to assign certain peers as supernodes (or ultrapeers) that form an unstructured

network and all peers connect to supernodes. Such organization is used in most

popular file-sharing systems (e.g. [56, 74]). However, for simplicity, we will

classify supernode networks as unstructured networks [139].

Strangers Peers that appear to be new to the system. They have not interacted

with other peers and therefore no trust information is available.

Adversary A general term we use to apply to agents that wish to harm other peers


or the system, or act in ways contrary to “acceptable” behavior. This may

include accessing restricted information, corrupting data, maliciously attacking

other nodes in the network, or attempting to take down the system services.

2.2 Assumptions and Constraints

The driving force behind reputation system design is providing a service that severely

mitigates misbehavior while imposing a minimal cost on the well-behaved users. To

that end, it is important to understand the requirements imposed on system design by

each of the following: the behavior and expectations of typical good users, the goals

and attacks of adversaries, and the technical limitations resulting from the environ-

ment where the system is deployed. We discuss each of these here. The choices made

here will impact the necessary mechanism properties discussed in Sections 2.3, 2.4,

and 2.5.

2.2.1 User Behavior

A system designer must build a system that is accessible to its intended users, provides

the level of functionality they require and does not hinder or burden them to the

point of driving them away. Therefore, it is important to anticipate any allowable

user behavior and meet their needs, regardless of added system complexity.

Examples of user behavior and requirements that affect distributed mechanism

design include:

Node churn The rate at which peers enter and leave the network, as well as how

gracefully they disconnect, affects many areas from network routing to content

availability. Higher levels of churn require increased data replication, redun-

dant routing paths, and topology repair protocols [60]. The node lifetime in

the system determines how much information can be collected for purpose of

2.2. ASSUMPTIONS AND CONSTRAINTS 13

computing its reputation, as well as how long that information is useful.

Reliability For most applications, users require certain guarantees on the reliabil-

ity or availability of system services. For example, a distributed data storage

application would want to guarantee that data stored by a user will always be

available to the user with high probability and that it will persist in the network

(even if temporarily offline) with a much higher probability [81]. The situation

is more difficult in peer-to-peer networks where adversaries are actively attempt-

ing to corrupt the content peers provide. Group auditing techniques may help

detect or prevent data loss [87].

Privacy Along with reliability, users that store data in an untrusted distributed sys-

tem would also want to protect the content from being accessed by unauthorized

users. One solution is to encrypt all data before storing [81]. However, in some

applications access to unencrypted data is necessary for processing. Separat-

ing sensitive data from subject identities, or using legally binding strict privacy

policies may be sufficient [115, 6, 7].

Anonymity As a specific application of privacy, users may only be willing to par-

ticipate if a certain amount of anonymity is guaranteed. This may vary from

no anonymity requirements, to hiding real-world identity behind a pseudonym,

to requiring that an agent’s actions be completely disconnected from both his

real-world identity and his other actions. Obviously, a reputation system would

be infeasible under the last requirement.

2.2.2 Threat Model

The two primary types of adversaries in peer-to-peer networks are selfish peers and

malicious peers. They are distinguished primarily by their goals in the system. Self-

ish peers wish to use system services while contributing minimal or no resources


themselves. A well-known example of selfish peers are “freeriders” [5] in file-sharing

networks, such as Kazaa and Gnutella. To minimize their cost in bandwidth and

CPU utilization freeriders refuse to share files in the network.

The goal of malicious peers, on the other hand, is to cause harm to either specific

targeted members of the network or the system as a whole. To accomplish this goal,

they are willing to spend any amount of resources (though we can consider malicious

peers with constrained resources a subclass of malicious peers). Examples include

distributing corrupted audio files on music-sharing networks to discourage piracy [98]

or disseminating virus-infected files for notoriety [12].

Reputation system designers usually target a certain type of adversary. For in-

stance, incentive schemes that encourage cooperation may work well against selfish

peers but be ineffective against malicious peers. The number or fraction of peers that

are adversaries also impact design. Byzantine protocols, for example, assume less

than a third of the peers are misbehaving [21].

The work presented in this thesis tackles both selfish and malicious peers, although

some sections may focus on a single type of adversary.

Adversarial Powers

Next, a designer must decide what techniques he expects the adversaries to employ

against the system and build in mechanisms to combat those techniques. The follow-

ing list briefly describes the more general techniques available to adversaries.

Traitors Some malicious peers may behave properly for a period of time in order to

build up a strongly positive reputation, then begin defecting. This technique

is effective when increased reputation gives a peer additional privileges, thus

allowing malicious peers to do extra damage to the system when they defect.

An example of traitors are eBay merchants that participate in many small

transactions in order to build up a high positive reputation, and then defraud

2.2. ASSUMPTIONS AND CONSTRAINTS 15

one or more buyers on a high-priced item. Traitors may also be the computers

of well-behaved users that have been compromised through a virus or trojan

horse. These machines will act to further the goals of the malicious user that

subverted them.

Collusion In many situations multiple malicious peers acting together can cause

more damage than each acting independently. This is especially true in peer-

to-peer reputation systems, where covert affiliations are untraceable and the

opinions of unknown peers impacts ones decisions. Most research devoted to

defeating collusion assume that if a group of peers collude they act as a single

unit, each peer being fully aware of the information and intent of every other

colluding peer [87].

Front peers Also referred to as “moles” [45], these malicious colluding peers always

cooperate with others in order to increase their reputation. They then provide

misinformation to promote actively malicious peers. This form of attack is par-

ticularly difficult to prevent in an environment where there are no pre-existing

trust relationships and peers have only the word and actions of others in guiding

their interactions [93] (see Sec. 5.6.2).

Whitewashers Peers that purposefully leave and rejoin the system with a new iden-

tity in an attempt to shed any bad reputation they have accumulated under their

previous identity [83]. Whitewashers are discussed in depth in later sections and

chapters (see Sec. 2.3.3 and Chp. 5).

Denial of Service (DoS) Whether conducted at the application layer or network

layer, Denial of Service attacks usually involve the adversary bringing to bear

large amounts of resources to completely disrupt service usage. Using Inter-

net worms however, malicious users are able to minimize their own personal


resource usage while amplifying the damage done through Distributed DoS at-

tacks. Much work has been done on detecting, managing, and preventing DoS

attacks. P2P-specific applications include [34, 35, 55] in unstructured networks

and [21] in DHT networks. Not only would we like reputation systems to detect

DoS attackers, but such attacks could be used against the reputation mechanism

itself.

As we discuss different mechanisms, we will reference these tactics and explain

how certain system properties can help against them. Most of the existing research

does not claim to handle malicious peers that bring to bear all these attacks at once.

In fact, much of the work focuses solely on independent selfish peers.

While Chapter 3 deals solely with the simplest case of selfish peers, the following

chapters (and particularly Chapter 5) study at depth the issues surrounding malicious

peers that use all these adversarial techniques.

2.2.3 Environmental Limitations

The primary division among system component architectures is centralized versus

decentralized. Implementing certain functionality at a single trusted entity can sim-

plify mechanism design and provide a more efficient system. As we will see, some

component properties can only be attained using the management and auditing ca-

pabilities afforded by a single point of trust. Of course centralization also has several

drawbacks. It may be infeasible to have a single entity all agents trust. A cen-

tralized server becomes a single point of failure as well as a bottleneck. Providing

performance and robustness requires the controlling entity to unilaterally invest large

sums of money. It also makes for a single point of attack by adversaries, either by

infiltration, subversion, or DoS attacks.

Between purely centralized and purely decentralized is a spectrum of hybrid archi-

tectures. For simplicity, we will refer to proposed mechanisms as centralized if they

2.3. GATHERING INFORMATION 17

require one (or a small number) entity that is trusted by all users to handle some

service for the entire system, even if they do not need to be always available, only

intermittently. Otherwise, the mechanism is decentralized.

2.3 Gathering Information

The first component of a reputation system is responsible for collecting information

on the behavior of peers, which will be used to determine how “trustworthy” they

are (either on an absolute scale or relative to the other peers).

2.3.1 System Identities

Associating a history of behavior with a particular agent requires a sufficiently per-

sistent identifier. Therefore, our first concern is the type of identities employed by

the peers in the system. There are several properties an identity scheme may have,

not all of which can be met with a single design. In fact, some properties are in direct

conflict of each other. The properties we focus on are:

Anonymity As previously mentioned in Section 2.2.1, the level of anonymity offered

by an identity scheme can vary from using real-world identities to preventing

any correlation of actions as being from the same agent.

Most peer-to-peer networks, such as Kazaa [74], use simple, user-generated

pseudonyms. Since peers connect directly to one another, their IP addresses are

public, providing the closest association between the agent’s actions and their

real-world identity. To hide their IP addresses users can employ redirection

schemes, such as Onion routing [128]. A P2P-specific solution using anonymiz-

ing tunnels is Tarzan [47]. Frequently changing pseudonyms and routing tunnels

disassociates the user’s actions from each other.


Though full anonymity prevents building user reputation, some peer-to-peer

reputation systems, such as TrustMe [120], use pseudo-anonymity to encourage

honest information sharing without fear of retribution. Each peer is assigned

two identifiers; one for transactions and another for reporting reputation infor-

mation and scores. A centralized login server minimizes fraud and whitewash-

ing.

Spoof-resistant To prevent adversaries from impersonating other peers identities

must be resistant to spoofing. One common solution is the use of public/private

key pairs. If a peer uses its public key as its identifier, other peers can verify

that any communication comes to or from that peer, assuming the use of nonces

to defeat replay attacks. However, initially transmitting ones public key may

still be susceptible to man-in-the-middle attacks. Certificates signed by an a

priori trusted certificate authority (CA) can help, but requires a centralized

mechanism.

Unforgeable In addition to being spoof-resistant, unforgeable identities protect

against whitewashers and Sybil attacks [38], where a single user poses as several

distinct peers in the network. Unforgeable identities are usually generated by a

trusted system entity and given to new users as they join. These identifiers can

be proven to have been generated by this trusted entity and only that entity.

Notice a user’s public/private key pair is not sufficient. A certificate for that

public key issued by a trusted CA is. Login servers can also authenticate users

as they enter. The CA or login server may require real-world identity proof to

ensure that each user receives only one system identifier, perhaps using credit

card verification [48, 21]. These solutions are necessarily centralized. Decen-

tralized solutions usually require identifiers that are costly to produce, though

not strictly unforgeable. Costly identifiers help slow the rate of whitewashing


Unforgeability

Anonymity

Cost

Figure 2.1: Representation of primary identity scheme properties.

or generating multiple identities, but do not eliminate it. [38].

The effectiveness of any solution at providing a given property lies on a cost

scale (e.g. cycles, bandwidth, dollars). An adversary with infinite resources can

compromise any property. For example, most resilient unforgeable or spoof-resistant

identity schemes often rely on public/private key privacy. Given enough CPU power,

an adversary can crack the key and therefore negate its intended purpose. An informal

representation of the spectrum of identity choices is presented in Figure 2.1.

2.3.2 Information Sharing

Using established network identities, a reputation system protocol collects informa-

tion on a given peer’s behavior in previous transactions in order to determine their

reputation. Examples of useful information include reports on the success or failure

of a transaction by one or both parties, as well as the quality of the service/resource

provided. This information collection may be done individually by each peer in a

reactive method, or proactively by all peers collating together their experiences. In

this section we discuss the sources from which information is collected, what quality

of information agents can expect to collect, and how peers combine information from


different sources.

Sources of Information

In general, quantity and quality of information are diametrically opposed. As the

amount of information gathered increases, the credibility of each piece of information

usually decreases.

The most cautious individuals may only want to rely on their own personal ex-

perience and use only local information when determining whether to transact with

a given peer. Of course without additional information, the individual risks being

cheated the first time they interact with each adversary. However, local information

may be sufficient if the agent locates a few well-behaved peers able to repeatedly

provide good service [91].

To increase the information sources a cautious agent can collect the opinions of

users whom they have a priori trust relationships with externally from the system.

These may include their friends from their personal lives, coworkers or business rela-

tionships, or even members of social networks [89, 63] they trust (see Sec. 2.6.2 and

Chapter 6).

Even with personal experience and the opinions of friends (that are currently

online), an agent is unlikely to have any information on a particular random peer. To

gather more opinions an agent can ask peers it has met in the P2P network, such as

its neighbors in the overlay network, or peers who have already provided good service,

proving themselves reputable. The question now is how many peers to query for their

opinions (We discuss how to aggregate these opinions in Sec. 2.3.2.) Asking a small

number of peers limits the communication overhead on the network [93], while asking

a larger number improves the chances of collecting useful information on a specific

peer [25].

If the number of personally-proven reputable peers is small, then an agent may


request that each of those peers collect the opinions of other peers they believe are

reputable, recursively. Each additional step exponentially increases the information

sources. Information located through a transitive trust chain may be more reliable

than asking a random peer [141, 84, 45].

Finally, there are the global history reputation systems that collect information

about all peers from all peers. These solutions are the most comprehensive as well

as the most complex to implement. While the probability that any single opinion is

fraudulent may be greater, the collective sum of all opinions is likely to be accurate,

even when a large fraction of the peers are malicious colluding adversaries.

While previous information-sharing techniques are easily decentralized, global his-

tory systems tend not to be. Perhaps the most widely used reputation system is that

of eBay [42], which consists of a single trusted entity that collects all transaction

reports and rates each user. Global history systems proposed for P2P networks tend

to be more distributed. TrustMe [120] relies on a centralized server to assign unforge-

able identities, but reputation adjustments and lookups are handled purely between

peers. EigenTrust [73] offers a fully decentralized solution using weak identity leaving

it more vulnerable to whitewashing.

In conclusion, a peer’s reputation is based on information collected about that

peer from one or multiple sources. The primary sources are: personal experience,

external trusted sources, one-hop trusted peers, multi-hop trusted peers, and a global

system. Each source provides increasingly more information about peers. However,

that information becomes increasingly less credible as well.

In [101], O’Hara et al. categorize “trust strategies” for the Semantic Web based

on how agents react to peers they have no personal experience with. Their five basic

strategies include: optimistically assuming all strangers are trustworthy unless proven

otherwise, pessimistically ignoring all strangers, unless they are proven trustworthy,


investigating a stranger by asking trusted peers, transitively propagating the inves-

tigation through friends of friends, or using a centralized reputation system. Notice

that their taxonomy mirrors that presented here.

Information Integrity

One major problem with reputations systems is guaranteeing the validity of opinions.

It is impossible to enforce honest, accurate reporting on transaction outcomes by all

peers. Most reputation systems do not attempt to verify the integrity of information

collected. Instead they assume the majority of users are honest and well-behaved,

and that collecting information from a large number of peers will result in a relatively

accurate assessment of a peer’s behavior.

Reputation systems that hope to combat colluding adversaries and front peers

that promote each other while denigrating good users, use reputation to weigh the

information and opinions collected. Instead of considering the opinions of each peer,

or each reported transaction experience, equally, these systems weigh the information

based on the trustworthiness of the source when compiling a peer’s reputation rating.

For example, information provided by personal friends would likely be considered

two or three times more accurate than that of a seemingly reputable, but unknown

peer in the network. Of course, when available, personal experience would be valued

the most [93]. Often the opinions of system peers are weighted by their previously

determined reputation scores. Information collected through transitive trust may be

weighted by the reputation rating of the least reputable peer in the trust chain [45].

Or, if reputation ratings lie between 0 and 1, the opinion would be weighted by the

product of the ratings of the peers in the chain.

One may also want to distinguish second-hand information from third-hand in-

formation. For instance, I may trust a transaction history reported by a peer if it

is based on their personal experience, rather than a reputation based on information


they received from other peers. If peers maintain and share complete records of every

transaction, this allows each peer to individually determine how to weigh each piece

of information. This weight could be computed based on the reputation of the origi-

nal reporting peer, the degree of separation from that peer, and the reputation of the

intermediate peers. However, this introduces significant traffic due to transmitting

the detailed transaction logs. Peers may prefer to share only a single score based

either solely on their personal experience, or a cumulative rating that includes the

scores provided by others. While this method reduces the flexibility offered to each

peer when calculating a final reputation, the use of weights by all peers will still have

the effect of dampening the influence of information from distant peers.

Even global history reputation systems apply reputation-based weighting. Eigen-

Trust [73] uses a distributed algorithm similar to PageRank [103] to compute a global

reputation rating for every peer using individual transaction reports weighted by the

rating of the reporting peer. However, even this algorithm was found to be vulnerable

to widespread collusion. Therefore, the authors suggest each agent separately weigh a

globally computed rating with the personal opinions of trusted peers, when available.

Some systems attempt to improve the accuracy of the transaction reports by

requiring proof of interaction. TrustMe [120], for example, requires that both parties

in a transaction sign a transaction certificate that is then presented when reporting

on the outcome of the transaction. While this may not prevent malicious peers from

lying about the outcome of a transaction, it does prevent adversaries from submitting

fraudulent reports about peers they have not interacted with in order to besmirch

their reputation.

2.3.3 Dealing with Strangers

With new users joining the system periodically, agents will often encounter peers with

no transaction history available at any source. As the number of sources an agent


gathers information from increases, the frequency of encountering a local stranger

(a peer whom the agent has no direct or indirect experience with or knowledge of)

decreases. In the global history systems all local strangers are also global strangers

(peers whom no agent in the system has interacted with).

When no reputation information can be located, an agent must decide whether to

transact with a stranger based on its stranger policy. As mentioned previously, two

simple strategies are to optimistically trust all strangers, or pessimistically refuse to

interact with them. Both have their drawbacks. Optimistic agents may frequently

be defrauded, especially in systems with high levels of whitewashing. However, in

pessimistic systems, new users will be unable to participate in transactions and will

never build a reputation.

Feldman et al. have done extensive work in analyzing the problem of stranger

policies and whitewashing in P2P networks [83, 45, 46]. They suggest a “stranger

adaptive” strategy. All transaction information on first-time interactions with any

stranger is aggregated together. Using a “generosity” metric based on recent stranger

transactions, an agent estimates the likely probability of being cheated by the next

stranger and decides whether to trust the next stranger using that probability. This

probabilistic strategy adapts well to the current rate of whitewashing in the system.

2.4 Reputation Scoring and Ranking

Once a peer’s transaction history has been collected and properly weighted, a rep-

utation score is computed for that peer, either by an interested agent, a centralized

entity, or by all peers collectively, as in EigenTrust [73]. We will refer to the method

by which the score is computed as a general reputation score function.

The primary purpose of the reputation score is to help an agent decide which

available service provider in the network it should transact with. The two typical

scenarios are:

2.4. REPUTATION SCORING AND RANKING 25

i) Agent A is offered a resource or service by peer P . A decides if transacting with

P is worth the expected risk of defection, based on P ’s reputation score.

ii) In response to A’s request for a certain resource or service, multiple service

providers (P1, P2,...) respond. A uses the reputation scores of each responder

to rank them in order of how likely they are to provide proper service. A then

chooses the highest ranked provider. Should that transaction fail, A may try

again with the next highest ranked peer.

In the next two sections, we consider the inputs and outputs of the reputation

score function. What statistics gathered from a peer’s transactional history will

most benefit in computing its trustworthiness? How should reputation scores be

represented?

2.4.1 Inputs

Regardless of how a peer’s final reputation rating is calculated, it may be based

on various statistics collected from its history. But what statistics should be used

in computing the ranking score? Ideally, both the amount a peer cooperates and

defects would be taken into account. However, often the amount a peer defects may

be unknown. While a malicious peer may openly defect on an agreed transaction by

providing bad service or no service, selfish peers usually defect “silently”. For example

in file-sharing networks, freeriders refuse to share their files and ignore queries they

could answer. Other peers cannot determine how often a peer selfishly ignores a

request. However, as suggested in [45], peers can calculate the rate at which an agent

contributes to the network. The contribution rate is a reputation rating based solely

on good work.

When defection information is available, this statistic is usually more useful than

cooperations. Notice that visible defections usually constitute malicious behavior,


which is more harmful than selfish behavior. While both good and bad behavior

can be taken into account, the negative impact of bad behavior on reputation should

outweigh the positive impact of good behavior.

When only information on positive contributions are available, the reputation will

have to be based solely on the amount of good work done. However, if a history

of peer’s cooperations and defections is available, should the peer’s reputation be

based on the quality of the work its done? Or should the quantity also matter? Our

work shows that while quality alone is useful (see Chapter 5, a score that properly

combines quality and quantity is much more effective and flexible under a variety of

adversarial techniques (see Sec. 4.6.2. Included in quantity should be the value of

each transaction. Intuitively, a peer that defects on one $100 transaction should have

a lower reputation than one who defects on two or three $1 transactions.

If a system wishes to defend against traitors, then reputations scores must consider

time. More recent transaction behavior should have a greater impact on a peer’s score

than older transactions. For example, a weighted transaction history could be used.

This would allow system agents to detect peers who suddenly “go bad” and defend

against them.

2.4.2 Outputs

In the end, the computed reputation rating may be a binary value (trusted or un-

trusted), a scaled integer (e.g. 1 to 10), or on a continuous scale (e.g. [0,1]). The

choice will be application dependent, although a binary value would likely be insuffi-

cient in a P2P environment where all peers are untrusted, but we want to rank peers

based on how reliable they are likely to be.

Both scenarios detailed above imply a single scalar value is obtained for each

candidate and is compared either against other candidates’ ratings or against a trust

threshold determined by the transaction. However, it is useful to maintain a peer’s

2.4. REPUTATION SCORING AND RANKING 27

reputation as multiple component scores. Applying different functions to the scores

allows a peer to calculate a rating best suited for the given situation. Many proposed

systems suggest maintaining multiple statistics about each peer. For example, keeping

separate ratings on a peer’s likelihood to defect on a transaction and it’s likelihood to

recommend malicious peers helps mitigate the effects of front peers. The TRELLIS

system [53] keeps separate ratings for the likelihood a peer cooperates on a transaction

(referred to as its “reliability”) and the accuracy of its opinions or recommendations

(its “credibility”). Reliability would correspond to the reputation score as discussed

here, while the credibility score would be used for weighing information sources, as

discussed in Section 2.3.2. Guha et al. [59] suggest maintaining separate scores for

trust and distrust.

2.4.3 Peer Selection

Once an agent has computed reputation ratings for the peers interested in transacting

with it, it must decide which, if any, to choose. If there is only one peer, and the

question is whether to trust it with the offered transaction, the agent may decide based

on whether the peer’s reputation rating is above or below a set selection threshold [91].

If multiple peers are offering the same resource, the agent would likely go with the

peer with the highest reputation rating. However, even with many peers available,

an agent may decide to refuse all their transaction requests if all their reputations

lie below the selection threshold. It may not be uncommon in certain systems, such

as document-sharing systems, for all peers responding to a rare document request

to be malicious. Malicious peers disseminating inauthentic or virus-infected files

can reply to any request, while well-behaved peers will only reply if they have the

queried document. A selection threshold is necessary to protect against malicious

spam responses (see Sec. 5.6.1.


2.5 Taking Action

In addition to guiding decisions on selecting transactional partners, reputation sys-

tems can be used to motivate peers to positively contribute to the network and/or

punish adversaries who try to disrupt the system.

2.5.1 Incentives

Mechanisms used to encourage cooperation in the system are referred to as incentive

schemes. They are most effective at combatting selfishness as they offset the cost

of contribution with some benefit. However, incentive schemes can mitigate some

maliciousness if access to system services requires an adversary provide good resources

first. Such a reciprocative procedure raises the cost of misbehavior.

Most suggested incentive schemes offer one of two types of incentives: improved

service or money, with service improvements further decomposing into three general

categories:

Speed Agents that contribute resources to the network may be rewarded with faster

download speeds or reduced response latency for their requests and queries.

An example of this incentive can be seen in Bittorrent [23], a common P2P

application for downloading popular files by allowing downloaders to share parts

of files as they are received. Applying the principle of “tit-for-tat”, the client

application throttles upload speeds to a peer based on the download speed it

is receiving from that peer. Therefore, peers that are willing to devote more

upload bandwidth are rewarded with a higher download speed.

Quality Some systems may provide content at varying levels of quality, depending

on a peer’s contribution rate. For example, a P2P streaming movie service

could provide movies at different resolutions depending on a customer’s sub-

scription plan. This approach is already used by many online video providers

2.5. TAKING ACTION 29

(e.g. IFILM [67]).

Quantity Similar to quality, the amount of information, content, or service providers

available to a peer would be determined by the amount the peer contributes.

This approach is also used by many online services that provide a limited amount

of content for free but require payment for access to all their content. Similar

ideas have been proposed for use in P2P systems. For instance, some solutions

encourage peers to route network messages for other peers (e.g. [137, 14]).

Money Currently, peer-to-peer systems are used to share files and resources that

require little or no cost for the contributing peer to produce and distribute.

However, to support the exchange of more valuable content will require a pay-

ment mechanism that allows an agent to pay the content creator and provider

upon acquiring it. Most of the content will likely carry a low price since the

cost of distribution is spread over the users. Therefore a low-weight micropay-

ment mechanism is needed, allowing clients to make payments of a few cents

(or fractions of cents) without incurring a larger billing fee. Several papers have

proposed low-cost micropayment mechanisms for P2P systems (e.g. [64, 140]).

2.5.2 Punishment

While incentives are very useful at discouraging selfishness, curtailing misbehavior

requires the ability to punish malicious peers. As discussed earlier, the primary

function of reputation systems is to inform agents as to which peers are likely to

defect on a transaction. Not only does adversary avoidance benefit well-behaved

peers, but it punishes malicious peers who will quickly find themselves unable to

disseminate bad resources or cheat other peers. E-commerce sites, such as eBay [42],

use reputation systems not only to provide good customers information on sellers,

giving buyers a sense of security, but also to discourage misbehavior in the first place.


If the reputation system can identify actively malicious peers it may retaliate in

several ways beyond simply warning other users. Overlay network neighbors can

disconnect from the adversary, immediately ejecting it from the network. Depending

on the type of identifiers used, the adversary may be kicked from the network for a

period of time, or permanently banned. To reenter the system, the adversary would

need to acquire a new valid identifier, which may be costly or impossible.

Finally, P2P systems tied to financial institutions for monetary payments could

fine a malicious peer for each verified act of misbehavior. Of course, such a solution

should be used cautiously as adversaries could use them to wreak havoc in the system

by falsely implicating well-behaved peers of misbehavior.

2.6 Miscellaneous

Other work approaches the problem of trust and reputation with a unique or novel

method that is not easily classified by the implementation of the basic mechanisms.

We feel it is important to include some examples of such work for completeness.

2.6.1 Resource Reputation

In [29], Damiani et al. enhance their previous peer reputation protocol [25] by propos-

ing the concept of resource reputation. In addition to reporting on the peers they

interact with, users give opinions on a resource’s authenticity based on its digest, or

hashed value. When a user requests a file or resource, each responder returns the di-

gest of the file it is willing to upload. Using the reputation system protocol, the user

looks up the digests to find which one corresponds to the correct file he is interested

in, and which is reported to be fake or corrupt. The user then fetches the file from

the provider reporting the digest most likely to be authentic. The user should then

recompute the digest on the received file. If it does not match that reported by the

file provider, the user can delete the file, report the provider for attempting to cheat,

2.7. CONCLUSION 31

and try a different provider. This technique complements the process of maintaining

peer reputations, which is still necessary in situations where the resource is rare and

no other peers have encountered it.

2.6.2 Social Networks

Peer-to-peer reputation system research is conducted under the assumption that all

peers in the network are unknown and untrusted. However, in the real world this

is not likely to be the case. A user may know that some of his friends also use

the same peer-to-peer network. If he could connect to them The use of a priori

trust relationships was touched upon earlier in this chapter (Sec. 2.3.2) as sources

of reputation information. However, existing social networks can be leveraged by

peer-to-peer systems in other ways. In Chapter 6 we present SPROUT, a protocol

for using social network access to locate and connect with friends in P2P networks in

order to improve message routing reliability. Using a social network trust model, we

show how such a protocol can be expected to improve performance.

2.7 Conclusion

Developing an implementable reputation system is an art involving many separate

design problems and choices. A reputation system is generally composed of three basic

functions: gathering behavioral information, scoring and ranking peers, and rewarding

or punishing peers. In turn, each component requires a combination of mechanisms to

function; each mechanism providing its own set of conflicting properties. We believe

a proper dissection of the overall design problem will allow researchers to develop

efficient solutions to each separate part without losing sight of the overall goal.

Chapter 3

Quantifying Agent StrategiesUnder Reputation

While most reputation system work has focused on developing specific protocols and

implementation designs that are tested through simulations, we believe much could be

learned through high-level theoretical analysis. In this chapter, we explore reputation

in online trade using a microeconomic model, primarily concentrating on individual

transactions between a small number of buyers and sellers. This chapter explores

the application of game theory to the study of reputation. Then, Chapter 4 takes a

macroeconomic approach, expanding the model to predicting typical user behavior in

highly populated systems.

This chapter concentrates only on selfish peers that seek to maximize their profit,

regardless of the harm to other parties. We do not discuss malicious agents that

gain additional utility from the act of harming other peers, though many reputation

systems are designed specifically to root out such agents [73, 93].

For this chapter, we are specifically interested in agents that have engaged in a

number of trades and therefore have accumulated a behavioral history. We ignore

the issue of bootstrapping reputation for new agents while preventing whitewashing.

We suggest a“stranger adaptive” technique similar to that proposed in [45] would be

effective. That work is further discussed in Section 3.5. We return to the issue of

32

3.1. DEFINITIONS AND DIMENSIONS 33

bootstrapping in Chapter 5. Also, we do not address how the behavioral history is

collected. We simply assume that a perfect history is available to all agents, allowing

us to focus on agent strategies rather than on specific mechanisms for gathering

transaction information. Chapter 5 offers an examination of more realistic history

maintenance.

We begin in Section 3.1 by proposing a simple economic game that captures the

mechanics of transactions between a buyer and a seller. Section 3.2 describes the pos-

sible outcomes of each transaction and states the social optimum. In Section 3.3, we

discuss expected player response and outcome when buyers have no knowledge about

the sellers as well as when they have perfect knowledge of how a seller will respond.

While simple, this exercise introduces the model and the analysis techniques, and will

provide insight when we look at reputation in Section 3.4. Assuming a perfect repu-

tation system, we show that Nash equilibrium is reached when players predominantly

cooperate. Sections 3.5 and 3.6 discuss related and future work. Finally, we conclude

in Section 3.7.

3.1 Definitions and Dimensions

This section defines a game that provides a simplified model of a generic trading

system. Next, we describe three dimensions which we vary to compose the specific

game scenarios we are interested in analyzing.

3.1.1 Game Setup and Rules

The players in our system are buyers and sellers.

• A seller can provide 1 unit of goods each turn, which we refer to as a bundle.

This bundle may be split by the seller between good resources, denoted by G,

and bad resources, denoted by B. Let 0 ≤ g ≤ 1 denote the fraction of the

34 CHAPTER 3. AGENT STRATEGIES UNDER REPUTATION

Table 3.1: Parameter descriptions with sample valuesParam. Description Value

v Valuation of 1G of goods to a buyer 3c Seller’s production cost of 1G of goods 1p Price paid for a bundle 2 (for FP)g Fraction of bundle that is good N/A

bundle made up of good resources. For example, bundle [ 34G : 1

4B]⇔ g = 3

4.

• Each unit of good resources costs a seller c to supply and has a valuation of v

to the buyer. Assume v > c. If not, there would be no price at which both the

seller and buyer could profit from a transaction and so no transactions would

occur.

• Each unit of bad resources costs a seller $0 to supply and has a valuation of $0

to the buyer.

• All sellers have the same production costs and all buyers have the same valua-

tion.

• A buyer can purchase at most one bundle per turn, but may choose not to

purchase any.

• The buyer always pays the seller before receiving the bundle. Consequently,

the buyer can never cheat a seller, only vice versa. This assumption reduces

the complexity of case analysis and mirrors most transactions, where payment

is verified before goods are received and their quality evaluated.

The parameters are listed with descriptions in Table 3.1, along with default values

used in concrete examples throughout the chapter.

As with most economic games, our interest will be to study how various strategies

affect the utility of each player in the game. Therefore, all values given are in units


of utility. We will use $ as the symbol for units of utility. Each player is solely

motivated to increase his own utility. When a buyer purchases goods from a seller,

we are interested in the change in utility for each participant of the transaction. We

refer to this change in utility as the profit (positive or negative) of each player. We

define social profit to be the sum of all the players’ profits. We consider the optimal

utilitarian strategy to be one that maximizes social profit.

Our investigation breaks down the range of options in three dimensions: knowl-

edge, players, and pricing. The following describes each dimension as well as the

scenarios we consider relevant.

3.1.2 Knowledge-space

As we wish to look at the effects of reputation information on market behavior, we

must specify what information about the seller is available to the buyer. We look at

three approaches of increasing complexity.

Zero Knowledge (0K) A buyer has no knowledge whatsoever of the transaction

history of any seller, even of sellers he himself has previously interacted with.

Perfect Knowledge (PK) A buyer knows exactly what is the composition of the

current bundle being offered by any seller.

Perfect History (PH) We define perfect history to mean that a buyer is aware of

the composition of every bundle each seller has previously sold but not the bun-

dle the seller is currently offering. Perfect history represents an ideal reputation

system capable of supplying the buyer with all information about any seller’s

previous actions.

3.1.3 Player-space

The number of each type of player in a a scenario is determined as follows:


1B-1S The simplest player scenario we will look at is a game with one buyer and

one seller.

1B-MS In this scenario there is one buyer but many sellers competing for the buyer’s

attention and money.

MB-1S Conversely, there may be many buyers competing to purchase from only one

seller.

MB*MS After studying the previous three simpler scenarios we will consider more

complex player scenarios with multiple buyers and sellers, though the relative

number of each will vary. The relative number of each will be indicated by the

appropriate sign (i.e. =, <, or >) in place of “*”. In most situations each of

these cases reduce to one of the three simpler scenarios, depending on relative

population size.

When the number of buyers and/or sellers does not matter we will use asterisks

notation (e.g. *B-*S). We will refer to single buyers as B and single sellers as S.

When there may be multiple buyers and/or sellers we will use {B} to signify the set

of all buyers and {S} to signify the set of all sellers.

3.1.4 Price-space

The two pricing options we consider are:

Fixed price (FP) The system sets a constant price for each bundle. The seller may

vary the content of the bundle and the buyer may choose to buy a bundle or

not, but the price does not vary.

When multiple buyers are interested in a single seller in one turn, we assume

the buyers are randomly ordered. The first buyer chooses from all sellers and

the rest of the buyers choose from the remaining sellers, in order. This ordering


represents a real world phenomenon where an implicit ordering is obtained as

buyers compete for items offered on a “first come, first served” (FCFS) basis.

Variable price (VP) Each buyer bids on a bundle offered by the seller. The seller

accepts the highest bid, which determines the price the buyer must pay the

seller. In the case of a tie, the seller randomly chooses.

We do not concern ourselves with the specific mechanism of the auction, but

for simplicity assume an ascending auction or Vickrey auction [131]. Since all

buyers have the same valuation for goods and the same knowledge about the

seller, we expect all buyers to bid the same amount. Therefore, the second

highest bid will equal the highest bid in a Vickrey auction.

Because auctioning bundles does not make sense when there is only one buyer

we will ignore scenarios involving one buyer (i.e. 1B-1S/VP or 1B-MS/VP).

The variable p will denote the price paid for a bundle in either price scenario.

In FP, p denotes the fixed price set by the market, while in VP, p denotes the bid

accepted for the bundle.

3.1.5 eBay Scenario

To help in illustrating the implications of the model, we will at times use examples

within the framework of an online shopping site such as eBay [42]. While mostly

known for its variable priced auctions, many items on eBay also have an associated

fixed price allowing a bidder to purchase the item immediately for a specified amount.

Some items are offered on a solely fixed price basis. Therefore, eBay is an excellent

scenario in which to discuss the various aspects of the model across price and player

space. For example, when there are more interested buyers than items offered by

a particular seller at a fixed price, the order in which buyers purchase the items is

determined by when each clicked the “Buy Now” button; first come, first served.


Table 3.2: General payoff matrixBundle(S) Buyer Seller Social Profit[1G : 0B] v − p p− c v − c[0G : 1B] −p p 0

[g : (1− g)] vg − p p− cg (v − c)g

Though eBay covers the spectrum of player and price-space, we specially focus on

the two most common scenarios: MB-1S/VP representing auctions and 1B-MS/FP

representing the sale of fixed-price commodities.

3.2 Strategy Independent Analysis

This section focuses on strategy-independent properties of the model. First we discuss

the payoffs each player receives from a single transaction with different bundles, as

well as the social profit. Given that, we derive the socially optimal bundle.

3.2.1 Single Transaction Payoff

For a transaction the buyer’s payoff equals the valuation of the good component of

the bundle minus the price paid: vg − p. The seller receives the price minus the cost

of producing the good component: p − cg. Adding the two gives the social profit of

(v− c)g. These expressions are summarized in Table 3.2 for easy reference, including

the two extreme bundles, 1G and 1B.

These expressions hold regardless of the strategy employed by players, the number

of players, or the information available to each player. Instead, these factors affect:

the bundle chosen by each seller, whether a buyer agrees to buy a bundle, and the

price offered by the buyers in the variable-priced scenario.

To illustrate, the following examples assume a bundle valuation of v = $3, a

production cost of c = $1, and a fixed price of p = $2 (listed in Table 3.1). The payoff

matrix for different sample bundle distributions for these specific parameter values is

3.2. STRATEGY INDEPENDENT ANALYSIS 39

given in Table 3.3. The last row gives the payoffs as a function of g, the fraction of

the bundle that is good effort.

Table 3.3: Payoff matrix for fixed $2 priced goods with valuation $3 and cost $1Bundle(S) Buyer Seller Social Profit[1G : 0B] 1 1 2[12G : 1

2B] -0.5 1.5 1

[0G : 1B] -2 2 0[g : (1− g)] 3g − 2 2− g 2g

The payoff to buyers is the value of goods acquired minus the price. The payoff

to the seller is the amount paid minus the cost of producing the bundle of goods. For

example, consider the second row in Table 3.3 where a buyer purchases a bundle that

is half good resources and half bad resources. The buyer gains 12· $3 utility but pays

$2 for a total loss of $0.5. It cost the seller $0.5 to produce the bundle (specifically

the 12G) and it received $2 in payment for a total gain of $1.5. Therefore, the total

increase in utility, or social profit, from the transaction was $1.

Remember, the buyer may always decline the transaction resulting in $0 profit for

both parties. If the seller is allowed to only produce [1G : 0B] or [0G : 1B] bundles,

this game resembles the one-sided prisoner’s dilemma [111], where it is one player’s

interest to defect when the other cooperates, while the other player wants to strictly

cooperate.

3.2.2 Social Optimum

Our objective function is to maximize social profit, which we define as the sum of

utility gained/lost by both the buyer and the seller. From Section 3.2.1 we have the

social profit from a transaction as (v − c)g. Since v − c > 0 by definition, clearly the

social optimum results when the seller maximizes g by producing 1G.

Because social profit is independent of price or player strategy, this social optimum

holds for both fixed and variable pricing and is constant across knowledge-space and


player-space as well.

As we will see, the social optimum is an equilibrium for selfish agents in certain

scenarios. An additional advantage of this social optimum is that it does not require

the seller to know the valuation of the buyer, as long as v > c.

3.3 Selfish Analysis

Here we compare optimal strategy previously described with player strategies due to

independent selfish behavior. This section studies the 0K and PK knowledge-space

while the following section focuses on the more interesting and complex Perfect His-

tory. Each part begins analyzing a one buyer-one seller scenario with fixed prices

(1B-1S/FP). When applicable, variations in player-space and price-space will be dis-

cussed.

3.3.1 Zero Knowledge

Suppose 1B-1S/FP and consider the case of 0K, where the buyer has no knowledge of

the seller’s current bundle or what she has offered in the past. If every transaction is

completely disconnected from all other transactions, then the seller’s choice in bundle

has no effect, positive or negative, on future transactions. Each round is equivalent

to a one-shot Stackelberg game where the buyer always leads. Therefore, seller will

offer 1B in order to maximize personal profit (p − cg|g = 0 ⇒ p). However, if the

seller is expected to provide 1B, purchasing from her will result in negative profit for

the buyer (vg − p|g = 0 ⇒ −p). Therefore, the buyer will decline the transaction,

resulting in $0 profit for each and thus no increase in total utility.

Increasing the number of players or using variable pricing will not affect the fact

that it is in each seller’s interest to sell 1B if buyers are unable to distinguish between

sellers or their bundles in any way. Therefore, it is in every buyer’s interest to reject

the transaction.

3.3. SELFISH ANALYSIS 41

3.3.2 Perfect Knowledge

Let’s begin again with 1B-1S/FP. Suppose the buyer knows exactly what bundle the

seller is offering (PK). Unlike under 0K, each round is now a Stackelberg game where

the seller always leads. Given that buyer B’s only choices are to purchase the offered

bundle or reject the transaction, seller S need only offer the minimal bundle as to

give B positive profit. Solving vg− p from Table 3.2 for g : (1− g) yields a threshold

bundle of [ pvG : v−p

vB]. If the seller offers any bundle with more good resources, the

buyer will accept. Let S offer ε more good resources (and thus ε less bad resources)

where ε → 0+ to ensure a very small but positive profit for B. This mixture results

in a profit gain of vε → 0 for the buyer and p − c v−pv− cε → p − c v−p

vfor the seller.

The social profit is simply the profit of the seller, p− c v−pv

.

Using the default values for the parameters from Table 3.1 produces a threshold

bundle of [23G : 1

3B] with a social profit of $ 4

3.

Next we expand our player set to include multiple buyers, then multiple sellers.

Buyer’s Market

Consider the 1B-MS/FP under PK scenario with n sellers but only one buyer. Each

turn the buyer chooses the seller with the best bundle from which to purchase a bundle

at the fixed price. If all sellers offer the same bundle, each seller has probability 1n

of

being chosen.

Sellers can no longer offer the minimal bundle that gives a buyer a positive profit.

If they do, one seller will realize that they can increase their chance of selling their

bundle from 1n

to 1 by slightly improving their bundle above that of the rest. Quickly,

the other sellers will follow suit and improve their bundles up to or past that of the

first seller. In the end, all sellers will offer a bundle of 1G resulting in an average profit

rate equal to the probability of being chosen times the utility gained from selling 1G,

or 1n· (p− c). No seller is motivated to change their bundle because offering anything


less than the rest of the sellers guarantees they will not be chosen.

Now the Nash equilibrium equals the social optimum.

Next, consider the MB<MS scenario. Suppose there are m buyers, m < n. The

same equilibrium will result. After m − 1 buyers have made a choice and chosen a

seller there will remain multiple sellers for the last buyer. For these final players the

problem degenerates to the 1B-MS situation and so all remaining sellers must offer

1G. All previously chosen sellers must have also offered 1G. If one had not, she would

not have been chosen before the remaining sellers who are offering a better bundle.

Seller’s Market

Next, consider the MB-1S/FP/PK scenario. Let there be m buyers and one seller.

Since the seller can only sell one bundle per turn and the price of the bundle is fixed

at p, she will offer the minimal bundle so as to guarantee a sale. Just as in the first

case of equal number of buyers and sellers this bundle needs to be only slightly better

than [23G : 1

3B] resulting in the same Nash equilibrium as 1B − 1S.

Now we consider the variable price scenario. Instead of randomly ordering the

buyers, thus guaranteeing that the last m − 1 will not be able to (or not want to)

purchase any goods, what if we allowed the buyers to bid for bundles?

We begin with a single seller and multiple buyers (MB-1S/VP/PK). Table 3.4

lists the payoffs to both the buyer and the seller, as well as the total social profit for

three different bundles. As expected, the social profit is the same for each bundle as

in the FP scenario.

If each buyer is free to bid any price we can expect them to bid at or below the

bundle valuation. Assume the valuation of the bundle is vg > 0 (the seller offers a

bundle with at least some good content). A buyer B1 would like to pay as little as

possible, say $0. However, a second B2 will happily offer a bit more in order to secure

winning the auction. It is in B1’s interest to raise his bid beyond that of B2. This

3.3. SELFISH ANALYSIS 43

continues until one or both bid the actual valuation vg.

Table 3.4: Payoff Matrix for variable priced p goods for default v = $3 and c = $1.Bundle(S) Buyer Seller Social Profit[1G : 0B] 3-p p-1 2[12G : 1

2B] 1.5-p p-0.5 1

[0G : 1B] 0-p p 0

Using the MB>MS/FP scenario, we find that a similar analysis yields the same

equilibrium as for the MB-1S/FP scenario. The seller will choose a bundle so as to

limit the buyer’s profit to 0.

The MB>MS/VP scenario is not as trivial. The model specifies a buyer can

acquire one bundle per round. With multiple sellers auctioning their bundles should

buyers be allowed to bid on multiple concurrent bundles? One solution is to order the

sellers and conduct the auctions sequentially. The buyer who wins the bundle cannot

participate in progressive auctions. With more buyers than sellers, we are guaranteed

to have multiple bidders for each bundle and so the same equilibrium price as the

MB-1S is expected.

Similarly, if we allow buyers to purchase multiple bundles, and the valuation of

each bundle is not affected by the number acquired, then we would expect every buyer

to participate in each seller’s auction. Once again, this situation degenerates to the

MB-1S case.

The last scenario is auctions held in parallel and buyers can only purchase one

bundle and therefore only bid on one bundle. Here we break it down into two cases:

one with more than twice as many buyers as sellers, and one with less.

The first case is the simplest. Each seller’s auction will be bid on by two or more

buyers, mirroring the MB-1S situation. If not, if there were a seller with only one

buyer bidding on its bundle then that buyer would have an advantage and would bid

low (less than v). However, there must be a seller with three or more buyers bidding


for her bundle. One of those buyers would see that the single buyer was bidding less

than v and move its bid over the the single-buyer seller and escalate the bid. Now

every seller has multiple buyers bidding.

In the second case, there are less than two buyers for each seller. Some sellers will

have only one buyer bidding for their bundles.

To summarize, the 0K results indicate the need for some information about a

seller’s behavior if any trades are to happen. Even with perfect knowledge, the seller

will not necessarily act in the best interest of the buyer. However, in many scenarios

the seller has incentive to offer the best possible bundle. While obvious in situations

where multiple buyers are competing for one buyer’s attention (and money), it also

holds when multiple buyers are competing for one seller’s item in an auction scenario.

3.4 Perfect History

We begin by proposing very simple strategies for both buyers and sellers, then incre-

mentally modifying them in response to the other players’ current strategy until the

players reach a Nash equilibrium.

As defined in Section 3.1, perfect history (PH) entitles all buyers to know the

transaction history of every seller. We will simplify our model to allow sellers to sell

one of two bundles: 1G or 1B. If the seller offers 1G we say the seller cooperates on

the transaction. If she offers 1B, she is defecting on the transaction. We argue that

assuming a binary bundle does not greatly weaken our model. A buyer’s decision on

whether to buy, and at what price, will be based on the probability he expects the

seller to cooperate or defect. This probability will be estimated based on all sellers’

history/reputation.

We assume that each seller has accrued a number of transactions in her history

consistent with the strategy she employs. We do not focus on the reputation boot-

strapping problem (when a seller has no history) which is outside the scope of this

3.4. PERFECT HISTORY 45

chapter, but is discussed in subsequent chapters. When necessary we simply assume

buyers expect sellers to cooperate on the first transaction.

To simplify our initial analysis of strategies for both buyers and sellers, we begin

with buyers assuming a simple model for the behavior of each seller. Given this as-

sumption, a buyer will choose a strategy. If sellers then assume each buyer follows

that strategy, they will choose their own strategy. We then repeat the process, un-

til the progression of strategies reaches Nash equilibrium where neither player has

incentive to change their strategy.1

The first section proposes initial strategies for both buyers and sellers. The fol-

lowing section explores improved strategies under the auction scenario (MB-1S/VP),

while the final section concentrates on strategies in the fixed-price market (1B-MS/FP)

scenario.

3.4.1 Basic Reputation-based Strategies

Coin Model (CM): Each round, seller S randomly chooses whether to cooperate

or defect with probability ρS of cooperating.

This simple model mimics each seller flipping a biased coin each turn. If there are

multiple sellers in the system, each seller may have a different bias ρi i ∈ {S} where

{S} is the set of all sellers, whether one or more.

Buyer Strategy β1β1β1 (BS-β1β1β1): Buyer B assumes seller S follows the coin model,

estimates S’s probability of cooperating and will pay up to vρS.

Regardless of the number of buyers and sellers (*B-*S), each buyer initially con-

siders each seller S independently. To determine the likelihood of S cooperating on

the next transaction, B needs to know ρS. Given ρS the estimated valuation of S’s

bundle is vρS + 0(1 − ρS) = vρS. Therefore, B will be willing to pay up to vρS

for S’s bundle. Consequently, the price a seller can command is proportional to her

1Note, we do not claim it is the only existing Nash equilibrium.


reputation. This intuitive result is supported by empirical findings [75].

Although B may not know ρS, it can estimate it by using the seller’s transac-

tional history. Specifically, by counting the number of transactions it has previously

cooperated on and dividing by the total number of transaction we have an unbiased

estimator for ρS. Let TS be the total set of transactions S participated in and CS be

those transactions in which S cooperated.

ρS =|CS||TS|

(3.1)

To understand how ρ affects the buyer’s decision, first consider 1B-1S. Buyer B

calculates ρS and is willing to purchase from S if the fixed price (FP) p ≤ vρS. If S

is auctioning the bundle (MB-1S/VP), B will offer at most vρS.

Now suppose there are multiple sellers to choose from (1B-MS).B estimates ρi ∀i ∈{S}. Now consider the following cases.

• FP: B seeks to maximize expected profit vρi − p for fixed price p. Therefore a

single buyer (1B-MS) will choose to purchase from S whose ρS ≥ ρi ∀i ∈ S. If

there are multiple buyers (MB*MS) competing for the bundles on FCFS, from

among the remaining sellers with available bundles, B will purchase from the

seller with the highest ρi such that vρi − p ≥ 0.

• VP: Variable pricing only applies to MB > MS, or MB < MS if buyers can

purchase multiple bundles per turn. In either case, B will bid up to vρS, just

as in the single seller scenario.

In BS-β1 the buyer(s) assumes the seller applies the coin model. Now we will look

at how the seller should respond if it assumes buyers are using BS-β1.

Seller Strategy σ1σ1σ1 (SS-σ1σ1σ1): Seller S assumes buyer uses BS-β1. S follows coin

model, but can choose appropriate ρS when it enters the system. However, S cannot

vary ρS over time.


In other words, we allow S to freely choose ρS but not vary it over time (we relax

this constraint in the following sections). In the 1B-1S FP scenario, ρS needs to be

sufficiently high so that the expected valuation calculated by the buyer is greater

than or equal to the price p. Therefore, vρS ≥ p ⇒ ρS ≥ pv. The same result holds

for MB-1S FP.

However, in 1B-MS/FP, S expects the buyer B applying BS-β1 to choose the

seller with the highest ρS. All sellers will choose ρ = 1. If not, if all sellers choose a

ρ < 1, then one seller i could unilaterally raise her ρi above that of the other sellers

and guarantee that she is chosen by B. This move would prompt the other sellers to

increase their ρ to be competitive, until all sellers are using ρ = 1. If one seller does

not follow suit, and keeps her ρ < 1, then there is no chance of B choosing her.

Rational sellers will set ρ = 1 only if they can expect to have positive profits.

Note that, if all sellers adhere to Seller Strategy σ1 the expected payoff every round

for a seller S is p−cn

if ρS = 1 where n is the number of sellers with ρ = 1. Since only

ρ = 1 generates positive profit for the seller, we would not expect any rational sellers

to choose a ρ < 1 .

Finally, consider the MB-1S/VP scenario. Following the same reasoning as for

MB-1S VP under perfect knowledge we once again find the preferred ρS for seller S

is 1 as long as v > c.

If all sellers must adhere to SS-σ1, then the buyers have no incentive to deviate

from BS-β1, resulting in Nash equilibrium.

3.4.2 Independent Decisions for MB-1S/VP

In this section we consider only the one seller, multiple buyer, variable priced, perfect

history scenario. The work also applies to multiple sellers, but where buyers are

not restricted to purchasing at most one bundle per turn, thus allowing them to bid

in each seller’s auction. This scenario represents the type of markets we are most


interested in, namely eBay-style auctions. Instead of insisting on a constant ρ over

all time as with SS-σ1, we allow the seller to decide whether to cooperate or defect

on each transaction separately. As we will see, a crucial factor in the seller’s strategy

is the total number of transactions the seller plans to execute.

Suppose seller S has committed n transactions, m of which were good and n−mwere bad. Assuming variable priced bids and buyers applying Buyer Strategy β1,

a buyer will bid up to vmn

for the next bundle offered by seller. Should the seller

cooperate or defect? If she cooperates the expected bid price of the next bundle will

be vm+1n+1

. If the seller defects she gains a one-time benefit of c (compare 1G with 1B

in Table 3.4), but the expected price of the next bundle will be v mn+1

, slightly lower

than if she had cooperated. Regardless of S’s previous or successive behavior, how

many additional transactions must S perform before the long-term damage done to

her reputation by one defection outweighs the one time gain from that defection?

To measure the effect of a seller’s decision on long-term utility we calculate utility

over time for each case, cooperate or defect, and see after how many rounds the values

are equal.

Lemma 3.4.1 Assuming buyers follow BS-β1, a seller S that has committed n trans-actions will gain more utility from defecting rather than cooperating on the n+1 trans-action if S performs less than k additional transactions and less utility if S performsmore than k additional transactions, where k ≈ (n+ 1

2)(ec/v − 1).

Proof Suppose seller S has a history of n transactions, in m of which she cooperated.

On the n+1 turn the seller chooses either to defect or cooperate. Let k be the number

of turns S sells bundles after she cooperates/defects on turn n+ 1.

Let U(n) be S’s utility after the first n turns. Let Uc(z) be the utility of S after

z > n turns, assuming S cooperated on the n + 1 turn. Similarly, let Ud(z) be the

utility of S after z > n turns, assuming S defected on the n + 1 turn. Before we

formulate Uc(z) and Ud(z) we must define some auxiliary functions.


Define function fS(t) to return 1 if S cooperated (C) on turn t, or 0 otherwise.

Define function FS(t) to be a nondecreasing function equal to the number of turns

S has cooperated after t turns. For example, since S cooperated m times in her first

n transactions, FS(n) = m.

FS(t) =t∑

i=1

fS(i) (3.2)

Define F−yS (t) to be a non decreasing function equal to the number of turns S has

cooperated after t turns, excluding turn y:

F−yS (t) =

t∑

i=1,i6=y

fS(i) (3.3)

Basically, for any t the value of F−yS (t) is independent of how S acted on turn y.

Expressed mathematically,

∀t, y {F−yS (t)|fS(y) = 1} = {F−(y)

S (t)|fS(y) = 0} (3.4)

Specifically of interest to our problem substitute y with n+ 1.

∀t {F−(n+1)S (t)|fS(n+ 1) = 1}

={F−(n+1)S (x)|fS(n+ 1) = 0}

(3.5)

Function F−(n+1)S (t) allows us to express the fact that the seller is consistent as to

whether she defects or cooperates after turn n+1 regardless of the decision she made

that turn. As we are dealing with only one seller we will ignore the subscript.

As stated above, buyers follow the consistency bid model described by BS-β1,

therefore each turn S is paid the fraction of transactions she has cooperated in the

past times the value of cooperation, v.

The following equations express the seller’s utility k turns after the cooperate/defect

choice:


Uc(n+ 1 + k) = U(n) +(

vm

n− c)

+k∑

i=1

(

vF−(n+1)(n+ i) + 1

n+ i− f(n+ 1 + i)c

)

(3.6)

Ud(n+ 1 + k) = U(n) +(

vm

n

)

+k∑

i=1

(

vF−(n+1)(n+ i)

n+ i− f(n+ 1 + i)c

)

(3.7)

Notice in Equation 3.6 the additional 1 in the summation fraction indicating S

cooperated on the n + 1 turn. Set the two utility equations equal to each other and

solve for k. The full derivation is presented in Appendix A.

Uc(n+ 1 + k) = Ud(n+ 1 + k) (3.8)

k ≈ (n+1

2)(ec/v − 1) (3.9)

Using our default parameter values (v = 3 and c = 1) results in k ≈ 0.40n + 0.2,

which means a seller that has accumulated a history of 10 transactions would profit

more from cooperating on the next sale than defecting if she plans to participate in

5 or more additional transactions.

Figure 3.1(a) shows the linear relation between n and k for three different values

of v/c. For example, for n = 40 and v/c = 2, k = 13.3, therefore, given that a buyer’s

valuation of a good bundle is twice the cost of producing the bundle, a seller with

a history of 40 sales (good or bad) will profit less from defecting than cooperating

on the next sale, if she sells 14 or more additional bundles. Interestingly, k does not

depend on m or f(x), only n. This means that the seller’s decision to cooperate or

defect on past or future transactions has no impact on whether she should cooperate

or defect on the current turn; only the quantity of past transactions matters.

The other factor affecting k, in addition to the length of a seller’s history (n), is


0

10

20

30

40

50

60

70

80

0 10 20 30 40 50

k

n

v/c=1.1v/c=2v/c=3

(a) As a function of seller’s history n, for threevalues of v/c.

0

10

20

30

40

50

60

70

80

90

1 1.5 2 2.5 3 3.5 4 4.5 5

k

v/c

n=10n=25n=50

(b) As a function of the ratio of valuation to costv/c, for three values of n.

Figure 3.1: Number of transactions until gain from single defection equals loss fromlowered reputation k.

the cost and valuation of goods. More specifically, as valuation increases with respect

to cost, the optimal fraction of total transactions to defect on decreases. The ratio of

cost to valuation is illustrated in Figure 3.1(b) for three values of n.

Intuitively, as the difference between cost and valuation shrinks, the potential for

profit goes down. For instance, if valuation equals cost plus a small δ, then the highest

price buyers will be willing to pay is the cost of the bundle plus δ. If the profit a seller

can make from the sale of a good bundle is a fraction of the cost, then the utility

earned by saving on the cost of one bundle outweighs the profit loss on many good

bundles. This is represented by the sharp rise in k as v/c approaches 1 in the figure.

As the cost of producing a good bundle becomes a smaller fraction of the valuation,

and thus the bid price the seller can command for a bundle, then the decrease in bid

prices due to lower reputation quickly usurps the one-time gain from defection. As

v/c approaches ∞, k converges to 0.

This analysis suggests a new seller strategy for the MB-1S/VP/PH scenario:

Seller Strategy σ2σ2σ2: Seller S assumes buyer uses BS-β1β1β1. Suppose S knows be-

forehand how many total bundles she wants to sell, Z, and the cost and valuation of


bundles. S will maximize her utility by cooperating on the first d(Z − 12)e−c/v − 1

2e

transactions and then defecting on the rest.

If S knows the total number of bundles she will auction over her lifetime in the

system (call this Z), S can maximize her profit by cooperating for some of the initial

transactions then, at a certain point, switching and defecting on the rest. Lemma 3.4.1

gives, for a certain number of completed transactions, how many more transactions

must be completed for the one-time gain from defecting to equal the long-term loss

due to a lower reputation. If a seller defects and performs less than k additional

transactions, the defection was in her benefit. If S performs more than k, then she

has less utility than had she cooperated. Therefore, ideally S’s strategy is to cooperate

on all sales for a number of turns, then defect on the rest of the turns. When the

number of transactions in the cooperating phase is n, the number of transactions in

the defecting stage is k+1, and the two values are related by Lemma 3.4.1, the utility

is maximized.

Below we prove SS-σ2 is optimal for a seller participating in a predetermined

number of transactions under the scenario MB-1S/VP/PH where the buyers are using

BS-σ1.

Definition Let a transaction schedule of length Z be a permutation of exactly Z

cooperations and defections. Let ΞZx be the set of all possible transaction schedules

with x cooperations and Z − x defections. For example (C C D D C D C)∈ Ξ74.

Definition The utility of a transaction schedule T , U(T ), is the total utility gained

or lost by a seller who commits exactly Z transactions and cooperates or defects

in the order specified by T , assuming MB-1S/VP with buyers using strategy BS-β1.

U(T ) for any schedule T is equal to the sum of the payment received for each bundle

minus the sum of the cost of producing good bundles. The total cost for a transaction

T ∈ ΞZx is xc (cooperations times cost of each). The payment received by a seller S


for each bundle is equal to the buyers’ valuation of a good bundle times ρS which, for

the ith bundle, is the number of cooperations in the first i− 1 turns divided by i− 1.

For simplicity we will assume ρS = 1. As stated earlier, we assume buyers always

trust new sellers on their first bundle. This assumption only affects the payment on

the first bundle and is equal for all schedules. Mathematically,

∀T ∈ ΞZx U(T ) = v

1stpayment

+Z∑

i=2

vFT (i− 1)

i− 1other payments

− x ∗ ctotal costs

(3.10)

Note, the subscript in FT (i−1) refers to the transaction schedule. We define FT (i−1)

as the number of cooperations in the first i− 1 terms of transaction schedule T .

Given a seller makes Z transactions with 0 ≤ x ≤ Z cooperations and Z − x

defections, we will show that

(i) a utility optimal transaction schedule will consist of all x cooperations first,

then all Z − x defections, and

(ii) for such a transaction schedule the optimal number of cooperations is x =⌈

(Z − 12)e−c/v − 1

2

⌉

.

Theorem 3.4.2 Assuming that buyers use strategy BS-β1, the utility optimal trans-action schedule of length Z with x cooperations and Z − x defections consists ofexecuting all x cooperations first, followed by all Z − x defections. We refer to sucha schedule as a segregated schedule.

Proof by contradiction Let T ∈ ΞZx be an optimal transaction schedule in ΞZ

x

such that at least one defection D appears before at least one cooperation C in the

schedule. Let i be the index of the first D in t and j be the index of the last C. By

definition i < j. Construct transaction schedule T ′ by swapping the D at position i

with the C at position j. By definition U(T ) ≥ U(T ′). Represent each utility using


Eq. 3.10.

U(T ) ≥ U(T ′) (3.11)

v +Z∑

k=2

vFT (k − 1)

k − 1− x ∗ c ≥ v +

Z∑

k=2

vFT ′(k − 1)

k − 1− x ∗ c (3.12)

Notice both schedules have the same total cost due to having the same total number

of cooperations (x). Both also have the same initial payment. Because only the i

and j terms in T were swapped to form T ′, then ∀k < i, k ≥ j FT (k) = FT ′(k).

Cancelling out equal terms leaves

�v +�vZ∑

k=2

FT (k − 1)

k − 1−��x ∗ c ≥�v +�v

Z∑

k=2

FT ′(k − 1)

k − 1−��x ∗ c (3.13)

j∑

k=i+1

FT (k − 1)

k − 1≥

j∑

k=i+1

FT ′(k − 1)

k − 1(3.14)

However, T ′ has C in the ith position where T has a D, while all other positions

less than j are the same. Therefore, by definition of function FT (k), ∀k i ≤ k <

j FT (k) = FT ′(k)− 1. This fact, however, contradicts Eq. 3.14, which implies that

∃k i ≤ k < j s.t. FT (k) ≥ FT ′(k). Therefore, a utility optimal transaction schedule

cannot have a defection appear in the sequence before a cooperation.

Intuitively, because the benefit from defecting is a one-time savings on cost, while

the benefit of cooperation is improved reputation which in turn increases the expected

payment for each future bundle. Therefore, executing a set number of cooperations

before any defections will maximize the benefit gained from those cooperations.

Theorem 3.4.2 implies that once a seller has decided it is in her interest to defect

once, it will be in her interest to defect every time until she exits the system. Next

we check to see if there is always one value of the number of cooperations that will

maximize the utility of a segregated schedule of length Z.


First, we need an expression for the utility generated by a segregated schedule.

Definition Let Useg(Z, x) be the utility of a segregated transaction schedule of length

Z with x cooperations followed by Z − x defections. If we assume an MB-1S/VP

scenario with buyers using BS-β1, Useg(Z, x) can be expressed as

Useg(Z, x) = (v − c)x+Z−1∑

i=x

vx

i(3.15)

where (v − c)x is the utility from the x cooperations, v is the utility from the first

defection, and the summation is the utility from the remaining defections. Note, as

in Eq. 3.10, we are assuming buyers always expect the seller to cooperate on the first

transaction. This assumption simplifies our derivations and analysis but does not

affect our results. As we will see, for Z ≥ 2, the seller should always cooperate on

the first transaction.

Theorem 3.4.3 For a given value of Z the utility function for a segregated transac-tion schedule (given by Equation 3.15) has at most one unique global maximum forvalid values of 0 < x ≤ Z.

Proof The formal proof of Theorem 3.4.3 is given in Appendix B. Basically, the sec-

ond derivative of Useg(Z, x) (Eq. 3.15) with respect to x (the number of cooperations)

is −2v∑∞

k=0k

(x+k)3, which is always negative between 0 and Z. Therefore, Eq. 3.15

can have a most one maximum for any valid value of x.

Now knowing that a segregated schedule of the form (C C . . .C D D . . .D) with

x cooperations followed by Z − x defections has a unique optimal value for x that

maximizes Useg(Z, x) for a specific Z, how can we compute it? Below, we derive

an approximate answer by approximating Eq. 3.15 with a continuous function. We

then state (Theorem 3.4.5) a tighter approximation based on Lemma 3.4.1 (proven

in Appendix A), whose full derivation is presented in Appendix C.


Theorem 3.4.4 Assuming that buyers use strategy BS-β1, the utility optimal trans-

action schedule of length Z consists of approximately⌈

(Z − 1)e−c/v⌉

cooperations

followed by⌊

(Z − 1)(1− e−c/v) + 1⌋

defections.

Proof Approximate Useg(Z, x) as the continuous function U

U = (v − c)x+

∫ Z−1

x

vx

tdt (3.16)

Simplifying and taking the derivative with respect to x yields

U = (v − c)x+ vx ln(Z − 1)− vx ln(x) (3.17)

dU

dx= (v − c) + v ln(Z − 1)− v ln(x)− v (3.18)

= v ln(Z − 1

x

)

−c (3.19)

Set dUdx

= 0 and solve for x.

dU

dx= v ln

(Z − 1

x

)

−c = 0 (3.20)

ln(Z − 1

x

)

=c

v(3.21)

Z − 1

x= ec/v (3.22)

x = (Z − 1)e−c/v (3.23)

Note that the second derivative of U is

d2U

dx2= −v (Z − 1)2

x3(3.24)

which is negative for all 0 < x ≤ Z. Therefore, the value of x given in Eq. 3.23

must give the unique maximum in U for all valid values of x, just as for Useg(Z, x)

(Theorem 3.4.3).

Because we are interested only in integer values for x and Z − x, the resulting

equations for the optimal number of cooperations and defections in a segregated


transaction schedule of length Z would be

nC(Z) =⌈

(Z − 1)e−c/v⌉

(3.25)

nD(Z) =⌊

(Z − 1)(1− e−c/v) + 1⌋

(3.26)

Where nC(Z) and nD(Z) are the number of cooperations and defections (respectively)

in a utility optimal segregated schedule.

As stated earlier, performing a derivation based on the discrete representation

of utility (presented in Appendix C) results in a better approximation with error

bounds that approach 0 as Z approaches ∞. Restating Theorem 3.4.4 with a tighter

approximation:

Theorem 3.4.5 Assuming that buyers use strategy BS-β1, the utility optimal trans-

action schedule of length Z consists of approximately⌈

(Z− 12)e−c/v− 1

2

⌉

cooperations

followed by⌊

(Z − 12)(1− e−c/v) + 1

⌋

defections.

We now focus solely on this improved approximation for constructing an optimal

segregated schedule of length Z.

nC(Z) =⌈

(Z − 1

2)e−c/v − 1

2

⌉

(3.27)

nD(Z) =⌊

(Z − 1

2)(1− e−c/v) + 1

⌋

(3.28)

Figure 3.2 shows both nC(Z) and nD(Z) as functions of Z for v/c = 3. Notice that

both are linear in Z, though nC(Z) grows at a faster rate, so that it is always roughly

2.5 times nD(Z). This ratio is determined by the valuation/cost ratio. For our default

values of c and v ($1 and $3, respectively) and a sufficiently large Z, the equations

indicate a seller should cooperate on roughly the first 70% of the transactions and

defect on the rest. For example, we see that at Z = 40, nC(Z) = 28 and nD(Z) = 12;

2840

= 70%.


0

5

10

15

20

25

30

35

5 10 15 20 25 30 35 40 45 50Z

Num. Coop.Num. Defect.

Figure 3.2: Optimal number of cooperation/defections as a function of total sales.

In deriving Lemma 3.4.1, and consequently Eqs 3.27 and 3.28, we used a closed

form approximation of a finite harmonic series (see Appendix A). To numerically

evaluate the approximation error we compute the following two error functions:

Definition From Eq. 3.15 we define Useg(Z, nC(Z)) as the utility of the transaction

schedule with supposedly optimal number of cooperations and defections. Define

error functions f+e (Z) and f−

e (Z) as

f+e (Z) =

U(Z, nC(Z))− U(Z, nC(Z) + 1)

U(Z, nC(Z))(3.29)

f−e (Z) =

U(Z, nC(Z))− U(Z, nC(Z)− 1)

U(Z, nC(Z))(3.30)

Functions f+e (Z) and f−

e (Z) give us the relative error between the schedule we

assume to be optimal and the two closest schedules of length Z, namely with one

more and one less cooperation, respectively.

Below, error analysis is applied to the tighter approximation from Theorem 3.4.5

followed by an analogous evaluation for Theorem 3.4.4.

In Figure 3.3 we plot f+e (Z) and −f−

e (Z).2 For large enough Z, neither curve

crosses 0, indicating that indeed the schedule we believe is optimal does result in

2We negate the second function to better differentiate both functions in one graph.


-0.02

-0.015

-0.01

-0.005

0

0.005

0.01

0.015

0.02

5 10 15 20 25 30 35 40 45 50

Err

or

Z

fe+(Z)

-fe-(Z)

Figure 3.3: Relative utility error between optimal schedule and ±1 C/D.

-0.02

-0.015

-0.01

-0.005

0

0.005

0.01

0.015

0.02

5 10 15 20 25 30 35 40 45 50

Err

or

Z

fe+(Z)

-fe-(Z)

Figure 3.4: Relative utility error between optimal schedule using weak approximationand ±1 C/D.

better utility than a schedule with one more or one less cooperation, and is therefore

at least a local maximum. For small Z (Z < 5), however, nC(Z) is not necessarily

optimal. In fact, though not visible in Figure 3.3 because it lies outside of the y-

range, f+e (Z) attains negative values for Z = 1,2,3 and 4. These results indicate that

nC(Z) + 1 results in better utility than nC(Z) for very small Z, which is expected

because the approximation for k from Lemma 3.4.1 is weakest for very small Z.

However, for very small Z the behavior of buyers towards unknown, untested sellers

is an important factor. Originally, we stated we wanted to assume sufficient history

in order to ignore reputation bootstrapping issues.


Calculating the same error functions using the weaker approximations from The-

orem 3.4.4 reveals that f−e (Z) is periodically negative, regardless of how large Z gets.

In Figure 3.4 the dotted curve representing −f−e (Z) rises slightly above 0 with a

periodicity of 7. Therefore, for one in seven values of Z, the weaker approximation

from Theorem 3.4.4 does not compute the utility optimal schedule.

The previous numerical error analysis demonstrates that the value computed by

Eq. 3.27 specifies the local maximum for Useg(Z, x). Applying Theorem 3.4.3, we

know the value must be a global maximum because the utility function has only one

unique maximum in the valid range.

Notice that if the seller plans to participate in the system selling bundles indefi-

nitely, we may set Z = ∞. In this case nC(∞) = ∞. Therefore, SS-σ2 dictates that

a seller that plans to sell goods for the foreseeable future should always cooperate.

As expected, this result is exactly the same as SS-σ1, which sets ρ = 1.

So far we have constrained the buyers to strategy BS-β1. If we remove this re-

striction, how will buyers respond to sellers using SS-σ2?

Buyer Strategy β2β2β2: Buyer B assumes seller S uses SS-σ2. Not knowing how

many bundles S will sell in all (Z), B should assume S will always cooperate until S

defects once. From then on assume S will always defect and never purchase from S

again.

Knowing that the optimal strategy for sellers is to cooperate for their first x

transactions and then defect on the rest, a buyer will watch for a seller’s first defection

and then refuse to purchase any more bundles from it.

If a seller assumes all buyers are using BS-β2, the seller will adopt a new strategy.

Knowing that no buyer will purchase a bundle from her once she has defected once,

and given that the seller makes a larger profit from cooperating on a transaction than

not selling anything at all, the seller will cooperate on every transaction except on

the very last one.


Seller Strategy σ3σ3σ3: Seller S assumes buyers use BS-β2. Given a total of Z

bundles to sell, S will cooperate on the first Z − 1 bundles and defect only on the last

bundle.

To this seller strategy, a buyer will respond with BS-β2, indicating an equilibrium.

Notice SS-σ3 is almost equivalent to SS-σ1, where each seller cooperates on every

transaction in order to maximize ρ.

Using the estimator ρ alone, BS-β1 is unable to distinguish whether a seller, with a

history of 15 Cs and 3 Ds, is applying SS-σ1 or SS-σ2. Obviously, a player’s reputation

score must rely not only on the number of cooperates and defects, but the sequence

as well.

3.4.3 Independent Decisions for 1B-MS/FP

We continue studying the expected player behavior when sellers decide on a per turn

basis whether to cooperate or defect. While the previous section dealt specifically with

the MB-1S/VP scenario, in this section we concentrate on the 1B-MS/FP scenario.

A rational seller’s decision whether to cooperate or defect is not fixed over time (as

in SS-σ1); it may vary as both her and other sellers’ reputations vary. For example,

suppose there are 10 sellers, S1...S10 and one buyer B that is willing to pay a fixed

price p for one bundle and follows Buyer Strategy β1. Each seller has previously sold

10 bundles, of which 5 were good and 5 bad. All else being equal, the buyer will

prefer to purchase from the seller with the best transactional record, expressed as the

fraction of transactions in which they cooperated. To the buyer who must choose one,

all ten are identical and thus all have an equal chance of being chosen, 0.1. Suppose B

chooses S1. If S1 defects, it’s transaction record will drop to 511

while the rest remain

at 510

. In the following round, having a clearly worse record will disqualify S1 from

selection, lowering S1’s probability of being chosen to 0 and raising the other seller’s

chance to 19. However, if S1 had instead cooperated with B then her record would be


611

, higher than the other sellers. In the following round we would expect B to choose

S1 with probability 1 as she clearly has a better record than the rest. These expected

outcomes provide incentive for S1 to cooperate.

In the following round, B chooses S1 again. If S1 defects, her record falls to 612

,

equal to that of the other sellers. S1 is no longer guaranteed to be chosen and once

again has a probability of 0.1. Therefore, once again, S1 is incentivized to cooperate.

The third round, however, the situation is more interesting. If S1 is chosen and

defects her record drops to 713

, which is still better than the rest of the sellers with 510

.

There is no disincentive for S1 to defect. In fact, comparing 1G to 1B in Table 3.3,

S1 clearly has incentive to defect and earn more utility than cooperating.

This analysis suggests a new strategy for the seller.

Suppose there are n sellers, each with ci cooperations and di defections (ci/di not

necessarily equal to cj/dj, if i 6= j). Assume a buyer ranks sellers according to their

reputation score, calculated as ci

ci+difor seller i. Then a seller i would choose to defect

on a transaction if ci

ci+di+1>

cj

cj+dj∀j 6= i. Otherwise, the seller cooperates. More

generally stated,

Seller Strategy σ4σ4σ4: Assuming buyers use BS-β1, seller S always cooperates

unless her reputation is sufficiently higher than the other sellers that a defection still

gives S a higher reputation than the rest.

Now we relax the assumption that buyers strictly use BS-β1.

In the example above, we noted that S1 had incentive to defect in round 3 while

keeping her standing as the highest reputable seller. Consequently, B may be better

off ignoring S1 and choosing one of the other sellers, contrary to BS-β1. Now, the

situation is reduced to the problem of 9 equal sellers, all with incentive to cooperate.

Suppose B chooses S2 now. As before S2 is expected to cooperate, raising her repu-

tation to 611

. On the fourth round, B will choose S1 again, since if she defects, S1’s

new reputation of 713

will be lower than S2’s.

3.5. RELATED WORK 63

If sellers are expected to use SS-σ4, a buyer should then choose the seller S, such

that cS

cS+dS≥ cj

cj+dj∀j unless cS

cS+dS+1>

cj

cj+dj∀j 6= S. In such a case, the buyer should

choose the seller T such that cT

cT +dT≥ cj

cj+dj∀j 6= S. In other words,

Buyer Strategy β3β3β3: Assuming sellers use SS-σ4, buyer B always chooses to buy

from the seller with the highest reputation whose rank (from most reputable to least

reputable) would fall if they defected.

We believe that knowing buyers are using BS-β3 will not cause sellers to deviate

from SS-σ4, and hence equilibrium is reached. We do not present a formal proof here.

3.5 Related Work

The work presented here was initially inspired by the work of Feldman et al. In [45],

they used the Evolutionary Prisoner’s Dilemma to model peer interactions in a large

population. They developed a reciprocative strategy that employs subjective shared

history and adaptive stranger policies to discourage selfish behavior and whitewash-

ing. While this previous work relies primarily on simulations to evaluate the effec-

tiveness of their design we apply mathematical analysis to derive agent strategies and

overall system behavior.

Much work has applied game theory to the problem of selfish agents (e.g. [18,

46, 107]). [18] predicts a socially beneficial Nash equilibrium given some incentive

scheme, while [46] concentrates on minimizing whitewashing. However, most of this

research uses one-shot games to model behavior and do not address peer history or

reputation.

In both [22] and [75], user groups participated in economic games in order to

experimentally compare the market efficiency from varying amounts of transaction

history. Their results are similar to our analytic results, indicating that the more

information available about an agent, the more likely they will cooperate.

Economists have applied game theory to market analysis and reputation for decades [80,


100, 50]. Most of this work has focused on firms competing for market share. How-

ever, the explosion in online trade among countless small transient agents demands

a reevaluation of the subject. In addition, to the best of our knowledge no previous

work studies optimal segregated transaction schedules.

3.6 Future Directions

The work presented here assumed all buyers had an equal constant valuation that

was public. One extension will be to allow buyers to have different, private bundle

valuations. This would only affect seller strategies that rely on knowing v in order to

choose the proper course of action.

Much of the analysis of this simplified model indicated equilibrium in some sce-

narios is reached when sellers only cooperate. Introducing an unavoidable error rate

that results in occasional defections regardless of the seller’s intention, may require

buyers and sellers to devise more interesting strategies. What is sellers could gain

a cost reduction on all bundles by accepting a higher error rate? This would mimic

retailers choosing to stock cheaper items from lower quality manufacturers.

A necessary step will be to forego our assumption of perfect history and explore

the use of uncertain history provided by imperfect reputation systems. One solution

would be to assign a probability that any given transaction is incorrectly reported or

simply omitted from a seller’s recorded history.

3.6.1 Variably-valuated goods

So far we have discussed situations where every good sold by a seller had an equal cost.

In real markets a seller sells goods of varying value. How does this affect reputation?

How important is the value of the transactions in a seller’s history? If no import is

placed on the transaction value a seller could accumulate a high reputation selling

inexpensive goods and then defect on one large transaction.

3.6. FUTURE DIRECTIONS 65

One improvement may be to use the price of each transaction when computing

a seller’s reputation. For example, let us assume a buyer is using BS-β1 and wants

to calculate ρS for seller S. Instead of using the formula in Equation 3.1 where each

previous transaction is reduced to 0 or 1, we can use the following equation

ρS =

∑

i∈CSp(i)

∑

i∈TSp(i)

(3.31)

This estimator allows a buyer to better detect a seller who purposefully cooperates

on small transactions, but defects on very large ones, and distinguish that seller from

one who makes accidental errors that tarnish its reputation.

3.6.2 Malicious Sellers

Until now, the players have acted selfishly. Selfish sellers are motivated only in in-

creasing their utility by raising the price of their goods and lowering their costs. We

will now consider malicious sellers that extract an additional benefit from passing bad

content to buyers that harms them in some way. Malicious users in the real world

include propagators of virus-infected software in order to gain access to machines or

disseminators of falsified copies of documents in order to promote their agenda.

Accounting for malicious motive in the model is difficult, as it is unclear how to

represent the effects of malicious activity in terms of monetary gain or loss. One

possibility is to represent malicious activity by adding an additional payoff term to

both the buyer and the seller: a negative term, −(1−g)m, representing the lost utility

from damage to the buyer, and a positive term, (1 − g)m, for the resulting benefit

the malicious seller derives. We call the coefficient m the maliciousness factor, a new

parameter in our model that relates the amount of bad content provided by a seller

to the damage in utility afflicted on the buyer, as well as the gain in utility to the

seller for causing it. Adding the maliciousness factor of m = $1 to Payoff Table 3.3

we have Table 3.5. Updating the payoff formulas from Section 3.2.1 yields a single


Table 3.5: Payoff Matrix for fixed $2 priced goods with valuation $3, cost $1, andmaliciousness factor $1

Bundle(S) Buyer Seller Social Profit[1G : 0B] 1 1 2[12G : 1

2B] -1 2 1

[0G : 1B] -3 3 0[g : (1− g)] 4g − 3 3− 2g 2g

transaction payoff of vg − p−m(1− g) for buyers and p− cg +m(1− g) for sellers.

Note, the social profit remains the same at (v − c)g.

Of course, we are assuming that the benefit derived by a malicious seller for selling

malicious content exactly matches the cost imposed on the buyer. In most situations

the benefit/cost ratio would be imbalanced. For example, a seller that sends a virus-

laden file to a buyer might receive some small temporary joy from this act, but the

buyer may have his hard drive erased and lose years of work in the process.

Because of the uncertainty in modeling malicious activity on a per transaction

basis, we do not employ it in this study. A more sophisticated method for accounting

for the effects of maliciousness is presented in the following chapter.

3.6.3 Costly Signaling

One technique that may complement reputation is costly signaling, especially when

little or no transactional history is available. First proposed by Zahavi [142, 143],

costly signaling is a biological mechanism whereby organisms communicate their

“quality” to potential mates by overtly expending energy or resources as a sign of

their fitness. Since then, it has been explored by many biologists and economists

(e.g. [58] [57] [123]). A real world economic example would be a store that offers free

gifts to anyone who enters in order to entice them to consider additional purchases.

In fact, advertising in general constitutes costly signaling. Applying this concept to

3.7. CONCLUSION 67

our game would introduce a new cost, say c′, that all sellers must pay each round, re-

gardless of whether they sell a bundle or not. While the additional cost may decrease

sellers’ profits (if not offset by a raised price) it may encourage buyers to trust new

sellers.

Once a seller has obtained a good reputation, the extra cost may be unnecessary.

After a number of successful transactions, a seller may be allowed to waive this cost.

This procedure equates to an entrance fee imposed on newcomers to the system.

Similar techniques are discussed in the following chapters.

Although costly signaling may not be applicable to every transaction scenario, it

can be used in concrete peer-to-peer applications. Costly signaling forms the basis

of effort-balancing protocols that attempt to equalize the computational resources

expended by both parties. Effort-balancing employs artificial puzzles that require

complex CPU-bound or memory-bound functions to solve, but are simple to verify [41,

2]. This technique has been suggested for various applications ranging from email

spam reduction [40], to digital preservation [87]. However, adding artificial resource

burdens in order to guarantee equal effort is not advisable for applications where

speed and efficiency are of the utmost importance.

3.7 Conclusion

This chapter presents our initial study of buyer/seller strategies, focusing primarily

on how knowledge of past transaction history affects both buyer and seller strategy.

We proposed a simple game model for transactions with cooperating and defecting

buyers and sellers in a rich spectrum of scenarios. Beginning with basic strategies for

all players, we incrementally improved them until an equilibrium was reached.

We concentrated on the two scenarios we believe to be the most natural, buyers

competing in an auction (MB-1S/VP) and many sellers competing for buyers in a


fixed-price commodities market (1B-MS/FP). It is interesting to note that at equi-

librium players are encouraged to cooperate, realizing the social optimum. In other

words, it does not pay to cheat when reputation is involved.

Chapter 4

Modeling Reputation andIncentives in Online Trade

The previous chapter presented a game theoretic approach for analyzing user strate-

gies when each seller’s transaction history is available. However, the study was limited

to a small number of participants using simple strategies. In addition, we ignored the

initial entry effects; how should new sellers with no history be treated? In this chapter

we approach the issue of reputation in trading systems from a macroeconomic level,

focusing not on individual transactions, but whole system trends and the expected

behavior and performance of different types of peers.

There are many ways a software designer can implement an online trading system.

Each design choice involves trade-offs that are not yet completely clear. For example:

How should peers choose among trusted and untrusted service providers?

How do peers use reputation information to choose who to transact with? A system

must balance avoiding bad peers with giving new peers the opportunity to partici-

pate. Our analysis in this chapter will show that, while the selection method has little

effect on the long-term profit rates of well-behaved agents, it does influence whether

malicious peers profit in the system or not.

Should a peer’s reputation reflect the amount of cooperation or purely

the quality of their cooperation? If the trust strategy measures only the quality

69

70 CHAPTER 4. MODELING REPUTATION AND INCENTIVES

of a peer’s interactions, then malicious peers can gain a high reputation from a few

small transactions and then defect on large transactions. Using two separate scalar

metrics would be pointless because we want all peers ordered relative to each other

within the given context. We will find that calculating trust solely by the proportion

of good work done does not effectively deter misbehavior.

How easy or difficult should it be for a new user to join the system?

Should users pay an entrance fee to join the system as insurance against possible

malicious behavior? How much should we trust new peers? If the initial trust is too

high, then malicious peers can profit, at least in the short run. If initial trust it too

low, good new peers will never be able to contribute and raise their reputation. Given

specific values for the system parameters (e.g. expected payoff from maliciousness)

we can calculate an initial trust so that malicious peers are not expected to profit at

all. If we charge an initial entrance fee, then we can slightly raise the initial trust

so that good peers begin generating profit faster, while malicious peers are still not

expected to generate sufficient profit as to outweigh the entrance cost.

Should a user reply to queries from only trusted peers? Peers may decide

whether to respond to a service request based on the reputation of the requestor.

Tying service responses to requestor reputation improves the expected performance

for good peers when the system is highly loaded with requests.

For each of these questions, what are the implications of various solutions in terms

of fairness, profitability and vulnerability to misbehavior?

To address these, and many other, questions we have developed a mathematical

model for peer behavior in trading system that employs per transaction payments and

a reputation system. Note, we are not modeling or analyzing any specific existing

mechanism.1 Instead, we strive to develop an abstract model that is simple and

general enough to analyze design issues and facilitate system engineers.

1For an analysis of a realistic reputation system, see Chapter 5.

4.1. ASSUMPTIONS AND DEFINITIONS 71

The following section defines terms and lists our assumptions. Section 4.2 de-

scribes our basic economic system model. In Section 4.3, we present a time-based

analysis of our model. Section 4.6 discusses variations to specific components of the

model, which we then develop into our generalized economic model in Section 4.7.

Finally, we conclude in Section 4.10.

4.1 Assumptions and Definitions

The basic unit of work in a trading system is one party acquiring a resource or

service from another party. We will refer to this as a transaction. A transaction may

or may not involve a transfer of payment in exchange for the resource or service.

To distinguish between when a peer participates in a transaction by providing the

resource (server) or acquiring it (client), we will refer any transaction a peer serves

as a contribution and any transaction a peer requests as an acquisition. For brevity

we will refer to the goods, services, resources, etc., acquired through one transaction

as a resource.

Our system model is that of a group or network of users or nodes 2 that exchange

resources with each other. From now on, we will refer to the users or nodes in the

system generically as peers.

We assume a trusted reputation system collects the results of the transactions

(whether they succeed or fail and the quality or validity of the acquired resource)

in order to calculate a reputation rating for the peers involved. A peer uses these

ratings to determine which of the other peers offering the needed content to contact.

Well-behaved peers will in general prefer to interact with “reputable” peers. Below,

we explain how these assumptions are expressed in our economic model.

2We use node to indicate a user’s virtual identity in the exchange system. One user may acquireseveral system identifiers and thus control several nodes [38]. A node’s behavior may also differ fromits user’s if the user’s machine has been compromised (e.g. infected by a virus).


We refer to any peer who has not previously participated in a transaction, and its

identity is unknown to the reputation system, as a stranger [45]. A stranger may be

a newcomer to the system or a whitewasher, a peer who has changed its identity in

order to reenter the system with no history of its past behavior.

4.1.1 Utility

The goal of each peer in a trading system is to increase its “utility” (defined formally

below), by acquiring resources which it values more than they cost the peer to acquire.

In many cases, the utility of an acquisition is completely subjective (e.g. the senti-

mental value of a song purchased from iTunes [8]). Likewise, the cost of contributing

is dependent on the peer. For example, a student in a college dorm may have free

high-speed Internet access, while someone else pays a monthly fee for a fraction of

that bandwidth. We make a couple of assumptions for simplicity:

• The full utility gain or cost of a transaction can be expressed in a general unit

of utility, denoted by the symbol u.

• All peers gain the same utility from an acquisition and suffer the same cost in

utility for a contribution, though some peers are capable of contributing more

than others.

4.1.2 Time

Our model characterizes how a peer’s utility changes over time while participating

in the online trading system. We discuss how a peer’s behavior in a unit of time

influences its utility and choices for the next interval. Thus, we describe our model

in the context of discrete time units. When we refer to a given variable F in two

different units of time, we use F [i] and F [j] to denote the difference. For brevity, we

leave off this notation if all time-varying parameters refer to their values at the end

4.2. FORMAL MODEL 73

of the same unit of time.

In Section 4.3, we represent the time-varying parameters with continuous-time

functions. In this case we use parenthetical notation to denote the value of a para-

meter at a specific time (e.g. F (t)).

4.2 Formal Model

Now, we present a general mathematical model for the behavior of peers in a peer-

to-peer system. This model illustrates the effect of both the incentive scheme and

reputation system on a peer’s decisions whether to contribute positively or not.

In naıve exchange systems freeriders profited because the amount of services one

could request from the system was decoupled from what they contributed. The fol-

lowing equation demonstrates how a particular peer i’s profit changes over a given

period of time.

P [t] = kvA[t]︸︷︷︸

income

−kcC[t]− κ︸︷︷︸

cost

(4.1)

We break down the characteristics affecting each peer’s strategy in the system into

the following three parameters: utility, contributive capacity, and acquisition rate.

• Utility (U) is the total value (in utility units u) of all resources available to a

peer, including its monetary wealth. We denote a peer’s total utility when it

enters the trading system as U(0) and assume a peer does not change its utility

through factors external to the system once it joins.

Profit is the amount a peer’s utility changes in a unit of time and is denoted

by P = ∆U . Profit is the utility gained by a peer from using the system (its

income) minus the cost of participation. Factors that increase income include

resources acquired and payments, or other incentives, received. Factors that

increase cost include resources expended (e.g. bandwidth) and payments made


if purchasing resources. Because we are generally more interested in the change

in a peer’s utility from using the system, than its absolute utility, we tend to

use the term “profit” more than “utility”.

• A peer’s contributive capacity (C), in general, indicates the number of trans-

actions it can serve (contribute) in a unit of time. The contributive capacity

takes into account, for example, the rate a peer can answer queries and upload

files. For typical freeriders, C = 0 as they provide no files or services.

For now, we assume a peer’s contributive capacity remains constant in a given

unit of time . Changes to its capacity are directly initiated by the user and not

affected by the system.

For convenience, we bound C to a normalized range of [0, 1] where 1 represents

the maximum contributive capacity any peer. In a real-world system, while any

peer can lower their C to 0, not all can raise it to 1 (e.g. bandwidth constraints).

• Acquisition Rate (A) is the number of resources a peer can acquire in one

unit of time. Similar to the contributive capacity, we bound A to a normalized

range of [0, 1]. Though not necessarily true in real systems, we assume A is

not dependent on C (or vice versa) due to resource constraints (e.g. same

bandwidth for uploads and downloads).

When dealing with a specific named peer we will subscript the above variables

with the peer’s id (e.g. Ti for peer i trust). The equations we present in this section

all deal with the effects a single peer’s factors (U , T , C, A) have on each other.

Therefore, for brevity, we will leave off the subscripted id, unless specifically referring

to interaction between two distinct peers.

In Equation 4.1, kvA represents the peer’s income from resources it acquired (e.g.

the number of download times the value of the downloads to the peer, denoted by the


# of peers

P

SaturatedUnsaturated

Figure 4.1: Relationship between a peer’s profit rate and the number of peers in thenetwork.

constant kv), while kcC is the cost suffered by the peer for contributing C amount of

resources (kc is the cost of contributing one unit of C). κ is a low fixed cost peers

pay for belonging to the network (e.g. cost of bandwidth per unit time for doing

basic routing, or subscription fee). The equation is maximized when C is zero, or no

contribution is made. Therefore, peers are encouraged to be selfish and freeride in

order to maximize their profit.

We ignore the effect the number of peers participating in the exchange system (n)

has on any single peer’s ability to acquire or contribute resources. For low values of n,

peers may have difficulty locating the resources they need. However, for large enough

n, there is sufficient resource availability that the bottleneck in acquiring resources is

the requesting peer itself. For simplicity, our model assumes the system is operating

in this saturated scenario to the right of the dashed line in Figure 4.1.

4.2.1 Incentive Schemes

The reason why freeriding thrives in the naıve model is because the acquisitions a

peer is allowed is independent of their level of contribution. To discourage freeriders,

a trading system will employ some incentive scheme so that the amount of resources

a peer can acquire is directly related to the amount of resources it contributes to the


system. An incentive scheme is a set of rules of behavior, enforced by a centralized

mechanism or a distributed protocol, that encourage peers to contribute to the trading

system in order to increase the utility they gain from the system.

For our model we abstract all incentive schemes as a policy of payments for us-

ing system services (i.e. acquiring resources). This payment is typically earned by

contributing resources to the system.

P = kvA︸︷︷︸

value ofacquiring

− kpA︸︷︷︸

cost ofacquiring

+ kpC︸︷︷︸

payment for

contributing

− kcC︸︷︷︸

cost ofcontributing

− κ (4.2)

In Equation 4.2, peers are paid proportionally to their contribution (kpC). We

can think of kp as the price other peers pay for each normalized unit of contribution

expressed in units of utility (u) in the equation. kc remains the same, the cost a

peer incurs for contributing. For each acquisition, a peer pays a price of kp, but gains

an average utility of kv from it.

We expect peers to be rational and thus only acquire resources whose acquisition

will increase their utility. Thus, we assume that the utility of the resources acquired

is greater than the utility of the price paid (kv > kp) or else the peer would not have

purchased the resource through the system.

For purposes of discussion, we will assume the incentive scheme requires each

peer to pay the resource provider for any resource it acquires using a common trading

system currency we call credits, whose generation and distribution is managed by the

system. Using a distinct currency for transactions allows us to cleanly decouple the

resource acquisition and contribution functions and apply the concept of “price” to

each transaction. We also assume a global exchange rate between credits and utility

exists, rx. For instance, if kp = 2 and rx = 10, then a peer with a contributive capacity

of 1 will earn 20 credits per unit of time, worth 2u. A peer with C = 0.5 would make

10 credits worth 1u. However, the two are not necessarily directly exchangeable (as


we shall see in Section 4.2.2).

For simplicity, we will assume that all resources in the system are priced the same

in credits. Later, we will add price variability into our model.

We see in Equation 4.2 that a peer can generate credits by contributing, and that

it can only acquire resources by spending credits. Consequently, a peer’s acquisition

rate appears to be at least loosely related to its contributive capacity, as is the goal of

the incentive scheme. In the next section, we discuss specific methods for tightening

or relaxing the relation between A and C.

4.2.2 Currency Scenarios

The amount of resources a peer can purchase is limited by many credits it has avail-

able.3 We now introduce three scenarios representing different methods of treating

credits and payments in the system. Each scenario will impose its own additional

restrictions to the incentive model.

In the first scenario a peer can purchase resources using its full utility. We assume

peers can freely purchase credits from or sell credits back to the system using some

real-world currency. We further assume all of a peer’s utility can be converted to this

currency and used for purchasing credits in the system. Then the amount of resources

a peer can acquire is limited only by its current total utility, which can be expressed

as

U [t− 1] ≥ kpA[t] (4.3)

where U [t− 1] is a peer’s total utility at the beginning of the unit of time t. Though

the peer gains an additional kpC−kcC of utility during that time, we assume that all

purchases are initiated at the beginning of the interval, simplifying our evaluation.

However, if the system does not allow credits to be purchased nor sold directly, a

peer is limited to using only the credits it has earned from earlier contributions plus

3For now, we assume peers cannot “borrow” credits from the system


any additional credits it has saved previously. In the second scenario, peers can only

purchase resources with the credits they have earned and saved from contributions.

Let Si[t] be the current credits saved up by peer i from contributions at the end of

time interval t. We now have the following two equations instead.

S[t− 1] + kpC ≥ kpA[t] (4.4)

∆S = kpC − kpA[t] (4.5)

Equation 4.4 states that the amount of credits a peer pays for resources in a time unit

cannot exceed the number of credits saved up plus what it earned from cooperating

in that same unit of time. Equation 4.5 demonstrates how a peer’s credit balance

(the amount of credits saved up) changes over time.

We have presented two scenarios, depending on whether the incentive scheme

allows credits to be freely purchased or not. We refer to a system governed by Eq. 4.3

as the common currency scenario, while a system following Eq. 4.4 and 4.5 will be

called disjoint currency scenario. “Common currency” refers to the fact that credits

in the system are freely exchangeable with currency outside the system, so that it can

be thought of as a single currency. “Disjoint currency” stresses the fact that credits

can only be earned or spent within the system by providing or acquiring resources.

We still include the credit payments received or made in our utility equation because

they represent the potential utility we would gain by purchasing resources with the

credits.

A third scenario, which will be the most useful for analysis and discussion, assumes

that in each time interval peer spends exactly as many credits as it earned cooperating,

which is expressed by the following equality:

kpC[t] = kpA[t] (4.6)

The disjoint currency scenario discussed before assumes credits cannot be exchanged


to or from the system for real-world currency so credits are only useful within the

system for purchasing resources. Consequently, we expect peers to spend all their

saved credits on resources before leaving the system. Thus, for a sufficiently large

time interval, we expect Equation 4.6 holds for any rational peer in the disjoint

currency scenario. Accordingly, we call this third scenario the long-term disjoint

currency scenario. With this model a peer’s credit balance does not change and can

be assumed to be 0 between intervals and ignored. Because credits are worthless

outside the system peers will spend all their credits. Therefore, the system does not

need to enforce this policy as peers will seek to carry it out in their own self-interest

to maximize profit.

Obviously, because kp is a constant, A = C. We can now substitute C for A in

the term for the value of acquired resources in Equation 4.2, kvA, giving us kvC.

We can now simplify Equation 4.2 for the long-term disjoint currency scenario by

cancelling out the equal terms and substituting:

P = kv(kp

kp

C)−��kpA+��kpC − kcC − κ

= kvC − kcC − κ(4.7)

Constant kv represents the total income gained for each unit contributed, either in

value of resources acquired or credits received but not spent. As long as the incentive

system guarantees that kv > kc peers are motivated to contribute more resources and

not freeride.

4.2.3 Trust

Unfortunately, malicious peers insist on distributing bad content to other peers, prof-

iting from harming the system. To capture this effect we divide the contributive

capacity into good capacity CG and bad capacity CB. The latter includes resources

devoted to disrupting the system. Of course, C = CG +CB. We incorporate the effect


of malicious peers in our profit model from Equation 4.2 in the following formula:

P = πgkvA− kpA+ kmCB + kpC − kcC − κ (4.8)

In Equation 4.8, πg represents the fraction of the nodes in the system that are well-

behaved (not malicious). We would expect this same fraction of requested transac-

tions to complete successfully. Therefore, we only gain utility from a πg fraction of

the resources we purchase. Now that malicious peers are sharing bogus resources a

fraction 1−πg of each peers’ requested transactions will be worthless. This decreases

the utility of acquired resources to πgkvA. In addition to being paid for all their con-

tributed resources, malicious peers gain additional utility from causing damage with

the bad contributions (CB). The parameter km quantifies the additional value bad

nodes gain from harming the system with their bad content. If we assume km > 0 for

malicious peers, then it is in their interest to increase CB to maximize profit, resulting

in C = CB and CG = 0. For well-behaved nodes km = 0 and C = CG.

Earlier, when we introduced the long-term disjoint currency scenario we saw that

an incentive scheme promotes peers to contribute if it can guarantee that kv > kc.

Now the loss of profit due to bogus resources will lower the expected income from

acquisitions by a factor of 1 − πg. Consequently, if πgkv ≯ kc good peers will be

motivated to stop contributing or leave the network altogether.

For completeness, we may want to account for any additional cost incurred by a

malicious peer for sharing any amount of good resources (CG > 0). We can subtract

an extra term kmgCG to Equation 4.8, representing any unhappiness the malicious

peer may get for contributing useful resources to the system. Because we feel that in

most situations either kmg ' 0 and/or CG ' 0, we will ignore this factor.

Note, Equation 4.8 assumes that resource providers are paid for their resources

before the purchaser is able to verify the validity or value of the resource acquired.

This is expressed by the payments received term (kpC) indicating payment for all


Trus

t Vec

tor

∆T

∆U

Peers

Figure 4.2: Representation of a reputation system’s role in a trading network. Trans-action observations update peer reputations maintained in the trust vector. Reputa-tion information is then used by peers in transactions to improve expected utility.

contributions, not just good contributions. Likewise, the payments made term (kpA)

denotes that all acquisitions were paid for, while only a fraction πg received were

good. If instead we assumed peers paid after verifying the validity of resources (or

could reliably revoke their payments), then a peer would only be paid for the good

resources it provided (kpCG) and it would only pay for the fraction of resources that

were valid (πgkpA). This difference would yield instead the following equation.

P = πgtkvA− πgtkpA+ kmCB + kpCG − kcC − κ (4.9)

We do not believe we can expect peers to be able to verify the validity of resources

before paying for them. A system of payment revocation (as we have with credit cards)

may be possible, but if the malicious peer has already spent the credits it received,

it would be difficult to exact a currency-based penalty. Therefore, we continue our

economic model based on Equation 4.8.

To combat malicious nodes we deploy reputation systems. Reputation systems

can be abstracted as two separate mechanisms (illustrated in Figure 4.2).

1. A centralized authority or distributed protocol, represented by the eye, tracks

every peer’s positive and negative contributions and modifies that peer’s trust

rating based on their past and present contributions. The structure containing

reputation information for each peer is labelled a trust vector in Fig. 4.2. We


refer to the model of this mechanism as the trust model.

2. Peers access the trust vector to fetch the ratings of peers offering the resource

they desire and take into account the expected risk of bad service (given each

provider’s reputation) when selecting the provider from which to fetch the re-

source. We refer to the representation of the second mechanism as the profit

model.

Modeling the first mechanism is dependent on the method in which trust is com-

puted, which is system-specific. Therefore, we need an equation for how a peer’s

trust rating changes based on its contributions. A peer’s trust rating or reputation

may be derived in three ways: from the quantity it contributes, the quality of its

contributions, or a combination of the two. We will ignore the first method, as bas-

ing reputation purely on the quantity is obviously counter-productive, as a malicious

peer that contributes a lot of bad work will attain a high reputation. Basing the trust

rating purely on the quality of contributions can be effective but does not completely

capture the value a peer brings to the network. For example, we may consider a peer

that contributes twice as much good work as another peer to be more reputable. In

addition, malicious peers can take advantage of the trust mechanism by providing

good resources on one or two small transactions, then defecting on a larger or more

costly transaction. Therefore, the trust equation we will study combines both quan-

tity and quality measurements. However, we will compare our proposed strategy to

a purely quality-based strategy in Section 4.6.2.

We present one particular “∆T” formula below that exhibits many useful proper-

ties. We discuss these properties in Section 4.3.1 and present a more generalized ∆T

model in Section 4.7.

To utility, contributive capacity and rate of acquisition, we add a fourth parameter

indicating a peer’s reputation:


• Trust (T ) represents the perceived reliability or reputation of a node by its

peers. Trust is quantified by the peer’s rating in the reputation system. De-

pending on the reputation system, the reputation value range may be bounded

or unbounded, but for our model we assume a peer’s trust is between 0 and 1,

with 1 meaning a peer is most reputable. A peer’s initial trust rating when it

enters the system will be referred to as T (0) and assume that it is equal for all

newcomers.

The second mechanism must be modeled by the profit equation and must take

into account the effects on utility of both using reputation when choosing resource

providers and a peer’s own reputation on its ability to contribute to the system.

We augment Eq. 4.8 to express one specific way in which peers use trust ratings

to increase their profits (mechanism 2).

P = πgtkvA− kpA+ (kmCB + kpC − kcC)T − κ (4.10)

In Equation 4.10, parameter πgt is the fraction of transactions requested from well-

behaved peers who are likely to reply correctly. Unlike πg in Equation 4.8, πgt must

take into account that a peer will choose to interact with reputable peers, decreasing

the probability of contacting malicious peers. Therefore, we expect πgt > πg. In fact,

πgt is likely to be close to 1.

In Eq. 4.10, a peer’s reputation T affects how much of its system contribution C

is accessed by other peers. Peers a more likely to purchase resources from reputable

peers. As stated earlier, a peer’s trust T is bounded between 0 and 1. To model

the role reputation plays on a peer’s ability to sell resources in Eq. 4.10 we multiply

each term relating to a peer’s contribution C, by T . Thus, T determines the fraction

of the contribution which is used by other peers and thereby generating credits and,

in the case of malicious nodes, disseminating bad content. Eq. 4.10 presents one

specific relation between trust and the rate of contribution (linearly proportional).


This relation is sufficiently simple to illustrate our intuition and allow us to perform

some interesting analysis. We discuss the generalized form of this relation between

trust and profit in Section 4.7.

If we apply the long-term disjoint currency scenario, we can simplify Equation 4.10.

By applying the same reasoning used to derive Eq. 4.7 from Eq. 4.2 to Eq. 4.10 we

get

P3 = (πgtkvC + kmCB − kcC)T − κ (4.11)

where the subscript 3 indicates this equation corresponds to our third scenario, long-

term disjoint currency. This equation will be useful in our analysis of utility over time

in Section 4.3.2.

Previously, our discussion of long-term disjoint currency indicated that an incen-

tive scheme will promote cooperation if, in general, kv > kc. When we introduced

malicious peers, the dissemination of bogus resources reduced the fraction of good

resources acquired to πg. As noted earlier, well-behaved peers would only be encour-

aged to contribute if πgkv > kc, which may not be the case. Consequently, good peers

would begin leaving the network, further decreasing the probability of locating good

resources, causing the network to collapse. By introducing a reputation system we

expect the probability of acquiring a good resource, πgt, to be close to 1. If so, then

the inequality πgtkv > kc is likely to hold for all good peers in the network, assuming

kv > kc. Once again, the incentive system will encourage cooperation.

We now look at representing the first mechanism of the reputation system; com-

puting trust based on peer behavior. As stated above, a peer’s reputation rating is

determined by the quality and/or quantity of positive and negative contributions.

There are many ways by which an actual reputation system expresses a peer’s repu-

tation given their behavior. Here, we present a specific formula for updating a peer’s

trust that is intuitive, maintains T between 0 and 1, and displays characteristics ben-

eficial to reputation systems, discussed in Section 4.3.1. We evaluate other choices in


Section 4.6.2.

∆T = (rgCG(1− T )− rbCBT )T (4.12)

Equation 4.12 demonstrates how trust changes over time. Trust increases with

positive interactions (CG) and decreases with negative interactions (CB). Constants

rg and rb indicate the effect each unit of cooperation has on a peer’s trust value;

from either positive or negative contributions, respectively, though both constants

will be between 0 and 1. For example, we would most likely want to lower a node’s

reputation more for each bad file uploaded we reward it for each good file uploaded.

Consequently, we may set rb to 1 and rg to 0.25. We would also expect a peer’s repu-

tation to increase less for good behavior as it becomes more reputable. Inversely, we

would want a peer’s reputation to decrease more for bad behavior as their reputation

increases. Equation 4.12 meets these requirements by multiplying the positive factor

by (1− T ) and the negative factor by T .

In addition, both the positive and negative contribution terms are multiplied by

an additional T . Weighing the terms by the peer’s trust models real world behavior

where reputable peers will be more likely to be chosen for transactions, allowing them

more opportunities to increment (or decrement) their trust in the same amount of

time. Once again, this is based on our choice of a linear relation between trust and

contributions accepted.

∆T ∝ T implies that the low (but nonzero) reputation of strangers increases very

slowly at first. If strangers with no reputation or trust have T ' 0, then they will be

unable to gain profit or trust. To correct this problem, we assume a lower bound on

the initial trust (T (0)) of τ > 0, although very small, guaranteeing T will rise over

time.

When T is small, P in Equation 4.10 will be dominated by −κ, so the costs of

belonging to the P2P system outweigh its benefits. Additionally, their credit income


rate will be very low as few peers will purchase resources from them, fearing they

will be cheated. To encourage transactions and thus raise their reputation, strangers

may have to “discount” the price of their services. A low income rate will severely

limit stranger’s access to offered resources. We call this the “reputation slow-start”

phase. Few peers may have the patience to suffer this entry cost long enough to gain

sufficient trust to attain positive profit.

A reputation system will likely want trust to decay over time. If a peer earns a

high reputation, it may then stop contributing resources (C → 0) and maintain its

high reputation rating. This might be exploited by malicious peers. For example,

they may watch for reputable peers that are inactive and target them for spoofing,

assuming those peers are less likely to notice their identifier’s being hijacked. To

model a constant drop in trust we add a decay factor, δ to Equation 4.12.

∆T = (rgCG(1− T )− rbCBT )T − δT 2 (4.13)

The decay factor is combined with the other negative factor introduced by mali-

cious peers doing harmful work, but is independent of CB. It is multiplied by the trust

because stale reputation is a greater problem with highly reputable peers than low

reputation peers. The term T 2 ensures that peers with low trust, such as strangers,

are not greatly affected by the decay, regardless of the amount of contributions. We

require that rb + δ ≤ 1, which can be shown to guarantee that 0 ≤ T ≤ 1. An-

other possible solution to constraining the value range of T would have been to use

min/max functions in our formula for ∆T . However, by not resorting to min/max

functions to bound T we can solve for T (t) in Equation 4.13 and express trust as a

continuous function over time, which will be useful in our analysis in Section 4.3.1.

4.3. ANALYSIS 87

4.3 Analysis

Using the model presented above, we study the expected behavior of peers in a trading

system which conforms to our model. We illustrate the effects varying the different

parameters yield on the reputation and wealth of individual peers.

4.3.1 Trust Over time

Given our equation for ∆T from Equation 4.13, we can compute a function for how a

peer’s trust changes over time. If we assume the length of the discrete time intervals

approaches 0, and all other parameters stay constant, we can treat Equation 4.13 as

a differential equation. Solving this differential equation gives us

T (t) =rgCG

rgCG + rbCB + δ + Z · e−rgCGt

where Z =rgCG

T (0)− (rgCG + rbCB + δ)

(4.14)

Notice that T (0) appears in the denominator of the initial condition constant Z.

As we saw earlier, we cannot have T (t′) = 0 for any t′, else ∆T = 0, making T (t) = 0

for all t > t′. Therefore, we limit T ≥ τ . We will use a default τ of 0.01.

Using this equation, we evaluate the effects of the different parameters on a peer’s

reputation. In our analysis we constrain each parameter to a default value (given in

the first half of Table 4.1) except for the parameter(s) under study.

Figure 4.3(a) shows the progression of trust over time for a new, well-behaved node

at varying amounts of cooperation, CG. As expected, the more a peer contributes, the

faster its reputation grows. This curve illustrates the effect of reputation slow-start.

Eventually, all three curves flatten out at different values of T . The curves reach a

steady-state value when the incremental gain in trust (given the current value of T

and C) equals the drop caused by the decay factor.

Next, we look at how quickly a peer’s trust falls due to misbehavior. We begin


Table 4.1: Trust and Profit Parameters and Default ValuesParam. Description Default

ValueC Contributive capacity 1CG Good content contributed CCB Bad content contributed C − CG

rg Factor by which CG increases trust 0.2rb Factor by which CB decreases trust 0.99T (0) Starting trust value τ = 0.01δ Decay factor 0.01πgt Prob. of acquired resource being good 0.9kv Utility gained from acquiring 1 unit 2kc Utility cost of contributing 1 unit 1km Utility bad peers gain for harming system 2U(0) Initial utility of peer at time 0 0

0

0.2

0.4

0.6

0.8

1

0 50 100 150 200

T(t)

Time t

C = 1.00C = 0.50C = 0.25

(a) For different C = CG. T (0) = 0.01, δ = 0.01

0

0.2

0.4

0.6

0.8

1

0 50 100 150 200

T(t)

Time t

CB = 0.99CB = 0.50CB = 0.20

(b) For different CB . C = 1, T (0) = 1, δ = 0.01

0

0.2

0.4

0.6

0.8

1

0 100 200 300 400 500

T(t)

Time t

δ = 0.005δ = 0.010δ = 0.020

(c) For different δ. C = CG = 0.01, T (0) = 1

Figure 4.3: A peer’s trust rating over time.

4.3. ANALYSIS 89

by assuming a peer has previously cooperated prodigiously and attained a reputation

rating of 1 (T (0) = 1). Then it “turns bad” and, while maintaining a total contributive

capacity of 1 (C = 1), introduces bad content (CB > 0). In Figure 4.3(b) we see how

quickly its reputation falls. The rate at which it decreases, as well as the final level

it reaches, are dependent on the ratio of good to bad content the peer is providing

(CG : CB).

Figure 4.3(c) demonstrates the effects of different decay rates (δ). Once again,

we assume a peer has attained a high reputation in the past (T (0) = 1) and then

significantly drops their level of contribution (C = 0.01, CB = 0). Note the longer t

value range on the x-axis. In all other experiments we use a δ of 0.01, corresponding

to the middle curve.

What we see in all three graphs is that T (t) tends to converge to a different value

depending on the parameter values used. To evaluate the long-term effects of each

parameter we take the limit of T (t) as t tends to infinity.

limt→∞

T (t) = T (∞) =rgCG

rgCG + rbCB + δ(4.15)

Interestingly, T (∞) is independent of T (0), demonstrating that, regardless of the

current value of T , it will eventually converge to a particular value if parameters do

not change. We demonstrate this in Figures 4.4(a) and 4.4(b) where we plot T (∞)

as a function of C and δ in graphs (a) and (b), respectively. The different curves in

each graph correspond to three different values of rg. CB is set to 0 so the value of rb

is inconsequential. Note that the x-axis in both graphs is in logscale.

Figure 4.4(a) shows the value to which a peer’s trust converges after being in

the system for a long time and has reached a steady-state where T does not change

assuming the peer does not change its behavior. This steady-state value is the max-

imum reputation a peer can attain for a particular contributive capacity, given our

trust model in Equation 4.13. If we consider the curve for rg with default value of 0.2,


0

0.2

0.4

0.6

0.8

1

0.0001 0.001 0.01 0.1 1

T(∞

)

C

rg = 0.1rg = 0.2rg = 0.5

(a) As a function of C = CG, for different rg.δ = 0.01.

0

0.2

0.4

0.6

0.8

1

0.0001 0.001 0.01 0.1 1

T(∞

)

δ

C = 1C = 0.1

C = 0.01

(b) As a function of δ, for different C = CG.

Figure 4.4: Convergence of T as t→∞. Note the logscale x-axis. CB = 0 in both.

we see that T (∞) rises almost linearly (in logscale) from 0.2 at C = 0.01 to 0.8 at

C = 0.1. The downward shift of the curves as rg decreases is due to the decay factor.

While trust decays a constant amount each unit of time, the amount trust increases

is determined by rgCG. Therefore, a lower rg means a lower steady-state trust value.

This can be quite low for peers with low contributive capacities. A system designer

needs to balance the desire to have a low rg in order to bias the trust model against

bad behavior, and the desire for a high rg to allow contributive peers to attain a

useful reputation quickly.

We next look at how the decay factor affects T at steady-state. In Figure 4.4(b),

we plot T (∞) versus the decay factor δ, with separate curves for C = CG at 1, 0.1,

and 0.01. The first curve indicates the maximum trust attainable by a good node

contributing at maximum capacity. For δ < 0.001 T (∞) f for C = 1 is in effect 1.

But, for δ > 0.01 maximum trust quickly falls. Notice that the largest difference in

T (∞) between the three curves occurs around δ = 0.01. If we want the long-term

reputations of peers in our network to reflect mainly the quality of their contributions

(good or bad) and not the quantity, then we would want to use a much smaller decay,

such as 0.0001 where all the curves reach a high value of T (∞). However, if we want

4.3. ANALYSIS 91

reputation to indicate the amount a peer contributes along with its quality, then

δ = 0.01 seems good.

4.3.2 Utility over Time

To conduct a similar analysis for utility over time, we need to integrate Equation 4.10,

using our Equation 4.14 for T (t) in place of T . The equation for U(t) is given in

Equation 4.16.4

U(t) = (πgtkvA− kpA− κ)t+

(kmCB + kpC − kcC)

ln

(

(rgCG + rbCB + δ)(ergCGt − 1) T (0)rgCG

+ 1

)

rgCG + rbCB + δ+ U(0) (4.16)

To simplify Equation 4.16 for analysis, we will consider only the long-term disjoint

currency scenario. Using the formula for P3, given in Equation 4.11, we can derive

an equation for utility over time in this scenario.

U3(t) = (πgtkvC+kmCB−kcC)

ln

(


+ 1

)

rgCG + rbCB + δ−κt+U(0)

(4.17)

We now use Equation 4.17 to plot the profit gained from using the system for

various parameter settings. By default we use the values listed in Table 4.1.

We begin by plotting the utility of a new well-behaved peer joining the system

using all the default parameter values. In Figure 4.5(a) we see the utility over time

for three peers with different contributive capacities. The curves all have the same

shape, beginning flat,5 while the peer’s reputation is building, then it curves up

and climbs linearly once the reputation rating has stabilized and the system is in

steady-state. What distinguishes the curves are the length of time profit is flat or

4The derivation is presented in Appendix D.1.5In fact, we see the utility curve for C = 0.25 initially dips below 0 before recovering.


-10

0

10

20

30

40

50

60

0 20 40 60 80 100

U3(

t)

Time t

C=1.00C=0.50C=0.25

(a) For different C = CG.

-4

-3

-2

-1

0

1

2

0 200 400 600 800 1000

U3(

t)

Time t

C=CB=0.99C=CB=0.50C=CB=0.25

(b) For different C = CB .

Figure 4.5: A peer’s utility over time. Initial trust T(0) = 0.01. Higher is better.

negative (determined by the length of time needed for the reputation to rise), and the

slope of the final linear component (determined by the trust value at which the peer

stabilized). Obviously, peers that contribute more, will have their trust rise faster

and higher, resulting in greater utility over a given period of time.

Next, in Figure 4.5(b), we observe the utility of three malicious peers, each sharing

primarily bogus resources (C = CB + CG where CG = 0.01. If CG = 0, T is trivially

0). As expected, in the long run malicious peers lose utility since their low trust

ratings prevent them from making contributions and earning credits, while they still

must pay the fixed cost κ. Interestingly, malicious peers with a high CB generate

positive utility in the short run, though a very small amount relative to the amount

well-behaved nodes can generate. But any amount of positive utility would attract

malicious users for short-term gains. In fact, the behavior we see in Figure 4.5(b),

indicates that new malicious peers make a larger profit when first joining than new

good peers. This effect would actually encourage whitewashing, not discourage it.

Why do malicious peers profit when first joining the network? If malicious peers

are generating positive utility in the short run then P must be initially positive due

to the starting value of T (0). By setting Equation 4.11 equal to 0 and solving for

T , we find that a malicious peer with CB = 1 will have positive profits as long as

4.3. ANALYSIS 93

-10

0

10

20

30

40

50

60

0 20 40 60 80 100

U3(

t)

Time t

C=1.00C=0.50C=0.25

(a) For different C = CG.

-7

-6

-5

-4

-3

-2

-1

0

0 200 400 600 800 1000

U3(

t)

Time t

C=CB=0.99C=CB=0.50C=CB=0.25

(b) For different C = CB .

Figure 4.6: A peer’s utility over time. Initial trust T(0) = 0.0035.

T > 0.0036. Though the malicious peers’ ratings eventually fall below this threshold,

a T (0) of 0.01 allows them to make a small initial profit. By setting T (0) = 0.0035 we

prevent purely bad peers from gaining any positive utility. In Figure 4.6 we present

the same two graphs as in Fig. 4.5, but now with the new, lower T (0) of 0.0035. While

the well-behaved peers are not greatly impacted, having only their slow-start period

extended, the malicious peers begin losing utility from the onset.

Looking at the third curve in Figure 4.6(a) corresponding to C = 0.25, we see

that after the initial slow-start phase, the peer gains utility very slowly. This seems

to indicate that for a low value of C good peers will be unable to attain positive

profit no matter how long they remain in the system. In order to calculate this

threshold capacity value of C at which the steady-state profit rate goes from positive

to negative, we need an equation for the steady-state slope of the utility curves. This

formula is derived by using Equation 4.11 for profit at steady-state trust, or T (∞).

Inserting Eq. 4.15 into Eq. 4.11 we have the following formula for the slope of the

utility curves in steady-state.

P3(∞) =((πgtkv − kc)C + kmCB)rgCG

rgCG + rbCB + δ− κ (4.18)


-0.01

0

0.01

0.02

0.03

0.04

0.05

0 0.02 0.04 0.06 0.08 0.1

P3(

∞)

C = Cg

(a) Steady-state profit (P3(∞)) as a function ofC = CG.

-25

-20

-15

-10

-5

0

5

10

0 500 1000 1500 2000 2500 3000

U3(

t)

Time t

C=0.02C=0.03C=0.04

(b) Utility over time of a new peer joining withC = CG near 0.035. T (0) = 0.0035.

Figure 4.7: Minimum capacity needed for a good peer to (eventually) generate positiveprofit (using default πgt, kv, and kc) is approximately 0.035 (for default parameters).

We set CB = 0 and plot Equation 4.18 with respect to CG in Figure 4.7(a).

From Figure 4.7(a) we see that the steady-state slope is 0 when CG is approxi-

mately 0.035. Well-behaved peers with a greater contributive capacity than 0.035 are

expected to make a profit from the system in the long-run. Peers that contribute less

will only lose utility by participating as the fixed cost κ will outweigh the small rate

of credits they receive due to their low trust rating. To illustrate this effect we plot

utility over time for three values of C = CG near 0.035 in Figure 4.7(b). Notice that

only the curve, corresponding to C = 0.04, is capable of generating positive profit,

and hence utility, in the long run. Although, even for such a peer, the time during

which it loses utility before reaching a positive utility gain is very long. If peers are

unwilling to commit themselves to participating in the system for such a long period

of time, then peers with low contributive capacities will be discouraged from partici-

pating at all. Intuitively, designers of real-world systems must address two important

questions:

1. What amount of contributive capacity should be expected of participants in the

system?

4.4. SIMULATION DETAILS 95

2. What fraction of all interested parties are capable of providing and maintaining

that level of cooperation?

The answers to these questions determine the parameters of the system, as well as its

expected size.

4.4 Simulation Details

The analytic model we have introduced in Section 4.2 presents a macro-level model

with many simplifications that allow us to analyze and deduce trends about the system

from the model. For example, the equations give us the expected trust and utility of a

typical peer given a certain capacity (CG, CB) assuming a large and varied population

of peers that remain relatively static. In addition, instead of accounting for discrete

transactions, the analytic model assumes continuous work at a rate determined by

the peer’s capacity and reputation.

To test the validity of our analytic model we use a micro-level discrete transac-

tional model based on our assumptions on capacity and utility pricing. Instead of one

simple formula for the expected trust and utility of a given peer, our transactional

model specifies how a peer chooses which peers to transact with, how transactions

affect each party’s utility, and how each peer’s reputation is updated over time. This

model forms the basis of our experiments, which we compare with the analytic model’s

predicted behaviors. If the simplifications of the analytic model were reasonable, we

would expect the trends observed earlier to be visible in our experiments, even if the

actual values do not match exactly.

Using a turn-based simulator based on our transactional model, we simulate a

peer-to-peer trading system where a large population of N individual peers, each

assigned its own capacity (CG and CB), exchange resources. In each turn, for each peer

p, the system randomly chooses a subset R of the entire population N , representing


peers who respond to p’s resource request. Using the current reputation ratings, p then

selects one (or more) peer r from R and purchases resources from it. This exchange

is represented by a transfer of credits from p to r, a deduction in r’s contributive

capacity C for the rest of the turn, and a change in p and r’s utility (depending on

the amount and type of capacity used). A centralized reputation system updates

each peer’s trust ratings T at the end of the turn based on the amount and type

of capacity contributed during that turn. The algorithm executed for each turn is

presented below.

Algorithm 1 DoTurn()

for each peer p ∈ N (in random order) doselect NumResponders peersput selected peers with C > 0 in set Rcount ← 0while p.credits>0 AND |R| > 0 AND count<MaxTransactionsPerPeer do

use Selector to choose responder r ∈ Rp acquires as much capacity from r as possible (min(p.credits, r.remainingC ))count++

end whileend forfor each peer p ∈ N do

use TrustFunction to update p.T based on amount of p.CG and p.CB contributedduring turn

end for

In Algorithm 1, the parameter NumResponders (NR) determines how many peers

are randomly selected by the simulator as possible contributors (size of R), mimicking

a subset of the peers responding to p’s resource request to offer their services. The

requesting peer then chooses one or more responders (but at most MaxTransactions-

PerPeer (MTPP) responders) to fulfill its need. We denote the total number of turns

simulated as NumTurns. A list of all simulation-specific parameters and their default

values is presented in Table 4.2.

The Selector represents the selection function σ used by each peer to select which


Table 4.2: Simulation Parameters and Default ValuesParameter Default ValueN 500NumTurns 200NumResponders (NR) 25MaxTransactionsPerPeer (MTPP) 1,2Selector PolynomialExponent (E) 1 (σ = T )TrustFunction DifferentialInitial Credits C0 1

resource provider to interact with, given the set of responders R. The Selector we

use in our experiments is a Polynomial Selector. Given a set of responders, each

responder is weighted by its trust rating raised to an exponent value, T E. The

Polynomial Selector then probabilistically chooses a responder given their weights.

Therefore, the relative probability that peer i is chosen over peer j is

p(X = i)

p(X = j)=

(Ti)E

(Tj)E(4.19)

For example, if E = 1, a peer with T = 0.6 is twice as likely to be chosen as one with

T = 0.3. However, if E = 0, then all peers are equally likely to be chosen, regardless

of their reputation.

The TrustFunction implements the trust model used for updating trust. For our

experiments we look at both the differential and ratio trust models (discussed in

Sec. 4.6.2) using the default parameter values specified in Table 4.1. Unless other-

wise specified, assume our experiment used the differential trust model as defined in

Equation 4.13 to update each peer’s trust at the end of each turn.

In each experiment, we begin the simulation with N peers with varied amounts

of CG and CB. We conducted two types of experiments: population-focused and

individual-focused.


For our population-focused experiments we are interested in the behavior demon-

strated by all peers in the system after a given number of turns.

Individual-focused experiments, however, allow us to track the behavior of a single

peer over time during the experiment. For these experiments we will focus on a

peer entering an existing active trading network. Therefore, we insert N peers with

varying CG and CB values and run the experiment for some number of turns so that

these peers’ trust ratings have stabilized. We then insert a new peer with specific

parameters we are interested in testing. All single-peer results are for this new peer,

which we will refer to as p∗. All time-dependent graphs will plot the results of p∗

beginning at the time it was inserted into the experiment and continue until the end

of the simulation.

4.5 Simulation Results

We begin by examining the trends visible by studying the base population of 500 peers

itself. Afterwards, we will focus on the single-peer experiments, where a new peer is

inserted into the system after the base population has stabilized. In all experiments

presented here we set NumResponders (NR) to 25, equating to 5% of the population

responding to each service request. We begin by setting MaxTransactionsPerPeer

(MTPP) to 1, allowing each peer to make one transaction per turn.

4.5.1 Base Population

The default base population we use in our experiments consists of N=500 peers with

varied capacity distribution. We classify 70% as “good” peers. These peers all have

CB = 0 and their CG is distributed linearly from 0.01 to 1. The other 30% are the

“bad” peers (CB > 0). Their total capacity C is distributed linearly from 0.1 to 1.

Additionally, the fraction of C devoted to CB is distributed linearly from 0.5 to 1.

This allows us to have a uniform distribution of both capacity and proportion of bad

4.5. SIMULATION RESULTS 99

0

0.2

0.4

0.6

0.8

1

0 50 100 150 200 250 300 350 400 450 500

Cap

acity

(C)

Peer

CBCG

Figure 4.8: Capacity distribution for base population.

work to good work. The capacity distribution for our base population is illustrated in

Figure 4.8, where the first 150 peers are the bad peers and the rest are the good peers

(sorted in increasing capacity). The different tones distinguish the good and the bad

capacities. For example, peer 75 has a total contributive capacity of C = 0.55, of

which only one quarter is good capacity (CG = 0.14, CB = 0.41).

In Figures 4.9(a) and 4.9(b) we have a snapshot of the trust and utility (respec-

tively) values of the base population after 200 turns. As we discuss later, turn 200

is when we inject peer p∗ in the single-peer experiments. The values are ordered by

peer ids and so correspond directly to the capacities displayed in Fig. 4.8. In these

graphs, the different color distinguish the 150 bad peers from the good peers. The

black curve indicates the expected values predicted by the analytic model.

As we can see in Figure 4.9(a), the experimental results are surprisingly close

to the analytic model’s prediction. Overall, the simulation values are less than the

predicted values. We believe this is due to the nature of the simulation. By limiting

the number of responders as well as the number of transactions each peer is allowed

per turn, some of the peers’ capacities are not fully utilized, while other peers are left

with a surplus of credits they were unable to spend. This hypothesis is supported by

the utility graph in Fig. 4.9(b). Once again, the experimental results are relatively


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 50 100 150 200 250 300 350 400 450 500

Trus

t (T)

Peer

Bad PeersGood Peers

Predicted

(a) Trust

-20

0

20

40

60

80

100

120

140

0 50 100 150 200 250 300 350 400 450 500

Util

ity (U

)

Peer

BadGood

Predicted

(b) Utility

Figure 4.9: Trust and utility values for default population after 200 turns.

-5

0

5

10

15

20

25

0 50 100 150 200 250 300 350 400 450 500

Cre

dits

Peer

BadGood

Figure 4.10: Distribution of credits in base population at turn 200.

close to the predicted curve, but the predicted curve is steeper. Peers 0-300 appear to

have generated more utility than expected, while high-capacity good peers generated

less than expected. Malicious peers especially generated greater than expected utility.

Malicious peers receive an initial bonus due to starting all the peers at the same time

with the same initial trust rating, allowing them to contribute more capacity at first

when trust ratings are closer together. As we shall see below, this is only an initial

artificial bonus due to starting all the peers at the same time and does not last over

time.

Other factors also contribute to the higher than expected utility for low-capacity


peers, while lower than expected utility for high-capacity peers. Analyzing the credit

distribution, shown in Figure 4.10, reveals that indeed the high-capacity peers main-

tain a surplus of credits throughout the simulation run. This surplus is wasted utility

that could have purchased resources or services that increase the utility of the pur-

chasing peer. In addition, a smaller fraction of high-capacity peers’ contributions are

being utilized each turn due to the limited number of responders (NR) and transac-

tions per peer (MTPP). Because the analytic model is based on continuous infinitely

small transactions from a very large selection population, it predicts greater utiliza-

tion of high capacity peers. This behavior induced by our continuous time approach

for our analytic model was one of our main concerns that we wished to evaluate with

the transactional model. Fortunately, the results indicate the deviation is reasonably

small.

If we continue simulating our base population for an additional 800 turns (until

turn 1000), we see little difference in peer trust ratings from what we see after only

200 turns (Fig. 4.11(a) vs. Fig. 4.9(a)). The utility graph in Figure 4.11(b) shows a

larger difference between the utility gained by bad peers and those gained by good

peers than in the earlier snapshot (Fig. 4.9(b)), closely matching the predicted curve.

In fact, the utility of bad peers now closely matches the predicted curve, indicating

that their bad contributions have been mitigated, but not eliminated.

If we take a closer look at peers 100 through 200 (Fig. 4.11(c)), which are the bad

peers with high CB and the good peers with low C, we see that many have negative

utility. These peers are unable to contribute enough good capacity to gain sufficient

trust to stay competitive. Therefore, their trust ratings fall to 0 and they are ignored

by all peers. Consequently, the fixed cost κ is the only factor affecting their utility

and, as predicted, their utility falls below 0.

As mentioned earlier, the base population was chosen to represent different amounts

of capacity, both good and bad. If we compare the analytic curves and simulation


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 50 100 150 200 250 300 350 400 450 500

Trus

t (T)

Peer

BadGood

Predicted

(a) Trust for all peers

-100

0

100

200

300

400

500

600

700

800

0 50 100 150 200 250 300 350 400 450 500

Util

ity (U

)

Peer

BadGood

Predicted

(b) Utility for all peers

-10

0

10

20

30

40

50

60

70

80

100 120 140 160 180 200

Util

ity (U

)

Peer

BadGood

Predicted

(c) Utility of peers 100-200

Figure 4.11: Trust and utility for base population after 1000 turns.


results in Figure 4.11 for both trust and utility to the base capacity graph in Fig-

ure 4.8 we can determine what are the dominant factors. For good peers, capacity

has little effect on trust unless the capacity is less than 0.1, as we shall discuss later.

The utility, on the other hand, is proportional to a peer’s trust times its capacity.

Consequently, given a relatively flat trust curve (for C > 0.1) and a linear capacity,

the result is a linear utility curve.

Bad peers, however, are more interesting. Here we have several possible factors

influencing trust and utility: total capacity, good capacity, bad capacity, and the ratio

between the last two. In Figure 4.11(a), we see that trust for bad peers is clearly a

linear curve sloping downwards. This indicates that it is proportional to the fraction

of total capacity that is good. Remember that the base population was constructed

with CG

C= 0.5 for peer 0, CG

C= 0 for peer 149, and linearly interpolated for peers

in between the two. Considering the utility graph in Figure 4.11(b) we see that the

predicted utility curve appears similar to the amount of good capacity on bad peers

as shown in Figure 4.8. This is expected, if utility is proportional to trust times total

capacity. Since trust is proportional to the fraction of total capacity that is good,

then, multiplying by each peer’s total capacity, utility would be proportional to the

amount of good capacity.

In another experiment, not presented in any graphs, we varied the model para-

meter kp that determines the amount of credits charged per unit of contribution. As

expected, varying kp had no noticeable effect on the trust or utility values attained

by the base population. kp is merely a currency exchange rate and is equally applied

when receiving credits for contributions and when giving credits for acquisitions. This

experiment supports our simplification of profit in Equation 4.2 by cancelling out the

kp terms when applying the long-term disjoint currency scenario, resulting in Equa-

tion 4.7.


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 50 100 150 200 250 300 350 400 450 500

Trus

t (T)

Peer

BadGood

Predicted

(a) Trust

-100

0

100

200

300

400

500

600

700

800

0 50 100 150 200 250 300 350 400 450 500

Util

ity (U

)

Peer

BadGood

Predicted

(b) Utility

Figure 4.12: Trust and utility for NR=400 after 1000 turns.

4.5.2 NR and MTPP

The simulator adds several parameters not included in the analytic model. Foremost

among them are NR, then number of responders a peer has to choose from each turn,

and MTPP, the maximum number of those responders a peer may transact with

in one turn. In the following experiments we modify the value of each parameter

independently and observe its effect on the trust and utility of the peers in the base

population after 1000 turns.

First, we experimented with different NumResponders values. Remember that NR

determines the number of peers that are uniformly chosen at random by the simulator

to represent service providers responding to a peer’s request for service. From these

NR responders a peer selects one to transact with using the selection function, in

this case weighting each peer by their trust. Consequently, the larger the the value

of NR the more effect the responders’ trust values have on which peer is selected,

resulting in relatively more transactions with highly trusted peers. Inversely, a small

NR means initial uniform random selection has a larger impact the peer selected for

transaction. Therefore, we would expect low trust peers to participate in relatively

as many transactions as high trust peers, assuming they have equal total capacity.


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 50 100 150 200 250 300 350 400 450 500

Trus

t (T)

Peer

BadGood

Predicted

(a) Trust

-100

0

100

200

300

400

500

600

700

800

900

0 50 100 150 200 250 300 350 400 450 500

Util

ity (U

)

Peer

BadGood

Predicted

(b) Utility

Figure 4.13: Trust and utility for NR=1 after 1000 turns.

Looking at the graphs in Figures 4.12 and 4.13, we see exactly the predicted effect.

Comparing the trust and utility graphs for NR=400 (Fig. 4.12) with the default

NR=25 in Figure 4.11 we see the graphs are quite similar. On closer inspection we

can notice that the utility and trust values for malicious peers are overall lower for

NR=400, while the trust and utility values for high-capacity good peers is slightly

higher. As expected, this effect is due to a slightly higher number of contributing

transactions for high trust peers versus low trust peers in the NR=400 experiment.

Now, if we compare the default case of NR=25 to NR=1 in Figure 4.13 we see

a much more striking difference. With NR=1 the selection function does not have

multiple peers to compare and so is unused. All transaction providers, therefore, are

chosen purely at random and so the number of transactions a peer contributes to

will be based solely on its total capacity. Now, the bad peers have a clear advantage.

Due to their high capacities, the fact that they gain additional utility for contributing

bad content6 and the loss in utility by good peers for purchasing more bad content,

bad peers generate much more utility than the good peers. Even in the trust graph

in Figure 4.13(a) we see that the trust ratings for the bad peers are closer to the

predicted curve than before, while high-capacity good peers have seen their trust

6Remember bad peers gain a bonus kmCB utility (see Sec. 4.2.3)


values fall slightly, all due to the even distribution of transactions.

Clearly, if a peer is limited to choose between one or two service provider it is at

greater risk of being cheated by a malicious peer. But once a sufficient number of

responders are available, increasing the response set further provides no appreciable

advantage in avoiding bad peers. Therefore, system designers should strive to increase

accessibility and replication of rare content and resources, but can sacrifice a high level

of recall for popular items.

Next we will focus on the parameter MTPP, which specifies how many trans-

actions a peer can initiate in one turn. Increasing MTPP will increase the overall

number of transactions, resulting in faster, more accurate trust convergence. Also,

the total utility will increase as peers receive additional transactions. However, the

total available capacity does not change, therefore the increased utility must come

from peers that were previously underutilized at a lower MTPP. While this provides

some benefit to low trust good peers, such as new peers entering the network, bad

peers will benefit most as they tend to be the most underutilized because of their

high capacity but low reputation. In fact, the purpose of reputation systems is to

keep malicious peers underutilized as much as possible.

If we study Figure 4.9, we notice that there are a few gaps in both the trust and

utility bar graphs, indicating peers with a trust rating of 0. These are peers that were

never chosen to contribute by other peers and thus their trust falls to zero through

decay. The decay factor δ in the trust model slowly decreases the trust rating in these

unutilized peers’, which in turn further decreases the likelihood of being chosen in the

future. Once again, this is a factor of the granularity and number of transactions as

compared to the analytic model.

To improve overall system utilization we will allow a peer to initiate a second

transaction if they have credits remaining after the first acquisition (MTPP=2). In

Figure 4.14 we see the trust and utility values for the base population after 1000


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 50 100 150 200 250 300 350 400 450 500

Trus

t (T)

Peer

BadGood

Predicted

(a) Trust

-100

0

100

200

300

400

500

600

700

800

0 50 100 150 200 250 300 350 400 450 500

Util

ity (U

)

Peer

BadGood

Predicted

(b) Utility

Figure 4.14: Trust and utility for MTPP=2 after 1000 turns.

turns when MTPP=2. Comparing Fig. 4.14(a) to Fig. 4.11(a) we notice that the

trust values match much closer to the curve predicted by the analytic model for

MTPP=2 than 1. The higher number of transactions per turn simply gives the

reputation system more data on which to compute a peer’s trust rating.

Setting MTPP to 2 does, however, produce three additional effects that cause it to

deviate from the analytic model, at least with respect to the utility graph. As we see in

Figure 4.14(b) the utility values are higher for all peers, regardless of amount or type

of capacity. This overall increase in utility is primarily due to the increased number

of transactions, which translates into increased utility through more acquisitions.

Additionally, the increase in transactions results in peers contributing more, giving

the reputation system more information on which to update the peers’ trust values,

which means their trust values converge and stabilize faster than before. For most

good peers, which stabilize at a high trust rating, faster convergence allows their

trust rating to rise quickly, giving them an earlier advantage over low trust peers. In

the graph of Figure 4.14(b), this effect is demonstrated by the fact that the slope of

the trust graph for the good peers in the simulation is steeper than in the MTPP=1

graph (Fig. 4.11(b)). In fact, unlike in the previous results, the slope now matches

the slope of the predicted curve.


-500

0

500

1000

1500

2000

2500

0 50 100 150 200 250 300 350 400 450 500

Util

ity (U

)

Peer

BadGood

Predicted

Figure 4.15: Utility for MTPP=3 after 1000 turns.

Finally, we see that though all peers show an increase in utility, the bad peers

exhibit a much larger increase in utility relative to the good peers. Because in the

latter part of processing peer transactions during one turn, it is likely that much of

the remaining capacity is on bad peers since they were less likely to be chosen earlier

when compared to good peers. Consequently, peers that complete transactions late

in the turn are often forced to transact with malicious peers as they are the only

peers with remaining capacity. Allowing peers additional transactions each turn only

amplifies this effect. By increasing MTPP we do somewhat improve the capacity

utilization of good peers, but we increase the utilization of bad peers even more.

If we raise MTPP further, this last effect begins to dominate the results. In

Figure 4.15, we see that MTPP=3 clearly benefits the malicious peers. Though the

trust graph is identical to that for the MTPP=2 scenario (Fig. 4.14(a)), the malicious

peers now generate much more utility than the good peers. In addition, the highest

capacity bad peers, which have the lowest trust ratings, now generate the most utility.

These peers are not being selected for transactions because of their reputation, but

only because they are the only responders with remaining capacity. Allowing peers

to perform three transactions per turn greatly increases the chances of choosing a

malicious peer from the response set, especially near the end of a simulation turn


when most good peers have contributed their capacity.

Intuitively, even in real-world systems raising the total number of transactions

is likely to saturate the capacity of trusted peers, resulting in a larger fraction of

malicious peers in the response set. However, the pronounced difference in utility

caused by slightly changing MTPP is an artifact of our turn-based simulator, which

amplifies the effect. This situation can be improved by utilizing a selection threshold

that limits the providers a peer will consider acquiring from to those whose reputation

lies above a certain threshold (see Sec. 5.6.1). Responders with reputations lower than

the threshold are simply ignored.

4.5.3 Trust vs Capacity

In Section 4.3.1 we discussed the convergence of trust over time with respect to

capacity and presented a graph of the predicted T (∞) as a function of capacity in

Figure 4.16. For comparison, we simulated a population of 500 peers with C ranging

from 0.0001 to 1 and CB = 0. The simulation was run for 1000 turns and the

resulting trust ratings recorded. Figure 4.16 shows the results of two simulations,

one for MTPP=1 and one MTPP=2, along with the corresponding analytic curve.

Once again, we see a significant similarity between the two simulation curves and the

predicted curve, especially for C < 0.001 and C > 0.1. Where the curves differ is in

the behavior of peers with C between 0.001 and 0.1. Both simulation curves sharply

fall off as C drops below 0.1, while the predicted curve tapers off more gradually. The

simulation produced a steeper curve, showing a larger range of capacities that result

in low trust ratings. For example, at C = 0.01 the analytic model predicts a T (∞)

of 0.2, yet the simulation produced trust ratings around 0.01 when MTPP=2 and 0

when MTPP=1.

Once again, this sharp drop-off is due to the different granularity of both time

and number of transactions between the continuous analytic model, and the discrete


0

0.2

0.4

0.6

0.8

1

0.0001 0.001 0.01 0.1 1

T(∞

)

C = CG

PredictedSim MTPP=1Sim MTPP=2

Figure 4.16: Comparing the analytical and simulation results for the convergence ofT as t→∞ as a function of C = CG. Note the logscale x-axis.

transactional model used in our experiments. Remember that peers cannot contribute

more than their capacity, but can contribute less, and not all credits are spent each

turn, underutilizing the system. Consequently, peers with little capacity are not likely

to be selected at all and their trust would decay to 0. In a continuous model they

would contribute, but simply to a small degree. In fact, this underutilization appears

to account for the simulation curve being slightly lower than the predicted curve, even

for high capacity peers.

As we saw earlier, increasing MTPP from 1 to 2 improves system utilization

and improves the likelihood of low capacity peers being able to contribute, and thus

improving their trust rating. We see this improvement in the graph by the fact that

the MTPP=2 curve is higher than the MTPP=1 curve, closely hugging the predicted

curve, and does not drop quite as sharply as the MTPP=1 curve. The result is that

peers with capacity between 0.01 and 0.03 were ignored in the MTPP=1 experiments,

but were able to participate and trade with other peers in teh MTPP=2 experiments.

4.5.4 Single-Peer Experiments

After studying the effects on our base population with its diverse distribution of

capacity, we now look at injecting a new peer p∗ into a “warm” trading system,


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 100 200 300 400 500

Trus

t (T)

Time

Predicted C = 1.00Predicted C = 0.25

Sim C = 1.00Sim C = 0.25

(a) MTPP=1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 50 100 150 200 250 300

Trus

t (T)

Time

Predicted C = 1.00Predicted C = 0.25

Sim C = 1.00Sim C = 0.25

(b) MTPP=2

Figure 4.17: Comparing the analytical and simulation results of trust over time fornew good peers.

where peers in our base population have been transacting with each other for 200

turns and their trust ratings have stabilized. We then monitor p∗’s trust and utility

over 1000 turns. Typically, p∗’s trust rating converges to a stable value much earlier

than time 1000. Therefore, most result graphs will focus on the first 200 to 500

turns after p∗ is inserted. These experiments better mimic the situation a typical new

peer would encounter joining an established real-world trading system. To minimize

artifacts of randomness we repeat the simulation with 50 seeds and average the 50

resulting data sets.

In the following figures we present various graphs for simulation experiments

matching the analysis from Section 4.3. In each graph we present the experimen-

tal result curve and the the corresponding analytic model curve originally presented

in Figures 4.3 and 4.5.

In our first single-peer experiment we focus on the entry of a good peer. Thus, we

simulate the scenario presented in the analysis surrounding Figure 4.3 inserting both

a high-capacity peer with C = CG = 1 and lower capacity peer with C = CG = 0.25.

The results of running our experiment with an MTPP value of 1 are presented in

Figure 4.17(a). We see a large difference between the expected behavior and the


simulation results. Not only did it take longer for the trust values to converge, but

the final value is much lower than expected. For example, when C = 1, p∗ appears

to only attain a trust rating of 0.62 while we had predicted 0.95.

Upon closer evaluation of the experiment we found the reason for this discrepancy.

Remember that each experiment is an average of 50 separate simulation runs. In each

of those simulations p∗ would begin with the low initial trust value of 0.01. After some

turns, with its trust rating gradually decaying, p∗ would be chosen to contribute and

its trust rating would quickly climb until it’s trust matched the predicted stable value.

However, there was a large variance in the time it took for p∗ to be first chosen to

contribute. In fact, in some runs p∗ was never selected and its trust would decay to 0.

Therefore, we feel that presenting an average curve in situation is not representative

of the actual performance.

Previously, we had seen that underutilized peer performed better when MTPP

was increased, improving the probability of their being selected. So we performed the

same experiment, except with MTPP=2. As Figure 4.17(b) shows, the results were

much closer to what we expected. Overall, the simulation and analytic curves appear

quite similar. In the trust graphs both the simulation and analytic curves converge

to the same value for each of the tested capacities. For instance, in Figure 4.17(b),

which depicts two new good peers (C = 1 and C = 0.25) entering the network, both

curves converge to 0.95 for C = 1 and 0.84 for C = 0.25. The primary difference

between the simulation and predicted results appears to be the rate of convergence.

Trust in the simulation converges faster than in the analytic model. In Figure 4.17(b)

we see that the simulation exhibits only a little slow-start behavior before increasing

linearly until near the the convergence point. This reduced slow-start period is due

to the additional transaction allowed by MTPP=2. We noticed the same behavior

earlier in the base population studies.

In Figure 4.18 we simulated malicious peers with C = 1 and CB of 0.99 and 0.5.


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

0 20 40 60 80 100

Trus

t (T)

Time

Predicted CB = 0.99Predicted CB = 0.50

Sim CB = 0.99Sim CB = 0.50

Figure 4.18: Comparing the analytical and simulation results of trust over time.MTPP=1.

-20

0

20

40

60

80

100

120

140

0 50 100 150 200

Util

ity (U

)

Time

Predicted C=1.00Predicted C=0.25

Sim C = 1.00Sim C = 0.25

Figure 4.19: Comparing utility over time for new good peer. MTPP=2.

Here, parameter MTPP was set to 1. Notice that the simulation results mimic exactly

the predicted curve. When performed with MTPP=2 the simulation converges to the

same value as the analytic model, though, as before, the simulation converges faster

than with MTPP=1.

Figure 4.19, compares the experimental results to the expected behavior for the

same two good peers whose trust is presented in Figure 4.17(b). Due to our experience

with the new good peer trust experiments using MTPP=1, we only present the results

of the new good peer utility experiments with MTPP=2. As expected by our analysis

in Section 4.3, once the trust ratings have converged, the slope of the utility curve (the

profit rate) remains constant, matching the slope predicted by the analytic model.


-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

0 200 400 600 800 1000

Util

ity (U

)

Time

Predicted CB=0.99 Predicted CB=0.50

Sim CB=0.99Sim CB=0.50

Figure 4.20: Comparing utility over time for new bad peer. MTPP=1.

Because the simulation trust curves exhibited less slow-start than the analytic model,

we see that the corresponding utility curves also experience less slow-start resulting

in an upward shift compared to the predicted curves.

Finally, in Figure 4.20 we simulate two new bad peers entering the population.

Both have minimal good capacity (CG = 0.01). One peer was a high capacity bad peer

with CB = 0.99, the other had less capacity (CB = 0.5). As the results indicate, their

capacities made no difference. Both performed almost identically. Both initially gain

utility, primarily from spending the initial credits we allot each peer at the start. But

quickly, their utility levels off at approximately 1.5, the same peak utility predicted

by the analytic model. The simulation, however, peaks much earlier around 75 turns,

while the analytic curves reach their maxima at approximately 150 or 300 turns,

depending on the peer’s capacity. In the simulation, the amount of bad capacity does

not seem to matter. The reason is that difference between good and bad capacity for

both peers is so large that both peers earn near 0 trust ratings. While a low trust

rating is sufficient to maintain a small level of contribution in the continuous analytic

model, in the transactional model which already suffers from underutilization, these

almost 0 trust malicious peers are ignored by the rest of the population and never

selected.

4.6. VARIATIONS ON THE MODEL 115

From the experiments it appears that our macro-level model performs quite well

given our base assumptions. We have tested it with a varied distribution of peer

capacities and the results are highly correlated to the predicted values. The largest

variances appear to be due to the discrete nature of the simulator. We would expect

actual real-world systems to exhibit behavior between the analytic and transactional

model. While it would not suffer the artifacts of discrete turns, neither do peers

engage in infinitely many, infinitely small, transactions, as assumed by the continuous

functions we developed in our analysis of our economic model.

4.6 Variations on the Model

In this section we present variations of our trust/incentive model. Specifically, we will

analyze the effects of modifying particular components of Equations 4.10 and 4.13.

4.6.1 Profit Trust Factor

In Section 4.2.3 we presented one straightforward method in which a peer’s trust

influences its profit. We assumed that the probability of a peer being chosen for

a transaction by another peer is linearly proportional to its trust value. Thus, in

Equation 4.10, we multiply each term related to contributed transactions (cost and

payment received) by T . There are other ways in which a peer’s trust could relate to

their profit rate. For example, a peer may be four times as likely to choose a service

provider that has twice the reputation of another. In this case, the relationship would

be quadratic, not linear, and so we would multiply the contribution by T 2, not T .

We refer to this profit trust factor term as σ. Equation 4.20 illustrates how the profit

trust factor appears in our model of profit in the long-term disjoint currency scenario

(compare to Eq. 4.11).

P = (πgtkvC + kmCB − kcC)σ − κ (4.20)


-10

0

10

20

30

40

50

60

70

80

0 20 40 60 80 100

U(t)

Time t

σ=1σ=T1/2

σ=Tσ=T3/2

(a) Utility over time for various σ functions of T .C = CG = 1.

-10

0

10

20

30

40

50

0 200 400 600 800 1000

U(t)

Time t

σ=1σ=T1/2

σ=Tσ=T3/2

(b) Utility over time for various σ functions of T .C = 1, CB = 0.99.

Figure 4.21: Effects of varying trust factor σ.

We experimented with substituting various functions of T for σ, and then cal-

culating the utility over time for both a good peer (C = CG = 1) and a bad peer

(CB = 0.99). The results are presented in Figure 4.21.

We see in Figure 4.21(a) that lowering the exponent on T results in well-behaved

peers’ reaching steady-state faster, but the steady-state profit rate (i.e. slope) remains

the same. In Figure 4.21(b), however, we see that low exponents allow malicious peers

to profit from harming the system. Ideally, we would like to shorten the startup period

for well-behaved peers, while keeping malicious peers from gaining utility from abusing

the system. This translates to choosing a profit trust factor whose curve is the farthest

left in Fig. 4.21(a), but is below 0 in Fig. 4.21(b). Of the four curves presented in

the figure, σ = T , which has been the default value we use, appears to give the best

performance.

How would system designers enforce a specific selection function? Specifying the

selection function used requires regulating the emphasis each peer places on resource

providers’ reputation. For example, to change the σ(T ) from linear to quadratic

would mean that, while previously peers were twice as likely to choose a provider

with double the reputation rating, now they are four times as likely to choose it.


If selection is handled automatically by client software, then updating every peer’s

software will change the selection function.

However, if users manually select the provider from a list of responders (with

their reputation ratings), then each peer applies its own σ(T ). Figure 4.21(b) shows

that if enough naıve peers disregard (or give little weight to) providers’ reputations

(e.g., σ(T ) = 1 or√T ), they can hurt the entire system by making it profitable for

malicious peers to join, thus encouraging misbehavior. Fortunately, it is these naıve

peers that will be most hurt by misbehavior as they are the ones excessively fetching

resources from non-reputable peers likely to be malicious. In the long run, naıve peers

will either adopt a more reasonable selection policy or leave the system entirely [83].

4.6.2 Additional Trust Models

In Equations 4.12 and 4.13 we introduced a specific formula for updating a peer’s

trust rating given their recent performance. In this section we present other methods

for calculating T and discuss their effect on profit for various peer strategies.

The new trust model we will be looking at considers only the ratio of good contri-

butions to total contributions in a unit of time when updating a peer’s trust rating.

Our previous model factored in the amount of contributions into a peer’s reputation.

Some argue that a peer’s trust should be orthogonal to the amount they contribute

and only consider the quality of those contributions. So we construct a trust model

where each interval, we combine the fraction of contributions that were good (a value

between 0 and 1) with the previous trust rating by some weight ω. The trust model

can be represented by either of the following equations:

T [i+ 1] = ωCG

CG + CB

+ (1− ω)T [i] (4.21)

∆TR = ω(CG

CG + CB

− T ) (4.22)


In Equation 4.21 the term CG

CG+CB, or simply CG

C, is the trust ratio, the fraction of

times the peer has provided good resources when contributing in the last unit of time.

The term T [i] represents the peers current reputation. The two terms are combined by

a weight ω, which determines how much of a peer’s reputation results from its current

performance versus its past history. For the purposes of distinguishing between the

previously studied trust model and this new model proposed here, we refer to the

former model (given in Eq. 4.13 as the differential trust model and this new one as

the ratio trust model. For consistency with the differential trust model, Equation 4.21

can be expressed as the change in T in a unit of time (Eq. 4.22), with a subscripted

R to denote the ratio model.

Notice that the ratio trust model is similar to how a seller’s reputation was cal-

culated in Chapter 3 using Buyer Strategy β1. The difference is that in Chapter 3

an unweighted history was used. We use the same trust model in Chapter 5 as it is

intuitive and simple to analyze, given the complexity of the system model used.

To compare the two trust models we ran experiments similar to those conducted

in Section 4.3, but with the ratio trust model, using ω = 0.1. We will not consider

other values of ω as it plays little role in the long-term behavior of the system. The

steady-state trust function for the ratio trust model is simply the trust ratio,

TR(∞) =CG

C(4.23)

and is independent of ω. The weight has no effect on steady-state profit rate, which

is a function of T (∞). However, ω does affect the speed with which T converges to a

new value if the amount of contribution changes. A larger ω causes T to adapt faster,

shifting both the T (t) and U(t) curves to the left. Conversely, a smaller ω emphasizes

a peer’s past behavior, shifting the curves to the right.

Figure 4.22 shows the results of four experiments, which include the corresponding

differential trust model curves for comparison. First, in Figure 4.22(a) we model


0

0.2

0.4

0.6

0.8

1

0 20 40 60 80 100

T(t)

Time t

Diff C = 1.00Diff C = 0.25Ratio ω = 0.1

Ratio ω = 0.01

(a) Trust as a function of time for new well-behaved nodes with two different values of C =CG.

0

0.2

0.4

0.6

0.8

1

0 20 40 60 80 100

T(t)

Time t

Diff Cb = 0.99Diff Cb = 0.25

Ratio Cb = 0.99Ratio Cb = 0.25

(b) Trust as function of time for reputable peersthat turn bad with two different values of CB .C = 1.

-10

0

10

20

30

40

50

60

70

80

0 20 40 60 80 100

U3(

t)

Time t

Diff C=1.00Diff C=0.25

Ratio C=1.00Ratio C=0.25

(c) Utility over time of a new good peer joiningfor two values of C = CG.

-1.5

-1

-0.5

0

0.5

1

1.5

2

0 50 100 150 200 250 300 350 400

U3(

t)

Time t

Diff Cb=0.99Diff Cb=0.25

Ratio Cb=0.99Ratio Cb=0.25

(d) Utility over time of a new bad peer joining fortwo values of CB . CG = 0.01.

Figure 4.22: Comparison of ratio trust model to differential trust model. T (0) = 0.01


two well-behaved peers entering the system: one with C = CG = 1 and one with

C = CG = 0.25. Because the ratio trust model uses the ratio CG

Cto compute ∆T , and

not the absolute amount, there is no difference between the results of two different

C = CG values. Therefore, instead we present two curves with different ω weights, 0.1

and 0.01. We see that, unlike the differential model, the ratio model does not have

a slow-start phase where the change in slope, d”T (t)dt”

, is positive, as the differential

model curves exhibit. Instead, the ratio curves climb fastest at first, then eventually

level out to a trust rating of 1. As expected, lowering ω dampens the impact of the

current ratio CG

C= 1 when computing ∆T , lengthening the time it takes to converge

and thus shifting the curve towards the right.

In Figure 4.22(b) we look at two malicious peers who start with a reputation of 1:

one with CB = 0.99 and one with CB = 0.25.7 Here we see the opposite effect. The

ratio model seems to respond slower to the misbehavior than the differential model.

In addition, for CB = 0.25 the steady-state trust value is significantly higher for ratio

than differential.

Figures 4.22(c) and 4.22(d), show utility over time for the two good peers (as in

Fig. 4.22(a)), and the two bad peers (Fig. 4.22(b)), respectively. At first, the results in

Fig. 4.22(d) seem inconsistent with Fig. 4.22(b). Consider the curves corresponding

to CB = 0.99. Both the ratio and differential trust models appear to converge to

the same low trust rating in Fig. 4.22(b), yet they have wildly different utility curve

slopes at steady-state in Fig. 4.22(d). However, though the two curves seem to have

similar low trust values at time 100 (the edge of the graph in Fig. 4.22(b)), from

Equation 4.23 the ratio model converges to CG

C= 0.01 as t→∞, while Equation 4.15

tells us that the differential model converges to approximately 0.002. The steady-state

utility slope is given by the generic formula for profit at time infinity (independent of

the trust model), which equals P3(∞) = ((πgtkv−kc)C+kmCB)T (∞)−κ. The factor

7C = 1 for both malicious peers. CG = C − CB .


of 5 difference in T (∞) between the two models is sufficient to make the first term

greater than κ for the ratio trust model (thus giving a positive slope), while making

the first term smaller than κ for the differential model (resulting in a negative slope).

Therefore, malicious users can profit in the long run in a system incorporating the

ratio trust model, but not with the differential trust model.

Consequently, from Figures 4.22(a) and 4.22(b) we conclude that the ratio model

would be vulnerable to malicious peers that occasionally contribute good resources

to raise their reputation, then switch to offering bad resources to damage the system.

This conclusion is validated by the results in Figures 4.22(c) and 4.22(d). While the

two trust models perform similarly with respect to well-behaved peers (though the

ratio model reaches steady-state faster), Fig. 4.22(d) demonstrates the most impor-

tant difference in their performance. While the differential model results in a limited

and eventually negative utility for bad peers, the ratio model allows bad peers to

continuously make large positive profit (the curves are linear).

To summarize, though the trust model used by a trading system is system-specific

and cannot be fully generalized, there are obvious advantages to choosing one ap-

proach over another. We have seen here that the differential trust model exhibits

much better characteristics than the ratio trust model, specifically with regards to

malicious behavior.

4.6.3 Tying Service to Reputation

We will now look at the probability of acquiring a good resource, denoted by πgt in

Equation 4.10. Until now, we have assumed πgt has a constant value for a given system

and is independent on the status or behavior of the peer acquiring the resource. This

may not necessarily be the case. For instance, a malicious peer may acquire a resource

from a cooperative peer but then claim to not have received it, negatively affecting

the good peer’s reputation or payment. To avoid being cheated in this way, peers


with limited contributive capacity may prefer to contribute to reputable peers. For

example, say peer A is searching for resource R. Some peer B that has R may decide

whether to offer it to A based on A’s reputation. If B is highly loaded with requests it

may ignore A’s request unless A is very reputable. Prioritizing contributions based on

requestors’ reputation provides peers further incentive to cooperate. If well-behaved

peers are more likely to respond to requests from peers with higher reputation ratings,

then the number of resource request responses a peer receives is directly related to

its reputation. The more responses received from good peers, the more likely a peer

is of acquiring a valid resource. For example, assuming a uniform distribution of peer

trust ratings from 0 to 1, if A receives only one response, the expected reputation of

that responder is 0.5. If A receives 4 replies, the expected reputation of the highest

rated responder would be 0.8.8 Therefore, the higher A’s reputation, the greater the

number of resource offers it will receive for a given query, which in turn increases the

probability of acquiring a good version of resource R.

To represent reputation-weighted contributions in our model, we make πgt a func-

tion with one parameter denoting the acquiring peer’s trust rating Ti, giving us the

term πgt(Ti). Let us explore four possible πgt functions, given in Figure 4.6.3. These

functions capture a range of biases towards trustworthy peers. For example, the

fourth curve corresponds to a quadratic function of T , πgt(T ) = T 2. A peer with

a trust rating of 0.3 is expected to have only 9% (0.32) of its acquired resources be

valid. If its reputation doubles to 0.6, then 36% of its acquisitions will likely be valid.

Note that the area under the curve captures, in a way, the amount of good re-

sources contributed in the system. Consider a system with 100 peers, all with equal

number of acquisitions A = 1 and T values uniformly distributed between 0 and 1

(e.g., 0.01, 0.02, 0.03, etc.). We shall refer to this scenario as the “uniform” scenario.

According to our πgt(T ) function, for a given peer, say with T = 0.45, that peer will

8Note that E[max(U1, U2, ..., Un)] = nn+1

[76, 36].


0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

π gt(T

)

T

πgt = 1/3 T0

πgt = 1/2 √Tπgt = 2/3 T

πgt = T2

Figure 4.23: πgt w.r.t T for various functions of T .

have acquired 0.452 or 0.2025 good resources in each time unit. Summing up all the

good resources acquired by all peers is equivalent to calculating the area under the

curve. The total amount of good resources acquired must equal the total amount

of good resource contributed (∑CG,i∀i ∈ n). So in this scenario, the area under

the curves represents the total good “work” done in the system. Figure 4.6.3 shows

three other curves representing functions of T : 13T 0, 1

2

√T , and 2

3T . Note that with

the given coefficients we have “normalized” the curves so that the area under each

curve is equal. It makes sense to consider curves with equal areas because it models

systems where the same amount of good work is being done, under the assumptions

of our uniform scenario. For brevity, we will ignore the constant coefficients when

referring to the four curves (i.e. T 0,√T , T , T 2). By studying these four functions

we can see the impact of a reputation bias. If bias is good we can then design a

search mechanism that indeed rewards trustworthy peers in a comparable way. If we

discover bias is bad, we can then downplay trustworthiness in our search mechanism.

Using these functions of T for πgt in Equation 4.18 we have the expected profit

rate at steady-state. In Figure 4.24(a) we plot profit at steady-state as a function

of CG, with CB = 0. Notice that for the curves corresponding to functions with low

exponents (T 0, and√T ), peers will not have positive profit no matter how much they


-0.4

-0.2

0

0.2

0.4

0.6

0.8

0 0.2 0.4 0.6 0.8 1

P(∞

)

CG

πgt = 1/3 T0

πgt = 1/2 √Tπgt = 2/3 T

πgt = T2

(a) Steady-state profit as a function of CG forvarious πgt. CB = 0.

-0.4

-0.2

0

0.2

0.4

0.6

0.8

0 0.2 0.4 0.6 0.8 1

P(∞

)

CB

πgt = 1/3 T0

πgt = 1/2 √Tπgt = 2/3 T

πgt = T2

(b) Steady-state profit as a function of CB forvarious πgt. C = 1.

Figure 4.24: Effects of sample πgt w.r.t varying functions of T .

contribute. The more skewed the probability of locating a good resource is towards

reputable peers, the more profit cooperative peers will gain in the long run, regardless

of the amount they contribute.9 Therefore, reputation-weighted contributions both

encourage good behavior and result in more profit for well-behaved peers when total

system contributive capacity is low. Of course, this simple analysis assumes that

the system is overloaded with requests (∑A ≥ ∑CG) and the distribution of peer

reputations remains uniform throughout time.

We have seen how varying function πgt(T ) affects the profit of good peers. Now we

focus on the effect on bad peers. Figure 4.24(b) plots profit at steady-state (P3(∞))

as a function of CB, given that C = 1 and, therefore, CG = 1−CB. To understand the

graph, consider a peer that contributes at the full rate of C = 1. The x-axis represents

a spectrum of behavior, with a highly cooperative peer that always contributes valid

resources at the left extreme, and a highly malicious peer that always contributes

false resources at the right extreme. Here we see that a lower T exponent in the πgt

function results both in lower profit for well-behaved peers (to the left) and higher

profit for somewhat malicious peers (in the middle). In fact, for the first two curves

9As long as CG > 0.1. For smaller CG all πgt(T ) functions give equal negative profits.


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.2 0.4 0.6 0.8 1

T(∞

)

CB

Figure 4.25: Steady-state trust as a function of CB. C = 1

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

0.25

0 0.2 0.4 0.6 0.8 1

P(∞

)

CB

km=2km=1

km=0.5km=0

Figure 4.26: Steady-state profit as a function of CB. C = 1

(T 0 and√T , respectively), only somewhat malicious peers make positive profit, with

a maximum profit rate of approximately 0.1 for T 0 and 0.5 for√T , at CB ≈ 0.5.

In contrast, the curves for T and T 2 indicate relatively high profit for purely well-

behaved peers (at CB = 0) as we saw in Fig. 4.24(a), but then sharply drop to a

profit rate less than 0 at approximately CB = 0.2. Afterwards, both curves exhibit

a gradual increase in P3(∞) as CB increases, similar to that of the lower exponent

curves, but to a lesser degree. Finally, all four curves drop to negative profit rating

at CB = 1.

The most interesting characteristic of all four curves is the gradual increase in

profit to a (local) maximum value for CB between 0.4 and 0.8, depending on the


particular curve, before falling again to less than 0 profit. This profit “hump” is

independent of the πgt function as it appears in all four curves. The hump is a result

of the other factors in Equation 4.18 dependent on CB. Specifically, the two terms

kmCB and T (∞). Notice that the first factor is linear with respect to CB, with a

large coefficient km = 2. However, if we plot T (∞) w.r.t CB, as in Figure 4.25, we

see that the decrease is not linear and has a slope less than km. Therefore, in the

model represented by Eq. 4.10, the attenuation on profit from malicious activity (CB)

from the reputation system is not sufficient to overcome the gain in profit given by

parameter km. Unfortunately, this results in a mix of good and bad contributions

that result in positive profits, which a reputation system ideally would prevent.

By analyzing the underlying equations, we see there are various ways to further

decrease steady-state profits for CB > 0. First, would be to decrease the utility

malicious peers gain from misbehaving, lowering km. Figure 4.26 shows the effect on

the T curve from Fig. 4.24(b) of decreasing the value of km from the default value

of 2. Notice that all curves share the same endpoints at CB = 0, where the peer

is sharing no bad resources to derive malicious utility from, and CB = 1, where the

peer’s trust rating is 0 and so none of its resources are contributed, as no one trusts it.

In the middle, lowering km decreases the steady-state profit. Interestingly, the largest

impact is not for high values of CB, as we might expect, but for lower values, especially

around CB = 0.25. The reason is that, though a smaller fraction of a malicious peer’s

contributions are bad (CB), the total amount of bad contributions is greater because

of the effect of the profit trust factor on total contribution (CT (∞)). With a negative

P (∞) of almost 0.2, the km = 0 curve attains the lowest profit rate; much lower than

can be accounted for by the fixed cost κ = 0.01. The bad profit rate is due to a πgt

value (based on the calculated T (∞)) that is less than kc

kv. Recall from Section 4.2.3

that for a well-behaved peer to gain utility in our model from Equation 4.11 based on

the long-term disjoint currency scenario, the inequality πgtkv > kc must hold. In the

4.7. GENERALIZED MODEL OF TRUST AND PROFIT 127

worst case, exhibited at approximately CB = 0.25, πgt is less than kc

kv= 0.5 because

of a moderately low T (∞), causing negative profits per contribution, but T (∞) is

sufficiently high to have a high level of contribution. This experiment demonstrates

that if a peer is losing utility on each of its acquisitions, it will benefit from lowering

its contributions, which can be done by lowering its contributive capacity, leaving the

system, or developing a bad reputation. Figure 4.26 also shows that, except for high

km = 2, malicious peers are not able to maintain a positive rate of profit if they share

a substantial amount of bad resources (CB > 0.1). Unfortunately, because km is the

subjective personal utility gained by a malicious user for simply hurting another user,

it is beyond the control of the system designer.

A second way to decrease profit for malicious peers would be to increase the

fixed participation cost (κ), which would lower profit for all participants. Adding

an additional fixed cost of κ′ will simply shift all the curves in Fig. 4.24(b) down

by κ′. A third solution would be to use a different profit trust factor (as discussed

in Section 4.6.1) to increase the emphasis a peer’s reputation has on the amount it

contributes. And finally, the trust model, given in Equation 4.13, could be changed.

The weight given to good contributions rg could be decreased, further lowering all

peer’s trust ratings. Also, the entire trust model could be replaced with a different

method of calculating trust, as discussed in the following section.

To summarize, in situations where the average peer has little contributive capacity,

selectively offering resources based on the requestor’s reputation results in more profit

for well-behaved peers and further decreases the profit expectations of malicious peers.

4.7 Generalized Model of Trust and Profit

We now extend the system model represented by Equations 4.10 and 4.13 that we

have been studying. By augmenting the strategies available to a peer, we develop a

more general mathematical model for trust and incentives in trading systems.


Until now we have assumed that all peers charge the same amount for all resources.

If prices are fixed, regardless of a peer’s reputation, there would be no reason for peers

to choose a provider with a lower trust rating over one with a higher rating. Con-

sequently, reputable peers would become overloaded while new peers would seldom

be chosen to contribute, keeping them at a lower reputation rating. What if peers

price their contributions differently, depending on their reputation? The greater a

peer’s trust, the more they may charge for providing a resource. For instance, a peer

with a reputation of 0.5 may charge twice as much for the same resource as a peer

with reputation 0.25. Consequently, when a peer is selecting a service provider from

which to acquire a resource, they may have a choice of peers with different reputa-

tions, offering the resource for prices proportional to their reputation rating. The

requesting peer may then choose to pay more for a reliable provider, or pay less and

risk receiving a valueless resource.

To represent this effect in our model we focus on the term for profit acquisitions in

Equation 4.10, (πgtkv−kp)A. For a particular resource a peer wants to acquire, there

may be several peers offering it. On average, choosing a cheaper resource provider

corresponds to lowering the value of kp. At the same time, we expect the risk of

buying a bad resource to increase, decreasing the value of πgt. We call the function a

peer uses to select a price for its resources, based on its current reputation, its pricing

function. We denote the pricing function for a particular peer i as pi(Ti).10 This

function determines the payment a peer receives for one contribution. For simplicity,

let us assume kp is the “full” or maximum price a peer may charge for a resource. A

peer’s pricing function will denote the fraction (between 0 and 1) of the maximum

price which it will charge for each contributed resource. Inserting this into our profit

10As before, we will ignore the subscript when all subjective variables in question relate to thesame peer.


equation (Eq. 4.10) yields

P = πgtkvA− kpA+ (kmCB + kpp(T )C − kcC)T − κ (4.24)

Notice that for a given peer i, the probability of acquiring a good resource πgt is

dependent on the price it is willing to pay for a resource and the pricing function used

by the other peers that service its acquisitions. To represent this variability in price

we introduce the discount factor di as a multiplicative factor of kp, which is assumed

to be the maximum price of a resource.

For simplicity, let us assume all other peers use the same pricing function, denoted

as ρ().11 We can now express the probability of a specific peer i acquiring a good

resource as a function of the global pricing function and the amount the peer is

willing to pay, denoted by πgt(di, ρ) or, briefly, πgt(d, ρ). Note that this new term

for the probability of acquiring a good resource (πgt(di, ρ)) is independent of Ti. To

account for the possibility that πgt is related to a peer’s reputation, as discussed

in Section 4.6.3, we add an additional function parameter, resulting in the function

πgt(di, Ti, ρ).

Further augmenting Equation 4.24 with the discount factor and πgt function we

have

P = πgt(d, T, ρ)kvA− kpdA+ (kmCB + kpp(T )C − kcC)T − κ (4.25)

When a resource requester must decide which provider to transact with, it will

compare the providers based on their reputation ratings and the prices they set.

Thus, the number of transactions a peer contributes in a given time interval will be

dependent on how other peers weigh its reputation and pricing function. We express

this relationship with a selection function σ(Ti, pi), a global black-box function which,

given a peer’s current trust and pricing function, determines the fraction of C the

11Consider ρ to be similar to an average of the pricing functions of all other peers.


Table 4.3: Definition of Generalized Model TermsTerm Definitiondi Fraction of maximum resource price a peer i pays (on average)

pi(Ti) Pricing function used by peer i to determine fraction of maxi-mum resource price to charge for a contribution

ρ() Global pricing function we assume all other peers useσ(Ti, pi) Global selection function that determines the fraction of a peer’s

contributions that are used given its current reputation and thepricing function it uses

πgt(Di, Ti, ρ) Global function that determines the probability of a peer’s ac-quired resources being good based on the price the peer is willingto pay (denoted by D), its reputation, and the pricing functionused by the rest of the peers

kv Utility of a full unit of acquired resourceskp Maximum price in utility of a full unit of contributed resourceskc Cost in utility of contributing at maximum capacity (C = 1)

peer actually contributes. A concise explanation of the functions and variables in our

generalized model appears in Table 4.3.

We now generalize our model for profit and trust in a reputation system by em-

ploying both the pricing and selection functions. Inserting the selection function into

Equations 4.25 and 4.13 and collecting terms yields our generalized equations for

profit and changing trust.

P = (πgt(d, T, ρ)kv − kpd)A+ (kmCB + (kpp(T )− kc)C)σ(T, p)− κ (4.26)

∆T = (rgCG(1− T )− rbCBT )σ(T, p)− δT 2 (4.27)

Because σ(T, p) determines the fraction of contribution used in a unit of time, we must

apply it to our change in trust equation as well. Thus, we replace the multiplicative

T in Equation 4.13 with σ(T, p) in Equation 4.27. The selection function replaces the

profit trust factor discussed in Section 4.6.1.

Notice that if σ(T ) 6= T , then we can no longer guarantee that T will remain


between 0 and 1 even if rb + δ ≤ 1.12 The only way to maintain that guarantee is

if both rbCB and δ are multiplied by the same T dependent term. To restore the

guarantee we would need to modify the decay term from δT 2 to δTσ(T ). However,

this implies that the rate of trust decay used by the reputation system is proportional

to the selection function used by peers when choosing resource providers based on

reputation.

We now consider two constraints on the generalized model: one where we hold

p(T ) constant, and one where we hold σ(T, p) constant.

Notice that we can think of our original model from Equation 4.10 as a specific

instance of Equation 4.26, but with σ(T, p) = T and p(T ) = 1 for any value of

T . We will denote this special pricing function, where ∀T p(T ) = 1, as p(T ). In

Equation 4.10 we had assumed that σ(T, p) = T for simplicity of illustration and

analysis.

In an analysis similar to that presented previously in Section 4.6.1 we varied σ(T, p)

in both Eq. 4.26 and 4.27, while keeping p(T ) = p(T ) = 1. The difference between

these experiments and those in Section 4.6.1 is that the σ(T, p) term affecting ∆T

was also changed here, while it remained σ(T, p) = T in the previous experiments.

Figure 4.27(a) shows how increasing the exponent on T increases the slow-start

phase, before a peer’s trust rises quickly. The more emphasis placed on reputation

when selecting a peer, the less likely low reputation peers will be chosen, resulting in

a longer period of time to prove their trustworthiness.

Studying Figure 4.27(b) we see the effect of different selection functions on highly

cooperative peers (C = CG = 1). Notice that a longer reputation slow-start phase

translates into a longer period of low profit, before reaching steady-state where utility

climbs linearly. Comparing the graph to Figure 4.21(a) we see a larger variation in

the length of the startup phase, before trust and profit stabilize. This behavior is

12Discussed in Section 4.2.3 when decay was first introduced.


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 10 20 30 40 50 60 70 80 90 100

T(t)

Time t

σ=1σ=T1/2

σ=Tσ=T3/2

(a) Trust over time for various σ(T, p) functionsof T .

-10

0

10

20

30

40

50

60

70

80

0 20 40 60 80 100

U(t)

Time t

σ=1σ=T1/2

σ=Tσ=T3/2

(b) Utility over time for various σ(T, p) functionsof T . C = CG = 1.

-10

0

10

20

30

40

50

0 200 400 600 800 1000

U(t)

Time t

σ=1σ=T1/2

σ=Tσ=T3/2

(c) Utility over time for various σ(T, p) functionsof T . C = 1, CB = 0.99.

Figure 4.27: Effects of varying σ(T, p).


most noticeable when comparing the curves for σ = T3

2 in the two graphs. Previously

we did not take into account the affect that varying the selection function (previously

referred to as the profit trust factor) would have on ∆T . We believe this generalized

form to be more reasonable and consistent. If the selection function influences the

amount of contribution affecting profit, it should similarly influence the amount of

contribution affecting reputation.

Figure 4.27(c) demonstrates the effects on highly malicious peers (CB = 0.99,

CG = 0.01). Comparing to the corresponding graph from Section 4.6.1 (Fig. 4.21(b))

we see little difference. Because T begins at a very low value (0.01) and only decreases,

the additional σ(T, p) term in the trust equation (Eq. 4.27) has little effect on the

system behavior. Although, as before, the selection function term in the profit equa-

tion (Eq. 4.26) does considerably influence the profit rate of malicious peers. Again,

σ(T, p) = T gives the most desirable results. Limiting malicious peers to negative

profit while allowing good peers to quickly gain utility.

Instead of having a constant p(T ) and having a variable σ(T, p), we could con-

versely imagine a σ(T, p) that produced one constant value for any T . If all peers

could price their contributions so that the expected risk of receiving a bogus resource

is offset by the lower cost. This means that for any resource, no matter what price

dkp is offered, the related risk πgt(d, T, ρ) changes so that (πgt(d, T, ρ)kv−kpd) always

remains constant.13 Consequently, all peers would be able to contribute their full ca-

pacity, but more reputable peers would generate more profit for the same contributive

capacity. We denote the special pricing function where the price of a certain peer’s

contribution exactly offsets the expected risk from transacting with a peer of their

reputation as p∗(T ). Expressed mathematically

∀T σ(T, p∗) = 1 (4.28)

13Given that πgt is independent of T .


Every peer’s contributive capacity is equally likely to be used. The advantage of

σ(T, p∗) = 1 is that T will reach steady-state much faster by negating the slow-start

reputation effect, as low reputation peers perform more transactions per round from

which to judge their behavior.

Unfortunately, p∗(T ) does not handle malicious peers well. Notice that if σ(T, p∗) =

1 in Equation 4.26, then purely malicious peers will generate positive profit if km > kc,

regardless of p∗(T ). In other words, if malicious peers derive sufficient “pleasure” from

harming the system by distributing bad content, then they are benefiting from the

system as long as they are allowed to make contributions, even if they do not receive

payment for them. Preferably, σ(T, p∗) resembles a step function that rises from 0

to 1 at T = T (0)− ε, for some very small epsilon. This would allow new good peers

to quickly raise their reputations. A peer whose reputation falls below T (0) will be

ignored for the remainder of their stay in the system, with no hope of redemption.

Of course, such a node could change its identity and reenter the system, but if they

misbehave will quickly find itself ignored once again.

4.8 Discussion

4.8.1 Credits and Economic Stimulation

One remaining question is how credits are distributed to the peers in the system.

One solution is to allow peers to purchase credits using using real money at a certain

exchange rate, rx.14 Ideally the monetary price of credits should outweigh their

cooperative price. If not, peers will rather purchase credits with money than acquiring

them by contributing resources, lowering the total value of the network. Given that

kc, the cost of contributing (see Equation 4.2), is strictly less than kp, the payment

for contributing, then this condition is satisfied as rxkp > rxkc.

14For consistency, let rx be the exchange rate for credits to real-world.

4.8. DISCUSSION 135

Now any peer may acquire credits using real-world money if it wishes. However,

a stranger’s low reputation limits their total credit income rate, encouraging them

to purchase credits in order to use the network to acquire resources during their

reputation slow-start phase. Paying the extra price of direct currency conversion in

order to gain instant access to offered resources is another aspect of the penalty, or

tax, on strangers.

Over time the system may lose currency as nodes leave the network abruptly with

positive credit balances. Also, peers that contribute more than they spend will hoard

credits. To avoid global stagnation the system can periodically inject credits into

the network in the form of payments or rewards. The question is what peers should

be paid? One solution is to give credits to nodes that need it the most; those with

zero credits. Unfortunately, this would encourage nodes to freeride, as nodes that

are not contributing will be earning no credits and have 0 balance. Another choice is

to reward nodes that contribute the most, regardless of the rate they spend credits.

Of course these nodes are likely to have high income rates and large credit balances

already.

We suggest rewarding the peers who generate the most global wealth. Wealth in

our system is produced whenever a transaction occurs. The peer serving a resource

is paid more than what serving it cost them. The peer acquiring the resource values

it more than the credits the peer paid for it. Therefore, the peer participating in

the most transactions will be helping to increase the total wealth. Using our earlier

variable A (the amount of resources acquired) this translates to

{Node s|(Cs + As) = max (Cx + Ax)∀x ∈ N} (4.29)

where N is the set of all nodes. We call this node the maximum economic stimulator.

In our ideal economic model we assumed every node spent all their credits, so that

A ∝ C. In this case the maximum economic stimulator is the node with highest C.


4.9 Related Work

Significant effort has been put into understanding how to effectively design reputation

and/or incentive schemes (e.g. [59, 37, 93]). We briefly mention two examples that

closely relate to the method we have presented here.

Huberman and Wu [65] developed a sophisticated economic model of reputation

over time that they use to study the endogenous dynamics or reputations. Geared

towards the economics community, they focus on the stability and equilibria of reputa-

tion for persistent service providers with long-term relationships with a large customer

base.

In [45], Feldman et al. study the problem of whitewashers using the Prisoner’s

Dilemma model for interactions. They propose a reciprocative decision function to

minimize the benefits of whitewashing without excessively punishing new users. In our

model we assume certain parameters are static when this is not necessarily the case.

For example, we assume a global constant value for initial trust T0. Feldman et al. [45]

suggest dynamically varying the initial trust in strangers based upon the behavior of

past strangers. This allows the system to be optimistic when few whitewashers are

present, but quickly throttle back if the number of defecting strangers increases. As

demonstrated in Sections 4.6 and 4.7, it is possible to replace such constants with

variable functions, leading to further discussion and analysis of appropriate functions

and their effects. In addition, their work inspired our transactional model.

4.10 Conclusion

We have presented an economic model of peer behavior in a resource exchange envi-

ronment with reputation management. The model is sufficiently simple and extensible

to support detailed analysis. Our simulations show it adequately captures the prop-

erties of a simplified trading system. Applying our model we have been able to shed

4.10. CONCLUSION 137

light on various important system design questions.

Using the analytic model, we elucidated on the desirable properties of reputation

systems. We have discussed the tradeoffs associated with how provider selection is

influenced by reputation information. We investigated how tying request response

rate to the requestor’s reputation can improve performance for well-behaved users,

especially in times of high-load. Finally, we demonstrated that using both transac-

tion quality and quantity in calculating reputation outperforms using only quality or

quantity.

Our model has exposed “hidden” parameters and functions in trading systems such

as πgt(T ) and σ(T, p). Though a system may not explicitly state these parameters,

every system inherently exhibits a particular behavior, based on the design choices.

Considering the impact of these choices beforehand will lead to better choices.

Chapter 5

Examining Metrics forPeer-to-Peer Reputation Systems

Until now, we have assumed global shared history is maintained about all peer inter-

actions, allowing the system to detect possibly malicious resource providers. Mecha-

nisms that implement global reputation systems have been proposed in the literature.

EigenTrust [73], for example, collects statistics on peer behaviors and computes a

global trust rating for each peer. However, global history schemes are complicated,

requiring long periods of time to collect statistics and compute a global rating. They

also suffer from the transience of peers and continual anonymity afforded malicious

peers through zero-cost identities.

In this chapter, we evaluate the performance of a peer-to-peer resource-sharing net-

work in the presence of malicious peers, which, in contrast to global history schemes,

uses only limited or no information sharing between peers. We develop various tech-

niques based on collecting reputation information and present several interesting side-

effects resulting from some of the techniques.

We also study the trade-offs of two identity management schemes for peer-to-peer

networks: a trusted central login server and self-managed identities. We analyze the

performance of each scenario and compare it to the base case with no reputation

system. We look at how new peers should be treated, and present some mechanisms

138

5.1. SYSTEM MODEL 139

to further improve system efficiency.

In Section 5.1 we present our system model and its assumptions. Section 5.2 de-

scribes the two threat models we consider. Then, Section 5.3 discusses the reputation

systems used in the experiments and their options. Section 5.4 describes the metrics

used for evaluating our experiments. In Section 5.5 we specify the details of the sim-

ulation environment used for the experiments, and present the results in Section 5.6.

Section 5.12 discusses related work. Finally, we conclude in Section 5.13.

Some of the work presented here has been previously published as [91] and [93].

5.1 System Model

A peer-to-peer system is composed of n peer nodes1 arranged in an overlay network.

In a resource-sharing network each node offers a set of resources to its peers, such as

multimedia files, documents, or services. When a node desires a resource, it queries all

or a subset of the peers in the network (depending on the system protocol), collects

responses from available resource providers, and selects a provider from which to

access or retrieve the resource.

Locating a willing resource provider does not guarantee the user will be satisfied

with its service. Selfish peers may offer resources to maintain the impression of

cooperation, but not put in the necessary effort to provide the service. Worse, certain

nodes may join the network, not to use other peers’ resources, but to propagate false

files or information for their own benefits.

In our model, each peer verifies the validity of any resource it uses. Accessing

invalid or falsified resources can be expensive in terms of time and money. A system

1In this chapter, we often use the term “node” rather than “peer”. A node refers to a networkentity with a unique system identifier. “Node” emphasizes the distinction between such entitiesand network users. One user may control multiple entities [38], while another person’s computermay have been compromised by a worm and forced to run a system node on behalf of an unknownmalicious user.

140 CHAPTER 5. P2P REPUTATION SYSTEM METRICS

may implement a micropayment scheme requiring users to pay a provider before being

able to verify the validity of the resource. In most cases the user must wait for a file to

be downloaded or a remote computation to conclude and then verify the correctness

of the result. Checking the validity of the file or service response may itself be a

costly but necessary operation in the presence of malicious nodes. Because such an

operation is highly domain-specific, we assume the existence of a global verification

function, V (R) which checks whether resource R is valid. Any node can perform this

verification, but it is indeterminately expensive to compute and may require human

interaction (such as listening to a song after downloading it from a music service to

ensure it is the correct song and uncorrupted) or even a third-party. A resource must

be downloaded or accessed before it can be verified, which costs time and bandwidth.

We include this cost in the verification function, so that it represents the full price of

accessing a bad resource.

To simplify the discussion we present our work in the context of a file-sharing

system, where users query the network, fetch files from other peers, and verify the

files’ content is correct. Nodes hearing the query reply to the query originator if

they have a copy of the file. The originator then fetches copies of the file from

the responders until a valid, or authentic copy is located. File-sharing networks have

existed for some time and their characteristics have been thoroughly studied, allowing

us to more accurately model deployed, working systems. Though we use the term

“files” in the rest of the chapter, most concepts apply to generic resources.

What does it mean for a file to be invalid or fake? The issue of file authenticity is

discussed in the following section. The behavior of peers in the system with respect

to the authenticity of the files they send each other is captured in the threat model,

which is discussed in Section 5.2. Reputation systems, which track node behavior in

order to mitigate the problem of inauthentic files, are covered in Section 5.3.

5.1. SYSTEM MODEL 141

MetadataTitle: A Tale of Two Cities

Author: Charles Dickens

Publish Date: April 2002

Publisher: Barnes & Noble

Books

...

ContentIt was the best of times, it was

the worst of times...

(a) A document consists of data orcontent and sufficient metadata touniquely describe the content.

QueryTitle: A Tale of Two Cities

Author: Charles Dickens

Publish Date: April 2002

Publisher: Barnes & Noble

Books

...

(b) A query in a file retrieval systemconsists of sufficient metadata as touniquely match only one documentin the system

Figure 5.1: Sample document and matching query

5.1.1 Authenticity

In our model the unit of storage and retrieval is the document. Every document D

consists of some content data CD and metadata MD which uniquely describes the

content. If two documents contained the same metadata but different content, there

must be some information pertaining to their differences that should be included in the

documents’ metadata to make them unique. For example, different editions of a book

should include the edition and the year published in their metadata. Figure 5.1(a)

illustrates a sample document for a specific edition of the Dickens’ novel A Tale of

Two Cities. If the only metadata provided were the title and author, the document

may not be unique, since the other editions of the same book exist in other languages,

or may include notes or pictures.

Given the definition of a document, we can now define document authenticity. A

document is considered authentic if and only if its metadata fields are consistent with

each other and the content. If any information in the metadata does not “agree”

with the content or the rest of the metadata, then the document is considered to be


inauthentic, or fake. For example in Figure 5.1(a), if the Author field were changed

to Charles Darwin, this document would be considered inauthentic, since Barnes &

Noble Books has never published a book titled A Tale of Two Cities written by

Charles Darwin that begins “It was the best of times”.

We assume the existence of a global authenticity function, A(D) which enables

one to verify the authenticity of a document D. Evaluating the function is likely to

be very expensive and may require human user interaction or even a third party. An

example would be if Alice were to download a song from a music sharing service, she

could determine whether it is the correct song by listening to it. We generalize the

document authenticity function to the resource verification function V (R). We also

use the terms “file” and “document” interchangeably.

5.2 Threat Models

As stated above, the threat we are studying is that of a group of malicious nodes

that wish to propagate inauthentic (or fake) copies of certain files. They do not care

if they themselves are unable to query the system for files, thus incentive schemes

fail to deter them. In addition, we assume they may pass false information to other

nodes to encourage them to fetch bad files. We consider three behaviors for malicious

nodes; abbreviated as N , L and C:

N : No misinformation is shared. All nodes give true opinions.

L: Malicious nodes lie independently for their own gain. They give a bad opinion

of everyone else.

C: Malicious nodes collude. They give good opinions of each other and bad opinions

of well-behaved nodes. For this model, we will briefly consider the situation

where some malicious nodes act as “front” nodes by providing only authentic

5.2. THREAT MODELS 143

files (but never from the subversion set) in an attempt to gain the trust of other

nodes and spread their malicious opinions.

The percentage of nodes in the network that are malicious is given by the pa-

rameter πB. We propose two distinct threat models, one in which malicious nodes

target specific files, and another in which they act maliciously towards other nodes by

a certain probability. In both threat models, we assume no other malicious activity,

such as denial-of-service style attacks, are occurring in the network.

5.2.1 Document-based Threat Model

The first threat model is designed to emulate expected real-world malicious activ-

ity. We randomly select a set of files, called the subversion set, that all malicious

nodes wish to subvert by disseminating invalid copies. Each unique file has an equal

probability of being in the subversion set, specified by the parameter pB. We assume

no correlation exists between a file’s popularity and its likelihood to be targeted for

subversion. Malicious nodes also share valid copies of files not in the subversion set.

The effects on performance of varying both πB and pB are discussed in Sections 5.6.1

and 5.6.2.

We assume well-behaved nodes always verify the authenticity of any file they have

before sharing it in the network. Though this assumption may be unrealistic for many

peer-to-peer systems, experiments in which a small fraction of the files provided by

good nodes were invalid demonstrated little effect on our experimental results.

5.2.2 Node-based Threat Model

Our second threat model performs equivalently to the former, but provides us inter-

esting avenues of research. We choose to model the node behavior described above

as a probability that a given node will send an authentic copy of a file to another


node requesting the file. For example, good nodes may reply correctly 95% of the

time (0.95)2, while malicious nodes only reply correctly 10% of the time (0.1). These

probabilities can be arranged in a threat matrix, T , where Ti,j contains the probability

that node j will reply with an authentic file to request from node i. Section 5.4 dis-

cusses the advantage of modelling the threat model in this form. The threat matrix

characterizes the threat model at a specific time, Ti,j(t), since nodes may behave well

at first and then begin acting maliciously. Though most of our experiments use a

static threat model, some look at dynamic node behavior, such as malicious nodes

behaving well for a period of time, then turning bad (see Sec. 5.6.3).

For the results in this chapter we assume all malicious nodes use the same prob-

ability of replying with a fake file, regardless of the query originator. We reuse the

parameter pB to indicate this probability. Therefore, for any malicious node m,

∀i, Ti,m = 1− pB. Similarly, we use pG to be the probability of a good node sending

an authentic file, thus assuming that a fraction equal to 1 − pG of the files on the

average well-behaved node are corrupted. For any good node g, ∀i, Ti,g = pG. In

Section 5.6.3 we look at the effects of having varying node threat values among the

well-behaved nodes and the malicious nodes.

5.3 Reputation Systems

When a node queries the system for a file, it collects all replies (and their source IDs)

in a response set. The node repeatedly selects responses from the set, fetches the copy

of the file offered by the responder and verifies it (using the verification function) until

an authentic copy is found.

As nodes interact with each other, they record the outcome, such as whether

the file received was authentic of not. As a node collects statistics, it develops an

2The 5% accounts for the fake files that are shared before they are verified as authentic by theuser

5.3. REPUTATION SYSTEMS 145

opinion, or reputation rating, for each node. We make no assumptions of how this

rating should be computed, but since it is used to compare and rank nodes, it should

be scalar (see below for an example).

Each node records statistics and ratings in a reputation vector of length n, where

n is the total number of nodes in the network.3 When a node first enters the system

all entries are undefined. As the node receives and verifies files from peers, it updates

the corresponding entry. Nodes may also share their opinions about other nodes with

each other and incorporate them in their ratings. The reputation vectors can be

viewed as an n × n reputation matrix, R, where the ith row is node i’s reputation

vector. Cell Ri,j would contain node i’s “opinion” of node j.

When a node has collected replies to a query, the reputation system calls a selection

procedure, which takes as input the query response set and the node’s reputation

vector, and selects and fetches a file. The verification function is then calculated on

the selected file. As stated earlier, this may be done programmatically if possible, but

most likely requires presenting the file to the user. The system updates its statistics

for the selected response provider based on the verification result. If verification

failed, the selection procedure is called again with a decremented response set. This

is repeated until a valid file is located, the response set is empty, or the selection

procedure deems there are no responses worth selecting (such as if the remaining

responders’ ratings are too low).

For this chapter we study variants on two reputation systems, one in which peers

share their opinion and one in which only local statistics are used. They are compared

against a random selection algorithm.

Random Selection: Our base case for comparison is an algorithm which ran-

domly chooses from the query responses until an authentic file is located. Since no

3Or more accurately the number of identities in the network (see Section 5.3.1).


knowledge or state about previous interactions is stored, shared or used, this algo-

rithm models the performance of a system with no reputation system.

Local Reputation System: With this reputation system each node maintains

statistics on how many files it has verified from each peer and how many of those were

authentic. Each peer’s reputation rating is calculated as the fraction of verified files

which were authentic. This results in a rating ranging from 0 to 1, with 0 meaning

no authenticity check passed and 1 meaning all authenticity checks passed. When

processing a query, these ratings are used in the selection procedure to select the peer

from which to fetch the file. We consider two procedures in our experiments:

• The Select-Best selection procedure selects the response from the response node

with the highest rating. If the selected response is invalid, the procedure chooses

the next highest-rated node.

• Select-Best will prefer to choose good nodes it has previously encountered and

thus may overload a small subset of reputable peers. To spread out file requests

we propose the Weighted selection procedure, which probabilistically selects the

file to fetch weighted by the provider’s rating. For example, if nodes i and j

both provide replies to node q and R(q, i) = 0.1 and R(q, j) = 0.9, then j is nine

times as likely to be chosen as i. We study load distribution in Section 5.6.2.

The Select-Best method requires a node maintain an ordered list of the most

reputable nodes it knows. We call this list a Friend-Cache of maximum size FC.

There are additional benefits to maintaining a Friend-Cache in the local reputation

system. By sending queries directly to nodes in the Friend-Cache before propagating

the query normally, the message traffic of query floods in flat unstructured networks

can be greatly reduced. We call this the Friends-First technique and evaluate it in

Section 5.6.1.

Voting Reputation System: This system collects statistics and determines


local peer ratings just as the local system does. It extends the previous system by

considering the opinions of other peers in the selection stage. When a node, q, has

received a set of responses to a query, it contacts a set of nodes, Q, for their own

local opinion of the responders. Each polled node, or voter v ∈ Q, replies with its

rating (from 0 to 1) for any responder it has interacted with and thus has gathered

statistics. The final rating for each responder is calculated by the formula

ρr = (1− wQ)R(q, r) + wQ

∑

v∈QR(q, v)R(v, r)∑

v∈QR(q, v)(5.1)

For each responder r, the querying node q sums each voter’s (v) rating of r weighed

by q’s rating for v. This result is the quorum rating. If node q has no prior knowledge

of r, it uses the quorum rating as r’s rating in the selection procedure. If q already

has statistics from prior interaction with node r, the rating for node r is the combi-

nation of the local statistics and the quorum rating, by some given weight called the

quorumweight, wQ. Note that when wQ = 0 the voting system works exactly like the

local system.

Until now we have not discussed how the nodes in the quorum Q are selected to

give their opinion. We consider two methods of selecting voters. The first method is

to ask one’s neighbors in the overlay topology. These are typically the first peers a

node is introduced to in the network and, though neighbors may come and go, the

number of voters will remain relatively constant. The other method is to ask peers

from whom one has fetched files and who have proven to be reputable. This group

would consist of the peers with the f highest local ratings at node q. The former

quorum selection we call Neighbor-voting while the latter is referred to as Friend-

voting. In the Friend-voting scheme we reuse the Friend-Cache described above to

maintain our list of voters. The cache has a maximum size of FC. We study the

effects of varying FC in Section 5.6.2.


Above, we describe the source node as contacting each voter for their opinion

for each and every query once it has collected the responses. Realistically, nodes

may instead periodically exchange reputation vectors with each other. If the rate at

which reputation vectors are exchanged is as frequent as once per query, then the two

methods are equivalent. For simplicity, we assume this equivalence in our simulator

and model the system as acquiring voter opinions at the time of the query.

Both reputation systems have two additional parameters. Since all entries in R

are initially undefined, an initial reputation rating ρ0 must be assigned to nodes for

which no statistics are available, to be used for comparing response nodes in the

selection stage. Analysis of different values for ρ0 is provided in Section 5.6.1.

In some domains it may be easy for malicious nodes to automatically generate

fake responses to queries. In situations where a node is querying for a rare document,

it may receive many replies, all of which are bad. To prevent the node from fetching

every false document and calculating V (R), we introduce a selection threshold value

(ρT ). Any response from a node whose reputation rating is below this threshold

is automatically discarded and never considered for selection.4 In Section 5.6.1 we

analyze the effects of varying the threshold value on performance.

In the weighted selection procedure a response from a node with a rating of 0

would never be chosen since it has a weight of 0.5 The Weighted procedure differs

from the Select-Best procedure because it may choose any response from the response

set, with some probability (albeit small). To prevent nodes from being permanently

excluded from the selection process by the Weighted procedure, all nodes with a

reputation rating of 0 are assigned an artificial weight we call the zero-weight (w0) in

the selection procedure. In addition a positive value avoids issues when all nodes in

the response set had a rating of 0. For the experiments performed the ideal and local

4New nodes are automatically exempt from being discarded, even if ρ0 < ρT .5A node would receive a rating of 0 if the first file fetched from it were inauthentic


Weighted reputation systems used a w0 of 0.01 in their weighted selection procedure.

The value 0.01 was chosen because it is a positive value, but significantly smaller than

any other node behavior value, such as pB. Experiments were performed using both

a zero-weight of 0 (no zero-weight) and 0.01. The results were similar.

Finally, we present a prescient reputation system, which is applicable only when

using the node-based threat model:

Ideal: The ideal case uses T to base its selection decision. It represents the best

possible performance a reputation system can achieve by using the actual threat model

in selecting the document to present. Both the Select-Best and Weighted selection

procedures are evaluated for the ideal system.

This system is not realistic, but is used as a guide for the ideal performance of

the previously defined reputation systems, if the reputation matrix R converges to

the actual threat matrix T . Results for this system are presented in the third results

section, relating to the node-based threat model.

5.3.1 Identity

Maintaining statistics of node behavior requires some form of persistent node identi-

fication. In order to build reputation, a user or node must have some form of identity

which is valid over a period of time. The longer this period of time, and the more

resistant the identity is to spoofing, the more accurately the reputation system can

rate nodes [116].

The simplest way to identify a node is to use its IP address. This method is

severely limited because addresses are vulnerable to IP-spoofing and peers are often

dynamically assigned temporary IP addresses by their ISPs. Instead, a more reliable

method may be to use self-signed certificates. This technique allows well-behaved

nodes to build trust between each other over a series of disconnections and recon-

nections from different IP addresses. Although malicious nodes can always generate


new certificates making it difficult to distinguish them from new users, this technique

prevents them from impersonating existing well-behaved nodes.

Some argue that the only effective solution to the identity problem in the presence

of malicious nodes is to use a central trusted login server, which assigns a node identity

based on a verifiable real-world identity. This would limit a malicious node’s ability

to masquerade as several nodes and to change identities when their misbehavior is

detected. It would also allow the system to impose more severe penalties for abuse

of the system.6

For simplicity, we generally assume that all nodes use the same identity for their

lifetime. This mimics a system with a centralized login server, assigning unforgeable

IDs based on real-world identities. This scheme ensures users cannot (easily) change

identities to hide their misbehavior, by limiting each real-world entity to one network

ID. In this system the trusted server need not know which system ID refers to which

real-world ID [48].

The second model relies on users generating their own certificates and public/private

key pairs as forms of identification. Though robust to spoofing, any user can easily

discard an identity and generate a new one. Using self-managed identities makes the

system vulnerable to whitewashing, where malicious nodes periodically change their

identities to hide their misbehavior [83]. This is modelled by erasing all informa-

tion gathered on a malicious node after it sends an invalid document to the query

source node for verification. If node M sends node S a fake document, all information

collected by nodes (including S) about M is erased. Essentially all nodes “forget”

about bad nodes. We abbreviate the references to the login server and self-managed

identities scenarios as Login and Self-Mgd, respectively. These two identity schemes

are compared in Section 5.6.1 of the results.

6For example, a person might have to use a valid credit card to enter the system, allowing thesystem auditors to debit their card if they are caught misbehaving.

5.4. METRICS 151

In Section 5.6.2 we experiment with a slightly different whitewashing scenario.

Instead of each malicious node constantly changing identities, malicious nodes white-

wash periodically. We conduct experiments using both identity models and distin-

guish the two as the whitewashing and static scenarios. In our results, the default

identity model is the static model, unless whitewashing (WW ) is specified.

5.4 Metrics

The main objective of a reputation system is to reduce the number of documents

the user must look at before finding the correct document for their query. We call

this the efficiency of the reputation system. This is equivalent to minimizing the

number of times the authenticity function is calculated in the selection stage. This

metric seems the most practical and direct measure of a particular selection heuristic’s

performance.

While systems reduce the number of document fetches and authenticity function

computations to be more efficient, it often comes at the sacrifice of effectiveness. The

effectiveness of a search system relates to its ability to locate an answer given that

one exists somewhere in the network. A reputation system’s effectiveness is measured

by the fraction of queries for which an authentic document is selected, given that one

exists in the response set. We call this metric the miss rate. This measurement of

effectiveness is only accurate for systems in which the reputation algorithm does not

interfere with query response or query/response propagation, but it is accurate for

the systems described here.

When studying reputation systems it is necessary to determine what metrics best

measure the success of a particular system. Here we present the metrics we use

to evaluate our experimental results. We ran simulations of our system model for a

period of time and gathered statistics at the end. These statistics are used to compute


Table 5.1: Simulation statistics and metricsMetric Descriptionqtot # of queries generatedqgood # of queries with an authentic file in at least one responseqsucc # of successful queries where the selection procedure located an

authentic fileVi # of verification function evaluations performed on files fetched

from node inG Number of good nodes in the networknfld Average number of nodes that receive a query through floodingqFC Number of queries successfully answered by a node in the Friend-

CacheV # of verification function evaluationsVG Total number of verification function evaluations of files fetched

from good nodesrV Verification ratiodTR Threat-reputation distancermiss Miss rateì Load on node i`G Average load on good nodes

MTrel Relative message traffic of Friends First w.r.t. flooding

the metrics. They are summarized in Table 5.1.

From among all the queries generated during execution (qtot) we are specifically

interested in the number of good queries (qgood) and the number of successful queries

(qsucc). A good query is any query whose response set includes at least one authen-

tic copy of the queried file, even if no authentic copy was located by the selection

procedure. A successful query is a query that results in an authentic copy of the

requested file being selected by the selection procedure. The relation between the

three statistics is given by the following equation:

qtot ≥ qgood ≥ qsucc (5.2)

For the reputation systems we are testing, if qsucc always equals qgood then the

5.4. METRICS 153

system is considered to be 100% effective.

5.4.1 Efficiency

When designing reputation systems our primary concern is to reduce the number of

files which must be fetched and verified before locating a valid query response. During

execution we record the number of file verifications supplied by each node i, which we

refer to as Vi. From this data we compute the total number of verification function

evaluations, V , as

V =n∑

i=1

Vi (5.3)

But V alone is insufficient. A system could ignore every response, report failure

on every query, and have V = 0. To account for the fact that some systems may incur

more verification checks, but locate valid files to more queries, we divide V by the

number of successful queries (qsucc). We call this metric the verification ratio (rV ).

rV =V

qsucc

(5.4)

The lower the value of rV , the more efficient the system is. The best possible per-

formance would be a prescient algorithm which always chose a valid file if one was

available in the response set, and ignored all responses if not. This would give an rV

of 1. The verification ratio measures the efficiency of a reputation system and is our

principal metric of system performance.

5.4.2 Effectiveness

While systems reduce the number of file fetches and authenticity function compu-

tations to be more efficient, it often comes at the sacrifice of effectiveness. The

effectiveness of a search system relates to its ability to locate an answer, given that


one exists somewhere in the network. A reputation system’s effectiveness can be con-

sidered to be the fraction of queries for which an authentic file is selected, given that

one exists in the response set. We call this metric the miss rate. This measurement

of effectiveness is only accurate for systems in which the reputation algorithm does

not interfere with query response or query/response propagation, but it is accurate

for the systems described here.

Some reputation systems with selection thresholds may not locate an authentic file

even when one is available, and thus are not completely effective. We are interested

in measuring how often such systems report a failure to a good query. We introduce

the miss rate (rmiss), given by the equation

rmiss =qgood − qsucc

qgood

(5.5)

The miss rate gives the fraction of good queries that were missed. A system which

returns a valid file for every good query will have a miss rate of 0. A system which

never returns a good response would have a miss rate of 1. Therefore, the miss rate

is inversely related to the effectiveness of the reputation system.

5.4.3 Load

We are also interested in measuring the load on the network under the various repu-

tation systems and threat models. We are primarily concerned with the load on the

well-behaved nodes in the network from file fetches. If each file is transferred only

when it is selected to be verified, then the number of files a node has uploaded is

equal to the number of verification function evaluations of files from that node. We

define the load on node i (ì) as the number of verification checks on files it supplies

5.4. METRICS 155

normalized by the total number of queries, or

ì =Vi

qtot

(5.6)

We measure the average load on the network as the average load across well-

behaved nodes, `G. Let G be the set of all good nodes in the network and let nG be

the total number of good nodes. Therefore, the average load is

`G =

∑

i∈G ì

nG

(5.7)

Network load is analyzed in Section 5.6.2.

5.4.4 Message Traffic

To measure the message efficiency of the Friends-First method we compare the net-

work query message traffic generated by this method to the default practice in un-

structured networks of flooding the network for each query. We calculate the relative

message traffic as

MTrel =Number of Friends-First Messages

Number of Flooding Messages(5.8)

and compute it using the system parameters and the statistics gathered from the

Select-Best experiments. Note that the number of Friends-First messages includes

messages sent directly to friends and messages from query floods, resulting when the

query goes unanswered by friends.


5.4.5 Threat-Reputation Distance

The final metric we introduce applies only to the node-based threat model defined in

Section 5.2.2. If a node’s reputation rating is expressed as the perceived probability

that a node returns an authentic file, then the reputation matrix R, approximates

the threat matrix T . Let the standardized reputation matrix R′, be a an n×n matrix

such that R′i,j is the probability with which node Ni expects a file from Nj to be

authentic.7 For many reputation systems R′ is equal to or easily derived from R,

and R′ may converge to T .8 For some reputation systems, R converges to T . How

quickly convergence takes place may be a useful metric. Since R is most likely a

sparse matrix, an appropriate matrix distance algorithm must be used.

We sum the square of the differences between each defined value of R′ and T ,

take the square root, and divide by the number of defined values in R′. This metric

we call the threat-reputation distance, or T-R distance (dTR) for short, and can be

mathematically expressed as

dTR =

√√√√√

n∑

i

n∑

j

R′

i,jdefined

(Ti,j −R′i,j)

2

n∑

i

n∑

j

R′

i,jdefined

1(5.9)

If no cells inR′ are defined then the T-R distance is undefined. For the ideal reputation

system the T-R distance is always 0 (by definition of the ideal system).

7For the local and voting reputation systems we simulate R′ = R.8Ti,j is the a priori probability that j sends an authentic file to i. Ri,j is the probability with

which i expects j to reply to it with an valid file based on past experience.


Table 5.2: Configuration parameters, and default valuesParam. Description Value

τ Simulation runtime 1000n Number of nodes 10000

dmax Maximum allowed degree of a node in the network 150davg Average degree of a node in the network ≈ 3.5TTL Distance from source queries are propagated 5πB Percentage of malicious nodes in the network 0.3pG Probability of a good node replying with an authentic file 0.99pB Probability of a malicious node replying with a fake file 0.9ρ0 Initial reputation rating used for nodes with no prior inter-

action0.3

ρT Selection threshold. Nodes with reputation ratings belowρT are not considered in the selection procedure.

0.2,0.15

w0 Weight assigned to nodes with a reputation rating of 0 0.01αF Zipf exponent for file popularity distribution 1.2αQ Zipf exponent for query popularity distribution 1.24PQR Number of popular queries 200αPQ Zipf exponent for most popular queries 0.63FC Size of Friend-Cache used with Friends-First method -wQ Quorumweight - Weight given to voters’ opinions with re-

spect to local statistics0.1

5.5 Simulation Details

The following section describes the specific component models, parameters, and met-

rics used in the simulations. The key parameters for the simulations are summarized

in Table 5.2 along with their default values. Table 5.3 lists the various statistical

distributions used, along with the default values for their parameters. The results of

the simulations are presented and discussed in the following section.

We evaluate the reputation systems using our own P2P Simulator based on our

system model. The simulations were run on a Dual 2.4Ghz Xeon processor machine

with 2GB of RAM. Each data point presented in the results section represents the

average of approximately 10 simulation runs with different seeds.


Table 5.3: Distributions and their parameters with default valuesDescription Distribution Parameters (with default values)Network topology Power-Law n = 10, 000, dmax = 150, β ≈ 1.9Query popularity Zipf αPQ = 0.63, PQR = 250, αQ = 1.24Query selection power Zipf αF = 1.2

Though most of our findings apply to any peer-to-peer network, for our experi-

ments we construct a Gnutella-like flat unstructured network. Specifying the overlay

topology is necessary for studying certain issues, such as Neighbor-voting and mes-

sage traffic reduction. Studies of unstructured peer-to-peer networks have shown

their topologies are power-law networks [44]. We use randomly generated, fully con-

nected power-law networks with an average node degree of davg ≈ 3.1. For the local

reputation system experiments, we used networks of size n = 10, 000 nodes with a

maximum node degree of dmax = 150. The voting reputation system experiments

used 1000 nodes with a maximum node degree of dmax = 50.9 Queries are propagated

to a TTL of 5. For simplicity we assume the network structure does not change,

though we simulate a node leaving and a new node taking its place in the network.

Each timestep a query is generated and completely evaluated before the next

query/timestep. Therefore, a simulation run of 100 timesteps processes 100 queries.

For the results dealing solely with the local reputation system, which does not ex-

change reputation information between peers, all queries are sent from a single node

randomly chosen at startup. Each simulation seed selects a different node. For exper-

iments using the voting-based system a node is randomly chosen as the query source

at each timestep.

The simulation component most specific to file-sharing (as opposed to general

resource-sharing) is our query model. It is similar to the one proposed in [138]. We

assume a total of 100,000 unique files. The number of copies of each file in the system

9We have experimented with larger networks. Results are not shown but observed trends aresimilar to what is reported here.

5.6. RESULTS 159

is determined by a Zipf distribution with α = 1.2. Each node is assigned a number

of files based on the distribution of shared files collected by Saroiu et al. [119] The

query popularity distribution determines which file each query searches for. For this

distribution we use a two-part Zipf distribution with an α of 0.63 from rank 1 to

250 and an α of 1.24. This distribution better models query popularity in existing

peer-to-peer systems [124]. Though our query model is based on data collected on

today’s file-sharing networks, we expect networks providing other content or services

to have similar distributions.

In Section 5.6.2, we model node turnover by having a random node leave the net-

work and a new node enter on average once per query from a single node. Therefore,

a turnover occurs every timestep for the single query source experiments and every

1000 timesteps for the multiple query source experiments in a 1000 node network. For

the reputation system, this is equivalent to clearing all information in the ith row and

column of R, when node i leaves. For the whitewash experiments in Section 5.6.1,

each malicious node changes its identity after uploading a fake file to any node, by

clearing all the column of R relating to the malicious node. In Section 5.6.2, all

malicious nodes change identity every 10 queries from a single node (or every 10,000

queries with multiple query sources),

Unless otherwise stated, we use a selection threshold of 0.2 in all experiments

reported in this chapter. We use an initial reputation rating of 0 for the whitewashing

experiments, and 0.3 otherwise. All experiments with constant πB, pB, and pG, were

run with πB = 0.3, pB = 0.9 and pG = 0.99.

5.6 Results

The results section is divided into three components. First, we look solely at the local

reputation system and compare the two identity models in detail and measure the


effects of the parameters common to both reputations systems. We also evaluate the

Friends-First technique for message traffic reduction.

In Section 5.6.2, we focus on the voting-based reputation model, look at the effects

of the system parameters specific to it and also analyze the distribution of load on

the well-behaved peers in the network. We also look at the effects of the malicious

opinion-sharing (N , L, C).

The first two parts use only the document-based threat model defined in Sec-

tion 5.2.1. In the final part, we look at the performance of the node-based threat

model (see Sec. 5.2.2) and compare it to our results from the document-based threat

model.

5.6.1 Local Reputation System

In this section we address several of the questions brought up in the previous sections.

Specifically:

1. Is an initial reputation rating of zero always preferable to nonzero?

2. What is the cost in efficiency (as defined here) for using self-managed identities

in lieu of a trusted login server?

3. Is there a benefit to using a selection threshold?

4. Can maintaining a Friend-Cache reduce message traffic?

Here we compare the two extreme identity models, the Login model, in which

nodes do not change identities over time, and the Self-Mgd model, in which malicious

nodes change identities after every fake file they upload to a peer. The experiments

in this section were all conducted with no node turnover. Each simulation was run

for 1000 timesteps (unless otherwise noted).

5.6. RESULTS 161

0

5

10

15

20

25

30

35

40

0 0.2 0.4 0.6 0.8 1

Ver

ifica

tion

Rat

io (r

v)

Initial Reputation Rating (ρ0)

Login WeightedLogin Best

Self-Mgd WeightedSelf-Mgd Best

Figure 5.2: Efficiency for varying ρ0. Lower value is better. 1 is optimal.

New Node Reputation

In this experiment we varied the initial reputation rating (ρ0) used by the local

reputation system for any node from which we have not received a document and

checked its authenticity. Our experiments demonstrate that, though a reputation

system performs similarly for both identity models for a ρ0 of 0, efficiency in the login

server scenario can improve substantially by increasing ρ0, while performance in the

self-managed identities scenario will only worsen.

Figure 5.2 shows that for the Login scenario, a nonzero initial reputation rating

(eg. ρ0 = 0.4) performs better by a factor of 1.5 in terms of minimizing the number

of authenticity checks computed. If malicious nodes cannot change their identities to

pose as new nodes after misbehaving, there is a benefit to selecting new nodes over

previously encountered malicious nodes.

If malicious nodes are allowed to change their identities, as in the self-managed

identities scenario, they will usually be treated as new nodes with a reputation rating

of ρ0 in the selection procedures. We would expect that varying ρ0 would have a

significant effect for Self-Mgd. Figure 5.2 shows that increasing ρ0 decreases the

efficiency when using the Weighted procedure, though unexpectedly, the Select-Best

procedure is not affected (until ρ0 = 1). For example, from a ρ0 of 0.0 to 0.5, the


0

1

2

3

4

5

6

7

8

9

10

0 0.2 0.4 0.6 0.8 1

Ver

ifica

tion

Rat

io (r

v)

Selection Threshold (ρT)

Login WeightedLogin Best

Self-Mgd WeightedSelf-Mgd Best

Figure 5.3: Varying selection threshold values.

verification ratio (the average number of authenticity checks performed per query)

of the Weighted method goes from 9.7 to 25.1, while Select-Best stays constant at

9.4. Since the Weighted method considers all nodes (weighted by their ratings) in

the selection stage, it is important to lower the weight of new nodes, which are more

likely to be malicious nodes in the scenario of self-managed identities than in that of

a login server. The results support our intuition. The Select-Best method’s unvaried

performance across all values of ρ0 can be attributed to the fact that often a node

receives a reply from a peer which has previously provided an authentic document, in

which case the node will always choose the reputable source over any unknown peer.

From these experiments we selected 0.3 as the default value for ρ0 for Login. Many

of the following experiments were additionally performed with other values of ρ0, but

the results did not vary noticeably from those at ρ0 = 0.3 and are not discussed.

For Self-Mgd simulations we use only ρ0 = 0, which clearly performed best for the

Weighted method.

Selection Threshold

Figure 5.3 shows tests varying the value of the selection threshold for both the

Weighted and Select-Best variants of the local reputation system. The verification

5.6. RESULTS 163

ratio is plotted as a function of ρT . As stated above, ρ0 was set to 0.3 for Login and

0 for Self-Mgd.

The result is surprising. For Login all values of ρT above 0 resulted in almost equal

performance, yet significantly better than ρT = 0 (rv of 6.3 for ρT = 0 down to 2.0

for ρT > 0).10 Because malicious nodes always reply with a copy when a document

in the subversion set is queried for, the vast majority of responses in the response set

come from malicious nodes supplying bad copies. When searching for rare content, it

is common to receive only bad copies from malicious nodes. The threshold prevents

nodes from repeatedly fetching and testing documents from peers which have proven

malicious or unreliable in the past. The drawback of the selection threshold is a

decrease in query effectiveness (discussed in the following section).

For Self-Mgd varying ρT had no effect. Remembering which nodes have lied in

the past is of no use if those nodes can immediately change their identities to hide

their misbehavior. The threshold may be useful if nodes were motivated to maintain

their identities, perhaps by providing incentives for building reputations.

In successive tests any system variant using a selection threshold uses a ρT value

of 0.2 or 0.15 unless otherwise stated. The primary simulations use ρT =0.2. However,

early experiments, presented later in the result section, used a selection threshold of

0.15. As this and following experiments demonstrate, there is negligible difference in

results between using a selection threshold of 0.15 or 0.2.

Performance under Various Threat Conditions

In this section we look at system performance under different threat model parameter

values. Specifically we demonstrate how overall efficiency is affected by varying the

percentage of malicious nodes in the system (πB) and the probability of a unique

10Though almost the same, the values of rv for different nonzero ρT for a given reputation systemvariant are not exactly identical.


0

5

10

15

20

25

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

Ver

ifica

tion

Rat

io (r

v)

Percentage of Bad Nodes (πB)

BaseLogin Weighted 0.0

Login Best 0.0Login Weighted 0.2

Login Best 0.2Self-Mgd Weighted 0.0

Self-Mgd Best 0.0Self-Mgd Weighted 0.2

Self-Mgd Best 0.2

(a)

0

5

10

15

20

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Ver

ifica

tion

Rat

io (r

v)

Percentage of Unique Documents in Subversion Set (pB)

BaseLogin Weighted 0.0

Login Best 0.0Login Weighted 0.2

Login Best 0.2Self-Mgd Weighted 0.0

Self-Mgd Best 0.0Self-Mgd Weighted 0.2

Self-Mgd Best 0.2

(b)

Figure 5.4: Efficiency comparison.

document being in the subversion set (pB). Eight different variants of the local

reputation system were tested. These eight variants are derived from three system

parameters: the identity model (Login or Self-Mgd), ρT (0 or 0.2), and the selection

procedure (Weighted or Select-Best).

The graphs in Figure 5.4 present the system performance for varying πB and pB.

The results show that overall a trusted login server significantly reduces the cost

of ensuring authenticity over self-managed identities roughly by a factor of 5.5. Yet,

using a reputation system with the Self-Mgd model outperforms having no reputation

system at all (Base curve in Figure 5.4) by an additional factor of 3.5.

For both graphs, the curve corresponding to the base case, of purely random

selection, quickly climbs out of the range of the graphs. In Figure 5.4(a) the base

curve increased steadily to 46 at πB = 0.4; 3.5 times the verification ratio of the

Self-Mgd variants, and up to 20 times the rv of Login using a selection threshold. In

Figure 5.4(b) the base curve climbed to 35 at pB = 1; resulting in 3.4 times the rv

of the Self-Mgd variants, and approximately 17 times that of Login with ρT = 0.2.

This means one would expect to have to fetch and test on average 20 times as many

query responses in order to find a valid response! Even using self-managed identities,

a rudimentary reputation system provides significant performance improvements over

5.6. RESULTS 165

no reputation system. Even then users would expect to fetch over ten bad copies for

every good copy they locate (for πB > 0.3). In contrast, a peer using a selection

threshold in a login server environment would only expect to encounter one or two

fakes for every authentic file, no matter the level of malicious activity in the network.

Figures 5.4(a) and 5.4(b) show that the Select-Best and the Weighted proce-

dures perform similarly. Overall the Select-Best method outperformed the Weighted

method, especially in the Login model. Though the Select-Best performed well and

served to mitigate the performance variance of other parameters (such as the initial

reputation rating), it does have drawbacks. A study of the load on well-behaved nodes

(measured as the number of documents fetched from a node) showed a much more

skewed distribution for the Select-Best variants than the Weighted variants. In fact,

the highest loaded good nodes in the Select-Best simulations were being asked for 2.5

times as many documents as the highest loaded nodes in the Weighted simulations. At

the bottom of the distribution, hundreds of nodes were never accessed in the Select-

Best simulations that were in the Weighted simulations. This dramatic skew in load

distribution can result in unfair overloading, especially in a relatively homogeneous

peer-to-peer network. We study load distribution in detail in Section 5.6.2.

Both graphs illustrate that the selection threshold is useless in the Self-Mgd sce-

narios, but provides a large performance boost for Login. This supports our findings

in the previous section, and demonstrates it was not an artifact of the selected values

of the threat parameters. Using a selection threshold system efficiency is relatively

unaffected by variations in πB and pB.

Measurements of effectiveness in these experiments (only applicable to a nonzero

selection threshold) resulted in a miss rate well below 0.001 (0.1 of 1%) for the exper-

iments varying πB at a constant pB of 0.9. For the experiments varying pB, the miss

rate increases as pB decreases, but always remains below 0.0025 (0.25 of 1%). As pB

decreases, the subversion set decreases. Because malicious nodes become more likely


to provide authentic documents, but tend to fall under the threshold, the effectiveness

of the system deceases. For most applications these miss rates are acceptable, espe-

cially when compared to the increased efficiency offered by the selection threshold.

Message Traffic

Now, we present our experiments on mitigating message traffic using the Friends-First

technique. As explained earlier, Friends-First takes advantage of the Friend-Cache to

try and locate a positive query response among the known reputable nodes, before

querying the entire system. As we will see, in a flood-based querying system, this can

result in 85% less message traffic!

Before presenting the results, we redefine the general formula for relative message

traffic, given in Equation 5.8, in terms specific to our model. The numerator is the

total message traffic for Friends-First. For all queries, messages are sent to all nodes in

the Friend-Cache (qtot ·FC). In addition there is the cost in messages of flooding the

network when a valid response is not located from the Friend-Cache. The number

of messages generated in the network to propagate a query will be at least equal

to the number of nodes which hear the query, and most likely much larger due to

several occurrences of two nodes forwarding the query to the same node. We roughly

estimate the number of messages generated by a query flood as the average number of

nodes reached by a query flood (nfld). Therefore, the additional cost of flooding for

Friends-First would be the number of queries not answered by a node in the Friend-

Cache (qtot− qFC) times nfld. The denominator is the number of messages generated

assuming every query is a flood (qtot · nfld).

Note that FC is greater than or equal to the actual number of nodes in the Friend-

Cache at any time, so not all queries will have FC nodes to query directly. Let FC i

be the number of nodes in the Friend-Cache after i − 1 queries. FC i is the number

of messages sent directly to reputable nodes for the ith query. Note that for all i

5.6. RESULTS 167

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0 50 100 150 200 250 300 0

20

40

60

80

100

120

140

Rel

ativ

e M

essa

ge T

raffi

c

Max

imum

Num

ber o

f Nod

es in

Frie

nd-C

ache

Size of Friend-Cache

MTrelMax Friends

Figure 5.5: Relative message traffic of Friends-First and maximum Friend-Cacheutilization as a function of the size of the cache.

FC i ≤ FC and FC1 = 0 since all nodes are initially unknown. We can define our

message traffic metric as

MTrel =

∑qtot

i=1 FC i + (qtot − qFC)nfld

qtot · nfld

(5.10)

Note that this is still a conservative calculation of relative traffic since nfld is less

than or equal to the total number of messages generated due to a query flood.

We conducted these experiments using the local reputation system and the single-

source query generator. For the results in this section, we ignored whitewashing and

node turnover. We ran simulations for various numbers of queries (1000, 10,000,

50,000, etc).

The solid line in Figure 5.5 plots the relative message traffic of Friends-First with

respect to regular flooding (MTrel) as a function of the maximum Friend-Cache size,

after 50,000 queries. We see that, as the size of the Friend-Cache increases, the query

message traffic drops quickly to approximately MTrel=0.15 until it reaches a point

where growing the cache no longer provides any benefit. This means that the Friends-

First method is generating only 15% as much message traffic as flooding without any

loss in effectiveness!


Interestingly, for cache sizes greater than 120, the traffic overhead actually in-

creases slightly, before levelling off at around 200. For small FC, increasing the

cache size greatly reduces message traffic because of the high likelihood of locating

future query answers at the additional nodes stored in the cache. Every additional

query satisfied by a node in the cache saves the system a query flood, outweighing the

cost of the additional messages sent to the new nodes in the Friend-Cache for every

query. But when FC is large, any node added to the cache will likely be sharing

few files (thus rarely provide a response in the future). If the node had more files,

it would have been located earlier and already be in the Friend-Cache. We find that

well-behaved nodes sharing many files tend to be located quickly and be placed in the

Friend-Cache early. Nodes added later offer fewer (approx. 5) files and do not offer

any more query responses, thus wasting bandwidth on query messages sent directly

to them.

As stated earlier, we performed experiments for varying lengths of time. In our

shorter simulations (e.g. 1000 queries) there were no rise in relative traffic for large

FC. Instead MTrel drops quickly and levels off, with no single minimum. These

shorter simulations end before the Friend-Cache begins collecting useless nodes with

very few files. Runs of 20,000 and 100,000 queries, on the other hand, also showed a

preferred FC around 130. This result supports our hypothesis that, once only small

nodes remain outside the cache, adding a node to the cache increases overall traffic

because the cost of sending them a direct query outweighs the slim probability of

their answering a request and avoiding a query flood.

In studying the efficiency of Friends-First, it is useful to consider the utilization

of the Friend-Cache. The right y-axis of Figure 5.5 corresponds to the number of

reputable nodes in the Friend-Cache when the simulation ended, represented in the

graph by the points on the dashed line. Notice that the number of nodes in the

cache increases linearly until MTrel reaches the minimum, and levels off when MTrel

5.6. RESULTS 169

levels off. Interestingly, the value it reaches is 126, approximately the same value

as the optimal cache size. We believe this is not a coincidence. This value is an

average of several simulation runs with different seeds. Some runs had lower values

and others higher, but it does indicate that, on average, the system did not use

responses from more than 130 reputable nodes. Thus, in the simulations where more

than 130 reputable nodes were located and placed in the cache, we would not expect

them to provide any further useful unique responses. Therefore, limiting the Friend-

Cache to a size of 130 prevents useless nodes from entering the cache and worsening

performance.

5.6.2 Voting-System

In this section we discuss the following important issues:

1. How well does the voting-based system perform? How do the parameters wQ

and FC affect the performance?

2. How does the voting system compare to the local reputation system? Remember

that the voting system with a quorumweight of 0 is equivalent to the local

system.

3. How do the Select-Best and Weighted methods compare in terms of overall

efficiency?

4. How does the reputation system affect the distribution of load across well-

behaved nodes?

As stated before, all experiments use the document-based threat model. We also

slightly relax the whitewashing scenario. Instead of a malicious node changing iden-

tities after each false file upload, all malicious nodes whitewash after after an average

of 10 queries per node in the network. To underscore this difference we refer to the


0

2

4

6

8

10

12

0 0.2 0.4 0.6 0.8 1

Ver

ifica

tion

Rat

io (r

V)

Weight Given to Quorum’s Opinion (wQ)

Frd LFrd CNbr LNbr C

Frd L WWFrd C WW

Nbr L/C WW

Figure 5.6: Efficiency of the voting reputation system (using Select-Best) with respectto varying quorumweight (wQ). Lower rV is better. 1 is optimal.

two identity models as the whitewash (WW ) and no whitewash scenarios, as opposed

to Login and Self-Mgd, as done in the previous section.

Voting System Parameters

In this section, we analyze the performance of the voting-based reputation system

for various parameter values. All experiments were performed using the multi-source

query generator for a total of 100,000 queries. For this scenario the random algorithm

obtained an rV = 28.2, off the scale of the graphs. The relative performance of the

local reputation system is given by the data point for a quorumweight of 0.

Figure 5.6 presents the effects of varying the quorumweight, wQ. It shows results

for both with whitewashing (WW ) and without, both Neighbor (Nbr) and Friend

(Frd) voting, and both the selfish lying (L) and colluding (C) malicious opinion-

sharing models. No (N) opinion-sharing misbehavior mirrored the L curves, per-

forming only marginally better across all experiments, and are not graphed. In the

selfish lying model, malicious nodes give themselves a rating of 1 and all others a

rating of 0. Since malicious nodes cannot vote for themselves and give everyone else

an equal rating of 0, they do not greatly impact a vote in favor of malicious nodes.

5.6. RESULTS 171

Note that the values of rV in Figure 5.6 are relatively high. For example, an rV

value of only 3 means we would expect to download and verify three files for each

query. In an actual system, it may not be feasible for a node to thoroughly check each

downloaded file’s authenticity. The node may simply trust the file to be valid. In this

case, rV can be viewed as the inverse probability that such a file is valid. Accounting

for well-behaved nodes offering bad copies of files complicates the threat model. We

have conducted experiments with this assumption and, as long as the probability of

a good node offering a bad file is small, it does not noticeably affect our results.

Observing the drop in rV from wQ = 0 to wQ = 0.05, we conclude that incorpo-

rating other nodes’ opinions tends to improve the efficiency of the system. Except

when malicious nodes collude to subvert the voting process, varying the weight of the

voters opinions beyond wQ = 0.05 has no effect on the system performance. This

behavior indicates that the greatest benefit from voting is in the situation where the

local node has no opinion of their own. When bad nodes collude (C), system perfor-

mance decreases as the weight given to the quorum’s opinion increases, reinforcing

that there is no substitute for personal experience in an untrusted environment.

Comparing the Frd family of curves to the Nbr curves within the same white-

wash scenario (e.g. Frd L vs. Nbr L), we clearly see that Friend-voting outperforms

Neighbor-voting. Nodes that have given you good service in the past have demon-

strated some effort to be reliable and well-behaved. Asking them for their opinions

is more reasonable than relying on one’s neighbors, a third of which, in this scenario,

are likely to be bad. Not only does Neighbor-voting not perform as well, but it is

more susceptible to malicious collusion as neighbors’ opinions are given more weight

(see Nbr C curve). Friend-voting, however, tends to avoid asking malicious nodes for

their opinions, mitigating the effects of collusion.

Though the whitewash scenario performs worse than no whitewashing, it can

benefit more from opinion-sharing. As the Frd WW curves between wQ = 0 and


0

2

4

6

8

10

12

14

16

18

0 5 10 15 20 25 30 35 40 45 50

Ver

ifica

tion

Rat

io (r

V)

Size of Friend-Cache

No WWWW

Figure 5.7: Efficiency of the voting reputation system with respect to Friend-Cachesize (FC).

wQ > 0 illustrate, efficiency for Friend-voting improves by a factor of 3 over the local

reputation system. The Nbr L/C WW curve shows that Neighbor-voting in the WW

scenario is almost completely unaffected by opinion-sharing, no matter the malicious

opinion-sharing model. As stated before, in the WW scenarios an initial reputation

rating of 0 is assigned to unknown nodes. Since this value is used for weighing the

opinions of the voting nodes, any unknown peer in the neighbor quorum (including

malicious nodes that have whitewashed) will have their votes ignored. Because the

average number of neighbors is small (approx. 3.1) the probability of a well-behaved

neighbor providing a query response that is tested, and thus becoming “known” and

having their opinion used, is rare. In contrast, in the no WW scenario, since ρ0 = 0.3,

even untested peers’ opinions are considered, explaining its poor performance when

bad nodes collude.

In summary, this experiment shows that choosing a relatively small quorumweight

around 0.1 with Friend-voting improves performance by a factor of 2 or more across

all scenarios. But how many reputable nodes should one keep in the Friend-Cache?

Does increasing the size of the Friend-Cache always result in better efficiency? In a

real system, a larger cache means greater maintenance cost periodically checking the

liveness of the nodes in the cache. Is this cost always justified?

5.6. RESULTS 173

Figure 5.7 shows the performance of both the whitewashing and no whitewashing

scenarios for various Friend-Cache sizes (FC) with no bad opinion-sharing (N).11

Both scenarios stabilize so that increasing the size of the cache yields no performance

improvement, but a system dealing with whitewashing benefits from a larger cache.

For instance, while a Friend-Cache of 10 is sufficient when there is no whitewashing,

the whitewash scenario can benefit from a cache as large as 25. As expected, when

tested with the malicious opinion-sharing models (N , L, C), all three models produced

similar rV values, with the C values being slightly greater than that of the N and

L values by about 0.4 in the no WW scenario. Thus, we only plot the N curve in

Figure 5.7. A surprisingly small cache is needed for this technique to be efficient.

In Section 5.6.1 we used the Friend-Cache to choose peers to query directly before

flooding the network. Though there is little benefit from gathering opinions from

more than the 10 or 15 most reputable nodes, the traffic results indicate that we can

take advantage of Friend-Caches larger than 100. Should we use our entire large cache

for gathering opinions? No. Though a large Friend-Cache is easy to maintain (it is

a list of known nodes ordered by their reputation statistics), asking a large number

of nodes to share their opinions, either per query or periodically, will greatly increase

the amount of message traffic produced yet not improve our selection performance.

Thus, though we may maintain a large Friend-Cache for direct querying, we would

only ask the top nodes to participate in our quorum.

Friend-voting is effective against collusion because it only considers the opinions of

nodes that have shown to behave well by providing good files. Given our threat model,

this quickly bars malicious nodes from the Friend-Cache. One technique malicious

nodes may employ to defeat Friend-voting would be to set up front nodes. These

nodes properly trade only authentic files, but when asked for their opinion of other

nodes, act according to the collusion model, C, promoting only malicious nodes.

11FC = 0 corresponds to the local reputation system


0

1

2

3

4

5

6

7

8

9

10

0 0.2 0.4 0.6 0.8 1

Ver

ifica

tion

Rat

io (r

V)

Fraction of Malicious Front Nodes

wQ=0.1wQ=0.8

Figure 5.8: Effects of front nodes on efficiency.

We have run simulations where a fraction of the malicious nodes are set to be front

nodes. We present the results for both a quorumweight of 0.1 and 0.8 in Figure 5.8.

These experiments show that, in the case of wQ = 0.8, front nodes can cause consid-

erable harm to the system. The damage peaks when 40% of the malicious nodes are

front nodes, decreasing the system performance by more than a factor of 3! For a

larger number of front nodes, rV steadily drops, indicating that too many malicious

nodes are behaving well to promote a smaller group causing actual damage. To be

optimally effective, attackers would need to use the right balance of front nodes and

actively malicious nodes. Surprisingly, front nodes appear to have no adverse effect

when wQ = 0.1. We believe this shows that a very low quorumweight limits the im-

pact of front nodes’ bad opinions sufficiently that the damage caused by front nodes

is negated by the benefit of having fewer actively malicious nodes.

Efficiency Comparisons

Given the results of our analysis on the voting parameters, we wish to evaluate the

system with respect to varying threat parameters. Specifically, we demonstrate how

overall efficiency is affected by varying the percentage of malicious nodes in the system

(πB). We have run similar experiments varying the probability of a unique file being

5.6. RESULTS 175

0

2

4

6

8

10

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

Ver

ifica

tion

Rat

io (r

V)

Fraction of Bad Nodes (πB)

BaseWeighted

BestWeighted WW

Best WW

(a) Voting system

0

5

10

15

20

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

Ver

ifica

tion

Rat

io (r

v)

Fraction of Bad Nodes (πB)

Base Weighted

BestWeighted WW

Best WW

(b) Local system

Figure 5.9: Comparison of the efficiency of the two reputation systems with therandom algorithm as a function of πB.

in the subversion set (pB) and obtained similar results and performance comparisons.

We test the voting system with wQ = 0.1 and FC = 10, and using the two

selection procedures both with and without whitewashing. Malicious nodes did not

lie or collude with their opinions (N). These experiments were also run using the

multi-source query generator for 100,000 queries. We evaluate the efficiency of the

reputation systems for values of πB between 0 and 0.4 using the default pB of 0.9.

It may seem unlikely that a network would have 40% malicious peers attacking

90% of the files. But in the real world, there are large entities, with access to vast

resources, which have an interest in subverting peer-to-peer networks. We have sim-

ulated across several degrees of malicious activity (varying both πB and pB) and

the relative performance of the different reputation system variants is comparable in

weaker threat scenarios to those presented here.

Figure 5.9(a) shows the performance of the voting reputation system. Clearly,

using any local statistics when selecting a provider results in significantly better ef-

ficiency than purely random selection (base case). While the base case climbed to

42.5 at 40% malicious nodes, the voting reputation system attained an efficiency of 2


(with no whitewashing), a factor of improvement of 21! Whitewashing adversely af-

fects the performance of the system, but not as badly as expected. For example, with

a verification ratio of 4.5 the reputation system in the whitewash scenario performs

2.3 times worse than when there are no whitewashers. This means that on average

a node would have to fetch more than twice as many copies of a file before finding a

valid one, showing a clear advantage to preventing whitewashing by requiring users to

log in through a trusted authority that can verify each real user has only one system

identity.

We also executed the experiments using the local reputation system under equiv-

alent conditions (100 queries from a single querying node). The results, shown in

Figure 5.9(b), were only a factor of 2 worse performance than the voting system in

the non-whitewashing scenario.12 The performance difference between the two sys-

tems was greater in the whitewashing scenario, a factor of 4. These results support

our findings in Section 5.6.2 that opinion-sharing is worthwhile in spite of its slightly

higher implementation complexity.

When comparing the performance of the Select-Best (Best) and Weighted selection

procedures in either graph of Figure 5.9, we see no large efficiency advantage of

one procedure over the other, though the Select-Best method outperforms Weighted

across all values of πB. As expected, selecting the best known provider is slightly

more efficient than probabilistically choosing a provider, but this comes at a cost,

which we discuss in the following section.

Load on Good Nodes

One critical issue is that reputation systems may unfairly burden some of the good

nodes in the network. Thus, we now look at the amount of load placed on well-

behaved nodes in the network in terms of the number of files they upload. We are

12Note the difference in scale between the two graphs in Figure 5.9.

5.6. RESULTS 177

0

0.0002

0.0004

0.0006

0.0008

0.001

0.0012

0.0014

0.0016

0.0018

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

Load

on

Goo

d N

odes


ExpectedWeighted

BestWeighted WW

Best WW

Figure 5.10: Average load on well-behaved nodes as a function of pB.

interested only in the effect produced by requests from well-behaved nodes running

the algorithms correctly. We use the same setup as above but concentrate on the

scenario with no whitewashing.

Figure 5.10 plots the average load on the well-behaved nodes, as a function of

the fraction of malicious nodes in the network. In an ideal system with no malicious

nodes, we would expect exactly 1 download per query, giving a value of `G = 0.001

for a 1000 node network. In our case, when there are no malicious nodes, the value

of `G is 0.00098. This value is less than expected (shown by the Expected curve in

the graph) because a few queries go unanswered by any node in the network.

As the fraction of malicious nodes increases, so does `G. For instance, when

πB = 0.3 the average load is 0.00138. With only 70% as many good nodes to service

requests, we would expect `G = 10.70·1000

= 0.00143. Both the fact that malicious

nodes provide some good files, and that the probability of a successful query is lower,

account for the difference between the observed and expected loads. Comparing

the two selection procedures shows an insignificant difference in average load. Both

procedures fetch the same number of files from good nodes overall.

Though there was little difference between the selection procedures in terms of

average load, it is important to consider the load distribution. In a homogeneous


0

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0.016

1 10 100 1000 1

10

100

1000

10000

100000

Load

on

Goo

d N

odes

Num

ber o

f File

s O

ffere

d

Good Nodes (ordered by load)

Base Weighted

Best

(a) Load on each node

0

5e-07

1e-06

1.5e-06

2e-06

2.5e-06

3e-06

3.5e-06

1 10 100 1000 1

10

100

1000

10000

100000

Load

Per

File

on

Goo

d N

odes

Num

ber o

f File

s O

ffere

d

Good Nodes (ordered by load)

Base Weighted

Best

(b) Load per file on each node

Figure 5.11: Distribution of load on good nodes (and their corresponding number offiles shared). x-axis corresponds to nodes sorted by amount of load in (a) and loadper document stored on node in (b) (note logscale axis). The curves relate to the lefty-axis and specify the amount of load measured at each node. The points map to theright y-axis and indicate the number of documents on the corresponding node.

network where all nodes have similar bandwidth, it is preferable if load is distrib-

uted evenly across all nodes, as opposed to a few nodes handling most of the traffic

while the majority are idle. To study load distribution we measured the load (using

Eq. 5.6) on each individual node using the two voting-based reputation system selec-

tion procedures and the random selection algorithm. The values were then sorted in

descending load order. The results, averaged across 10 runs with different seeds, are

shown by the three line curves on the left y-axis in Figure 5.11(a). Here we see that,

using the Select-Best selection procedure, the most heavily loaded node (with rank

1) has a load of almost 0.015. This value is more than 10 times the average load of

0.00138.

Though both selection procedures incurred greater load on the highest ranked

nodes than the base case, Select-Best concentrated the load on a few nodes while

Weighted distributed the load better. The maximum load on a node with the Select-

Best method was almost twice that of Weighted. This is expected since Select-Best

locates a few good nodes and tries to reuse them when possible, while the Weighted

5.6. RESULTS 179

model encourages fetching files from new nodes (broadening its pool of known good

nodes). If load-balancing in a homogeneous system is an important requirement,

then the Weighted selection procedure would be preferable.

Another factor to consider is how load relates to the number of files shared by

each node. It would be expected that good nodes with more files are more likely to

be able to answer queries, increasing the number of files they upload and thus their

load. Figure 5.11(a) plots as points the number of files on each good node on the

right-hand y-axis. For example, for rank 1, there are three points around 38,000. This

means that, for all three systems, the most heavily loaded node shared an average of

around 38,000 files. As expected, all distributions show a strong correlation between

nodes sharing more files and higher load.

In Figure 5.11(b) we divide the load on each node by the number of files it provides

and reorder the distribution. For instance, the node at rank 1 has a load per file of

2.8 × 10−6 for the Weighted selection procedure, but only 2.2 × 10−6 for the Select-

Best procedure. The result is surprising. The Select-Best method generated much

less load per node than the Weighted or random methods. To understand this result

we again plot the number of files offered by each node on the right y-axis. Here we

see two trends. The base case and the Weighted method both curve from the bottom

left upwards, showing that the nodes with highest load per file offer very few files.

This effect is due to the sublinearity of the answering power of a node with respect

to the number of files it is offering. For example, if node i has twice as many files as

node j, we expect node i to be able to answer less than twice as many queries as j. In

general, given a probability p that any individual file in the system matches a query,

the probability that a node with f files can respond to a query equals 1 − (1 − p)f .

In a purely random selection model this probability is an indicator of the expected

load on a node; as f increases, so does the probability, and thus the likely load. This

is corroborated by our results in Figure 5.11(a). Now if we divide this probability by


f we have an indicator for the load per file: 1−(1−p)f

f. This equation has a maximum

value when f = 1 and decreases as f increases. This explains the behavior we see

from the random base case and the Weighted case in Figure 5.11(b).

The Select-Best method, on the other hand, shows a different trend. The most

heavily loaded (per file) nodes share a very large number of files. The Select-Best

procedure selects nodes which have proven reliable in the past. This behavior favors

well-behaved nodes which respond to queries early in the simulation and often, nodes

sharing many files. This procedure gives nodes with many files an even greater chance

of being chosen with respect to the random model.

Whether or not it is desirable to send greater traffic to nodes with more files

is dependent on the environment. Some have suggested that in some peer-to-peer

systems, the number of files a node offers correlates to its available bandwidth. If so,

using the Select-Best selection procedure, which gives preference to nodes with more

files, may result in more effective bandwidth usage. But if peers have similar resource

constraints or fair load-balancing is a priority, then we would prefer the Weighted

selection procedure, which better equalizes load yet is almost as efficient at locating

authentic documents.

Susceptibility to Attack

In addition to fairness, a skewed load distribution also raises concerns with respect

to security. If a smaller number of peers are providing a larger portion of the net-

work services, these peers become easy, tempting targets for malicious entities. Once

highly-loaded, well-behaved nodes are detected, an adversary can mount a network

Denial of Service attack directed at these nodes in order to shut them down, or at-

tempt to subvert the nodes for its own purposes through security exploits. A more

balanced load distribution makes it harder to detect which peers are providing the

most resources. In addition, more of these nodes would have to be subverted in order

5.6. RESULTS 181

to do the same amount of damage to the network.

As stated earlier, we did not investigate DoS attacks or node subversion in this

study. However, it is important to consider these issues when choosing system para-

meters, such as the selection procedure. In essence, using the Select-Best procedure

weakens one of the most important traits of P2P systems: robustness through widely

distributed and replicated files and resources. Any parameter that influences diversity

in peer selection will have repercussions on both load balancing and risk from point

attacks. For example, as discussed earlier, raising the initial reputation rating re-

sults in more inauthentic file accesses. However, lowering it will reduce the likelihood

of discovering new well-behaved peers, thus increasing load skew and the system’s

susceptibility to other malicious attacks.

5.6.3 Node-based Threat Model

Here we present the results of experiments using the threat model, based on the

threat matrix. We evaluate the local reputation system and the ideal reputation

system, which uses the threat matrix T as its reputation matrix. All results in this

section were performed with no node turnover or whitewashing.

This threat model allows us to perform a statistical analysis of the expected long-

term performance of the reputation system variants, which we present in Appen-

dix 5.7. This analysis presents the expected system behavior in steady-state after

running for sufficiently (perhaps infinitely) long. Comparing the analytical results

with those presented in the next section, gives us an understanding of the inherent

limitations of each of reputation system variants. The statistical analysis results as-

sume no node whitewashing or node turnover. For more information please see the

appendix.


0

5

10

15

20

25

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

Ver

ifica

tion

Rat

io (r

V)


BaseIdeal Weighted 0.0

Ideal Best 0.0Ideal Weighted 0.2

Ideal Best 0.2SL Weighted 0.0

SL Best 0.0SL Weighted 0.2

SL Best 0.2

(a) As a function of πB

0

5

10

15

20

25

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Ver

ifica

tion

Rat

io (r

V)

Probability of Bad Node Sending Fake Response (pB)





SL Best 0.2

(b) As a function of pB

0

5

10

15

20

25

30

35

40

0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1

Ver

ifica

tion

Rat

io (r

V)

Probability of Good Node Sending Authentic Response (pG)





SL Best 0.2

(c) As a function of pG

Figure 5.12: Comparison of the efficiency of the local and ideal reputation systemsunder the node-based threat model. Lower is better. 1 is optimal.

Efficiency of the Reputation Systems

We first present the results of varying the three threat parameters, πB, pB and pG.

Figure 5.12 shows that the local reputation system performs quite well, matching

the associated ideal system, and even surpassing it in some situations. Though the

Weighted and Select-Best methods achieve equal efficiency, the use of a selection

threshold dramatically improves performance, allowing it to maintain a verification

ratio under 2.5.

The ideal system likewise benefited from a selection threshold, allowing the sys-

tem to maintain a near perfect ratio of approximately 1.01. The improvement in

5.6. RESULTS 183

performance due to the threshold reinforces our observations of the large number of

queries that returned no authentic documents, only fakes. As expected, the selection

threshold for the ideal case is useless once pB > ρT and it performs as if there is no

threshold (Figure 5.12(b)).

The most interesting observation is the shape of either Ideal Weighted curve. We

see that the verification ratio peaks when pB is around 0.6. The reason the Ideal

system performs badly in this situation is because, knowing the values of the threat

matrix, it expects bad nodes to reply correctly to 40% of the queries. In reality,

malicious nodes reply falsely to around 60% of the queries, but only reply correctly

to a small fraction of the other 40% of the queries. This is because it can only return

a valid response to a document it has, and each file has only a small fraction of all

the unique documents in the city. This shows that the “ideal” reputation system is

not as good as we may have originally believed.

Comparing the SL curves in the graphs in Figure 5.12 to the corresponding ones

the document-based threat model and the login server identity model, shown in Fig-

ure 5.4, we see that relative performance stays same. The behavior the systems under

both threat models is quite similar, especially regarding varying the common threat

parameters, πB and pB. In fact, we repeated the experiments for determining the

optimal values for the initial reputation rating and selection threshold (under both

identity models) but the results were so similar, we feel it would be redundant to

include them.

Distributed Node Ratings

In most experiments all well-behaved nodes have same probability of sending an

authentic document, or rating, of pG. Malicious nodes, likewise, all have the same

rating of pB. These values are expected to be far apart, allowing algorithms to more

easily locate and isolate the two. But how do the reputation systems behave when


0

10

20

30

40

50

60

70

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Ver

ifica

tion

Rat

io (r

v)

Probability of Bad Node Sending Fake Response (1-pB)





SL Best 0.15

(a) As a function of pB

0

20

40

60

80

100

0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1

Ver

ifica

tion

Rat

io (r

v)

Probability of Good Node Sending Authentic Response (pG)





SL Best 0.15

(b) As a function of pG

Figure 5.13: Comparison of the efficiency of the reputation systems with node threatratings uniformly distributed in an interval of length 0.7 around pB or pG.

the node behaviors are not so distinct?

To test this, we randomize the node ratings in the threat matrix. Each node’s

rating is chosen from a uniform distribution with an interval of size 0.7 centered on

pB or pG, depending on whether it is a good or bad node. For example, if pB = 0.4

then malicious nodes will be assigned ratings in the range of 0.05 to 0.75 and the

average value will be 0.4. For values of pB and pG near 0 or 1, the interval cannot

extend completely 0.35 in each direction from the center. Rather than shorten the

interval equally on both sides of the center, the interval is abruptly cut at 0 or 1.

This results in a shift of the interval center (and average) away from the intended

center. For example, when pB = 0.1, then the values of malicious nodes are chosen

uniformly from the interval [0, 0.45], resulting in an average value of 0.225.

Figure 5.13 shows the results of simulations run with the scattered threat ratings,

both for different pB and pG values. Comparing Figures 5.13(a) and 5.13(b) to Fig-

ures 5.12(b) and 5.12(c) respectively, we see little difference in the performance of

the base case and all the local variants.

5.6. RESULTS 185

0

2

4

6

8

10

12

14

0 100 200 300 400 500 600 700 800 900 1000

Rat

io o

f Aut

hent

icity

Che

cts

to S

ucce

ssfu

l Que

ries

Time

Comparison of Reputation Algorithms

BaseIdeal - Best 0.0

Ideal - Best 0.15SimpleLocal - Best 0.0

SimpleLocal - Best 0.15

(a) Ratio of authenticity checks to successfulqueries between each 50 query interval.

0

0.0002

0.0004

0.0006

0.0008

0.001

0.0012

0.0014

0 100 200 300 400 500 600 700 800 900 1000

Dis

tanc

e be

twee

n m

atric

es R

and

T

Time


BaseIdeal - Best 0.0

Ideal - Best 0.15SimpleLocal - Best 0.0


(b) Distance between reputation and threat ma-trices. Lower value is better. 0 is optimal.

Figure 5.14: Comparison of the local reputation system with ρT of 0.0 and 0.15 andthe base case over time. The simulation was run for 1000 queries and statistics werecollected every 50 queries.

Convergence Over Time

The next tests we ran involved collecting statistics during each simulation run in order

to evaluate the change in performance over time and to see if the reputation matrix

converges to the threat matrix. In each set of simulations the simulation ran for Q

total queries and statistics were gathered every δ queries. The graphs measuring the

verification ratio over time compute the ratio based only on the number of authenticity

checks and successful queries in the last δ queries. The graphs measuring T-R distance

give the calculated T-R distance at the current time (i.e after δ queries, 2δ queries,

etc.).

Figure 5.14 ran for Q = 1000 queries with a δ = 50 queries. In it we compare the

three reputation systems using only the Select-Best procedure with ρT of 0 and 0.15

for the ideal and Simple Local cases.

Though in Figure 5.14(a) the values appear to vary randomly, notice that the local

curves appear to converge towards 1. This indicates the the statistics the reputation

system is gathering allows it to make better decisions in the future when selecting


responses. The base case and the ideal case with no threshold, as expected, do not

show this convergence since they do not “learn” from previous experiences. The ideal

case with threshold performs well enough to stay near 1 for the entire run.

In Figure 5.14(b) we see the T-R distance during the same test. The ideal case

is trivially 0, and the base case is not present (since the Random algorithm does not

maintain statistics). We see that both local curves converge to a value of almost

0.0004. The curve corresponding to ρT = 0.0 seems to converge faster and to a

slightly smaller distance than for ρT = 0.15. Since local using no threshold performs

worse in terms of the number of authenticity checks it must perform, this allows it to

collect more statistics about other nodes faster and therefore converge faster with a

bit more accuracy. But once it locates a pool of good nodes, the Select-Best procedure

will always attempt to pick from this group. Therefore it will not choose documents

from other nodes if possible and will not collect new statistics, other than refine the

reputation ratings of the good nodes.

To see if gaining more varied statistics results in faster and better convergence

we ran a similar longer test with Q = 10000 queries and δ = 100. We included the

Weighted procedure to if its ability to choose a node that is not necessarily the best

known node will help it minimize the T-R distance. We also modified the base case

to collect statistics in order to calculate a reputation for each node, but still not use

the information in the selection process. Since the base case performs the worst by

performing the most authenticity checks, we would expect it to collect the most data

and converge faster and better than the rest.

Figure 5.15(a) shows the verification ratio over time. Though the local Select-Best

curves converge quickly to 1, the Weighted versions periodically jump to high values,

resulting in worse performance.

In Figure 5.15(b), all the curves converge at about the same rate. But both

ρT = 0.15 curves converge to a higher T-R distance than the rest. This is as expected

5.6. RESULTS 187

0

5

10

15

20

25

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

Rat

io o

f Aut

hent

icity

Che

cks

to S

ucce

ssfu

l Que

ries

Time


BaseSimpleLocal - Weighted 0.0

SimpleLocal - Weighted 0.15SimpleLocal - Best 0.0


(a) Ratio of authenticity checks to successfulqueries between each 100 query interval

0

0.0001

0.0002

0.0003

0.0004

0.0005

0.0006

0.0007

0.0008

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

Dis

tanc

e be

twee

n m

atric

es R

and

T

Time





(b) Distance between reputation and threat ma-trices over 10000 queries

0

5e-05

0.0001

0.00015

0.0002

0.00025

0.0003

9000 9200 9400 9600 9800 10000

Dis

tanc

e be

twee

n m

atric

es T

and

R

Time (Queries)





(c) Distance between reputation and threat ma-trices between 9000 and 10000 queries

Figure 5.15: Comparison of the local reputation system with both Weighted andSelect-Best variants and a selection threshold of 0.0 and 0.15 and the base case overtime. The simulation was run for 10000 queries and statistics were collected every100 queries.


after the previous results. Figure 5.15(c) shows a more detailed view of the end of

the simulation run. Interestingly, the Weighted version with no threshold and the

base case did not converge faster or to a lower distance the the Select-Best version

with no threshold, even though they both performed many more authenticity checks.

This may be because, though the Weighted and base case collect more statistics, the

statistics are across a larger number of nodes, while the Select-Best case leaves more

nodes as undefined. Since the undefined nodes are not included in our measure of

distance, a system would exhibit a lower T-R distance by information on a smaller

number of nodes than the same amount of information (or even more) across a much

larger number of nodes. A different method of calculating distance which promotes

having less undefined nodes would likely improve the measure for the bad performing

systems.

Dynamic Misbehavior

A suggested strategy for malicious nodes in a reputation system is to behave cor-

rectly for some period of time and accrue a positive reputation, then begin acting

maliciously. What effect does such a scheme have on efficiency? Can good nodes

detect such misbehaving nodes? If so, how quickly? To determine the effects of this

strategy we devised a test where malicious nodes behaved correctly for 1000 queries

from a single source. At that time all bad nodes began misbehaving. The simulation

then continued for 1000 more queries. The verification ratio was calculated every

50 timesteps using the cumulative A(D) and qsucc at that time, except in the case

of the dynamic bad nodes, in which case the verification ratio after 1000 queries is

calculated using statistics gathered since the nodes began misbehaving at time 1000.

For comparison we graph the behavior of the standard static threat model where

the malicious nodes misbehave for the entire simulation of 1000 queries. Figure 5.16(a)

shows that the reputation systems quickly stabilize to a steady-state. The ideal

5.6. RESULTS 189

0

10

20

30

40

50

60

0 100 200 300 400 500 600 700 800 900 1000

Ver

ifica

tion

Rat

io (r

v)

Time (queries)





SL Best 0.15

(a) Regular static behavior (1000 queries total)

0

10

20

30

40

50

60

1000 1200 1400 1600 1800 2000

Ver

ifica

tion

Rat

io (r

v)

Time (queries)





SL Best 0.15

(b) Bad nodes begin misbehaving after 1000queries (2000 queries total)

Figure 5.16: Comparison of the efficiency of the reputation systems over time.

system with threshold performs the best (almost 1) for the entire simulation, but the

local system with threshold quickly converges to near optimal.

Comparing it now to the dynamic misbehavior scenario in Figure 5.16(b) we see

that the reputation systems perform just as well, even though the malicious nodes

have had time to build up a good reputation. The one interesting difference, is

the worse performance of the Weighted procedure as opposed to Select-Best. With

Select-Best as soon as the querying node fetches and checks one fake document from

a malicious node, that node’s rating will be lowered significantly so that other nodes

will always be selected before it in the future, if possible. On the other hand, with

the Weighted method, a malicious node is likely to be selected multiple times before

its reputation rating is lowered sufficiently that it is unlikely to be chosen in future

queries.

To illustrate this, let node A be a good node and node M be a dynamic malicious

node. Say that both have provided 4 good documents during the period of time that

M behaved well to accrue reputation and so both have a reputation of 1.0. Now say

that M delivers an inauthentic document. It’s rating will drop to 45

= 0.8. While

Select-Best will never pick M if A has also replied, with the Weighted method A is


only 25% more likely to be selected than M . In fact after 3 more false responses

from M , its rating has only dropped to 0.5 and is only half as likely to be chosen

as A. This demonstrates a weakness of the simple reputation rating function used.

In situations with dynamic malicious nodes, reputations based on only the statistics

gathered over that last x number of queries would perform better, but require more

state be maintained.

The reason why Weighted does not perform too badly in this dynamic test is

that in a period of only 1000 queries, few malicious nodes had the opportunity to

provide more than 1 or 2 correct answers to build their reputation statistics. If

the misbehaving nodes were to behave better for a longer period, the greater the

performance gap between Select-Best and Weighted. But if malicious nodes behave

well for very long periods of time, they would be very ineffective at disrupting the

network.

5.7 Statistical Analysis of Reputation Systems

In this section we provide a derivation for our statistical analysis of the steady-

state behavior of the reputation systems and their variants presented in this chapter.

We begin by presenting the variables and equations dictating the expected system

behavior. Next, we empirically compute values for certain parameters. Finally we

calculate the expected performance of the different systems. Specifically, we are

interested in the steady-state verification ratio of a system, which is approximately

equal to the expected number of documents fetched and checked for each query.

5.8. EQUATIONS 191

5.8 Equations

In the experiments all the network topologies were static. Therefore, for a given TTL

and network topology, the nodes reached by a flood query was always the same for

each node, but different between nodes and topologies. Let nflood(i) be the number

of nodes reached by a query from node i. Let nflood be the average number of nodes

queried across all possible nodes in all possible network configurations. We empirically

estimate this value in the following section.

For the analysis we are interested in computing the expected probability of a node

being able to reply to a query. In the simulations the probability of a node having a

document matching a query q is

pN(d, q) = 1− (1− pD(q))d (5.11)

where pD(q) is the probability of any document in the system matching query q

and d is the number of documents stored at the specified node. The probability of

a document matching a query is determined by the query model using the query

popularity and query selection power distributions [138]. The number of documents

on a node is chosen from the file distribution sample collected by Saroiu [119]. The

expected probability of a node answering a query, pN , is computed experimentally.

More details are given below.

• The expected number of bad nodes that hear a query is nfloodπB.

• The expected number of good nodes that hear a query is nflood(1− πB).

• The probability of good node replying with an authentic document = pNpG.

• The probability of good node replying with a fake document = pN(1− pG).

• The probability of bad node replying with an authentic document = pNpB.


• The probability of bad node replying with a fake document = 1− pB.

Notice that the probabilities of sending an authentic document and sending a fake

document do not add up to 1. The remainder is the probability of the node not

replying. Also note the difference between good and bad nodes in the probability of

replying with a fake document. Because bad nodes can generate fake documents even

when they do not have a query match, pN is not taken into account.

From these equations we can derive formulas for the expected number of docu-

ments in a query’s response set, how many come from good or bad nodes, and how

many are authentic or not.

• Expected number of authentic documents received for a query

– From good nodes: dAG = nflood(1− πB)pNpG

– From bad nodes: dAB = nfloodπBpNpB

– Total: dA = nfloodpN(pG(1− πB) + pBπB)

• Expected number of fake documents received for a query

– From good nodes: dFG = nflood(1− πB)pN(1− pG)

– From bad nodes: dFB = nfloodπB(1− pB)

– Total: dF = nflood(pN(1− pG)(1− πB) + (1− pB)πB)

• Expected number of total documents received for a query

– From good nodes: dTG = nflood(1− πB)pN

– From bad nodes: dTB = nfloodπB(1− pB + pBpN)

– Total: dT = nflood(pN(1− πB) + πB(1− pB + pBpN))

5.9. EMPIRICAL ESTIMATIONS 193

Another set of equations which will be necessary are the probability a document

received from a good or bad node is authentic. This is equivalent to the expected num-

ber of authentic documents from a good/bad node divided by the expected number

of total documents from a good/bad node. This gives the following three equations

P (DG = A) =dAG

dTG

= pG (5.12)

P (DB = A) =dAB

dTB

=pBpN

pBpN + 1− pB

(5.13)

P (D = A) =dA

dT

=pN(pG(1− πB) + pBπB)

pN(1− πB) + πB(1− pB + pBpN)(5.14)

For good nodes the probability is simply pG since they only reply with documents

they own, though a small percentage are fake. Bad nodes, on the other hand, can

generate false documents.

5.9 Empirical Estimations

We estimate nflood to be 3950 after 10,000 samples from 100 nodes in 100 power-law

topologies of 10,000 nodes with approximately 3.1 degrees on average per node.

To compute pN we generate 4,000,000 random samples from both our query popu-

larity and query selection power distributions, and the sample document distribution

collected by Saroiu [119] (described in Section 5.5). For each generated value of pD(q)

and d we calculate pN(d, q) using Equation 5.11, and average across all samples. The

experimental estimate for pN was 0.1090, or approximately 10% probability that a

node can answer a query.


5.10 Long-Term Reputation System Performance

In this section we calculate the expected performance of each of the reputation systems

and their variants after running for a very long period and having settled into a

steady-state. For simplicity, this analysis makes two assumptions. First, all nodes

maintain their public identities for all time. Second, the network topologies do not

change. These assumptions are not expected to be realistic. For discussion of system

performance in realistic scenarios, see the experimental results. We are interested in

the steady-state performance as a guide to optimal performance of each system. A

comparison of the experimental results to our statistical analysis should indicate how

quickly the systems converges toward the steady-state behavior.

To determine the long-term efficiency of the reputation systems we are interested

in calculating the expected verification ratio. This ratio is the number of documents

that must be verified for authenticity for each successful query before an authentic

one is found. Above, we determined the expected number of documents of each type

received in response to a query. For all reputation systems, the expected verification

ratio will be a variation of the expected number of documents that must be chosen

from a subset of the responses, without replacement, until an authentic one is found.

Since the expected number of documents received in response to a query is large

we can approximate this problem to that of with replacement [36]. Given that the

probability of choosing an authentic document on the first try is p, we assume that

each successive attempt at choosing an authentic document is also has a probability p

of success. The expected number of documents that must be checked before locating

an authentic document, with replacement, is approximated as

E(X) =inf∑

x=1

p(1− p)x−1 =1

p(5.15)

5.10. LONG-TERM REPUTATION SYSTEM PERFORMANCE 195

This is the formula we use in the analysis below.

We also assume (as in most experiments) that all well-behaved nodes have a threat

rating of pG and all malicious nodes have a threat rating of pB, and that their threat

ratings do not change over time (or at least that the systems reach a steady-state

between changes).

The equations in Appendix 5.8 give the expected number of documents per query.

As stated before, the verification ratio, rv is proportional to the number of successful

queries, not total queries. We account for this distinction by calculating a factor Q,

which is the fraction of total queries expected to be successful. Say we issue θ queries.

The expected number of successful queries would be Q · θ. Each query will fetch 1p

documents on average. The total number of documents fetched will be 1p· θ. By

plugging these values into Equation 5.4 we get

rv =

1p· θ

Q · θ =

1p

Q

rvQ =1

p

(5.16)

For each system, we give equations for:

• The probability of choosing an authentic document on the first try, p (when

applicable).

• The fraction of total queries expected to be successful, Q.

• The expected verification ratio multiplied by the Q factor.

5.10.1 Random base case

All responses will be looked at equally likely.


p =dA

dT

(5.17)

Q = 1− (1− P (D = A))dT (5.18)

Expected rvQ =1

p=dT

dA

=pN(pG(1− πB) + pBπB)

pN(1− πB) + πB(1− pB + pBpN)(5.19)

5.10.2 Select-Best/Weighted ideal case with threshold

We assume all malicious nodes’ ratings fall below the selection threshold and so only

answers from good nodes will be considered. Since we assume all good nodes have

the same threat rating, then they will all be equally likely to be chosen by both the

Select-Best and Weighted methods.

Probability of choosing an authentic document of the first try is

p =dAG

dTG

(5.20)

Q = 1− (1− P (DG = A))dTG (5.21)

Expected rvQ =1

p=dTG

dAG

=1

pG

(5.22)

5.10.3 Weighted ideal case without threshold

All responses are considered but are weighted based on the rating of the sending

node. Given we have received dTG responses from good nodes and dTB responses

from bad nodes, then the probability of choosing a response from a good node with

the weighted method is pGdTG

pGdTG+pBdTB. Likewise, the probability of choosing a response

from a bad node is pBdTB

pGdTG+pBdTB. The probability that a random document from a

good node is authentic is simply dAG

dTG, and the probability that a document from a

bad node is authentic is dAB

dTB. Combining these formulas we get the probability of


choosing an authentic document using the weighted method.

p =pGdTG

pGdTG + pBdTB

· dAG

dTG

+pBdTB

pGdTG + pBdTB

· dAB

dTB

=pGdAG + pBdAB

pGdTG + pBdTB

(5.23)

Q = 1− (1− P (D = A))dT (5.24)

Expected rvQ =pGdTG + pBdTB

pGdAG + pBdAB

=pGpN(1− πB) + pBπB(1− pB + pBpN)

p2GpN(1− πB) + p2

BpNπB

(5.25)

5.10.4 Select-Best ideal case without threshold

With no threshold all responses may be looked at. First, the responses from good

nodes will be checked one at a time. If no authentic document was found, then the

responses from bad nodes are checked. If there are responses from good nodes, then

the probability of locating an authentic document on the first try is P (DG = A), or

pG, on the second try (1− pG)pG, on the third try (1− pG)2pG, and so on. Let γ be

the number of responses from good nodes (dTG), and β be the number of responses

from bad nodes (dTB). The probability of the first authentic document found being

the first document from a bad node checked would be (1 − pG)γP (DB = A). The

probability of the first authentic document found being the second document from a

bad node checked would be (1− pG)γ(1− P (DB = A))P (DB = A).

We can now calculate the expected number of documents which must be down-

loaded and checked per query:

Expected rvQ =

γ∑

k=1

kpG(1−pG)k−1+

β∑

k=1

(γ+k)(1−pG)γP (DB = A)(1−P (DB = A))k−1

(5.26)


By applying the well-known equations [77]

∑

0≤j≤n

axj = a

(1− xn+1

1− x

)

and∑

0≤j≤n

ajxj = a

(nxn+2 − (n+ 1)xn+1 + x

(x− 1)2

)

(5.27)

on Equation 5.26 and simplifying we obtain

E[rvQ] =1

pG

+ (1− pG)γ

[1

P (DB = A)− 1

pG

− (1− P (DB = A))β

(1

P (DB = A)+ β + γ

)]

(5.28)

Substituting dTG and dTB for γ and β, gives us

E[rvQ] =1

pG

+ (1− pG)dTG

[1

P (DB = A)− 1

pG

− (1− P (DB = A))dTB

(1

P (DB = A)+ dTB + dTG

)]

(5.29)

As with any system with no threshold, the Q factor is

Q = 1− (1− P (D = A))dT (5.30)

5.10.5 Select-Best/Weighted local reputation system with thresh-

old

In the long run we would expect the local system’s standard reputation matrix, R′,

converge to the threat matrix T . Therefore the local system in steady-state would

approximate the ideal case. One exception is in the scenario with a threshold. All

bad nodes would eventually be rated below the threshold and ignored, just as in the

ideal case. In addition, a fraction of good nodes would fall below this threshold and


be ignored based on the fact that, with some probability (1−pG) good nodes do reply

with fake documents. For very large pG and small ρT we approximate the percentage

of good nodes which fall below the threshold as the percentage of good nodes which

reply with a fake document the first time they reply to a particular node’s queries by

(1 − pG). Thus the only a 1 − pG fraction of the responses from good nodes will be

considered for both Select-Best and Weighted with a threshold.

p =pGdAG

pGdTG

(5.31)

Q = 1− (1− P (DG = A))pGdTG (5.32)

Expected rvQ =1

p=pGdTG

pGdAG

=1

pG

(5.33)

Although the long-term efficiency of system is almost equal to that of the ideal

system (note the difference in Q), its effectiveness is reduced because of the additional

exclusion of the small percentage of good nodes.

5.10.6 Weighted local system without threshold

With no threshold the local system should more closely approximate T than with a

threshold since good nodes which reply with a fake document their first time, will be

given a second chance. All responses are considered but are weighted based on the

rating of the sending node. The equations for p, Q, and the expected rvQ are given

by Eq. 5.23, 5.24, and 5.25, respectively.

5.10.7 Select-Best local system

As with the Weighted procedure the Select-Best local system with no threshold will

perform like the corresponding ideal case variant. Thus the equations for expected

rvQ and Q are given by Eq. 5.29 and 5.30, respectively.


5.11 Comparison of Statistical Analysis to Simu-

lation Results

Using the equations derived above, we verify the correctness of our simulator by

comparing our simulation results to the expected value at steady state for the same

parameter values. In Figure 5.17 we graph the expected steady-state rv as a function

of three threat model parameters for the random base case, and the three variants of

the ideal case discussed above. These equations are represented by the line curves.

We reran the experiments from Section 5.6.3 for the base case and the ideal cases

with a modification to the document model, which we discuss below. We chose only

to test the base case and the ideal variants because those systems attain steady state

from the beginning. Unlike the local system, they gather no new information over

time, so their behavior remains constant. We plot these results as the datapoints in

the graphs of Figure 5.17. Clearly, the simulation results closely match the curves of

the expected performance.

If we compare the graphs in Figure 5.17 with those in Figure 5.4 we see that the

curves are similar in shape and relative proportions, but the absolute values much

larger in Figure 5.4. This difference is due to the modified document base model used

for the simulations for Figure 5.17. In these experiments we use a simplified model

where each node had an equal probability of matching any query, pN . pN was set to

a value of 0.1090, the expected probability of a node matching a query (pN) derived

experimentally from our realistic document base model used in all other simulations.

That model is dependent on three Zipf distributions (query popularity, query selection

power, and number of documents per node). When comparing the simulator to our

derived equations it is necessary to substitute this simplified document model because

of a similar simplification which we made in our equations.

The formulas all use the expected value pN in place of the linearly dependent

5.11. COMPARISON OF STATISTICAL ANALYSIS TO SIMULATION RESULTS201

0

1

2

3

4

5

6

7

8

9

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

Ver

ifica

tion

Rat

io (r

v)



Ideal Best 0.0Ideal 0.15

(a) As a function of πB

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Ver

ifica

tion

Rat

io (r

v)

Probability of Bad Node Sending Fake Response (1-pb)



(b) As a function of pB

0

1

2

3

4

5

6

7

8

9

0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1

Ver

ifica

tion

Rat

io (r

v)

Probability of Good Node Sending Authentic Response (pg)



(c) As a function of pG

Figure 5.17: Comparison of expected steady-state system behavior with 1000 querysimulation using uniform document base with pN = 0.1090. Lines represent expectedperformance based on analysis. Points are results of corresponding simulations.


random variable pN . This simplification allowed us to derive relatively simple formulas

(if not the formulas would be as complex as the simulator itself). Unfortunately, this

mitigates the effects of certain conditions, such as when a node receives no authentic

replies whatsoever in its response set. With approximately 4000 nodes hearing each

query, and each node having, on average, almost an 11% chance of matching the query,

it would seem unlikely that no authentic document would be received. But because of

the heavy-tail nature of Zipf distributions, this condition occurs much more frequently

than one might expect and should not be ignored. This shows the importance of

running complex simulations and not simply relying on statistical derivations which

may lead to seemingly insignificant simplifications that actually greatly affect the

validity of one’s conclusions.

5.12 Related Work

Extensive research has been done on general issues of reputation (eg. [65] [72] [88]).

Much work has been done in the area of locating reputable nodes in resource-sharing

peer-to-peer networks and many interesting reputation systems have been proposed

(eg. [37] [83] [61]). Here we describe a few related examples.

Reference [48] presents a game theoretical model, based on the prisoner’s dilemma,

for analyzing the social cost of allowing nodes to freely change identities. It proposes

a mechanism, based on a centralized trusted intermediary. It ensures each user is

assigned only one system identifier, yet protects their anonymity so that even the

intermediary does not know which identifier was assigned to which node.

In [38] Douceur discusses the problem of preventing users from using multiple

identities in a system with no trusted central agency (the Sybil attack). He presents

methods for imposing computational cost on identity creation and lists system con-

ditions necessary to limit the number of identities peers can generate.

5.13. CONCLUSION 203

In [83], Lai et al. propose a reciprocative incentive strategy to combat freeriders,

based on the Evolutionary Prisoners Dilemma [9]. They compare the performance

of private history versus shared history and develop an adaptive stranger response

strategy that balances punishing whitewashers with overly taxing new nodes.

Reference [73] presents EigenTrust, a system to compute and publish a global

reputation rating for each node in a network using an algorithm similar to PageR-

ank [103]. Reputation statistics for each node are maintained at several nodes across

a content-addressable network to mitigate the effects of bad nodes colluding.

In [29], Damiani et al. enhance their previous work on reputation [25] by propos-

ing the concept of resource reputation, where peers give opinions on a resource’s

authenticity based on its reported digest. This technique complements the process of

maintaining peer reputations, which is still necessary in situations where the resource

is rare and no other peers have encountered it.

5.13 Conclusion

We have compared two practical identity infrastructures for peer-to-peer resource-

sharing environments. A centralized trusted login server that ties nodes’ network

pseudo-identities to their real-world identities provides better support for reputation

systems by preventing nodes from quickly changing identities. However, this benefit

comes at a high management cost and requires users to disclose information to a level

which they may not find acceptable. The decentralized approach, where each node

generates its own identity, provides a higher level of anonymity while simultaneously

preventing identity hijacking, at the cost of no enforced identity persistence for ma-

licious nodes. Though we have concentrated on two distinct identity models, many

practical solutions fall in a spectrum between them (such as providing incentives for

persistent identities) and perform accordingly.


Our results show that even simple reputation systems can work well in either of the

two identity schemes when compared to no reputation system. In environments where

system identities are generated by the peers themselves, all unknown nodes should be

regarded as malicious. But, if a centralized login authority enforces identities tied to

real world identities, then the optimal reputation for unknown nodes is nonzero. In

addition, certain techniques, such as using a selection threshold, provide large benefits

in efficiency for one identity scheme, but are ineffectual for others.

We have presented a simple voting-based reputation system that significantly mit-

igates the deleterious effects of malicious nodes, by sharing information with a small

group of nodes. Even with 40% of the network attempting to subvert 90% of the

resources, a node would expect to only have to attempt twice before locating a good

provider, though it increases to four or five tries if the system is vulnerable to white-

washing.

We compared two methods for selecting providers given reputation information

and showed that, while one provides better efficiency, it also significantly skews the

load on the well-behaved nodes in the network. Depending on the amount of hetero-

geneity in the network this may be acceptable. We also show how the Friend-Cache

developed for the reputation system can be applied to significantly reduce message

traffic in unstructured peer-to-peer networks.

Finally, we present two distinct threat models, allowing us to simulate a variety

of malicious behaviors. Both models affect system performance quite similarly and

reputation system results in one are equivalently proportioned in the other. This

allows us to compare reputation systems using whichever model is most convenient

and expect similar results from the other.

Chapter 6

SPROUT: P2P Routing with

Social Networks

Social networks are everywhere. Many people all over the world participate online in

established social networks every day. AOL, Microsoft, and Yahoo! all provide instant

messaging services to millions of users, alerting them when their friends log on. Many

community websites, such as Friendster [49], specialize in creating and utilizing social

networks. As another example, service agreements between ISPs induce a “social”

network through which information is routed globally. Social networks are valuable

because they capture trust relationships between entities. By building a P2P data-

management system “on top of”, or with knowledge of, an existing social network,

we can leverage these trust relationships in order to support efficient, reliable query

processing.

Several serious problems in peer-to-peer networks today are largely due to lack of

trust between peers. Peer anonymity and the lack of a centralized enforcement agency

make P2P systems vulnerable to a category of attacks we call misrouting attacks. We

use the term misrouting to refer to any failure by a peer node to forward a message

to the appropriate peer according to the correct routing algorithm. Failures include

205

206 CHAPTER 6. SPROUT: P2P ROUTING WITH SOCIAL NETWORKS

dropping the message or forwarding the message to other colluding nodes instead of

the correct peer, perhaps in an attempt to control the results of a query. For instance,

in a distributed hash table (DHT) a malicious node may wish to masquerade as the

index owner of the key being queried for in order to disseminate bad information and

suppress content shared by other peers.

In addition, malicious users can acquire several valid network identifiers and thus

control multiple distinct nodes in the network. This is referred to as the Sybil attack

and has been studied by various groups (e.g. [48] [38] [91] and discussed in Chapter 5).

This implies that a small number of malicious users can control a large fraction of the

network nodes, increasing the probability that they participate in any given message

route.

Using a priori relationship knowledge may be key to mitigating the effects of

misrouting. To avoid routing messages through possibly malicious nodes, we would

prefer forwarding our messages through nodes controlled by people we know person-

ally, perhaps from a real life social context. We could assume our friends would not

purposefully misroute our messages. 1 Likewise, our friends could try and forward our

message through their friends’ nodes. Social network services provide us the mech-

anism to identify who our social contacts are and locate them in the network when

they are online.

Misrouting is far from the only application of social networks to peer-to-peer

systems. Social networks representing explicit or implicit service agreements can also

be used to optimize quality of service by, for example, minimizing latency. Peers may

give queue priority to packets forwarded by friends or partners over those of strangers.

Thus, the shortest path through a network is not necessarily the fastest.

1In our study, we assume a slim, but nonzero, chance (5%) that a virus or trojan has infectedtheir machine, causing it to act maliciously (see Sec. 6.1.1).

6.1. TRUST MODEL 207

In Section 6.1 we present a high-level model for evaluating the use of social net-

works for peer-to-peer routing, and apply it to the two problems we described above;

yielding more query results and reducing query times.

Unstructured networks can be easily molded to conform to the social links of their

participants. OpenNap, for example, allows supernodes to restrict themselves to link-

ing only with reputable or “friendly” peer supernodes, who manage message propa-

gation and indexing. However, structured networks, such as DHTs, are less flexible,

since their connections are determined algorithmically, and thus it is more challenging

to use social networks in such systems. In Section 6.2 we propose SPROUT, a routing

algorithm which uses social link information to improve DHT routing performance

with respect to both misrouting and latency. We then analyze and evaluate both our

model and SPROUT in Section 6.3.

Social networks can be exploited by P2P systems for a variety of other reasons.

In Section 6.4 we discuss application scenarios where our model is useful, as well as

other related and future work. Finally, we conclude in Section 6.5.

This work has been published as [96] and [89].

6.1 Trust Model

The basic intuition is that computers managed by friends are not likely to be selfish or

malicious and deny us service or misroute our messages. Similarly, friends of friends

are also unlikely to be malicious. Therefore, the likelihood of a node B purposefully

misrouting a message from nodeA is proportional to (or some function of) the distance

from A’s owner to B’s owner in the social network. Observe that in a real network

with malicious nodes, the above intuition cannot hold simultaneously for all nodes;

neighbors of malicious nodes, for example, will find malicious nodes close to them.

Rather our objective is to model trust from the perspective of a random good node


in the network. Likewise, we assume messages forwarded over social links would

experience less latency on average because of prioritizing based on friendship or service

agreements.

We now describe a flexible model for representing the behavior of peers relative

to a node based on social connections. We will illustrate the model usage for two

different specific issues: minimizing the risk of misrouting, and decreasing latency to

improve Quality of Service.

6.1.1 Trust Function

We express the trust that a node A has in peer B as T (A,B). Based on our assump-

tion, this value is dependent only on the distance (in hops) d from A to B in the

social network. To quantify this measure of trust for the misrouting scenario, we use

the expected probability that node B will correctly route a message from node A.

The reason for this choice will become apparent shortly.

One simple trust function would be to assume our friends’ nodes are very likely to

correctly route our messages, say with probability f = 0.95. But their friends are less

likely (0.90), and their friends even less so (0.85). Note, this is not the probability

that the peer forwards each packet, but instead the probability that the peer is not

misbehaving and dropping all packets. Averaged over all nodes, they are equivalent.

A node’s trustworthiness decreases linearly with respect to its distance from us in the

social network. This would level off when we hit the probability that any random

stranger node (far from us in the social network) will successfully route a message, say

r = 0.6. For large networks with large diameters probability r represents the fraction

of the network made up of good nodes willing to correctly route messages. Thus, r =

0.6 means that we expect that 40% of the network nodes (or more accurately network

node identifiers) will purposefully misroute messages. Here we have presented a linear

trust function. We consider others in Section 6.3.3.

6.1. TRUST MODEL 209

Note that in this example the probability of a friend routing correctly is only 0.95

and not 100%. This value accounts for friends who do not wish us harm, but whose

computers that may be unknowingly subverted by an adversary, perhaps through

virus infection. However, we may assume that our friends are less likely to allow their

machines to be infected than a random stranger.

In addition, using social links that connect to known individuals helps reduce

the threat of Sybil attacks [38]. Assume all machines in the network have an equal

probability p of being subverted by an adversary. A friend’s computer is likely to

have been subverted and be acting maliciously with probability p. However, if we

connect to a random peer in the network, we expect the probability of peer being

malicious to be much higher. The adversary may have each subverted computer

posing as multiple peers by registering multiple IDs in the network. Therefore, by

relying on nodes discovered through the social network we are limiting the power of

the adversary to his/her physical presence in the network as opposed to his/her virtual

presence, which may be much larger. This technique is used in other P2P applications

to mitigate the effectiveness of malicious attacks. For example, the LOCKSS digital

preservation system uses a friends list in order to reduce the influence of an adversary

who registers many virtual identities in order to poison the reference lists of well-

behaved peers [87].

When measuring QoS we would want to use a very different function. Let T (A,B)

be the expected additional latency incurred by a message forwarded through node B,

which it received from node A. For simplicity, let us assume that T (A,B) = ε if a

social link exists between A and B and ∆ · ε otherwise. For example, assume ε =

1 and ∆ = 3. If A has a service agreement, or is friends with, B, then B give any

message it receives from A priority and forward it in about 1 (ms), otherwise it is

placed in a queue and takes on average 3 (ms). We will use these same values for ε

and ∆ in our example below and in our analysis in Section 6.3.6.


We do not claim any of these functions with any specific parameter values is an

accurate trust representation of any or all social networks, but they do serve to express

the relationship we believe exists between social structure and the quality of routing.

6.1.2 Path Rating

We wish to use our node trust model to compare peer-to-peer routing algorithms.

For this we need to calculate a path trust rating P to use as our performance metric.

The method for calculating P will be application-dependent (and we will present two

specific examples below), but a few typical decisions that must be made are:

1. Source-routing or hop-by-hop? Will the trust value of a node on the path be a

function of its social distance from the message originator, or only from whom

it received the message directly?

2. How do you combine node trust? Is the path rating the product, sum, maximum

value, or average value of the node trust values along the path? Any appropriate

function could be used.

We now give as example a metric for reliability in the presence of misrouting.

We need to compare the likelihood that a message will reach its destination given

the path selected by a routing algorithm. We calculate the reliability path rating

by multiplying the separate node trust ratings for each node along the path from

the source to destination. For example, assume source node S wishes to route a

message to destination node D. In order to do so a routing algorithm calls for the

message to hop from S to A, then B, then C, and finally D. Then the reliability

path rating will be PR = T (S,A)∗T (S,B)∗T (S,C)∗T (S,D). Given that T (X,Y ) is

interpreted as the actual probability node Y correctly routes node X’s message, then

PR is the probability that the message is received and properly handled by D. Note

that T (X,Y ) is dependent only on the shortest path in the social network between

6.2. SOCIAL PATH ROUTING ALGORITHM 211

X and Y and thus independent of whether Y was the first, second, or nth node along

the path.

Including the final destination’s trust rating is optional and dependent on what

we are measuring. If we wish to account for the fact that the destination may be

malicious and ignore a message, we include it. Since we are using path rating to

compare routing algorithms going to the same destination, both paths will include

this factor, making the issue irrelevant.

For the Quality of Service we would want our path rating to express the expected

time a message would take to go from the source to the destination. Given that

T (A,B) is the latency incurred by each hop we would want to use an additive function.

And if each node decides whether to prioritize forwarding based on who it received

the message from directly, and not the originator, then the function would be hop-

by-hop. Calculating the latency path rating for the path used above would be PL =

T (S,A) + T (A,B) + T (B,C) + T (C,D).

Though we focus on linear paths in this chapter, the rating function can generalize

to arbitrary routing graphs, such as multicast trees.

6.2 Social Path Routing Algorithm

We wish to leverage the assumed correlation between routing reliability or efficiency

and social distance by creating a peer-to-peer system that utilizes social information

from a service such as a community website or instant messenger service. Though

there are many ways to exploit social links, for this chapter, we focus on building a

distributed hash table (DHT) routing algorithm. Specifically, we build on the basic

Chord routing algorithm [127]. Chord was chosen because it is a well-known scheme

and studies have shown it to provide great static resilience, a useful property in a

system with a high probability of misrouting that is difficult to detect and repair [60].


Our technique is equally applicable to other DHT designs, such as CAN [112] or

Pastry [118].

When a user first joins the Chord network, it is randomly assigned a network

identifier from 0 to 1. It then establishes links to its sequential neighbors in idspace,

forming a ring of nodes. It also makes roughly log2 n long links to nodes halfway

around the ring, a quarter of the way, an eighth, etc. When a node inserts or looks up

an item, it hashes the item’s key to a value between 0 and 1. Using greedy clockwise

routing, it can locate the peer whose ID is closest to the key’s hash (and is thus

responsible for indexing the item) in O(log n) hops. For simplicity, we will use “key”

to refer to a key’s hash value in this chapter.

Our Social Path ROUTing (SPROUT) algorithm adds to Chord additional links to

any friends that are online. All popular instant messenger services keep a user aware

of when their friends enter or leave the network. Using this existing mechanism a

node can determine when their friends’ nodes are up and form links to them in the

DHT as well. This provides them with several highly trusted links to use for routing

messages. When a node needs to route to key k SPROUT works as follows:

1. Locate the friend node whose ID is closest to, but not greater than, k.

2. If such a friend node exists, forward the message to it. That node repeats the

procedure from step 1.

3. If no friend node is closer to the destination, then use the regular Chord algo-

rithm to continue forwarding to the destination.

6.2.1 Optimizations

Here we present two techniques to improve the performance of our routing algorithm.

We evaluate them in Section 6.3.2.

6.2. SOCIAL PATH ROUTING ALGORITHM 213

Lookahead

With the above procedure, when we choose the friend node closest to the destination

we do not know if it has a friend to take us closer to the destination. Thus, we may

have to resort to regular Chord routing after the first hop. To improve our chances of

finding social hops to the destination we can employ a lookahead cache of 1 or 2 levels.

Each node may share with its friends a list of its friends and, in 2-level lookahead,

its friends-of-friends. A node can then consider all nodes within 2 or 3 social hops

away when looking for the node closest to the destination. We still require that the

message be forwarded over the established social links.

Minimum Hop Distance

Though SPROUT guarantees forward progress towards the destination with each hop,

it may happen that at each hop SPROUT finds the sequential neighbor is the closest

friend to the target. Thus, in the worst case, routing is O(n).

To prevent this we use a minimum hop distance (MHD) to ensure that the follow-

ing friend hop covers at least MHD fraction of the remaining distance (in idspace) to

the destination. For example, if MHD = 0.25, then the next friend hop must be at

least a quarter of the distance from the current node to the destination. If not then

we resort to Chord routing, where each hop covers approximately half of the distance.

This optimization guarantees us O(log n) hops to any destination but causes us to

give up on using social links earlier in the routing process. When planning multiple

hops at once, due to lookahead, we require the path to cover MHD

kadditional distance

for each additional hop, for some appropriate k.


6.3 Results

In this section we evaluate our friend-routing algorithm as well as present optimiza-

tions. We compare SPROUT to regular Chord and Chord augmented with additional

links. We also discuss the trust model and compare different trust functions. We an-

alyze the effects of misrouting on both structured and unstructured search networks.

Finally, we apply SPROUT to QoS and reducing path latency.

6.3.1 Simulation Details

To test our SPROUT algorithm for DHTs compare it to Chord in the following sce-

nario. Assume the members of an existing social network wish to share files or infor-

mation by creating a distributed hash table. Believing that some peers in the network

are unreliable, each node would prefer to route messages through their friends’ nodes

if possible. We use two sources for social network data for our simulations. The first

is data taken from the Club Nexus community website established at Stanford Uni-

versity [19]. This dataset consists of over 2200 users and their links to each other as

determined by their Buddy Lists. The second source was a synthetic social network

generator based on the Small World topology algorithm presented in [110]. Both the

Club Nexus data and the Small World data created social networks with an average

of approximately 8 links per node. We randomly inserted each social network node

into the Chord idspace.

We also ran experiments using a trace of a social network based on 130,000 AOL

Instant Messenger users and their Buddy Lists provided by BuddyZoo [30]. Because

of the size of this dataset, we have only used the data to verify results of our other

experiments.

For each experiment we randomly chose a query source node and a key hash value

to look up (chosen uniformly from 0 to 1). We compute a path using each routing

6.3. RESULTS 215

Table 6.1: SPROUT vs. ChordAvg. Path Length Avg. Reliability

Regular Chord 5.343 0.3080Augmented Chord 4.532 0.3649SPROUT(1,0.5) 4.569 0.4661

algorithm and gather statistics on path length and path rating. Each data point

presented below is the average of 1,000,000 such query paths.

6.3.2 Algorithm Evaluation

We first focus on the problem of misrouting. We use the linear trust function described

in Section 6.1 with f = 0.95 and r = 0.6, which corresponds to 40% of the nodes

misbehaving. We feel such a large fraction of bad nodes is reasonable because of

the threat of Sybil attacks [38]. We evaluate different trust functions and parameter

values in Section 6.3.3.

We compare SPROUT, using a lookahead of 1 and MHD = 0.5, to Chord using

the Club Nexus social network data. The first and third rows of Table 6.1 give the

measured values for both the average path length and average reliability path rating

of both regular Chord routing and SPROUT. With an average path length of 5.343

and average reliability of 0.3080, Chord performed much worse in both metrics than

SPROUT, which attained values of 4.569 and 0.4661, respectively. In fact, a path is

over 1.5 times as likely to succeed using standard SPROUT as with regular Chord.

But this difference in performance may be simply due to having additional links

available for routing, and the fact that they are friend links may have no effect

on performance. To equalize the comparison we augmented Chord by giving nodes

additional links to use for routing. Each node was given as many additional random

links as that node has social links (which SPROUT uses). Thus, the total number of

links useable at each node is equal for both SPROUT and augmented Chord. The


Table 6.2: Evaluating lookahead and MHDLookahead

MHD None 1-level 2-levelLength Rating Length Rating Length Rating

0 4.875 0.4068 5.101 0.4420 5.378 0.44210.125 4.805 0.4070 5.003 0.4464 5.258 0.44780.25 4.765 0.4068 4.872 0.4525 5.114 0.45510.5 4.656 0.4033 4.569 0.4661 4.757 0.4730

performance of the augmented Chord (AC) is given in the second row of Table 6.1.

As expected, with more links to choose from AC performs significantly better than

regular Chord, especially in terms of path length. But SPROUT is still 1.3 times as

likely to route successfully. In the following sections we compare SPROUT only to

the augmented Chord algorithm.

How were lookahead and MHD values used above chosen? Table 6.2 shows the

results of our experiments in varying both parameters in the same scenario. As we

see, the largest increase in path rating comes from using a 1-level lookahead. But this

comes at a slight cost in average path length, due to the fact that more lookahead

allows us to route along friend links for more of the path. For example, for MHD

= 0.5, no lookahead averaged 0.977 social links per path, while 1-level lookahead

averaged 2.533 and 2-level averaged 3.491. Friend links tend to not be as efficient as

Chord links, so forward progress may require 2 or 3 hops, depending on the lookahead

depth. But friend links a more likely to reach nodes closer to the sending node on

the social network.

Increasing MHD limits the choices in forward progressing friend hops, causing

the algorithm to switch to Chord earlier than otherwise, but mitigates inefficient

progress. A large MHD seems to be most effective at both shortening path lengths

and increasing path rating. This is not very surprising. Since our reliability function

is multiplicative each additional link appreciably drops the path reliability.

6.3. RESULTS 217

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

100 1000 10000 100000 0

10

20

30

40

50

60

70

80

Ave

rage

Rel

iabi

lity

Per

cent

age

Number of Nodes

Augmented ChordSPROUT

Percent Improvement (SPROUT over AC)

Figure 6.1: Performance of SPROUT and AC in different size Small World networks.The third curve shows the relative performance of SPROUT with respect to AC,plotted on the right-hand y-axis. Note that the x-axis is logscale.

From these results we chose to use a 1-level lookahead and an MHD of 0.5 for our

standard SPROUT procedure. Though 2-level lookahead produced slightly better

reliability we did not feel it warranted the longer route paths and exponentially in-

creased node state propagation and management. Our available social network data

indicates that a user has on average between 8 and 9 friends. Thus, we would expect

most nodes’ level-1 lookahead cache to hold less than 100 entries.

The path ratings presented above were relatively small, indicating a low, but per-

haps acceptable, probability of successfully routing to a destination in the DHT. If

the number of friends a user has remains constant but the total number of network

nodes increases we would expect reliability to drop. As the number of nodes n in-

creases, the average Chord path length increases as O(log n). Each additional node

in a path decreases the path rating. But by how much? To study this issue we ran

our experiment using our synthetic Small World model for networks of different sizes,

but always with an average of around 8 friends per node. We present these results in

Figure 6.1.

As expected, for larger networks the path length increases, thus decreasing overall


reliability. Because the average path length is O(log n) as in Chord, the reliabil-

ity drops exponentially with respect to log n. The range of network sizes tested is

insufficiently large to properly illustrate an exponential curve, giving it a misguid-

ing linear appearance. The third curve gives the percent increase in reliability of

SPROUT with respect to augmented Chord. Notice that the reliability of SPROUT

over AC remains relatively constant, thus resulting in increasing relative performance

for SPROUT over AC. In fact, at 10,000 nodes SPROUT performs over 50% better

than AC. As the network grows, the average number of social links increases slightly.

The benefit SPROUT derives from additional friend links is greater than the benefit

AC derives from additional random links.

6.3.3 Calculating Trust

All of our previous results used a linear trust function with f = 0.95. Of course other

trust functions or parameter values may be more appropriate for different scenarios.

T (A,B), using the linear trust function LT we previously described, is defined in

Equation 6.1 as a function of d, the distance from A to B in the social network.

LT (d) = max(1− (1− f)d, r) (6.1)

Instead of a linear drop in trust, we may want to model an exponential drop at

each additional hop. For this we use an exponential trust function ET , shown in

Equation 6.2.

ET (d) = max(f d, r) (6.2)

Another simple function we call the step trust function ST (d) assigns an equal

high trustworthiness of f to all nodes within h hops of us and the standard rating of

6.3. RESULTS 219

0

0.2

0.4

0.6

0.8

1

0.86 0.88 0.9 0.92 0.94 0.96 0.98 1

Ave

rage

Pat

h R

elia

bilit

y

f

LT - Augmented Chord (AC)LT - SPROUT

ST - ACST - SPROUT

ET - ACET - SPROUT

Figure 6.2: Performance of SPROUT and AC for different trust functions and varyingf . Higher value is better.

r to the rest. Equation 6.3 defines the step trust function.

ST (d) = if (d < h) then f else r (6.3)

In our experiments we set h, the social horizon, to 5.

All three functions are expressed so that f is the rating assigned to nodes one hop

away in the social network, the direct friends. In Figure 6.2 we graph both routing

algorithms under all three trust functions as a function of the parameter f .

We see here that both the linear (LT) and exponential (ET) trust functions per-

form equivalently while the step trust function (ST) gives less performance difference

for varying f . The key observation here is that SPROUT demonstrates a clear im-

provement over augmented Chord for a whole variety of trust functions, especially for

f values greater than 0.85. For example, at f = 0.96 using the exponential function

SPROUT succeeds in routing 47% of the time, while AC only 38%. Thus, even if

one does not know precisely the trust function, one can expect SPROUT to perform

substantially better.

We also varied r, the perceived reliability of random unknown nodes in the network

and present the results in Figure 6.3. We find that for values of r < 0.75 path ratings


0

0.2

0.4

0.6

0.8

1

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Ave

rage

Pat

h R

atin

g

Fraction Good (r)


Figure 6.3: Performance of SPROUT and AC for varying r, the probability ofstrangers routing correctly.

remained unchanged. Above 0.75 both algorithms’ ratings steadily increased. When

5% or less of unknown peers are likely to misroute (r ≥ 0.95) both algorithms perform

equally well, even with f also 0.95 so that we trust our friends no more than any

stranger. This means that while SPROUT significantly improves path reliability in

a peer-to-peer network with many malicious and selfish peers, we do not suffer any

appreciable penalty for using it in a network with very few bad peers.

6.3.4 Number of Friends

In a given network, a node with more friends is likely to perform better since it has

more choices of social links to use. But how much better? How much improvement

would a node expect to gain by establishing some trust relationship with another

node? To quantify this, we generated 100 queries from each node in the Club Nexus

network, calculated its path rating, and grouped and averaged the results based on

the number of social links each node has.

Figure 6.4 shows the results for SPROUT using 0- and 1-level lookahead, as well

as AC for the Club Nexus data. For example, 85 nodes in the network had exactly

10 social links. The average path rating for those 85 nodes when running SPROUT

6.3. RESULTS 221

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 10 100 1000

Ave

rage

Rel

iabi

lty

Node Degree

0 lookahead1 lookahead

AC

Figure 6.4: Performance as a function of a node’s degree. Club Nexus data.

with 1 lookahead was 0.553. Note that the three curves are linear with respect to the

log of the node degree, indicating an exponentially decreasing benefit return for each

additional social link. For instance, nodes with only 1 social peer attained a reliability

rating of 0.265 with SPROUT with no lookahead, while nodes with 10 social peers

scored 0.471, a difference of 0.206. A node with 10 social peers would need to grow

to over 100 social peers to increase their rating that same amount (the one node with

103 social links had a rating of 0.663).

From these curves we can estimate how many links a typical node would need to

have in order to attain a specified level of reliability. For instance, considering the

SPROUT with 1-level lookahead curve, we see that a node would need about 100

social links to attain an average rating of 0.7, and about 600 social links to get a

rating of 0.9.

Though a single node increasing its number of friends does not greatly influence

its performance, what performance can nodes expect if we a priori set the number of

friend connections each node must have? To analyze this we create a random regular

social network graph of 2500 nodes where each node has an equal degree and vary

this degree for each simulation run. The results are shown below in Figure 6.5.

The curves correspond to SPROUT with 1-level lookahead and augmented Chord.


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 10 100 1000

Ave

rage

Rel

iabi

lty

Number of links per node


Figure 6.5: Performance of SPROUT and AC for different uniform networks withvarying degrees.

As expected, we see that both curves rise more steeply than in the previous graph.

If all nodes add an extra social link the probability of successful routing will rise

more than if only one node adds a link (as seen in Fig. 6.4). But the curves level

off just below 0.9. In fact, similar simulations for larger networks showed the same

results, with reliability leveling off under 0.9 at around 100 social links per node.

This confirms that even at high social degree, each path is expected to take multiple

hops through nodes that are, to some small amount, unreliable. Even if all nodes

were exactly two social hops away from each other, this would yield a reliability of

0.95*0.9=0.855. Therefore, we would not expect a node in the Club Nexus dataset,

as seen in Figure 6.4, reach 0.9 reliability, even with 600 links.

Though SPROUT provides greater reliability than Chord, neither algorithm per-

forms particularly well. Our results from Table 6.1 showed ratings of less than 0.50,

indicating less than 50% of messages would be expected to reach their destination.

Perhaps DHT routing is incapable of providing acceptable performance when mem-

bers of the network seek to harm it. In the next section, we evaluate the brute force

method of query flooding.

6.3. RESULTS 223

6.3.5 Comparison to Gnutella-like Networks

So far we have limited our analysis of SPROUT to Chord-like DHT routing. We were

also interested in comparing the effects of misrouting on structured P2P networks to

unstructured, flooding-based networks, such as Gnutella. To balance the comparison

we assume the unstructured network’s topology is determined by the social network,

using only its social links, and apply the same linear trust function used before to

calculate the probability that a node forwards a query flood message.

Because querying the network is flooding-based, we cannot use the probability

of reaching a certain destination as our metric. Instead, we would like to find the

expected number of good responses a querying node would receive. For a DHT we

assume a node would receive all or no responses, depending on whether the query

message reached the correct well-behaved index node (we do not consider the problem

of inserting item keys into the DHT caused by misrouting). In an unstructured

network the number of good responses located is equal to the number of responses at

well-behaved nodes reached by the query flood. Because the flood is usually limited

in size by a time-to-live (TTL), even if there are no malicious nodes in the network,

not all query answers will be located.

Using the simulator described in Chapter 5, we modelled a Gnutella-like network

with a topology based on the Club Nexus data and used a TTL of 5, allowing us to

reach the vast majority of the nodes in the network (over 2000 on average). We seeded

the network with files based on empirically collected data from actual networks [119]

and ran 10000 queries for different files from varying nodes, dropping a query message

at a peer with a probability based on the trust function and the shortest path to the

querying node. We averaged across 10 runs (for different file distributions) and present

the results, as a function of r (the expected reliability of a node distant in the social

network), in Figure 6.6.

The top curve, labelled Total, indicates the total number of files in the entire


0

20

40

60

80

100

120

140

160

1 0

10

20

30

40

50

60

Goo

d A

nsw

ers

Per

Que

ry

Per

cent

age

r

TotalFloodingDHT AC

DHT SPROUT

Figure 6.6: Performance of SPROUT and AC versus unstructured flooding.

network matching each query (on average), independent of the routing algorithm used.

This value is approximately 150. The expected number of good answers received for

the DHT curves was calculated as this total number times the expected probability of

reaching the index node storing the queried for items. Flooding results in significantly

more responses on average, a factor of almost 2 for small r. More importantly, this

means we would expect to locate at least some good answers flooding when the DHT

completely fails. For values of r less than 0.5 all the curves level off. If r = 1 then

we assume no nodes in the network are malicious. Thus DHT outperforms flooding

since it will always locate the index node and retrieve all the available answers.

Note, these results are meant to be a rough comparison of these two P2P styles.

The flooding model does not take into account messages dropped due to congestion.

This is a much larger problem for flooding protocols than DHTs. In our simulations on

the 2200 node Club Nexus network each query reached, on average, over 2000 nodes.

This indicates the number of messages produced by the flood was even greater (due

to duplicate messages). The DHT algorithm, on the other hand, averaged around 5

messages to reach the index node. Thus, flooding schemes will not scale to very large

networks as well as DHTs.

On the other hand, in the DHT model, we are only considering the probability of

6.3. RESULTS 225

a query message being misrouted. We assume all good answers are inserted at the

correct index node, not taking into account that index insertions may fail just as well

as index queries. If we factor in index insertion failures, the DHT curves would shift

down, further increasing the relative performance difference with flooding.

Though flooding is more costly in terms of processor and network bandwidth

utilization, it is clearly a more reliable method of querying in a network suffering

from some amount of misrouting. A better solution may be to use a hybrid scheme

where one uses DHT routing until they detect misrouting or malicious nodes, then

switch to query flooding. In fact, a such a scheme is proposed in [21] and discussed

in Section 6.4.

6.3.6 Latency Comparisons

As we stated before, both SPROUT and our social trust model are not limited to

studying misrouting. With few modifications our model can be used to evaluate other

issues, such as Quality of Service. If peers prioritized their message queues based on

service agreements and/or social connections we may want to use latency as the metric

for comparing routing algorithms. Using the latency trust function (with ε = 1 and

∆ = 3) and latency path rater we described in Section 6.1, we route messages using

both SPROUT and augmented Chord and see which provides the least latency. We

would expect SPROUT to perform even better with respect to Chord in such systems.

We performed an analysis to determine the optimal MHD for latency-based rout-

ing. As in the misrouting scenario an MHD of 0.5 performed the best. This is

surprising since the latency path rater is additive, not multiplicative. The difference

with other values for MHD was almost negligible, indicating that for small ∆ where

the cost of social links and regular links are similar, shortening the overall path out-

weighs choosing social links. In fact, with a larger ∆ of 10, smaller MHD values

perform significantly better than 0.5.


0

2

4

6

8

10

12

14

16

18

100 1000 10000 100000 0

10

20

30

40

50

60

Ave

rage

Lat

ency

Per

cent

age

Number of Nodes

ACSPROUT

Percent Improvement (SPROUT over AC)

Figure 6.7: Latency measurements for SPROUT vs AC w.r.t. network size. Lower isbetter.

Figure 6.7 shows the average path latency for both SPROUT and augmented

Chord as a function of the network size (using a Small World topology). The

third curve shows the percent decrease in latency attained by switching from AC

to SPROUT. We see that SPROUT results in roughly half (40-60%) the latency of

AC. We would expect SPROUT to deliver messages twice as fast as AC by preferring

to take advantage of service agreements, rather than simply minimizing hop count.

Clearly, Quality of Service issues greatly benefit from routing algorithms which

account for service agreements between peers, as SPROUT does. In fact, real-world

systems which deal with QoS, such as ISPs and phone carriers, base their routing

decisions on service agreements among their peers, though their networks are not as

dynamic as peer-to-peer networks.

6.3.7 Message Load

One problem SPROUT faces is uneven load distribution due to the widely varying

social connectivity of the nodes. Peers with more social links are expected to for-

ward messages for friends at a higher rate than weakly socially connected peers. To

study this issue we measure the number of messages forwarded by each node over all

6.3. RESULTS 227

0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

1 10 100 1000 10000

Load

Node Rank


SPROUT (No Top 10)SPROUT (Limit 20)

Figure 6.8: Distribution of load (in fraction of routes) for augmented Chord andSPROUT. Lower is better. Social links were removed for the top 10 highest connectednodes for the No Top 10 curve. All nodes were limited to at most 20 social links forthe Limit 20 curve. Note the logscale x-axis.

1,000,000 paths for both SPROUT(1,0.5) and augmented Chord. The resulting load

on each node, in decreasing order, is given by the first two curves in Figure 6.8. The

load is calculated as the fraction of all messages a node participated in routing.

The highest loaded node in the SPROUT experiment was very heavily loaded in

comparison to AC (4% vs 0.75%). As expected, a peer’s social degree is proportional

to its load, with the most connected peers forwarding the most messages. Though the

top 200 nodes suffer substantially more load with SPROUT than AC, the remaining

nodes report equal or less load. Because the average path length for SPROUT is

slightly higher than for AC, the total load is greater in the SPROUT scenario. Yet

the median load is slightly lower for SPROUT, further indicating an imbalanced load

distribution.

To analyze the importance of the highly connected nodes we removed the social

links from the top 10 most connected nodes, but kept their regular Chord links and

reran the experiment. As the third curve in Figure 6.8 shows, the load has lowered

for the most heavily weighted nodes, yet remains well above AC. Surprisingly the

reliability was barely affected, dropping by 2% to 0.4569. If highly connected nodes


were to stop forwarding for friends due to too much traffic, the load would shift to

other nodes and the overall system performance would not be greatly affected.

Instead of reacting to high load, nodes may wish to only provide a limited number

of social links for routing from the start. We limited all nodes to using only at most

20 social links for SPROUT. As we can see from the Limit 20 curve in Figure 6.8, the

load on the highly-loaded peers (excluding the most loaded peer) has fallen further,

but not significantly from the No Top 10 scenario. The average path reliability has

dropped only an additional 1.5% to 0.4500.

In the end, it is the system architect who must decide whether the load skew is

acceptable. For weakly connected homogeneous systems, fair load distribution may

be critical. For other systems, improved reliability may be more important. In fact,

one could take advantage of this skew. Adding one highly-connected large-capacity

node to the network would increase reliability while significantly decreasing all other

nodes’ load.

6.4 Related and Future Work

In [21], Castro et al. propose using stricter network identifier assignment and density

checks to detect misrouting attacks in DHTs. They suggest using constrained routing

tables and redundant routing to circumvent malicious nodes and provide more secure

routing. SPROUT is complementary to their approach, simply increasing the proba-

bility that the message will be routed correctly the first time. One technique of theirs

that would be especially useful in our system was their route failure test based on

measuring the density of network IDs around oneself and the purported destination.

Not only can this technique be used to determine when a route has failed, but it can

be used to evaluate the trustworthiness of a node’s sequential neighbors by comparing

local density to that at random locations in idspace or around friends.

6.5. CONCLUSION 229

As discussed in Section 6.1, the LOCKSS digital preservation system uses a friends

list containing peers with which one has a priori trust relationships. By including

some peers from this list in the voting process, it reduces the influence of an adversary

who attempts to poison the peers’ reference lists with its many virtual identities [87].

One open question is whether node IDs can be assigned more intelligently to

improve trustworthiness. That is, if identifiers were assigned to nodes based on

the current IDs of their connected friends, what algorithm or distribution for ID

assignment would optimize our ability to route over social links?

One method to provide greater reliability in a DHT for fault tolerance and/or

security, is to replicate the index to multiple nodes. If we do k-replication then

when we insert, update or search for an entry in the DHT, we must contact k nodes

determined by using k hash functions. If a good node A wishes to insert an item

into the DHT, it attempts to contact all k replicas. Each message has an expected

probability p of having traversed only well-behaved nodes to the destination. Likewise,

if node B wishes to look up the item A inserted it can try to contact all k replicas,

each time with an expected probability of success of p. Assuming neither A nor B can

determine whether they contacted a good node or are being lied to, the probability of

B locating A’s item is 1− (1− p2)k. Using the values in Table 6.1 for p and a typical

replication factor of k = 3, SPROUT would succeed 41% of the time compared to

only 26% for AC.

6.5 Conclusion

Today’s peer-to-peer systems are very vulnerable to malicious attacks. The anonymity

and transience of the members make it difficult to determine who to trust. Integrating

social networks with P2P networks will provide this much-needed trust information.


We have presented a method for leveraging the trust relationships gained by mar-

rying a peer-to-peer system with a social network, and showed how to improve the

expected number of query results and the how to reduce the expected delays. We

described a model for evaluating routing algorithms in such a system and proposed

SPROUT, a routing algorithm designed to leverage trust relationships given by social

links. Our results demonstrate how SPROUT can significantly improve the likelihood

of getting query results in a timely fashion, when a large fraction of nodes are mali-

cious. Though flooding-based search schemes are far more robust when threatened

by a large number of malicious users, with the right techniques structured networks

can obtain acceptable performance at far less bandwidth costs.

Chapter 7

Mitigating Routing Misbehavior in

Mobile Ad Hoc Networks

There will be tremendous growth over the next decade in the use of wireless com-

munication, from satellite transmission into many homes to wireless personal area

networks. As the cost of wireless access drops, wireless communications could re-

place wired in many settings. One advantage of wireless is the ability to transmit

data among users in a common area while remaining mobile. However, the distance

between participants is limited by the range of transmitters or their proximity to

wireless access points. Ad hoc wireless networks mitigate this problem by allowing

out of range nodes to route data through intermediate nodes.

Ad hoc networks have a wide array of military and commercial applications. Ad

hoc networks are ideal in situations where installing an infrastructure is not possible

because the infrastructure is too expensive or too vulnerable, the network is too

transient, or the infrastructure was destroyed. For example, nodes may be spread

over too large an area for one base station and a second base station may be too

expensive. An example of a vulnerable infrastructure is a military base station on

a battlefield. Networks for wilderness expeditions and conferences may be transient

231

232 CHAPTER 7. MITIGATING MANET MISBEHAVIOR

if they exist for only a short period of time before dispersing or moving. Finally,

if network infrastructure has been destroyed due to a disaster, an ad hoc wireless

network could be used to coordinate relief efforts. Since DARPA’s PRNET [71], the

area of routing in ad hoc networks has been an open research topic.

Ad hoc networks maximize total network throughput by using all available nodes

for routing and forwarding. Therefore, the more nodes that participate in packet

routing, the greater the aggregate bandwidth, the shorter the possible routing paths,

and the smaller the possibility of a network partition. However, a node may misbehave

by agreeing to forward packets and then failing to do so, because it is overloaded,

selfish, malicious, or broken. An overloaded node lacks the CPU cycles, buffer space

or available network bandwidth to forward packets. A selfish node is unwilling to

spend battery life, CPU cycles, or available network bandwidth to forward packets

not of direct interest to it, even though it expects others to forward packets on its

behalf. A malicious node launches a denial of service attack by dropping packets. A

broken node might have a software fault that prevents it from forwarding packets.

In ad hoc networks, misbehaving mobile nodes can be a significant problem. Sim-

ulations presented in this chapter show that if 10%-40% of the nodes in the ad hoc

network misbehave, then the average throughput degrades by 16%-32%. However,

the worst case throughput experienced by any one node may be worse than the av-

erage, because nodes that try to route through a misbehaving node experience high

loss while other nodes experience no loss. Thus, even a few misbehaving nodes can

have a severe impact.

One solution to misbehaving nodes is to forward packets only through nodes that

share an a priori trust relationship. A priori trust relationships are based on pre-

existing relationships built outside of the context of the network (e.g. friendships,

companies, and armies). SPROUT, discussed in the previous chapter, used these re-

lationships to improve routing. However, several issues prevent a priori trust routing

233

from being practical in mobile ad hoc networks:

a) SPROUT leveraged existing social network services, which may employ a central-

ized trusted entity, such as AOL Instant Messenger. No such service is likely to

exist in scenarios where ad hoc networks are deployed.

b) While in a wired overlay network any peer can contact a friendly peer directly,

mobile nodes are limited to routing through nodes within radio transmission range.

Even if one’s friends are participating in the ad hoc network, they may not be

in communication range. Routing packets towards a destination through friendly

nodes for even one or two hops may be impossible.

c) Although relying solely on a priori trust-based forwarding reduces the number of

misbehaving nodes, it will exclude untrusted well behaved nodes whose presence

could improve ad hoc network performance.

d) Finally, in the more hostile environments and scenarios for which mobile ad hoc

networks are envisioned (e.g. battlefield), trusted nodes are more likely to be

compromised.

Another solution to misbehaving nodes is to attempt to forestall or isolate these

nodes from within the actual routing protocol for the network. However, this would

add significant complexity to protocols whose behavior must be very well defined.

In fact, current versions of mature ad hoc routing algorithms, including DSR [70],

AODV [31], TORA [26], DSDV [106], STAR [51], and others [39] only detect if the

receiver’s network interface is accepting packets, but they otherwise assume that

routing nodes do not misbehave. Although trusting all nodes to be well behaved

increases the number of nodes available for routing, it also admits misbehaving nodes

to the network.

In this chapter we explore a different approach, and install extra facilities on top

of the network routing protocol to detect and mitigate routing misbehavior. In this


way, only minimal changes to the underlying routing algorithm are needed in order

to take advantage of our mechanisms. We introduce two extensions to any ad hoc

routing algorithm that mitigate the effects of routing misbehavior: the Watchdog

and the Pathrater. The Watchdog identifies misbehaving nodes, while the Pathrater

avoids routing packets through these nodes. When a node forwards a packet, the

node’s Watchdog verifies that the next node in the path also forwards the packet.

The Watchdog does this by listening promiscuously to the next node’s transmissions.

If the next node does not forward the packet, then it is misbehaving. The Pathrater

uses this knowledge of misbehaving nodes to choose the network path that is most

likely to deliver packets. In this chapter, we demonstrate how to use our extensions

with the Dynamic Source Routing algorithm (DSR) [70]. However, Watchdog and

Pathrater can be easily applied to other ad hoc routing protocols.

Using the ns network simulator [43], we show that the two techniques increase

throughput by 17% in the presence of up to 40% misbehaving nodes during moderate

mobility, while increasing the ratio of overhead transmissions to data transmissions

from the standard routing protocol’s 9% to 17%. During extreme mobility, Watchdog

and Pathrater can increase network throughput by 27%, while increasing the percent-

age of overhead transmissions from 12% to 24%. We describe mechanisms to reduce

this overhead in Section 7.6.

The remainder of this chapter is organized as follows. Section 7.1 specifies our

assumptions about ad hoc networks and gives background information about DSR.

Section 7.2 describes the Watchdog and Pathrater extensions. Section 7.3 describes

the methodology we use in our simulations and the metrics we use to evaluate the

results. We present these results in Section 7.4. Section 7.5 presents related work and

Section 7.7 concludes the chapter.

The basis of this chapter originally appeared in [97].

7.1. ASSUMPTIONS AND BACKGROUND 235

7.1 Assumptions and Background

This section outlines the assumptions we make regarding the properties of the physical

and network layers of ad hoc networks and includes a brief description of DSR, the

routing protocol we use.

7.1.1 Definitions

We use the term neighbor to refer to a node that is within wireless transmission

range of another node. Likewise, neighborhood refers to all the nodes that are within

wireless transmission range of a node.

7.1.2 Physical Layer Characteristics

Throughout this chapter we assume bidirectional communication symmetry on every

link between nodes. This means that if a node B is capable of receiving a message

from a node A at time t, then node A could instead have received a message from node

B at time t. This assumption is often valid, since many wireless MAC layer protocols,

including IEEE 802.11 and MACAW [13], require bidirectional communication for

reliable transmission. The Watchdog mechanism relies on bidirectional links.

In addition, we assume wireless interfaces that support promiscuous mode oper-

ation. Promiscuous mode means that if a node A is within range of a node B, it

can overhear communications to and from B even if those communications do not

directly involve A. Lucent Technologies’ WaveLAN interfaces have this capability.

While promiscuous mode is not appropriate for all ad hoc network scenarios (partic-

ularly some military scenarios) it is useful in other scenarios for improving routing

protocol performance [70].


SD

(a)

SD

(b)

SD

(c)

Figure 7.1: Example of a route request. (a) Node S sends out a route requestpacket to find a path to node D. (b) The route request is forwarded throughoutthe network, each node adding its address to the packet. (c) D then sends back aroute reply to S using the path contained in one of the route request packetthat reached it. The thick lines represent the path the route reply takes back tothe sender.

7.1.3 Dynamic Source Routing (DSR)

DSR is an on-demand, source routing protocol. Every packet has a route path con-

sisting of the addresses of nodes that have agreed to participate in routing the packet.

The protocol is referred to as “on-demand” because route paths are discovered at the

time a source sends a packet to a destination for which the source has no path.

We divide DSR into two main functions: route discovery and route maintenance.

Figure 7.1 illustrates route discovery. Node S (the source) wishes to communicate

with node D (the destination) but does not know any paths to D. S initiates a route

discovery by broadcasting a route request packet to its neighbors that contains

the destination address D. The neighbors in turn append their own addresses to the

route request packet and rebroadcast it. This process continues until a route

request packet reaches D. D must now send back a route reply packet to inform S

7.2. WATCHDOG AND PATHRATER 237

of the discovered route. Since the route request packet that reaches D contains

a path from S to D, D may choose to use the reverse path to send back the reply

(bidirectional links are required here) or to initiate a new route discovery back to S.

Since there can be many routes from a source to a destination, a source may receive

multiple route replies from a destination. DSR caches these routes in a route cache

for future use.

The second main function in DSR is route maintenance, which handles link breaks.

A link break occurs when two nodes on a path are no longer in transmission range.

If an intermediate node detects a link break when forwarding a packet to the next

node in the route path, it sends back a message to the source notifying it of that link

break. The source must try another path or do a route discovery if it does not have

another path.

7.2 Watchdog and Pathrater

In this section we present the Watchdog and the Pathrater — tools for detecting

and mitigating routing misbehavior. We also describe the limitations of these meth-

ods. Though we implement these tools on top of DSR, some of our concepts can be

generalized to other source routing protocols. We note those concepts that can be

generalized during our descriptions of the techniques.

7.2.1 Watchdog

The Watchdog method detects misbehaving nodes. Figure 7.2 illustrates how the

Watchdog works. Suppose there exists a path from node S to D through intermediate

nodes A, B, and C. Node A cannot transmit all the way to node C, but it can listen

in on node B’s traffic. Thus, when A transmits a packet for B to forward to C, A

can often tell if B transmits the packet. If encryption is not performed separately for


S A B C D

Figure 7.2: When B forwards a packet from S toward D through C, A can overhearB’s transmission and can verify that B has attempted to pass the packet to C. Thesolid line represents the intended direction of the packet sent by B to C, while thedashed line indicates that A is within transmission range of B and can overhear thepacket transfer.

S A B C D2 1 1

Figure 7.3: Node A does not hear B forward packet 1 to C, because B’s transmissioncollides at A with packet 2 from the source S.

each link, which can be expensive, then A can also tell if B has tampered with the

payload or the header.

We implement the Watchdog by maintaining a buffer of recently sent packets and

comparing each overheard packet with the packet in the buffer to see if there is a

match. If so, the packet in the buffer is removed and forgotten by the Watchdog,

since it has been forwarded on. If a packet has remained in the buffer for longer than

a certain timeout, the Watchdog increments a failure tally for the node responsible

for forwarding on the packet. If the tally exceeds a certain threshold bandwidth, it

determines that the node is misbehaving and sends a message to the source notifying

it of the misbehaving node.

The Watchdog technique has advantages and weaknesses. DSR with the Watchdog

has the advantage that it can detect misbehavior at the forwarding level and not just

the link level. Watchdog’s weaknesses are that it might not detect a misbehaving

node in the presence of 1) ambiguous collisions, 2) receiver collisions, 3) limited

transmission power, 4) false misbehavior, 5) collusion, and 6) partial dropping.

The ambiguous collision problem prevents A from overhearing transmissions from

B. As Figure 7.3 illustrates, a packet collision can occur at A while it is listening for B

to forward on a packet. A does not know if the collision was caused by B forwarding


S A B C D1 21

Figure 7.4: Node A believes that B has forwarded packet 1 on to C, though C neverreceived the packet due to a collision with packet 2.

on a packet as it should or if B never forwarded the packet and the collision was

caused by other nodes in A’s neighborhood. Because of this uncertainty, A should

not immediately accuse B of misbehaving, but should instead continue to watch B

over a period of time. If A repeatedly fails to detect B forwarding on packets, then

A can assume that B is misbehaving.

In the receiver collision problem, node A can only tell whether B sends the packet

to C, but it cannot tell if C receives it (Figure 7.4). If a collision occurs at C when

B first forwards the packet, A only sees B forwarding the packet and assumes that

C successfully receives it. Thus, B could skip re-transmitting the packet and leave A

none the wiser. B could also purposefully cause the transmitted packet to collide at

C by waiting until C is transmitting and then forwarding on the packet. In the first

case, a node could be selfish and not want to waste power with retransmissions. In

the latter case, the only reason B would have for taking the actions that it does is

because it is malicious. B wastes battery power and CPU time, so it is not selfish.

An overloaded node would not engage in this behavior either, since it wastes badly

needed CPU time and bandwidth. Thus, this second case should be a rare occurrence.

Another problem can occur when nodes falsely report other nodes as misbehaving.

A malicious node could attempt to partition the network by claiming that some nodes

following it in the path are misbehaving. For instance, node A could report that

node B is not forwarding packets when in fact it is. This will cause S to mark B as

misbehaving when A is the culprit. This behavior, however, will be detected. Since

A is passing messages on to B (as verified by S), then any acknowledgements from

D to S will go through A to S, and S will wonder why it receives replies from D


when supposedly B dropped packets in the forward direction. In addition, if A drops

acknowledgements to hide them from S, then node B will detect this misbehavior and

will report it to D.

Another problem is that a misbehaving node that can control its transmission

power can circumvent the Watchdog. A node could limit its transmission power

such that the signal is strong enough to be overheard by the previous node but too

weak to be received by the true recipient. This would require that the misbehaving

node know the transmission power required to reach each of its neighboring nodes.

Only a node with malicious intent would behave in this manner — selfish nodes have

nothing to gain since battery power is wasted and overloaded nodes would not relieve

any congestion by doing this.

Multiple nodes in collusion can mount a more sophisticated attack. For example,

B and C from Figure 7.2 could collude to cause mischief. In this case, B forwards

a packet to C but does not report to A when C drops the packet. Because of this

limitation, it may be necessary to disallow two consecutive untrusted nodes in a

routing path. In this study, we only deal with the possibility of nodes acting alone.

The harder problem of colluding nodes is being studied by Johnson at CMU [69].

Colluding nodes pose another threat by intercepting and dropping misbehavior

reports. If a node notices the next hop node is not forwarding packets and sends a

notification back along the path to the sender, a malicious node inserted earlier in

the path may drop the notification, preventing the source from receiving it. Once

again, Watchdog could be employed in the reverse direction. However, the node that

detects the dropped report would have to establish a new route to the source node,

adding complexity and resource usage to the protocol.

Finally, a node can circumvent the Watchdog by dropping packets at a lower

rate than the Watchdog’s configured minimum misbehavior threshold. Although the

watchdog will not detect this node as misbehaving, this node is forced to forward at


the threshold bandwidth. In this way the watchdog serves to enforce this minimum

bandwidth.

The watchdog mechanism could be used to some degree to detect replay attacks

but would require maintaining a great deal of state information at each node as it

monitors its neighbors to ensure that they do not retransmit a packet that they have

already forwarded. Also, if a collision has taken place at the receiving node, it would

be necessary and correct for a node to retransmit a packet, which may appear as a

replay attack to the node acting as its Watchdog. Therefore, detecting replay attacks

would neither be an efficient nor an effective use of the Watchdog mechanism.

For the Watchdog to work properly, it must know where a packet should be in two

hops. In our implementation, the Watchdog has this information because DSR is a

source routing protocol. If the Watchdog does not have this information (for instance

if it were implemented on top of a hop-by-hop routing protocol), then a malicious or

broken node could broadcast the packet to a non-existent node and the Watchdog

would have no way of knowing. Because of this limitation, the Watchdog works best

on top of a source routing protocol.

7.2.2 Pathrater

The Pathrater, run by each node in the network, combines knowledge of misbehaving

nodes with link reliability data to pick the route most likely to be reliable. Each

node maintains a rating for every other node it knows about in the network. It

calculates a path metric by averaging the node ratings in the path. We choose this

metric because it gives a comparison of the overall reliability of different paths and

allows Pathrater to emulate the shortest length path algorithm when no reliability

information has been collected, as explained below. If there are multiple paths to the

same destination, we choose the path with the highest metric. Note that this differs

from standard DSR, which chooses the shortest path in the route cache. Further note


that since the Pathrater depends on knowing the exact path a packet has traversed,

it must be implemented on top of a source routing protocol.

The Pathrater assigns ratings to nodes according to the following algorithm. When

a node in the network becomes known to the Pathrater (through route discovery),

the path-rater assigns it a “neutral” rating of 0.5. A node always rates itself with

a 1.0. This ensures that when calculating path rates, if all other nodes are neutral

nodes (rather than suspected misbehaving nodes), the Pathrater picks the shortest

length path. The Pathrater increments the ratings of nodes on all actively used paths

by 0.01 at periodic intervals of 200 ms. An actively used path is one on which the

node has sent a packet within the previous rate increment interval. The maximum

value a neutral node can attain is 0.8. We decrement a node’s rating by 0.05 when

we detect a link break during packet forwarding and the node becomes unreachable.

The lower bound rating of a “neutral” node is 0.0. The Pathrater does not modify

the ratings of nodes that are not currently in active use.

We assign a special highly negative value, −100 in the simulations, to nodes sus-

pected of misbehaving by the Watchdog mechanism. When the Pathrater calculates

the path metric, negative path values indicate the existence of one or more suspected

misbehaving nodes in the path. If a node is marked as misbehaving due to a temporary

malfunction or incorrect accusation it would be preferable if it were not permanently

excluded from routing. Therefore nodes that have negative ratings should have their

ratings slowly increased or set back to a non-negative value after a long timeout. This

is not implemented in our simulations since the current simulation period is too short

to reset a misbehaving node’s rating. Section 7.4.3 discusses the effect on throughput

of accusing well-behaving nodes.

When the Pathrater learns that a node on a path that is in use misbehaves, and

it cannot find a path free of misbehaving nodes, it sends out a route request if

we have enabled an extension we call Send Route Request (SRR).

7.3. METHODOLOGY 243

7.3 Methodology

In this section we describe our simulator, simulation parameters, and measured met-

rics.

We use a version of Berkeley’s Network Simulator (ns) [43] that includes wireless

extensions made by the CMU Monarch project. We also use a visualization tool from

CMU called ad-hockey [109] to view the results of our simulations and detect overall

trends in the network. To execute the simulations, we use PCs (450 or 500 MHz

Pentium IIIs with at least 128 MB of RAM) running Red Hat Linux 6.1.

Our simulations take place in a 670 by 670 meter flat space filled with a scattering

of 50 wireless nodes. The physical layer and the 802.11 MAC layer we use are included

in the CMU wireless extensions to ns[15].

7.3.1 Movement and Communication Patterns

The nodes communicate using 10 constant bit rate (CBR) node-to-node connections.

Four nodes are sources for two connections each, and two nodes are sources for one

connection each. Eight of the flow destinations receive only one flow and the ninth

destination receives two flows. The communication pattern we use was developed by

CMU [15].

In all of our node movement scenarios, the nodes choose a destination and move

in a straight line towards the destination at a speed uniformly distributed between 0

meters/second (m/s) and some maximum speed. This is called the random waypoint

model [15]. We limit the maximum speed of a node to 20 m/s (10 m/s on average)

and we set the run-time of the simulations to 200 seconds. Once the node reaches

its destination, it waits for the pause time before choosing a random destination and

repeating the process. We use pause times of 0 and 60 seconds. In addition we use

two different variations of the initial node placement and movement patterns. By


combining the two pause times with two movement patterns, we obtain four different

mobility scenarios.

7.3.2 Misbehaving Nodes

Of the 50 nodes in the simulated network, some variable percentage of the nodes

misbehave. In our simulations, a misbehaving node is one that agrees to participate

in forwarding packets (it appends its address into route request packets) but then

indiscriminately drops all data packets that are routed through it.

We vary the percentage of the network comprised of misbehaving nodes from

0% to 40% in 5% increments. While a network with 40% misbehaving nodes may

seem unrealistic, it is interesting to study the behavior of the algorithms in a more

hostile environment than we hope to encounter in real life. We use Tcl’s [102] built-in

pseudo-random number generator to designate misbehaving nodes randomly. We use

the same seed across the 0% to 40% variation of the misbehaving nodes parameter,

which means that the group of misbehaving nodes in the 10% case is a superset of the

group of misbehaving nodes in the 5% case. This ensures that the obstacles present

in lower percentage misbehaving node runs are also present in higher percentage

misbehaving node runs.

7.3.3 Metrics

We evaluate our extensions using the following three metrics:

• Throughput: This is the percentage of sent data packets actually received by

the intended destinations.

• Overhead: This is the ratio of routing-related transmissions (route request,

route reply, route error, and Watchdog) to data transmissions in a sim-

ulation. A transmission is one node either sending or forwarding a packet.


For example, one packet being forwarded across 10 nodes would count as 10

transmissions. We count transmissions instead of packets because we want to

compare routing-related transmissions to data transmissions, but some routing

packets are more expensive to the network than other packets: route re-

quest packets are broadcast to all neighbors which in turn broadcast to all of

their neighbors, causing a tree of packet transmissions. Unicast route reply,

route error, Watchdog, and data packets only travel along a single path.

• Effects of Watchdog false positives on network throughput. False positives occur

when the Watchdog mechanism reports that a node is misbehaving when in fact

it is not, for reasons discussed in Section 7.2.1. We study the impact of this on

throughput.

7.4 Simulation Results

In this section we present the results of our simulations. We focus on three metrics of

evaluation: network throughput, routing overhead, and the effects of false positives

on throughput.

We test the utility of various combinations of our extensions: Watchdog (WD),

Pathrater (PR), and send (extra) route request (SRR). We use the SRR extension to

find new paths when all known paths include a suspected misbehaving node. Each of

the following sections includes two graphs of simulation results for two separate pause

times. The first graph is for a pause time of 0 (the nodes are in constant motion) and

the second is for a pause time of 60 seconds before and in between node movement.

We simulate two different node mobility patterns using four different pseudo-random

number generator seeds. The seeds determine which nodes misbehave. We plot the

average of the eight simulations.


0

0.2

0.4

0.6

0.8

1

0 0.1 0.2 0.3 0.4 0.5

Thro

ughp

ut (p

erce

nt o

f pac

kets

rece

ived

)

Fraction of misbehaving nodes

WD=ON ,PR=ON ,SRR=ON WD=ON ,PR=ON ,SRR=OFFWD=OFF,PR=ON ,SRR=OFFWD=OFF,PR=OFF,SRR=OFF

(a) 0 second pause time

0

0.2

0.4

0.6

0.8

1

0 0.1 0.2 0.3 0.4 0.5

Thro

ughp

ut (p

erce

nt o

f pac

kets

rece

ived

)


WD=ON ,PR=ON ,SRR=ON WD=ON ,PR=ON ,SRR=OFFWD=OFF,PR=ON ,SRR=OFFWD=OFF,PR=OFF,SRR=OFF

(b) 60 second pause time

Figure 7.5: Overall network throughput as a function of the fraction of misbehavingnodes in the network.

7.4.1 Network Throughput

We graph four curves for network throughput: everything enabled, Watchdog and

Pathrater enabled, only Pathrater enabled, and everything disabled. We choose to

graph both everything enabled and everything enabled except SRR, because we want

to isolate performance gains or problems caused by extra route requests. Since the

Pathrater is not strictly a tool to be used for circumventing misbehaving nodes, we

choose to include the graph where only Pathrater is enabled to determine if it increases

network throughput without any knowledge of suspected misbehaving nodes. We do

not graph Watchdog and SRR activated without Pathrater, since without Pathrater

the information about misbehaving nodes would not be used for routing decisions.

Figure 7.5 shows the total network throughput, calculated as the fraction of data

packets generated that are received, versus the fraction of misbehaving nodes in the

network for the combinations of extensions. In the case where the network contains

no misbehaving nodes, all four curves achieve around 95% throughput. After the 0%

misbehaving node case, the graphs diverge.

As expected, the simulations with all three extensions active perform the best by a

considerable margin as misbehaving nodes are added to the network. The mechanisms


Maximum Minimum

0 second pause time 88.6% 75.2%60 second pause time 95.0% 73.9%

Table 7.1: Maximum and minimum network throughput obtained by any simulationat 40% misbehaving nodes with all features enabled.

increase the throughput by up to 27% compared to the basic protocol, maintaining a

throughput greater than 80% for both pause times, even with 40% misbehaving nodes.

Table 7.1 lists the maximum and minimum throughput achieved in any simulation

run at 40% misbehaving nodes with all options enabled.

When a subset of the extensions is active, performance does not increase as much

over the simulations with no extensions. Watchdog alone does not affect routing

decisions, but it supplies Pathrater with extra information to combat misbehaving

nodes more effectively. When Watchdog is deactivated, the source node has no way of

detecting the misbehaving node in its path to the destination, and so its transmission

flow suffers total packet loss. Pathrater alone cannot detect a path with misbehaving

nodes to decrement its rate (see Section 7.6).

One effect of the randomness of ns is that nodes may receive route replies to their

route requests in a different order in one simulation than in another simulation with

slightly varied parameters. This change can result in a node choosing a path with a

misbehaving node in one run, but not choosing that path in a simulation with more

misbehaving nodes in the network. This may actually result in slight increases in

network throughput when the number of misbehaving nodes increases. For instance,

this is noticeable in the Pathrater-only curve of Figure 7.5 (b) where the throughput

raises from 82% to 84% between 20% and 25% misbehaving nodes.

In both throughput graphs, the everything disabled curve and the Pathrater only

curves closely follow each other. From the graphs we conclude that the Pathrater


0

0.2

0.4

0.6

0.8

1

0 0.1 0.2 0.3 0.4 0.5

Ove

rhea

d ra

tion


WD=ON ,PR=ON ,SRR=ON WD=ON ,PR=ON ,SRR=OFFWD=ON ,PR=OFF,SRR=OFFWD=OFF,PR=OFF,SRR=OFF


0

0.2

0.4

0.6

0.8

1

0 0.1 0.2 0.3 0.4 0.5

Ove

rhea

d ra

tion


WD=ON ,PR=ON ,SRR=ON WD=ON ,PR=ON ,SRR=OFFWD=ON ,PR=OFF,SRR=OFFWD=OFF,PR=OFF,SRR=OFF


Figure 7.6: This figure shows routing overhead as a ratio of routing packet trans-missions to data packet transmissions. This ratio is plotted against the fraction ofmisbehaving nodes.

alone does not significantly affect performance. In Section 7.6 we suggest some im-

provements to the Pathrater that may increase its utility in the absence of the other

extensions.

7.4.2 Routing Overhead

For routing overhead, we graph four curves: everything on, Pathrater and Watchdog

on, only Watchdog on (Watchdog-only), and everything off. Using the everything

off graph as our basis for comparison, we graph the Watchdog-only curve to find the

overhead generated just by the Watchdog when it sends notifications to senders. The

Watchdog and Pathrater curve shows the overhead added by Watchdog and Pathrater

but with Pathrater’s ability to send out extra route requests disabled. The everything

on curve includes the overhead created by Pathrater when sending out extra route

requests.

Figure 7.6 shows the amount of overhead incurred by activating the different

routing extensions. The greatest effect on routing overhead results from using the

SRR feature, which sends out route requests for a destination to which the only


Maximum Minimum

0 second pause time 31.3% 18.9%60 second pause time 23.5% 11.0%

Table 7.2: Maximum and minimum overhead obtained by any simulation at 40%misbehaving nodes with all features enabled.

known routes include suspected misbehaving nodes. For 40% misbehaving nodes in

the high mobility scenario, the overhead rises from 12% to 24% when SRR is activated

in the Pathrater. Any route requests generated by SRR will flood the network with

route request and route reply packets, which greatly increase the overhead.

Table 7.2 lists the maximum and minimum overhead for any of the simulations with

all options enabled at 40% misbehaving nodes.

The Watchdog mechanism itself only adds a very small amount of extra overhead

as seen by comparing the Watchdog-only graph with the all-disabled graph. Also, the

added overhead is not affected by the increase in misbehaving nodes in the network.

Using both the Watchdog and Pathrater mechanisms increases the throughput of the

network by 16% at 40% misbehaving nodes with only 6% additional network overhead

(see Figure 7.6 (a)).

Though the overhead added by these extensions is significant, especially when

Pathrater sends out route requests to avoid misbehaving nodes, these extensions still

improve net throughput. Therefore, the main concerns with high overhead involve

issues such as increased battery usage on portables and PDAs. Since the largest

factor accounting for the overhead is route requests, the overhead can be significantly

reduced by optimizing the delay between Pathrater sending out route requests and

incorporating some of the approaches developed for mitigating route requests and

broadcast storms in general [10, 20, 78].


0

0.2

0.4

0.6

0.8

1

0 0.1 0.2 0.3 0.4 0.5

Thro

ughp

ut (f

ract

ion

of p

acke

ts re

ceiv

ed)


No False PositivesRegular Watchdog


0

0.2

0.4

0.6

0.8

1

0 0.1 0.2 0.3 0.4 0.5

Thro

ughp

ut (f

ract

ion

of p

acke

ts re

ceiv

ed)


No False PositivesRegular Watchdog


Figure 7.7: Comparison of network throughput between the regular Watchdog and aWatchdog that reports no false positives.

7.4.3 Effects of False Detection

We compare simulations of the regular Watchdog with a Watchdog that does not

report false positives. Figure 7.7 shows the network throughput lost by the Watchdog

incorrectly reporting well-behaved nodes. These results show that throughput is not

appreciably affected by false positives and that they may even have beneficial side

effects, as described below.

The similarity in throughput can be attributed to a few factors. First, the nodes

incorrectly reported as misbehaving could have moved out of the previous node’s

listening range before forwarding on a packet. If these nodes move out of range

frequently enough to warrant an accusation of misbehavior they may be unreliable

due to their location, and the source would be better off routing around them. The

fact that more false positives are reported in the 0 second pause time simulations as

compared to the 60 second pause time simulations, as shown in Table 7.3, supports

this conclusion. Table 7.3 shows the average value of false positives reported by the

simulation runs for each pause time and misbehaving node percentage.

Another factor that may account for the similar throughput of the Watchdog’s

performance with and without false positives concerns one of the limitations of the


% misbehaving nodes 0% 5% 10% 15% 20% 25% 30% 35% 40%

0 second pause time 111.2 82.8 90.3 66.5 75.5 60.8 67.5 31.3 50.860 second pause time 39.0 57.6 40.8 63.1 35.7 79.5 46.7 21.7 47.2

Table 7.3: Comparison of the number of false positives between the 0 second and 60second pause time simulations. Average taken from the simulations with all featuresenabled.

Watchdog. As described in Section 7.2.1, if a collision occurs while the Watchdog

is waiting for the next node to forward a packet, it may never overhear the packet

being transmitted. If many collisions occur over time, the Watchdog may incorrectly

assume that the next node is misbehaving. However, if a node constantly experiences

collisions, it may actually increase throughput to route packets around areas of high

communication density.

Yet another factor is that increased false positives will result in more paths in-

cluding a suspected misbehaving node. The Pathrater will then send out more route

requests to the destination. This increases the overhead in the network, but it also

provides the sending node with a fresher list of routes for its route cache.

7.5 Related Work

At the time this work was originally published [97] there was no previously published

work on detection of routing misbehavior specific to ad hoc networks, although there

is relevant work by Smith, Murthy and Garcia-Luna-Aceves on securing distance

vector routing protocols from Byzantine routing failures [122]. In their work, they

suggest countermeasures to secure routing messages and routing updates. This work

may be applicable to ad hoc networks in that distance vector routing protocols, such

as DSDV, have been proposed for ad hoc networks.

Zhou and Haas investigate distributed certificate authorities in ad hoc networks


using threshold cryptography[144]. Zhou and Haas take the view that no one sin-

gle node in an ad hoc network can be trusted due to low physical security and low

availability. Therefore, using a single node to provide an important network-wide

service, such as a certificate authority, is very risky. Threshold cryptography al-

lows a certificate authority’s private key to be broken up into shares and distributed

across multiple nodes. To sign a certificate, a subset of the nodes with private key

shares must jointly collaborate. Thus, to mount a successful attack on the certificate

authority, an intruder must compromise multiple nodes.

To further frustrate attack attempts over time, Zhou and Haas’ scheme uses share

refreshing. It is possible that over a long period of time enough share servers could be

compromised to recover the certificate authority’s secret key. Share refreshing allows

uncompromised servers to compute a new private key periodically from the old private

key’s shares. This periodic refreshing means that an attacker must infiltrate a large

number of nodes within a short time span to recover the certificate authority’s secret

key.

Stajano and Anderson [125] elucidate some of the security issues facing ad hoc

networks and investigate ad hoc networks composed of low compute-power nodes such

as home appliances, sensor networks, and PDAs where full public key cryptography

may not be feasible. The authors develop a system in which a wireless device ”im-

prints” itself on a master device, accepting a symmetric encryption key from the first

device that sends it a key. After receiving that key, the slave device will not recognize

any other device as a master except the device that originally sent it the key. The

authors bring up an interesting denial of service attack: the battery drain attack.

A misbehaving node can mount a denial-of-service attack against another node by

routing seemingly legitimate traffic through the node in an attempt to wear down the

other node’s batteries.

Since this study was originally conducted in 1999, there has been much work in the


MANET community building upon Watchdog and/or Pathrater. Below, we discuss

a sample of the more relevant projects.

Michiardi and Molva designed CORE [99], a reputation system for ad hoc routing

that utilizes Watchdog to detect misbehavior. This protocol is targeted at discourag-

ing misbehavior by selfish nodes and does not protect against malicious nodes. Each

CORE node maintains three types of reputation information about each peer. A

separate functional reputation is calculated for each node based on its performance

of specific network functions (e.g. packet forwarding, route discovery). This informa-

tion is further broken down into subjective and indirect reputations, based on personal

observations and second-hand reports, respectively, similar to the limited reputation

system analyzed in Chapter 5. The application of appropriate weight values when

combining the separate reputations into a single value is essential to the success of

this protocol. However, the authors do not discuss how these weights are determined.

CONFIDANT [16], proposed by Buchegger and Le Boudec, builds upon our work

with two significant extensions. First, in addition to detecting next-hop drops for

packets a node personally forwards, a node also eavesdrops on neighboring nodes

in an attempt to catch misrouting. This technique is likely to result in many false

positives due to the hidden terminal problem. The second extension is a reputation

system where nodes notify “friends” of possible malicious routers, in a way similar

to our limited reputation system. However, to avoid malicious nodes poisoning the

system with false reports, the protocol defines friends to be nodes with which one

has an a priori trust relationship. In the simulations, all well-behaved nodes are

considered to be friends with each other, an idealized assumption they acknowledge

and which is a focus of their future work.

Buchegger et al. [17] also constructed a test-bed architecture that allows experi-

ments on routing attack detection to be conducted using real-world wireless technol-

ogy in actual mobility scenarios. The initial test-bed experiments evaluated the ability


of enhanced passive acknowledgement (PACK), their improved version of Watchdog,

to detect misrouting attacks, as well as modification and fabrication attacks. They

find that some of the possible disadvantages of Watchdog we mention in Section 7.2.1,

such as partial dropping, have little or no effect in real-world scenarios.

7.6 Future Work

This chapter presents initial work in detecting misbehaving nodes and mitigating

their performance impact in ad hoc wireless networks. In this section we describe

some further ideas we would like to explore.

We plan on conducting more rigorous tests of the Watchdog and Pathrater pa-

rameters to determine optimal values to increase throughput in different situations.

Currently we are experimenting with different Watchdog thresholds for deciding when

a node is misbehaving. Some of the variables to optimize for the Pathrater include

the rating increment and decrement amounts, the rate incrementing interval, and the

delay between sending out route requests to decrease the overhead caused by this

feature.

Currently the Pathrater only decrements a node’s rating when another node tries

unsuccessfully to send to it or if the Watchdog mechanism is active and determines

that a node is misbehaving. Without the Watchdog active, the Pathrater cannot

detect misbehaving nodes. An obvious enhancement would be to receive updates from

a reliable transport layer, such as TCP, when ACKs fail to be received. This would

allow the Pathrater to detect bad paths and lower the nodes’ ratings accordingly.

The experiments conducted in this study were of relatively short duration, only 200

seconds. Longer simulations were infeasible due to the complexity of the simulations

resulting in long runtimes. While we believe these simulations adequately evaluate

the Watchdog mechanism, longer simulations may provide more information on the

7.7. CONCLUSION 255

performance of Pathrater. We would expect its performance to improve as more

information is collected, allowing it to calculate more accurate reputations for nodes

encountered. We postulate that the relative performance improvement gained by

utilizing Pathrater is likely to be similar to the gain seen in Chapter 5 by the local

reputation system over the base case.

All the simulations presented in this chapter use CBR data sources with no relia-

bility requirements. Our next goal is to analyze how the routing extensions perform

with TCP flows common to most network applications. Our focus would then change

from measuring throughput, or dropped packets, to measuring the time to complete

a reliable transmission, such as an FTP transfer. For these tests the modification to

Pathrater described above should improve performance significantly in the case where

the Watchdog is not active.

Finally, we would like to evaluate the Watchdog and Pathrater considering latency

in addition to throughput.

7.7 Conclusion

Ad hoc networks are an increasingly promising area of research with practical ap-

plications, but they are vulnerable in many settings to nodes that misbehave when

routing packets. For robust performance in an untrusted environment, it is necessary

to resist such routing misbehavior.

In this chapter we analyze two possible extensions to DSR to mitigate the effects of

routing misbehavior in ad hoc networks - the Watchdog and the Pathrater. We show

that the two techniques increase throughput by 17% in a network with moderate mo-

bility, while increasing the ratio of overhead transmissions to data transmissions from

the standard routing protocol’s 9% to 17%. During extreme mobility, Watchdog and

Pathrater can increase network throughput by 27%, while increasing the percentage


of overhead transmissions from 12% to 24%.

These results show that we can gain the benefits of an increased number of routing

nodes while minimizing the effects of misbehaving nodes. In addition we show that

this can be done without a priori trust or excessive overhead.

Chapter 8

Conclusion and Future Work

Designing an online resource exchange or content distribution system presents a chal-

lenging undertaking, especially when that system is required to be decentralized and

its members fully autonomous. To facilitate the system architecture and protocol de-

sign, it is important to understand the behavior of the users and the impact system

parameters play on their actions. This thesis presented our research into designing

reputation systems for autonomous, decentralized, peer-to-peer networks.

We began with an overview of research related to reputation system design. Our

goal was to organize existing ideas and work, to facilitate system design. Chapter 2

presented a taxonomy of reputation system components, their properties, and dis-

cussed how user behavior and technical constraints can conflict. In our discussion,

we described current research (some of which is presented in this thesis) that ex-

emplifies solutions developed and compromises made in order to produce a useable,

implementable system.

Next in Chapters 3 and 4, we presented two theoretical models for how trust

influences users in an online trading systems. The first model used a microeconomic

approach, focusing on individual transaction strategies. The second macroeconomic

model quantified system design choices on expected user behavior and participation.

257

258 CHAPTER 8. CONCLUSION AND FUTURE WORK

In Chapter 3, we proposed a simple game model that captures the incentives

dictating the interaction between buyers and sellers and study the strategies that

evolve in different scenarios, such as eBay auctions. In particular, we focused on the

effect seller history has on player strategy. We proved that for simple reputation-based

buyer strategies, a seller’s decision whether to cheat or not is dependent only on the

length of history, not on the particular actions committed. Given a finite number

of transactions, a seller can compute a utility optimal sequence of cooperations and

defections. As more advanced buyer/seller strategies evolve, equilibrium is reached

when players predominantly cooperate.

Chapter 4 identified key attributes that drive the actions of users of trading sys-

tems, whether they are cooperative, selfish, or malicious. We then presented an

economic model that captures the behavior of peers in a system that employs incen-

tive schemes and reputation systems to mitigate the effects of both freeriding and

misbehavior. We showed how the basic model could be modified to account for de-

sign decisions and derived a more generalized model. Results from an individual

transaction-based simulator approximated our economic model’s expectations, sug-

gesting the model captures the key elements of a reputation-based trading system.

Next, we concentrated on a more realistic study by fully simulating an unstruc-

tured peer-to-peer system using real statistics and traces from actual large file-sharing

P2P networks. We evaluated the effect of limited reputation information sharing on

the efficiency and load distribution of a peer-to-peer system. Chapter 5 presented ad-

vantages and disadvantages of resource selection techniques based on peer reputation.

We also investigated the cost in efficiency of two identity models for peer-to-peer rep-

utation systems. Our results show that, using some simple mechanisms, reputation

systems can provide a factor of 20 improvement in performance over no reputation

system.

259

Finally, we proposed two protocols to improve message routing throughput us-

ing trust information. Each was targeted at very different peer-to-peer networks;

one towards “traditional” structured P2P systems, and the other to mobile ad hoc

networks. The sources of trust information also varied greatly.

In Chapter 6, we investigated how existing social networks can benefit P2P data

networks by leveraging the inherent trust associated with social links. We presented

a trust model that lets us compare routing algorithms for P2P networks overlaying

social networks. We proposed SPROUT, a DHT routing algorithm that, by using

social links, significantly increases the number of query results and reduces query

delays. We discussed further optimization and design choices for both the model and

the routing algorithm. Finally, we evaluated our model versus both regular DHT

routing and Gnutella-like flooding.

Chapter 7 described two techniques that improve throughput in an ad hoc network

in the presence of nodes that agree to forward packets but fail to do so. To mitigate

this problem, we proposed categorizing nodes based upon their dynamically measured

behavior. We suggested a Watchdog that identifies misbehaving nodes and a Pathrater

that helps routing protocols avoid these nodes. Through simulation we evaluated

Watchdog and Pathrater using packet throughput, percentage of overhead (routing)

transmissions, and the accuracy of misbehaving node detection. When used together

in a network with moderate mobility, the two techniques increase throughput by

17% in the presence of 40% misbehaving nodes, while increasing the percentage of

overhead transmissions from the standard routing protocol’s 9% to 17%. During

extreme mobility, Watchdog and Pathrater can increase network throughput by 27%,

while increasing the overhead transmissions from the standard routing protocol’s 12%

to 24%.

The work contained in this thesis runs the gamut of system design and analysis,

beginning with categorization and high-level theoretical models, to specific system

260 CHAPTER 8. CONCLUSION AND FUTURE WORK

protocols and mechanisms. In all categories, there are still open problems in this

field of investigation. Many facets of reputation system design remain to be explored

beyond what is presented here. We end by presenting areas of research that build

upon the work described in this thesis that will need to be explored in order to provide

efficient and effective reputation management in P2P systems.

Refine High-Level Models

Both the micro- and macroeconomic models presented in Chapters 3 and 4 (respec-

tively) could be refined into more sophisticated, realistic models. Our study of agent

strategies under reputation was limited to the perfect knowledge-space, where buyers

had access to a seller’s complete transactional history. The next step would be to

analyze agent strategies with limited or inaccurate views of transactional histories.

The mathematical model presented in Chapter 4 also assumed a perfect reputa-

tion system capable of accurately and instantly collecting the amount and type of

contributions made by each peer and maintaining their trust rating. Perturbations

could be added to the model in order to improve realism. One example would be to

introduce a delay between when contributions occur and when they affect the con-

tributor’s reputation. A peer’s contributive capacity could be a function of time and

not constant as we assumed it to be. In addition, adding an error factor to each

contribution in the trust equation would mimic inaccurate/incomplete transaction

reporting.

SPROUT and Watchdog

The connectivity of nodes in social networks is not as regular as in structured P2P

networks. The additional message routing links derived from social connections used

in SPROUT could lead to imbalanced load across peers. An analysis of the effects

of SPROUT on load distribution, perhaps similar to that presented in Chapter 5, is

261

needed.

Some of the advantages of SPROUT could be applied to Pathrater. We expect the

performance of Pathrater to increase when it can make use of explicitly trusted nodes.

Trusted node lists are available in some ad hoc network scenarios, and we would like

to analyze the performance of our routing extensions in these scenarios. However, due

to the constrained transmission range, a node’s choice of link neighbors is limited.

Therefore, locating a trusted node may be unlikely, diminishing the effectiveness of

using a priori trust.

Bootstrapping Trust

Throughout the thesis we touch on the problem of how peers should initially regard

a new member joining the system. This issue is focused on primarily in Chapter 5,

but touched on in Chapter 4 as well. In those chapters a default low initial trust

rating is assigned to newcomers. However, more research needs to be done on how

to balance giving new members opportunities to contribute, while not falling prey to

whitewashing.

One possibility is to leverage a priori trust relationships with existing users.

Should a peer in good standing vouch for a newcomer, the new node could receive

preferential treatment by the reputation system. Should the newcomer abuse this

trust, a penalty would be incurred both by the vouching peer as well the misbehaving

newcomer.

Peer-to-peer systems rely on the goodwill of many unknown and untrusted users

to function effectively. This reliance makes naıve P2P systems vulnerable to malicious

attacks. Reputation systems are needed to detect and deter malicious users, mak-

ing these networks available and useable for all. Effective decentralized reputation

systems will enable P2P technology as a viable medium for content distribution and

services.

Appendix A

Proof Of Long-Term Reputation

Damage

Set the two utility equations (Eq. 3.6 and Eq. 3.7) equal to each other and solve for

k.

Uc(n+ 1 + k) = Ud(n+ 1 + k) (A.1)

U(n) +(

vm

n− c)

+k∑

i=1

(

vF−(n+1)(n+ i) + 1

n+ i− f(n+ i+ 1)c

)

= U(n) +(

vm

n

)

+k∑

i=1

(

vF−(n+1)(n+ i)

n+ i− f(n+ i+ 1)c

)(A.2)

��U(n) +�

��vm

n− c+

k∑

i=1

(

vF−(n+1)(n+ i) + 1

n+ i

)

−��

k∑

i=1

f(n+ i+ 1)c

= ��U(n) +�

��vm

n+

k∑

i=1

(

vF−(n+1)(n+ i)

n+ i

)

−��

k∑

i=1

f(n+ i+ 1)c

(A.3)

262

263

−c+��k∑

i=1

vF−(n+1)(n+ i)

n+ i+

k∑

i=1

v1

n+ i=

��k∑

i=1

vF−(n+1)(n+ i)

n+ i(A.4)

v

k∑

i=1

1

n+ i= c (A.5)

n+k∑

i=1

1

i−

n∑

i=1

1

i=c

v(A.6)

(A.7)

Now we have two finite harmonic sums. To simplify the summations, we apply

the formula for finite harmonic sum [77].

Hn =n∑

i=1

1

i= ln(n) + γ +

1

2n− 1

12n2+

1

120n4− ε

where 0 < ε <1

252n6

(A.8)

Where γ is the Euler-Mascheroni constant.

Let ε′(n) =1

2n(A.9)

Clearly, ln(n) + γ < Hn < ln(n) + γ + ε′(n) (A.10)

Next substitute the appropriate upper or lower bound for Hn for each summation

in Eq. A.7 so as to get an upper and lower bound k.

264 APPENDIX A. PROOF OF LONG-TERM REPUTATION DAMAGE

(ln(n+ k) + γ)− (ln(n) + γ + ε′(n)) <c

v< (ln(n+ k) + γ + ε′(n+ k))− (ln(n) + γ)

(A.11)

ln(n+ k) + �γ − ln(n)− �γ − ε′(n) <c

v< ln(n+ k) + ε′(n+ k) + �γ − ln(n)− �γ

(A.12)

ln(n+ k

n

)

− ε′(n) <c

v< ln

(n+ k

n

)

+ ε′(n+ k) (A.13)

(A.14)

Notice that ε′(n+ k) ≤ ε′(n) ∀k ≥ 0. We can replace ε′(n+ k) with ε′(n) without

invalidating the inequality.

ln(n+ k

n

)

− ε′(n) <c

v< ln

(n+ k

n

)

+ ε′(n) (A.15)

n+ k

ne−ε′(n) < ec/v <

n+ k

neε′(n) (A.16)

(n+ k)e−ε′(n) < nec/v < (n+ k)eε′(n) (A.17)

(A.18)

Solving each inequality separately, we have

(n+ k)e−ε′(n) < nec/v nec/v < (n+ k)eε′(n) (A.19)

n+ k < nec/veε′(n) n+ k > nec/ve−ε′(n) (A.20)

k < n(ec/veε′(n) − 1) k > n(ec/ve−ε′(n) − 1) (A.21)

A.1. ERROR BOUNDS 265

Notice that e−ε′(n) < 1 and eε′(n) > 1. Therefore, we will approximate k to be

k ≈ n(ec/v − 1) (A.22)

A.1 Error Bounds

What is the error range for k? Subtracting the lower bound from the upper bound

we have

n(ec/veε′(n) − 1)− n(ec/ve−ε′(n) − 1) = ec/vn(eε′(n) − e−ε′(n)) (A.23)

Consider, limn→∞

n(eε′(n) − e−ε′(n)) = limn→∞

n(e1

2n − e− 1

2n ) (A.24)

Substituting x =1

2ngives lim

x→0

ex − e−x

2x(A.25)

limx→0

ex − e−x

2x= 1 (A.26)

As n→∞ the error range decreases and converges to ec/v, a constant with respect

to n. Therefore, the largest error is when n is as small as possible. Originally, we

stated that we are not concerned with the situation that the seller is new to the

system, but has instead generated a history of several transactions. Therefore, we

assume n is not small and definitely not 0. For example, using n = 5 in Eq. A.23, the

error range will be 1.002ec/v.

By definition c < v, therefore ec/v < e. In our running example of c = $1 and

v = $3, then e1/3 = 1.4. Here, the error range is less than 1.5. Because we are

interested in k as an integer, the approximate value from Eq. A.22 cannot be off by

more than 1. Even in the worst case, where n = 1 and ec/v = e, the error range is

less than 3, therefore the approximate value of k cannot be off by more than 2. For


the range of n we are interested in, the approximation of k may be acceptable.

A.2 Improved Approximation

In the previous section we bounded the error to a range of size ec/v, constant with

respect to n. However, it is possible to do better by using a tighter bound on the

harmonic number [62, 132].

1

24(n+ 1)2< Hn − ln(n+

1

2)− γ < 1

24n2(A.27)

Using this equation we now have tighter bounds for the error. To make easier use of

the bounds

Let ε′′(n) =1

24n2(A.28)

0 < ε′′(n+ 1) < Hn − ln(n+1

2)− γ < ε′′(n) (A.29)

Substituting into Eq. A.7 we have

ln(n+k+1

2)+γ−(ln(n+

1

2)+γ+ε′′(n)) <

c

v< ln(n+k+

1

2)+γ+ε′′(n+k)−(ln(n+

1

2)+γ)

(A.30)

Clearly, ε′′(n+ k) < ε′′(n).

ln(n+ k +1

2)− ln(n+

1

2)− ε′′(n) <

c

v< ln(n+ k +

1

2) + ε′′(n)− ln(n+

1

2) (A.31)

n+ k + 12

n+ 12

e−ε′′(n) < ecv <

n+ k + 12

n+ 12

eε′′(n) (A.32)

(n+ k +1

2)e−ε′′(n) < (n+

1

2)e

cv < (n+ k +

1

2)eε′′(n) (A.33)

A.2. IMPROVED APPROXIMATION 267

Solving each inequality separately, we have

(n+ k +1

2)e−ε′′(n) < (n+

1

2)ec/v (n+

1

2)ec/v < (n+ k +

1

2)eε′′(n) (A.34)

n+ k +1

2< (n+

1

2)ec/veε′′(n) n+ k +

1

2> (n+

1

2)ec/ve−ε′′(n) (A.35)

k < (n+1

2)(ec/veε′′(n) − 1) k > (n+

1

2)(ec/ve−ε′′(n) − 1) (A.36)

A better approximation for k than Eq. A.22 is

k ≈ (n+1

2)(ec/v − 1) (A.37)

The error range now is

(n+1

2)(ec/veε′′(n)−1)− (n+

1

2)(ec/ve−ε′′(n)−1) = ec/v(n+

1

2)(eε′′(n)− e−ε′′(n)) (A.38)

While our previous approximation gave constant bounds on the error range as n

grew. This approximation can be shown to converge to a single value as n grows.

Consider, limn→∞

(n+1

2)(eε′′(n) − e−ε′′(n)) = lim

n→∞(n+

1

2)(e

1

24n2 − e− 1

24n2 ) (A.39)

Substitute x =1

24n2(A.40)

limx→0

ex − e−x

√24x

− ex − e−x

2= 0− 0 = 0 (A.41)

The second term goes to 0, and, using L’Hopital’s rule, the first term also goes to

0. So for sufficiently large n, the error range converges to 0.

What happens for small n? With our previous approximation, the error range

for n = 1 was approximately 1.04ec/v, which had an upper bound of approximately

2.8, when c = v. With the improved approximation the error range at n = 1 is less


than 0.13ec/v, which in the worst case, is less than 0.34. Calculating k to the nearest

integer will be correct with very high probabilty.

Appendix B

Proof of Unique Global Maximum

for Segregated Schedule Utility

From Equation 3.15 we have the following equation for the utility of a segregated

schedule of length Z with x cooperations followed by Z − x defections. We may

express the total utility of such a schedule as

Useg(Z, x) = U = (v − c)x+Z−1∑

i=x

vx

i(B.1)

To prove there can only be one unique value of x that maximizes Useg(Z, x) for a

given Z, we will take the second derivative with respect to x and show that it only

takes on negative values for 0 ≤ x ≤ Z.

We begin by simplifying Eq. B.1.

U = (v − c)x+Z−1∑

i=x

vx

i(B.2)

= (v − c)x+ vx(Z−1∑

i=1

1

i−

x−1∑

i=1

1

i

)

(B.3)

269

270 APPENDIX B. UNIQUE MAXIMUM OF SEGREGATED SCHEDULE

We now have the difference of two finite harmonic sums. A finite harmonic sum

can be expressed analytically as

Hn = γ + ψ0(n+ 1) (B.4)

where γ is the Euler-Mascheroni constant and ψ0(n+1) is the digamma function [132].

Substituting in for the series gives us

U = (v − c)x+ vx(γ + ψ0(Z)− (γ + ψ0(x))) (B.5)

Next, we simplify and take two derivatives. The derivative of ψ0(z) is ψ1(z) and

similarly the derivative of ψ1(z) is ψ2(z), where ψ1(z) and ψ2(z) are polygamma

functions [133].

U = (v − c)x+ vx(�γ + ψ0(Z)− �γ − ψ0(x)) (B.6)

= (v − c)x+ vx(ψ0(Z)− ψ0(x)) (B.7)

= (v − c)x+ vψ0(Z)x− vxψ0(x) (B.8)

dU

dx= (v − c) + vψ0(Z)− vψ0(x)− vxψ1(x) (B.9)

d2U

dx2= −vψ1(x)− vψ1(x)− vxψ2(x) (B.10)

= −v(2ψ1(x) + xψ2(x)) (B.11)

(B.12)

271

A polygamma function ψn(z) can be written as follows [133]:

ψn(z) = (−1)n+1n!∞∑

k=0

1

(z + k)n+1(B.13)

Applying Eq. B.13 to Eq. B.12 and simplifying:

d2U

dx2= −v(2ψ1(x) + xψ2(x)) (B.14)

= −v[

2

(∞∑

k=0

1

(x+ k)2

)

+ x

(

−2∞∑

k=0

1

(x+ k)3

)]

(B.15)

= −2v∞∑

k=0

(

x+ k

(x+ k)3− x

(x+ k)3

)

(B.16)

= −2v∞∑

k=0

(

k

(x+ k)3

)

(B.17)

Notice that for any valid value of x, 0 < x ≤ Z, the summation is purely positive.

Therefore, the second derivative of Useg(Z, x) must be negative in that same range.

Appendix C

Estimating Optimal Schedule for

Fixed Number of Transactions

From Equation 3.9 we know that, given a number of completed transactions n and

a cost/valuation ratio c/v, we can calculate how many additional transactions k are

needed so that the utility from cooperating or defecting on the n + 1 turn is equal.

Consequently, a seller benefits more from defecting if she participates in less than k

additional transactions, and benefits more from cooperating if she participates in more

than k additional transactions. Therefore, for a given number of total transactions

Z, we can determine how many cooperations are optimal by computing n and k + 1

respectively from Z = n+ 1 + k, substituting in Equation 3.9 for k, then solving for

n.

272

273

Z = n+ 1 + k (C.1)

Z = n+ 1 + (n+1

2)(ec/v − 1) (C.2)

Z = �n+ 1 + nec/v −�n+1

2ec/v − 1

2(C.3)

n = (Z − 1

2)e−c/v − 1

2(C.4)

We now have the optimal number of cooperations in terms of Z, the total number

of transactions, nC(Z). Now subtracting the value of n in Eq. C.4 from Z gives us

the optimal number of defections in terms of Z, nD(Z).

nD(Z) = Z − ((Z − 1

2)e−c/v − 1

2) (C.5)

= Z − (Z − 1

2)e−c/v +

1

2(C.6)

= (Z − 1

2) +

1

2− (Z − 1

2)e−c/v +

1

2(C.7)

= (Z − 1

2)(1− e−c/v) + 1 (C.8)

The previous equations allow for real numbered values. Because we are interested

in integer values we must apply proper integer conversions. For a fixed number of

transactions Z, the utility optimal number of cooperations and defections, respec-

tively, are

nC(Z) =⌈

(Z − 1

2)e−c/v − 1

2

⌉

(C.9)

nD(Z) =⌊

(Z − 1

2)(1− e−c/v) + 1

⌋

(C.10)

Appendix D

Mathematical Derivations of

Economic Model

Here we present the derivations of equations from Chapter 4 to help the reader un-

derstand the process.

274

D.1. UTILITY OVER TIME 275

D.1 Utility Over Time

∫

Pdt =

∫

(πgtkvA− kuA+ (kmCB + kpC − kcC)T (t)− κ)dt (D.1)

U(t) = πgtkvAt− kuAt+ (kmCB + kpC − kcC)

∫

T (t)dt− κt+ Y (D.2)

=(πgtkvA− kuA− κ)t+

(kmCB + kpC − kcC)ln((rgCG + rbCB + δ)ergCGt + Z)

rgCG + rbCB + δ+ Y

(D.3)

=(πgtkvA− kuA− κ)t+


ln

(

(rgCG + rbCB + δ)(ergCGt − 1) + rgCG

T (0)

)

rgCG + rbCB + δ+ Y

(D.4)

where Y = U(0)− (kmCB + kpC − kcC)ln( rgCG

T (0)

)

rgCG + rbCB + δ(D.5)

(D.6)

U(t) =(πgtkvA− kuA− κ)t+


ln

(


+ 1

)

rgCG + rbCB + δ+ U(0)

(D.7)

276 APPENDIX D. MATH. DERIV. OF ECON. MODEL

D.2 Generalized Trust Over Time (σ(T, p∗) = 1)

∫

∆Tdt =

∫

(rgCG(1− T (t))− (rbCB + δ)T (t))��σ(T, p∗)dt

T (t) =rgCG

rgCG + rbCB + δ+ Z · e−(rgCG+rbCB+δ)t

where Z = T (0)− rgCG

rgCG + rbCB + δ

(D.8)

Bibliography

[1] Stanford Peers research group. http://www-db.stanford.edu/peers/.

[2] Martn Abadi, Mike Burrows, Mark Manasse, and Ted Wobber. Moderately

hard, memory-bound functions. In Proceedings of the 10th Annual Network

and Distributed System Security Symposium, 2003.

[3] Lada Adamic. Search in power law networks. Physical Review E, 64:46135–

46143, 2001.

[4] Lada Adamic. Personal communication, 2002.

[5] Eytan Adar and Bernardo A. Huberman. Free riding on gnutella. First Monday,

5(10), October 2000.

[6] Gagan Agarwal, Mayank Bawa, Prasanna Ganesan, Hector Garcia-Molina, Kr-

ishnaram Kenthapadi, Nina Mishra, Rajeev Motwani, Utkarsh Srivastava, Dilys

Thomas, Jennifer Widom, and Ying Xu. Vision Paper: Enabling Privacy for

the Paranoids. In VLDB, 2004.

[7] Gagan Agarwal, Mayank Bawa, Prasanna Ganesan, Hector Garcia-Molina, Kr-

ishnaram Kenthapadi, Rajeev Motwani, Utkarsh Srivastava, Dilys Thomas, and

Ying Xu. Two Can Keep a Secret: A Distributed Architecture for Secure Data-

base Services. In CIDR, 2005.

277

278 BIBLIOGRAPHY

[8] Apple Computer, Inc. iTunes, 2004. http://www.apple.com/itunes/.

[9] Robert Axelrod. The Evolution of Cooperation. Basic Books, 1984.

[10] Stefano Basagni, Imrich Chlamtac, Violet R. Syrotiuk, and Barry A. Woodward.

A distance routing effect algorithm for mobility (dream). In MobiCom ’98:

Proceedings of the 4th annual ACM/IEEE international conference on Mobile

computing and networking, pages 76–84. ACM Press, 1998.

[11] Mayank Bawa, Brian F. Cooper, Arturo Crespo, Neil Daswani, Prasanna Gane-

san, Hector Garcia-Molina, Sepandar Kamvar, Sergio Marti, Mario Schlosser,

Qi Sun, Patrick Vinograd, and Beverly Yang. Peer-to-peer research at stanford.

SIGMOD Rec., 32(3):23–28, 2003.

[12] BBC NEWS. Viruses turn to peer-to-peer nets. BBC NEWS. 1/20/2004,

January 2004.

[13] Vaduvur Bharghavan, Alan Demers, Scott Shenker, and Lixia Zhang. Macaw:

a media access protocol for wireless lan’s. In SIGCOMM ’94: Proceedings of the

conference on Communications architectures, protocols and applications, pages

212–225. ACM Press, 1994.

[14] Alberto Blanc, Yi-Kai Liu, and Amin Vahdat. Designing Incentives for Peer-

to-Peer Routing. In Workshop on Economics of Peer-to-Peer Systems.

[15] Josh Broch, David A. Maltz, David B. Johnson, Yih-Chun Hu, and Jorjeta

Jetcheva. A performance comparison of multi-hop wireless ad hoc network

routing protocols. In MobiCom ’98: Proceedings of the 4th annual ACM/IEEE

international conference on Mobile computing and networking, pages 85–97.

ACM Press, 1998.

BIBLIOGRAPHY 279

[16] Sonja Buchegger and Jean-Yves Le Boudec. Performance analysis of the confi-

dant protocol (cooperation of nodes - fairness in dynamic ad-hoc networks). In

Proceedings of MobiHoc 2002, Lausanne, June 2002.

[17] Sonja Buchegger, Cedric Tissieres, and Jean-Yves Le Boudec. A test-bed for

misbehavior detection in mobile ad-hoc networks - how much can watchdogs

really do? In Proceedings of IEEE WMCSA 2004, English Lake District, UK,

December 2004.

[18] Chiranjeeb Buragohain, Divyakant Agrawal, and Subhash Suri. A Game The-

oretic Framework for Incentives in P2P Systems. In IEEE 3rd International

Conference on Peer-to-Peer Computing (P2P 2003).

[19] Orkut Buyukkokten. Club Nexus, 2001.

[20] Robert Castaneda and Samir R. Das. Query localization techniques for on-

demand routing protocols in ad hoc networks. In MobiCom ’99: Proceedings of

the 5th annual ACM/IEEE international conference on Mobile computing and

networking, pages 186–194. ACM Press, 1999.

[21] Miguel Castro, Peter Druschel, Ayalvadi Ganesh, Antony Rowstron, and Dan S.

Wallach. Secure routing for structured peer-to-peer overlay networks. In Pro-

ceedings of the Fifth Symposium on Operating Systems Design and Implemen-

tation, 2002.

[22] Kay-Yut Chen, Tad Hogg, and Nathan Wozny. Experimental Study of Market

Reputation Mechanisms. In ACM Conference on Electronic Commerce (EC’04),

2004.

[23] Bram Cohen. Incentives Build Robustness in BitTorrent. In Workshop on

Economics of Peer-to-Peer Systems, 2003.

280 BIBLIOGRAPHY

[24] Brian F. Cooper, Mayank Bawa, Neil Daswani, and Hector Garcia-Molina. Pro-

tecting the PIPE from Malicious Peers. Technical report, Stanford University,

2002.

[25] Fabrizio Cornelli, Ernesto Damiani, and Sabrina De Capitani. Choosing Rep-

utable Servents in a P2P Network. In Proc. of the 11th International World

Wide Web Conference, 2002.

[26] Scott Corson and Vincent Park. Temporally-Ordered Routing Algorithm

(TORA) Version 1 Functional Specification. Mobile Ad-hoc Network (MANET)

Working Group, IETF, October 1999.

[27] Arturo Crespo and Hector Garcia-Molina. Routing Indices For Peer-to-Peer

Systems. Proceedings of the International Conference on Distributed Computing

Systems (ICDCS), July 2002.

[28] B. P. Crow, I. K. Widjaja, G. Jeong, and P. T. Sakai. IEEE-802.11 Wireless

Local Area Networks. IEEE Communications Magazine, 35(9):116–126, Sep-

tember 1997.

[29] Ernesto Damiani, De Capitani di Vimercati, Stefano Paraboschi, Pierangela

Samarati, and Fabio Violante. A reputation-based approach for choosing reli-

able resources in peer-to-peer networks. In Proceedings of the 9th ACM confer-

ence on Computer and communications security, pages 207–216. ACM Press,

2002.

[30] Adam D’Angelo. BuddyZoo. http://www.buddyzoo.com.

[31] Samir Das, Charles E. Perkins, and Elizabeth M. Royer. Ad Hoc On Demand

Distance Vector (AODV) Routing (Internet-Draft). Mobile Ad-hoc Network

(MANET) Working Group, IETF, October 1999.

BIBLIOGRAPHY 281

[32] Neil Daswani. Personal communication, 2004.

[33] Neil Daswani. Denial of Service Attacks and Commerce Infrastructure in Peer-

to-peer Networks. PhD thesis, Stanford University, 2004.

[34] Neil Daswani and Hector Garcia-Molina. Query-Flood DoS Attacks in Gnutella.

In ACM Conference on Computer and Communications Security, 2002.

[35] Neil Daswani and Hector Garcia-Molina. Pong-Cache Poisoning in GUESS. In

ACM Conference on Computer and Communications Security, 2004.

[36] Jay L. Devore. Probability and statistics for engineering and the sciences.

Brooks/Cole Publishing Co., 3rd edition, 1991.

[37] Roger Dingledine, Michael J. Freedman, David Hopwood, and David Molnar.

A reputation system to increase MIX-net reliability. Lecture Notes in Computer

Science, 2137:126+, 2001.

[38] John R. Douceur. The Sybil Attack. In Proc. of the International Workshop

on Peer-to-Peer Systems, 2002.

[39] IETF MANET Working Group Internet Drafts.

http://www.ietf.org/ids.by.wg/manet.html.

[40] Cynthia Dwork, Andrew Goldberg, and Moni Naor. On memory-bound func-

tions for fighting spam. In Advances in Cryptology – CRYPTO’03, 2003.

[41] Cynthia Dwork and Moni Naor. Pricing via processing. In Advances in Cryp-

tology – CRYPTO’92, 1992.

[42] eBay - The World’s Online Marketplace. http://www.ebay.com/.

282 BIBLIOGRAPHY

[43] K. Fall and K. Varadhan. ns notes and documentation. The VINT Project,

UC Berkeley, LBL, USC/ISI, and Xerox PARC. Available from http://www-

mash.cs.berkeley.edu/ns/, July 1999.

[44] Michalis Faloutsos, Petros Faloutsos, and Christos Faloutsos. On power-law

relationships of the internet topology. In SIGCOMM, pages 251–262, 1999.

[45] Michal Feldman, Kevin Lai, Ion Stoica, and John Chuang. Robust Incentive

Techniques for Peer-to-Peer Networks. In ACM Conference on Electronic Com-

merce (EC’04), 2004.

[46] Michal Feldman, Christos Padimitriou, John Chuang, and Ion Stoica. Free-

Riding and Whitewashing in Peer-to-Peer Systems. In ACM SIGCOMM 2004,

Workshop of Practice and Theory of Incentives and Game Theory in Networked

Systems, 2004.

[47] Michael Freedman and Robert Morris. Tarzan: A Peer-to-Peer Anonymizing

Network Layer. In Proceedings of the 9th ACM Conference on Computer and

Communications Security, 2002.

[48] Eric Friedman and Paul Resnick. The social cost of cheap pseudonyms. Journal

of Economics and Management Strategy, 10(2):173–199, 1998.

[49] Friendster Inc. Friendster Beta, 2003. http://www.friendster.com.

[50] Drew Fudenberg and David K. Levine. Reputation and Equilibrium Selection

in Games with a Patient Player. Econometrica, (57), 1989.

[51] J. J. Garcia-Luna-Aceves and Marcelo Spohn. Source-tree routing in wireless

networks. In ICNP ’99: Proceedings of the Seventh Annual International Con-

ference on Network Protocols, page 273. IEEE Computer Society, 1999.

BIBLIOGRAPHY 283

[52] J.J. Garcia-Luna-Aceves, Marcelo Spohn, and David Beyer. Source Tree

Adaptive Routing (STAR) Protocol (Internet-Draft). Mobile Ad-hoc Network

(MANET) Working Group, IETF, October 1999.

[53] Yolanda Gil and Varun Ratnakar. Trusting information sources one citizen at

a time. In Proceedings of the First International Semantic Web Conference

(ISWC), 2002.

[54] T.J. Giuli. Personal communication, 2005.

[55] TJ Giuli, Petros Maniatis, Mary Baker, David S. H. Rosenthal, and Mema

Roussopoulos. Attrition defenses for a peer-to-peer digital preservation system.

In Proceedings of the USENIX Technical Conference, 2005.

[56] Gnutella specification. www9.limewire.com/

developer/gnutella protocol 0.4.pdf.

[57] H. Charles J. Godfray. Signalling of need by offspring to their parents. Nature,

(352):328–330, 1991.

[58] A. Grafen. Biological signals as handicaps. Journal of Theoretical Biology,

(144):517–546, 1990.

[59] R. Guha, Ravi Kumar, Prabhakar Raghavan, and Andrew Tomkins. Prop-

agation of trust and distrust. In Proceedings of the 13th World Wide Web

Conference (WWW2004), 2004.

[60] K. Gummadi, R. Gummadi, S. Gribble, S. Ratnasamy, S. Shenker, and I. Stoica.

The impact of DHT routing geometry on resilience and proximity. In Proc.

ACM SIGCOMM, 2003.

284 BIBLIOGRAPHY

[61] Minaxi Gupta, Paul Judge, and Mostafa Ammar. A reputation system for

peer-to-peer networks. In ACM 13th International Workshop on Network and

Operating Systems Support for Digital Audio and Video, 2003.

[62] Julian Havil. Gamma: Exploring Euler’s Constant. Princeton University Press,

2003.

[63] Tad Hogg and Lada Adamic. Enhancing Reputation Mechanisms via Online

Social Networks. In ACM Conference on Electronic Commerce (EC’04), 2004.

[64] B. Horne, B. Pinkas, and T. Sander. Escrow Services and Incentives in Peer-to-

Peer Networks. In Proceedings of 3rd ACM Conference on Electronic Commerce,

2001.

[65] Bernardo A. Huberman and Fang Wu. The dynamics of reputations.

www.hpl.hp.com/shl/papers/reputations/, 2002.

[66] IDC. Internet commerce model, v9.3, January 2005.

[67] IFILM Corp. IFILM, 2004. http://www.ifilm.com.

[68] Per Johansson, Tony Larsson, Nicklas Hedman, Bartosz Mielczarek, and Mikael

Degermark. Scenario-based performance analysis of routing protocols for mobile

ad-hoc networks. In MobiCom ’99: Proceedings of the 5th annual ACM/IEEE


ACM Press, 1999.

[69] Dave Johnson. Personal Communication, February 2000.

[70] David B. Johnson, David A. Maltz, and Josh Broch. Source Tree Adaptive

Routing (STAR) Protocol (Internet-Draft). Mobile Ad-hoc Network (MANET)

Working Group, IETF, October 1999.

BIBLIOGRAPHY 285

[71] J. Jubin and J. Tornow. The DARPA Packet Radio Network Protocols. Pro-

ceedings of the IEEE, 75(1):21–32, 1987.

[72] Radu Jurca and Boi Faltings. Towards incentive-compatible reputation man-

agement. In Proceedings of the AAMAS 2002 Workshop on Deception, Fraud

and Trust in Agent Societies.

[73] Sepandar D. Kamvar, Mario T. Schlosser, and Hector Garcia-Molina. The

EigenTrust Algorithm for Reputation Management in P2P Networks. In Pro-

ceedings of the Twelfth International World Wide Web Conference, 2003.

[74] KaZaA Home Page. http://www.kazaa.com/.

[75] C. Keser. Experimental games for the design of reputation management sys-

tems. IBM Systems Journal, 42(3):498–506, 2003.

[76] Donald E. Knuth. Seminumerical Algorithms, volume 2 of The Art of Computer

Programming. Addison-Wesley Publishing Co., 1969.

[77] Donald E. Knuth. Fundamental Algorithms, volume 1 of The Art of Computer

Programming. Addison-Wesley Publishing Co., 2nd edition, 1973.

[78] Young-Bae Ko and Nitin H. Vaidya. Location-aided routing (LAR) in mobile

ad hoc networks. In MobiCom ’98: Proceedings of the 4th annual ACM/IEEE


ACM Press, 1998.

[79] Young-Bae Ko and Nitin H. Vaidya. Geocasting in Mobile Ad Hoc Networks:

Location-Based Multicast Algorithms. In WMCSA’99, 1999.

[80] D. Kreps and R. Wilson. Reputation and Imperfect Information. Journal of

Economic Theory, (50):253–79, 1982.

286 BIBLIOGRAPHY

[81] John Kubiatowicz, David Bindel, Yan Chen, Patrick Eaton, Dennis Geels,

Ramakrishna Gummadi, Sean Rhea, Hakim Weatherspoon, Westly Weimer,

Christopher Wells, and Ben Zhao. OceanStore: An Architecture for Global-

scale Persistent Storage. In Proceedings of ACM ASPLOS. ACM, November

2000.

[82] Kevin Lai. Personal communication, 2004.

[83] Kevin Lai, Michal Feldman, Ion Stoica, and John Chuang. Incentives for Coop-

eration in Peer-to-Peer Networks. In Workshop on Economics of Peer-to-Peer

Systems, 2003.

[84] Seungjoon Lee, Rob Sherwood, and Bobby Bhattacharjee. Cooperative Peer

Groups in NICE. In Proceedings of the IEEE INFOCOM, 2003.

[85] Qin Lv, Pei Cao, Edith Cohen, Kai Li, and Scott Shenker. Search and repli-

cation in unstructured peer-to-peer networks. In Proceedings of the 2002 ACM

SIGMETRICS international conference on Measurement and modeling of com-

puter systems.

[86] Mylene Mangalindan. Some Sellers Leave eBay Over New Fees. Wall Street

Journal, page B.1, January 31, 2005.

[87] Petros Maniatis, Mema Roussopoulos, TJ Giuli, David S. H. Rosenthal, Mary

Baker, and Yanto Muliadi. Preserving peer replicas by rate-limited sampled vot-

ing. In 19th ACM Symposium on Operating Systems Principles (SOSP 2003),

2003.

[88] R. Marimon, J. Nicolini, and P. Teles. Competition and reputation. In Pro-

ceedings of the World Conference Econometric Society, 2000.

BIBLIOGRAPHY 287

[89] Sergio Marti, Prasanna Ganesan, and Hector Garcia-Molina. SPROUT: P2P

Routing with Social Networks. In International Workshop on Peer-to-Peer

Computing & DataBases (P2P&DB 2004), 2004.

[90] Sergio Marti, Prasanna Ganesan, and Hector Garcia-Molina. SPROUT:

P2P Routing with Social Networks. Technical report, 2004.

dbpubs.stanford.edu/pub/2004-5.

[91] Sergio Marti and Hector Garcia-Molina. Identity Crisis: Anonymity vs. Repu-

tation in P2P Systems. In IEEE 3rd International Conference on Peer-to-Peer

Computing (P2P 2003).

[92] Sergio Marti and Hector Garcia-Molina. Examining Metrics for Reputation

Systems (in progress). Technical report, 2003. dbpubs.stanford.edu/pub/2003-

39.

[93] Sergio Marti and Hector Garcia-Molina. Limited Reputation Sharing in P2P

Systems. In ACM Conference on Electronic Commerce (EC’04), 2004.

[94] Sergio Marti and Hector Garcia-Molina. Modeling Reputation and

Incentives in Online Trade (extended). Technical report, 2004.

dbpubs.stanford.edu/pub/2004-45.

[95] Sergio Marti and Hector Garcia-Molina. A Game Theoretic Approach to Rep-

utation (extended). Technical report, 2004. dbpubs.stanford.edu/pub/2004-49.

[96] Sergio Marti and Prasanna Ganesan Hector Garcia-Molina. DHT Routing

Using Social Links. In 3rd International Workshop on Peer-to-Peer Systems

(IPTPS’04), 2004.

[97] Sergio Marti, T.J. Giuli, Kevin Lai, and Mary Baker. Mitigating Routing

Misbehavior in Mobile Ad Hoc Networks. In MobiCom ’00: Proceedings of

288 BIBLIOGRAPHY

the 6th annual ACM/IEEE international conference on Mobile computing and

networking, 2000.

[98] Les McClain. RIAA posting bad music files to deter illegal downloaders. The

Daily Texan. 2/6/2004, February 2004.

[99] Pietro Michiardi and Refik Molva. Core: a collaborative reputation mecha-

nism to enforce node cooperation in mobile ad hoc networks. In Sixth IFIP

Conference on Security, Communications and Multimedia, 2002.

[100] P. Milgrom and J. Roberts. Limit Pricing and Entry Under Incomplete Infor-

mation: An Equilibrium Analysis. Econometrica, (50):443–60, 1982.

[101] Kieron O’Hara, Harith Alani, Yannis Kalfoglou, and Nigel Shadbolt. Trust

Strategies for the Semantic Web. In ISWC’04 Workshop on Trust, Security

and Reputation on the Semantic Web, 2004.

[102] John K. Ousterhout. Tcl and the Tk Toolkit. Addison Wesley, 1994.

[103] Larry Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The PageRank

citation ranking: Bringing order to the web. Technical report, Stanford Digital

Library Technologies Project, 1998.

[104] C. Palmer and J. Steffan. Generating network topologies that obey power laws.

In Proceedings of GLOBECOM ’2000.

[105] Se Hyun Park, Aura Ganz, and Zvi Ganz. Security protocol for IEEE 802.11

wireless local area network. Mobile Networks and Applications, 3:237–246, 1998.

[106] Charles Perkins and Pravin Bhagwat. Highly dynamic destination-sequenced

distance-vector routing (DSDV) for mobile computers. In ACM SIGCOMM’94

BIBLIOGRAPHY 289

Conference on Communications Architectures, Protocols and Applications,

pages 234–244, 1994.

[107] Ryan Porter and Yoav Shoham. Designing Efficient Online Trading Systems.

In ACM Conference on Electronic Commerce (EC’04), 2004.

[108] R. Prakash. Unidirectional links prove costly in wireless ad-hoc networks. In

Proceedings of DIMACS Workshop on Mobile Networks and Computers, 1999.

[109] The CMU Monarch Project. The cmu monarch project’s wireless and mobility

extensions to ns. http://www.monarch.cs.cmu.edu/cmu-ns.html, October 1999.

[110] A. R Puniyani, R. M Lukose, and B. A Huberman. Intentional Walks

on Scale Free Small Worlds. ArXiv Condensed Matter e-prints, July 2001.

http://aps.arxiv.org/abs/cond-mat/0107212.

[111] Eric Rasmusen. Games and Information. Basil and Blackwell Ltd., 1989.

[112] Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, and Scott

Shenker. A scalable content addressable network. Technical Report TR-00-

010, Berkeley, CA, 2000.

[113] Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, and Scott

Shenker. A Scalable Content-Addressable Network. In Proceedings of the ACM

SIGCOMM Symposium on Communication, Architecture, and Protocols, pages

161–172, San Diego, CA, U.S.A., August 2001. ACM SIGCOMM.

[114] Vicky Reich and David S. H. Rosenthal. LOCKSS: A Permanent

Web Publishing and Access System. D-Lib Magazine, 7(6), June 2001.

http://www.dlib.org/dlib/june01/reich/06reich.html.

290 BIBLIOGRAPHY

[115] Michael K. Reiter and Aviel D. Rubin. Crowds: Anonymity for web transac-

tions. In ACM Transactions on Information and System Security, 1998.

[116] Paul Resnick, Richard Zeckhauser, Eric Friedman, and Ko Kuwabara. Reputa-

tion systems. Communications of the ACM, pages 45-48, December 2000.

[117] Tim Roughgarden. Personal communication, 2004.

[118] Antony Rowstron and Peter Druschel. Pastry: Scalable, decentralized object

location, and routing for large-scale peer-to-peer systems. IFIP/ACM Interna-

tional Conference on Distributed Systems Platforms, pages 329–350, 2001.

[119] Stefan Saroiu, P. Krishna Gummadi, and Steven D. Gribble. A measurement

study of peer-to-peer file sharing systems. In Proceedings of Multimedia Com-

puting and Networking 2002 (MMCN ’02), San Jose, CA, USA, January 2002.

[120] Aameek Singh and Lin Liu. TrustMe: Anonymous Management of Trust Rela-

tionships in Decentralized P2P Systems. In IEEE 3rd International Conference

on Peer-to-Peer Computing (P2P 2003), 2003.

[121] Bradley Smith and J.J. Garcia-Luna-Aceves. Efficient Security Mechanisms for

the Border Gateway Routing Protocol. Computer Communications (Elsevier),

21(3):203–210, 1998.

[122] Bradley R. Smith, Shree Murthy, and J. J. Garcia-Luna-Aceves. Securing

distance-vector routing protocols. In Proceedings of Internet Society Sympo-

sium on Network and Distributed System Security, pages 85–92, February 1997.

[123] Herber Gintis Eric Alden Smith and Samuel Bowles. Costly signaling and

cooperation. Journal of Theoretical Biology, (213):103–119, 2001.

BIBLIOGRAPHY 291

[124] K. Sripanidkulchai. The popularity of gnutella queries and its implications on

scalability. Featured on O’Reilly’s www.openp2p.com website, February 2001.

[125] Frank Stajano and Ross Anderson. The resurrecting duckling: Security issues

for ad-hoc wireless networks. pages 172–194.

[126] Doug G. Stinson. Cryptography: Theory and Practice. CRC Press, 1995.

[127] Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M. Frans

Kaashoek, Frank Dabek, and Hari Balakrishnan. Chord: a scalable peer-to-peer

lookup protocol for internet applications. IEEE/ACM Trans. Netw., 11(1):17–

32, 2003.

[128] Paul Syverson, David Goldschlag, and Michael Reed. Anonymous Connections

and Onion Routing. In Proceedings of the IEEE Symposium on Security and

Privacy, 1997.

[129] C. K. Toh. Associativity-based routing for ad-hoc mobile networks. Wireless

Personal Communications Journal, Special Issue on Mobile Networking and

Computing Systems, 4(2):103–139, 1997.

[130] United States Department of Commerce. Quarterly Retail E-Commerce Sales

3rd Quarter 2004. United States Department of Commerce News, November 19,

2004.

[131] William Vickrey. Counter speculation, auctions, and competitive sealed tenders.

Journal of Finance, (16):8–37, 1961.

[132] Eric W. Weisstein. Harmonic number. From MathWorld–A Wolfram Web

Resource, 2004. http://mathworld.wolfram.com/HarmonicNumber.html.

292 BIBLIOGRAPHY

[133] Eric W. Weisstein. Polygamma function. From MathWorld–A Wolfram Web

Resource, 2004. http://mathworld.wolfram.com/PolygammaFunction.html.

[134] Jay Wrolstad. Online Holiday Shopping Up 25 Percent. NewsFactor Network,

January 4, 2005.

[135] Yahoo! Finance. Quotes and info: ebay inc.

http://finance.yahoo.com/q/ks?s=EBAY, January 31, 2005. Data provided by

Reuters.

[136] Beverly Yang. Personal communication, 2002.

[137] Beverly Yang, Tyson Condie, Sepandar Kamvar, and Hector Garcia-Molina.

Addressing the Non-Cooperation Problem in Competitive P2P Systems. In

Workshop on Economics of Peer-to-Peer Systems.

[138] Beverly Yang and Hector Garcia-Molina. Comparing hybrid peer-to-peer sys-

tems (extended). Technical report, 2000.

[139] Beverly Yang and Hector Garcia-Molina. Comparing hybrid peer-to-peer sys-

tems. In The VLDB Journal, pages 561–570, sep 2001.

[140] Beverly Yang and Hector Garcia-Molina. PPay: Micropayments for Peer-to-

Peer Systems. In Proceedings of the 10th ACM Conference on Computer and

Communications Security (CCS), 2003. Washington D.C.

[141] B. Yu and M. P. Singh. A social mechanism of reputation management in

electronic communities. Cooperative Information Agents, pages 154–165, 2000.

[142] Amotz Zahavi. Mate selection: a selection for handicap. Journal of Theoretical

Biology, (53):205–214, 1975.

BIBLIOGRAPHY 293

[143] Amotz Zahavi. The cost of honesty (further remarks on the handicap principle).

Journal of Theoretical Biology, (67):603–605, 1977.

[144] Lidong Zhou and Zygmunt J. Haas. Securing ad hoc networks. IEEE Network,

13(6):24–30, 1999.