Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
TRUST AND REPUTATION IN PEER-TO-PEER NETWORKS
A DISSERTATION
SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE
AND THE COMMITTEE ON GRADUATE STUDIES
OF STANFORD UNIVERSITY
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
Sergio Marti
May 2005
c© Copyright by Sergio Marti 2005
All Rights Reserved
ii
I certify that I have read this dissertation and that, in my opinion, it
is fully adequate in scope and quality as a dissertation for the degree
of Doctor of Philosophy.
Hector Garcia-Molina Principal Adviser
I certify that I have read this dissertation and that, in my opinion, it
is fully adequate in scope and quality as a dissertation for the degree
of Doctor of Philosophy.
Mary Baker
I certify that I have read this dissertation and that, in my opinion, it
is fully adequate in scope and quality as a dissertation for the degree
of Doctor of Philosophy.
Rajeev Motwani
Approved for the University Committee on Graduate Studies.
iii
iv
Abstract
The increasing availability of high bandwidth Internet connections and low-cost, com-
modity computers in people’s homes has stimulated the use of resource sharing peer-
to-peer networks. These systems employ scalable mechanisms that allow anyone to
offer content and services to other system users. However, the open accessibility of
these systems make them vulnerable to malicious users wishing to poison the system
with corrupted data or harmful services and worms. Because of this danger, users
must be wary of the quality or validity of the resources they access.
To mitigate the adverse behavior of unreliable or malicious peers in a network,
researchers have suggested using reputation systems. Yet our understanding of how
to incorporate an effective reputation system into an autonomous network is limited.
This thesis categorizes and evaluates the components and mechanisms necessary to
build robust, effective reputation systems for use in decentralized autonomous net-
works. Borrowing techniques from game theory and economic analysis, we begin
with high-level models in order to understand general trends and properties of repu-
tation systems and their effect on a user’s behavior and experience. We then closely
examine the effects of limited reputation sharing through simulations based on large-
scale measurements from actual, operating P2P networks. Finally, we propose new
mechanisms for improving message routing throughput in decentralized networks of
untrusted peers: one geared towards structured DHTs (SPROUT) and two other
complementary mechanisms for mobile ad hoc networks (Watchdog and Pathrater).
v
Acknowledgements
I would like to thank my advisor Hector Garcia-Molina for his unending patience
and guidance. I appreciate his great passion for research that is only matched by
his strong commitment to his students. I am always amazed that, regardless of his
many duties and projects, Hector would make himself available provide feedback and
insight on my work. Not only is Hector a wonderful advisor but also a caring friend.
I am also deeply grateful for the opportunity to have had Mary Baker as my
advisor when I first came to Stanford. Her professionalism and enthusiasm for research
inspired me to pursue my Ph.D. Mary’s devotion to excellence is exemplified in the
work of her students.
My experience at Stanford has been joyful and enlightening and I am grateful
to the members of both the Mosquitonet and Database groups for their insights,
constructive criticism and friendship. I would especially like to thank my co-authors
TJ Giuli, Kevin Lai and Prasanna Ganesan. I also appreciate Rajeev Motwani for
agreeing to be a member on my reading committee.
Finally, I must thank my friends and family for their encouragement and support.
In particular, I am grateful to my parents for their love and for instilling in me a deep
sense of academic pride. And most of all, to my wife Wendy whose patience and love
has kept me going, even when I doubted myself. From proofreading my papers to
preparing tasty treats, Wendy is always there for me.
vi
Contents
Abstract v
Acknowledgements vi
1 Introduction 1
1.1 Research Contributions and Thesis Outline . . . . . . . . . . . . . . . 5
2 Taxonomy of Trust 8
2.0.1 Taxonomy Overview . . . . . . . . . . . . . . . . . . . . . . . 9
2.1 Terms and Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Assumptions and Constraints . . . . . . . . . . . . . . . . . . . . . . 12
2.2.1 User Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.2 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.3 Environmental Limitations . . . . . . . . . . . . . . . . . . . . 16
2.3 Gathering Information . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.1 System Identities . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.2 Information Sharing . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.3 Dealing with Strangers . . . . . . . . . . . . . . . . . . . . . . 23
2.4 Reputation Scoring and Ranking . . . . . . . . . . . . . . . . . . . . 24
2.4.1 Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
vii
2.4.2 Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4.3 Peer Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.5 Taking Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.5.1 Incentives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.5.2 Punishment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6 Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.6.1 Resource Reputation . . . . . . . . . . . . . . . . . . . . . . . 30
2.6.2 Social Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3 Agent Strategies Under Reputation 32
3.1 Definitions and Dimensions . . . . . . . . . . . . . . . . . . . . . . . 33
3.1.1 Game Setup and Rules . . . . . . . . . . . . . . . . . . . . . . 33
3.1.2 Knowledge-space . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.1.3 Player-space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.1.4 Price-space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.1.5 eBay Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2 Strategy Independent Analysis . . . . . . . . . . . . . . . . . . . . . . 38
3.2.1 Single Transaction Payoff . . . . . . . . . . . . . . . . . . . . . 38
3.2.2 Social Optimum . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3 Selfish Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3.1 Zero Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3.2 Perfect Knowledge . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4 Perfect History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.4.1 Basic Reputation-based Strategies . . . . . . . . . . . . . . . . 45
3.4.2 Independent Decisions for MB-1S/VP . . . . . . . . . . . . . . 47
3.4.3 Independent Decisions for 1B-MS/FP . . . . . . . . . . . . . . 61
viii
3.5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.6 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.6.1 Variably-valuated goods . . . . . . . . . . . . . . . . . . . . . 64
3.6.2 Malicious Sellers . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.6.3 Costly Signaling . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4 Modeling Reputation and Incentives 69
4.1 Assumptions and Definitions . . . . . . . . . . . . . . . . . . . . . . . 71
4.1.1 Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.1.2 Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.2 Formal Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.2.1 Incentive Schemes . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.2.2 Currency Scenarios . . . . . . . . . . . . . . . . . . . . . . . . 77
4.2.3 Trust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.3.1 Trust Over time . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.3.2 Utility over Time . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.4 Simulation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.5.1 Base Population . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.5.2 NR and MTPP . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.5.3 Trust vs Capacity . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.5.4 Single-Peer Experiments . . . . . . . . . . . . . . . . . . . . . 110
4.6 Variations on the Model . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.6.1 Profit Trust Factor . . . . . . . . . . . . . . . . . . . . . . . . 115
4.6.2 Additional Trust Models . . . . . . . . . . . . . . . . . . . . . 117
ix
4.6.3 Tying Service to Reputation . . . . . . . . . . . . . . . . . . . 121
4.7 Generalized Model of Trust and Profit . . . . . . . . . . . . . . . . . 127
4.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
4.8.1 Credits and Economic Stimulation . . . . . . . . . . . . . . . 134
4.9 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
4.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5 P2P Reputation System Metrics 138
5.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
5.1.1 Authenticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.2 Threat Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.2.1 Document-based Threat Model . . . . . . . . . . . . . . . . . 143
5.2.2 Node-based Threat Model . . . . . . . . . . . . . . . . . . . . 143
5.3 Reputation Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
5.3.1 Identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
5.4 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
5.4.1 Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
5.4.2 Effectiveness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
5.4.3 Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
5.4.4 Message Traffic . . . . . . . . . . . . . . . . . . . . . . . . . . 155
5.4.5 Threat-Reputation Distance . . . . . . . . . . . . . . . . . . . 156
5.5 Simulation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
5.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
5.6.1 Local Reputation System . . . . . . . . . . . . . . . . . . . . . 160
5.6.2 Voting-System . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
5.6.3 Node-based Threat Model . . . . . . . . . . . . . . . . . . . . 181
5.7 Statistical Analysis of Reputation Systems . . . . . . . . . . . . . . . 190
x
5.8 Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
5.9 Empirical Estimations . . . . . . . . . . . . . . . . . . . . . . . . . . 193
5.10 Long-Term Reputation System Performance . . . . . . . . . . . . . . 194
5.10.1 Random base case . . . . . . . . . . . . . . . . . . . . . . . . 195
5.10.2 Select-Best/Weighted ideal case with threshold . . . . . . . . . 196
5.10.3 Weighted ideal case without threshold . . . . . . . . . . . . . 196
5.10.4 Select-Best ideal case without threshold . . . . . . . . . . . . 197
5.10.5 Select-Best/Weighted local reputation system with threshold . 198
5.10.6 Weighted local system without threshold . . . . . . . . . . . . 199
5.10.7 Select-Best local system . . . . . . . . . . . . . . . . . . . . . 199
5.11 Comparison of Statistical Analysis to Simulation Results . . . . . . . 200
5.12 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
5.13 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
6 SPROUT: P2P Routing with Social Networks 205
6.1 Trust Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
6.1.1 Trust Function . . . . . . . . . . . . . . . . . . . . . . . . . . 208
6.1.2 Path Rating . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
6.2 Social Path Routing Algorithm . . . . . . . . . . . . . . . . . . . . . 211
6.2.1 Optimizations . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
6.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
6.3.1 Simulation Details . . . . . . . . . . . . . . . . . . . . . . . . 214
6.3.2 Algorithm Evaluation . . . . . . . . . . . . . . . . . . . . . . . 215
6.3.3 Calculating Trust . . . . . . . . . . . . . . . . . . . . . . . . . 218
6.3.4 Number of Friends . . . . . . . . . . . . . . . . . . . . . . . . 220
6.3.5 Comparison to Gnutella-like Networks . . . . . . . . . . . . . 223
6.3.6 Latency Comparisons . . . . . . . . . . . . . . . . . . . . . . . 225
xi
6.3.7 Message Load . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
6.4 Related and Future Work . . . . . . . . . . . . . . . . . . . . . . . . 228
6.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
7 Mitigating MANET Misbehavior 231
7.1 Assumptions and Background . . . . . . . . . . . . . . . . . . . . . . 235
7.1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
7.1.2 Physical Layer Characteristics . . . . . . . . . . . . . . . . . . 235
7.1.3 Dynamic Source Routing (DSR) . . . . . . . . . . . . . . . . . 236
7.2 Watchdog and Pathrater . . . . . . . . . . . . . . . . . . . . . . . . . 237
7.2.1 Watchdog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
7.2.2 Pathrater . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
7.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
7.3.1 Movement and Communication Patterns . . . . . . . . . . . . 243
7.3.2 Misbehaving Nodes . . . . . . . . . . . . . . . . . . . . . . . . 244
7.3.3 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
7.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
7.4.1 Network Throughput . . . . . . . . . . . . . . . . . . . . . . 246
7.4.2 Routing Overhead . . . . . . . . . . . . . . . . . . . . . . . . 248
7.4.3 Effects of False Detection . . . . . . . . . . . . . . . . . . . . 250
7.5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
7.6 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
7.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
8 Conclusion and Future Work 257
A Proof Of Long-Term Reputation Damage 262
A.1 Error Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
xii
A.2 Improved Approximation . . . . . . . . . . . . . . . . . . . . . . . . . 266
B Unique Maximum of Segregated Schedule 269
C Optimal Schedule 272
D Math. Deriv. of Econ. Model 274
D.1 Utility Over Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
D.2 Generalized Trust Over Time (σ(T, p∗) = 1) . . . . . . . . . . . . . . 276
Bibliography 277
xiii
List of Tables
2.1 Breakdown of Reputation System Components . . . . . . . . . . . . . 9
3.1 Parameter descriptions with sample values . . . . . . . . . . . . . . . 34
3.2 General payoff matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.3 Payoff matrix for fixed $2 priced goods with valuation $3 and cost $1 39
3.4 Payoff Matrix for variable priced p goods for default v = $3 and c = $1. 43
3.5 Payoff Matrix for fixed $2 priced goods with valuation $3, cost $1, and
maliciousness factor $1 . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.1 Trust and Profit Parameters and Default Values . . . . . . . . . . . . 88
4.2 Simulation Parameters and Default Values . . . . . . . . . . . . . . . 97
4.3 Definition of Generalized Model Terms . . . . . . . . . . . . . . . . . 130
5.1 Simulation statistics and metrics . . . . . . . . . . . . . . . . . . . . . 152
5.2 Configuration parameters, and default values . . . . . . . . . . . . . . 157
5.3 Distributions and their parameters with default values . . . . . . . . 158
6.1 SPROUT vs. Chord . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
6.2 Evaluating lookahead and MHD . . . . . . . . . . . . . . . . . . . . . 216
7.1 Maximum and minimum network throughput obtained by any simula-
tion at 40% misbehaving nodes with all features enabled. . . . . . . . 247
xiv
7.2 Maximum and minimum overhead obtained by any simulation at 40%
misbehaving nodes with all features enabled. . . . . . . . . . . . . . . 249
7.3 Comparison of the number of false positives between the 0 second and
60 second pause time simulations. Average taken from the simulations
with all features enabled. . . . . . . . . . . . . . . . . . . . . . . . . . 251
xv
List of Figures
2.1 Representation of primary identity scheme properties. . . . . . . . . . 19
3.1 Number of transactions until gain from single defection equals loss from
lowered reputation k. . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.2 Optimal number of cooperation/defections as a function of total sales. 58
3.3 Relative utility error between optimal schedule and ±1 C/D. . . . . . 59
3.4 Relative utility error between optimal schedule using weak approxima-
tion and ±1 C/D. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.1 Relationship between a peer’s profit rate and the number of peers in
the network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.2 Representation of a reputation system’s role in a trading network.
Transaction observations update peer reputations maintained in the
trust vector. Reputation information is then used by peers in transac-
tions to improve expected utility. . . . . . . . . . . . . . . . . . . . . 81
4.3 A peer’s trust rating over time. . . . . . . . . . . . . . . . . . . . . . 88
4.4 Convergence of T as t→∞. Note the logscale x-axis. CB = 0 in both. 90
4.5 A peer’s utility over time. Initial trust T(0) = 0.01. Higher is better. 92
4.6 A peer’s utility over time. Initial trust T(0) = 0.0035. . . . . . . . . . 93
xvi
4.7 Minimum capacity needed for a good peer to (eventually) generate
positive profit (using default πgt, kv, and kc) is approximately 0.035
(for default parameters). . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.8 Capacity distribution for base population. . . . . . . . . . . . . . . . 99
4.9 Trust and utility values for default population after 200 turns. . . . . 100
4.10 Distribution of credits in base population at turn 200. . . . . . . . . . 100
4.11 Trust and utility for base population after 1000 turns. . . . . . . . . . 102
4.12 Trust and utility for NR=400 after 1000 turns. . . . . . . . . . . . . . 104
4.13 Trust and utility for NR=1 after 1000 turns. . . . . . . . . . . . . . . 105
4.14 Trust and utility for MTPP=2 after 1000 turns. . . . . . . . . . . . . 107
4.15 Utility for MTPP=3 after 1000 turns. . . . . . . . . . . . . . . . . . . 108
4.16 Comparing the analytical and simulation results for the convergence
of T as t→∞ as a function of C = CG. Note the logscale x-axis. . . 110
4.17 Comparing the analytical and simulation results of trust over time for
new good peers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.18 Comparing the analytical and simulation results of trust over time.
MTPP=1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
4.19 Comparing utility over time for new good peer. MTPP=2. . . . . . . 113
4.20 Comparing utility over time for new bad peer. MTPP=1. . . . . . . . 114
4.21 Effects of varying trust factor σ. . . . . . . . . . . . . . . . . . . . . . 116
4.22 Comparison of ratio trust model to differential trust model. T (0) = 0.01119
4.23 πgt w.r.t T for various functions of T . . . . . . . . . . . . . . . . . . . 123
4.24 Effects of sample πgt w.r.t varying functions of T . . . . . . . . . . . . 124
4.25 Steady-state trust as a function of CB. C = 1 . . . . . . . . . . . . . 125
4.26 Steady-state profit as a function of CB. C = 1 . . . . . . . . . . . . . 125
4.27 Effects of varying σ(T, p). . . . . . . . . . . . . . . . . . . . . . . . . 132
xvii
5.1 Sample document and matching query . . . . . . . . . . . . . . . . . 141
5.2 Efficiency for varying ρ0. Lower value is better. 1 is optimal. . . . . . 161
5.3 Varying selection threshold values. . . . . . . . . . . . . . . . . . . . 162
5.4 Efficiency comparison. . . . . . . . . . . . . . . . . . . . . . . . . . . 164
5.5 Relative message traffic of Friends-First and maximum Friend-Cache
utilization w.r.t. cache size. . . . . . . . . . . . . . . . . . . . . . . . 167
5.6 Efficiency of voting reputation system w.r.t. varying quorumweight. . 170
5.7 Efficiency of the voting reputation system w.r.t. Friend-Cache size. . 172
5.8 Effects of front nodes on efficiency. . . . . . . . . . . . . . . . . . . . 174
5.9 Efficiency of two reputation systems with the random algorithm as a
function of πB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
5.10 Average load on well-behaved nodes as a function of pB. . . . . . . . 177
5.11 Distribution of load on good nodes (and their corresponding number
of files shared). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
5.12 Efficiency comparison of local and ideal reputation systems under the
node-based threat model. . . . . . . . . . . . . . . . . . . . . . . . . . 182
5.13 Efficiency comparison of reputation systems with uniformly distributed
node threat ratings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
5.14 Comparison of the local reputation system with ρT of 0.0 and 0.15 and
the base case over time. . . . . . . . . . . . . . . . . . . . . . . . . . 185
5.15 Comparison of the local reputation system with both Weighted and
Select-Best variants and a selection threshold of 0.0 and 0.15 and the
base case over time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
5.16 Comparison of the efficiency of the reputation systems over time. . . 189
5.17 Expected steady-state system behavior . . . . . . . . . . . . . . . . . 201
6.1 Performance of SPROUT and AC in different size Small World networks.217
xviii
6.2 Performance of SPROUT and AC for different trust functions and vary-
ing f . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
6.3 Performance of SPROUT and AC for varying r. . . . . . . . . . . . . 220
6.4 Performance as a function of a node’s degree. Club Nexus data. . . . 221
6.5 Performance of SPROUT and AC for different uniform networks with
varying degrees. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
6.6 Performance of SPROUT and AC versus unstructured flooding. . . . 224
6.7 Latency measurements for SPROUT vs AC w.r.t. network size. Lower
is better. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
6.8 Distribution of load (in fraction of routes) for augmented Chord and
SPROUT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
7.1 Example of a route request. . . . . . . . . . . . . . . . . . . . . . 236
7.2 Watchdog in action . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
7.3 Node A does not hear B forward packet 1 to C, because B’s transmis-
sion collides at A with packet 2 from the source S. . . . . . . . . . . . 238
7.4 Node A believes that B has forwarded packet 1 on to C, though C
never received the packet due to a collision with packet 2. . . . . . . 239
7.5 Overall network throughput as a function of the fraction of misbehav-
ing nodes in the network. . . . . . . . . . . . . . . . . . . . . . . . . . 246
7.6 This figure shows routing overhead as a ratio of routing packet trans-
missions to data packet transmissions. This ratio is plotted against the
fraction of misbehaving nodes. . . . . . . . . . . . . . . . . . . . . . . 248
7.7 Comparison of network throughput between the regular Watchdog and
a Watchdog that reports no false positives. . . . . . . . . . . . . . . . 250
xix
xx
Chapter 1
Introduction
Previously, the ability to both send and receive large amounts of digital content and
data was limited to large institutions with the funds and resources to install and
manage high-speed networks and fast server machines. However, the increasing avail-
ability of high bandwidth Internet connections and low-cost, commodity computers
in people’s homes allows regular home users to quickly communicate and share data
with each other. This spread of computing resources has stimulated the use of re-
source sharing peer-to-peer (P2P) networks. These systems employ a simple scalable
mechanism that allows anyone to offer content and services to other users, as well as
search for and request resources from the network.
What distinguishes P2P systems from other distributed systems is their focus on
full user autonomy. Typically, distributed systems consist of computers managed
by a single organization or hierarchy. Devising an efficient architecture that spans
many networked machines is much simpler when all machines can be monitored and
controlled by a single operator.
However, in pure P2P architectures there are no centralized services or control
mechanisms dictating the actions of other nodes. Each user decides what computing
resources he will contribute, as well as when and for how long. The architecture is
1
2 CHAPTER 1. INTRODUCTION
designed to handle large numbers of nodes joining and abruptly leaving the network.
In addition, these systems emphasize equality and balancing the load across nodes.
This flexibility, self-determination and low participation cost encourages a much larger
number of participants, which, in turn, greatly increases the number and value of the
services provided by the system to all.
The most important contribution of peer-to-peer system research is providing an
architecture that allows a group of users spread throughout the Internet to cheaply
and efficiently connect their commodity computing resources into one massive system,
useable by all. The implications for rapid prototyping and deployment of new ser-
vices by small teams of developers without large amounts of capital are astounding.
Already we see P2P systems that handle a plethora of applications, ranging from grid
computing to data storage to digital preservation.
However, current media attention to peer-to-peer systems is concentrated on the
legal issues of copyright infringement that plague popular file-sharing applications.
Users have discovered P2P networks to be an efficient and cheap method of trans-
mitting digital content. However, these transmissions are being done without the
consent of the legal owners of the content. No legally acceptable solution to content
distribution using P2P technology is deployed today. If such a solution existed both
content owners/creators and consumers would benefit greatly.
To understand the potential impact of P2P systems, we must step back and chroni-
cle the evolution of media distribution. Currently, the cost of setting up and managing
traditional media distribution channels is too great for individual content creators to
overcome, resulting in a few large monopolistic companies that control all develop-
ment and distribution of media, such as music, movies and books. These companies
decide what media is produced based primarily on what can be marketed for max-
imum profit, not artistic merit. This filtering severely limits the public’s access to
new and diverse content and ideas.
3
The evolution of the World Wide Web has greatly helped independent artists
and authors to reach a larger segment of the population. Artists can now distribute
or sell their work in digital form from their websites, circumventing the packaging,
transportation, and retail costs of CDs, DVDs and books. The Web has also enabled
the sale of all kinds of material goods by ordinary people on a global scale. The best
example of this is the auction site eBay [42], which allows any individual to advertise
and auction items to people all over the world. Not only has the Web created new
distribution channels for digital content, but it provides a cheap solution for global
advertising of physical items.
Although the Web has lowered the cost of distribution and marketing, it does
impose costs that are still too great for many users. Websites that distribute songs
or movies will require large amounts of bandwidth to serve all their customers, and
bandwidth costs money. Running a commercial website with the necessary computing
resources to handle sales and distribution for a vast number of customers is still
beyond the capacity of most individuals. This need for technical capital has resulted
in the emergence of large companies that specialize in digital content distribution.
These new electronic distribution middlemen, such as eBay and Apple’s iTunes [8],
are once again in a position of power over the content creators. They decide what
is sold and what they charge for access to their service. Many eBay merchants are
unhappy with the fees they must pay eBay to use its services. Every increase in
fees results in sellers leaving eBay as they lose the already slim profit margins they
maintained [86]. A new distribution revolution is needed.
This revolution is coming in the form of P2P networks. When content can be
transferred between customers without involving a single centralized server, the com-
putational and bandwidth burdens on the content creator or owner are removed. The
cost of distribution would be much lower for the content owner and the distribution
channels could no longer be monopolized by a small group of middlemen. The result
4 CHAPTER 1. INTRODUCTION
would mean lower prices for consumers and increased profits for the producers. Mer-
chants who have left eBay (or never used it) due to the increasing fees may welcome
a pure P2P-commerce solution where no fees are collected and all sellers participate
equally.
Unfortunately, both producers and consumers are reticent about using P2P net-
works for distribution. P2P technology is not sufficiently mature to support a secure
and safe method for purchasing content through these systems. The primary hurdles
are: providing an efficient, secure mechanism for purchasing content, a universally ac-
cepted method for verifying content authenticity and ownership, and ways to prevent
or mitigate attacks on the system by malicious users. These attacks include:
• defrauding customers and stealing their money,
• intentionally modifying content to damage the owner and/or creator of the con-
tent, and
• using content distribution to infect computers with worms or viruses.
Because of the lack of a secure payment system that prevents or punishes malicious
attackers, P2P technology is not yet a viable distribution medium.
These worries have appeared before whenever a new distribution channel emerged,
most recently with e-commerce over the World Wide Web. Each time, methods
and practices were developed to combat malicious activity and instill confidence in
consumers and sellers alike. These mechanisms have proven successful. In 2004
Americans spent approximately $115 billion on online purchases, up over 25% from
the previous year [66, 130, 134]. EBay alone, posted 2004 revenues of $3.3 billion [135].
The success of eBay is of special relevance because eBay is a hybrid peer-to-peer
system. Although certain functions such as indexing and auction management are
operated by a centralized server, distribution of goods and payment is handled directly
between the buyers and sellers.
1.1. RESEARCH CONTRIBUTIONS AND THESIS OUTLINE 5
Now researchers are working fervently to develop the secure payment, digital rights
management, auditing and enforcement mechanisms peer-to-peer systems need in or-
der to allow users to confidently purchase and distribute all kinds of content. A
major component in detecting and mitigating malicious attacks will be the reputa-
tion system. Online trading and auction systems, such as eBay, employ reputation
systems as a means of distinguishing well-behaved productive users from the selfish
or malicious peers. Reputation systems provide users with a summarized (perhaps
imperfect) history of another peer’s transactions. Users use this information to de-
cide to what extent they should trust an unknown peer before they themselves have
interacted with him/her.
Scholars and researchers have adopted reputation systems as a useful mechanism
for detecting, containing and discouraging misbehavior in P2P networks. Unfortu-
nately, the lack of a centralized trusted entity capable of monitoring user behavior and
enforcing rules, complicates the design of mechanisms for detecting and preventing
malicious behavior in autonomous environments. However, it is this challenge that
most inspires the work presented in this thesis, as well as the research field of security
for peer-to-peer systems. Secure solutions will encourage more users to engage in
larger-valued transactions through the flexible and efficient commercial medium of
P2P systems. This growth will drive the burgeoning economy of digital goods and
services. Reputation systems are necessary in order to revolutionize content and in-
formation distribution just as much, if not more, than the World Wide Web, as the
cost of distribution is lowered once again.
1.1 Research Contributions and Thesis Outline
This thesis presents a top-down exploration of designing reputation systems for au-
tonomous, decentralized computer systems. After an introductory decomposition and
6 CHAPTER 1. INTRODUCTION
survey of the research field, we present high-level models of the relationship between
reputation and user behavior in typical trading systems. We then focus on P2P
networks, using detailed simulations to investigate characteristics of basic system de-
sign decisions. Finally, we present two novel applications of trust and reputation
for routing security in different autonomous networks. The following thesis outline
describes the content of each chapter and touches on the major findings or research
contributions discussed therein.
Chapter 2 lays out an overview of the area of reputation system research geared
towards peer-to-peer networks. We decompose peer-to-peer reputation systems into
separate components. Each component must provide certain properties or capabilities
in order for the whole system to function. Designing mechanisms that achieve these
properties in an autonomous transient network, yields the most interesting research
problems. In addition to defining terms used throughout the thesis, this chapter
discusses in detail related work in this vast field of research. Further chapters briefly
describe related research that is more closely tied to results presented in the chapter.
The next two chapters study reputation in general systems where resources or
commodity goods are exchanged. Although the examples used for illustration focus
on online trade, the resulting conclusions are applicable to many economic systems.
Chapters 3 and 4 present theoretical models for how reputation affects user behavior
and utility, each applying a different approach at different granularity. These models
provide a framework for evaluating reputation algorithms using economic metrics,
which we then use to analyze high-level implementation issues. Based on these studies,
we propose guidelines for reputation system designers. Chapter 3 applies elementary
game theory to explore agent strategies on a microeconomic scale. Chapter 4 expands
these ideas to a macroeconomic mathematical model for expected user performance
in a large-scale online trading system. Our mathematical model is then compared to
simulation results.
1.1. RESEARCH CONTRIBUTIONS AND THESIS OUTLINE 7
In Chapter 5, we look closely at using limited reputation sharing in unstruc-
tured peer-to-peer resource-sharing networks. We propose several performance met-
rics (such as message traffic, load and efficiency) that allow us to evaluate and compare
reputation systems. Through detailed simulations of multiple variations on the basic
reputation system, we quantify the effects of certain system properties and design
choices. Our study demonstrates that even a small amount of reputation information
collecting and sharing can vastly improve a peer’s ability to locate and fetch valid
resources, even when faced with large-scale whitewashing and collusion by malicious
peers. In addition, certain methods for calculating reputation and ranking peers may
perform equally well in terms of detecting and avoiding malicious peers, but have
vastly differing effects on load balancing.
The following two chapters each present specific protocols/mechanisms that ex-
ploit reputation information in order to improve message routing performance in
two types of networks that vary both in their physical medium and their structure.
Chapter 6 proposes the SPROUT protocol for incorporating existing social network
information and services into a structured P2P network in order to improve the re-
liability of message transmission. Using our model of “social trust” we show that
SPROUT can improve expected message delivery by 50%.
Chapter 7 concentrates on the issue of trust in ad hoc wireless routing. The Watch-
dog mechanism uses the inherent broadcast nature of wireless transmission to detect
when packets are not being forwarded correctly by eavesdropping on next-hop trans-
mission. The reputation of nodes along a path is incremented or decremented based
on the message throughput. These reputations are used when selecting new paths as
nodes move around. Simulations show Watchdog improves routing throughput by up
to 27% under high mobility when 40% of the nodes fail to route correctly.
Finally, we give our concluding comments in Chapter 8.
Chapter 2
Taxonomy of Trust: Categorizing
P2P Reputation Systems
The development of any complex computer architecture can be a challenge. This is
especially true of a complex distributed algorithm that is run by autonomous un-
trusted agents, yet is expected to be relatively reliable, efficient, and secure. Such is
the task of designing a complete reputation system for use in peer-to-peer networks.
To accomplish the task, it is necessary to break down the problem into separate sim-
pler problems of constructing a mechanism that provides a specific set of functions
or properties, allowing developers to “divide and conquer” the problem of reputation
system design.
Our primary goal in this chapter is to provide a useful taxonomy of the field of
peer-to-peer reputation design. To accomplish this goal, we identify the three basic
components of a reputation system, break them down into the necessary separate
mechanisms, and categorize properties we feel the mechanisms need to provide in
order for the reputation system to fulfill its function. For each mechanism we list
possible design choices proposed by the research community.
In the process, we give examples of research in the area of trust and reputation. A
8
9
Table 2.1: Breakdown of Reputation System ComponentsReputation Systems
Information Gathering Scoring and Ranking ResponseIdentity Scheme Good vs. Bad Behavior IncentivesInfo. Sources Quantity vs. Quality PunishmentInfo. Aggregation Time-dependenceStranger Policy Selection Threshold
Peer Selection
variety of research papers and implementations are referenced to illustrate ideas and
provide the reader avenues for further investigation. We often draw on work done by
the Peers research group [1] at Stanford University and do not pretend to produce
a complete survey of the research area. We feel this overview will be of particular
interest to those who are unfamiliar with the breadth of issues relating to reputation
system design for peer-to-peer networks.
Taxonomies related to trust and reputation systems (either in part or as a whole)
have been proposed by others (e.g Daswani [33] and O’Hara et al. [101]) and will be
discussed in the text when appropriate.
2.0.1 Taxonomy Overview
The following section defines terms we use throughout the thesis. We begin our tax-
onomy by classifying the common assumptions and constraints that guide reputation
system design in Section 2.2. These assumptions include expected user behavior, as
well as the goals of adversaries in the system and their capabilities. How effectively a
reputation system can deal with adversaries may be constrained by the the technical
limitations imposed on the implementation by the target system environment. These
issues determine the necessary properties and powers of the reputation system.
Next, we break down the functionality of a reputation system into the three com-
ponents shown in Table 2.1. In general, a reputation system assists agents in choosing
10 CHAPTER 2. TAXONOMY OF TRUST
a reliable peer (if possible) to transact with when one or more have offered the agent
a service or resource. To provide this function, a reputation system collects infor-
mation on the transactional behavior of each peer (information gathering), scores
and ranks the peers based on expected reliability (scoring and ranking), and allows
the system to take action against malicious peers while rewarding contributors (re-
sponse). Each component requires separate system mechanisms (listed in Table 2.1).
For each mechanism we study the possible desired properties and then discuss the
implementation limitations and trade-offs that may prevent some of the properties
from being met. In the discussion we will reference existing solutions or research to
illustrate how different mechanism designs achieve certain properties within the given
system constraints.
The three functionalities, gathering, scoring and response are covered in turn in
Sections 2.3, 2.4 and 2.5.
2.1 Terms and Definitions
Before discussing the various taxonomies we would like to define certain terms we will
be using throughout the thesis:
Peer A single active entity in any system or network of autonomous entities. In
general, a peer in a system is associated with a specific user and his/her rep-
resentation in a network. However, in some systems it is possible for a single
human user to control multiple network entities with different identities (as used
in Sybil attacks [38]). Also, a user’s computer may be compromised by a worm
or trojan horse and consequently the computer may behave differently in the
network than the user intended. The user may even be unaware the computer
is misbehaving. Therefore, we distinguish between a user and user’s represen-
tation(s) or node(s) in the network. At times, we will use the term node, agent
2.1. TERMS AND DEFINITIONS 11
or even user (when not considering compromised clients) synonymously with
peer. For instance, in Chapter 3 we use the term agent out of the tradition of
the field of game theory.
Transactions Peer-to-peer systems are defined by interactions between autonomous
agents or peers. These interactions may include swapping files, storing data,
answering queries, or remote CPU usage. In addition, money may be exchanged
when purchasing the desired resource. We refer to all interactions in general as
transactions between two parties.
Cooperate/Defect When well-behaved peers carry out transactions correctly, we
say they cooperate. Bad peers, however, may at times attempt to cheat or
defraud another peer, in which case they defect on the transaction. We will use
these terms (when applicable) when discussing general system/peer behavior.
Structured vs Unstructured P2P network architectures tend to be categorized
as either structured or unstructured, depending on how the overlay topology is
formed. Structured networks use a specific protocol to assign network IDs and
establish links to new peers and are exemplified by the class of systems called
Distributed Hash Tables (DHTs) (e.g. [127, 113, 118]). In purely unstructured
topologies new users connect randomly to other peers. A hybrid approach is
to assign certain peers as supernodes (or ultrapeers) that form an unstructured
network and all peers connect to supernodes. Such organization is used in most
popular file-sharing systems (e.g. [56, 74]). However, for simplicity, we will
classify supernode networks as unstructured networks [139].
Strangers Peers that appear to be new to the system. They have not interacted
with other peers and therefore no trust information is available.
Adversary A general term we use to apply to agents that wish to harm other peers
12 CHAPTER 2. TAXONOMY OF TRUST
or the system, or act in ways contrary to “acceptable” behavior. This may
include accessing restricted information, corrupting data, maliciously attacking
other nodes in the network, or attempting to take down the system services.
2.2 Assumptions and Constraints
The driving force behind reputation system design is providing a service that severely
mitigates misbehavior while imposing a minimal cost on the well-behaved users. To
that end, it is important to understand the requirements imposed on system design by
each of the following: the behavior and expectations of typical good users, the goals
and attacks of adversaries, and the technical limitations resulting from the environ-
ment where the system is deployed. We discuss each of these here. The choices made
here will impact the necessary mechanism properties discussed in Sections 2.3, 2.4,
and 2.5.
2.2.1 User Behavior
A system designer must build a system that is accessible to its intended users, provides
the level of functionality they require and does not hinder or burden them to the
point of driving them away. Therefore, it is important to anticipate any allowable
user behavior and meet their needs, regardless of added system complexity.
Examples of user behavior and requirements that affect distributed mechanism
design include:
Node churn The rate at which peers enter and leave the network, as well as how
gracefully they disconnect, affects many areas from network routing to content
availability. Higher levels of churn require increased data replication, redun-
dant routing paths, and topology repair protocols [60]. The node lifetime in
the system determines how much information can be collected for purpose of
2.2. ASSUMPTIONS AND CONSTRAINTS 13
computing its reputation, as well as how long that information is useful.
Reliability For most applications, users require certain guarantees on the reliabil-
ity or availability of system services. For example, a distributed data storage
application would want to guarantee that data stored by a user will always be
available to the user with high probability and that it will persist in the network
(even if temporarily offline) with a much higher probability [81]. The situation
is more difficult in peer-to-peer networks where adversaries are actively attempt-
ing to corrupt the content peers provide. Group auditing techniques may help
detect or prevent data loss [87].
Privacy Along with reliability, users that store data in an untrusted distributed sys-
tem would also want to protect the content from being accessed by unauthorized
users. One solution is to encrypt all data before storing [81]. However, in some
applications access to unencrypted data is necessary for processing. Separat-
ing sensitive data from subject identities, or using legally binding strict privacy
policies may be sufficient [115, 6, 7].
Anonymity As a specific application of privacy, users may only be willing to par-
ticipate if a certain amount of anonymity is guaranteed. This may vary from
no anonymity requirements, to hiding real-world identity behind a pseudonym,
to requiring that an agent’s actions be completely disconnected from both his
real-world identity and his other actions. Obviously, a reputation system would
be infeasible under the last requirement.
2.2.2 Threat Model
The two primary types of adversaries in peer-to-peer networks are selfish peers and
malicious peers. They are distinguished primarily by their goals in the system. Self-
ish peers wish to use system services while contributing minimal or no resources
14 CHAPTER 2. TAXONOMY OF TRUST
themselves. A well-known example of selfish peers are “freeriders” [5] in file-sharing
networks, such as Kazaa and Gnutella. To minimize their cost in bandwidth and
CPU utilization freeriders refuse to share files in the network.
The goal of malicious peers, on the other hand, is to cause harm to either specific
targeted members of the network or the system as a whole. To accomplish this goal,
they are willing to spend any amount of resources (though we can consider malicious
peers with constrained resources a subclass of malicious peers). Examples include
distributing corrupted audio files on music-sharing networks to discourage piracy [98]
or disseminating virus-infected files for notoriety [12].
Reputation system designers usually target a certain type of adversary. For in-
stance, incentive schemes that encourage cooperation may work well against selfish
peers but be ineffective against malicious peers. The number or fraction of peers that
are adversaries also impact design. Byzantine protocols, for example, assume less
than a third of the peers are misbehaving [21].
The work presented in this thesis tackles both selfish and malicious peers, although
some sections may focus on a single type of adversary.
Adversarial Powers
Next, a designer must decide what techniques he expects the adversaries to employ
against the system and build in mechanisms to combat those techniques. The follow-
ing list briefly describes the more general techniques available to adversaries.
Traitors Some malicious peers may behave properly for a period of time in order to
build up a strongly positive reputation, then begin defecting. This technique
is effective when increased reputation gives a peer additional privileges, thus
allowing malicious peers to do extra damage to the system when they defect.
An example of traitors are eBay merchants that participate in many small
transactions in order to build up a high positive reputation, and then defraud
2.2. ASSUMPTIONS AND CONSTRAINTS 15
one or more buyers on a high-priced item. Traitors may also be the computers
of well-behaved users that have been compromised through a virus or trojan
horse. These machines will act to further the goals of the malicious user that
subverted them.
Collusion In many situations multiple malicious peers acting together can cause
more damage than each acting independently. This is especially true in peer-
to-peer reputation systems, where covert affiliations are untraceable and the
opinions of unknown peers impacts ones decisions. Most research devoted to
defeating collusion assume that if a group of peers collude they act as a single
unit, each peer being fully aware of the information and intent of every other
colluding peer [87].
Front peers Also referred to as “moles” [45], these malicious colluding peers always
cooperate with others in order to increase their reputation. They then provide
misinformation to promote actively malicious peers. This form of attack is par-
ticularly difficult to prevent in an environment where there are no pre-existing
trust relationships and peers have only the word and actions of others in guiding
their interactions [93] (see Sec. 5.6.2).
Whitewashers Peers that purposefully leave and rejoin the system with a new iden-
tity in an attempt to shed any bad reputation they have accumulated under their
previous identity [83]. Whitewashers are discussed in depth in later sections and
chapters (see Sec. 2.3.3 and Chp. 5).
Denial of Service (DoS) Whether conducted at the application layer or network
layer, Denial of Service attacks usually involve the adversary bringing to bear
large amounts of resources to completely disrupt service usage. Using Inter-
net worms however, malicious users are able to minimize their own personal
16 CHAPTER 2. TAXONOMY OF TRUST
resource usage while amplifying the damage done through Distributed DoS at-
tacks. Much work has been done on detecting, managing, and preventing DoS
attacks. P2P-specific applications include [34, 35, 55] in unstructured networks
and [21] in DHT networks. Not only would we like reputation systems to detect
DoS attackers, but such attacks could be used against the reputation mechanism
itself.
As we discuss different mechanisms, we will reference these tactics and explain
how certain system properties can help against them. Most of the existing research
does not claim to handle malicious peers that bring to bear all these attacks at once.
In fact, much of the work focuses solely on independent selfish peers.
While Chapter 3 deals solely with the simplest case of selfish peers, the following
chapters (and particularly Chapter 5) study at depth the issues surrounding malicious
peers that use all these adversarial techniques.
2.2.3 Environmental Limitations
The primary division among system component architectures is centralized versus
decentralized. Implementing certain functionality at a single trusted entity can sim-
plify mechanism design and provide a more efficient system. As we will see, some
component properties can only be attained using the management and auditing ca-
pabilities afforded by a single point of trust. Of course centralization also has several
drawbacks. It may be infeasible to have a single entity all agents trust. A cen-
tralized server becomes a single point of failure as well as a bottleneck. Providing
performance and robustness requires the controlling entity to unilaterally invest large
sums of money. It also makes for a single point of attack by adversaries, either by
infiltration, subversion, or DoS attacks.
Between purely centralized and purely decentralized is a spectrum of hybrid archi-
tectures. For simplicity, we will refer to proposed mechanisms as centralized if they
2.3. GATHERING INFORMATION 17
require one (or a small number) entity that is trusted by all users to handle some
service for the entire system, even if they do not need to be always available, only
intermittently. Otherwise, the mechanism is decentralized.
2.3 Gathering Information
The first component of a reputation system is responsible for collecting information
on the behavior of peers, which will be used to determine how “trustworthy” they
are (either on an absolute scale or relative to the other peers).
2.3.1 System Identities
Associating a history of behavior with a particular agent requires a sufficiently per-
sistent identifier. Therefore, our first concern is the type of identities employed by
the peers in the system. There are several properties an identity scheme may have,
not all of which can be met with a single design. In fact, some properties are in direct
conflict of each other. The properties we focus on are:
Anonymity As previously mentioned in Section 2.2.1, the level of anonymity offered
by an identity scheme can vary from using real-world identities to preventing
any correlation of actions as being from the same agent.
Most peer-to-peer networks, such as Kazaa [74], use simple, user-generated
pseudonyms. Since peers connect directly to one another, their IP addresses are
public, providing the closest association between the agent’s actions and their
real-world identity. To hide their IP addresses users can employ redirection
schemes, such as Onion routing [128]. A P2P-specific solution using anonymiz-
ing tunnels is Tarzan [47]. Frequently changing pseudonyms and routing tunnels
disassociates the user’s actions from each other.
18 CHAPTER 2. TAXONOMY OF TRUST
Though full anonymity prevents building user reputation, some peer-to-peer
reputation systems, such as TrustMe [120], use pseudo-anonymity to encourage
honest information sharing without fear of retribution. Each peer is assigned
two identifiers; one for transactions and another for reporting reputation infor-
mation and scores. A centralized login server minimizes fraud and whitewash-
ing.
Spoof-resistant To prevent adversaries from impersonating other peers identities
must be resistant to spoofing. One common solution is the use of public/private
key pairs. If a peer uses its public key as its identifier, other peers can verify
that any communication comes to or from that peer, assuming the use of nonces
to defeat replay attacks. However, initially transmitting ones public key may
still be susceptible to man-in-the-middle attacks. Certificates signed by an a
priori trusted certificate authority (CA) can help, but requires a centralized
mechanism.
Unforgeable In addition to being spoof-resistant, unforgeable identities protect
against whitewashers and Sybil attacks [38], where a single user poses as several
distinct peers in the network. Unforgeable identities are usually generated by a
trusted system entity and given to new users as they join. These identifiers can
be proven to have been generated by this trusted entity and only that entity.
Notice a user’s public/private key pair is not sufficient. A certificate for that
public key issued by a trusted CA is. Login servers can also authenticate users
as they enter. The CA or login server may require real-world identity proof to
ensure that each user receives only one system identifier, perhaps using credit
card verification [48, 21]. These solutions are necessarily centralized. Decen-
tralized solutions usually require identifiers that are costly to produce, though
not strictly unforgeable. Costly identifiers help slow the rate of whitewashing
2.3. GATHERING INFORMATION 19
Unforgeability
Anonymity
Cost
Figure 2.1: Representation of primary identity scheme properties.
or generating multiple identities, but do not eliminate it. [38].
The effectiveness of any solution at providing a given property lies on a cost
scale (e.g. cycles, bandwidth, dollars). An adversary with infinite resources can
compromise any property. For example, most resilient unforgeable or spoof-resistant
identity schemes often rely on public/private key privacy. Given enough CPU power,
an adversary can crack the key and therefore negate its intended purpose. An informal
representation of the spectrum of identity choices is presented in Figure 2.1.
2.3.2 Information Sharing
Using established network identities, a reputation system protocol collects informa-
tion on a given peer’s behavior in previous transactions in order to determine their
reputation. Examples of useful information include reports on the success or failure
of a transaction by one or both parties, as well as the quality of the service/resource
provided. This information collection may be done individually by each peer in a
reactive method, or proactively by all peers collating together their experiences. In
this section we discuss the sources from which information is collected, what quality
of information agents can expect to collect, and how peers combine information from
20 CHAPTER 2. TAXONOMY OF TRUST
different sources.
Sources of Information
In general, quantity and quality of information are diametrically opposed. As the
amount of information gathered increases, the credibility of each piece of information
usually decreases.
The most cautious individuals may only want to rely on their own personal ex-
perience and use only local information when determining whether to transact with
a given peer. Of course without additional information, the individual risks being
cheated the first time they interact with each adversary. However, local information
may be sufficient if the agent locates a few well-behaved peers able to repeatedly
provide good service [91].
To increase the information sources a cautious agent can collect the opinions of
users whom they have a priori trust relationships with externally from the system.
These may include their friends from their personal lives, coworkers or business rela-
tionships, or even members of social networks [89, 63] they trust (see Sec. 2.6.2 and
Chapter 6).
Even with personal experience and the opinions of friends (that are currently
online), an agent is unlikely to have any information on a particular random peer. To
gather more opinions an agent can ask peers it has met in the P2P network, such as
its neighbors in the overlay network, or peers who have already provided good service,
proving themselves reputable. The question now is how many peers to query for their
opinions (We discuss how to aggregate these opinions in Sec. 2.3.2.) Asking a small
number of peers limits the communication overhead on the network [93], while asking
a larger number improves the chances of collecting useful information on a specific
peer [25].
If the number of personally-proven reputable peers is small, then an agent may
2.3. GATHERING INFORMATION 21
request that each of those peers collect the opinions of other peers they believe are
reputable, recursively. Each additional step exponentially increases the information
sources. Information located through a transitive trust chain may be more reliable
than asking a random peer [141, 84, 45].
Finally, there are the global history reputation systems that collect information
about all peers from all peers. These solutions are the most comprehensive as well
as the most complex to implement. While the probability that any single opinion is
fraudulent may be greater, the collective sum of all opinions is likely to be accurate,
even when a large fraction of the peers are malicious colluding adversaries.
While previous information-sharing techniques are easily decentralized, global his-
tory systems tend not to be. Perhaps the most widely used reputation system is that
of eBay [42], which consists of a single trusted entity that collects all transaction
reports and rates each user. Global history systems proposed for P2P networks tend
to be more distributed. TrustMe [120] relies on a centralized server to assign unforge-
able identities, but reputation adjustments and lookups are handled purely between
peers. EigenTrust [73] offers a fully decentralized solution using weak identity leaving
it more vulnerable to whitewashing.
In conclusion, a peer’s reputation is based on information collected about that
peer from one or multiple sources. The primary sources are: personal experience,
external trusted sources, one-hop trusted peers, multi-hop trusted peers, and a global
system. Each source provides increasingly more information about peers. However,
that information becomes increasingly less credible as well.
In [101], O’Hara et al. categorize “trust strategies” for the Semantic Web based
on how agents react to peers they have no personal experience with. Their five basic
strategies include: optimistically assuming all strangers are trustworthy unless proven
otherwise, pessimistically ignoring all strangers, unless they are proven trustworthy,
22 CHAPTER 2. TAXONOMY OF TRUST
investigating a stranger by asking trusted peers, transitively propagating the inves-
tigation through friends of friends, or using a centralized reputation system. Notice
that their taxonomy mirrors that presented here.
Information Integrity
One major problem with reputations systems is guaranteeing the validity of opinions.
It is impossible to enforce honest, accurate reporting on transaction outcomes by all
peers. Most reputation systems do not attempt to verify the integrity of information
collected. Instead they assume the majority of users are honest and well-behaved,
and that collecting information from a large number of peers will result in a relatively
accurate assessment of a peer’s behavior.
Reputation systems that hope to combat colluding adversaries and front peers
that promote each other while denigrating good users, use reputation to weigh the
information and opinions collected. Instead of considering the opinions of each peer,
or each reported transaction experience, equally, these systems weigh the information
based on the trustworthiness of the source when compiling a peer’s reputation rating.
For example, information provided by personal friends would likely be considered
two or three times more accurate than that of a seemingly reputable, but unknown
peer in the network. Of course, when available, personal experience would be valued
the most [93]. Often the opinions of system peers are weighted by their previously
determined reputation scores. Information collected through transitive trust may be
weighted by the reputation rating of the least reputable peer in the trust chain [45].
Or, if reputation ratings lie between 0 and 1, the opinion would be weighted by the
product of the ratings of the peers in the chain.
One may also want to distinguish second-hand information from third-hand in-
formation. For instance, I may trust a transaction history reported by a peer if it
is based on their personal experience, rather than a reputation based on information
2.3. GATHERING INFORMATION 23
they received from other peers. If peers maintain and share complete records of every
transaction, this allows each peer to individually determine how to weigh each piece
of information. This weight could be computed based on the reputation of the origi-
nal reporting peer, the degree of separation from that peer, and the reputation of the
intermediate peers. However, this introduces significant traffic due to transmitting
the detailed transaction logs. Peers may prefer to share only a single score based
either solely on their personal experience, or a cumulative rating that includes the
scores provided by others. While this method reduces the flexibility offered to each
peer when calculating a final reputation, the use of weights by all peers will still have
the effect of dampening the influence of information from distant peers.
Even global history reputation systems apply reputation-based weighting. Eigen-
Trust [73] uses a distributed algorithm similar to PageRank [103] to compute a global
reputation rating for every peer using individual transaction reports weighted by the
rating of the reporting peer. However, even this algorithm was found to be vulnerable
to widespread collusion. Therefore, the authors suggest each agent separately weigh a
globally computed rating with the personal opinions of trusted peers, when available.
Some systems attempt to improve the accuracy of the transaction reports by
requiring proof of interaction. TrustMe [120], for example, requires that both parties
in a transaction sign a transaction certificate that is then presented when reporting
on the outcome of the transaction. While this may not prevent malicious peers from
lying about the outcome of a transaction, it does prevent adversaries from submitting
fraudulent reports about peers they have not interacted with in order to besmirch
their reputation.
2.3.3 Dealing with Strangers
With new users joining the system periodically, agents will often encounter peers with
no transaction history available at any source. As the number of sources an agent
24 CHAPTER 2. TAXONOMY OF TRUST
gathers information from increases, the frequency of encountering a local stranger
(a peer whom the agent has no direct or indirect experience with or knowledge of)
decreases. In the global history systems all local strangers are also global strangers
(peers whom no agent in the system has interacted with).
When no reputation information can be located, an agent must decide whether to
transact with a stranger based on its stranger policy. As mentioned previously, two
simple strategies are to optimistically trust all strangers, or pessimistically refuse to
interact with them. Both have their drawbacks. Optimistic agents may frequently
be defrauded, especially in systems with high levels of whitewashing. However, in
pessimistic systems, new users will be unable to participate in transactions and will
never build a reputation.
Feldman et al. have done extensive work in analyzing the problem of stranger
policies and whitewashing in P2P networks [83, 45, 46]. They suggest a “stranger
adaptive” strategy. All transaction information on first-time interactions with any
stranger is aggregated together. Using a “generosity” metric based on recent stranger
transactions, an agent estimates the likely probability of being cheated by the next
stranger and decides whether to trust the next stranger using that probability. This
probabilistic strategy adapts well to the current rate of whitewashing in the system.
2.4 Reputation Scoring and Ranking
Once a peer’s transaction history has been collected and properly weighted, a rep-
utation score is computed for that peer, either by an interested agent, a centralized
entity, or by all peers collectively, as in EigenTrust [73]. We will refer to the method
by which the score is computed as a general reputation score function.
The primary purpose of the reputation score is to help an agent decide which
available service provider in the network it should transact with. The two typical
scenarios are:
2.4. REPUTATION SCORING AND RANKING 25
i) Agent A is offered a resource or service by peer P . A decides if transacting with
P is worth the expected risk of defection, based on P ’s reputation score.
ii) In response to A’s request for a certain resource or service, multiple service
providers (P1, P2,...) respond. A uses the reputation scores of each responder
to rank them in order of how likely they are to provide proper service. A then
chooses the highest ranked provider. Should that transaction fail, A may try
again with the next highest ranked peer.
In the next two sections, we consider the inputs and outputs of the reputation
score function. What statistics gathered from a peer’s transactional history will
most benefit in computing its trustworthiness? How should reputation scores be
represented?
2.4.1 Inputs
Regardless of how a peer’s final reputation rating is calculated, it may be based
on various statistics collected from its history. But what statistics should be used
in computing the ranking score? Ideally, both the amount a peer cooperates and
defects would be taken into account. However, often the amount a peer defects may
be unknown. While a malicious peer may openly defect on an agreed transaction by
providing bad service or no service, selfish peers usually defect “silently”. For example
in file-sharing networks, freeriders refuse to share their files and ignore queries they
could answer. Other peers cannot determine how often a peer selfishly ignores a
request. However, as suggested in [45], peers can calculate the rate at which an agent
contributes to the network. The contribution rate is a reputation rating based solely
on good work.
When defection information is available, this statistic is usually more useful than
cooperations. Notice that visible defections usually constitute malicious behavior,
26 CHAPTER 2. TAXONOMY OF TRUST
which is more harmful than selfish behavior. While both good and bad behavior
can be taken into account, the negative impact of bad behavior on reputation should
outweigh the positive impact of good behavior.
When only information on positive contributions are available, the reputation will
have to be based solely on the amount of good work done. However, if a history
of peer’s cooperations and defections is available, should the peer’s reputation be
based on the quality of the work its done? Or should the quantity also matter? Our
work shows that while quality alone is useful (see Chapter 5, a score that properly
combines quality and quantity is much more effective and flexible under a variety of
adversarial techniques (see Sec. 4.6.2. Included in quantity should be the value of
each transaction. Intuitively, a peer that defects on one $100 transaction should have
a lower reputation than one who defects on two or three $1 transactions.
If a system wishes to defend against traitors, then reputations scores must consider
time. More recent transaction behavior should have a greater impact on a peer’s score
than older transactions. For example, a weighted transaction history could be used.
This would allow system agents to detect peers who suddenly “go bad” and defend
against them.
2.4.2 Outputs
In the end, the computed reputation rating may be a binary value (trusted or un-
trusted), a scaled integer (e.g. 1 to 10), or on a continuous scale (e.g. [0,1]). The
choice will be application dependent, although a binary value would likely be insuffi-
cient in a P2P environment where all peers are untrusted, but we want to rank peers
based on how reliable they are likely to be.
Both scenarios detailed above imply a single scalar value is obtained for each
candidate and is compared either against other candidates’ ratings or against a trust
threshold determined by the transaction. However, it is useful to maintain a peer’s
2.4. REPUTATION SCORING AND RANKING 27
reputation as multiple component scores. Applying different functions to the scores
allows a peer to calculate a rating best suited for the given situation. Many proposed
systems suggest maintaining multiple statistics about each peer. For example, keeping
separate ratings on a peer’s likelihood to defect on a transaction and it’s likelihood to
recommend malicious peers helps mitigate the effects of front peers. The TRELLIS
system [53] keeps separate ratings for the likelihood a peer cooperates on a transaction
(referred to as its “reliability”) and the accuracy of its opinions or recommendations
(its “credibility”). Reliability would correspond to the reputation score as discussed
here, while the credibility score would be used for weighing information sources, as
discussed in Section 2.3.2. Guha et al. [59] suggest maintaining separate scores for
trust and distrust.
2.4.3 Peer Selection
Once an agent has computed reputation ratings for the peers interested in transacting
with it, it must decide which, if any, to choose. If there is only one peer, and the
question is whether to trust it with the offered transaction, the agent may decide based
on whether the peer’s reputation rating is above or below a set selection threshold [91].
If multiple peers are offering the same resource, the agent would likely go with the
peer with the highest reputation rating. However, even with many peers available,
an agent may decide to refuse all their transaction requests if all their reputations
lie below the selection threshold. It may not be uncommon in certain systems, such
as document-sharing systems, for all peers responding to a rare document request
to be malicious. Malicious peers disseminating inauthentic or virus-infected files
can reply to any request, while well-behaved peers will only reply if they have the
queried document. A selection threshold is necessary to protect against malicious
spam responses (see Sec. 5.6.1.
28 CHAPTER 2. TAXONOMY OF TRUST
2.5 Taking Action
In addition to guiding decisions on selecting transactional partners, reputation sys-
tems can be used to motivate peers to positively contribute to the network and/or
punish adversaries who try to disrupt the system.
2.5.1 Incentives
Mechanisms used to encourage cooperation in the system are referred to as incentive
schemes. They are most effective at combatting selfishness as they offset the cost
of contribution with some benefit. However, incentive schemes can mitigate some
maliciousness if access to system services requires an adversary provide good resources
first. Such a reciprocative procedure raises the cost of misbehavior.
Most suggested incentive schemes offer one of two types of incentives: improved
service or money, with service improvements further decomposing into three general
categories:
Speed Agents that contribute resources to the network may be rewarded with faster
download speeds or reduced response latency for their requests and queries.
An example of this incentive can be seen in Bittorrent [23], a common P2P
application for downloading popular files by allowing downloaders to share parts
of files as they are received. Applying the principle of “tit-for-tat”, the client
application throttles upload speeds to a peer based on the download speed it
is receiving from that peer. Therefore, peers that are willing to devote more
upload bandwidth are rewarded with a higher download speed.
Quality Some systems may provide content at varying levels of quality, depending
on a peer’s contribution rate. For example, a P2P streaming movie service
could provide movies at different resolutions depending on a customer’s sub-
scription plan. This approach is already used by many online video providers
2.5. TAKING ACTION 29
(e.g. IFILM [67]).
Quantity Similar to quality, the amount of information, content, or service providers
available to a peer would be determined by the amount the peer contributes.
This approach is also used by many online services that provide a limited amount
of content for free but require payment for access to all their content. Similar
ideas have been proposed for use in P2P systems. For instance, some solutions
encourage peers to route network messages for other peers (e.g. [137, 14]).
Money Currently, peer-to-peer systems are used to share files and resources that
require little or no cost for the contributing peer to produce and distribute.
However, to support the exchange of more valuable content will require a pay-
ment mechanism that allows an agent to pay the content creator and provider
upon acquiring it. Most of the content will likely carry a low price since the
cost of distribution is spread over the users. Therefore a low-weight micropay-
ment mechanism is needed, allowing clients to make payments of a few cents
(or fractions of cents) without incurring a larger billing fee. Several papers have
proposed low-cost micropayment mechanisms for P2P systems (e.g. [64, 140]).
2.5.2 Punishment
While incentives are very useful at discouraging selfishness, curtailing misbehavior
requires the ability to punish malicious peers. As discussed earlier, the primary
function of reputation systems is to inform agents as to which peers are likely to
defect on a transaction. Not only does adversary avoidance benefit well-behaved
peers, but it punishes malicious peers who will quickly find themselves unable to
disseminate bad resources or cheat other peers. E-commerce sites, such as eBay [42],
use reputation systems not only to provide good customers information on sellers,
giving buyers a sense of security, but also to discourage misbehavior in the first place.
30 CHAPTER 2. TAXONOMY OF TRUST
If the reputation system can identify actively malicious peers it may retaliate in
several ways beyond simply warning other users. Overlay network neighbors can
disconnect from the adversary, immediately ejecting it from the network. Depending
on the type of identifiers used, the adversary may be kicked from the network for a
period of time, or permanently banned. To reenter the system, the adversary would
need to acquire a new valid identifier, which may be costly or impossible.
Finally, P2P systems tied to financial institutions for monetary payments could
fine a malicious peer for each verified act of misbehavior. Of course, such a solution
should be used cautiously as adversaries could use them to wreak havoc in the system
by falsely implicating well-behaved peers of misbehavior.
2.6 Miscellaneous
Other work approaches the problem of trust and reputation with a unique or novel
method that is not easily classified by the implementation of the basic mechanisms.
We feel it is important to include some examples of such work for completeness.
2.6.1 Resource Reputation
In [29], Damiani et al. enhance their previous peer reputation protocol [25] by propos-
ing the concept of resource reputation. In addition to reporting on the peers they
interact with, users give opinions on a resource’s authenticity based on its digest, or
hashed value. When a user requests a file or resource, each responder returns the di-
gest of the file it is willing to upload. Using the reputation system protocol, the user
looks up the digests to find which one corresponds to the correct file he is interested
in, and which is reported to be fake or corrupt. The user then fetches the file from
the provider reporting the digest most likely to be authentic. The user should then
recompute the digest on the received file. If it does not match that reported by the
file provider, the user can delete the file, report the provider for attempting to cheat,
2.7. CONCLUSION 31
and try a different provider. This technique complements the process of maintaining
peer reputations, which is still necessary in situations where the resource is rare and
no other peers have encountered it.
2.6.2 Social Networks
Peer-to-peer reputation system research is conducted under the assumption that all
peers in the network are unknown and untrusted. However, in the real world this
is not likely to be the case. A user may know that some of his friends also use
the same peer-to-peer network. If he could connect to them The use of a priori
trust relationships was touched upon earlier in this chapter (Sec. 2.3.2) as sources
of reputation information. However, existing social networks can be leveraged by
peer-to-peer systems in other ways. In Chapter 6 we present SPROUT, a protocol
for using social network access to locate and connect with friends in P2P networks in
order to improve message routing reliability. Using a social network trust model, we
show how such a protocol can be expected to improve performance.
2.7 Conclusion
Developing an implementable reputation system is an art involving many separate
design problems and choices. A reputation system is generally composed of three basic
functions: gathering behavioral information, scoring and ranking peers, and rewarding
or punishing peers. In turn, each component requires a combination of mechanisms to
function; each mechanism providing its own set of conflicting properties. We believe
a proper dissection of the overall design problem will allow researchers to develop
efficient solutions to each separate part without losing sight of the overall goal.
Chapter 3
Quantifying Agent StrategiesUnder Reputation
While most reputation system work has focused on developing specific protocols and
implementation designs that are tested through simulations, we believe much could be
learned through high-level theoretical analysis. In this chapter, we explore reputation
in online trade using a microeconomic model, primarily concentrating on individual
transactions between a small number of buyers and sellers. This chapter explores
the application of game theory to the study of reputation. Then, Chapter 4 takes a
macroeconomic approach, expanding the model to predicting typical user behavior in
highly populated systems.
This chapter concentrates only on selfish peers that seek to maximize their profit,
regardless of the harm to other parties. We do not discuss malicious agents that
gain additional utility from the act of harming other peers, though many reputation
systems are designed specifically to root out such agents [73, 93].
For this chapter, we are specifically interested in agents that have engaged in a
number of trades and therefore have accumulated a behavioral history. We ignore
the issue of bootstrapping reputation for new agents while preventing whitewashing.
We suggest a“stranger adaptive” technique similar to that proposed in [45] would be
effective. That work is further discussed in Section 3.5. We return to the issue of
32
3.1. DEFINITIONS AND DIMENSIONS 33
bootstrapping in Chapter 5. Also, we do not address how the behavioral history is
collected. We simply assume that a perfect history is available to all agents, allowing
us to focus on agent strategies rather than on specific mechanisms for gathering
transaction information. Chapter 5 offers an examination of more realistic history
maintenance.
We begin in Section 3.1 by proposing a simple economic game that captures the
mechanics of transactions between a buyer and a seller. Section 3.2 describes the pos-
sible outcomes of each transaction and states the social optimum. In Section 3.3, we
discuss expected player response and outcome when buyers have no knowledge about
the sellers as well as when they have perfect knowledge of how a seller will respond.
While simple, this exercise introduces the model and the analysis techniques, and will
provide insight when we look at reputation in Section 3.4. Assuming a perfect repu-
tation system, we show that Nash equilibrium is reached when players predominantly
cooperate. Sections 3.5 and 3.6 discuss related and future work. Finally, we conclude
in Section 3.7.
3.1 Definitions and Dimensions
This section defines a game that provides a simplified model of a generic trading
system. Next, we describe three dimensions which we vary to compose the specific
game scenarios we are interested in analyzing.
3.1.1 Game Setup and Rules
The players in our system are buyers and sellers.
• A seller can provide 1 unit of goods each turn, which we refer to as a bundle.
This bundle may be split by the seller between good resources, denoted by G,
and bad resources, denoted by B. Let 0 ≤ g ≤ 1 denote the fraction of the
34 CHAPTER 3. AGENT STRATEGIES UNDER REPUTATION
Table 3.1: Parameter descriptions with sample valuesParam. Description Value
v Valuation of 1G of goods to a buyer 3c Seller’s production cost of 1G of goods 1p Price paid for a bundle 2 (for FP)g Fraction of bundle that is good N/A
bundle made up of good resources. For example, bundle [ 34G : 1
4B]⇔ g = 3
4.
• Each unit of good resources costs a seller c to supply and has a valuation of v
to the buyer. Assume v > c. If not, there would be no price at which both the
seller and buyer could profit from a transaction and so no transactions would
occur.
• Each unit of bad resources costs a seller $0 to supply and has a valuation of $0
to the buyer.
• All sellers have the same production costs and all buyers have the same valua-
tion.
• A buyer can purchase at most one bundle per turn, but may choose not to
purchase any.
• The buyer always pays the seller before receiving the bundle. Consequently,
the buyer can never cheat a seller, only vice versa. This assumption reduces
the complexity of case analysis and mirrors most transactions, where payment
is verified before goods are received and their quality evaluated.
The parameters are listed with descriptions in Table 3.1, along with default values
used in concrete examples throughout the chapter.
As with most economic games, our interest will be to study how various strategies
affect the utility of each player in the game. Therefore, all values given are in units
3.1. DEFINITIONS AND DIMENSIONS 35
of utility. We will use $ as the symbol for units of utility. Each player is solely
motivated to increase his own utility. When a buyer purchases goods from a seller,
we are interested in the change in utility for each participant of the transaction. We
refer to this change in utility as the profit (positive or negative) of each player. We
define social profit to be the sum of all the players’ profits. We consider the optimal
utilitarian strategy to be one that maximizes social profit.
Our investigation breaks down the range of options in three dimensions: knowl-
edge, players, and pricing. The following describes each dimension as well as the
scenarios we consider relevant.
3.1.2 Knowledge-space
As we wish to look at the effects of reputation information on market behavior, we
must specify what information about the seller is available to the buyer. We look at
three approaches of increasing complexity.
Zero Knowledge (0K) A buyer has no knowledge whatsoever of the transaction
history of any seller, even of sellers he himself has previously interacted with.
Perfect Knowledge (PK) A buyer knows exactly what is the composition of the
current bundle being offered by any seller.
Perfect History (PH) We define perfect history to mean that a buyer is aware of
the composition of every bundle each seller has previously sold but not the bun-
dle the seller is currently offering. Perfect history represents an ideal reputation
system capable of supplying the buyer with all information about any seller’s
previous actions.
3.1.3 Player-space
The number of each type of player in a a scenario is determined as follows:
36 CHAPTER 3. AGENT STRATEGIES UNDER REPUTATION
1B-1S The simplest player scenario we will look at is a game with one buyer and
one seller.
1B-MS In this scenario there is one buyer but many sellers competing for the buyer’s
attention and money.
MB-1S Conversely, there may be many buyers competing to purchase from only one
seller.
MB*MS After studying the previous three simpler scenarios we will consider more
complex player scenarios with multiple buyers and sellers, though the relative
number of each will vary. The relative number of each will be indicated by the
appropriate sign (i.e. =, <, or >) in place of “*”. In most situations each of
these cases reduce to one of the three simpler scenarios, depending on relative
population size.
When the number of buyers and/or sellers does not matter we will use asterisks
notation (e.g. *B-*S). We will refer to single buyers as B and single sellers as S.
When there may be multiple buyers and/or sellers we will use {B} to signify the set
of all buyers and {S} to signify the set of all sellers.
3.1.4 Price-space
The two pricing options we consider are:
Fixed price (FP) The system sets a constant price for each bundle. The seller may
vary the content of the bundle and the buyer may choose to buy a bundle or
not, but the price does not vary.
When multiple buyers are interested in a single seller in one turn, we assume
the buyers are randomly ordered. The first buyer chooses from all sellers and
the rest of the buyers choose from the remaining sellers, in order. This ordering
3.1. DEFINITIONS AND DIMENSIONS 37
represents a real world phenomenon where an implicit ordering is obtained as
buyers compete for items offered on a “first come, first served” (FCFS) basis.
Variable price (VP) Each buyer bids on a bundle offered by the seller. The seller
accepts the highest bid, which determines the price the buyer must pay the
seller. In the case of a tie, the seller randomly chooses.
We do not concern ourselves with the specific mechanism of the auction, but
for simplicity assume an ascending auction or Vickrey auction [131]. Since all
buyers have the same valuation for goods and the same knowledge about the
seller, we expect all buyers to bid the same amount. Therefore, the second
highest bid will equal the highest bid in a Vickrey auction.
Because auctioning bundles does not make sense when there is only one buyer
we will ignore scenarios involving one buyer (i.e. 1B-1S/VP or 1B-MS/VP).
The variable p will denote the price paid for a bundle in either price scenario.
In FP, p denotes the fixed price set by the market, while in VP, p denotes the bid
accepted for the bundle.
3.1.5 eBay Scenario
To help in illustrating the implications of the model, we will at times use examples
within the framework of an online shopping site such as eBay [42]. While mostly
known for its variable priced auctions, many items on eBay also have an associated
fixed price allowing a bidder to purchase the item immediately for a specified amount.
Some items are offered on a solely fixed price basis. Therefore, eBay is an excellent
scenario in which to discuss the various aspects of the model across price and player
space. For example, when there are more interested buyers than items offered by
a particular seller at a fixed price, the order in which buyers purchase the items is
determined by when each clicked the “Buy Now” button; first come, first served.
38 CHAPTER 3. AGENT STRATEGIES UNDER REPUTATION
Table 3.2: General payoff matrixBundle(S) Buyer Seller Social Profit[1G : 0B] v − p p− c v − c[0G : 1B] −p p 0
[g : (1− g)] vg − p p− cg (v − c)g
Though eBay covers the spectrum of player and price-space, we specially focus on
the two most common scenarios: MB-1S/VP representing auctions and 1B-MS/FP
representing the sale of fixed-price commodities.
3.2 Strategy Independent Analysis
This section focuses on strategy-independent properties of the model. First we discuss
the payoffs each player receives from a single transaction with different bundles, as
well as the social profit. Given that, we derive the socially optimal bundle.
3.2.1 Single Transaction Payoff
For a transaction the buyer’s payoff equals the valuation of the good component of
the bundle minus the price paid: vg − p. The seller receives the price minus the cost
of producing the good component: p − cg. Adding the two gives the social profit of
(v− c)g. These expressions are summarized in Table 3.2 for easy reference, including
the two extreme bundles, 1G and 1B.
These expressions hold regardless of the strategy employed by players, the number
of players, or the information available to each player. Instead, these factors affect:
the bundle chosen by each seller, whether a buyer agrees to buy a bundle, and the
price offered by the buyers in the variable-priced scenario.
To illustrate, the following examples assume a bundle valuation of v = $3, a
production cost of c = $1, and a fixed price of p = $2 (listed in Table 3.1). The payoff
matrix for different sample bundle distributions for these specific parameter values is
3.2. STRATEGY INDEPENDENT ANALYSIS 39
given in Table 3.3. The last row gives the payoffs as a function of g, the fraction of
the bundle that is good effort.
Table 3.3: Payoff matrix for fixed $2 priced goods with valuation $3 and cost $1Bundle(S) Buyer Seller Social Profit[1G : 0B] 1 1 2[12G : 1
2B] -0.5 1.5 1
[0G : 1B] -2 2 0[g : (1− g)] 3g − 2 2− g 2g
The payoff to buyers is the value of goods acquired minus the price. The payoff
to the seller is the amount paid minus the cost of producing the bundle of goods. For
example, consider the second row in Table 3.3 where a buyer purchases a bundle that
is half good resources and half bad resources. The buyer gains 12· $3 utility but pays
$2 for a total loss of $0.5. It cost the seller $0.5 to produce the bundle (specifically
the 12G) and it received $2 in payment for a total gain of $1.5. Therefore, the total
increase in utility, or social profit, from the transaction was $1.
Remember, the buyer may always decline the transaction resulting in $0 profit for
both parties. If the seller is allowed to only produce [1G : 0B] or [0G : 1B] bundles,
this game resembles the one-sided prisoner’s dilemma [111], where it is one player’s
interest to defect when the other cooperates, while the other player wants to strictly
cooperate.
3.2.2 Social Optimum
Our objective function is to maximize social profit, which we define as the sum of
utility gained/lost by both the buyer and the seller. From Section 3.2.1 we have the
social profit from a transaction as (v − c)g. Since v − c > 0 by definition, clearly the
social optimum results when the seller maximizes g by producing 1G.
Because social profit is independent of price or player strategy, this social optimum
holds for both fixed and variable pricing and is constant across knowledge-space and
40 CHAPTER 3. AGENT STRATEGIES UNDER REPUTATION
player-space as well.
As we will see, the social optimum is an equilibrium for selfish agents in certain
scenarios. An additional advantage of this social optimum is that it does not require
the seller to know the valuation of the buyer, as long as v > c.
3.3 Selfish Analysis
Here we compare optimal strategy previously described with player strategies due to
independent selfish behavior. This section studies the 0K and PK knowledge-space
while the following section focuses on the more interesting and complex Perfect His-
tory. Each part begins analyzing a one buyer-one seller scenario with fixed prices
(1B-1S/FP). When applicable, variations in player-space and price-space will be dis-
cussed.
3.3.1 Zero Knowledge
Suppose 1B-1S/FP and consider the case of 0K, where the buyer has no knowledge of
the seller’s current bundle or what she has offered in the past. If every transaction is
completely disconnected from all other transactions, then the seller’s choice in bundle
has no effect, positive or negative, on future transactions. Each round is equivalent
to a one-shot Stackelberg game where the buyer always leads. Therefore, seller will
offer 1B in order to maximize personal profit (p − cg|g = 0 ⇒ p). However, if the
seller is expected to provide 1B, purchasing from her will result in negative profit for
the buyer (vg − p|g = 0 ⇒ −p). Therefore, the buyer will decline the transaction,
resulting in $0 profit for each and thus no increase in total utility.
Increasing the number of players or using variable pricing will not affect the fact
that it is in each seller’s interest to sell 1B if buyers are unable to distinguish between
sellers or their bundles in any way. Therefore, it is in every buyer’s interest to reject
the transaction.
3.3. SELFISH ANALYSIS 41
3.3.2 Perfect Knowledge
Let’s begin again with 1B-1S/FP. Suppose the buyer knows exactly what bundle the
seller is offering (PK). Unlike under 0K, each round is now a Stackelberg game where
the seller always leads. Given that buyer B’s only choices are to purchase the offered
bundle or reject the transaction, seller S need only offer the minimal bundle as to
give B positive profit. Solving vg− p from Table 3.2 for g : (1− g) yields a threshold
bundle of [ pvG : v−p
vB]. If the seller offers any bundle with more good resources, the
buyer will accept. Let S offer ε more good resources (and thus ε less bad resources)
where ε → 0+ to ensure a very small but positive profit for B. This mixture results
in a profit gain of vε → 0 for the buyer and p − c v−pv− cε → p − c v−p
vfor the seller.
The social profit is simply the profit of the seller, p− c v−pv
.
Using the default values for the parameters from Table 3.1 produces a threshold
bundle of [23G : 1
3B] with a social profit of $ 4
3.
Next we expand our player set to include multiple buyers, then multiple sellers.
Buyer’s Market
Consider the 1B-MS/FP under PK scenario with n sellers but only one buyer. Each
turn the buyer chooses the seller with the best bundle from which to purchase a bundle
at the fixed price. If all sellers offer the same bundle, each seller has probability 1n
of
being chosen.
Sellers can no longer offer the minimal bundle that gives a buyer a positive profit.
If they do, one seller will realize that they can increase their chance of selling their
bundle from 1n
to 1 by slightly improving their bundle above that of the rest. Quickly,
the other sellers will follow suit and improve their bundles up to or past that of the
first seller. In the end, all sellers will offer a bundle of 1G resulting in an average profit
rate equal to the probability of being chosen times the utility gained from selling 1G,
or 1n· (p− c). No seller is motivated to change their bundle because offering anything
42 CHAPTER 3. AGENT STRATEGIES UNDER REPUTATION
less than the rest of the sellers guarantees they will not be chosen.
Now the Nash equilibrium equals the social optimum.
Next, consider the MB<MS scenario. Suppose there are m buyers, m < n. The
same equilibrium will result. After m − 1 buyers have made a choice and chosen a
seller there will remain multiple sellers for the last buyer. For these final players the
problem degenerates to the 1B-MS situation and so all remaining sellers must offer
1G. All previously chosen sellers must have also offered 1G. If one had not, she would
not have been chosen before the remaining sellers who are offering a better bundle.
Seller’s Market
Next, consider the MB-1S/FP/PK scenario. Let there be m buyers and one seller.
Since the seller can only sell one bundle per turn and the price of the bundle is fixed
at p, she will offer the minimal bundle so as to guarantee a sale. Just as in the first
case of equal number of buyers and sellers this bundle needs to be only slightly better
than [23G : 1
3B] resulting in the same Nash equilibrium as 1B − 1S.
Now we consider the variable price scenario. Instead of randomly ordering the
buyers, thus guaranteeing that the last m − 1 will not be able to (or not want to)
purchase any goods, what if we allowed the buyers to bid for bundles?
We begin with a single seller and multiple buyers (MB-1S/VP/PK). Table 3.4
lists the payoffs to both the buyer and the seller, as well as the total social profit for
three different bundles. As expected, the social profit is the same for each bundle as
in the FP scenario.
If each buyer is free to bid any price we can expect them to bid at or below the
bundle valuation. Assume the valuation of the bundle is vg > 0 (the seller offers a
bundle with at least some good content). A buyer B1 would like to pay as little as
possible, say $0. However, a second B2 will happily offer a bit more in order to secure
winning the auction. It is in B1’s interest to raise his bid beyond that of B2. This
3.3. SELFISH ANALYSIS 43
continues until one or both bid the actual valuation vg.
Table 3.4: Payoff Matrix for variable priced p goods for default v = $3 and c = $1.Bundle(S) Buyer Seller Social Profit[1G : 0B] 3-p p-1 2[12G : 1
2B] 1.5-p p-0.5 1
[0G : 1B] 0-p p 0
Using the MB>MS/FP scenario, we find that a similar analysis yields the same
equilibrium as for the MB-1S/FP scenario. The seller will choose a bundle so as to
limit the buyer’s profit to 0.
The MB>MS/VP scenario is not as trivial. The model specifies a buyer can
acquire one bundle per round. With multiple sellers auctioning their bundles should
buyers be allowed to bid on multiple concurrent bundles? One solution is to order the
sellers and conduct the auctions sequentially. The buyer who wins the bundle cannot
participate in progressive auctions. With more buyers than sellers, we are guaranteed
to have multiple bidders for each bundle and so the same equilibrium price as the
MB-1S is expected.
Similarly, if we allow buyers to purchase multiple bundles, and the valuation of
each bundle is not affected by the number acquired, then we would expect every buyer
to participate in each seller’s auction. Once again, this situation degenerates to the
MB-1S case.
The last scenario is auctions held in parallel and buyers can only purchase one
bundle and therefore only bid on one bundle. Here we break it down into two cases:
one with more than twice as many buyers as sellers, and one with less.
The first case is the simplest. Each seller’s auction will be bid on by two or more
buyers, mirroring the MB-1S situation. If not, if there were a seller with only one
buyer bidding on its bundle then that buyer would have an advantage and would bid
low (less than v). However, there must be a seller with three or more buyers bidding
44 CHAPTER 3. AGENT STRATEGIES UNDER REPUTATION
for her bundle. One of those buyers would see that the single buyer was bidding less
than v and move its bid over the the single-buyer seller and escalate the bid. Now
every seller has multiple buyers bidding.
In the second case, there are less than two buyers for each seller. Some sellers will
have only one buyer bidding for their bundles.
To summarize, the 0K results indicate the need for some information about a
seller’s behavior if any trades are to happen. Even with perfect knowledge, the seller
will not necessarily act in the best interest of the buyer. However, in many scenarios
the seller has incentive to offer the best possible bundle. While obvious in situations
where multiple buyers are competing for one buyer’s attention (and money), it also
holds when multiple buyers are competing for one seller’s item in an auction scenario.
3.4 Perfect History
We begin by proposing very simple strategies for both buyers and sellers, then incre-
mentally modifying them in response to the other players’ current strategy until the
players reach a Nash equilibrium.
As defined in Section 3.1, perfect history (PH) entitles all buyers to know the
transaction history of every seller. We will simplify our model to allow sellers to sell
one of two bundles: 1G or 1B. If the seller offers 1G we say the seller cooperates on
the transaction. If she offers 1B, she is defecting on the transaction. We argue that
assuming a binary bundle does not greatly weaken our model. A buyer’s decision on
whether to buy, and at what price, will be based on the probability he expects the
seller to cooperate or defect. This probability will be estimated based on all sellers’
history/reputation.
We assume that each seller has accrued a number of transactions in her history
consistent with the strategy she employs. We do not focus on the reputation boot-
strapping problem (when a seller has no history) which is outside the scope of this
3.4. PERFECT HISTORY 45
chapter, but is discussed in subsequent chapters. When necessary we simply assume
buyers expect sellers to cooperate on the first transaction.
To simplify our initial analysis of strategies for both buyers and sellers, we begin
with buyers assuming a simple model for the behavior of each seller. Given this as-
sumption, a buyer will choose a strategy. If sellers then assume each buyer follows
that strategy, they will choose their own strategy. We then repeat the process, un-
til the progression of strategies reaches Nash equilibrium where neither player has
incentive to change their strategy.1
The first section proposes initial strategies for both buyers and sellers. The fol-
lowing section explores improved strategies under the auction scenario (MB-1S/VP),
while the final section concentrates on strategies in the fixed-price market (1B-MS/FP)
scenario.
3.4.1 Basic Reputation-based Strategies
Coin Model (CM): Each round, seller S randomly chooses whether to cooperate
or defect with probability ρS of cooperating.
This simple model mimics each seller flipping a biased coin each turn. If there are
multiple sellers in the system, each seller may have a different bias ρi i ∈ {S} where
{S} is the set of all sellers, whether one or more.
Buyer Strategy β1β1β1 (BS-β1β1β1): Buyer B assumes seller S follows the coin model,
estimates S’s probability of cooperating and will pay up to vρS.
Regardless of the number of buyers and sellers (*B-*S), each buyer initially con-
siders each seller S independently. To determine the likelihood of S cooperating on
the next transaction, B needs to know ρS. Given ρS the estimated valuation of S’s
bundle is vρS + 0(1 − ρS) = vρS. Therefore, B will be willing to pay up to vρS
for S’s bundle. Consequently, the price a seller can command is proportional to her
1Note, we do not claim it is the only existing Nash equilibrium.
46 CHAPTER 3. AGENT STRATEGIES UNDER REPUTATION
reputation. This intuitive result is supported by empirical findings [75].
Although B may not know ρS, it can estimate it by using the seller’s transac-
tional history. Specifically, by counting the number of transactions it has previously
cooperated on and dividing by the total number of transaction we have an unbiased
estimator for ρS. Let TS be the total set of transactions S participated in and CS be
those transactions in which S cooperated.
ρS =|CS||TS|
(3.1)
To understand how ρ affects the buyer’s decision, first consider 1B-1S. Buyer B
calculates ρS and is willing to purchase from S if the fixed price (FP) p ≤ vρS. If S
is auctioning the bundle (MB-1S/VP), B will offer at most vρS.
Now suppose there are multiple sellers to choose from (1B-MS).B estimates ρi ∀i ∈{S}. Now consider the following cases.
• FP: B seeks to maximize expected profit vρi − p for fixed price p. Therefore a
single buyer (1B-MS) will choose to purchase from S whose ρS ≥ ρi ∀i ∈ S. If
there are multiple buyers (MB*MS) competing for the bundles on FCFS, from
among the remaining sellers with available bundles, B will purchase from the
seller with the highest ρi such that vρi − p ≥ 0.
• VP: Variable pricing only applies to MB > MS, or MB < MS if buyers can
purchase multiple bundles per turn. In either case, B will bid up to vρS, just
as in the single seller scenario.
In BS-β1 the buyer(s) assumes the seller applies the coin model. Now we will look
at how the seller should respond if it assumes buyers are using BS-β1.
Seller Strategy σ1σ1σ1 (SS-σ1σ1σ1): Seller S assumes buyer uses BS-β1. S follows coin
model, but can choose appropriate ρS when it enters the system. However, S cannot
vary ρS over time.
3.4. PERFECT HISTORY 47
In other words, we allow S to freely choose ρS but not vary it over time (we relax
this constraint in the following sections). In the 1B-1S FP scenario, ρS needs to be
sufficiently high so that the expected valuation calculated by the buyer is greater
than or equal to the price p. Therefore, vρS ≥ p ⇒ ρS ≥ pv. The same result holds
for MB-1S FP.
However, in 1B-MS/FP, S expects the buyer B applying BS-β1 to choose the
seller with the highest ρS. All sellers will choose ρ = 1. If not, if all sellers choose a
ρ < 1, then one seller i could unilaterally raise her ρi above that of the other sellers
and guarantee that she is chosen by B. This move would prompt the other sellers to
increase their ρ to be competitive, until all sellers are using ρ = 1. If one seller does
not follow suit, and keeps her ρ < 1, then there is no chance of B choosing her.
Rational sellers will set ρ = 1 only if they can expect to have positive profits.
Note that, if all sellers adhere to Seller Strategy σ1 the expected payoff every round
for a seller S is p−cn
if ρS = 1 where n is the number of sellers with ρ = 1. Since only
ρ = 1 generates positive profit for the seller, we would not expect any rational sellers
to choose a ρ < 1 .
Finally, consider the MB-1S/VP scenario. Following the same reasoning as for
MB-1S VP under perfect knowledge we once again find the preferred ρS for seller S
is 1 as long as v > c.
If all sellers must adhere to SS-σ1, then the buyers have no incentive to deviate
from BS-β1, resulting in Nash equilibrium.
3.4.2 Independent Decisions for MB-1S/VP
In this section we consider only the one seller, multiple buyer, variable priced, perfect
history scenario. The work also applies to multiple sellers, but where buyers are
not restricted to purchasing at most one bundle per turn, thus allowing them to bid
in each seller’s auction. This scenario represents the type of markets we are most
48 CHAPTER 3. AGENT STRATEGIES UNDER REPUTATION
interested in, namely eBay-style auctions. Instead of insisting on a constant ρ over
all time as with SS-σ1, we allow the seller to decide whether to cooperate or defect
on each transaction separately. As we will see, a crucial factor in the seller’s strategy
is the total number of transactions the seller plans to execute.
Suppose seller S has committed n transactions, m of which were good and n−mwere bad. Assuming variable priced bids and buyers applying Buyer Strategy β1,
a buyer will bid up to vmn
for the next bundle offered by seller. Should the seller
cooperate or defect? If she cooperates the expected bid price of the next bundle will
be vm+1n+1
. If the seller defects she gains a one-time benefit of c (compare 1G with 1B
in Table 3.4), but the expected price of the next bundle will be v mn+1
, slightly lower
than if she had cooperated. Regardless of S’s previous or successive behavior, how
many additional transactions must S perform before the long-term damage done to
her reputation by one defection outweighs the one time gain from that defection?
To measure the effect of a seller’s decision on long-term utility we calculate utility
over time for each case, cooperate or defect, and see after how many rounds the values
are equal.
Lemma 3.4.1 Assuming buyers follow BS-β1, a seller S that has committed n trans-actions will gain more utility from defecting rather than cooperating on the n+1 trans-action if S performs less than k additional transactions and less utility if S performsmore than k additional transactions, where k ≈ (n+ 1
2)(ec/v − 1).
Proof Suppose seller S has a history of n transactions, in m of which she cooperated.
On the n+1 turn the seller chooses either to defect or cooperate. Let k be the number
of turns S sells bundles after she cooperates/defects on turn n+ 1.
Let U(n) be S’s utility after the first n turns. Let Uc(z) be the utility of S after
z > n turns, assuming S cooperated on the n + 1 turn. Similarly, let Ud(z) be the
utility of S after z > n turns, assuming S defected on the n + 1 turn. Before we
formulate Uc(z) and Ud(z) we must define some auxiliary functions.
3.4. PERFECT HISTORY 49
Define function fS(t) to return 1 if S cooperated (C) on turn t, or 0 otherwise.
Define function FS(t) to be a nondecreasing function equal to the number of turns
S has cooperated after t turns. For example, since S cooperated m times in her first
n transactions, FS(n) = m.
FS(t) =t∑
i=1
fS(i) (3.2)
Define F−yS (t) to be a non decreasing function equal to the number of turns S has
cooperated after t turns, excluding turn y:
F−yS (t) =
t∑
i=1,i6=y
fS(i) (3.3)
Basically, for any t the value of F−yS (t) is independent of how S acted on turn y.
Expressed mathematically,
∀t, y {F−yS (t)|fS(y) = 1} = {F−(y)
S (t)|fS(y) = 0} (3.4)
Specifically of interest to our problem substitute y with n+ 1.
∀t {F−(n+1)S (t)|fS(n+ 1) = 1}
={F−(n+1)S (x)|fS(n+ 1) = 0}
(3.5)
Function F−(n+1)S (t) allows us to express the fact that the seller is consistent as to
whether she defects or cooperates after turn n+1 regardless of the decision she made
that turn. As we are dealing with only one seller we will ignore the subscript.
As stated above, buyers follow the consistency bid model described by BS-β1,
therefore each turn S is paid the fraction of transactions she has cooperated in the
past times the value of cooperation, v.
The following equations express the seller’s utility k turns after the cooperate/defect
choice:
50 CHAPTER 3. AGENT STRATEGIES UNDER REPUTATION
Uc(n+ 1 + k) = U(n) +(
vm
n− c)
+k∑
i=1
(
vF−(n+1)(n+ i) + 1
n+ i− f(n+ 1 + i)c
)
(3.6)
Ud(n+ 1 + k) = U(n) +(
vm
n
)
+k∑
i=1
(
vF−(n+1)(n+ i)
n+ i− f(n+ 1 + i)c
)
(3.7)
Notice in Equation 3.6 the additional 1 in the summation fraction indicating S
cooperated on the n + 1 turn. Set the two utility equations equal to each other and
solve for k. The full derivation is presented in Appendix A.
Uc(n+ 1 + k) = Ud(n+ 1 + k) (3.8)
k ≈ (n+1
2)(ec/v − 1) (3.9)
Using our default parameter values (v = 3 and c = 1) results in k ≈ 0.40n + 0.2,
which means a seller that has accumulated a history of 10 transactions would profit
more from cooperating on the next sale than defecting if she plans to participate in
5 or more additional transactions.
Figure 3.1(a) shows the linear relation between n and k for three different values
of v/c. For example, for n = 40 and v/c = 2, k = 13.3, therefore, given that a buyer’s
valuation of a good bundle is twice the cost of producing the bundle, a seller with
a history of 40 sales (good or bad) will profit less from defecting than cooperating
on the next sale, if she sells 14 or more additional bundles. Interestingly, k does not
depend on m or f(x), only n. This means that the seller’s decision to cooperate or
defect on past or future transactions has no impact on whether she should cooperate
or defect on the current turn; only the quantity of past transactions matters.
The other factor affecting k, in addition to the length of a seller’s history (n), is
3.4. PERFECT HISTORY 51
0
10
20
30
40
50
60
70
80
0 10 20 30 40 50
k
n
v/c=1.1v/c=2v/c=3
(a) As a function of seller’s history n, for threevalues of v/c.
0
10
20
30
40
50
60
70
80
90
1 1.5 2 2.5 3 3.5 4 4.5 5
k
v/c
n=10n=25n=50
(b) As a function of the ratio of valuation to costv/c, for three values of n.
Figure 3.1: Number of transactions until gain from single defection equals loss fromlowered reputation k.
the cost and valuation of goods. More specifically, as valuation increases with respect
to cost, the optimal fraction of total transactions to defect on decreases. The ratio of
cost to valuation is illustrated in Figure 3.1(b) for three values of n.
Intuitively, as the difference between cost and valuation shrinks, the potential for
profit goes down. For instance, if valuation equals cost plus a small δ, then the highest
price buyers will be willing to pay is the cost of the bundle plus δ. If the profit a seller
can make from the sale of a good bundle is a fraction of the cost, then the utility
earned by saving on the cost of one bundle outweighs the profit loss on many good
bundles. This is represented by the sharp rise in k as v/c approaches 1 in the figure.
As the cost of producing a good bundle becomes a smaller fraction of the valuation,
and thus the bid price the seller can command for a bundle, then the decrease in bid
prices due to lower reputation quickly usurps the one-time gain from defection. As
v/c approaches ∞, k converges to 0.
This analysis suggests a new seller strategy for the MB-1S/VP/PH scenario:
Seller Strategy σ2σ2σ2: Seller S assumes buyer uses BS-β1β1β1. Suppose S knows be-
forehand how many total bundles she wants to sell, Z, and the cost and valuation of
52 CHAPTER 3. AGENT STRATEGIES UNDER REPUTATION
bundles. S will maximize her utility by cooperating on the first d(Z − 12)e−c/v − 1
2e
transactions and then defecting on the rest.
If S knows the total number of bundles she will auction over her lifetime in the
system (call this Z), S can maximize her profit by cooperating for some of the initial
transactions then, at a certain point, switching and defecting on the rest. Lemma 3.4.1
gives, for a certain number of completed transactions, how many more transactions
must be completed for the one-time gain from defecting to equal the long-term loss
due to a lower reputation. If a seller defects and performs less than k additional
transactions, the defection was in her benefit. If S performs more than k, then she
has less utility than had she cooperated. Therefore, ideally S’s strategy is to cooperate
on all sales for a number of turns, then defect on the rest of the turns. When the
number of transactions in the cooperating phase is n, the number of transactions in
the defecting stage is k+1, and the two values are related by Lemma 3.4.1, the utility
is maximized.
Below we prove SS-σ2 is optimal for a seller participating in a predetermined
number of transactions under the scenario MB-1S/VP/PH where the buyers are using
BS-σ1.
Definition Let a transaction schedule of length Z be a permutation of exactly Z
cooperations and defections. Let ΞZx be the set of all possible transaction schedules
with x cooperations and Z − x defections. For example (C C D D C D C)∈ Ξ74.
Definition The utility of a transaction schedule T , U(T ), is the total utility gained
or lost by a seller who commits exactly Z transactions and cooperates or defects
in the order specified by T , assuming MB-1S/VP with buyers using strategy BS-β1.
U(T ) for any schedule T is equal to the sum of the payment received for each bundle
minus the sum of the cost of producing good bundles. The total cost for a transaction
T ∈ ΞZx is xc (cooperations times cost of each). The payment received by a seller S
3.4. PERFECT HISTORY 53
for each bundle is equal to the buyers’ valuation of a good bundle times ρS which, for
the ith bundle, is the number of cooperations in the first i− 1 turns divided by i− 1.
For simplicity we will assume ρS = 1. As stated earlier, we assume buyers always
trust new sellers on their first bundle. This assumption only affects the payment on
the first bundle and is equal for all schedules. Mathematically,
∀T ∈ ΞZx U(T ) = v
1stpayment
+Z∑
i=2
vFT (i− 1)
i− 1other payments
− x ∗ ctotal costs
(3.10)
Note, the subscript in FT (i−1) refers to the transaction schedule. We define FT (i−1)
as the number of cooperations in the first i− 1 terms of transaction schedule T .
Given a seller makes Z transactions with 0 ≤ x ≤ Z cooperations and Z − x
defections, we will show that
(i) a utility optimal transaction schedule will consist of all x cooperations first,
then all Z − x defections, and
(ii) for such a transaction schedule the optimal number of cooperations is x =⌈
(Z − 12)e−c/v − 1
2
⌉
.
Theorem 3.4.2 Assuming that buyers use strategy BS-β1, the utility optimal trans-action schedule of length Z with x cooperations and Z − x defections consists ofexecuting all x cooperations first, followed by all Z − x defections. We refer to sucha schedule as a segregated schedule.
Proof by contradiction Let T ∈ ΞZx be an optimal transaction schedule in ΞZ
x
such that at least one defection D appears before at least one cooperation C in the
schedule. Let i be the index of the first D in t and j be the index of the last C. By
definition i < j. Construct transaction schedule T ′ by swapping the D at position i
with the C at position j. By definition U(T ) ≥ U(T ′). Represent each utility using
54 CHAPTER 3. AGENT STRATEGIES UNDER REPUTATION
Eq. 3.10.
U(T ) ≥ U(T ′) (3.11)
v +Z∑
k=2
vFT (k − 1)
k − 1− x ∗ c ≥ v +
Z∑
k=2
vFT ′(k − 1)
k − 1− x ∗ c (3.12)
Notice both schedules have the same total cost due to having the same total number
of cooperations (x). Both also have the same initial payment. Because only the i
and j terms in T were swapped to form T ′, then ∀k < i, k ≥ j FT (k) = FT ′(k).
Cancelling out equal terms leaves
�v +�vZ∑
k=2
FT (k − 1)
k − 1−���x ∗ c ≥�v +�v
Z∑
k=2
FT ′(k − 1)
k − 1−���x ∗ c (3.13)
j∑
k=i+1
FT (k − 1)
k − 1≥
j∑
k=i+1
FT ′(k − 1)
k − 1(3.14)
However, T ′ has C in the ith position where T has a D, while all other positions
less than j are the same. Therefore, by definition of function FT (k), ∀k i ≤ k <
j FT (k) = FT ′(k)− 1. This fact, however, contradicts Eq. 3.14, which implies that
∃k i ≤ k < j s.t. FT (k) ≥ FT ′(k). Therefore, a utility optimal transaction schedule
cannot have a defection appear in the sequence before a cooperation.
Intuitively, because the benefit from defecting is a one-time savings on cost, while
the benefit of cooperation is improved reputation which in turn increases the expected
payment for each future bundle. Therefore, executing a set number of cooperations
before any defections will maximize the benefit gained from those cooperations.
Theorem 3.4.2 implies that once a seller has decided it is in her interest to defect
once, it will be in her interest to defect every time until she exits the system. Next
we check to see if there is always one value of the number of cooperations that will
maximize the utility of a segregated schedule of length Z.
3.4. PERFECT HISTORY 55
First, we need an expression for the utility generated by a segregated schedule.
Definition Let Useg(Z, x) be the utility of a segregated transaction schedule of length
Z with x cooperations followed by Z − x defections. If we assume an MB-1S/VP
scenario with buyers using BS-β1, Useg(Z, x) can be expressed as
Useg(Z, x) = (v − c)x+Z−1∑
i=x
vx
i(3.15)
where (v − c)x is the utility from the x cooperations, v is the utility from the first
defection, and the summation is the utility from the remaining defections. Note, as
in Eq. 3.10, we are assuming buyers always expect the seller to cooperate on the first
transaction. This assumption simplifies our derivations and analysis but does not
affect our results. As we will see, for Z ≥ 2, the seller should always cooperate on
the first transaction.
Theorem 3.4.3 For a given value of Z the utility function for a segregated transac-tion schedule (given by Equation 3.15) has at most one unique global maximum forvalid values of 0 < x ≤ Z.
Proof The formal proof of Theorem 3.4.3 is given in Appendix B. Basically, the sec-
ond derivative of Useg(Z, x) (Eq. 3.15) with respect to x (the number of cooperations)
is −2v∑∞
k=0k
(x+k)3, which is always negative between 0 and Z. Therefore, Eq. 3.15
can have a most one maximum for any valid value of x.
Now knowing that a segregated schedule of the form (C C . . .C D D . . .D) with
x cooperations followed by Z − x defections has a unique optimal value for x that
maximizes Useg(Z, x) for a specific Z, how can we compute it? Below, we derive
an approximate answer by approximating Eq. 3.15 with a continuous function. We
then state (Theorem 3.4.5) a tighter approximation based on Lemma 3.4.1 (proven
in Appendix A), whose full derivation is presented in Appendix C.
56 CHAPTER 3. AGENT STRATEGIES UNDER REPUTATION
Theorem 3.4.4 Assuming that buyers use strategy BS-β1, the utility optimal trans-
action schedule of length Z consists of approximately⌈
(Z − 1)e−c/v⌉
cooperations
followed by⌊
(Z − 1)(1− e−c/v) + 1⌋
defections.
Proof Approximate Useg(Z, x) as the continuous function U
U = (v − c)x+
∫ Z−1
x
vx
tdt (3.16)
Simplifying and taking the derivative with respect to x yields
U = (v − c)x+ vx ln(Z − 1)− vx ln(x) (3.17)
dU
dx= (v − c) + v ln(Z − 1)− v ln(x)− v (3.18)
= v ln(Z − 1
x
)
−c (3.19)
Set dUdx
= 0 and solve for x.
dU
dx= v ln
(Z − 1
x
)
−c = 0 (3.20)
ln(Z − 1
x
)
=c
v(3.21)
Z − 1
x= ec/v (3.22)
x = (Z − 1)e−c/v (3.23)
Note that the second derivative of U is
d2U
dx2= −v (Z − 1)2
x3(3.24)
which is negative for all 0 < x ≤ Z. Therefore, the value of x given in Eq. 3.23
must give the unique maximum in U for all valid values of x, just as for Useg(Z, x)
(Theorem 3.4.3).
Because we are interested only in integer values for x and Z − x, the resulting
equations for the optimal number of cooperations and defections in a segregated
3.4. PERFECT HISTORY 57
transaction schedule of length Z would be
nC(Z) =⌈
(Z − 1)e−c/v⌉
(3.25)
nD(Z) =⌊
(Z − 1)(1− e−c/v) + 1⌋
(3.26)
Where nC(Z) and nD(Z) are the number of cooperations and defections (respectively)
in a utility optimal segregated schedule.
As stated earlier, performing a derivation based on the discrete representation
of utility (presented in Appendix C) results in a better approximation with error
bounds that approach 0 as Z approaches ∞. Restating Theorem 3.4.4 with a tighter
approximation:
Theorem 3.4.5 Assuming that buyers use strategy BS-β1, the utility optimal trans-
action schedule of length Z consists of approximately⌈
(Z− 12)e−c/v− 1
2
⌉
cooperations
followed by⌊
(Z − 12)(1− e−c/v) + 1
⌋
defections.
We now focus solely on this improved approximation for constructing an optimal
segregated schedule of length Z.
nC(Z) =⌈
(Z − 1
2)e−c/v − 1
2
⌉
(3.27)
nD(Z) =⌊
(Z − 1
2)(1− e−c/v) + 1
⌋
(3.28)
Figure 3.2 shows both nC(Z) and nD(Z) as functions of Z for v/c = 3. Notice that
both are linear in Z, though nC(Z) grows at a faster rate, so that it is always roughly
2.5 times nD(Z). This ratio is determined by the valuation/cost ratio. For our default
values of c and v ($1 and $3, respectively) and a sufficiently large Z, the equations
indicate a seller should cooperate on roughly the first 70% of the transactions and
defect on the rest. For example, we see that at Z = 40, nC(Z) = 28 and nD(Z) = 12;
2840
= 70%.
58 CHAPTER 3. AGENT STRATEGIES UNDER REPUTATION
0
5
10
15
20
25
30
35
5 10 15 20 25 30 35 40 45 50Z
Num. Coop.Num. Defect.
Figure 3.2: Optimal number of cooperation/defections as a function of total sales.
In deriving Lemma 3.4.1, and consequently Eqs 3.27 and 3.28, we used a closed
form approximation of a finite harmonic series (see Appendix A). To numerically
evaluate the approximation error we compute the following two error functions:
Definition From Eq. 3.15 we define Useg(Z, nC(Z)) as the utility of the transaction
schedule with supposedly optimal number of cooperations and defections. Define
error functions f+e (Z) and f−
e (Z) as
f+e (Z) =
U(Z, nC(Z))− U(Z, nC(Z) + 1)
U(Z, nC(Z))(3.29)
f−e (Z) =
U(Z, nC(Z))− U(Z, nC(Z)− 1)
U(Z, nC(Z))(3.30)
Functions f+e (Z) and f−
e (Z) give us the relative error between the schedule we
assume to be optimal and the two closest schedules of length Z, namely with one
more and one less cooperation, respectively.
Below, error analysis is applied to the tighter approximation from Theorem 3.4.5
followed by an analogous evaluation for Theorem 3.4.4.
In Figure 3.3 we plot f+e (Z) and −f−
e (Z).2 For large enough Z, neither curve
crosses 0, indicating that indeed the schedule we believe is optimal does result in
2We negate the second function to better differentiate both functions in one graph.
3.4. PERFECT HISTORY 59
-0.02
-0.015
-0.01
-0.005
0
0.005
0.01
0.015
0.02
5 10 15 20 25 30 35 40 45 50
Err
or
Z
fe+(Z)
-fe-(Z)
Figure 3.3: Relative utility error between optimal schedule and ±1 C/D.
-0.02
-0.015
-0.01
-0.005
0
0.005
0.01
0.015
0.02
5 10 15 20 25 30 35 40 45 50
Err
or
Z
fe+(Z)
-fe-(Z)
Figure 3.4: Relative utility error between optimal schedule using weak approximationand ±1 C/D.
better utility than a schedule with one more or one less cooperation, and is therefore
at least a local maximum. For small Z (Z < 5), however, nC(Z) is not necessarily
optimal. In fact, though not visible in Figure 3.3 because it lies outside of the y-
range, f+e (Z) attains negative values for Z = 1,2,3 and 4. These results indicate that
nC(Z) + 1 results in better utility than nC(Z) for very small Z, which is expected
because the approximation for k from Lemma 3.4.1 is weakest for very small Z.
However, for very small Z the behavior of buyers towards unknown, untested sellers
is an important factor. Originally, we stated we wanted to assume sufficient history
in order to ignore reputation bootstrapping issues.
60 CHAPTER 3. AGENT STRATEGIES UNDER REPUTATION
Calculating the same error functions using the weaker approximations from The-
orem 3.4.4 reveals that f−e (Z) is periodically negative, regardless of how large Z gets.
In Figure 3.4 the dotted curve representing −f−e (Z) rises slightly above 0 with a
periodicity of 7. Therefore, for one in seven values of Z, the weaker approximation
from Theorem 3.4.4 does not compute the utility optimal schedule.
The previous numerical error analysis demonstrates that the value computed by
Eq. 3.27 specifies the local maximum for Useg(Z, x). Applying Theorem 3.4.3, we
know the value must be a global maximum because the utility function has only one
unique maximum in the valid range.
Notice that if the seller plans to participate in the system selling bundles indefi-
nitely, we may set Z = ∞. In this case nC(∞) = ∞. Therefore, SS-σ2 dictates that
a seller that plans to sell goods for the foreseeable future should always cooperate.
As expected, this result is exactly the same as SS-σ1, which sets ρ = 1.
So far we have constrained the buyers to strategy BS-β1. If we remove this re-
striction, how will buyers respond to sellers using SS-σ2?
Buyer Strategy β2β2β2: Buyer B assumes seller S uses SS-σ2. Not knowing how
many bundles S will sell in all (Z), B should assume S will always cooperate until S
defects once. From then on assume S will always defect and never purchase from S
again.
Knowing that the optimal strategy for sellers is to cooperate for their first x
transactions and then defect on the rest, a buyer will watch for a seller’s first defection
and then refuse to purchase any more bundles from it.
If a seller assumes all buyers are using BS-β2, the seller will adopt a new strategy.
Knowing that no buyer will purchase a bundle from her once she has defected once,
and given that the seller makes a larger profit from cooperating on a transaction than
not selling anything at all, the seller will cooperate on every transaction except on
the very last one.
3.4. PERFECT HISTORY 61
Seller Strategy σ3σ3σ3: Seller S assumes buyers use BS-β2. Given a total of Z
bundles to sell, S will cooperate on the first Z − 1 bundles and defect only on the last
bundle.
To this seller strategy, a buyer will respond with BS-β2, indicating an equilibrium.
Notice SS-σ3 is almost equivalent to SS-σ1, where each seller cooperates on every
transaction in order to maximize ρ.
Using the estimator ρ alone, BS-β1 is unable to distinguish whether a seller, with a
history of 15 Cs and 3 Ds, is applying SS-σ1 or SS-σ2. Obviously, a player’s reputation
score must rely not only on the number of cooperates and defects, but the sequence
as well.
3.4.3 Independent Decisions for 1B-MS/FP
We continue studying the expected player behavior when sellers decide on a per turn
basis whether to cooperate or defect. While the previous section dealt specifically with
the MB-1S/VP scenario, in this section we concentrate on the 1B-MS/FP scenario.
A rational seller’s decision whether to cooperate or defect is not fixed over time (as
in SS-σ1); it may vary as both her and other sellers’ reputations vary. For example,
suppose there are 10 sellers, S1...S10 and one buyer B that is willing to pay a fixed
price p for one bundle and follows Buyer Strategy β1. Each seller has previously sold
10 bundles, of which 5 were good and 5 bad. All else being equal, the buyer will
prefer to purchase from the seller with the best transactional record, expressed as the
fraction of transactions in which they cooperated. To the buyer who must choose one,
all ten are identical and thus all have an equal chance of being chosen, 0.1. Suppose B
chooses S1. If S1 defects, it’s transaction record will drop to 511
while the rest remain
at 510
. In the following round, having a clearly worse record will disqualify S1 from
selection, lowering S1’s probability of being chosen to 0 and raising the other seller’s
chance to 19. However, if S1 had instead cooperated with B then her record would be
62 CHAPTER 3. AGENT STRATEGIES UNDER REPUTATION
611
, higher than the other sellers. In the following round we would expect B to choose
S1 with probability 1 as she clearly has a better record than the rest. These expected
outcomes provide incentive for S1 to cooperate.
In the following round, B chooses S1 again. If S1 defects, her record falls to 612
,
equal to that of the other sellers. S1 is no longer guaranteed to be chosen and once
again has a probability of 0.1. Therefore, once again, S1 is incentivized to cooperate.
The third round, however, the situation is more interesting. If S1 is chosen and
defects her record drops to 713
, which is still better than the rest of the sellers with 510
.
There is no disincentive for S1 to defect. In fact, comparing 1G to 1B in Table 3.3,
S1 clearly has incentive to defect and earn more utility than cooperating.
This analysis suggests a new strategy for the seller.
Suppose there are n sellers, each with ci cooperations and di defections (ci/di not
necessarily equal to cj/dj, if i 6= j). Assume a buyer ranks sellers according to their
reputation score, calculated as ci
ci+difor seller i. Then a seller i would choose to defect
on a transaction if ci
ci+di+1>
cj
cj+dj∀j 6= i. Otherwise, the seller cooperates. More
generally stated,
Seller Strategy σ4σ4σ4: Assuming buyers use BS-β1, seller S always cooperates
unless her reputation is sufficiently higher than the other sellers that a defection still
gives S a higher reputation than the rest.
Now we relax the assumption that buyers strictly use BS-β1.
In the example above, we noted that S1 had incentive to defect in round 3 while
keeping her standing as the highest reputable seller. Consequently, B may be better
off ignoring S1 and choosing one of the other sellers, contrary to BS-β1. Now, the
situation is reduced to the problem of 9 equal sellers, all with incentive to cooperate.
Suppose B chooses S2 now. As before S2 is expected to cooperate, raising her repu-
tation to 611
. On the fourth round, B will choose S1 again, since if she defects, S1’s
new reputation of 713
will be lower than S2’s.
3.5. RELATED WORK 63
If sellers are expected to use SS-σ4, a buyer should then choose the seller S, such
that cS
cS+dS≥ cj
cj+dj∀j unless cS
cS+dS+1>
cj
cj+dj∀j 6= S. In such a case, the buyer should
choose the seller T such that cT
cT +dT≥ cj
cj+dj∀j 6= S. In other words,
Buyer Strategy β3β3β3: Assuming sellers use SS-σ4, buyer B always chooses to buy
from the seller with the highest reputation whose rank (from most reputable to least
reputable) would fall if they defected.
We believe that knowing buyers are using BS-β3 will not cause sellers to deviate
from SS-σ4, and hence equilibrium is reached. We do not present a formal proof here.
3.5 Related Work
The work presented here was initially inspired by the work of Feldman et al. In [45],
they used the Evolutionary Prisoner’s Dilemma to model peer interactions in a large
population. They developed a reciprocative strategy that employs subjective shared
history and adaptive stranger policies to discourage selfish behavior and whitewash-
ing. While this previous work relies primarily on simulations to evaluate the effec-
tiveness of their design we apply mathematical analysis to derive agent strategies and
overall system behavior.
Much work has applied game theory to the problem of selfish agents (e.g. [18,
46, 107]). [18] predicts a socially beneficial Nash equilibrium given some incentive
scheme, while [46] concentrates on minimizing whitewashing. However, most of this
research uses one-shot games to model behavior and do not address peer history or
reputation.
In both [22] and [75], user groups participated in economic games in order to
experimentally compare the market efficiency from varying amounts of transaction
history. Their results are similar to our analytic results, indicating that the more
information available about an agent, the more likely they will cooperate.
Economists have applied game theory to market analysis and reputation for decades [80,
64 CHAPTER 3. AGENT STRATEGIES UNDER REPUTATION
100, 50]. Most of this work has focused on firms competing for market share. How-
ever, the explosion in online trade among countless small transient agents demands
a reevaluation of the subject. In addition, to the best of our knowledge no previous
work studies optimal segregated transaction schedules.
3.6 Future Directions
The work presented here assumed all buyers had an equal constant valuation that
was public. One extension will be to allow buyers to have different, private bundle
valuations. This would only affect seller strategies that rely on knowing v in order to
choose the proper course of action.
Much of the analysis of this simplified model indicated equilibrium in some sce-
narios is reached when sellers only cooperate. Introducing an unavoidable error rate
that results in occasional defections regardless of the seller’s intention, may require
buyers and sellers to devise more interesting strategies. What is sellers could gain
a cost reduction on all bundles by accepting a higher error rate? This would mimic
retailers choosing to stock cheaper items from lower quality manufacturers.
A necessary step will be to forego our assumption of perfect history and explore
the use of uncertain history provided by imperfect reputation systems. One solution
would be to assign a probability that any given transaction is incorrectly reported or
simply omitted from a seller’s recorded history.
3.6.1 Variably-valuated goods
So far we have discussed situations where every good sold by a seller had an equal cost.
In real markets a seller sells goods of varying value. How does this affect reputation?
How important is the value of the transactions in a seller’s history? If no import is
placed on the transaction value a seller could accumulate a high reputation selling
inexpensive goods and then defect on one large transaction.
3.6. FUTURE DIRECTIONS 65
One improvement may be to use the price of each transaction when computing
a seller’s reputation. For example, let us assume a buyer is using BS-β1 and wants
to calculate ρS for seller S. Instead of using the formula in Equation 3.1 where each
previous transaction is reduced to 0 or 1, we can use the following equation
ρS =
∑
i∈CSp(i)
∑
i∈TSp(i)
(3.31)
This estimator allows a buyer to better detect a seller who purposefully cooperates
on small transactions, but defects on very large ones, and distinguish that seller from
one who makes accidental errors that tarnish its reputation.
3.6.2 Malicious Sellers
Until now, the players have acted selfishly. Selfish sellers are motivated only in in-
creasing their utility by raising the price of their goods and lowering their costs. We
will now consider malicious sellers that extract an additional benefit from passing bad
content to buyers that harms them in some way. Malicious users in the real world
include propagators of virus-infected software in order to gain access to machines or
disseminators of falsified copies of documents in order to promote their agenda.
Accounting for malicious motive in the model is difficult, as it is unclear how to
represent the effects of malicious activity in terms of monetary gain or loss. One
possibility is to represent malicious activity by adding an additional payoff term to
both the buyer and the seller: a negative term, −(1−g)m, representing the lost utility
from damage to the buyer, and a positive term, (1 − g)m, for the resulting benefit
the malicious seller derives. We call the coefficient m the maliciousness factor, a new
parameter in our model that relates the amount of bad content provided by a seller
to the damage in utility afflicted on the buyer, as well as the gain in utility to the
seller for causing it. Adding the maliciousness factor of m = $1 to Payoff Table 3.3
we have Table 3.5. Updating the payoff formulas from Section 3.2.1 yields a single
66 CHAPTER 3. AGENT STRATEGIES UNDER REPUTATION
Table 3.5: Payoff Matrix for fixed $2 priced goods with valuation $3, cost $1, andmaliciousness factor $1
Bundle(S) Buyer Seller Social Profit[1G : 0B] 1 1 2[12G : 1
2B] -1 2 1
[0G : 1B] -3 3 0[g : (1− g)] 4g − 3 3− 2g 2g
transaction payoff of vg − p−m(1− g) for buyers and p− cg +m(1− g) for sellers.
Note, the social profit remains the same at (v − c)g.
Of course, we are assuming that the benefit derived by a malicious seller for selling
malicious content exactly matches the cost imposed on the buyer. In most situations
the benefit/cost ratio would be imbalanced. For example, a seller that sends a virus-
laden file to a buyer might receive some small temporary joy from this act, but the
buyer may have his hard drive erased and lose years of work in the process.
Because of the uncertainty in modeling malicious activity on a per transaction
basis, we do not employ it in this study. A more sophisticated method for accounting
for the effects of maliciousness is presented in the following chapter.
3.6.3 Costly Signaling
One technique that may complement reputation is costly signaling, especially when
little or no transactional history is available. First proposed by Zahavi [142, 143],
costly signaling is a biological mechanism whereby organisms communicate their
“quality” to potential mates by overtly expending energy or resources as a sign of
their fitness. Since then, it has been explored by many biologists and economists
(e.g. [58] [57] [123]). A real world economic example would be a store that offers free
gifts to anyone who enters in order to entice them to consider additional purchases.
In fact, advertising in general constitutes costly signaling. Applying this concept to
3.7. CONCLUSION 67
our game would introduce a new cost, say c′, that all sellers must pay each round, re-
gardless of whether they sell a bundle or not. While the additional cost may decrease
sellers’ profits (if not offset by a raised price) it may encourage buyers to trust new
sellers.
Once a seller has obtained a good reputation, the extra cost may be unnecessary.
After a number of successful transactions, a seller may be allowed to waive this cost.
This procedure equates to an entrance fee imposed on newcomers to the system.
Similar techniques are discussed in the following chapters.
Although costly signaling may not be applicable to every transaction scenario, it
can be used in concrete peer-to-peer applications. Costly signaling forms the basis
of effort-balancing protocols that attempt to equalize the computational resources
expended by both parties. Effort-balancing employs artificial puzzles that require
complex CPU-bound or memory-bound functions to solve, but are simple to verify [41,
2]. This technique has been suggested for various applications ranging from email
spam reduction [40], to digital preservation [87]. However, adding artificial resource
burdens in order to guarantee equal effort is not advisable for applications where
speed and efficiency are of the utmost importance.
3.7 Conclusion
This chapter presents our initial study of buyer/seller strategies, focusing primarily
on how knowledge of past transaction history affects both buyer and seller strategy.
We proposed a simple game model for transactions with cooperating and defecting
buyers and sellers in a rich spectrum of scenarios. Beginning with basic strategies for
all players, we incrementally improved them until an equilibrium was reached.
We concentrated on the two scenarios we believe to be the most natural, buyers
competing in an auction (MB-1S/VP) and many sellers competing for buyers in a
68 CHAPTER 3. AGENT STRATEGIES UNDER REPUTATION
fixed-price commodities market (1B-MS/FP). It is interesting to note that at equi-
librium players are encouraged to cooperate, realizing the social optimum. In other
words, it does not pay to cheat when reputation is involved.
Chapter 4
Modeling Reputation andIncentives in Online Trade
The previous chapter presented a game theoretic approach for analyzing user strate-
gies when each seller’s transaction history is available. However, the study was limited
to a small number of participants using simple strategies. In addition, we ignored the
initial entry effects; how should new sellers with no history be treated? In this chapter
we approach the issue of reputation in trading systems from a macroeconomic level,
focusing not on individual transactions, but whole system trends and the expected
behavior and performance of different types of peers.
There are many ways a software designer can implement an online trading system.
Each design choice involves trade-offs that are not yet completely clear. For example:
How should peers choose among trusted and untrusted service providers?
How do peers use reputation information to choose who to transact with? A system
must balance avoiding bad peers with giving new peers the opportunity to partici-
pate. Our analysis in this chapter will show that, while the selection method has little
effect on the long-term profit rates of well-behaved agents, it does influence whether
malicious peers profit in the system or not.
Should a peer’s reputation reflect the amount of cooperation or purely
the quality of their cooperation? If the trust strategy measures only the quality
69
70 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
of a peer’s interactions, then malicious peers can gain a high reputation from a few
small transactions and then defect on large transactions. Using two separate scalar
metrics would be pointless because we want all peers ordered relative to each other
within the given context. We will find that calculating trust solely by the proportion
of good work done does not effectively deter misbehavior.
How easy or difficult should it be for a new user to join the system?
Should users pay an entrance fee to join the system as insurance against possible
malicious behavior? How much should we trust new peers? If the initial trust is too
high, then malicious peers can profit, at least in the short run. If initial trust it too
low, good new peers will never be able to contribute and raise their reputation. Given
specific values for the system parameters (e.g. expected payoff from maliciousness)
we can calculate an initial trust so that malicious peers are not expected to profit at
all. If we charge an initial entrance fee, then we can slightly raise the initial trust
so that good peers begin generating profit faster, while malicious peers are still not
expected to generate sufficient profit as to outweigh the entrance cost.
Should a user reply to queries from only trusted peers? Peers may decide
whether to respond to a service request based on the reputation of the requestor.
Tying service responses to requestor reputation improves the expected performance
for good peers when the system is highly loaded with requests.
For each of these questions, what are the implications of various solutions in terms
of fairness, profitability and vulnerability to misbehavior?
To address these, and many other, questions we have developed a mathematical
model for peer behavior in trading system that employs per transaction payments and
a reputation system. Note, we are not modeling or analyzing any specific existing
mechanism.1 Instead, we strive to develop an abstract model that is simple and
general enough to analyze design issues and facilitate system engineers.
1For an analysis of a realistic reputation system, see Chapter 5.
4.1. ASSUMPTIONS AND DEFINITIONS 71
The following section defines terms and lists our assumptions. Section 4.2 de-
scribes our basic economic system model. In Section 4.3, we present a time-based
analysis of our model. Section 4.6 discusses variations to specific components of the
model, which we then develop into our generalized economic model in Section 4.7.
Finally, we conclude in Section 4.10.
4.1 Assumptions and Definitions
The basic unit of work in a trading system is one party acquiring a resource or
service from another party. We will refer to this as a transaction. A transaction may
or may not involve a transfer of payment in exchange for the resource or service.
To distinguish between when a peer participates in a transaction by providing the
resource (server) or acquiring it (client), we will refer any transaction a peer serves
as a contribution and any transaction a peer requests as an acquisition. For brevity
we will refer to the goods, services, resources, etc., acquired through one transaction
as a resource.
Our system model is that of a group or network of users or nodes 2 that exchange
resources with each other. From now on, we will refer to the users or nodes in the
system generically as peers.
We assume a trusted reputation system collects the results of the transactions
(whether they succeed or fail and the quality or validity of the acquired resource)
in order to calculate a reputation rating for the peers involved. A peer uses these
ratings to determine which of the other peers offering the needed content to contact.
Well-behaved peers will in general prefer to interact with “reputable” peers. Below,
we explain how these assumptions are expressed in our economic model.
2We use node to indicate a user’s virtual identity in the exchange system. One user may acquireseveral system identifiers and thus control several nodes [38]. A node’s behavior may also differ fromits user’s if the user’s machine has been compromised (e.g. infected by a virus).
72 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
We refer to any peer who has not previously participated in a transaction, and its
identity is unknown to the reputation system, as a stranger [45]. A stranger may be
a newcomer to the system or a whitewasher, a peer who has changed its identity in
order to reenter the system with no history of its past behavior.
4.1.1 Utility
The goal of each peer in a trading system is to increase its “utility” (defined formally
below), by acquiring resources which it values more than they cost the peer to acquire.
In many cases, the utility of an acquisition is completely subjective (e.g. the senti-
mental value of a song purchased from iTunes [8]). Likewise, the cost of contributing
is dependent on the peer. For example, a student in a college dorm may have free
high-speed Internet access, while someone else pays a monthly fee for a fraction of
that bandwidth. We make a couple of assumptions for simplicity:
• The full utility gain or cost of a transaction can be expressed in a general unit
of utility, denoted by the symbol u.
• All peers gain the same utility from an acquisition and suffer the same cost in
utility for a contribution, though some peers are capable of contributing more
than others.
4.1.2 Time
Our model characterizes how a peer’s utility changes over time while participating
in the online trading system. We discuss how a peer’s behavior in a unit of time
influences its utility and choices for the next interval. Thus, we describe our model
in the context of discrete time units. When we refer to a given variable F in two
different units of time, we use F [i] and F [j] to denote the difference. For brevity, we
leave off this notation if all time-varying parameters refer to their values at the end
4.2. FORMAL MODEL 73
of the same unit of time.
In Section 4.3, we represent the time-varying parameters with continuous-time
functions. In this case we use parenthetical notation to denote the value of a para-
meter at a specific time (e.g. F (t)).
4.2 Formal Model
Now, we present a general mathematical model for the behavior of peers in a peer-
to-peer system. This model illustrates the effect of both the incentive scheme and
reputation system on a peer’s decisions whether to contribute positively or not.
In naıve exchange systems freeriders profited because the amount of services one
could request from the system was decoupled from what they contributed. The fol-
lowing equation demonstrates how a particular peer i’s profit changes over a given
period of time.
P [t] = kvA[t]︸ ︷︷ ︸
income
−kcC[t]− κ︸ ︷︷ ︸
cost
(4.1)
We break down the characteristics affecting each peer’s strategy in the system into
the following three parameters: utility, contributive capacity, and acquisition rate.
• Utility (U) is the total value (in utility units u) of all resources available to a
peer, including its monetary wealth. We denote a peer’s total utility when it
enters the trading system as U(0) and assume a peer does not change its utility
through factors external to the system once it joins.
Profit is the amount a peer’s utility changes in a unit of time and is denoted
by P = ∆U . Profit is the utility gained by a peer from using the system (its
income) minus the cost of participation. Factors that increase income include
resources acquired and payments, or other incentives, received. Factors that
increase cost include resources expended (e.g. bandwidth) and payments made
74 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
if purchasing resources. Because we are generally more interested in the change
in a peer’s utility from using the system, than its absolute utility, we tend to
use the term “profit” more than “utility”.
• A peer’s contributive capacity (C), in general, indicates the number of trans-
actions it can serve (contribute) in a unit of time. The contributive capacity
takes into account, for example, the rate a peer can answer queries and upload
files. For typical freeriders, C = 0 as they provide no files or services.
For now, we assume a peer’s contributive capacity remains constant in a given
unit of time . Changes to its capacity are directly initiated by the user and not
affected by the system.
For convenience, we bound C to a normalized range of [0, 1] where 1 represents
the maximum contributive capacity any peer. In a real-world system, while any
peer can lower their C to 0, not all can raise it to 1 (e.g. bandwidth constraints).
• Acquisition Rate (A) is the number of resources a peer can acquire in one
unit of time. Similar to the contributive capacity, we bound A to a normalized
range of [0, 1]. Though not necessarily true in real systems, we assume A is
not dependent on C (or vice versa) due to resource constraints (e.g. same
bandwidth for uploads and downloads).
When dealing with a specific named peer we will subscript the above variables
with the peer’s id (e.g. Ti for peer i trust). The equations we present in this section
all deal with the effects a single peer’s factors (U , T , C, A) have on each other.
Therefore, for brevity, we will leave off the subscripted id, unless specifically referring
to interaction between two distinct peers.
In Equation 4.1, kvA represents the peer’s income from resources it acquired (e.g.
the number of download times the value of the downloads to the peer, denoted by the
4.2. FORMAL MODEL 75
# of peers
P
SaturatedUnsaturated
Figure 4.1: Relationship between a peer’s profit rate and the number of peers in thenetwork.
constant kv), while kcC is the cost suffered by the peer for contributing C amount of
resources (kc is the cost of contributing one unit of C). κ is a low fixed cost peers
pay for belonging to the network (e.g. cost of bandwidth per unit time for doing
basic routing, or subscription fee). The equation is maximized when C is zero, or no
contribution is made. Therefore, peers are encouraged to be selfish and freeride in
order to maximize their profit.
We ignore the effect the number of peers participating in the exchange system (n)
has on any single peer’s ability to acquire or contribute resources. For low values of n,
peers may have difficulty locating the resources they need. However, for large enough
n, there is sufficient resource availability that the bottleneck in acquiring resources is
the requesting peer itself. For simplicity, our model assumes the system is operating
in this saturated scenario to the right of the dashed line in Figure 4.1.
4.2.1 Incentive Schemes
The reason why freeriding thrives in the naıve model is because the acquisitions a
peer is allowed is independent of their level of contribution. To discourage freeriders,
a trading system will employ some incentive scheme so that the amount of resources
a peer can acquire is directly related to the amount of resources it contributes to the
76 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
system. An incentive scheme is a set of rules of behavior, enforced by a centralized
mechanism or a distributed protocol, that encourage peers to contribute to the trading
system in order to increase the utility they gain from the system.
For our model we abstract all incentive schemes as a policy of payments for us-
ing system services (i.e. acquiring resources). This payment is typically earned by
contributing resources to the system.
P = kvA︸︷︷︸
value ofacquiring
− kpA︸︷︷︸
cost ofacquiring
+ kpC︸︷︷︸
payment for
contributing
− kcC︸︷︷︸
cost ofcontributing
− κ (4.2)
In Equation 4.2, peers are paid proportionally to their contribution (kpC). We
can think of kp as the price other peers pay for each normalized unit of contribution
expressed in units of utility (u) in the equation. kc remains the same, the cost a
peer incurs for contributing. For each acquisition, a peer pays a price of kp, but gains
an average utility of kv from it.
We expect peers to be rational and thus only acquire resources whose acquisition
will increase their utility. Thus, we assume that the utility of the resources acquired
is greater than the utility of the price paid (kv > kp) or else the peer would not have
purchased the resource through the system.
For purposes of discussion, we will assume the incentive scheme requires each
peer to pay the resource provider for any resource it acquires using a common trading
system currency we call credits, whose generation and distribution is managed by the
system. Using a distinct currency for transactions allows us to cleanly decouple the
resource acquisition and contribution functions and apply the concept of “price” to
each transaction. We also assume a global exchange rate between credits and utility
exists, rx. For instance, if kp = 2 and rx = 10, then a peer with a contributive capacity
of 1 will earn 20 credits per unit of time, worth 2u. A peer with C = 0.5 would make
10 credits worth 1u. However, the two are not necessarily directly exchangeable (as
4.2. FORMAL MODEL 77
we shall see in Section 4.2.2).
For simplicity, we will assume that all resources in the system are priced the same
in credits. Later, we will add price variability into our model.
We see in Equation 4.2 that a peer can generate credits by contributing, and that
it can only acquire resources by spending credits. Consequently, a peer’s acquisition
rate appears to be at least loosely related to its contributive capacity, as is the goal of
the incentive scheme. In the next section, we discuss specific methods for tightening
or relaxing the relation between A and C.
4.2.2 Currency Scenarios
The amount of resources a peer can purchase is limited by many credits it has avail-
able.3 We now introduce three scenarios representing different methods of treating
credits and payments in the system. Each scenario will impose its own additional
restrictions to the incentive model.
In the first scenario a peer can purchase resources using its full utility. We assume
peers can freely purchase credits from or sell credits back to the system using some
real-world currency. We further assume all of a peer’s utility can be converted to this
currency and used for purchasing credits in the system. Then the amount of resources
a peer can acquire is limited only by its current total utility, which can be expressed
as
U [t− 1] ≥ kpA[t] (4.3)
where U [t− 1] is a peer’s total utility at the beginning of the unit of time t. Though
the peer gains an additional kpC−kcC of utility during that time, we assume that all
purchases are initiated at the beginning of the interval, simplifying our evaluation.
However, if the system does not allow credits to be purchased nor sold directly, a
peer is limited to using only the credits it has earned from earlier contributions plus
3For now, we assume peers cannot “borrow” credits from the system
78 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
any additional credits it has saved previously. In the second scenario, peers can only
purchase resources with the credits they have earned and saved from contributions.
Let Si[t] be the current credits saved up by peer i from contributions at the end of
time interval t. We now have the following two equations instead.
S[t− 1] + kpC ≥ kpA[t] (4.4)
∆S = kpC − kpA[t] (4.5)
Equation 4.4 states that the amount of credits a peer pays for resources in a time unit
cannot exceed the number of credits saved up plus what it earned from cooperating
in that same unit of time. Equation 4.5 demonstrates how a peer’s credit balance
(the amount of credits saved up) changes over time.
We have presented two scenarios, depending on whether the incentive scheme
allows credits to be freely purchased or not. We refer to a system governed by Eq. 4.3
as the common currency scenario, while a system following Eq. 4.4 and 4.5 will be
called disjoint currency scenario. “Common currency” refers to the fact that credits
in the system are freely exchangeable with currency outside the system, so that it can
be thought of as a single currency. “Disjoint currency” stresses the fact that credits
can only be earned or spent within the system by providing or acquiring resources.
We still include the credit payments received or made in our utility equation because
they represent the potential utility we would gain by purchasing resources with the
credits.
A third scenario, which will be the most useful for analysis and discussion, assumes
that in each time interval peer spends exactly as many credits as it earned cooperating,
which is expressed by the following equality:
kpC[t] = kpA[t] (4.6)
The disjoint currency scenario discussed before assumes credits cannot be exchanged
4.2. FORMAL MODEL 79
to or from the system for real-world currency so credits are only useful within the
system for purchasing resources. Consequently, we expect peers to spend all their
saved credits on resources before leaving the system. Thus, for a sufficiently large
time interval, we expect Equation 4.6 holds for any rational peer in the disjoint
currency scenario. Accordingly, we call this third scenario the long-term disjoint
currency scenario. With this model a peer’s credit balance does not change and can
be assumed to be 0 between intervals and ignored. Because credits are worthless
outside the system peers will spend all their credits. Therefore, the system does not
need to enforce this policy as peers will seek to carry it out in their own self-interest
to maximize profit.
Obviously, because kp is a constant, A = C. We can now substitute C for A in
the term for the value of acquired resources in Equation 4.2, kvA, giving us kvC.
We can now simplify Equation 4.2 for the long-term disjoint currency scenario by
cancelling out the equal terms and substituting:
P = kv(kp
kp
C)−���kpA+���kpC − kcC − κ
= kvC − kcC − κ(4.7)
Constant kv represents the total income gained for each unit contributed, either in
value of resources acquired or credits received but not spent. As long as the incentive
system guarantees that kv > kc peers are motivated to contribute more resources and
not freeride.
4.2.3 Trust
Unfortunately, malicious peers insist on distributing bad content to other peers, prof-
iting from harming the system. To capture this effect we divide the contributive
capacity into good capacity CG and bad capacity CB. The latter includes resources
devoted to disrupting the system. Of course, C = CG +CB. We incorporate the effect
80 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
of malicious peers in our profit model from Equation 4.2 in the following formula:
P = πgkvA− kpA+ kmCB + kpC − kcC − κ (4.8)
In Equation 4.8, πg represents the fraction of the nodes in the system that are well-
behaved (not malicious). We would expect this same fraction of requested transac-
tions to complete successfully. Therefore, we only gain utility from a πg fraction of
the resources we purchase. Now that malicious peers are sharing bogus resources a
fraction 1−πg of each peers’ requested transactions will be worthless. This decreases
the utility of acquired resources to πgkvA. In addition to being paid for all their con-
tributed resources, malicious peers gain additional utility from causing damage with
the bad contributions (CB). The parameter km quantifies the additional value bad
nodes gain from harming the system with their bad content. If we assume km > 0 for
malicious peers, then it is in their interest to increase CB to maximize profit, resulting
in C = CB and CG = 0. For well-behaved nodes km = 0 and C = CG.
Earlier, when we introduced the long-term disjoint currency scenario we saw that
an incentive scheme promotes peers to contribute if it can guarantee that kv > kc.
Now the loss of profit due to bogus resources will lower the expected income from
acquisitions by a factor of 1 − πg. Consequently, if πgkv ≯ kc good peers will be
motivated to stop contributing or leave the network altogether.
For completeness, we may want to account for any additional cost incurred by a
malicious peer for sharing any amount of good resources (CG > 0). We can subtract
an extra term kmgCG to Equation 4.8, representing any unhappiness the malicious
peer may get for contributing useful resources to the system. Because we feel that in
most situations either kmg ' 0 and/or CG ' 0, we will ignore this factor.
Note, Equation 4.8 assumes that resource providers are paid for their resources
before the purchaser is able to verify the validity or value of the resource acquired.
This is expressed by the payments received term (kpC) indicating payment for all
4.2. FORMAL MODEL 81
Trus
t Vec
tor
∆T
∆U
Peers
Figure 4.2: Representation of a reputation system’s role in a trading network. Trans-action observations update peer reputations maintained in the trust vector. Reputa-tion information is then used by peers in transactions to improve expected utility.
contributions, not just good contributions. Likewise, the payments made term (kpA)
denotes that all acquisitions were paid for, while only a fraction πg received were
good. If instead we assumed peers paid after verifying the validity of resources (or
could reliably revoke their payments), then a peer would only be paid for the good
resources it provided (kpCG) and it would only pay for the fraction of resources that
were valid (πgkpA). This difference would yield instead the following equation.
P = πgtkvA− πgtkpA+ kmCB + kpCG − kcC − κ (4.9)
We do not believe we can expect peers to be able to verify the validity of resources
before paying for them. A system of payment revocation (as we have with credit cards)
may be possible, but if the malicious peer has already spent the credits it received,
it would be difficult to exact a currency-based penalty. Therefore, we continue our
economic model based on Equation 4.8.
To combat malicious nodes we deploy reputation systems. Reputation systems
can be abstracted as two separate mechanisms (illustrated in Figure 4.2).
1. A centralized authority or distributed protocol, represented by the eye, tracks
every peer’s positive and negative contributions and modifies that peer’s trust
rating based on their past and present contributions. The structure containing
reputation information for each peer is labelled a trust vector in Fig. 4.2. We
82 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
refer to the model of this mechanism as the trust model.
2. Peers access the trust vector to fetch the ratings of peers offering the resource
they desire and take into account the expected risk of bad service (given each
provider’s reputation) when selecting the provider from which to fetch the re-
source. We refer to the representation of the second mechanism as the profit
model.
Modeling the first mechanism is dependent on the method in which trust is com-
puted, which is system-specific. Therefore, we need an equation for how a peer’s
trust rating changes based on its contributions. A peer’s trust rating or reputation
may be derived in three ways: from the quantity it contributes, the quality of its
contributions, or a combination of the two. We will ignore the first method, as bas-
ing reputation purely on the quantity is obviously counter-productive, as a malicious
peer that contributes a lot of bad work will attain a high reputation. Basing the trust
rating purely on the quality of contributions can be effective but does not completely
capture the value a peer brings to the network. For example, we may consider a peer
that contributes twice as much good work as another peer to be more reputable. In
addition, malicious peers can take advantage of the trust mechanism by providing
good resources on one or two small transactions, then defecting on a larger or more
costly transaction. Therefore, the trust equation we will study combines both quan-
tity and quality measurements. However, we will compare our proposed strategy to
a purely quality-based strategy in Section 4.6.2.
We present one particular “∆T” formula below that exhibits many useful proper-
ties. We discuss these properties in Section 4.3.1 and present a more generalized ∆T
model in Section 4.7.
To utility, contributive capacity and rate of acquisition, we add a fourth parameter
indicating a peer’s reputation:
4.2. FORMAL MODEL 83
• Trust (T ) represents the perceived reliability or reputation of a node by its
peers. Trust is quantified by the peer’s rating in the reputation system. De-
pending on the reputation system, the reputation value range may be bounded
or unbounded, but for our model we assume a peer’s trust is between 0 and 1,
with 1 meaning a peer is most reputable. A peer’s initial trust rating when it
enters the system will be referred to as T (0) and assume that it is equal for all
newcomers.
The second mechanism must be modeled by the profit equation and must take
into account the effects on utility of both using reputation when choosing resource
providers and a peer’s own reputation on its ability to contribute to the system.
We augment Eq. 4.8 to express one specific way in which peers use trust ratings
to increase their profits (mechanism 2).
P = πgtkvA− kpA+ (kmCB + kpC − kcC)T − κ (4.10)
In Equation 4.10, parameter πgt is the fraction of transactions requested from well-
behaved peers who are likely to reply correctly. Unlike πg in Equation 4.8, πgt must
take into account that a peer will choose to interact with reputable peers, decreasing
the probability of contacting malicious peers. Therefore, we expect πgt > πg. In fact,
πgt is likely to be close to 1.
In Eq. 4.10, a peer’s reputation T affects how much of its system contribution C
is accessed by other peers. Peers a more likely to purchase resources from reputable
peers. As stated earlier, a peer’s trust T is bounded between 0 and 1. To model
the role reputation plays on a peer’s ability to sell resources in Eq. 4.10 we multiply
each term relating to a peer’s contribution C, by T . Thus, T determines the fraction
of the contribution which is used by other peers and thereby generating credits and,
in the case of malicious nodes, disseminating bad content. Eq. 4.10 presents one
specific relation between trust and the rate of contribution (linearly proportional).
84 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
This relation is sufficiently simple to illustrate our intuition and allow us to perform
some interesting analysis. We discuss the generalized form of this relation between
trust and profit in Section 4.7.
If we apply the long-term disjoint currency scenario, we can simplify Equation 4.10.
By applying the same reasoning used to derive Eq. 4.7 from Eq. 4.2 to Eq. 4.10 we
get
P3 = (πgtkvC + kmCB − kcC)T − κ (4.11)
where the subscript 3 indicates this equation corresponds to our third scenario, long-
term disjoint currency. This equation will be useful in our analysis of utility over time
in Section 4.3.2.
Previously, our discussion of long-term disjoint currency indicated that an incen-
tive scheme will promote cooperation if, in general, kv > kc. When we introduced
malicious peers, the dissemination of bogus resources reduced the fraction of good
resources acquired to πg. As noted earlier, well-behaved peers would only be encour-
aged to contribute if πgkv > kc, which may not be the case. Consequently, good peers
would begin leaving the network, further decreasing the probability of locating good
resources, causing the network to collapse. By introducing a reputation system we
expect the probability of acquiring a good resource, πgt, to be close to 1. If so, then
the inequality πgtkv > kc is likely to hold for all good peers in the network, assuming
kv > kc. Once again, the incentive system will encourage cooperation.
We now look at representing the first mechanism of the reputation system; com-
puting trust based on peer behavior. As stated above, a peer’s reputation rating is
determined by the quality and/or quantity of positive and negative contributions.
There are many ways by which an actual reputation system expresses a peer’s repu-
tation given their behavior. Here, we present a specific formula for updating a peer’s
trust that is intuitive, maintains T between 0 and 1, and displays characteristics ben-
eficial to reputation systems, discussed in Section 4.3.1. We evaluate other choices in
4.2. FORMAL MODEL 85
Section 4.6.2.
∆T = (rgCG(1− T )− rbCBT )T (4.12)
Equation 4.12 demonstrates how trust changes over time. Trust increases with
positive interactions (CG) and decreases with negative interactions (CB). Constants
rg and rb indicate the effect each unit of cooperation has on a peer’s trust value;
from either positive or negative contributions, respectively, though both constants
will be between 0 and 1. For example, we would most likely want to lower a node’s
reputation more for each bad file uploaded we reward it for each good file uploaded.
Consequently, we may set rb to 1 and rg to 0.25. We would also expect a peer’s repu-
tation to increase less for good behavior as it becomes more reputable. Inversely, we
would want a peer’s reputation to decrease more for bad behavior as their reputation
increases. Equation 4.12 meets these requirements by multiplying the positive factor
by (1− T ) and the negative factor by T .
In addition, both the positive and negative contribution terms are multiplied by
an additional T . Weighing the terms by the peer’s trust models real world behavior
where reputable peers will be more likely to be chosen for transactions, allowing them
more opportunities to increment (or decrement) their trust in the same amount of
time. Once again, this is based on our choice of a linear relation between trust and
contributions accepted.
∆T ∝ T implies that the low (but nonzero) reputation of strangers increases very
slowly at first. If strangers with no reputation or trust have T ' 0, then they will be
unable to gain profit or trust. To correct this problem, we assume a lower bound on
the initial trust (T (0)) of τ > 0, although very small, guaranteeing T will rise over
time.
When T is small, P in Equation 4.10 will be dominated by −κ, so the costs of
belonging to the P2P system outweigh its benefits. Additionally, their credit income
86 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
rate will be very low as few peers will purchase resources from them, fearing they
will be cheated. To encourage transactions and thus raise their reputation, strangers
may have to “discount” the price of their services. A low income rate will severely
limit stranger’s access to offered resources. We call this the “reputation slow-start”
phase. Few peers may have the patience to suffer this entry cost long enough to gain
sufficient trust to attain positive profit.
A reputation system will likely want trust to decay over time. If a peer earns a
high reputation, it may then stop contributing resources (C → 0) and maintain its
high reputation rating. This might be exploited by malicious peers. For example,
they may watch for reputable peers that are inactive and target them for spoofing,
assuming those peers are less likely to notice their identifier’s being hijacked. To
model a constant drop in trust we add a decay factor, δ to Equation 4.12.
∆T = (rgCG(1− T )− rbCBT )T − δT 2 (4.13)
The decay factor is combined with the other negative factor introduced by mali-
cious peers doing harmful work, but is independent of CB. It is multiplied by the trust
because stale reputation is a greater problem with highly reputable peers than low
reputation peers. The term T 2 ensures that peers with low trust, such as strangers,
are not greatly affected by the decay, regardless of the amount of contributions. We
require that rb + δ ≤ 1, which can be shown to guarantee that 0 ≤ T ≤ 1. An-
other possible solution to constraining the value range of T would have been to use
min/max functions in our formula for ∆T . However, by not resorting to min/max
functions to bound T we can solve for T (t) in Equation 4.13 and express trust as a
continuous function over time, which will be useful in our analysis in Section 4.3.1.
4.3. ANALYSIS 87
4.3 Analysis
Using the model presented above, we study the expected behavior of peers in a trading
system which conforms to our model. We illustrate the effects varying the different
parameters yield on the reputation and wealth of individual peers.
4.3.1 Trust Over time
Given our equation for ∆T from Equation 4.13, we can compute a function for how a
peer’s trust changes over time. If we assume the length of the discrete time intervals
approaches 0, and all other parameters stay constant, we can treat Equation 4.13 as
a differential equation. Solving this differential equation gives us
T (t) =rgCG
rgCG + rbCB + δ + Z · e−rgCGt
where Z =rgCG
T (0)− (rgCG + rbCB + δ)
(4.14)
Notice that T (0) appears in the denominator of the initial condition constant Z.
As we saw earlier, we cannot have T (t′) = 0 for any t′, else ∆T = 0, making T (t) = 0
for all t > t′. Therefore, we limit T ≥ τ . We will use a default τ of 0.01.
Using this equation, we evaluate the effects of the different parameters on a peer’s
reputation. In our analysis we constrain each parameter to a default value (given in
the first half of Table 4.1) except for the parameter(s) under study.
Figure 4.3(a) shows the progression of trust over time for a new, well-behaved node
at varying amounts of cooperation, CG. As expected, the more a peer contributes, the
faster its reputation grows. This curve illustrates the effect of reputation slow-start.
Eventually, all three curves flatten out at different values of T . The curves reach a
steady-state value when the incremental gain in trust (given the current value of T
and C) equals the drop caused by the decay factor.
Next, we look at how quickly a peer’s trust falls due to misbehavior. We begin
88 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
Table 4.1: Trust and Profit Parameters and Default ValuesParam. Description Default
ValueC Contributive capacity 1CG Good content contributed CCB Bad content contributed C − CG
rg Factor by which CG increases trust 0.2rb Factor by which CB decreases trust 0.99T (0) Starting trust value τ = 0.01δ Decay factor 0.01πgt Prob. of acquired resource being good 0.9kv Utility gained from acquiring 1 unit 2kc Utility cost of contributing 1 unit 1km Utility bad peers gain for harming system 2U(0) Initial utility of peer at time 0 0
0
0.2
0.4
0.6
0.8
1
0 50 100 150 200
T(t)
Time t
C = 1.00C = 0.50C = 0.25
(a) For different C = CG. T (0) = 0.01, δ = 0.01
0
0.2
0.4
0.6
0.8
1
0 50 100 150 200
T(t)
Time t
CB = 0.99CB = 0.50CB = 0.20
(b) For different CB . C = 1, T (0) = 1, δ = 0.01
0
0.2
0.4
0.6
0.8
1
0 100 200 300 400 500
T(t)
Time t
δ = 0.005δ = 0.010δ = 0.020
(c) For different δ. C = CG = 0.01, T (0) = 1
Figure 4.3: A peer’s trust rating over time.
4.3. ANALYSIS 89
by assuming a peer has previously cooperated prodigiously and attained a reputation
rating of 1 (T (0) = 1). Then it “turns bad” and, while maintaining a total contributive
capacity of 1 (C = 1), introduces bad content (CB > 0). In Figure 4.3(b) we see how
quickly its reputation falls. The rate at which it decreases, as well as the final level
it reaches, are dependent on the ratio of good to bad content the peer is providing
(CG : CB).
Figure 4.3(c) demonstrates the effects of different decay rates (δ). Once again,
we assume a peer has attained a high reputation in the past (T (0) = 1) and then
significantly drops their level of contribution (C = 0.01, CB = 0). Note the longer t
value range on the x-axis. In all other experiments we use a δ of 0.01, corresponding
to the middle curve.
What we see in all three graphs is that T (t) tends to converge to a different value
depending on the parameter values used. To evaluate the long-term effects of each
parameter we take the limit of T (t) as t tends to infinity.
limt→∞
T (t) = T (∞) =rgCG
rgCG + rbCB + δ(4.15)
Interestingly, T (∞) is independent of T (0), demonstrating that, regardless of the
current value of T , it will eventually converge to a particular value if parameters do
not change. We demonstrate this in Figures 4.4(a) and 4.4(b) where we plot T (∞)
as a function of C and δ in graphs (a) and (b), respectively. The different curves in
each graph correspond to three different values of rg. CB is set to 0 so the value of rb
is inconsequential. Note that the x-axis in both graphs is in logscale.
Figure 4.4(a) shows the value to which a peer’s trust converges after being in
the system for a long time and has reached a steady-state where T does not change
assuming the peer does not change its behavior. This steady-state value is the max-
imum reputation a peer can attain for a particular contributive capacity, given our
trust model in Equation 4.13. If we consider the curve for rg with default value of 0.2,
90 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
0
0.2
0.4
0.6
0.8
1
0.0001 0.001 0.01 0.1 1
T(∞
)
C
rg = 0.1rg = 0.2rg = 0.5
(a) As a function of C = CG, for different rg.δ = 0.01.
0
0.2
0.4
0.6
0.8
1
0.0001 0.001 0.01 0.1 1
T(∞
)
δ
C = 1C = 0.1
C = 0.01
(b) As a function of δ, for different C = CG.
Figure 4.4: Convergence of T as t→∞. Note the logscale x-axis. CB = 0 in both.
we see that T (∞) rises almost linearly (in logscale) from 0.2 at C = 0.01 to 0.8 at
C = 0.1. The downward shift of the curves as rg decreases is due to the decay factor.
While trust decays a constant amount each unit of time, the amount trust increases
is determined by rgCG. Therefore, a lower rg means a lower steady-state trust value.
This can be quite low for peers with low contributive capacities. A system designer
needs to balance the desire to have a low rg in order to bias the trust model against
bad behavior, and the desire for a high rg to allow contributive peers to attain a
useful reputation quickly.
We next look at how the decay factor affects T at steady-state. In Figure 4.4(b),
we plot T (∞) versus the decay factor δ, with separate curves for C = CG at 1, 0.1,
and 0.01. The first curve indicates the maximum trust attainable by a good node
contributing at maximum capacity. For δ < 0.001 T (∞) f for C = 1 is in effect 1.
But, for δ > 0.01 maximum trust quickly falls. Notice that the largest difference in
T (∞) between the three curves occurs around δ = 0.01. If we want the long-term
reputations of peers in our network to reflect mainly the quality of their contributions
(good or bad) and not the quantity, then we would want to use a much smaller decay,
such as 0.0001 where all the curves reach a high value of T (∞). However, if we want
4.3. ANALYSIS 91
reputation to indicate the amount a peer contributes along with its quality, then
δ = 0.01 seems good.
4.3.2 Utility over Time
To conduct a similar analysis for utility over time, we need to integrate Equation 4.10,
using our Equation 4.14 for T (t) in place of T . The equation for U(t) is given in
Equation 4.16.4
U(t) = (πgtkvA− kpA− κ)t+
(kmCB + kpC − kcC)
ln
(
(rgCG + rbCB + δ)(ergCGt − 1) T (0)rgCG
+ 1
)
rgCG + rbCB + δ+ U(0) (4.16)
To simplify Equation 4.16 for analysis, we will consider only the long-term disjoint
currency scenario. Using the formula for P3, given in Equation 4.11, we can derive
an equation for utility over time in this scenario.
U3(t) = (πgtkvC+kmCB−kcC)
ln
(
(rgCG + rbCB + δ)(ergCGt − 1) T (0)rgCG
+ 1
)
rgCG + rbCB + δ−κt+U(0)
(4.17)
We now use Equation 4.17 to plot the profit gained from using the system for
various parameter settings. By default we use the values listed in Table 4.1.
We begin by plotting the utility of a new well-behaved peer joining the system
using all the default parameter values. In Figure 4.5(a) we see the utility over time
for three peers with different contributive capacities. The curves all have the same
shape, beginning flat,5 while the peer’s reputation is building, then it curves up
and climbs linearly once the reputation rating has stabilized and the system is in
steady-state. What distinguishes the curves are the length of time profit is flat or
4The derivation is presented in Appendix D.1.5In fact, we see the utility curve for C = 0.25 initially dips below 0 before recovering.
92 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
-10
0
10
20
30
40
50
60
0 20 40 60 80 100
U3(
t)
Time t
C=1.00C=0.50C=0.25
(a) For different C = CG.
-4
-3
-2
-1
0
1
2
0 200 400 600 800 1000
U3(
t)
Time t
C=CB=0.99C=CB=0.50C=CB=0.25
(b) For different C = CB .
Figure 4.5: A peer’s utility over time. Initial trust T(0) = 0.01. Higher is better.
negative (determined by the length of time needed for the reputation to rise), and the
slope of the final linear component (determined by the trust value at which the peer
stabilized). Obviously, peers that contribute more, will have their trust rise faster
and higher, resulting in greater utility over a given period of time.
Next, in Figure 4.5(b), we observe the utility of three malicious peers, each sharing
primarily bogus resources (C = CB + CG where CG = 0.01. If CG = 0, T is trivially
0). As expected, in the long run malicious peers lose utility since their low trust
ratings prevent them from making contributions and earning credits, while they still
must pay the fixed cost κ. Interestingly, malicious peers with a high CB generate
positive utility in the short run, though a very small amount relative to the amount
well-behaved nodes can generate. But any amount of positive utility would attract
malicious users for short-term gains. In fact, the behavior we see in Figure 4.5(b),
indicates that new malicious peers make a larger profit when first joining than new
good peers. This effect would actually encourage whitewashing, not discourage it.
Why do malicious peers profit when first joining the network? If malicious peers
are generating positive utility in the short run then P must be initially positive due
to the starting value of T (0). By setting Equation 4.11 equal to 0 and solving for
T , we find that a malicious peer with CB = 1 will have positive profits as long as
4.3. ANALYSIS 93
-10
0
10
20
30
40
50
60
0 20 40 60 80 100
U3(
t)
Time t
C=1.00C=0.50C=0.25
(a) For different C = CG.
-7
-6
-5
-4
-3
-2
-1
0
0 200 400 600 800 1000
U3(
t)
Time t
C=CB=0.99C=CB=0.50C=CB=0.25
(b) For different C = CB .
Figure 4.6: A peer’s utility over time. Initial trust T(0) = 0.0035.
T > 0.0036. Though the malicious peers’ ratings eventually fall below this threshold,
a T (0) of 0.01 allows them to make a small initial profit. By setting T (0) = 0.0035 we
prevent purely bad peers from gaining any positive utility. In Figure 4.6 we present
the same two graphs as in Fig. 4.5, but now with the new, lower T (0) of 0.0035. While
the well-behaved peers are not greatly impacted, having only their slow-start period
extended, the malicious peers begin losing utility from the onset.
Looking at the third curve in Figure 4.6(a) corresponding to C = 0.25, we see
that after the initial slow-start phase, the peer gains utility very slowly. This seems
to indicate that for a low value of C good peers will be unable to attain positive
profit no matter how long they remain in the system. In order to calculate this
threshold capacity value of C at which the steady-state profit rate goes from positive
to negative, we need an equation for the steady-state slope of the utility curves. This
formula is derived by using Equation 4.11 for profit at steady-state trust, or T (∞).
Inserting Eq. 4.15 into Eq. 4.11 we have the following formula for the slope of the
utility curves in steady-state.
P3(∞) =((πgtkv − kc)C + kmCB)rgCG
rgCG + rbCB + δ− κ (4.18)
94 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
-0.01
0
0.01
0.02
0.03
0.04
0.05
0 0.02 0.04 0.06 0.08 0.1
P3(
∞)
C = Cg
(a) Steady-state profit (P3(∞)) as a function ofC = CG.
-25
-20
-15
-10
-5
0
5
10
0 500 1000 1500 2000 2500 3000
U3(
t)
Time t
C=0.02C=0.03C=0.04
(b) Utility over time of a new peer joining withC = CG near 0.035. T (0) = 0.0035.
Figure 4.7: Minimum capacity needed for a good peer to (eventually) generate positiveprofit (using default πgt, kv, and kc) is approximately 0.035 (for default parameters).
We set CB = 0 and plot Equation 4.18 with respect to CG in Figure 4.7(a).
From Figure 4.7(a) we see that the steady-state slope is 0 when CG is approxi-
mately 0.035. Well-behaved peers with a greater contributive capacity than 0.035 are
expected to make a profit from the system in the long-run. Peers that contribute less
will only lose utility by participating as the fixed cost κ will outweigh the small rate
of credits they receive due to their low trust rating. To illustrate this effect we plot
utility over time for three values of C = CG near 0.035 in Figure 4.7(b). Notice that
only the curve, corresponding to C = 0.04, is capable of generating positive profit,
and hence utility, in the long run. Although, even for such a peer, the time during
which it loses utility before reaching a positive utility gain is very long. If peers are
unwilling to commit themselves to participating in the system for such a long period
of time, then peers with low contributive capacities will be discouraged from partici-
pating at all. Intuitively, designers of real-world systems must address two important
questions:
1. What amount of contributive capacity should be expected of participants in the
system?
4.4. SIMULATION DETAILS 95
2. What fraction of all interested parties are capable of providing and maintaining
that level of cooperation?
The answers to these questions determine the parameters of the system, as well as its
expected size.
4.4 Simulation Details
The analytic model we have introduced in Section 4.2 presents a macro-level model
with many simplifications that allow us to analyze and deduce trends about the system
from the model. For example, the equations give us the expected trust and utility of a
typical peer given a certain capacity (CG, CB) assuming a large and varied population
of peers that remain relatively static. In addition, instead of accounting for discrete
transactions, the analytic model assumes continuous work at a rate determined by
the peer’s capacity and reputation.
To test the validity of our analytic model we use a micro-level discrete transac-
tional model based on our assumptions on capacity and utility pricing. Instead of one
simple formula for the expected trust and utility of a given peer, our transactional
model specifies how a peer chooses which peers to transact with, how transactions
affect each party’s utility, and how each peer’s reputation is updated over time. This
model forms the basis of our experiments, which we compare with the analytic model’s
predicted behaviors. If the simplifications of the analytic model were reasonable, we
would expect the trends observed earlier to be visible in our experiments, even if the
actual values do not match exactly.
Using a turn-based simulator based on our transactional model, we simulate a
peer-to-peer trading system where a large population of N individual peers, each
assigned its own capacity (CG and CB), exchange resources. In each turn, for each peer
p, the system randomly chooses a subset R of the entire population N , representing
96 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
peers who respond to p’s resource request. Using the current reputation ratings, p then
selects one (or more) peer r from R and purchases resources from it. This exchange
is represented by a transfer of credits from p to r, a deduction in r’s contributive
capacity C for the rest of the turn, and a change in p and r’s utility (depending on
the amount and type of capacity used). A centralized reputation system updates
each peer’s trust ratings T at the end of the turn based on the amount and type
of capacity contributed during that turn. The algorithm executed for each turn is
presented below.
Algorithm 1 DoTurn()
for each peer p ∈ N (in random order) doselect NumResponders peersput selected peers with C > 0 in set Rcount ← 0while p.credits>0 AND |R| > 0 AND count<MaxTransactionsPerPeer do
use Selector to choose responder r ∈ Rp acquires as much capacity from r as possible (min(p.credits, r.remainingC ))count++
end whileend forfor each peer p ∈ N do
use TrustFunction to update p.T based on amount of p.CG and p.CB contributedduring turn
end for
In Algorithm 1, the parameter NumResponders (NR) determines how many peers
are randomly selected by the simulator as possible contributors (size of R), mimicking
a subset of the peers responding to p’s resource request to offer their services. The
requesting peer then chooses one or more responders (but at most MaxTransactions-
PerPeer (MTPP) responders) to fulfill its need. We denote the total number of turns
simulated as NumTurns. A list of all simulation-specific parameters and their default
values is presented in Table 4.2.
The Selector represents the selection function σ used by each peer to select which
4.4. SIMULATION DETAILS 97
Table 4.2: Simulation Parameters and Default ValuesParameter Default ValueN 500NumTurns 200NumResponders (NR) 25MaxTransactionsPerPeer (MTPP) 1,2Selector PolynomialExponent (E) 1 (σ = T )TrustFunction DifferentialInitial Credits C0 1
resource provider to interact with, given the set of responders R. The Selector we
use in our experiments is a Polynomial Selector. Given a set of responders, each
responder is weighted by its trust rating raised to an exponent value, T E. The
Polynomial Selector then probabilistically chooses a responder given their weights.
Therefore, the relative probability that peer i is chosen over peer j is
p(X = i)
p(X = j)=
(Ti)E
(Tj)E(4.19)
For example, if E = 1, a peer with T = 0.6 is twice as likely to be chosen as one with
T = 0.3. However, if E = 0, then all peers are equally likely to be chosen, regardless
of their reputation.
The TrustFunction implements the trust model used for updating trust. For our
experiments we look at both the differential and ratio trust models (discussed in
Sec. 4.6.2) using the default parameter values specified in Table 4.1. Unless other-
wise specified, assume our experiment used the differential trust model as defined in
Equation 4.13 to update each peer’s trust at the end of each turn.
In each experiment, we begin the simulation with N peers with varied amounts
of CG and CB. We conducted two types of experiments: population-focused and
individual-focused.
98 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
For our population-focused experiments we are interested in the behavior demon-
strated by all peers in the system after a given number of turns.
Individual-focused experiments, however, allow us to track the behavior of a single
peer over time during the experiment. For these experiments we will focus on a
peer entering an existing active trading network. Therefore, we insert N peers with
varying CG and CB values and run the experiment for some number of turns so that
these peers’ trust ratings have stabilized. We then insert a new peer with specific
parameters we are interested in testing. All single-peer results are for this new peer,
which we will refer to as p∗. All time-dependent graphs will plot the results of p∗
beginning at the time it was inserted into the experiment and continue until the end
of the simulation.
4.5 Simulation Results
We begin by examining the trends visible by studying the base population of 500 peers
itself. Afterwards, we will focus on the single-peer experiments, where a new peer is
inserted into the system after the base population has stabilized. In all experiments
presented here we set NumResponders (NR) to 25, equating to 5% of the population
responding to each service request. We begin by setting MaxTransactionsPerPeer
(MTPP) to 1, allowing each peer to make one transaction per turn.
4.5.1 Base Population
The default base population we use in our experiments consists of N=500 peers with
varied capacity distribution. We classify 70% as “good” peers. These peers all have
CB = 0 and their CG is distributed linearly from 0.01 to 1. The other 30% are the
“bad” peers (CB > 0). Their total capacity C is distributed linearly from 0.1 to 1.
Additionally, the fraction of C devoted to CB is distributed linearly from 0.5 to 1.
This allows us to have a uniform distribution of both capacity and proportion of bad
4.5. SIMULATION RESULTS 99
0
0.2
0.4
0.6
0.8
1
0 50 100 150 200 250 300 350 400 450 500
Cap
acity
(C)
Peer
CBCG
Figure 4.8: Capacity distribution for base population.
work to good work. The capacity distribution for our base population is illustrated in
Figure 4.8, where the first 150 peers are the bad peers and the rest are the good peers
(sorted in increasing capacity). The different tones distinguish the good and the bad
capacities. For example, peer 75 has a total contributive capacity of C = 0.55, of
which only one quarter is good capacity (CG = 0.14, CB = 0.41).
In Figures 4.9(a) and 4.9(b) we have a snapshot of the trust and utility (respec-
tively) values of the base population after 200 turns. As we discuss later, turn 200
is when we inject peer p∗ in the single-peer experiments. The values are ordered by
peer ids and so correspond directly to the capacities displayed in Fig. 4.8. In these
graphs, the different color distinguish the 150 bad peers from the good peers. The
black curve indicates the expected values predicted by the analytic model.
As we can see in Figure 4.9(a), the experimental results are surprisingly close
to the analytic model’s prediction. Overall, the simulation values are less than the
predicted values. We believe this is due to the nature of the simulation. By limiting
the number of responders as well as the number of transactions each peer is allowed
per turn, some of the peers’ capacities are not fully utilized, while other peers are left
with a surplus of credits they were unable to spend. This hypothesis is supported by
the utility graph in Fig. 4.9(b). Once again, the experimental results are relatively
100 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 50 100 150 200 250 300 350 400 450 500
Trus
t (T)
Peer
Bad PeersGood Peers
Predicted
(a) Trust
-20
0
20
40
60
80
100
120
140
0 50 100 150 200 250 300 350 400 450 500
Util
ity (U
)
Peer
BadGood
Predicted
(b) Utility
Figure 4.9: Trust and utility values for default population after 200 turns.
-5
0
5
10
15
20
25
0 50 100 150 200 250 300 350 400 450 500
Cre
dits
Peer
BadGood
Figure 4.10: Distribution of credits in base population at turn 200.
close to the predicted curve, but the predicted curve is steeper. Peers 0-300 appear to
have generated more utility than expected, while high-capacity good peers generated
less than expected. Malicious peers especially generated greater than expected utility.
Malicious peers receive an initial bonus due to starting all the peers at the same time
with the same initial trust rating, allowing them to contribute more capacity at first
when trust ratings are closer together. As we shall see below, this is only an initial
artificial bonus due to starting all the peers at the same time and does not last over
time.
Other factors also contribute to the higher than expected utility for low-capacity
4.5. SIMULATION RESULTS 101
peers, while lower than expected utility for high-capacity peers. Analyzing the credit
distribution, shown in Figure 4.10, reveals that indeed the high-capacity peers main-
tain a surplus of credits throughout the simulation run. This surplus is wasted utility
that could have purchased resources or services that increase the utility of the pur-
chasing peer. In addition, a smaller fraction of high-capacity peers’ contributions are
being utilized each turn due to the limited number of responders (NR) and transac-
tions per peer (MTPP). Because the analytic model is based on continuous infinitely
small transactions from a very large selection population, it predicts greater utiliza-
tion of high capacity peers. This behavior induced by our continuous time approach
for our analytic model was one of our main concerns that we wished to evaluate with
the transactional model. Fortunately, the results indicate the deviation is reasonably
small.
If we continue simulating our base population for an additional 800 turns (until
turn 1000), we see little difference in peer trust ratings from what we see after only
200 turns (Fig. 4.11(a) vs. Fig. 4.9(a)). The utility graph in Figure 4.11(b) shows a
larger difference between the utility gained by bad peers and those gained by good
peers than in the earlier snapshot (Fig. 4.9(b)), closely matching the predicted curve.
In fact, the utility of bad peers now closely matches the predicted curve, indicating
that their bad contributions have been mitigated, but not eliminated.
If we take a closer look at peers 100 through 200 (Fig. 4.11(c)), which are the bad
peers with high CB and the good peers with low C, we see that many have negative
utility. These peers are unable to contribute enough good capacity to gain sufficient
trust to stay competitive. Therefore, their trust ratings fall to 0 and they are ignored
by all peers. Consequently, the fixed cost κ is the only factor affecting their utility
and, as predicted, their utility falls below 0.
As mentioned earlier, the base population was chosen to represent different amounts
of capacity, both good and bad. If we compare the analytic curves and simulation
102 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 50 100 150 200 250 300 350 400 450 500
Trus
t (T)
Peer
BadGood
Predicted
(a) Trust for all peers
-100
0
100
200
300
400
500
600
700
800
0 50 100 150 200 250 300 350 400 450 500
Util
ity (U
)
Peer
BadGood
Predicted
(b) Utility for all peers
-10
0
10
20
30
40
50
60
70
80
100 120 140 160 180 200
Util
ity (U
)
Peer
BadGood
Predicted
(c) Utility of peers 100-200
Figure 4.11: Trust and utility for base population after 1000 turns.
4.5. SIMULATION RESULTS 103
results in Figure 4.11 for both trust and utility to the base capacity graph in Fig-
ure 4.8 we can determine what are the dominant factors. For good peers, capacity
has little effect on trust unless the capacity is less than 0.1, as we shall discuss later.
The utility, on the other hand, is proportional to a peer’s trust times its capacity.
Consequently, given a relatively flat trust curve (for C > 0.1) and a linear capacity,
the result is a linear utility curve.
Bad peers, however, are more interesting. Here we have several possible factors
influencing trust and utility: total capacity, good capacity, bad capacity, and the ratio
between the last two. In Figure 4.11(a), we see that trust for bad peers is clearly a
linear curve sloping downwards. This indicates that it is proportional to the fraction
of total capacity that is good. Remember that the base population was constructed
with CG
C= 0.5 for peer 0, CG
C= 0 for peer 149, and linearly interpolated for peers
in between the two. Considering the utility graph in Figure 4.11(b) we see that the
predicted utility curve appears similar to the amount of good capacity on bad peers
as shown in Figure 4.8. This is expected, if utility is proportional to trust times total
capacity. Since trust is proportional to the fraction of total capacity that is good,
then, multiplying by each peer’s total capacity, utility would be proportional to the
amount of good capacity.
In another experiment, not presented in any graphs, we varied the model para-
meter kp that determines the amount of credits charged per unit of contribution. As
expected, varying kp had no noticeable effect on the trust or utility values attained
by the base population. kp is merely a currency exchange rate and is equally applied
when receiving credits for contributions and when giving credits for acquisitions. This
experiment supports our simplification of profit in Equation 4.2 by cancelling out the
kp terms when applying the long-term disjoint currency scenario, resulting in Equa-
tion 4.7.
104 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 50 100 150 200 250 300 350 400 450 500
Trus
t (T)
Peer
BadGood
Predicted
(a) Trust
-100
0
100
200
300
400
500
600
700
800
0 50 100 150 200 250 300 350 400 450 500
Util
ity (U
)
Peer
BadGood
Predicted
(b) Utility
Figure 4.12: Trust and utility for NR=400 after 1000 turns.
4.5.2 NR and MTPP
The simulator adds several parameters not included in the analytic model. Foremost
among them are NR, then number of responders a peer has to choose from each turn,
and MTPP, the maximum number of those responders a peer may transact with
in one turn. In the following experiments we modify the value of each parameter
independently and observe its effect on the trust and utility of the peers in the base
population after 1000 turns.
First, we experimented with different NumResponders values. Remember that NR
determines the number of peers that are uniformly chosen at random by the simulator
to represent service providers responding to a peer’s request for service. From these
NR responders a peer selects one to transact with using the selection function, in
this case weighting each peer by their trust. Consequently, the larger the the value
of NR the more effect the responders’ trust values have on which peer is selected,
resulting in relatively more transactions with highly trusted peers. Inversely, a small
NR means initial uniform random selection has a larger impact the peer selected for
transaction. Therefore, we would expect low trust peers to participate in relatively
as many transactions as high trust peers, assuming they have equal total capacity.
4.5. SIMULATION RESULTS 105
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 50 100 150 200 250 300 350 400 450 500
Trus
t (T)
Peer
BadGood
Predicted
(a) Trust
-100
0
100
200
300
400
500
600
700
800
900
0 50 100 150 200 250 300 350 400 450 500
Util
ity (U
)
Peer
BadGood
Predicted
(b) Utility
Figure 4.13: Trust and utility for NR=1 after 1000 turns.
Looking at the graphs in Figures 4.12 and 4.13, we see exactly the predicted effect.
Comparing the trust and utility graphs for NR=400 (Fig. 4.12) with the default
NR=25 in Figure 4.11 we see the graphs are quite similar. On closer inspection we
can notice that the utility and trust values for malicious peers are overall lower for
NR=400, while the trust and utility values for high-capacity good peers is slightly
higher. As expected, this effect is due to a slightly higher number of contributing
transactions for high trust peers versus low trust peers in the NR=400 experiment.
Now, if we compare the default case of NR=25 to NR=1 in Figure 4.13 we see
a much more striking difference. With NR=1 the selection function does not have
multiple peers to compare and so is unused. All transaction providers, therefore, are
chosen purely at random and so the number of transactions a peer contributes to
will be based solely on its total capacity. Now, the bad peers have a clear advantage.
Due to their high capacities, the fact that they gain additional utility for contributing
bad content6 and the loss in utility by good peers for purchasing more bad content,
bad peers generate much more utility than the good peers. Even in the trust graph
in Figure 4.13(a) we see that the trust ratings for the bad peers are closer to the
predicted curve than before, while high-capacity good peers have seen their trust
6Remember bad peers gain a bonus kmCB utility (see Sec. 4.2.3)
106 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
values fall slightly, all due to the even distribution of transactions.
Clearly, if a peer is limited to choose between one or two service provider it is at
greater risk of being cheated by a malicious peer. But once a sufficient number of
responders are available, increasing the response set further provides no appreciable
advantage in avoiding bad peers. Therefore, system designers should strive to increase
accessibility and replication of rare content and resources, but can sacrifice a high level
of recall for popular items.
Next we will focus on the parameter MTPP, which specifies how many trans-
actions a peer can initiate in one turn. Increasing MTPP will increase the overall
number of transactions, resulting in faster, more accurate trust convergence. Also,
the total utility will increase as peers receive additional transactions. However, the
total available capacity does not change, therefore the increased utility must come
from peers that were previously underutilized at a lower MTPP. While this provides
some benefit to low trust good peers, such as new peers entering the network, bad
peers will benefit most as they tend to be the most underutilized because of their
high capacity but low reputation. In fact, the purpose of reputation systems is to
keep malicious peers underutilized as much as possible.
If we study Figure 4.9, we notice that there are a few gaps in both the trust and
utility bar graphs, indicating peers with a trust rating of 0. These are peers that were
never chosen to contribute by other peers and thus their trust falls to zero through
decay. The decay factor δ in the trust model slowly decreases the trust rating in these
unutilized peers’, which in turn further decreases the likelihood of being chosen in the
future. Once again, this is a factor of the granularity and number of transactions as
compared to the analytic model.
To improve overall system utilization we will allow a peer to initiate a second
transaction if they have credits remaining after the first acquisition (MTPP=2). In
Figure 4.14 we see the trust and utility values for the base population after 1000
4.5. SIMULATION RESULTS 107
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 50 100 150 200 250 300 350 400 450 500
Trus
t (T)
Peer
BadGood
Predicted
(a) Trust
-100
0
100
200
300
400
500
600
700
800
0 50 100 150 200 250 300 350 400 450 500
Util
ity (U
)
Peer
BadGood
Predicted
(b) Utility
Figure 4.14: Trust and utility for MTPP=2 after 1000 turns.
turns when MTPP=2. Comparing Fig. 4.14(a) to Fig. 4.11(a) we notice that the
trust values match much closer to the curve predicted by the analytic model for
MTPP=2 than 1. The higher number of transactions per turn simply gives the
reputation system more data on which to compute a peer’s trust rating.
Setting MTPP to 2 does, however, produce three additional effects that cause it to
deviate from the analytic model, at least with respect to the utility graph. As we see in
Figure 4.14(b) the utility values are higher for all peers, regardless of amount or type
of capacity. This overall increase in utility is primarily due to the increased number
of transactions, which translates into increased utility through more acquisitions.
Additionally, the increase in transactions results in peers contributing more, giving
the reputation system more information on which to update the peers’ trust values,
which means their trust values converge and stabilize faster than before. For most
good peers, which stabilize at a high trust rating, faster convergence allows their
trust rating to rise quickly, giving them an earlier advantage over low trust peers. In
the graph of Figure 4.14(b), this effect is demonstrated by the fact that the slope of
the trust graph for the good peers in the simulation is steeper than in the MTPP=1
graph (Fig. 4.11(b)). In fact, unlike in the previous results, the slope now matches
the slope of the predicted curve.
108 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
-500
0
500
1000
1500
2000
2500
0 50 100 150 200 250 300 350 400 450 500
Util
ity (U
)
Peer
BadGood
Predicted
Figure 4.15: Utility for MTPP=3 after 1000 turns.
Finally, we see that though all peers show an increase in utility, the bad peers
exhibit a much larger increase in utility relative to the good peers. Because in the
latter part of processing peer transactions during one turn, it is likely that much of
the remaining capacity is on bad peers since they were less likely to be chosen earlier
when compared to good peers. Consequently, peers that complete transactions late
in the turn are often forced to transact with malicious peers as they are the only
peers with remaining capacity. Allowing peers additional transactions each turn only
amplifies this effect. By increasing MTPP we do somewhat improve the capacity
utilization of good peers, but we increase the utilization of bad peers even more.
If we raise MTPP further, this last effect begins to dominate the results. In
Figure 4.15, we see that MTPP=3 clearly benefits the malicious peers. Though the
trust graph is identical to that for the MTPP=2 scenario (Fig. 4.14(a)), the malicious
peers now generate much more utility than the good peers. In addition, the highest
capacity bad peers, which have the lowest trust ratings, now generate the most utility.
These peers are not being selected for transactions because of their reputation, but
only because they are the only responders with remaining capacity. Allowing peers
to perform three transactions per turn greatly increases the chances of choosing a
malicious peer from the response set, especially near the end of a simulation turn
4.5. SIMULATION RESULTS 109
when most good peers have contributed their capacity.
Intuitively, even in real-world systems raising the total number of transactions
is likely to saturate the capacity of trusted peers, resulting in a larger fraction of
malicious peers in the response set. However, the pronounced difference in utility
caused by slightly changing MTPP is an artifact of our turn-based simulator, which
amplifies the effect. This situation can be improved by utilizing a selection threshold
that limits the providers a peer will consider acquiring from to those whose reputation
lies above a certain threshold (see Sec. 5.6.1). Responders with reputations lower than
the threshold are simply ignored.
4.5.3 Trust vs Capacity
In Section 4.3.1 we discussed the convergence of trust over time with respect to
capacity and presented a graph of the predicted T (∞) as a function of capacity in
Figure 4.16. For comparison, we simulated a population of 500 peers with C ranging
from 0.0001 to 1 and CB = 0. The simulation was run for 1000 turns and the
resulting trust ratings recorded. Figure 4.16 shows the results of two simulations,
one for MTPP=1 and one MTPP=2, along with the corresponding analytic curve.
Once again, we see a significant similarity between the two simulation curves and the
predicted curve, especially for C < 0.001 and C > 0.1. Where the curves differ is in
the behavior of peers with C between 0.001 and 0.1. Both simulation curves sharply
fall off as C drops below 0.1, while the predicted curve tapers off more gradually. The
simulation produced a steeper curve, showing a larger range of capacities that result
in low trust ratings. For example, at C = 0.01 the analytic model predicts a T (∞)
of 0.2, yet the simulation produced trust ratings around 0.01 when MTPP=2 and 0
when MTPP=1.
Once again, this sharp drop-off is due to the different granularity of both time
and number of transactions between the continuous analytic model, and the discrete
110 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
0
0.2
0.4
0.6
0.8
1
0.0001 0.001 0.01 0.1 1
T(∞
)
C = CG
PredictedSim MTPP=1Sim MTPP=2
Figure 4.16: Comparing the analytical and simulation results for the convergence ofT as t→∞ as a function of C = CG. Note the logscale x-axis.
transactional model used in our experiments. Remember that peers cannot contribute
more than their capacity, but can contribute less, and not all credits are spent each
turn, underutilizing the system. Consequently, peers with little capacity are not likely
to be selected at all and their trust would decay to 0. In a continuous model they
would contribute, but simply to a small degree. In fact, this underutilization appears
to account for the simulation curve being slightly lower than the predicted curve, even
for high capacity peers.
As we saw earlier, increasing MTPP from 1 to 2 improves system utilization
and improves the likelihood of low capacity peers being able to contribute, and thus
improving their trust rating. We see this improvement in the graph by the fact that
the MTPP=2 curve is higher than the MTPP=1 curve, closely hugging the predicted
curve, and does not drop quite as sharply as the MTPP=1 curve. The result is that
peers with capacity between 0.01 and 0.03 were ignored in the MTPP=1 experiments,
but were able to participate and trade with other peers in teh MTPP=2 experiments.
4.5.4 Single-Peer Experiments
After studying the effects on our base population with its diverse distribution of
capacity, we now look at injecting a new peer p∗ into a “warm” trading system,
4.5. SIMULATION RESULTS 111
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 100 200 300 400 500
Trus
t (T)
Time
Predicted C = 1.00Predicted C = 0.25
Sim C = 1.00Sim C = 0.25
(a) MTPP=1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 50 100 150 200 250 300
Trus
t (T)
Time
Predicted C = 1.00Predicted C = 0.25
Sim C = 1.00Sim C = 0.25
(b) MTPP=2
Figure 4.17: Comparing the analytical and simulation results of trust over time fornew good peers.
where peers in our base population have been transacting with each other for 200
turns and their trust ratings have stabilized. We then monitor p∗’s trust and utility
over 1000 turns. Typically, p∗’s trust rating converges to a stable value much earlier
than time 1000. Therefore, most result graphs will focus on the first 200 to 500
turns after p∗ is inserted. These experiments better mimic the situation a typical new
peer would encounter joining an established real-world trading system. To minimize
artifacts of randomness we repeat the simulation with 50 seeds and average the 50
resulting data sets.
In the following figures we present various graphs for simulation experiments
matching the analysis from Section 4.3. In each graph we present the experimen-
tal result curve and the the corresponding analytic model curve originally presented
in Figures 4.3 and 4.5.
In our first single-peer experiment we focus on the entry of a good peer. Thus, we
simulate the scenario presented in the analysis surrounding Figure 4.3 inserting both
a high-capacity peer with C = CG = 1 and lower capacity peer with C = CG = 0.25.
The results of running our experiment with an MTPP value of 1 are presented in
Figure 4.17(a). We see a large difference between the expected behavior and the
112 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
simulation results. Not only did it take longer for the trust values to converge, but
the final value is much lower than expected. For example, when C = 1, p∗ appears
to only attain a trust rating of 0.62 while we had predicted 0.95.
Upon closer evaluation of the experiment we found the reason for this discrepancy.
Remember that each experiment is an average of 50 separate simulation runs. In each
of those simulations p∗ would begin with the low initial trust value of 0.01. After some
turns, with its trust rating gradually decaying, p∗ would be chosen to contribute and
its trust rating would quickly climb until it’s trust matched the predicted stable value.
However, there was a large variance in the time it took for p∗ to be first chosen to
contribute. In fact, in some runs p∗ was never selected and its trust would decay to 0.
Therefore, we feel that presenting an average curve in situation is not representative
of the actual performance.
Previously, we had seen that underutilized peer performed better when MTPP
was increased, improving the probability of their being selected. So we performed the
same experiment, except with MTPP=2. As Figure 4.17(b) shows, the results were
much closer to what we expected. Overall, the simulation and analytic curves appear
quite similar. In the trust graphs both the simulation and analytic curves converge
to the same value for each of the tested capacities. For instance, in Figure 4.17(b),
which depicts two new good peers (C = 1 and C = 0.25) entering the network, both
curves converge to 0.95 for C = 1 and 0.84 for C = 0.25. The primary difference
between the simulation and predicted results appears to be the rate of convergence.
Trust in the simulation converges faster than in the analytic model. In Figure 4.17(b)
we see that the simulation exhibits only a little slow-start behavior before increasing
linearly until near the the convergence point. This reduced slow-start period is due
to the additional transaction allowed by MTPP=2. We noticed the same behavior
earlier in the base population studies.
In Figure 4.18 we simulated malicious peers with C = 1 and CB of 0.99 and 0.5.
4.5. SIMULATION RESULTS 113
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
0 20 40 60 80 100
Trus
t (T)
Time
Predicted CB = 0.99Predicted CB = 0.50
Sim CB = 0.99Sim CB = 0.50
Figure 4.18: Comparing the analytical and simulation results of trust over time.MTPP=1.
-20
0
20
40
60
80
100
120
140
0 50 100 150 200
Util
ity (U
)
Time
Predicted C=1.00Predicted C=0.25
Sim C = 1.00Sim C = 0.25
Figure 4.19: Comparing utility over time for new good peer. MTPP=2.
Here, parameter MTPP was set to 1. Notice that the simulation results mimic exactly
the predicted curve. When performed with MTPP=2 the simulation converges to the
same value as the analytic model, though, as before, the simulation converges faster
than with MTPP=1.
Figure 4.19, compares the experimental results to the expected behavior for the
same two good peers whose trust is presented in Figure 4.17(b). Due to our experience
with the new good peer trust experiments using MTPP=1, we only present the results
of the new good peer utility experiments with MTPP=2. As expected by our analysis
in Section 4.3, once the trust ratings have converged, the slope of the utility curve (the
profit rate) remains constant, matching the slope predicted by the analytic model.
114 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 200 400 600 800 1000
Util
ity (U
)
Time
Predicted CB=0.99 Predicted CB=0.50
Sim CB=0.99Sim CB=0.50
Figure 4.20: Comparing utility over time for new bad peer. MTPP=1.
Because the simulation trust curves exhibited less slow-start than the analytic model,
we see that the corresponding utility curves also experience less slow-start resulting
in an upward shift compared to the predicted curves.
Finally, in Figure 4.20 we simulate two new bad peers entering the population.
Both have minimal good capacity (CG = 0.01). One peer was a high capacity bad peer
with CB = 0.99, the other had less capacity (CB = 0.5). As the results indicate, their
capacities made no difference. Both performed almost identically. Both initially gain
utility, primarily from spending the initial credits we allot each peer at the start. But
quickly, their utility levels off at approximately 1.5, the same peak utility predicted
by the analytic model. The simulation, however, peaks much earlier around 75 turns,
while the analytic curves reach their maxima at approximately 150 or 300 turns,
depending on the peer’s capacity. In the simulation, the amount of bad capacity does
not seem to matter. The reason is that difference between good and bad capacity for
both peers is so large that both peers earn near 0 trust ratings. While a low trust
rating is sufficient to maintain a small level of contribution in the continuous analytic
model, in the transactional model which already suffers from underutilization, these
almost 0 trust malicious peers are ignored by the rest of the population and never
selected.
4.6. VARIATIONS ON THE MODEL 115
From the experiments it appears that our macro-level model performs quite well
given our base assumptions. We have tested it with a varied distribution of peer
capacities and the results are highly correlated to the predicted values. The largest
variances appear to be due to the discrete nature of the simulator. We would expect
actual real-world systems to exhibit behavior between the analytic and transactional
model. While it would not suffer the artifacts of discrete turns, neither do peers
engage in infinitely many, infinitely small, transactions, as assumed by the continuous
functions we developed in our analysis of our economic model.
4.6 Variations on the Model
In this section we present variations of our trust/incentive model. Specifically, we will
analyze the effects of modifying particular components of Equations 4.10 and 4.13.
4.6.1 Profit Trust Factor
In Section 4.2.3 we presented one straightforward method in which a peer’s trust
influences its profit. We assumed that the probability of a peer being chosen for
a transaction by another peer is linearly proportional to its trust value. Thus, in
Equation 4.10, we multiply each term related to contributed transactions (cost and
payment received) by T . There are other ways in which a peer’s trust could relate to
their profit rate. For example, a peer may be four times as likely to choose a service
provider that has twice the reputation of another. In this case, the relationship would
be quadratic, not linear, and so we would multiply the contribution by T 2, not T .
We refer to this profit trust factor term as σ. Equation 4.20 illustrates how the profit
trust factor appears in our model of profit in the long-term disjoint currency scenario
(compare to Eq. 4.11).
P = (πgtkvC + kmCB − kcC)σ − κ (4.20)
116 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
-10
0
10
20
30
40
50
60
70
80
0 20 40 60 80 100
U(t)
Time t
σ=1σ=T1/2
σ=Tσ=T3/2
(a) Utility over time for various σ functions of T .C = CG = 1.
-10
0
10
20
30
40
50
0 200 400 600 800 1000
U(t)
Time t
σ=1σ=T1/2
σ=Tσ=T3/2
(b) Utility over time for various σ functions of T .C = 1, CB = 0.99.
Figure 4.21: Effects of varying trust factor σ.
We experimented with substituting various functions of T for σ, and then cal-
culating the utility over time for both a good peer (C = CG = 1) and a bad peer
(CB = 0.99). The results are presented in Figure 4.21.
We see in Figure 4.21(a) that lowering the exponent on T results in well-behaved
peers’ reaching steady-state faster, but the steady-state profit rate (i.e. slope) remains
the same. In Figure 4.21(b), however, we see that low exponents allow malicious peers
to profit from harming the system. Ideally, we would like to shorten the startup period
for well-behaved peers, while keeping malicious peers from gaining utility from abusing
the system. This translates to choosing a profit trust factor whose curve is the farthest
left in Fig. 4.21(a), but is below 0 in Fig. 4.21(b). Of the four curves presented in
the figure, σ = T , which has been the default value we use, appears to give the best
performance.
How would system designers enforce a specific selection function? Specifying the
selection function used requires regulating the emphasis each peer places on resource
providers’ reputation. For example, to change the σ(T ) from linear to quadratic
would mean that, while previously peers were twice as likely to choose a provider
with double the reputation rating, now they are four times as likely to choose it.
4.6. VARIATIONS ON THE MODEL 117
If selection is handled automatically by client software, then updating every peer’s
software will change the selection function.
However, if users manually select the provider from a list of responders (with
their reputation ratings), then each peer applies its own σ(T ). Figure 4.21(b) shows
that if enough naıve peers disregard (or give little weight to) providers’ reputations
(e.g., σ(T ) = 1 or√T ), they can hurt the entire system by making it profitable for
malicious peers to join, thus encouraging misbehavior. Fortunately, it is these naıve
peers that will be most hurt by misbehavior as they are the ones excessively fetching
resources from non-reputable peers likely to be malicious. In the long run, naıve peers
will either adopt a more reasonable selection policy or leave the system entirely [83].
4.6.2 Additional Trust Models
In Equations 4.12 and 4.13 we introduced a specific formula for updating a peer’s
trust rating given their recent performance. In this section we present other methods
for calculating T and discuss their effect on profit for various peer strategies.
The new trust model we will be looking at considers only the ratio of good contri-
butions to total contributions in a unit of time when updating a peer’s trust rating.
Our previous model factored in the amount of contributions into a peer’s reputation.
Some argue that a peer’s trust should be orthogonal to the amount they contribute
and only consider the quality of those contributions. So we construct a trust model
where each interval, we combine the fraction of contributions that were good (a value
between 0 and 1) with the previous trust rating by some weight ω. The trust model
can be represented by either of the following equations:
T [i+ 1] = ωCG
CG + CB
+ (1− ω)T [i] (4.21)
∆TR = ω(CG
CG + CB
− T ) (4.22)
118 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
In Equation 4.21 the term CG
CG+CB, or simply CG
C, is the trust ratio, the fraction of
times the peer has provided good resources when contributing in the last unit of time.
The term T [i] represents the peers current reputation. The two terms are combined by
a weight ω, which determines how much of a peer’s reputation results from its current
performance versus its past history. For the purposes of distinguishing between the
previously studied trust model and this new model proposed here, we refer to the
former model (given in Eq. 4.13 as the differential trust model and this new one as
the ratio trust model. For consistency with the differential trust model, Equation 4.21
can be expressed as the change in T in a unit of time (Eq. 4.22), with a subscripted
R to denote the ratio model.
Notice that the ratio trust model is similar to how a seller’s reputation was cal-
culated in Chapter 3 using Buyer Strategy β1. The difference is that in Chapter 3
an unweighted history was used. We use the same trust model in Chapter 5 as it is
intuitive and simple to analyze, given the complexity of the system model used.
To compare the two trust models we ran experiments similar to those conducted
in Section 4.3, but with the ratio trust model, using ω = 0.1. We will not consider
other values of ω as it plays little role in the long-term behavior of the system. The
steady-state trust function for the ratio trust model is simply the trust ratio,
TR(∞) =CG
C(4.23)
and is independent of ω. The weight has no effect on steady-state profit rate, which
is a function of T (∞). However, ω does affect the speed with which T converges to a
new value if the amount of contribution changes. A larger ω causes T to adapt faster,
shifting both the T (t) and U(t) curves to the left. Conversely, a smaller ω emphasizes
a peer’s past behavior, shifting the curves to the right.
Figure 4.22 shows the results of four experiments, which include the corresponding
differential trust model curves for comparison. First, in Figure 4.22(a) we model
4.6. VARIATIONS ON THE MODEL 119
0
0.2
0.4
0.6
0.8
1
0 20 40 60 80 100
T(t)
Time t
Diff C = 1.00Diff C = 0.25Ratio ω = 0.1
Ratio ω = 0.01
(a) Trust as a function of time for new well-behaved nodes with two different values of C =CG.
0
0.2
0.4
0.6
0.8
1
0 20 40 60 80 100
T(t)
Time t
Diff Cb = 0.99Diff Cb = 0.25
Ratio Cb = 0.99Ratio Cb = 0.25
(b) Trust as function of time for reputable peersthat turn bad with two different values of CB .C = 1.
-10
0
10
20
30
40
50
60
70
80
0 20 40 60 80 100
U3(
t)
Time t
Diff C=1.00Diff C=0.25
Ratio C=1.00Ratio C=0.25
(c) Utility over time of a new good peer joiningfor two values of C = CG.
-1.5
-1
-0.5
0
0.5
1
1.5
2
0 50 100 150 200 250 300 350 400
U3(
t)
Time t
Diff Cb=0.99Diff Cb=0.25
Ratio Cb=0.99Ratio Cb=0.25
(d) Utility over time of a new bad peer joining fortwo values of CB . CG = 0.01.
Figure 4.22: Comparison of ratio trust model to differential trust model. T (0) = 0.01
120 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
two well-behaved peers entering the system: one with C = CG = 1 and one with
C = CG = 0.25. Because the ratio trust model uses the ratio CG
Cto compute ∆T , and
not the absolute amount, there is no difference between the results of two different
C = CG values. Therefore, instead we present two curves with different ω weights, 0.1
and 0.01. We see that, unlike the differential model, the ratio model does not have
a slow-start phase where the change in slope, d”T (t)dt”
, is positive, as the differential
model curves exhibit. Instead, the ratio curves climb fastest at first, then eventually
level out to a trust rating of 1. As expected, lowering ω dampens the impact of the
current ratio CG
C= 1 when computing ∆T , lengthening the time it takes to converge
and thus shifting the curve towards the right.
In Figure 4.22(b) we look at two malicious peers who start with a reputation of 1:
one with CB = 0.99 and one with CB = 0.25.7 Here we see the opposite effect. The
ratio model seems to respond slower to the misbehavior than the differential model.
In addition, for CB = 0.25 the steady-state trust value is significantly higher for ratio
than differential.
Figures 4.22(c) and 4.22(d), show utility over time for the two good peers (as in
Fig. 4.22(a)), and the two bad peers (Fig. 4.22(b)), respectively. At first, the results in
Fig. 4.22(d) seem inconsistent with Fig. 4.22(b). Consider the curves corresponding
to CB = 0.99. Both the ratio and differential trust models appear to converge to
the same low trust rating in Fig. 4.22(b), yet they have wildly different utility curve
slopes at steady-state in Fig. 4.22(d). However, though the two curves seem to have
similar low trust values at time 100 (the edge of the graph in Fig. 4.22(b)), from
Equation 4.23 the ratio model converges to CG
C= 0.01 as t→∞, while Equation 4.15
tells us that the differential model converges to approximately 0.002. The steady-state
utility slope is given by the generic formula for profit at time infinity (independent of
the trust model), which equals P3(∞) = ((πgtkv−kc)C+kmCB)T (∞)−κ. The factor
7C = 1 for both malicious peers. CG = C − CB .
4.6. VARIATIONS ON THE MODEL 121
of 5 difference in T (∞) between the two models is sufficient to make the first term
greater than κ for the ratio trust model (thus giving a positive slope), while making
the first term smaller than κ for the differential model (resulting in a negative slope).
Therefore, malicious users can profit in the long run in a system incorporating the
ratio trust model, but not with the differential trust model.
Consequently, from Figures 4.22(a) and 4.22(b) we conclude that the ratio model
would be vulnerable to malicious peers that occasionally contribute good resources
to raise their reputation, then switch to offering bad resources to damage the system.
This conclusion is validated by the results in Figures 4.22(c) and 4.22(d). While the
two trust models perform similarly with respect to well-behaved peers (though the
ratio model reaches steady-state faster), Fig. 4.22(d) demonstrates the most impor-
tant difference in their performance. While the differential model results in a limited
and eventually negative utility for bad peers, the ratio model allows bad peers to
continuously make large positive profit (the curves are linear).
To summarize, though the trust model used by a trading system is system-specific
and cannot be fully generalized, there are obvious advantages to choosing one ap-
proach over another. We have seen here that the differential trust model exhibits
much better characteristics than the ratio trust model, specifically with regards to
malicious behavior.
4.6.3 Tying Service to Reputation
We will now look at the probability of acquiring a good resource, denoted by πgt in
Equation 4.10. Until now, we have assumed πgt has a constant value for a given system
and is independent on the status or behavior of the peer acquiring the resource. This
may not necessarily be the case. For instance, a malicious peer may acquire a resource
from a cooperative peer but then claim to not have received it, negatively affecting
the good peer’s reputation or payment. To avoid being cheated in this way, peers
122 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
with limited contributive capacity may prefer to contribute to reputable peers. For
example, say peer A is searching for resource R. Some peer B that has R may decide
whether to offer it to A based on A’s reputation. If B is highly loaded with requests it
may ignore A’s request unless A is very reputable. Prioritizing contributions based on
requestors’ reputation provides peers further incentive to cooperate. If well-behaved
peers are more likely to respond to requests from peers with higher reputation ratings,
then the number of resource request responses a peer receives is directly related to
its reputation. The more responses received from good peers, the more likely a peer
is of acquiring a valid resource. For example, assuming a uniform distribution of peer
trust ratings from 0 to 1, if A receives only one response, the expected reputation of
that responder is 0.5. If A receives 4 replies, the expected reputation of the highest
rated responder would be 0.8.8 Therefore, the higher A’s reputation, the greater the
number of resource offers it will receive for a given query, which in turn increases the
probability of acquiring a good version of resource R.
To represent reputation-weighted contributions in our model, we make πgt a func-
tion with one parameter denoting the acquiring peer’s trust rating Ti, giving us the
term πgt(Ti). Let us explore four possible πgt functions, given in Figure 4.6.3. These
functions capture a range of biases towards trustworthy peers. For example, the
fourth curve corresponds to a quadratic function of T , πgt(T ) = T 2. A peer with
a trust rating of 0.3 is expected to have only 9% (0.32) of its acquired resources be
valid. If its reputation doubles to 0.6, then 36% of its acquisitions will likely be valid.
Note that the area under the curve captures, in a way, the amount of good re-
sources contributed in the system. Consider a system with 100 peers, all with equal
number of acquisitions A = 1 and T values uniformly distributed between 0 and 1
(e.g., 0.01, 0.02, 0.03, etc.). We shall refer to this scenario as the “uniform” scenario.
According to our πgt(T ) function, for a given peer, say with T = 0.45, that peer will
8Note that E[max(U1, U2, ..., Un)] = nn+1
[76, 36].
4.6. VARIATIONS ON THE MODEL 123
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
π gt(T
)
T
πgt = 1/3 T0
πgt = 1/2 √Tπgt = 2/3 T
πgt = T2
Figure 4.23: πgt w.r.t T for various functions of T .
have acquired 0.452 or 0.2025 good resources in each time unit. Summing up all the
good resources acquired by all peers is equivalent to calculating the area under the
curve. The total amount of good resources acquired must equal the total amount
of good resource contributed (∑CG,i∀i ∈ n). So in this scenario, the area under
the curves represents the total good “work” done in the system. Figure 4.6.3 shows
three other curves representing functions of T : 13T 0, 1
2
√T , and 2
3T . Note that with
the given coefficients we have “normalized” the curves so that the area under each
curve is equal. It makes sense to consider curves with equal areas because it models
systems where the same amount of good work is being done, under the assumptions
of our uniform scenario. For brevity, we will ignore the constant coefficients when
referring to the four curves (i.e. T 0,√T , T , T 2). By studying these four functions
we can see the impact of a reputation bias. If bias is good we can then design a
search mechanism that indeed rewards trustworthy peers in a comparable way. If we
discover bias is bad, we can then downplay trustworthiness in our search mechanism.
Using these functions of T for πgt in Equation 4.18 we have the expected profit
rate at steady-state. In Figure 4.24(a) we plot profit at steady-state as a function
of CG, with CB = 0. Notice that for the curves corresponding to functions with low
exponents (T 0, and√T ), peers will not have positive profit no matter how much they
124 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
-0.4
-0.2
0
0.2
0.4
0.6
0.8
0 0.2 0.4 0.6 0.8 1
P(∞
)
CG
πgt = 1/3 T0
πgt = 1/2 √Tπgt = 2/3 T
πgt = T2
(a) Steady-state profit as a function of CG forvarious πgt. CB = 0.
-0.4
-0.2
0
0.2
0.4
0.6
0.8
0 0.2 0.4 0.6 0.8 1
P(∞
)
CB
πgt = 1/3 T0
πgt = 1/2 √Tπgt = 2/3 T
πgt = T2
(b) Steady-state profit as a function of CB forvarious πgt. C = 1.
Figure 4.24: Effects of sample πgt w.r.t varying functions of T .
contribute. The more skewed the probability of locating a good resource is towards
reputable peers, the more profit cooperative peers will gain in the long run, regardless
of the amount they contribute.9 Therefore, reputation-weighted contributions both
encourage good behavior and result in more profit for well-behaved peers when total
system contributive capacity is low. Of course, this simple analysis assumes that
the system is overloaded with requests (∑A ≥ ∑CG) and the distribution of peer
reputations remains uniform throughout time.
We have seen how varying function πgt(T ) affects the profit of good peers. Now we
focus on the effect on bad peers. Figure 4.24(b) plots profit at steady-state (P3(∞))
as a function of CB, given that C = 1 and, therefore, CG = 1−CB. To understand the
graph, consider a peer that contributes at the full rate of C = 1. The x-axis represents
a spectrum of behavior, with a highly cooperative peer that always contributes valid
resources at the left extreme, and a highly malicious peer that always contributes
false resources at the right extreme. Here we see that a lower T exponent in the πgt
function results both in lower profit for well-behaved peers (to the left) and higher
profit for somewhat malicious peers (in the middle). In fact, for the first two curves
9As long as CG > 0.1. For smaller CG all πgt(T ) functions give equal negative profits.
4.6. VARIATIONS ON THE MODEL 125
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.2 0.4 0.6 0.8 1
T(∞
)
CB
Figure 4.25: Steady-state trust as a function of CB. C = 1
-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
0.25
0 0.2 0.4 0.6 0.8 1
P(∞
)
CB
km=2km=1
km=0.5km=0
Figure 4.26: Steady-state profit as a function of CB. C = 1
(T 0 and√T , respectively), only somewhat malicious peers make positive profit, with
a maximum profit rate of approximately 0.1 for T 0 and 0.5 for√T , at CB ≈ 0.5.
In contrast, the curves for T and T 2 indicate relatively high profit for purely well-
behaved peers (at CB = 0) as we saw in Fig. 4.24(a), but then sharply drop to a
profit rate less than 0 at approximately CB = 0.2. Afterwards, both curves exhibit
a gradual increase in P3(∞) as CB increases, similar to that of the lower exponent
curves, but to a lesser degree. Finally, all four curves drop to negative profit rating
at CB = 1.
The most interesting characteristic of all four curves is the gradual increase in
profit to a (local) maximum value for CB between 0.4 and 0.8, depending on the
126 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
particular curve, before falling again to less than 0 profit. This profit “hump” is
independent of the πgt function as it appears in all four curves. The hump is a result
of the other factors in Equation 4.18 dependent on CB. Specifically, the two terms
kmCB and T (∞). Notice that the first factor is linear with respect to CB, with a
large coefficient km = 2. However, if we plot T (∞) w.r.t CB, as in Figure 4.25, we
see that the decrease is not linear and has a slope less than km. Therefore, in the
model represented by Eq. 4.10, the attenuation on profit from malicious activity (CB)
from the reputation system is not sufficient to overcome the gain in profit given by
parameter km. Unfortunately, this results in a mix of good and bad contributions
that result in positive profits, which a reputation system ideally would prevent.
By analyzing the underlying equations, we see there are various ways to further
decrease steady-state profits for CB > 0. First, would be to decrease the utility
malicious peers gain from misbehaving, lowering km. Figure 4.26 shows the effect on
the T curve from Fig. 4.24(b) of decreasing the value of km from the default value
of 2. Notice that all curves share the same endpoints at CB = 0, where the peer
is sharing no bad resources to derive malicious utility from, and CB = 1, where the
peer’s trust rating is 0 and so none of its resources are contributed, as no one trusts it.
In the middle, lowering km decreases the steady-state profit. Interestingly, the largest
impact is not for high values of CB, as we might expect, but for lower values, especially
around CB = 0.25. The reason is that, though a smaller fraction of a malicious peer’s
contributions are bad (CB), the total amount of bad contributions is greater because
of the effect of the profit trust factor on total contribution (CT (∞)). With a negative
P (∞) of almost 0.2, the km = 0 curve attains the lowest profit rate; much lower than
can be accounted for by the fixed cost κ = 0.01. The bad profit rate is due to a πgt
value (based on the calculated T (∞)) that is less than kc
kv. Recall from Section 4.2.3
that for a well-behaved peer to gain utility in our model from Equation 4.11 based on
the long-term disjoint currency scenario, the inequality πgtkv > kc must hold. In the
4.7. GENERALIZED MODEL OF TRUST AND PROFIT 127
worst case, exhibited at approximately CB = 0.25, πgt is less than kc
kv= 0.5 because
of a moderately low T (∞), causing negative profits per contribution, but T (∞) is
sufficiently high to have a high level of contribution. This experiment demonstrates
that if a peer is losing utility on each of its acquisitions, it will benefit from lowering
its contributions, which can be done by lowering its contributive capacity, leaving the
system, or developing a bad reputation. Figure 4.26 also shows that, except for high
km = 2, malicious peers are not able to maintain a positive rate of profit if they share
a substantial amount of bad resources (CB > 0.1). Unfortunately, because km is the
subjective personal utility gained by a malicious user for simply hurting another user,
it is beyond the control of the system designer.
A second way to decrease profit for malicious peers would be to increase the
fixed participation cost (κ), which would lower profit for all participants. Adding
an additional fixed cost of κ′ will simply shift all the curves in Fig. 4.24(b) down
by κ′. A third solution would be to use a different profit trust factor (as discussed
in Section 4.6.1) to increase the emphasis a peer’s reputation has on the amount it
contributes. And finally, the trust model, given in Equation 4.13, could be changed.
The weight given to good contributions rg could be decreased, further lowering all
peer’s trust ratings. Also, the entire trust model could be replaced with a different
method of calculating trust, as discussed in the following section.
To summarize, in situations where the average peer has little contributive capacity,
selectively offering resources based on the requestor’s reputation results in more profit
for well-behaved peers and further decreases the profit expectations of malicious peers.
4.7 Generalized Model of Trust and Profit
We now extend the system model represented by Equations 4.10 and 4.13 that we
have been studying. By augmenting the strategies available to a peer, we develop a
more general mathematical model for trust and incentives in trading systems.
128 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
Until now we have assumed that all peers charge the same amount for all resources.
If prices are fixed, regardless of a peer’s reputation, there would be no reason for peers
to choose a provider with a lower trust rating over one with a higher rating. Con-
sequently, reputable peers would become overloaded while new peers would seldom
be chosen to contribute, keeping them at a lower reputation rating. What if peers
price their contributions differently, depending on their reputation? The greater a
peer’s trust, the more they may charge for providing a resource. For instance, a peer
with a reputation of 0.5 may charge twice as much for the same resource as a peer
with reputation 0.25. Consequently, when a peer is selecting a service provider from
which to acquire a resource, they may have a choice of peers with different reputa-
tions, offering the resource for prices proportional to their reputation rating. The
requesting peer may then choose to pay more for a reliable provider, or pay less and
risk receiving a valueless resource.
To represent this effect in our model we focus on the term for profit acquisitions in
Equation 4.10, (πgtkv−kp)A. For a particular resource a peer wants to acquire, there
may be several peers offering it. On average, choosing a cheaper resource provider
corresponds to lowering the value of kp. At the same time, we expect the risk of
buying a bad resource to increase, decreasing the value of πgt. We call the function a
peer uses to select a price for its resources, based on its current reputation, its pricing
function. We denote the pricing function for a particular peer i as pi(Ti).10 This
function determines the payment a peer receives for one contribution. For simplicity,
let us assume kp is the “full” or maximum price a peer may charge for a resource. A
peer’s pricing function will denote the fraction (between 0 and 1) of the maximum
price which it will charge for each contributed resource. Inserting this into our profit
10As before, we will ignore the subscript when all subjective variables in question relate to thesame peer.
4.7. GENERALIZED MODEL OF TRUST AND PROFIT 129
equation (Eq. 4.10) yields
P = πgtkvA− kpA+ (kmCB + kpp(T )C − kcC)T − κ (4.24)
Notice that for a given peer i, the probability of acquiring a good resource πgt is
dependent on the price it is willing to pay for a resource and the pricing function used
by the other peers that service its acquisitions. To represent this variability in price
we introduce the discount factor di as a multiplicative factor of kp, which is assumed
to be the maximum price of a resource.
For simplicity, let us assume all other peers use the same pricing function, denoted
as ρ().11 We can now express the probability of a specific peer i acquiring a good
resource as a function of the global pricing function and the amount the peer is
willing to pay, denoted by πgt(di, ρ) or, briefly, πgt(d, ρ). Note that this new term
for the probability of acquiring a good resource (πgt(di, ρ)) is independent of Ti. To
account for the possibility that πgt is related to a peer’s reputation, as discussed
in Section 4.6.3, we add an additional function parameter, resulting in the function
πgt(di, Ti, ρ).
Further augmenting Equation 4.24 with the discount factor and πgt function we
have
P = πgt(d, T, ρ)kvA− kpdA+ (kmCB + kpp(T )C − kcC)T − κ (4.25)
When a resource requester must decide which provider to transact with, it will
compare the providers based on their reputation ratings and the prices they set.
Thus, the number of transactions a peer contributes in a given time interval will be
dependent on how other peers weigh its reputation and pricing function. We express
this relationship with a selection function σ(Ti, pi), a global black-box function which,
given a peer’s current trust and pricing function, determines the fraction of C the
11Consider ρ to be similar to an average of the pricing functions of all other peers.
130 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
Table 4.3: Definition of Generalized Model TermsTerm Definitiondi Fraction of maximum resource price a peer i pays (on average)
pi(Ti) Pricing function used by peer i to determine fraction of maxi-mum resource price to charge for a contribution
ρ() Global pricing function we assume all other peers useσ(Ti, pi) Global selection function that determines the fraction of a peer’s
contributions that are used given its current reputation and thepricing function it uses
πgt(Di, Ti, ρ) Global function that determines the probability of a peer’s ac-quired resources being good based on the price the peer is willingto pay (denoted by D), its reputation, and the pricing functionused by the rest of the peers
kv Utility of a full unit of acquired resourceskp Maximum price in utility of a full unit of contributed resourceskc Cost in utility of contributing at maximum capacity (C = 1)
peer actually contributes. A concise explanation of the functions and variables in our
generalized model appears in Table 4.3.
We now generalize our model for profit and trust in a reputation system by em-
ploying both the pricing and selection functions. Inserting the selection function into
Equations 4.25 and 4.13 and collecting terms yields our generalized equations for
profit and changing trust.
P = (πgt(d, T, ρ)kv − kpd)A+ (kmCB + (kpp(T )− kc)C)σ(T, p)− κ (4.26)
∆T = (rgCG(1− T )− rbCBT )σ(T, p)− δT 2 (4.27)
Because σ(T, p) determines the fraction of contribution used in a unit of time, we must
apply it to our change in trust equation as well. Thus, we replace the multiplicative
T in Equation 4.13 with σ(T, p) in Equation 4.27. The selection function replaces the
profit trust factor discussed in Section 4.6.1.
Notice that if σ(T ) 6= T , then we can no longer guarantee that T will remain
4.7. GENERALIZED MODEL OF TRUST AND PROFIT 131
between 0 and 1 even if rb + δ ≤ 1.12 The only way to maintain that guarantee is
if both rbCB and δ are multiplied by the same T dependent term. To restore the
guarantee we would need to modify the decay term from δT 2 to δTσ(T ). However,
this implies that the rate of trust decay used by the reputation system is proportional
to the selection function used by peers when choosing resource providers based on
reputation.
We now consider two constraints on the generalized model: one where we hold
p(T ) constant, and one where we hold σ(T, p) constant.
Notice that we can think of our original model from Equation 4.10 as a specific
instance of Equation 4.26, but with σ(T, p) = T and p(T ) = 1 for any value of
T . We will denote this special pricing function, where ∀T p(T ) = 1, as p(T ). In
Equation 4.10 we had assumed that σ(T, p) = T for simplicity of illustration and
analysis.
In an analysis similar to that presented previously in Section 4.6.1 we varied σ(T, p)
in both Eq. 4.26 and 4.27, while keeping p(T ) = p(T ) = 1. The difference between
these experiments and those in Section 4.6.1 is that the σ(T, p) term affecting ∆T
was also changed here, while it remained σ(T, p) = T in the previous experiments.
Figure 4.27(a) shows how increasing the exponent on T increases the slow-start
phase, before a peer’s trust rises quickly. The more emphasis placed on reputation
when selecting a peer, the less likely low reputation peers will be chosen, resulting in
a longer period of time to prove their trustworthiness.
Studying Figure 4.27(b) we see the effect of different selection functions on highly
cooperative peers (C = CG = 1). Notice that a longer reputation slow-start phase
translates into a longer period of low profit, before reaching steady-state where utility
climbs linearly. Comparing the graph to Figure 4.21(a) we see a larger variation in
the length of the startup phase, before trust and profit stabilize. This behavior is
12Discussed in Section 4.2.3 when decay was first introduced.
132 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 10 20 30 40 50 60 70 80 90 100
T(t)
Time t
σ=1σ=T1/2
σ=Tσ=T3/2
(a) Trust over time for various σ(T, p) functionsof T .
-10
0
10
20
30
40
50
60
70
80
0 20 40 60 80 100
U(t)
Time t
σ=1σ=T1/2
σ=Tσ=T3/2
(b) Utility over time for various σ(T, p) functionsof T . C = CG = 1.
-10
0
10
20
30
40
50
0 200 400 600 800 1000
U(t)
Time t
σ=1σ=T1/2
σ=Tσ=T3/2
(c) Utility over time for various σ(T, p) functionsof T . C = 1, CB = 0.99.
Figure 4.27: Effects of varying σ(T, p).
4.7. GENERALIZED MODEL OF TRUST AND PROFIT 133
most noticeable when comparing the curves for σ = T3
2 in the two graphs. Previously
we did not take into account the affect that varying the selection function (previously
referred to as the profit trust factor) would have on ∆T . We believe this generalized
form to be more reasonable and consistent. If the selection function influences the
amount of contribution affecting profit, it should similarly influence the amount of
contribution affecting reputation.
Figure 4.27(c) demonstrates the effects on highly malicious peers (CB = 0.99,
CG = 0.01). Comparing to the corresponding graph from Section 4.6.1 (Fig. 4.21(b))
we see little difference. Because T begins at a very low value (0.01) and only decreases,
the additional σ(T, p) term in the trust equation (Eq. 4.27) has little effect on the
system behavior. Although, as before, the selection function term in the profit equa-
tion (Eq. 4.26) does considerably influence the profit rate of malicious peers. Again,
σ(T, p) = T gives the most desirable results. Limiting malicious peers to negative
profit while allowing good peers to quickly gain utility.
Instead of having a constant p(T ) and having a variable σ(T, p), we could con-
versely imagine a σ(T, p) that produced one constant value for any T . If all peers
could price their contributions so that the expected risk of receiving a bogus resource
is offset by the lower cost. This means that for any resource, no matter what price
dkp is offered, the related risk πgt(d, T, ρ) changes so that (πgt(d, T, ρ)kv−kpd) always
remains constant.13 Consequently, all peers would be able to contribute their full ca-
pacity, but more reputable peers would generate more profit for the same contributive
capacity. We denote the special pricing function where the price of a certain peer’s
contribution exactly offsets the expected risk from transacting with a peer of their
reputation as p∗(T ). Expressed mathematically
∀T σ(T, p∗) = 1 (4.28)
13Given that πgt is independent of T .
134 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
Every peer’s contributive capacity is equally likely to be used. The advantage of
σ(T, p∗) = 1 is that T will reach steady-state much faster by negating the slow-start
reputation effect, as low reputation peers perform more transactions per round from
which to judge their behavior.
Unfortunately, p∗(T ) does not handle malicious peers well. Notice that if σ(T, p∗) =
1 in Equation 4.26, then purely malicious peers will generate positive profit if km > kc,
regardless of p∗(T ). In other words, if malicious peers derive sufficient “pleasure” from
harming the system by distributing bad content, then they are benefiting from the
system as long as they are allowed to make contributions, even if they do not receive
payment for them. Preferably, σ(T, p∗) resembles a step function that rises from 0
to 1 at T = T (0)− ε, for some very small epsilon. This would allow new good peers
to quickly raise their reputations. A peer whose reputation falls below T (0) will be
ignored for the remainder of their stay in the system, with no hope of redemption.
Of course, such a node could change its identity and reenter the system, but if they
misbehave will quickly find itself ignored once again.
4.8 Discussion
4.8.1 Credits and Economic Stimulation
One remaining question is how credits are distributed to the peers in the system.
One solution is to allow peers to purchase credits using using real money at a certain
exchange rate, rx.14 Ideally the monetary price of credits should outweigh their
cooperative price. If not, peers will rather purchase credits with money than acquiring
them by contributing resources, lowering the total value of the network. Given that
kc, the cost of contributing (see Equation 4.2), is strictly less than kp, the payment
for contributing, then this condition is satisfied as rxkp > rxkc.
14For consistency, let rx be the exchange rate for credits to real-world.
4.8. DISCUSSION 135
Now any peer may acquire credits using real-world money if it wishes. However,
a stranger’s low reputation limits their total credit income rate, encouraging them
to purchase credits in order to use the network to acquire resources during their
reputation slow-start phase. Paying the extra price of direct currency conversion in
order to gain instant access to offered resources is another aspect of the penalty, or
tax, on strangers.
Over time the system may lose currency as nodes leave the network abruptly with
positive credit balances. Also, peers that contribute more than they spend will hoard
credits. To avoid global stagnation the system can periodically inject credits into
the network in the form of payments or rewards. The question is what peers should
be paid? One solution is to give credits to nodes that need it the most; those with
zero credits. Unfortunately, this would encourage nodes to freeride, as nodes that
are not contributing will be earning no credits and have 0 balance. Another choice is
to reward nodes that contribute the most, regardless of the rate they spend credits.
Of course these nodes are likely to have high income rates and large credit balances
already.
We suggest rewarding the peers who generate the most global wealth. Wealth in
our system is produced whenever a transaction occurs. The peer serving a resource
is paid more than what serving it cost them. The peer acquiring the resource values
it more than the credits the peer paid for it. Therefore, the peer participating in
the most transactions will be helping to increase the total wealth. Using our earlier
variable A (the amount of resources acquired) this translates to
{Node s|(Cs + As) = max (Cx + Ax)∀x ∈ N} (4.29)
where N is the set of all nodes. We call this node the maximum economic stimulator.
In our ideal economic model we assumed every node spent all their credits, so that
A ∝ C. In this case the maximum economic stimulator is the node with highest C.
136 CHAPTER 4. MODELING REPUTATION AND INCENTIVES
4.9 Related Work
Significant effort has been put into understanding how to effectively design reputation
and/or incentive schemes (e.g. [59, 37, 93]). We briefly mention two examples that
closely relate to the method we have presented here.
Huberman and Wu [65] developed a sophisticated economic model of reputation
over time that they use to study the endogenous dynamics or reputations. Geared
towards the economics community, they focus on the stability and equilibria of reputa-
tion for persistent service providers with long-term relationships with a large customer
base.
In [45], Feldman et al. study the problem of whitewashers using the Prisoner’s
Dilemma model for interactions. They propose a reciprocative decision function to
minimize the benefits of whitewashing without excessively punishing new users. In our
model we assume certain parameters are static when this is not necessarily the case.
For example, we assume a global constant value for initial trust T0. Feldman et al. [45]
suggest dynamically varying the initial trust in strangers based upon the behavior of
past strangers. This allows the system to be optimistic when few whitewashers are
present, but quickly throttle back if the number of defecting strangers increases. As
demonstrated in Sections 4.6 and 4.7, it is possible to replace such constants with
variable functions, leading to further discussion and analysis of appropriate functions
and their effects. In addition, their work inspired our transactional model.
4.10 Conclusion
We have presented an economic model of peer behavior in a resource exchange envi-
ronment with reputation management. The model is sufficiently simple and extensible
to support detailed analysis. Our simulations show it adequately captures the prop-
erties of a simplified trading system. Applying our model we have been able to shed
4.10. CONCLUSION 137
light on various important system design questions.
Using the analytic model, we elucidated on the desirable properties of reputation
systems. We have discussed the tradeoffs associated with how provider selection is
influenced by reputation information. We investigated how tying request response
rate to the requestor’s reputation can improve performance for well-behaved users,
especially in times of high-load. Finally, we demonstrated that using both transac-
tion quality and quantity in calculating reputation outperforms using only quality or
quantity.
Our model has exposed “hidden” parameters and functions in trading systems such
as πgt(T ) and σ(T, p). Though a system may not explicitly state these parameters,
every system inherently exhibits a particular behavior, based on the design choices.
Considering the impact of these choices beforehand will lead to better choices.
Chapter 5
Examining Metrics forPeer-to-Peer Reputation Systems
Until now, we have assumed global shared history is maintained about all peer inter-
actions, allowing the system to detect possibly malicious resource providers. Mecha-
nisms that implement global reputation systems have been proposed in the literature.
EigenTrust [73], for example, collects statistics on peer behaviors and computes a
global trust rating for each peer. However, global history schemes are complicated,
requiring long periods of time to collect statistics and compute a global rating. They
also suffer from the transience of peers and continual anonymity afforded malicious
peers through zero-cost identities.
In this chapter, we evaluate the performance of a peer-to-peer resource-sharing net-
work in the presence of malicious peers, which, in contrast to global history schemes,
uses only limited or no information sharing between peers. We develop various tech-
niques based on collecting reputation information and present several interesting side-
effects resulting from some of the techniques.
We also study the trade-offs of two identity management schemes for peer-to-peer
networks: a trusted central login server and self-managed identities. We analyze the
performance of each scenario and compare it to the base case with no reputation
system. We look at how new peers should be treated, and present some mechanisms
138
5.1. SYSTEM MODEL 139
to further improve system efficiency.
In Section 5.1 we present our system model and its assumptions. Section 5.2 de-
scribes the two threat models we consider. Then, Section 5.3 discusses the reputation
systems used in the experiments and their options. Section 5.4 describes the metrics
used for evaluating our experiments. In Section 5.5 we specify the details of the sim-
ulation environment used for the experiments, and present the results in Section 5.6.
Section 5.12 discusses related work. Finally, we conclude in Section 5.13.
Some of the work presented here has been previously published as [91] and [93].
5.1 System Model
A peer-to-peer system is composed of n peer nodes1 arranged in an overlay network.
In a resource-sharing network each node offers a set of resources to its peers, such as
multimedia files, documents, or services. When a node desires a resource, it queries all
or a subset of the peers in the network (depending on the system protocol), collects
responses from available resource providers, and selects a provider from which to
access or retrieve the resource.
Locating a willing resource provider does not guarantee the user will be satisfied
with its service. Selfish peers may offer resources to maintain the impression of
cooperation, but not put in the necessary effort to provide the service. Worse, certain
nodes may join the network, not to use other peers’ resources, but to propagate false
files or information for their own benefits.
In our model, each peer verifies the validity of any resource it uses. Accessing
invalid or falsified resources can be expensive in terms of time and money. A system
1In this chapter, we often use the term “node” rather than “peer”. A node refers to a networkentity with a unique system identifier. “Node” emphasizes the distinction between such entitiesand network users. One user may control multiple entities [38], while another person’s computermay have been compromised by a worm and forced to run a system node on behalf of an unknownmalicious user.
140 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
may implement a micropayment scheme requiring users to pay a provider before being
able to verify the validity of the resource. In most cases the user must wait for a file to
be downloaded or a remote computation to conclude and then verify the correctness
of the result. Checking the validity of the file or service response may itself be a
costly but necessary operation in the presence of malicious nodes. Because such an
operation is highly domain-specific, we assume the existence of a global verification
function, V (R) which checks whether resource R is valid. Any node can perform this
verification, but it is indeterminately expensive to compute and may require human
interaction (such as listening to a song after downloading it from a music service to
ensure it is the correct song and uncorrupted) or even a third-party. A resource must
be downloaded or accessed before it can be verified, which costs time and bandwidth.
We include this cost in the verification function, so that it represents the full price of
accessing a bad resource.
To simplify the discussion we present our work in the context of a file-sharing
system, where users query the network, fetch files from other peers, and verify the
files’ content is correct. Nodes hearing the query reply to the query originator if
they have a copy of the file. The originator then fetches copies of the file from
the responders until a valid, or authentic copy is located. File-sharing networks have
existed for some time and their characteristics have been thoroughly studied, allowing
us to more accurately model deployed, working systems. Though we use the term
“files” in the rest of the chapter, most concepts apply to generic resources.
What does it mean for a file to be invalid or fake? The issue of file authenticity is
discussed in the following section. The behavior of peers in the system with respect
to the authenticity of the files they send each other is captured in the threat model,
which is discussed in Section 5.2. Reputation systems, which track node behavior in
order to mitigate the problem of inauthentic files, are covered in Section 5.3.
5.1. SYSTEM MODEL 141
MetadataTitle: A Tale of Two Cities
Author: Charles Dickens
Publish Date: April 2002
Publisher: Barnes & Noble
Books
...
ContentIt was the best of times, it was
the worst of times...
(a) A document consists of data orcontent and sufficient metadata touniquely describe the content.
QueryTitle: A Tale of Two Cities
Author: Charles Dickens
Publish Date: April 2002
Publisher: Barnes & Noble
Books
...
(b) A query in a file retrieval systemconsists of sufficient metadata as touniquely match only one documentin the system
Figure 5.1: Sample document and matching query
5.1.1 Authenticity
In our model the unit of storage and retrieval is the document. Every document D
consists of some content data CD and metadata MD which uniquely describes the
content. If two documents contained the same metadata but different content, there
must be some information pertaining to their differences that should be included in the
documents’ metadata to make them unique. For example, different editions of a book
should include the edition and the year published in their metadata. Figure 5.1(a)
illustrates a sample document for a specific edition of the Dickens’ novel A Tale of
Two Cities. If the only metadata provided were the title and author, the document
may not be unique, since the other editions of the same book exist in other languages,
or may include notes or pictures.
Given the definition of a document, we can now define document authenticity. A
document is considered authentic if and only if its metadata fields are consistent with
each other and the content. If any information in the metadata does not “agree”
with the content or the rest of the metadata, then the document is considered to be
142 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
inauthentic, or fake. For example in Figure 5.1(a), if the Author field were changed
to Charles Darwin, this document would be considered inauthentic, since Barnes &
Noble Books has never published a book titled A Tale of Two Cities written by
Charles Darwin that begins “It was the best of times”.
We assume the existence of a global authenticity function, A(D) which enables
one to verify the authenticity of a document D. Evaluating the function is likely to
be very expensive and may require human user interaction or even a third party. An
example would be if Alice were to download a song from a music sharing service, she
could determine whether it is the correct song by listening to it. We generalize the
document authenticity function to the resource verification function V (R). We also
use the terms “file” and “document” interchangeably.
5.2 Threat Models
As stated above, the threat we are studying is that of a group of malicious nodes
that wish to propagate inauthentic (or fake) copies of certain files. They do not care
if they themselves are unable to query the system for files, thus incentive schemes
fail to deter them. In addition, we assume they may pass false information to other
nodes to encourage them to fetch bad files. We consider three behaviors for malicious
nodes; abbreviated as N , L and C:
N : No misinformation is shared. All nodes give true opinions.
L: Malicious nodes lie independently for their own gain. They give a bad opinion
of everyone else.
C: Malicious nodes collude. They give good opinions of each other and bad opinions
of well-behaved nodes. For this model, we will briefly consider the situation
where some malicious nodes act as “front” nodes by providing only authentic
5.2. THREAT MODELS 143
files (but never from the subversion set) in an attempt to gain the trust of other
nodes and spread their malicious opinions.
The percentage of nodes in the network that are malicious is given by the pa-
rameter πB. We propose two distinct threat models, one in which malicious nodes
target specific files, and another in which they act maliciously towards other nodes by
a certain probability. In both threat models, we assume no other malicious activity,
such as denial-of-service style attacks, are occurring in the network.
5.2.1 Document-based Threat Model
The first threat model is designed to emulate expected real-world malicious activ-
ity. We randomly select a set of files, called the subversion set, that all malicious
nodes wish to subvert by disseminating invalid copies. Each unique file has an equal
probability of being in the subversion set, specified by the parameter pB. We assume
no correlation exists between a file’s popularity and its likelihood to be targeted for
subversion. Malicious nodes also share valid copies of files not in the subversion set.
The effects on performance of varying both πB and pB are discussed in Sections 5.6.1
and 5.6.2.
We assume well-behaved nodes always verify the authenticity of any file they have
before sharing it in the network. Though this assumption may be unrealistic for many
peer-to-peer systems, experiments in which a small fraction of the files provided by
good nodes were invalid demonstrated little effect on our experimental results.
5.2.2 Node-based Threat Model
Our second threat model performs equivalently to the former, but provides us inter-
esting avenues of research. We choose to model the node behavior described above
as a probability that a given node will send an authentic copy of a file to another
144 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
node requesting the file. For example, good nodes may reply correctly 95% of the
time (0.95)2, while malicious nodes only reply correctly 10% of the time (0.1). These
probabilities can be arranged in a threat matrix, T , where Ti,j contains the probability
that node j will reply with an authentic file to request from node i. Section 5.4 dis-
cusses the advantage of modelling the threat model in this form. The threat matrix
characterizes the threat model at a specific time, Ti,j(t), since nodes may behave well
at first and then begin acting maliciously. Though most of our experiments use a
static threat model, some look at dynamic node behavior, such as malicious nodes
behaving well for a period of time, then turning bad (see Sec. 5.6.3).
For the results in this chapter we assume all malicious nodes use the same prob-
ability of replying with a fake file, regardless of the query originator. We reuse the
parameter pB to indicate this probability. Therefore, for any malicious node m,
∀i, Ti,m = 1− pB. Similarly, we use pG to be the probability of a good node sending
an authentic file, thus assuming that a fraction equal to 1 − pG of the files on the
average well-behaved node are corrupted. For any good node g, ∀i, Ti,g = pG. In
Section 5.6.3 we look at the effects of having varying node threat values among the
well-behaved nodes and the malicious nodes.
5.3 Reputation Systems
When a node queries the system for a file, it collects all replies (and their source IDs)
in a response set. The node repeatedly selects responses from the set, fetches the copy
of the file offered by the responder and verifies it (using the verification function) until
an authentic copy is found.
As nodes interact with each other, they record the outcome, such as whether
the file received was authentic of not. As a node collects statistics, it develops an
2The 5% accounts for the fake files that are shared before they are verified as authentic by theuser
5.3. REPUTATION SYSTEMS 145
opinion, or reputation rating, for each node. We make no assumptions of how this
rating should be computed, but since it is used to compare and rank nodes, it should
be scalar (see below for an example).
Each node records statistics and ratings in a reputation vector of length n, where
n is the total number of nodes in the network.3 When a node first enters the system
all entries are undefined. As the node receives and verifies files from peers, it updates
the corresponding entry. Nodes may also share their opinions about other nodes with
each other and incorporate them in their ratings. The reputation vectors can be
viewed as an n × n reputation matrix, R, where the ith row is node i’s reputation
vector. Cell Ri,j would contain node i’s “opinion” of node j.
When a node has collected replies to a query, the reputation system calls a selection
procedure, which takes as input the query response set and the node’s reputation
vector, and selects and fetches a file. The verification function is then calculated on
the selected file. As stated earlier, this may be done programmatically if possible, but
most likely requires presenting the file to the user. The system updates its statistics
for the selected response provider based on the verification result. If verification
failed, the selection procedure is called again with a decremented response set. This
is repeated until a valid file is located, the response set is empty, or the selection
procedure deems there are no responses worth selecting (such as if the remaining
responders’ ratings are too low).
For this chapter we study variants on two reputation systems, one in which peers
share their opinion and one in which only local statistics are used. They are compared
against a random selection algorithm.
Random Selection: Our base case for comparison is an algorithm which ran-
domly chooses from the query responses until an authentic file is located. Since no
3Or more accurately the number of identities in the network (see Section 5.3.1).
146 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
knowledge or state about previous interactions is stored, shared or used, this algo-
rithm models the performance of a system with no reputation system.
Local Reputation System: With this reputation system each node maintains
statistics on how many files it has verified from each peer and how many of those were
authentic. Each peer’s reputation rating is calculated as the fraction of verified files
which were authentic. This results in a rating ranging from 0 to 1, with 0 meaning
no authenticity check passed and 1 meaning all authenticity checks passed. When
processing a query, these ratings are used in the selection procedure to select the peer
from which to fetch the file. We consider two procedures in our experiments:
• The Select-Best selection procedure selects the response from the response node
with the highest rating. If the selected response is invalid, the procedure chooses
the next highest-rated node.
• Select-Best will prefer to choose good nodes it has previously encountered and
thus may overload a small subset of reputable peers. To spread out file requests
we propose the Weighted selection procedure, which probabilistically selects the
file to fetch weighted by the provider’s rating. For example, if nodes i and j
both provide replies to node q and R(q, i) = 0.1 and R(q, j) = 0.9, then j is nine
times as likely to be chosen as i. We study load distribution in Section 5.6.2.
The Select-Best method requires a node maintain an ordered list of the most
reputable nodes it knows. We call this list a Friend-Cache of maximum size FC.
There are additional benefits to maintaining a Friend-Cache in the local reputation
system. By sending queries directly to nodes in the Friend-Cache before propagating
the query normally, the message traffic of query floods in flat unstructured networks
can be greatly reduced. We call this the Friends-First technique and evaluate it in
Section 5.6.1.
Voting Reputation System: This system collects statistics and determines
5.3. REPUTATION SYSTEMS 147
local peer ratings just as the local system does. It extends the previous system by
considering the opinions of other peers in the selection stage. When a node, q, has
received a set of responses to a query, it contacts a set of nodes, Q, for their own
local opinion of the responders. Each polled node, or voter v ∈ Q, replies with its
rating (from 0 to 1) for any responder it has interacted with and thus has gathered
statistics. The final rating for each responder is calculated by the formula
ρr = (1− wQ)R(q, r) + wQ
∑
v∈QR(q, v)R(v, r)∑
v∈QR(q, v)(5.1)
For each responder r, the querying node q sums each voter’s (v) rating of r weighed
by q’s rating for v. This result is the quorum rating. If node q has no prior knowledge
of r, it uses the quorum rating as r’s rating in the selection procedure. If q already
has statistics from prior interaction with node r, the rating for node r is the combi-
nation of the local statistics and the quorum rating, by some given weight called the
quorumweight, wQ. Note that when wQ = 0 the voting system works exactly like the
local system.
Until now we have not discussed how the nodes in the quorum Q are selected to
give their opinion. We consider two methods of selecting voters. The first method is
to ask one’s neighbors in the overlay topology. These are typically the first peers a
node is introduced to in the network and, though neighbors may come and go, the
number of voters will remain relatively constant. The other method is to ask peers
from whom one has fetched files and who have proven to be reputable. This group
would consist of the peers with the f highest local ratings at node q. The former
quorum selection we call Neighbor-voting while the latter is referred to as Friend-
voting. In the Friend-voting scheme we reuse the Friend-Cache described above to
maintain our list of voters. The cache has a maximum size of FC. We study the
effects of varying FC in Section 5.6.2.
148 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
Above, we describe the source node as contacting each voter for their opinion
for each and every query once it has collected the responses. Realistically, nodes
may instead periodically exchange reputation vectors with each other. If the rate at
which reputation vectors are exchanged is as frequent as once per query, then the two
methods are equivalent. For simplicity, we assume this equivalence in our simulator
and model the system as acquiring voter opinions at the time of the query.
Both reputation systems have two additional parameters. Since all entries in R
are initially undefined, an initial reputation rating ρ0 must be assigned to nodes for
which no statistics are available, to be used for comparing response nodes in the
selection stage. Analysis of different values for ρ0 is provided in Section 5.6.1.
In some domains it may be easy for malicious nodes to automatically generate
fake responses to queries. In situations where a node is querying for a rare document,
it may receive many replies, all of which are bad. To prevent the node from fetching
every false document and calculating V (R), we introduce a selection threshold value
(ρT ). Any response from a node whose reputation rating is below this threshold
is automatically discarded and never considered for selection.4 In Section 5.6.1 we
analyze the effects of varying the threshold value on performance.
In the weighted selection procedure a response from a node with a rating of 0
would never be chosen since it has a weight of 0.5 The Weighted procedure differs
from the Select-Best procedure because it may choose any response from the response
set, with some probability (albeit small). To prevent nodes from being permanently
excluded from the selection process by the Weighted procedure, all nodes with a
reputation rating of 0 are assigned an artificial weight we call the zero-weight (w0) in
the selection procedure. In addition a positive value avoids issues when all nodes in
the response set had a rating of 0. For the experiments performed the ideal and local
4New nodes are automatically exempt from being discarded, even if ρ0 < ρT .5A node would receive a rating of 0 if the first file fetched from it were inauthentic
5.3. REPUTATION SYSTEMS 149
Weighted reputation systems used a w0 of 0.01 in their weighted selection procedure.
The value 0.01 was chosen because it is a positive value, but significantly smaller than
any other node behavior value, such as pB. Experiments were performed using both
a zero-weight of 0 (no zero-weight) and 0.01. The results were similar.
Finally, we present a prescient reputation system, which is applicable only when
using the node-based threat model:
Ideal: The ideal case uses T to base its selection decision. It represents the best
possible performance a reputation system can achieve by using the actual threat model
in selecting the document to present. Both the Select-Best and Weighted selection
procedures are evaluated for the ideal system.
This system is not realistic, but is used as a guide for the ideal performance of
the previously defined reputation systems, if the reputation matrix R converges to
the actual threat matrix T . Results for this system are presented in the third results
section, relating to the node-based threat model.
5.3.1 Identity
Maintaining statistics of node behavior requires some form of persistent node identi-
fication. In order to build reputation, a user or node must have some form of identity
which is valid over a period of time. The longer this period of time, and the more
resistant the identity is to spoofing, the more accurately the reputation system can
rate nodes [116].
The simplest way to identify a node is to use its IP address. This method is
severely limited because addresses are vulnerable to IP-spoofing and peers are often
dynamically assigned temporary IP addresses by their ISPs. Instead, a more reliable
method may be to use self-signed certificates. This technique allows well-behaved
nodes to build trust between each other over a series of disconnections and recon-
nections from different IP addresses. Although malicious nodes can always generate
150 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
new certificates making it difficult to distinguish them from new users, this technique
prevents them from impersonating existing well-behaved nodes.
Some argue that the only effective solution to the identity problem in the presence
of malicious nodes is to use a central trusted login server, which assigns a node identity
based on a verifiable real-world identity. This would limit a malicious node’s ability
to masquerade as several nodes and to change identities when their misbehavior is
detected. It would also allow the system to impose more severe penalties for abuse
of the system.6
For simplicity, we generally assume that all nodes use the same identity for their
lifetime. This mimics a system with a centralized login server, assigning unforgeable
IDs based on real-world identities. This scheme ensures users cannot (easily) change
identities to hide their misbehavior, by limiting each real-world entity to one network
ID. In this system the trusted server need not know which system ID refers to which
real-world ID [48].
The second model relies on users generating their own certificates and public/private
key pairs as forms of identification. Though robust to spoofing, any user can easily
discard an identity and generate a new one. Using self-managed identities makes the
system vulnerable to whitewashing, where malicious nodes periodically change their
identities to hide their misbehavior [83]. This is modelled by erasing all informa-
tion gathered on a malicious node after it sends an invalid document to the query
source node for verification. If node M sends node S a fake document, all information
collected by nodes (including S) about M is erased. Essentially all nodes “forget”
about bad nodes. We abbreviate the references to the login server and self-managed
identities scenarios as Login and Self-Mgd, respectively. These two identity schemes
are compared in Section 5.6.1 of the results.
6For example, a person might have to use a valid credit card to enter the system, allowing thesystem auditors to debit their card if they are caught misbehaving.
5.4. METRICS 151
In Section 5.6.2 we experiment with a slightly different whitewashing scenario.
Instead of each malicious node constantly changing identities, malicious nodes white-
wash periodically. We conduct experiments using both identity models and distin-
guish the two as the whitewashing and static scenarios. In our results, the default
identity model is the static model, unless whitewashing (WW ) is specified.
5.4 Metrics
The main objective of a reputation system is to reduce the number of documents
the user must look at before finding the correct document for their query. We call
this the efficiency of the reputation system. This is equivalent to minimizing the
number of times the authenticity function is calculated in the selection stage. This
metric seems the most practical and direct measure of a particular selection heuristic’s
performance.
While systems reduce the number of document fetches and authenticity function
computations to be more efficient, it often comes at the sacrifice of effectiveness. The
effectiveness of a search system relates to its ability to locate an answer given that
one exists somewhere in the network. A reputation system’s effectiveness is measured
by the fraction of queries for which an authentic document is selected, given that one
exists in the response set. We call this metric the miss rate. This measurement of
effectiveness is only accurate for systems in which the reputation algorithm does not
interfere with query response or query/response propagation, but it is accurate for
the systems described here.
When studying reputation systems it is necessary to determine what metrics best
measure the success of a particular system. Here we present the metrics we use
to evaluate our experimental results. We ran simulations of our system model for a
period of time and gathered statistics at the end. These statistics are used to compute
152 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
Table 5.1: Simulation statistics and metricsMetric Descriptionqtot # of queries generatedqgood # of queries with an authentic file in at least one responseqsucc # of successful queries where the selection procedure located an
authentic fileVi # of verification function evaluations performed on files fetched
from node inG Number of good nodes in the networknfld Average number of nodes that receive a query through floodingqFC Number of queries successfully answered by a node in the Friend-
CacheV # of verification function evaluationsVG Total number of verification function evaluations of files fetched
from good nodesrV Verification ratiodTR Threat-reputation distancermiss Miss rate`i Load on node i`G Average load on good nodes
MTrel Relative message traffic of Friends First w.r.t. flooding
the metrics. They are summarized in Table 5.1.
From among all the queries generated during execution (qtot) we are specifically
interested in the number of good queries (qgood) and the number of successful queries
(qsucc). A good query is any query whose response set includes at least one authen-
tic copy of the queried file, even if no authentic copy was located by the selection
procedure. A successful query is a query that results in an authentic copy of the
requested file being selected by the selection procedure. The relation between the
three statistics is given by the following equation:
qtot ≥ qgood ≥ qsucc (5.2)
For the reputation systems we are testing, if qsucc always equals qgood then the
5.4. METRICS 153
system is considered to be 100% effective.
5.4.1 Efficiency
When designing reputation systems our primary concern is to reduce the number of
files which must be fetched and verified before locating a valid query response. During
execution we record the number of file verifications supplied by each node i, which we
refer to as Vi. From this data we compute the total number of verification function
evaluations, V , as
V =n∑
i=1
Vi (5.3)
But V alone is insufficient. A system could ignore every response, report failure
on every query, and have V = 0. To account for the fact that some systems may incur
more verification checks, but locate valid files to more queries, we divide V by the
number of successful queries (qsucc). We call this metric the verification ratio (rV ).
rV =V
qsucc
(5.4)
The lower the value of rV , the more efficient the system is. The best possible per-
formance would be a prescient algorithm which always chose a valid file if one was
available in the response set, and ignored all responses if not. This would give an rV
of 1. The verification ratio measures the efficiency of a reputation system and is our
principal metric of system performance.
5.4.2 Effectiveness
While systems reduce the number of file fetches and authenticity function compu-
tations to be more efficient, it often comes at the sacrifice of effectiveness. The
effectiveness of a search system relates to its ability to locate an answer, given that
154 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
one exists somewhere in the network. A reputation system’s effectiveness can be con-
sidered to be the fraction of queries for which an authentic file is selected, given that
one exists in the response set. We call this metric the miss rate. This measurement
of effectiveness is only accurate for systems in which the reputation algorithm does
not interfere with query response or query/response propagation, but it is accurate
for the systems described here.
Some reputation systems with selection thresholds may not locate an authentic file
even when one is available, and thus are not completely effective. We are interested
in measuring how often such systems report a failure to a good query. We introduce
the miss rate (rmiss), given by the equation
rmiss =qgood − qsucc
qgood
(5.5)
The miss rate gives the fraction of good queries that were missed. A system which
returns a valid file for every good query will have a miss rate of 0. A system which
never returns a good response would have a miss rate of 1. Therefore, the miss rate
is inversely related to the effectiveness of the reputation system.
5.4.3 Load
We are also interested in measuring the load on the network under the various repu-
tation systems and threat models. We are primarily concerned with the load on the
well-behaved nodes in the network from file fetches. If each file is transferred only
when it is selected to be verified, then the number of files a node has uploaded is
equal to the number of verification function evaluations of files from that node. We
define the load on node i (`i) as the number of verification checks on files it supplies
5.4. METRICS 155
normalized by the total number of queries, or
`i =Vi
qtot
(5.6)
We measure the average load on the network as the average load across well-
behaved nodes, `G. Let G be the set of all good nodes in the network and let nG be
the total number of good nodes. Therefore, the average load is
`G =
∑
i∈G `i
nG
(5.7)
Network load is analyzed in Section 5.6.2.
5.4.4 Message Traffic
To measure the message efficiency of the Friends-First method we compare the net-
work query message traffic generated by this method to the default practice in un-
structured networks of flooding the network for each query. We calculate the relative
message traffic as
MTrel =Number of Friends-First Messages
Number of Flooding Messages(5.8)
and compute it using the system parameters and the statistics gathered from the
Select-Best experiments. Note that the number of Friends-First messages includes
messages sent directly to friends and messages from query floods, resulting when the
query goes unanswered by friends.
156 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
5.4.5 Threat-Reputation Distance
The final metric we introduce applies only to the node-based threat model defined in
Section 5.2.2. If a node’s reputation rating is expressed as the perceived probability
that a node returns an authentic file, then the reputation matrix R, approximates
the threat matrix T . Let the standardized reputation matrix R′, be a an n×n matrix
such that R′i,j is the probability with which node Ni expects a file from Nj to be
authentic.7 For many reputation systems R′ is equal to or easily derived from R,
and R′ may converge to T .8 For some reputation systems, R converges to T . How
quickly convergence takes place may be a useful metric. Since R is most likely a
sparse matrix, an appropriate matrix distance algorithm must be used.
We sum the square of the differences between each defined value of R′ and T ,
take the square root, and divide by the number of defined values in R′. This metric
we call the threat-reputation distance, or T-R distance (dTR) for short, and can be
mathematically expressed as
dTR =
√√√√√
n∑
i
n∑
j
R′
i,jdefined
(Ti,j −R′i,j)
2
n∑
i
n∑
j
R′
i,jdefined
1(5.9)
If no cells inR′ are defined then the T-R distance is undefined. For the ideal reputation
system the T-R distance is always 0 (by definition of the ideal system).
7For the local and voting reputation systems we simulate R′ = R.8Ti,j is the a priori probability that j sends an authentic file to i. Ri,j is the probability with
which i expects j to reply to it with an valid file based on past experience.
5.5. SIMULATION DETAILS 157
Table 5.2: Configuration parameters, and default valuesParam. Description Value
τ Simulation runtime 1000n Number of nodes 10000
dmax Maximum allowed degree of a node in the network 150davg Average degree of a node in the network ≈ 3.5TTL Distance from source queries are propagated 5πB Percentage of malicious nodes in the network 0.3pG Probability of a good node replying with an authentic file 0.99pB Probability of a malicious node replying with a fake file 0.9ρ0 Initial reputation rating used for nodes with no prior inter-
action0.3
ρT Selection threshold. Nodes with reputation ratings belowρT are not considered in the selection procedure.
0.2,0.15
w0 Weight assigned to nodes with a reputation rating of 0 0.01αF Zipf exponent for file popularity distribution 1.2αQ Zipf exponent for query popularity distribution 1.24PQR Number of popular queries 200αPQ Zipf exponent for most popular queries 0.63FC Size of Friend-Cache used with Friends-First method -wQ Quorumweight - Weight given to voters’ opinions with re-
spect to local statistics0.1
5.5 Simulation Details
The following section describes the specific component models, parameters, and met-
rics used in the simulations. The key parameters for the simulations are summarized
in Table 5.2 along with their default values. Table 5.3 lists the various statistical
distributions used, along with the default values for their parameters. The results of
the simulations are presented and discussed in the following section.
We evaluate the reputation systems using our own P2P Simulator based on our
system model. The simulations were run on a Dual 2.4Ghz Xeon processor machine
with 2GB of RAM. Each data point presented in the results section represents the
average of approximately 10 simulation runs with different seeds.
158 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
Table 5.3: Distributions and their parameters with default valuesDescription Distribution Parameters (with default values)Network topology Power-Law n = 10, 000, dmax = 150, β ≈ 1.9Query popularity Zipf αPQ = 0.63, PQR = 250, αQ = 1.24Query selection power Zipf αF = 1.2
Though most of our findings apply to any peer-to-peer network, for our experi-
ments we construct a Gnutella-like flat unstructured network. Specifying the overlay
topology is necessary for studying certain issues, such as Neighbor-voting and mes-
sage traffic reduction. Studies of unstructured peer-to-peer networks have shown
their topologies are power-law networks [44]. We use randomly generated, fully con-
nected power-law networks with an average node degree of davg ≈ 3.1. For the local
reputation system experiments, we used networks of size n = 10, 000 nodes with a
maximum node degree of dmax = 150. The voting reputation system experiments
used 1000 nodes with a maximum node degree of dmax = 50.9 Queries are propagated
to a TTL of 5. For simplicity we assume the network structure does not change,
though we simulate a node leaving and a new node taking its place in the network.
Each timestep a query is generated and completely evaluated before the next
query/timestep. Therefore, a simulation run of 100 timesteps processes 100 queries.
For the results dealing solely with the local reputation system, which does not ex-
change reputation information between peers, all queries are sent from a single node
randomly chosen at startup. Each simulation seed selects a different node. For exper-
iments using the voting-based system a node is randomly chosen as the query source
at each timestep.
The simulation component most specific to file-sharing (as opposed to general
resource-sharing) is our query model. It is similar to the one proposed in [138]. We
assume a total of 100,000 unique files. The number of copies of each file in the system
9We have experimented with larger networks. Results are not shown but observed trends aresimilar to what is reported here.
5.6. RESULTS 159
is determined by a Zipf distribution with α = 1.2. Each node is assigned a number
of files based on the distribution of shared files collected by Saroiu et al. [119] The
query popularity distribution determines which file each query searches for. For this
distribution we use a two-part Zipf distribution with an α of 0.63 from rank 1 to
250 and an α of 1.24. This distribution better models query popularity in existing
peer-to-peer systems [124]. Though our query model is based on data collected on
today’s file-sharing networks, we expect networks providing other content or services
to have similar distributions.
In Section 5.6.2, we model node turnover by having a random node leave the net-
work and a new node enter on average once per query from a single node. Therefore,
a turnover occurs every timestep for the single query source experiments and every
1000 timesteps for the multiple query source experiments in a 1000 node network. For
the reputation system, this is equivalent to clearing all information in the ith row and
column of R, when node i leaves. For the whitewash experiments in Section 5.6.1,
each malicious node changes its identity after uploading a fake file to any node, by
clearing all the column of R relating to the malicious node. In Section 5.6.2, all
malicious nodes change identity every 10 queries from a single node (or every 10,000
queries with multiple query sources),
Unless otherwise stated, we use a selection threshold of 0.2 in all experiments
reported in this chapter. We use an initial reputation rating of 0 for the whitewashing
experiments, and 0.3 otherwise. All experiments with constant πB, pB, and pG, were
run with πB = 0.3, pB = 0.9 and pG = 0.99.
5.6 Results
The results section is divided into three components. First, we look solely at the local
reputation system and compare the two identity models in detail and measure the
160 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
effects of the parameters common to both reputations systems. We also evaluate the
Friends-First technique for message traffic reduction.
In Section 5.6.2, we focus on the voting-based reputation model, look at the effects
of the system parameters specific to it and also analyze the distribution of load on
the well-behaved peers in the network. We also look at the effects of the malicious
opinion-sharing (N , L, C).
The first two parts use only the document-based threat model defined in Sec-
tion 5.2.1. In the final part, we look at the performance of the node-based threat
model (see Sec. 5.2.2) and compare it to our results from the document-based threat
model.
5.6.1 Local Reputation System
In this section we address several of the questions brought up in the previous sections.
Specifically:
1. Is an initial reputation rating of zero always preferable to nonzero?
2. What is the cost in efficiency (as defined here) for using self-managed identities
in lieu of a trusted login server?
3. Is there a benefit to using a selection threshold?
4. Can maintaining a Friend-Cache reduce message traffic?
Here we compare the two extreme identity models, the Login model, in which
nodes do not change identities over time, and the Self-Mgd model, in which malicious
nodes change identities after every fake file they upload to a peer. The experiments
in this section were all conducted with no node turnover. Each simulation was run
for 1000 timesteps (unless otherwise noted).
5.6. RESULTS 161
0
5
10
15
20
25
30
35
40
0 0.2 0.4 0.6 0.8 1
Ver
ifica
tion
Rat
io (r
v)
Initial Reputation Rating (ρ0)
Login WeightedLogin Best
Self-Mgd WeightedSelf-Mgd Best
Figure 5.2: Efficiency for varying ρ0. Lower value is better. 1 is optimal.
New Node Reputation
In this experiment we varied the initial reputation rating (ρ0) used by the local
reputation system for any node from which we have not received a document and
checked its authenticity. Our experiments demonstrate that, though a reputation
system performs similarly for both identity models for a ρ0 of 0, efficiency in the login
server scenario can improve substantially by increasing ρ0, while performance in the
self-managed identities scenario will only worsen.
Figure 5.2 shows that for the Login scenario, a nonzero initial reputation rating
(eg. ρ0 = 0.4) performs better by a factor of 1.5 in terms of minimizing the number
of authenticity checks computed. If malicious nodes cannot change their identities to
pose as new nodes after misbehaving, there is a benefit to selecting new nodes over
previously encountered malicious nodes.
If malicious nodes are allowed to change their identities, as in the self-managed
identities scenario, they will usually be treated as new nodes with a reputation rating
of ρ0 in the selection procedures. We would expect that varying ρ0 would have a
significant effect for Self-Mgd. Figure 5.2 shows that increasing ρ0 decreases the
efficiency when using the Weighted procedure, though unexpectedly, the Select-Best
procedure is not affected (until ρ0 = 1). For example, from a ρ0 of 0.0 to 0.5, the
162 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
0
1
2
3
4
5
6
7
8
9
10
0 0.2 0.4 0.6 0.8 1
Ver
ifica
tion
Rat
io (r
v)
Selection Threshold (ρT)
Login WeightedLogin Best
Self-Mgd WeightedSelf-Mgd Best
Figure 5.3: Varying selection threshold values.
verification ratio (the average number of authenticity checks performed per query)
of the Weighted method goes from 9.7 to 25.1, while Select-Best stays constant at
9.4. Since the Weighted method considers all nodes (weighted by their ratings) in
the selection stage, it is important to lower the weight of new nodes, which are more
likely to be malicious nodes in the scenario of self-managed identities than in that of
a login server. The results support our intuition. The Select-Best method’s unvaried
performance across all values of ρ0 can be attributed to the fact that often a node
receives a reply from a peer which has previously provided an authentic document, in
which case the node will always choose the reputable source over any unknown peer.
From these experiments we selected 0.3 as the default value for ρ0 for Login. Many
of the following experiments were additionally performed with other values of ρ0, but
the results did not vary noticeably from those at ρ0 = 0.3 and are not discussed.
For Self-Mgd simulations we use only ρ0 = 0, which clearly performed best for the
Weighted method.
Selection Threshold
Figure 5.3 shows tests varying the value of the selection threshold for both the
Weighted and Select-Best variants of the local reputation system. The verification
5.6. RESULTS 163
ratio is plotted as a function of ρT . As stated above, ρ0 was set to 0.3 for Login and
0 for Self-Mgd.
The result is surprising. For Login all values of ρT above 0 resulted in almost equal
performance, yet significantly better than ρT = 0 (rv of 6.3 for ρT = 0 down to 2.0
for ρT > 0).10 Because malicious nodes always reply with a copy when a document
in the subversion set is queried for, the vast majority of responses in the response set
come from malicious nodes supplying bad copies. When searching for rare content, it
is common to receive only bad copies from malicious nodes. The threshold prevents
nodes from repeatedly fetching and testing documents from peers which have proven
malicious or unreliable in the past. The drawback of the selection threshold is a
decrease in query effectiveness (discussed in the following section).
For Self-Mgd varying ρT had no effect. Remembering which nodes have lied in
the past is of no use if those nodes can immediately change their identities to hide
their misbehavior. The threshold may be useful if nodes were motivated to maintain
their identities, perhaps by providing incentives for building reputations.
In successive tests any system variant using a selection threshold uses a ρT value
of 0.2 or 0.15 unless otherwise stated. The primary simulations use ρT =0.2. However,
early experiments, presented later in the result section, used a selection threshold of
0.15. As this and following experiments demonstrate, there is negligible difference in
results between using a selection threshold of 0.15 or 0.2.
Performance under Various Threat Conditions
In this section we look at system performance under different threat model parameter
values. Specifically we demonstrate how overall efficiency is affected by varying the
percentage of malicious nodes in the system (πB) and the probability of a unique
10Though almost the same, the values of rv for different nonzero ρT for a given reputation systemvariant are not exactly identical.
164 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
0
5
10
15
20
25
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Ver
ifica
tion
Rat
io (r
v)
Percentage of Bad Nodes (πB)
BaseLogin Weighted 0.0
Login Best 0.0Login Weighted 0.2
Login Best 0.2Self-Mgd Weighted 0.0
Self-Mgd Best 0.0Self-Mgd Weighted 0.2
Self-Mgd Best 0.2
(a)
0
5
10
15
20
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Ver
ifica
tion
Rat
io (r
v)
Percentage of Unique Documents in Subversion Set (pB)
BaseLogin Weighted 0.0
Login Best 0.0Login Weighted 0.2
Login Best 0.2Self-Mgd Weighted 0.0
Self-Mgd Best 0.0Self-Mgd Weighted 0.2
Self-Mgd Best 0.2
(b)
Figure 5.4: Efficiency comparison.
document being in the subversion set (pB). Eight different variants of the local
reputation system were tested. These eight variants are derived from three system
parameters: the identity model (Login or Self-Mgd), ρT (0 or 0.2), and the selection
procedure (Weighted or Select-Best).
The graphs in Figure 5.4 present the system performance for varying πB and pB.
The results show that overall a trusted login server significantly reduces the cost
of ensuring authenticity over self-managed identities roughly by a factor of 5.5. Yet,
using a reputation system with the Self-Mgd model outperforms having no reputation
system at all (Base curve in Figure 5.4) by an additional factor of 3.5.
For both graphs, the curve corresponding to the base case, of purely random
selection, quickly climbs out of the range of the graphs. In Figure 5.4(a) the base
curve increased steadily to 46 at πB = 0.4; 3.5 times the verification ratio of the
Self-Mgd variants, and up to 20 times the rv of Login using a selection threshold. In
Figure 5.4(b) the base curve climbed to 35 at pB = 1; resulting in 3.4 times the rv
of the Self-Mgd variants, and approximately 17 times that of Login with ρT = 0.2.
This means one would expect to have to fetch and test on average 20 times as many
query responses in order to find a valid response! Even using self-managed identities,
a rudimentary reputation system provides significant performance improvements over
5.6. RESULTS 165
no reputation system. Even then users would expect to fetch over ten bad copies for
every good copy they locate (for πB > 0.3). In contrast, a peer using a selection
threshold in a login server environment would only expect to encounter one or two
fakes for every authentic file, no matter the level of malicious activity in the network.
Figures 5.4(a) and 5.4(b) show that the Select-Best and the Weighted proce-
dures perform similarly. Overall the Select-Best method outperformed the Weighted
method, especially in the Login model. Though the Select-Best performed well and
served to mitigate the performance variance of other parameters (such as the initial
reputation rating), it does have drawbacks. A study of the load on well-behaved nodes
(measured as the number of documents fetched from a node) showed a much more
skewed distribution for the Select-Best variants than the Weighted variants. In fact,
the highest loaded good nodes in the Select-Best simulations were being asked for 2.5
times as many documents as the highest loaded nodes in the Weighted simulations. At
the bottom of the distribution, hundreds of nodes were never accessed in the Select-
Best simulations that were in the Weighted simulations. This dramatic skew in load
distribution can result in unfair overloading, especially in a relatively homogeneous
peer-to-peer network. We study load distribution in detail in Section 5.6.2.
Both graphs illustrate that the selection threshold is useless in the Self-Mgd sce-
narios, but provides a large performance boost for Login. This supports our findings
in the previous section, and demonstrates it was not an artifact of the selected values
of the threat parameters. Using a selection threshold system efficiency is relatively
unaffected by variations in πB and pB.
Measurements of effectiveness in these experiments (only applicable to a nonzero
selection threshold) resulted in a miss rate well below 0.001 (0.1 of 1%) for the exper-
iments varying πB at a constant pB of 0.9. For the experiments varying pB, the miss
rate increases as pB decreases, but always remains below 0.0025 (0.25 of 1%). As pB
decreases, the subversion set decreases. Because malicious nodes become more likely
166 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
to provide authentic documents, but tend to fall under the threshold, the effectiveness
of the system deceases. For most applications these miss rates are acceptable, espe-
cially when compared to the increased efficiency offered by the selection threshold.
Message Traffic
Now, we present our experiments on mitigating message traffic using the Friends-First
technique. As explained earlier, Friends-First takes advantage of the Friend-Cache to
try and locate a positive query response among the known reputable nodes, before
querying the entire system. As we will see, in a flood-based querying system, this can
result in 85% less message traffic!
Before presenting the results, we redefine the general formula for relative message
traffic, given in Equation 5.8, in terms specific to our model. The numerator is the
total message traffic for Friends-First. For all queries, messages are sent to all nodes in
the Friend-Cache (qtot ·FC). In addition there is the cost in messages of flooding the
network when a valid response is not located from the Friend-Cache. The number
of messages generated in the network to propagate a query will be at least equal
to the number of nodes which hear the query, and most likely much larger due to
several occurrences of two nodes forwarding the query to the same node. We roughly
estimate the number of messages generated by a query flood as the average number of
nodes reached by a query flood (nfld). Therefore, the additional cost of flooding for
Friends-First would be the number of queries not answered by a node in the Friend-
Cache (qtot− qFC) times nfld. The denominator is the number of messages generated
assuming every query is a flood (qtot · nfld).
Note that FC is greater than or equal to the actual number of nodes in the Friend-
Cache at any time, so not all queries will have FC nodes to query directly. Let FC i
be the number of nodes in the Friend-Cache after i − 1 queries. FC i is the number
of messages sent directly to reputable nodes for the ith query. Note that for all i
5.6. RESULTS 167
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 50 100 150 200 250 300 0
20
40
60
80
100
120
140
Rel
ativ
e M
essa
ge T
raffi
c
Max
imum
Num
ber o
f Nod
es in
Frie
nd-C
ache
Size of Friend-Cache
MTrelMax Friends
Figure 5.5: Relative message traffic of Friends-First and maximum Friend-Cacheutilization as a function of the size of the cache.
FC i ≤ FC and FC1 = 0 since all nodes are initially unknown. We can define our
message traffic metric as
MTrel =
∑qtot
i=1 FC i + (qtot − qFC)nfld
qtot · nfld
(5.10)
Note that this is still a conservative calculation of relative traffic since nfld is less
than or equal to the total number of messages generated due to a query flood.
We conducted these experiments using the local reputation system and the single-
source query generator. For the results in this section, we ignored whitewashing and
node turnover. We ran simulations for various numbers of queries (1000, 10,000,
50,000, etc).
The solid line in Figure 5.5 plots the relative message traffic of Friends-First with
respect to regular flooding (MTrel) as a function of the maximum Friend-Cache size,
after 50,000 queries. We see that, as the size of the Friend-Cache increases, the query
message traffic drops quickly to approximately MTrel=0.15 until it reaches a point
where growing the cache no longer provides any benefit. This means that the Friends-
First method is generating only 15% as much message traffic as flooding without any
loss in effectiveness!
168 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
Interestingly, for cache sizes greater than 120, the traffic overhead actually in-
creases slightly, before levelling off at around 200. For small FC, increasing the
cache size greatly reduces message traffic because of the high likelihood of locating
future query answers at the additional nodes stored in the cache. Every additional
query satisfied by a node in the cache saves the system a query flood, outweighing the
cost of the additional messages sent to the new nodes in the Friend-Cache for every
query. But when FC is large, any node added to the cache will likely be sharing
few files (thus rarely provide a response in the future). If the node had more files,
it would have been located earlier and already be in the Friend-Cache. We find that
well-behaved nodes sharing many files tend to be located quickly and be placed in the
Friend-Cache early. Nodes added later offer fewer (approx. 5) files and do not offer
any more query responses, thus wasting bandwidth on query messages sent directly
to them.
As stated earlier, we performed experiments for varying lengths of time. In our
shorter simulations (e.g. 1000 queries) there were no rise in relative traffic for large
FC. Instead MTrel drops quickly and levels off, with no single minimum. These
shorter simulations end before the Friend-Cache begins collecting useless nodes with
very few files. Runs of 20,000 and 100,000 queries, on the other hand, also showed a
preferred FC around 130. This result supports our hypothesis that, once only small
nodes remain outside the cache, adding a node to the cache increases overall traffic
because the cost of sending them a direct query outweighs the slim probability of
their answering a request and avoiding a query flood.
In studying the efficiency of Friends-First, it is useful to consider the utilization
of the Friend-Cache. The right y-axis of Figure 5.5 corresponds to the number of
reputable nodes in the Friend-Cache when the simulation ended, represented in the
graph by the points on the dashed line. Notice that the number of nodes in the
cache increases linearly until MTrel reaches the minimum, and levels off when MTrel
5.6. RESULTS 169
levels off. Interestingly, the value it reaches is 126, approximately the same value
as the optimal cache size. We believe this is not a coincidence. This value is an
average of several simulation runs with different seeds. Some runs had lower values
and others higher, but it does indicate that, on average, the system did not use
responses from more than 130 reputable nodes. Thus, in the simulations where more
than 130 reputable nodes were located and placed in the cache, we would not expect
them to provide any further useful unique responses. Therefore, limiting the Friend-
Cache to a size of 130 prevents useless nodes from entering the cache and worsening
performance.
5.6.2 Voting-System
In this section we discuss the following important issues:
1. How well does the voting-based system perform? How do the parameters wQ
and FC affect the performance?
2. How does the voting system compare to the local reputation system? Remember
that the voting system with a quorumweight of 0 is equivalent to the local
system.
3. How do the Select-Best and Weighted methods compare in terms of overall
efficiency?
4. How does the reputation system affect the distribution of load across well-
behaved nodes?
As stated before, all experiments use the document-based threat model. We also
slightly relax the whitewashing scenario. Instead of a malicious node changing iden-
tities after each false file upload, all malicious nodes whitewash after after an average
of 10 queries per node in the network. To underscore this difference we refer to the
170 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
0
2
4
6
8
10
12
0 0.2 0.4 0.6 0.8 1
Ver
ifica
tion
Rat
io (r
V)
Weight Given to Quorum’s Opinion (wQ)
Frd LFrd CNbr LNbr C
Frd L WWFrd C WW
Nbr L/C WW
Figure 5.6: Efficiency of the voting reputation system (using Select-Best) with respectto varying quorumweight (wQ). Lower rV is better. 1 is optimal.
two identity models as the whitewash (WW ) and no whitewash scenarios, as opposed
to Login and Self-Mgd, as done in the previous section.
Voting System Parameters
In this section, we analyze the performance of the voting-based reputation system
for various parameter values. All experiments were performed using the multi-source
query generator for a total of 100,000 queries. For this scenario the random algorithm
obtained an rV = 28.2, off the scale of the graphs. The relative performance of the
local reputation system is given by the data point for a quorumweight of 0.
Figure 5.6 presents the effects of varying the quorumweight, wQ. It shows results
for both with whitewashing (WW ) and without, both Neighbor (Nbr) and Friend
(Frd) voting, and both the selfish lying (L) and colluding (C) malicious opinion-
sharing models. No (N) opinion-sharing misbehavior mirrored the L curves, per-
forming only marginally better across all experiments, and are not graphed. In the
selfish lying model, malicious nodes give themselves a rating of 1 and all others a
rating of 0. Since malicious nodes cannot vote for themselves and give everyone else
an equal rating of 0, they do not greatly impact a vote in favor of malicious nodes.
5.6. RESULTS 171
Note that the values of rV in Figure 5.6 are relatively high. For example, an rV
value of only 3 means we would expect to download and verify three files for each
query. In an actual system, it may not be feasible for a node to thoroughly check each
downloaded file’s authenticity. The node may simply trust the file to be valid. In this
case, rV can be viewed as the inverse probability that such a file is valid. Accounting
for well-behaved nodes offering bad copies of files complicates the threat model. We
have conducted experiments with this assumption and, as long as the probability of
a good node offering a bad file is small, it does not noticeably affect our results.
Observing the drop in rV from wQ = 0 to wQ = 0.05, we conclude that incorpo-
rating other nodes’ opinions tends to improve the efficiency of the system. Except
when malicious nodes collude to subvert the voting process, varying the weight of the
voters opinions beyond wQ = 0.05 has no effect on the system performance. This
behavior indicates that the greatest benefit from voting is in the situation where the
local node has no opinion of their own. When bad nodes collude (C), system perfor-
mance decreases as the weight given to the quorum’s opinion increases, reinforcing
that there is no substitute for personal experience in an untrusted environment.
Comparing the Frd family of curves to the Nbr curves within the same white-
wash scenario (e.g. Frd L vs. Nbr L), we clearly see that Friend-voting outperforms
Neighbor-voting. Nodes that have given you good service in the past have demon-
strated some effort to be reliable and well-behaved. Asking them for their opinions
is more reasonable than relying on one’s neighbors, a third of which, in this scenario,
are likely to be bad. Not only does Neighbor-voting not perform as well, but it is
more susceptible to malicious collusion as neighbors’ opinions are given more weight
(see Nbr C curve). Friend-voting, however, tends to avoid asking malicious nodes for
their opinions, mitigating the effects of collusion.
Though the whitewash scenario performs worse than no whitewashing, it can
benefit more from opinion-sharing. As the Frd WW curves between wQ = 0 and
172 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
0
2
4
6
8
10
12
14
16
18
0 5 10 15 20 25 30 35 40 45 50
Ver
ifica
tion
Rat
io (r
V)
Size of Friend-Cache
No WWWW
Figure 5.7: Efficiency of the voting reputation system with respect to Friend-Cachesize (FC).
wQ > 0 illustrate, efficiency for Friend-voting improves by a factor of 3 over the local
reputation system. The Nbr L/C WW curve shows that Neighbor-voting in the WW
scenario is almost completely unaffected by opinion-sharing, no matter the malicious
opinion-sharing model. As stated before, in the WW scenarios an initial reputation
rating of 0 is assigned to unknown nodes. Since this value is used for weighing the
opinions of the voting nodes, any unknown peer in the neighbor quorum (including
malicious nodes that have whitewashed) will have their votes ignored. Because the
average number of neighbors is small (approx. 3.1) the probability of a well-behaved
neighbor providing a query response that is tested, and thus becoming “known” and
having their opinion used, is rare. In contrast, in the no WW scenario, since ρ0 = 0.3,
even untested peers’ opinions are considered, explaining its poor performance when
bad nodes collude.
In summary, this experiment shows that choosing a relatively small quorumweight
around 0.1 with Friend-voting improves performance by a factor of 2 or more across
all scenarios. But how many reputable nodes should one keep in the Friend-Cache?
Does increasing the size of the Friend-Cache always result in better efficiency? In a
real system, a larger cache means greater maintenance cost periodically checking the
liveness of the nodes in the cache. Is this cost always justified?
5.6. RESULTS 173
Figure 5.7 shows the performance of both the whitewashing and no whitewashing
scenarios for various Friend-Cache sizes (FC) with no bad opinion-sharing (N).11
Both scenarios stabilize so that increasing the size of the cache yields no performance
improvement, but a system dealing with whitewashing benefits from a larger cache.
For instance, while a Friend-Cache of 10 is sufficient when there is no whitewashing,
the whitewash scenario can benefit from a cache as large as 25. As expected, when
tested with the malicious opinion-sharing models (N , L, C), all three models produced
similar rV values, with the C values being slightly greater than that of the N and
L values by about 0.4 in the no WW scenario. Thus, we only plot the N curve in
Figure 5.7. A surprisingly small cache is needed for this technique to be efficient.
In Section 5.6.1 we used the Friend-Cache to choose peers to query directly before
flooding the network. Though there is little benefit from gathering opinions from
more than the 10 or 15 most reputable nodes, the traffic results indicate that we can
take advantage of Friend-Caches larger than 100. Should we use our entire large cache
for gathering opinions? No. Though a large Friend-Cache is easy to maintain (it is
a list of known nodes ordered by their reputation statistics), asking a large number
of nodes to share their opinions, either per query or periodically, will greatly increase
the amount of message traffic produced yet not improve our selection performance.
Thus, though we may maintain a large Friend-Cache for direct querying, we would
only ask the top nodes to participate in our quorum.
Friend-voting is effective against collusion because it only considers the opinions of
nodes that have shown to behave well by providing good files. Given our threat model,
this quickly bars malicious nodes from the Friend-Cache. One technique malicious
nodes may employ to defeat Friend-voting would be to set up front nodes. These
nodes properly trade only authentic files, but when asked for their opinion of other
nodes, act according to the collusion model, C, promoting only malicious nodes.
11FC = 0 corresponds to the local reputation system
174 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
0
1
2
3
4
5
6
7
8
9
10
0 0.2 0.4 0.6 0.8 1
Ver
ifica
tion
Rat
io (r
V)
Fraction of Malicious Front Nodes
wQ=0.1wQ=0.8
Figure 5.8: Effects of front nodes on efficiency.
We have run simulations where a fraction of the malicious nodes are set to be front
nodes. We present the results for both a quorumweight of 0.1 and 0.8 in Figure 5.8.
These experiments show that, in the case of wQ = 0.8, front nodes can cause consid-
erable harm to the system. The damage peaks when 40% of the malicious nodes are
front nodes, decreasing the system performance by more than a factor of 3! For a
larger number of front nodes, rV steadily drops, indicating that too many malicious
nodes are behaving well to promote a smaller group causing actual damage. To be
optimally effective, attackers would need to use the right balance of front nodes and
actively malicious nodes. Surprisingly, front nodes appear to have no adverse effect
when wQ = 0.1. We believe this shows that a very low quorumweight limits the im-
pact of front nodes’ bad opinions sufficiently that the damage caused by front nodes
is negated by the benefit of having fewer actively malicious nodes.
Efficiency Comparisons
Given the results of our analysis on the voting parameters, we wish to evaluate the
system with respect to varying threat parameters. Specifically, we demonstrate how
overall efficiency is affected by varying the percentage of malicious nodes in the system
(πB). We have run similar experiments varying the probability of a unique file being
5.6. RESULTS 175
0
2
4
6
8
10
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Ver
ifica
tion
Rat
io (r
V)
Fraction of Bad Nodes (πB)
BaseWeighted
BestWeighted WW
Best WW
(a) Voting system
0
5
10
15
20
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Ver
ifica
tion
Rat
io (r
v)
Fraction of Bad Nodes (πB)
Base Weighted
BestWeighted WW
Best WW
(b) Local system
Figure 5.9: Comparison of the efficiency of the two reputation systems with therandom algorithm as a function of πB.
in the subversion set (pB) and obtained similar results and performance comparisons.
We test the voting system with wQ = 0.1 and FC = 10, and using the two
selection procedures both with and without whitewashing. Malicious nodes did not
lie or collude with their opinions (N). These experiments were also run using the
multi-source query generator for 100,000 queries. We evaluate the efficiency of the
reputation systems for values of πB between 0 and 0.4 using the default pB of 0.9.
It may seem unlikely that a network would have 40% malicious peers attacking
90% of the files. But in the real world, there are large entities, with access to vast
resources, which have an interest in subverting peer-to-peer networks. We have sim-
ulated across several degrees of malicious activity (varying both πB and pB) and
the relative performance of the different reputation system variants is comparable in
weaker threat scenarios to those presented here.
Figure 5.9(a) shows the performance of the voting reputation system. Clearly,
using any local statistics when selecting a provider results in significantly better ef-
ficiency than purely random selection (base case). While the base case climbed to
42.5 at 40% malicious nodes, the voting reputation system attained an efficiency of 2
176 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
(with no whitewashing), a factor of improvement of 21! Whitewashing adversely af-
fects the performance of the system, but not as badly as expected. For example, with
a verification ratio of 4.5 the reputation system in the whitewash scenario performs
2.3 times worse than when there are no whitewashers. This means that on average
a node would have to fetch more than twice as many copies of a file before finding a
valid one, showing a clear advantage to preventing whitewashing by requiring users to
log in through a trusted authority that can verify each real user has only one system
identity.
We also executed the experiments using the local reputation system under equiv-
alent conditions (100 queries from a single querying node). The results, shown in
Figure 5.9(b), were only a factor of 2 worse performance than the voting system in
the non-whitewashing scenario.12 The performance difference between the two sys-
tems was greater in the whitewashing scenario, a factor of 4. These results support
our findings in Section 5.6.2 that opinion-sharing is worthwhile in spite of its slightly
higher implementation complexity.
When comparing the performance of the Select-Best (Best) and Weighted selection
procedures in either graph of Figure 5.9, we see no large efficiency advantage of
one procedure over the other, though the Select-Best method outperforms Weighted
across all values of πB. As expected, selecting the best known provider is slightly
more efficient than probabilistically choosing a provider, but this comes at a cost,
which we discuss in the following section.
Load on Good Nodes
One critical issue is that reputation systems may unfairly burden some of the good
nodes in the network. Thus, we now look at the amount of load placed on well-
behaved nodes in the network in terms of the number of files they upload. We are
12Note the difference in scale between the two graphs in Figure 5.9.
5.6. RESULTS 177
0
0.0002
0.0004
0.0006
0.0008
0.001
0.0012
0.0014
0.0016
0.0018
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Load
on
Goo
d N
odes
Percentage of Bad Nodes (πB)
ExpectedWeighted
BestWeighted WW
Best WW
Figure 5.10: Average load on well-behaved nodes as a function of pB.
interested only in the effect produced by requests from well-behaved nodes running
the algorithms correctly. We use the same setup as above but concentrate on the
scenario with no whitewashing.
Figure 5.10 plots the average load on the well-behaved nodes, as a function of
the fraction of malicious nodes in the network. In an ideal system with no malicious
nodes, we would expect exactly 1 download per query, giving a value of `G = 0.001
for a 1000 node network. In our case, when there are no malicious nodes, the value
of `G is 0.00098. This value is less than expected (shown by the Expected curve in
the graph) because a few queries go unanswered by any node in the network.
As the fraction of malicious nodes increases, so does `G. For instance, when
πB = 0.3 the average load is 0.00138. With only 70% as many good nodes to service
requests, we would expect `G = 10.70·1000
= 0.00143. Both the fact that malicious
nodes provide some good files, and that the probability of a successful query is lower,
account for the difference between the observed and expected loads. Comparing
the two selection procedures shows an insignificant difference in average load. Both
procedures fetch the same number of files from good nodes overall.
Though there was little difference between the selection procedures in terms of
average load, it is important to consider the load distribution. In a homogeneous
178 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
0
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0.016
1 10 100 1000 1
10
100
1000
10000
100000
Load
on
Goo
d N
odes
Num
ber o
f File
s O
ffere
d
Good Nodes (ordered by load)
Base Weighted
Best
(a) Load on each node
0
5e-07
1e-06
1.5e-06
2e-06
2.5e-06
3e-06
3.5e-06
1 10 100 1000 1
10
100
1000
10000
100000
Load
Per
File
on
Goo
d N
odes
Num
ber o
f File
s O
ffere
d
Good Nodes (ordered by load)
Base Weighted
Best
(b) Load per file on each node
Figure 5.11: Distribution of load on good nodes (and their corresponding number offiles shared). x-axis corresponds to nodes sorted by amount of load in (a) and loadper document stored on node in (b) (note logscale axis). The curves relate to the lefty-axis and specify the amount of load measured at each node. The points map to theright y-axis and indicate the number of documents on the corresponding node.
network where all nodes have similar bandwidth, it is preferable if load is distrib-
uted evenly across all nodes, as opposed to a few nodes handling most of the traffic
while the majority are idle. To study load distribution we measured the load (using
Eq. 5.6) on each individual node using the two voting-based reputation system selec-
tion procedures and the random selection algorithm. The values were then sorted in
descending load order. The results, averaged across 10 runs with different seeds, are
shown by the three line curves on the left y-axis in Figure 5.11(a). Here we see that,
using the Select-Best selection procedure, the most heavily loaded node (with rank
1) has a load of almost 0.015. This value is more than 10 times the average load of
0.00138.
Though both selection procedures incurred greater load on the highest ranked
nodes than the base case, Select-Best concentrated the load on a few nodes while
Weighted distributed the load better. The maximum load on a node with the Select-
Best method was almost twice that of Weighted. This is expected since Select-Best
locates a few good nodes and tries to reuse them when possible, while the Weighted
5.6. RESULTS 179
model encourages fetching files from new nodes (broadening its pool of known good
nodes). If load-balancing in a homogeneous system is an important requirement,
then the Weighted selection procedure would be preferable.
Another factor to consider is how load relates to the number of files shared by
each node. It would be expected that good nodes with more files are more likely to
be able to answer queries, increasing the number of files they upload and thus their
load. Figure 5.11(a) plots as points the number of files on each good node on the
right-hand y-axis. For example, for rank 1, there are three points around 38,000. This
means that, for all three systems, the most heavily loaded node shared an average of
around 38,000 files. As expected, all distributions show a strong correlation between
nodes sharing more files and higher load.
In Figure 5.11(b) we divide the load on each node by the number of files it provides
and reorder the distribution. For instance, the node at rank 1 has a load per file of
2.8 × 10−6 for the Weighted selection procedure, but only 2.2 × 10−6 for the Select-
Best procedure. The result is surprising. The Select-Best method generated much
less load per node than the Weighted or random methods. To understand this result
we again plot the number of files offered by each node on the right y-axis. Here we
see two trends. The base case and the Weighted method both curve from the bottom
left upwards, showing that the nodes with highest load per file offer very few files.
This effect is due to the sublinearity of the answering power of a node with respect
to the number of files it is offering. For example, if node i has twice as many files as
node j, we expect node i to be able to answer less than twice as many queries as j. In
general, given a probability p that any individual file in the system matches a query,
the probability that a node with f files can respond to a query equals 1 − (1 − p)f .
In a purely random selection model this probability is an indicator of the expected
load on a node; as f increases, so does the probability, and thus the likely load. This
is corroborated by our results in Figure 5.11(a). Now if we divide this probability by
180 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
f we have an indicator for the load per file: 1−(1−p)f
f. This equation has a maximum
value when f = 1 and decreases as f increases. This explains the behavior we see
from the random base case and the Weighted case in Figure 5.11(b).
The Select-Best method, on the other hand, shows a different trend. The most
heavily loaded (per file) nodes share a very large number of files. The Select-Best
procedure selects nodes which have proven reliable in the past. This behavior favors
well-behaved nodes which respond to queries early in the simulation and often, nodes
sharing many files. This procedure gives nodes with many files an even greater chance
of being chosen with respect to the random model.
Whether or not it is desirable to send greater traffic to nodes with more files
is dependent on the environment. Some have suggested that in some peer-to-peer
systems, the number of files a node offers correlates to its available bandwidth. If so,
using the Select-Best selection procedure, which gives preference to nodes with more
files, may result in more effective bandwidth usage. But if peers have similar resource
constraints or fair load-balancing is a priority, then we would prefer the Weighted
selection procedure, which better equalizes load yet is almost as efficient at locating
authentic documents.
Susceptibility to Attack
In addition to fairness, a skewed load distribution also raises concerns with respect
to security. If a smaller number of peers are providing a larger portion of the net-
work services, these peers become easy, tempting targets for malicious entities. Once
highly-loaded, well-behaved nodes are detected, an adversary can mount a network
Denial of Service attack directed at these nodes in order to shut them down, or at-
tempt to subvert the nodes for its own purposes through security exploits. A more
balanced load distribution makes it harder to detect which peers are providing the
most resources. In addition, more of these nodes would have to be subverted in order
5.6. RESULTS 181
to do the same amount of damage to the network.
As stated earlier, we did not investigate DoS attacks or node subversion in this
study. However, it is important to consider these issues when choosing system para-
meters, such as the selection procedure. In essence, using the Select-Best procedure
weakens one of the most important traits of P2P systems: robustness through widely
distributed and replicated files and resources. Any parameter that influences diversity
in peer selection will have repercussions on both load balancing and risk from point
attacks. For example, as discussed earlier, raising the initial reputation rating re-
sults in more inauthentic file accesses. However, lowering it will reduce the likelihood
of discovering new well-behaved peers, thus increasing load skew and the system’s
susceptibility to other malicious attacks.
5.6.3 Node-based Threat Model
Here we present the results of experiments using the threat model, based on the
threat matrix. We evaluate the local reputation system and the ideal reputation
system, which uses the threat matrix T as its reputation matrix. All results in this
section were performed with no node turnover or whitewashing.
This threat model allows us to perform a statistical analysis of the expected long-
term performance of the reputation system variants, which we present in Appen-
dix 5.7. This analysis presents the expected system behavior in steady-state after
running for sufficiently (perhaps infinitely) long. Comparing the analytical results
with those presented in the next section, gives us an understanding of the inherent
limitations of each of reputation system variants. The statistical analysis results as-
sume no node whitewashing or node turnover. For more information please see the
appendix.
182 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
0
5
10
15
20
25
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Ver
ifica
tion
Rat
io (r
V)
Percentage of Bad Nodes (πB)
BaseIdeal Weighted 0.0
Ideal Best 0.0Ideal Weighted 0.2
Ideal Best 0.2SL Weighted 0.0
SL Best 0.0SL Weighted 0.2
SL Best 0.2
(a) As a function of πB
0
5
10
15
20
25
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Ver
ifica
tion
Rat
io (r
V)
Probability of Bad Node Sending Fake Response (pB)
BaseIdeal Weighted 0.0
Ideal Best 0.0Ideal Weighted 0.2
Ideal Best 0.2SL Weighted 0.0
SL Best 0.0SL Weighted 0.2
SL Best 0.2
(b) As a function of pB
0
5
10
15
20
25
30
35
40
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
Ver
ifica
tion
Rat
io (r
V)
Probability of Good Node Sending Authentic Response (pG)
BaseIdeal Weighted 0.0
Ideal Best 0.0Ideal Weighted 0.2
Ideal Best 0.2SL Weighted 0.0
SL Best 0.0SL Weighted 0.2
SL Best 0.2
(c) As a function of pG
Figure 5.12: Comparison of the efficiency of the local and ideal reputation systemsunder the node-based threat model. Lower is better. 1 is optimal.
Efficiency of the Reputation Systems
We first present the results of varying the three threat parameters, πB, pB and pG.
Figure 5.12 shows that the local reputation system performs quite well, matching
the associated ideal system, and even surpassing it in some situations. Though the
Weighted and Select-Best methods achieve equal efficiency, the use of a selection
threshold dramatically improves performance, allowing it to maintain a verification
ratio under 2.5.
The ideal system likewise benefited from a selection threshold, allowing the sys-
tem to maintain a near perfect ratio of approximately 1.01. The improvement in
5.6. RESULTS 183
performance due to the threshold reinforces our observations of the large number of
queries that returned no authentic documents, only fakes. As expected, the selection
threshold for the ideal case is useless once pB > ρT and it performs as if there is no
threshold (Figure 5.12(b)).
The most interesting observation is the shape of either Ideal Weighted curve. We
see that the verification ratio peaks when pB is around 0.6. The reason the Ideal
system performs badly in this situation is because, knowing the values of the threat
matrix, it expects bad nodes to reply correctly to 40% of the queries. In reality,
malicious nodes reply falsely to around 60% of the queries, but only reply correctly
to a small fraction of the other 40% of the queries. This is because it can only return
a valid response to a document it has, and each file has only a small fraction of all
the unique documents in the city. This shows that the “ideal” reputation system is
not as good as we may have originally believed.
Comparing the SL curves in the graphs in Figure 5.12 to the corresponding ones
the document-based threat model and the login server identity model, shown in Fig-
ure 5.4, we see that relative performance stays same. The behavior the systems under
both threat models is quite similar, especially regarding varying the common threat
parameters, πB and pB. In fact, we repeated the experiments for determining the
optimal values for the initial reputation rating and selection threshold (under both
identity models) but the results were so similar, we feel it would be redundant to
include them.
Distributed Node Ratings
In most experiments all well-behaved nodes have same probability of sending an
authentic document, or rating, of pG. Malicious nodes, likewise, all have the same
rating of pB. These values are expected to be far apart, allowing algorithms to more
easily locate and isolate the two. But how do the reputation systems behave when
184 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
0
10
20
30
40
50
60
70
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Ver
ifica
tion
Rat
io (r
v)
Probability of Bad Node Sending Fake Response (1-pB)
BaseIdeal Weighted 0.0
Ideal Best 0.0Ideal Weighted 0.15
Ideal Best 0.15SL Weighted 0.0
SL Best 0.0SL Weighted 0.15
SL Best 0.15
(a) As a function of pB
0
20
40
60
80
100
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
Ver
ifica
tion
Rat
io (r
v)
Probability of Good Node Sending Authentic Response (pG)
BaseIdeal Weighted 0.0
Ideal Best 0.0Ideal Weighted 0.15
Ideal Best 0.15SL Weighted 0.0
SL Best 0.0SL Weighted 0.15
SL Best 0.15
(b) As a function of pG
Figure 5.13: Comparison of the efficiency of the reputation systems with node threatratings uniformly distributed in an interval of length 0.7 around pB or pG.
the node behaviors are not so distinct?
To test this, we randomize the node ratings in the threat matrix. Each node’s
rating is chosen from a uniform distribution with an interval of size 0.7 centered on
pB or pG, depending on whether it is a good or bad node. For example, if pB = 0.4
then malicious nodes will be assigned ratings in the range of 0.05 to 0.75 and the
average value will be 0.4. For values of pB and pG near 0 or 1, the interval cannot
extend completely 0.35 in each direction from the center. Rather than shorten the
interval equally on both sides of the center, the interval is abruptly cut at 0 or 1.
This results in a shift of the interval center (and average) away from the intended
center. For example, when pB = 0.1, then the values of malicious nodes are chosen
uniformly from the interval [0, 0.45], resulting in an average value of 0.225.
Figure 5.13 shows the results of simulations run with the scattered threat ratings,
both for different pB and pG values. Comparing Figures 5.13(a) and 5.13(b) to Fig-
ures 5.12(b) and 5.12(c) respectively, we see little difference in the performance of
the base case and all the local variants.
5.6. RESULTS 185
0
2
4
6
8
10
12
14
0 100 200 300 400 500 600 700 800 900 1000
Rat
io o
f Aut
hent
icity
Che
cts
to S
ucce
ssfu
l Que
ries
Time
Comparison of Reputation Algorithms
BaseIdeal - Best 0.0
Ideal - Best 0.15SimpleLocal - Best 0.0
SimpleLocal - Best 0.15
(a) Ratio of authenticity checks to successfulqueries between each 50 query interval.
0
0.0002
0.0004
0.0006
0.0008
0.001
0.0012
0.0014
0 100 200 300 400 500 600 700 800 900 1000
Dis
tanc
e be
twee
n m
atric
es R
and
T
Time
Comparison of Reputation Algorithms
BaseIdeal - Best 0.0
Ideal - Best 0.15SimpleLocal - Best 0.0
SimpleLocal - Best 0.15
(b) Distance between reputation and threat ma-trices. Lower value is better. 0 is optimal.
Figure 5.14: Comparison of the local reputation system with ρT of 0.0 and 0.15 andthe base case over time. The simulation was run for 1000 queries and statistics werecollected every 50 queries.
Convergence Over Time
The next tests we ran involved collecting statistics during each simulation run in order
to evaluate the change in performance over time and to see if the reputation matrix
converges to the threat matrix. In each set of simulations the simulation ran for Q
total queries and statistics were gathered every δ queries. The graphs measuring the
verification ratio over time compute the ratio based only on the number of authenticity
checks and successful queries in the last δ queries. The graphs measuring T-R distance
give the calculated T-R distance at the current time (i.e after δ queries, 2δ queries,
etc.).
Figure 5.14 ran for Q = 1000 queries with a δ = 50 queries. In it we compare the
three reputation systems using only the Select-Best procedure with ρT of 0 and 0.15
for the ideal and Simple Local cases.
Though in Figure 5.14(a) the values appear to vary randomly, notice that the local
curves appear to converge towards 1. This indicates the the statistics the reputation
system is gathering allows it to make better decisions in the future when selecting
186 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
responses. The base case and the ideal case with no threshold, as expected, do not
show this convergence since they do not “learn” from previous experiences. The ideal
case with threshold performs well enough to stay near 1 for the entire run.
In Figure 5.14(b) we see the T-R distance during the same test. The ideal case
is trivially 0, and the base case is not present (since the Random algorithm does not
maintain statistics). We see that both local curves converge to a value of almost
0.0004. The curve corresponding to ρT = 0.0 seems to converge faster and to a
slightly smaller distance than for ρT = 0.15. Since local using no threshold performs
worse in terms of the number of authenticity checks it must perform, this allows it to
collect more statistics about other nodes faster and therefore converge faster with a
bit more accuracy. But once it locates a pool of good nodes, the Select-Best procedure
will always attempt to pick from this group. Therefore it will not choose documents
from other nodes if possible and will not collect new statistics, other than refine the
reputation ratings of the good nodes.
To see if gaining more varied statistics results in faster and better convergence
we ran a similar longer test with Q = 10000 queries and δ = 100. We included the
Weighted procedure to if its ability to choose a node that is not necessarily the best
known node will help it minimize the T-R distance. We also modified the base case
to collect statistics in order to calculate a reputation for each node, but still not use
the information in the selection process. Since the base case performs the worst by
performing the most authenticity checks, we would expect it to collect the most data
and converge faster and better than the rest.
Figure 5.15(a) shows the verification ratio over time. Though the local Select-Best
curves converge quickly to 1, the Weighted versions periodically jump to high values,
resulting in worse performance.
In Figure 5.15(b), all the curves converge at about the same rate. But both
ρT = 0.15 curves converge to a higher T-R distance than the rest. This is as expected
5.6. RESULTS 187
0
5
10
15
20
25
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Rat
io o
f Aut
hent
icity
Che
cks
to S
ucce
ssfu
l Que
ries
Time
Comparison of Reputation Algorithms
BaseSimpleLocal - Weighted 0.0
SimpleLocal - Weighted 0.15SimpleLocal - Best 0.0
SimpleLocal - Best 0.15
(a) Ratio of authenticity checks to successfulqueries between each 100 query interval
0
0.0001
0.0002
0.0003
0.0004
0.0005
0.0006
0.0007
0.0008
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Dis
tanc
e be
twee
n m
atric
es R
and
T
Time
Comparison of Reputation Algorithms
BaseSimpleLocal - Weighted 0.0
SimpleLocal - Weighted 0.15SimpleLocal - Best 0.0
SimpleLocal - Best 0.15
(b) Distance between reputation and threat ma-trices over 10000 queries
0
5e-05
0.0001
0.00015
0.0002
0.00025
0.0003
9000 9200 9400 9600 9800 10000
Dis
tanc
e be
twee
n m
atric
es T
and
R
Time (Queries)
Comparison of Reputation Algorithms
BaseSimpleLocal - Weighted 0.0
SimpleLocal - Weighted 0.15SimpleLocal - Best 0.0
SimpleLocal - Best 0.15
(c) Distance between reputation and threat ma-trices between 9000 and 10000 queries
Figure 5.15: Comparison of the local reputation system with both Weighted andSelect-Best variants and a selection threshold of 0.0 and 0.15 and the base case overtime. The simulation was run for 10000 queries and statistics were collected every100 queries.
188 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
after the previous results. Figure 5.15(c) shows a more detailed view of the end of
the simulation run. Interestingly, the Weighted version with no threshold and the
base case did not converge faster or to a lower distance the the Select-Best version
with no threshold, even though they both performed many more authenticity checks.
This may be because, though the Weighted and base case collect more statistics, the
statistics are across a larger number of nodes, while the Select-Best case leaves more
nodes as undefined. Since the undefined nodes are not included in our measure of
distance, a system would exhibit a lower T-R distance by information on a smaller
number of nodes than the same amount of information (or even more) across a much
larger number of nodes. A different method of calculating distance which promotes
having less undefined nodes would likely improve the measure for the bad performing
systems.
Dynamic Misbehavior
A suggested strategy for malicious nodes in a reputation system is to behave cor-
rectly for some period of time and accrue a positive reputation, then begin acting
maliciously. What effect does such a scheme have on efficiency? Can good nodes
detect such misbehaving nodes? If so, how quickly? To determine the effects of this
strategy we devised a test where malicious nodes behaved correctly for 1000 queries
from a single source. At that time all bad nodes began misbehaving. The simulation
then continued for 1000 more queries. The verification ratio was calculated every
50 timesteps using the cumulative A(D) and qsucc at that time, except in the case
of the dynamic bad nodes, in which case the verification ratio after 1000 queries is
calculated using statistics gathered since the nodes began misbehaving at time 1000.
For comparison we graph the behavior of the standard static threat model where
the malicious nodes misbehave for the entire simulation of 1000 queries. Figure 5.16(a)
shows that the reputation systems quickly stabilize to a steady-state. The ideal
5.6. RESULTS 189
0
10
20
30
40
50
60
0 100 200 300 400 500 600 700 800 900 1000
Ver
ifica
tion
Rat
io (r
v)
Time (queries)
BaseIdeal Weighted 0.0
Ideal Best 0.0Ideal Weighted 0.15
Ideal Best 0.15SL Weighted 0.0
SL Best 0.0SL Weighted 0.15
SL Best 0.15
(a) Regular static behavior (1000 queries total)
0
10
20
30
40
50
60
1000 1200 1400 1600 1800 2000
Ver
ifica
tion
Rat
io (r
v)
Time (queries)
BaseIdeal Weighted 0.0
Ideal Best 0.0Ideal Weighted 0.15
Ideal Best 0.15SL Weighted 0.0
SL Best 0.0SL Weighted 0.15
SL Best 0.15
(b) Bad nodes begin misbehaving after 1000queries (2000 queries total)
Figure 5.16: Comparison of the efficiency of the reputation systems over time.
system with threshold performs the best (almost 1) for the entire simulation, but the
local system with threshold quickly converges to near optimal.
Comparing it now to the dynamic misbehavior scenario in Figure 5.16(b) we see
that the reputation systems perform just as well, even though the malicious nodes
have had time to build up a good reputation. The one interesting difference, is
the worse performance of the Weighted procedure as opposed to Select-Best. With
Select-Best as soon as the querying node fetches and checks one fake document from
a malicious node, that node’s rating will be lowered significantly so that other nodes
will always be selected before it in the future, if possible. On the other hand, with
the Weighted method, a malicious node is likely to be selected multiple times before
its reputation rating is lowered sufficiently that it is unlikely to be chosen in future
queries.
To illustrate this, let node A be a good node and node M be a dynamic malicious
node. Say that both have provided 4 good documents during the period of time that
M behaved well to accrue reputation and so both have a reputation of 1.0. Now say
that M delivers an inauthentic document. It’s rating will drop to 45
= 0.8. While
Select-Best will never pick M if A has also replied, with the Weighted method A is
190 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
only 25% more likely to be selected than M . In fact after 3 more false responses
from M , its rating has only dropped to 0.5 and is only half as likely to be chosen
as A. This demonstrates a weakness of the simple reputation rating function used.
In situations with dynamic malicious nodes, reputations based on only the statistics
gathered over that last x number of queries would perform better, but require more
state be maintained.
The reason why Weighted does not perform too badly in this dynamic test is
that in a period of only 1000 queries, few malicious nodes had the opportunity to
provide more than 1 or 2 correct answers to build their reputation statistics. If
the misbehaving nodes were to behave better for a longer period, the greater the
performance gap between Select-Best and Weighted. But if malicious nodes behave
well for very long periods of time, they would be very ineffective at disrupting the
network.
5.7 Statistical Analysis of Reputation Systems
In this section we provide a derivation for our statistical analysis of the steady-
state behavior of the reputation systems and their variants presented in this chapter.
We begin by presenting the variables and equations dictating the expected system
behavior. Next, we empirically compute values for certain parameters. Finally we
calculate the expected performance of the different systems. Specifically, we are
interested in the steady-state verification ratio of a system, which is approximately
equal to the expected number of documents fetched and checked for each query.
5.8. EQUATIONS 191
5.8 Equations
In the experiments all the network topologies were static. Therefore, for a given TTL
and network topology, the nodes reached by a flood query was always the same for
each node, but different between nodes and topologies. Let nflood(i) be the number
of nodes reached by a query from node i. Let nflood be the average number of nodes
queried across all possible nodes in all possible network configurations. We empirically
estimate this value in the following section.
For the analysis we are interested in computing the expected probability of a node
being able to reply to a query. In the simulations the probability of a node having a
document matching a query q is
pN(d, q) = 1− (1− pD(q))d (5.11)
where pD(q) is the probability of any document in the system matching query q
and d is the number of documents stored at the specified node. The probability of
a document matching a query is determined by the query model using the query
popularity and query selection power distributions [138]. The number of documents
on a node is chosen from the file distribution sample collected by Saroiu [119]. The
expected probability of a node answering a query, pN , is computed experimentally.
More details are given below.
• The expected number of bad nodes that hear a query is nfloodπB.
• The expected number of good nodes that hear a query is nflood(1− πB).
• The probability of good node replying with an authentic document = pNpG.
• The probability of good node replying with a fake document = pN(1− pG).
• The probability of bad node replying with an authentic document = pNpB.
192 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
• The probability of bad node replying with a fake document = 1− pB.
Notice that the probabilities of sending an authentic document and sending a fake
document do not add up to 1. The remainder is the probability of the node not
replying. Also note the difference between good and bad nodes in the probability of
replying with a fake document. Because bad nodes can generate fake documents even
when they do not have a query match, pN is not taken into account.
From these equations we can derive formulas for the expected number of docu-
ments in a query’s response set, how many come from good or bad nodes, and how
many are authentic or not.
• Expected number of authentic documents received for a query
– From good nodes: dAG = nflood(1− πB)pNpG
– From bad nodes: dAB = nfloodπBpNpB
– Total: dA = nfloodpN(pG(1− πB) + pBπB)
• Expected number of fake documents received for a query
– From good nodes: dFG = nflood(1− πB)pN(1− pG)
– From bad nodes: dFB = nfloodπB(1− pB)
– Total: dF = nflood(pN(1− pG)(1− πB) + (1− pB)πB)
• Expected number of total documents received for a query
– From good nodes: dTG = nflood(1− πB)pN
– From bad nodes: dTB = nfloodπB(1− pB + pBpN)
– Total: dT = nflood(pN(1− πB) + πB(1− pB + pBpN))
5.9. EMPIRICAL ESTIMATIONS 193
Another set of equations which will be necessary are the probability a document
received from a good or bad node is authentic. This is equivalent to the expected num-
ber of authentic documents from a good/bad node divided by the expected number
of total documents from a good/bad node. This gives the following three equations
P (DG = A) =dAG
dTG
= pG (5.12)
P (DB = A) =dAB
dTB
=pBpN
pBpN + 1− pB
(5.13)
P (D = A) =dA
dT
=pN(pG(1− πB) + pBπB)
pN(1− πB) + πB(1− pB + pBpN)(5.14)
For good nodes the probability is simply pG since they only reply with documents
they own, though a small percentage are fake. Bad nodes, on the other hand, can
generate false documents.
5.9 Empirical Estimations
We estimate nflood to be 3950 after 10,000 samples from 100 nodes in 100 power-law
topologies of 10,000 nodes with approximately 3.1 degrees on average per node.
To compute pN we generate 4,000,000 random samples from both our query popu-
larity and query selection power distributions, and the sample document distribution
collected by Saroiu [119] (described in Section 5.5). For each generated value of pD(q)
and d we calculate pN(d, q) using Equation 5.11, and average across all samples. The
experimental estimate for pN was 0.1090, or approximately 10% probability that a
node can answer a query.
194 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
5.10 Long-Term Reputation System Performance
In this section we calculate the expected performance of each of the reputation systems
and their variants after running for a very long period and having settled into a
steady-state. For simplicity, this analysis makes two assumptions. First, all nodes
maintain their public identities for all time. Second, the network topologies do not
change. These assumptions are not expected to be realistic. For discussion of system
performance in realistic scenarios, see the experimental results. We are interested in
the steady-state performance as a guide to optimal performance of each system. A
comparison of the experimental results to our statistical analysis should indicate how
quickly the systems converges toward the steady-state behavior.
To determine the long-term efficiency of the reputation systems we are interested
in calculating the expected verification ratio. This ratio is the number of documents
that must be verified for authenticity for each successful query before an authentic
one is found. Above, we determined the expected number of documents of each type
received in response to a query. For all reputation systems, the expected verification
ratio will be a variation of the expected number of documents that must be chosen
from a subset of the responses, without replacement, until an authentic one is found.
Since the expected number of documents received in response to a query is large
we can approximate this problem to that of with replacement [36]. Given that the
probability of choosing an authentic document on the first try is p, we assume that
each successive attempt at choosing an authentic document is also has a probability p
of success. The expected number of documents that must be checked before locating
an authentic document, with replacement, is approximated as
E(X) =inf∑
x=1
p(1− p)x−1 =1
p(5.15)
5.10. LONG-TERM REPUTATION SYSTEM PERFORMANCE 195
This is the formula we use in the analysis below.
We also assume (as in most experiments) that all well-behaved nodes have a threat
rating of pG and all malicious nodes have a threat rating of pB, and that their threat
ratings do not change over time (or at least that the systems reach a steady-state
between changes).
The equations in Appendix 5.8 give the expected number of documents per query.
As stated before, the verification ratio, rv is proportional to the number of successful
queries, not total queries. We account for this distinction by calculating a factor Q,
which is the fraction of total queries expected to be successful. Say we issue θ queries.
The expected number of successful queries would be Q · θ. Each query will fetch 1p
documents on average. The total number of documents fetched will be 1p· θ. By
plugging these values into Equation 5.4 we get
rv =
1p· θ
Q · θ =
1p
Q
rvQ =1
p
(5.16)
For each system, we give equations for:
• The probability of choosing an authentic document on the first try, p (when
applicable).
• The fraction of total queries expected to be successful, Q.
• The expected verification ratio multiplied by the Q factor.
5.10.1 Random base case
All responses will be looked at equally likely.
196 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
p =dA
dT
(5.17)
Q = 1− (1− P (D = A))dT (5.18)
Expected rvQ =1
p=dT
dA
=pN(pG(1− πB) + pBπB)
pN(1− πB) + πB(1− pB + pBpN)(5.19)
5.10.2 Select-Best/Weighted ideal case with threshold
We assume all malicious nodes’ ratings fall below the selection threshold and so only
answers from good nodes will be considered. Since we assume all good nodes have
the same threat rating, then they will all be equally likely to be chosen by both the
Select-Best and Weighted methods.
Probability of choosing an authentic document of the first try is
p =dAG
dTG
(5.20)
Q = 1− (1− P (DG = A))dTG (5.21)
Expected rvQ =1
p=dTG
dAG
=1
pG
(5.22)
5.10.3 Weighted ideal case without threshold
All responses are considered but are weighted based on the rating of the sending
node. Given we have received dTG responses from good nodes and dTB responses
from bad nodes, then the probability of choosing a response from a good node with
the weighted method is pGdTG
pGdTG+pBdTB. Likewise, the probability of choosing a response
from a bad node is pBdTB
pGdTG+pBdTB. The probability that a random document from a
good node is authentic is simply dAG
dTG, and the probability that a document from a
bad node is authentic is dAB
dTB. Combining these formulas we get the probability of
5.10. LONG-TERM REPUTATION SYSTEM PERFORMANCE 197
choosing an authentic document using the weighted method.
p =pGdTG
pGdTG + pBdTB
· dAG
dTG
+pBdTB
pGdTG + pBdTB
· dAB
dTB
=pGdAG + pBdAB
pGdTG + pBdTB
(5.23)
Q = 1− (1− P (D = A))dT (5.24)
Expected rvQ =pGdTG + pBdTB
pGdAG + pBdAB
=pGpN(1− πB) + pBπB(1− pB + pBpN)
p2GpN(1− πB) + p2
BpNπB
(5.25)
5.10.4 Select-Best ideal case without threshold
With no threshold all responses may be looked at. First, the responses from good
nodes will be checked one at a time. If no authentic document was found, then the
responses from bad nodes are checked. If there are responses from good nodes, then
the probability of locating an authentic document on the first try is P (DG = A), or
pG, on the second try (1− pG)pG, on the third try (1− pG)2pG, and so on. Let γ be
the number of responses from good nodes (dTG), and β be the number of responses
from bad nodes (dTB). The probability of the first authentic document found being
the first document from a bad node checked would be (1 − pG)γP (DB = A). The
probability of the first authentic document found being the second document from a
bad node checked would be (1− pG)γ(1− P (DB = A))P (DB = A).
We can now calculate the expected number of documents which must be down-
loaded and checked per query:
Expected rvQ =
γ∑
k=1
kpG(1−pG)k−1+
β∑
k=1
(γ+k)(1−pG)γP (DB = A)(1−P (DB = A))k−1
(5.26)
198 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
By applying the well-known equations [77]
∑
0≤j≤n
axj = a
(1− xn+1
1− x
)
and∑
0≤j≤n
ajxj = a
(nxn+2 − (n+ 1)xn+1 + x
(x− 1)2
)
(5.27)
on Equation 5.26 and simplifying we obtain
E[rvQ] =1
pG
+ (1− pG)γ
[1
P (DB = A)− 1
pG
− (1− P (DB = A))β
(1
P (DB = A)+ β + γ
)]
(5.28)
Substituting dTG and dTB for γ and β, gives us
E[rvQ] =1
pG
+ (1− pG)dTG
[1
P (DB = A)− 1
pG
− (1− P (DB = A))dTB
(1
P (DB = A)+ dTB + dTG
)]
(5.29)
As with any system with no threshold, the Q factor is
Q = 1− (1− P (D = A))dT (5.30)
5.10.5 Select-Best/Weighted local reputation system with thresh-
old
In the long run we would expect the local system’s standard reputation matrix, R′,
converge to the threat matrix T . Therefore the local system in steady-state would
approximate the ideal case. One exception is in the scenario with a threshold. All
bad nodes would eventually be rated below the threshold and ignored, just as in the
ideal case. In addition, a fraction of good nodes would fall below this threshold and
5.10. LONG-TERM REPUTATION SYSTEM PERFORMANCE 199
be ignored based on the fact that, with some probability (1−pG) good nodes do reply
with fake documents. For very large pG and small ρT we approximate the percentage
of good nodes which fall below the threshold as the percentage of good nodes which
reply with a fake document the first time they reply to a particular node’s queries by
(1 − pG). Thus the only a 1 − pG fraction of the responses from good nodes will be
considered for both Select-Best and Weighted with a threshold.
p =pGdAG
pGdTG
(5.31)
Q = 1− (1− P (DG = A))pGdTG (5.32)
Expected rvQ =1
p=pGdTG
pGdAG
=1
pG
(5.33)
Although the long-term efficiency of system is almost equal to that of the ideal
system (note the difference in Q), its effectiveness is reduced because of the additional
exclusion of the small percentage of good nodes.
5.10.6 Weighted local system without threshold
With no threshold the local system should more closely approximate T than with a
threshold since good nodes which reply with a fake document their first time, will be
given a second chance. All responses are considered but are weighted based on the
rating of the sending node. The equations for p, Q, and the expected rvQ are given
by Eq. 5.23, 5.24, and 5.25, respectively.
5.10.7 Select-Best local system
As with the Weighted procedure the Select-Best local system with no threshold will
perform like the corresponding ideal case variant. Thus the equations for expected
rvQ and Q are given by Eq. 5.29 and 5.30, respectively.
200 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
5.11 Comparison of Statistical Analysis to Simu-
lation Results
Using the equations derived above, we verify the correctness of our simulator by
comparing our simulation results to the expected value at steady state for the same
parameter values. In Figure 5.17 we graph the expected steady-state rv as a function
of three threat model parameters for the random base case, and the three variants of
the ideal case discussed above. These equations are represented by the line curves.
We reran the experiments from Section 5.6.3 for the base case and the ideal cases
with a modification to the document model, which we discuss below. We chose only
to test the base case and the ideal variants because those systems attain steady state
from the beginning. Unlike the local system, they gather no new information over
time, so their behavior remains constant. We plot these results as the datapoints in
the graphs of Figure 5.17. Clearly, the simulation results closely match the curves of
the expected performance.
If we compare the graphs in Figure 5.17 with those in Figure 5.4 we see that the
curves are similar in shape and relative proportions, but the absolute values much
larger in Figure 5.4. This difference is due to the modified document base model used
for the simulations for Figure 5.17. In these experiments we use a simplified model
where each node had an equal probability of matching any query, pN . pN was set to
a value of 0.1090, the expected probability of a node matching a query (pN) derived
experimentally from our realistic document base model used in all other simulations.
That model is dependent on three Zipf distributions (query popularity, query selection
power, and number of documents per node). When comparing the simulator to our
derived equations it is necessary to substitute this simplified document model because
of a similar simplification which we made in our equations.
The formulas all use the expected value pN in place of the linearly dependent
5.11. COMPARISON OF STATISTICAL ANALYSIS TO SIMULATION RESULTS201
0
1
2
3
4
5
6
7
8
9
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Ver
ifica
tion
Rat
io (r
v)
Percentage of Bad Nodes (πB)
BaseIdeal Weighted 0.0
Ideal Best 0.0Ideal 0.15
(a) As a function of πB
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Ver
ifica
tion
Rat
io (r
v)
Probability of Bad Node Sending Fake Response (1-pb)
BaseIdeal Weighted 0.0
Ideal Best 0.0Ideal 0.15
(b) As a function of pB
0
1
2
3
4
5
6
7
8
9
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
Ver
ifica
tion
Rat
io (r
v)
Probability of Good Node Sending Authentic Response (pg)
BaseIdeal Weighted 0.0
Ideal Best 0.0Ideal 0.15
(c) As a function of pG
Figure 5.17: Comparison of expected steady-state system behavior with 1000 querysimulation using uniform document base with pN = 0.1090. Lines represent expectedperformance based on analysis. Points are results of corresponding simulations.
202 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
random variable pN . This simplification allowed us to derive relatively simple formulas
(if not the formulas would be as complex as the simulator itself). Unfortunately, this
mitigates the effects of certain conditions, such as when a node receives no authentic
replies whatsoever in its response set. With approximately 4000 nodes hearing each
query, and each node having, on average, almost an 11% chance of matching the query,
it would seem unlikely that no authentic document would be received. But because of
the heavy-tail nature of Zipf distributions, this condition occurs much more frequently
than one might expect and should not be ignored. This shows the importance of
running complex simulations and not simply relying on statistical derivations which
may lead to seemingly insignificant simplifications that actually greatly affect the
validity of one’s conclusions.
5.12 Related Work
Extensive research has been done on general issues of reputation (eg. [65] [72] [88]).
Much work has been done in the area of locating reputable nodes in resource-sharing
peer-to-peer networks and many interesting reputation systems have been proposed
(eg. [37] [83] [61]). Here we describe a few related examples.
Reference [48] presents a game theoretical model, based on the prisoner’s dilemma,
for analyzing the social cost of allowing nodes to freely change identities. It proposes
a mechanism, based on a centralized trusted intermediary. It ensures each user is
assigned only one system identifier, yet protects their anonymity so that even the
intermediary does not know which identifier was assigned to which node.
In [38] Douceur discusses the problem of preventing users from using multiple
identities in a system with no trusted central agency (the Sybil attack). He presents
methods for imposing computational cost on identity creation and lists system con-
ditions necessary to limit the number of identities peers can generate.
5.13. CONCLUSION 203
In [83], Lai et al. propose a reciprocative incentive strategy to combat freeriders,
based on the Evolutionary Prisoners Dilemma [9]. They compare the performance
of private history versus shared history and develop an adaptive stranger response
strategy that balances punishing whitewashers with overly taxing new nodes.
Reference [73] presents EigenTrust, a system to compute and publish a global
reputation rating for each node in a network using an algorithm similar to PageR-
ank [103]. Reputation statistics for each node are maintained at several nodes across
a content-addressable network to mitigate the effects of bad nodes colluding.
In [29], Damiani et al. enhance their previous work on reputation [25] by propos-
ing the concept of resource reputation, where peers give opinions on a resource’s
authenticity based on its reported digest. This technique complements the process of
maintaining peer reputations, which is still necessary in situations where the resource
is rare and no other peers have encountered it.
5.13 Conclusion
We have compared two practical identity infrastructures for peer-to-peer resource-
sharing environments. A centralized trusted login server that ties nodes’ network
pseudo-identities to their real-world identities provides better support for reputation
systems by preventing nodes from quickly changing identities. However, this benefit
comes at a high management cost and requires users to disclose information to a level
which they may not find acceptable. The decentralized approach, where each node
generates its own identity, provides a higher level of anonymity while simultaneously
preventing identity hijacking, at the cost of no enforced identity persistence for ma-
licious nodes. Though we have concentrated on two distinct identity models, many
practical solutions fall in a spectrum between them (such as providing incentives for
persistent identities) and perform accordingly.
204 CHAPTER 5. P2P REPUTATION SYSTEM METRICS
Our results show that even simple reputation systems can work well in either of the
two identity schemes when compared to no reputation system. In environments where
system identities are generated by the peers themselves, all unknown nodes should be
regarded as malicious. But, if a centralized login authority enforces identities tied to
real world identities, then the optimal reputation for unknown nodes is nonzero. In
addition, certain techniques, such as using a selection threshold, provide large benefits
in efficiency for one identity scheme, but are ineffectual for others.
We have presented a simple voting-based reputation system that significantly mit-
igates the deleterious effects of malicious nodes, by sharing information with a small
group of nodes. Even with 40% of the network attempting to subvert 90% of the
resources, a node would expect to only have to attempt twice before locating a good
provider, though it increases to four or five tries if the system is vulnerable to white-
washing.
We compared two methods for selecting providers given reputation information
and showed that, while one provides better efficiency, it also significantly skews the
load on the well-behaved nodes in the network. Depending on the amount of hetero-
geneity in the network this may be acceptable. We also show how the Friend-Cache
developed for the reputation system can be applied to significantly reduce message
traffic in unstructured peer-to-peer networks.
Finally, we present two distinct threat models, allowing us to simulate a variety
of malicious behaviors. Both models affect system performance quite similarly and
reputation system results in one are equivalently proportioned in the other. This
allows us to compare reputation systems using whichever model is most convenient
and expect similar results from the other.
Chapter 6
SPROUT: P2P Routing with
Social Networks
Social networks are everywhere. Many people all over the world participate online in
established social networks every day. AOL, Microsoft, and Yahoo! all provide instant
messaging services to millions of users, alerting them when their friends log on. Many
community websites, such as Friendster [49], specialize in creating and utilizing social
networks. As another example, service agreements between ISPs induce a “social”
network through which information is routed globally. Social networks are valuable
because they capture trust relationships between entities. By building a P2P data-
management system “on top of”, or with knowledge of, an existing social network,
we can leverage these trust relationships in order to support efficient, reliable query
processing.
Several serious problems in peer-to-peer networks today are largely due to lack of
trust between peers. Peer anonymity and the lack of a centralized enforcement agency
make P2P systems vulnerable to a category of attacks we call misrouting attacks. We
use the term misrouting to refer to any failure by a peer node to forward a message
to the appropriate peer according to the correct routing algorithm. Failures include
205
206 CHAPTER 6. SPROUT: P2P ROUTING WITH SOCIAL NETWORKS
dropping the message or forwarding the message to other colluding nodes instead of
the correct peer, perhaps in an attempt to control the results of a query. For instance,
in a distributed hash table (DHT) a malicious node may wish to masquerade as the
index owner of the key being queried for in order to disseminate bad information and
suppress content shared by other peers.
In addition, malicious users can acquire several valid network identifiers and thus
control multiple distinct nodes in the network. This is referred to as the Sybil attack
and has been studied by various groups (e.g. [48] [38] [91] and discussed in Chapter 5).
This implies that a small number of malicious users can control a large fraction of the
network nodes, increasing the probability that they participate in any given message
route.
Using a priori relationship knowledge may be key to mitigating the effects of
misrouting. To avoid routing messages through possibly malicious nodes, we would
prefer forwarding our messages through nodes controlled by people we know person-
ally, perhaps from a real life social context. We could assume our friends would not
purposefully misroute our messages. 1 Likewise, our friends could try and forward our
message through their friends’ nodes. Social network services provide us the mech-
anism to identify who our social contacts are and locate them in the network when
they are online.
Misrouting is far from the only application of social networks to peer-to-peer
systems. Social networks representing explicit or implicit service agreements can also
be used to optimize quality of service by, for example, minimizing latency. Peers may
give queue priority to packets forwarded by friends or partners over those of strangers.
Thus, the shortest path through a network is not necessarily the fastest.
1In our study, we assume a slim, but nonzero, chance (5%) that a virus or trojan has infectedtheir machine, causing it to act maliciously (see Sec. 6.1.1).
6.1. TRUST MODEL 207
In Section 6.1 we present a high-level model for evaluating the use of social net-
works for peer-to-peer routing, and apply it to the two problems we described above;
yielding more query results and reducing query times.
Unstructured networks can be easily molded to conform to the social links of their
participants. OpenNap, for example, allows supernodes to restrict themselves to link-
ing only with reputable or “friendly” peer supernodes, who manage message propa-
gation and indexing. However, structured networks, such as DHTs, are less flexible,
since their connections are determined algorithmically, and thus it is more challenging
to use social networks in such systems. In Section 6.2 we propose SPROUT, a routing
algorithm which uses social link information to improve DHT routing performance
with respect to both misrouting and latency. We then analyze and evaluate both our
model and SPROUT in Section 6.3.
Social networks can be exploited by P2P systems for a variety of other reasons.
In Section 6.4 we discuss application scenarios where our model is useful, as well as
other related and future work. Finally, we conclude in Section 6.5.
This work has been published as [96] and [89].
6.1 Trust Model
The basic intuition is that computers managed by friends are not likely to be selfish or
malicious and deny us service or misroute our messages. Similarly, friends of friends
are also unlikely to be malicious. Therefore, the likelihood of a node B purposefully
misrouting a message from nodeA is proportional to (or some function of) the distance
from A’s owner to B’s owner in the social network. Observe that in a real network
with malicious nodes, the above intuition cannot hold simultaneously for all nodes;
neighbors of malicious nodes, for example, will find malicious nodes close to them.
Rather our objective is to model trust from the perspective of a random good node
208 CHAPTER 6. SPROUT: P2P ROUTING WITH SOCIAL NETWORKS
in the network. Likewise, we assume messages forwarded over social links would
experience less latency on average because of prioritizing based on friendship or service
agreements.
We now describe a flexible model for representing the behavior of peers relative
to a node based on social connections. We will illustrate the model usage for two
different specific issues: minimizing the risk of misrouting, and decreasing latency to
improve Quality of Service.
6.1.1 Trust Function
We express the trust that a node A has in peer B as T (A,B). Based on our assump-
tion, this value is dependent only on the distance (in hops) d from A to B in the
social network. To quantify this measure of trust for the misrouting scenario, we use
the expected probability that node B will correctly route a message from node A.
The reason for this choice will become apparent shortly.
One simple trust function would be to assume our friends’ nodes are very likely to
correctly route our messages, say with probability f = 0.95. But their friends are less
likely (0.90), and their friends even less so (0.85). Note, this is not the probability
that the peer forwards each packet, but instead the probability that the peer is not
misbehaving and dropping all packets. Averaged over all nodes, they are equivalent.
A node’s trustworthiness decreases linearly with respect to its distance from us in the
social network. This would level off when we hit the probability that any random
stranger node (far from us in the social network) will successfully route a message, say
r = 0.6. For large networks with large diameters probability r represents the fraction
of the network made up of good nodes willing to correctly route messages. Thus, r =
0.6 means that we expect that 40% of the network nodes (or more accurately network
node identifiers) will purposefully misroute messages. Here we have presented a linear
trust function. We consider others in Section 6.3.3.
6.1. TRUST MODEL 209
Note that in this example the probability of a friend routing correctly is only 0.95
and not 100%. This value accounts for friends who do not wish us harm, but whose
computers that may be unknowingly subverted by an adversary, perhaps through
virus infection. However, we may assume that our friends are less likely to allow their
machines to be infected than a random stranger.
In addition, using social links that connect to known individuals helps reduce
the threat of Sybil attacks [38]. Assume all machines in the network have an equal
probability p of being subverted by an adversary. A friend’s computer is likely to
have been subverted and be acting maliciously with probability p. However, if we
connect to a random peer in the network, we expect the probability of peer being
malicious to be much higher. The adversary may have each subverted computer
posing as multiple peers by registering multiple IDs in the network. Therefore, by
relying on nodes discovered through the social network we are limiting the power of
the adversary to his/her physical presence in the network as opposed to his/her virtual
presence, which may be much larger. This technique is used in other P2P applications
to mitigate the effectiveness of malicious attacks. For example, the LOCKSS digital
preservation system uses a friends list in order to reduce the influence of an adversary
who registers many virtual identities in order to poison the reference lists of well-
behaved peers [87].
When measuring QoS we would want to use a very different function. Let T (A,B)
be the expected additional latency incurred by a message forwarded through node B,
which it received from node A. For simplicity, let us assume that T (A,B) = ε if a
social link exists between A and B and ∆ · ε otherwise. For example, assume ε =
1 and ∆ = 3. If A has a service agreement, or is friends with, B, then B give any
message it receives from A priority and forward it in about 1 (ms), otherwise it is
placed in a queue and takes on average 3 (ms). We will use these same values for ε
and ∆ in our example below and in our analysis in Section 6.3.6.
210 CHAPTER 6. SPROUT: P2P ROUTING WITH SOCIAL NETWORKS
We do not claim any of these functions with any specific parameter values is an
accurate trust representation of any or all social networks, but they do serve to express
the relationship we believe exists between social structure and the quality of routing.
6.1.2 Path Rating
We wish to use our node trust model to compare peer-to-peer routing algorithms.
For this we need to calculate a path trust rating P to use as our performance metric.
The method for calculating P will be application-dependent (and we will present two
specific examples below), but a few typical decisions that must be made are:
1. Source-routing or hop-by-hop? Will the trust value of a node on the path be a
function of its social distance from the message originator, or only from whom
it received the message directly?
2. How do you combine node trust? Is the path rating the product, sum, maximum
value, or average value of the node trust values along the path? Any appropriate
function could be used.
We now give as example a metric for reliability in the presence of misrouting.
We need to compare the likelihood that a message will reach its destination given
the path selected by a routing algorithm. We calculate the reliability path rating
by multiplying the separate node trust ratings for each node along the path from
the source to destination. For example, assume source node S wishes to route a
message to destination node D. In order to do so a routing algorithm calls for the
message to hop from S to A, then B, then C, and finally D. Then the reliability
path rating will be PR = T (S,A)∗T (S,B)∗T (S,C)∗T (S,D). Given that T (X,Y ) is
interpreted as the actual probability node Y correctly routes node X’s message, then
PR is the probability that the message is received and properly handled by D. Note
that T (X,Y ) is dependent only on the shortest path in the social network between
6.2. SOCIAL PATH ROUTING ALGORITHM 211
X and Y and thus independent of whether Y was the first, second, or nth node along
the path.
Including the final destination’s trust rating is optional and dependent on what
we are measuring. If we wish to account for the fact that the destination may be
malicious and ignore a message, we include it. Since we are using path rating to
compare routing algorithms going to the same destination, both paths will include
this factor, making the issue irrelevant.
For the Quality of Service we would want our path rating to express the expected
time a message would take to go from the source to the destination. Given that
T (A,B) is the latency incurred by each hop we would want to use an additive function.
And if each node decides whether to prioritize forwarding based on who it received
the message from directly, and not the originator, then the function would be hop-
by-hop. Calculating the latency path rating for the path used above would be PL =
T (S,A) + T (A,B) + T (B,C) + T (C,D).
Though we focus on linear paths in this chapter, the rating function can generalize
to arbitrary routing graphs, such as multicast trees.
6.2 Social Path Routing Algorithm
We wish to leverage the assumed correlation between routing reliability or efficiency
and social distance by creating a peer-to-peer system that utilizes social information
from a service such as a community website or instant messenger service. Though
there are many ways to exploit social links, for this chapter, we focus on building a
distributed hash table (DHT) routing algorithm. Specifically, we build on the basic
Chord routing algorithm [127]. Chord was chosen because it is a well-known scheme
and studies have shown it to provide great static resilience, a useful property in a
system with a high probability of misrouting that is difficult to detect and repair [60].
212 CHAPTER 6. SPROUT: P2P ROUTING WITH SOCIAL NETWORKS
Our technique is equally applicable to other DHT designs, such as CAN [112] or
Pastry [118].
When a user first joins the Chord network, it is randomly assigned a network
identifier from 0 to 1. It then establishes links to its sequential neighbors in idspace,
forming a ring of nodes. It also makes roughly log2 n long links to nodes halfway
around the ring, a quarter of the way, an eighth, etc. When a node inserts or looks up
an item, it hashes the item’s key to a value between 0 and 1. Using greedy clockwise
routing, it can locate the peer whose ID is closest to the key’s hash (and is thus
responsible for indexing the item) in O(log n) hops. For simplicity, we will use “key”
to refer to a key’s hash value in this chapter.
Our Social Path ROUTing (SPROUT) algorithm adds to Chord additional links to
any friends that are online. All popular instant messenger services keep a user aware
of when their friends enter or leave the network. Using this existing mechanism a
node can determine when their friends’ nodes are up and form links to them in the
DHT as well. This provides them with several highly trusted links to use for routing
messages. When a node needs to route to key k SPROUT works as follows:
1. Locate the friend node whose ID is closest to, but not greater than, k.
2. If such a friend node exists, forward the message to it. That node repeats the
procedure from step 1.
3. If no friend node is closer to the destination, then use the regular Chord algo-
rithm to continue forwarding to the destination.
6.2.1 Optimizations
Here we present two techniques to improve the performance of our routing algorithm.
We evaluate them in Section 6.3.2.
6.2. SOCIAL PATH ROUTING ALGORITHM 213
Lookahead
With the above procedure, when we choose the friend node closest to the destination
we do not know if it has a friend to take us closer to the destination. Thus, we may
have to resort to regular Chord routing after the first hop. To improve our chances of
finding social hops to the destination we can employ a lookahead cache of 1 or 2 levels.
Each node may share with its friends a list of its friends and, in 2-level lookahead,
its friends-of-friends. A node can then consider all nodes within 2 or 3 social hops
away when looking for the node closest to the destination. We still require that the
message be forwarded over the established social links.
Minimum Hop Distance
Though SPROUT guarantees forward progress towards the destination with each hop,
it may happen that at each hop SPROUT finds the sequential neighbor is the closest
friend to the target. Thus, in the worst case, routing is O(n).
To prevent this we use a minimum hop distance (MHD) to ensure that the follow-
ing friend hop covers at least MHD fraction of the remaining distance (in idspace) to
the destination. For example, if MHD = 0.25, then the next friend hop must be at
least a quarter of the distance from the current node to the destination. If not then
we resort to Chord routing, where each hop covers approximately half of the distance.
This optimization guarantees us O(log n) hops to any destination but causes us to
give up on using social links earlier in the routing process. When planning multiple
hops at once, due to lookahead, we require the path to cover MHD
kadditional distance
for each additional hop, for some appropriate k.
214 CHAPTER 6. SPROUT: P2P ROUTING WITH SOCIAL NETWORKS
6.3 Results
In this section we evaluate our friend-routing algorithm as well as present optimiza-
tions. We compare SPROUT to regular Chord and Chord augmented with additional
links. We also discuss the trust model and compare different trust functions. We an-
alyze the effects of misrouting on both structured and unstructured search networks.
Finally, we apply SPROUT to QoS and reducing path latency.
6.3.1 Simulation Details
To test our SPROUT algorithm for DHTs compare it to Chord in the following sce-
nario. Assume the members of an existing social network wish to share files or infor-
mation by creating a distributed hash table. Believing that some peers in the network
are unreliable, each node would prefer to route messages through their friends’ nodes
if possible. We use two sources for social network data for our simulations. The first
is data taken from the Club Nexus community website established at Stanford Uni-
versity [19]. This dataset consists of over 2200 users and their links to each other as
determined by their Buddy Lists. The second source was a synthetic social network
generator based on the Small World topology algorithm presented in [110]. Both the
Club Nexus data and the Small World data created social networks with an average
of approximately 8 links per node. We randomly inserted each social network node
into the Chord idspace.
We also ran experiments using a trace of a social network based on 130,000 AOL
Instant Messenger users and their Buddy Lists provided by BuddyZoo [30]. Because
of the size of this dataset, we have only used the data to verify results of our other
experiments.
For each experiment we randomly chose a query source node and a key hash value
to look up (chosen uniformly from 0 to 1). We compute a path using each routing
6.3. RESULTS 215
Table 6.1: SPROUT vs. ChordAvg. Path Length Avg. Reliability
Regular Chord 5.343 0.3080Augmented Chord 4.532 0.3649SPROUT(1,0.5) 4.569 0.4661
algorithm and gather statistics on path length and path rating. Each data point
presented below is the average of 1,000,000 such query paths.
6.3.2 Algorithm Evaluation
We first focus on the problem of misrouting. We use the linear trust function described
in Section 6.1 with f = 0.95 and r = 0.6, which corresponds to 40% of the nodes
misbehaving. We feel such a large fraction of bad nodes is reasonable because of
the threat of Sybil attacks [38]. We evaluate different trust functions and parameter
values in Section 6.3.3.
We compare SPROUT, using a lookahead of 1 and MHD = 0.5, to Chord using
the Club Nexus social network data. The first and third rows of Table 6.1 give the
measured values for both the average path length and average reliability path rating
of both regular Chord routing and SPROUT. With an average path length of 5.343
and average reliability of 0.3080, Chord performed much worse in both metrics than
SPROUT, which attained values of 4.569 and 0.4661, respectively. In fact, a path is
over 1.5 times as likely to succeed using standard SPROUT as with regular Chord.
But this difference in performance may be simply due to having additional links
available for routing, and the fact that they are friend links may have no effect
on performance. To equalize the comparison we augmented Chord by giving nodes
additional links to use for routing. Each node was given as many additional random
links as that node has social links (which SPROUT uses). Thus, the total number of
links useable at each node is equal for both SPROUT and augmented Chord. The
216 CHAPTER 6. SPROUT: P2P ROUTING WITH SOCIAL NETWORKS
Table 6.2: Evaluating lookahead and MHDLookahead
MHD None 1-level 2-levelLength Rating Length Rating Length Rating
0 4.875 0.4068 5.101 0.4420 5.378 0.44210.125 4.805 0.4070 5.003 0.4464 5.258 0.44780.25 4.765 0.4068 4.872 0.4525 5.114 0.45510.5 4.656 0.4033 4.569 0.4661 4.757 0.4730
performance of the augmented Chord (AC) is given in the second row of Table 6.1.
As expected, with more links to choose from AC performs significantly better than
regular Chord, especially in terms of path length. But SPROUT is still 1.3 times as
likely to route successfully. In the following sections we compare SPROUT only to
the augmented Chord algorithm.
How were lookahead and MHD values used above chosen? Table 6.2 shows the
results of our experiments in varying both parameters in the same scenario. As we
see, the largest increase in path rating comes from using a 1-level lookahead. But this
comes at a slight cost in average path length, due to the fact that more lookahead
allows us to route along friend links for more of the path. For example, for MHD
= 0.5, no lookahead averaged 0.977 social links per path, while 1-level lookahead
averaged 2.533 and 2-level averaged 3.491. Friend links tend to not be as efficient as
Chord links, so forward progress may require 2 or 3 hops, depending on the lookahead
depth. But friend links a more likely to reach nodes closer to the sending node on
the social network.
Increasing MHD limits the choices in forward progressing friend hops, causing
the algorithm to switch to Chord earlier than otherwise, but mitigates inefficient
progress. A large MHD seems to be most effective at both shortening path lengths
and increasing path rating. This is not very surprising. Since our reliability function
is multiplicative each additional link appreciably drops the path reliability.
6.3. RESULTS 217
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
100 1000 10000 100000 0
10
20
30
40
50
60
70
80
Ave
rage
Rel
iabi
lity
Per
cent
age
Number of Nodes
Augmented ChordSPROUT
Percent Improvement (SPROUT over AC)
Figure 6.1: Performance of SPROUT and AC in different size Small World networks.The third curve shows the relative performance of SPROUT with respect to AC,plotted on the right-hand y-axis. Note that the x-axis is logscale.
From these results we chose to use a 1-level lookahead and an MHD of 0.5 for our
standard SPROUT procedure. Though 2-level lookahead produced slightly better
reliability we did not feel it warranted the longer route paths and exponentially in-
creased node state propagation and management. Our available social network data
indicates that a user has on average between 8 and 9 friends. Thus, we would expect
most nodes’ level-1 lookahead cache to hold less than 100 entries.
The path ratings presented above were relatively small, indicating a low, but per-
haps acceptable, probability of successfully routing to a destination in the DHT. If
the number of friends a user has remains constant but the total number of network
nodes increases we would expect reliability to drop. As the number of nodes n in-
creases, the average Chord path length increases as O(log n). Each additional node
in a path decreases the path rating. But by how much? To study this issue we ran
our experiment using our synthetic Small World model for networks of different sizes,
but always with an average of around 8 friends per node. We present these results in
Figure 6.1.
As expected, for larger networks the path length increases, thus decreasing overall
218 CHAPTER 6. SPROUT: P2P ROUTING WITH SOCIAL NETWORKS
reliability. Because the average path length is O(log n) as in Chord, the reliabil-
ity drops exponentially with respect to log n. The range of network sizes tested is
insufficiently large to properly illustrate an exponential curve, giving it a misguid-
ing linear appearance. The third curve gives the percent increase in reliability of
SPROUT with respect to augmented Chord. Notice that the reliability of SPROUT
over AC remains relatively constant, thus resulting in increasing relative performance
for SPROUT over AC. In fact, at 10,000 nodes SPROUT performs over 50% better
than AC. As the network grows, the average number of social links increases slightly.
The benefit SPROUT derives from additional friend links is greater than the benefit
AC derives from additional random links.
6.3.3 Calculating Trust
All of our previous results used a linear trust function with f = 0.95. Of course other
trust functions or parameter values may be more appropriate for different scenarios.
T (A,B), using the linear trust function LT we previously described, is defined in
Equation 6.1 as a function of d, the distance from A to B in the social network.
LT (d) = max(1− (1− f)d, r) (6.1)
Instead of a linear drop in trust, we may want to model an exponential drop at
each additional hop. For this we use an exponential trust function ET , shown in
Equation 6.2.
ET (d) = max(f d, r) (6.2)
Another simple function we call the step trust function ST (d) assigns an equal
high trustworthiness of f to all nodes within h hops of us and the standard rating of
6.3. RESULTS 219
0
0.2
0.4
0.6
0.8
1
0.86 0.88 0.9 0.92 0.94 0.96 0.98 1
Ave
rage
Pat
h R
elia
bilit
y
f
LT - Augmented Chord (AC)LT - SPROUT
ST - ACST - SPROUT
ET - ACET - SPROUT
Figure 6.2: Performance of SPROUT and AC for different trust functions and varyingf . Higher value is better.
r to the rest. Equation 6.3 defines the step trust function.
ST (d) = if (d < h) then f else r (6.3)
In our experiments we set h, the social horizon, to 5.
All three functions are expressed so that f is the rating assigned to nodes one hop
away in the social network, the direct friends. In Figure 6.2 we graph both routing
algorithms under all three trust functions as a function of the parameter f .
We see here that both the linear (LT) and exponential (ET) trust functions per-
form equivalently while the step trust function (ST) gives less performance difference
for varying f . The key observation here is that SPROUT demonstrates a clear im-
provement over augmented Chord for a whole variety of trust functions, especially for
f values greater than 0.85. For example, at f = 0.96 using the exponential function
SPROUT succeeds in routing 47% of the time, while AC only 38%. Thus, even if
one does not know precisely the trust function, one can expect SPROUT to perform
substantially better.
We also varied r, the perceived reliability of random unknown nodes in the network
and present the results in Figure 6.3. We find that for values of r < 0.75 path ratings
220 CHAPTER 6. SPROUT: P2P ROUTING WITH SOCIAL NETWORKS
0
0.2
0.4
0.6
0.8
1
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Ave
rage
Pat
h R
atin
g
Fraction Good (r)
Augmented ChordSPROUT
Figure 6.3: Performance of SPROUT and AC for varying r, the probability ofstrangers routing correctly.
remained unchanged. Above 0.75 both algorithms’ ratings steadily increased. When
5% or less of unknown peers are likely to misroute (r ≥ 0.95) both algorithms perform
equally well, even with f also 0.95 so that we trust our friends no more than any
stranger. This means that while SPROUT significantly improves path reliability in
a peer-to-peer network with many malicious and selfish peers, we do not suffer any
appreciable penalty for using it in a network with very few bad peers.
6.3.4 Number of Friends
In a given network, a node with more friends is likely to perform better since it has
more choices of social links to use. But how much better? How much improvement
would a node expect to gain by establishing some trust relationship with another
node? To quantify this, we generated 100 queries from each node in the Club Nexus
network, calculated its path rating, and grouped and averaged the results based on
the number of social links each node has.
Figure 6.4 shows the results for SPROUT using 0- and 1-level lookahead, as well
as AC for the Club Nexus data. For example, 85 nodes in the network had exactly
10 social links. The average path rating for those 85 nodes when running SPROUT
6.3. RESULTS 221
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1 10 100 1000
Ave
rage
Rel
iabi
lty
Node Degree
0 lookahead1 lookahead
AC
Figure 6.4: Performance as a function of a node’s degree. Club Nexus data.
with 1 lookahead was 0.553. Note that the three curves are linear with respect to the
log of the node degree, indicating an exponentially decreasing benefit return for each
additional social link. For instance, nodes with only 1 social peer attained a reliability
rating of 0.265 with SPROUT with no lookahead, while nodes with 10 social peers
scored 0.471, a difference of 0.206. A node with 10 social peers would need to grow
to over 100 social peers to increase their rating that same amount (the one node with
103 social links had a rating of 0.663).
From these curves we can estimate how many links a typical node would need to
have in order to attain a specified level of reliability. For instance, considering the
SPROUT with 1-level lookahead curve, we see that a node would need about 100
social links to attain an average rating of 0.7, and about 600 social links to get a
rating of 0.9.
Though a single node increasing its number of friends does not greatly influence
its performance, what performance can nodes expect if we a priori set the number of
friend connections each node must have? To analyze this we create a random regular
social network graph of 2500 nodes where each node has an equal degree and vary
this degree for each simulation run. The results are shown below in Figure 6.5.
The curves correspond to SPROUT with 1-level lookahead and augmented Chord.
222 CHAPTER 6. SPROUT: P2P ROUTING WITH SOCIAL NETWORKS
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1 10 100 1000
Ave
rage
Rel
iabi
lty
Number of links per node
Augmented ChordSPROUT
Figure 6.5: Performance of SPROUT and AC for different uniform networks withvarying degrees.
As expected, we see that both curves rise more steeply than in the previous graph.
If all nodes add an extra social link the probability of successful routing will rise
more than if only one node adds a link (as seen in Fig. 6.4). But the curves level
off just below 0.9. In fact, similar simulations for larger networks showed the same
results, with reliability leveling off under 0.9 at around 100 social links per node.
This confirms that even at high social degree, each path is expected to take multiple
hops through nodes that are, to some small amount, unreliable. Even if all nodes
were exactly two social hops away from each other, this would yield a reliability of
0.95*0.9=0.855. Therefore, we would not expect a node in the Club Nexus dataset,
as seen in Figure 6.4, reach 0.9 reliability, even with 600 links.
Though SPROUT provides greater reliability than Chord, neither algorithm per-
forms particularly well. Our results from Table 6.1 showed ratings of less than 0.50,
indicating less than 50% of messages would be expected to reach their destination.
Perhaps DHT routing is incapable of providing acceptable performance when mem-
bers of the network seek to harm it. In the next section, we evaluate the brute force
method of query flooding.
6.3. RESULTS 223
6.3.5 Comparison to Gnutella-like Networks
So far we have limited our analysis of SPROUT to Chord-like DHT routing. We were
also interested in comparing the effects of misrouting on structured P2P networks to
unstructured, flooding-based networks, such as Gnutella. To balance the comparison
we assume the unstructured network’s topology is determined by the social network,
using only its social links, and apply the same linear trust function used before to
calculate the probability that a node forwards a query flood message.
Because querying the network is flooding-based, we cannot use the probability
of reaching a certain destination as our metric. Instead, we would like to find the
expected number of good responses a querying node would receive. For a DHT we
assume a node would receive all or no responses, depending on whether the query
message reached the correct well-behaved index node (we do not consider the problem
of inserting item keys into the DHT caused by misrouting). In an unstructured
network the number of good responses located is equal to the number of responses at
well-behaved nodes reached by the query flood. Because the flood is usually limited
in size by a time-to-live (TTL), even if there are no malicious nodes in the network,
not all query answers will be located.
Using the simulator described in Chapter 5, we modelled a Gnutella-like network
with a topology based on the Club Nexus data and used a TTL of 5, allowing us to
reach the vast majority of the nodes in the network (over 2000 on average). We seeded
the network with files based on empirically collected data from actual networks [119]
and ran 10000 queries for different files from varying nodes, dropping a query message
at a peer with a probability based on the trust function and the shortest path to the
querying node. We averaged across 10 runs (for different file distributions) and present
the results, as a function of r (the expected reliability of a node distant in the social
network), in Figure 6.6.
The top curve, labelled Total, indicates the total number of files in the entire
224 CHAPTER 6. SPROUT: P2P ROUTING WITH SOCIAL NETWORKS
0
20
40
60
80
100
120
140
160
1 0
10
20
30
40
50
60
Goo
d A
nsw
ers
Per
Que
ry
Per
cent
age
r
TotalFloodingDHT AC
DHT SPROUT
Figure 6.6: Performance of SPROUT and AC versus unstructured flooding.
network matching each query (on average), independent of the routing algorithm used.
This value is approximately 150. The expected number of good answers received for
the DHT curves was calculated as this total number times the expected probability of
reaching the index node storing the queried for items. Flooding results in significantly
more responses on average, a factor of almost 2 for small r. More importantly, this
means we would expect to locate at least some good answers flooding when the DHT
completely fails. For values of r less than 0.5 all the curves level off. If r = 1 then
we assume no nodes in the network are malicious. Thus DHT outperforms flooding
since it will always locate the index node and retrieve all the available answers.
Note, these results are meant to be a rough comparison of these two P2P styles.
The flooding model does not take into account messages dropped due to congestion.
This is a much larger problem for flooding protocols than DHTs. In our simulations on
the 2200 node Club Nexus network each query reached, on average, over 2000 nodes.
This indicates the number of messages produced by the flood was even greater (due
to duplicate messages). The DHT algorithm, on the other hand, averaged around 5
messages to reach the index node. Thus, flooding schemes will not scale to very large
networks as well as DHTs.
On the other hand, in the DHT model, we are only considering the probability of
6.3. RESULTS 225
a query message being misrouted. We assume all good answers are inserted at the
correct index node, not taking into account that index insertions may fail just as well
as index queries. If we factor in index insertion failures, the DHT curves would shift
down, further increasing the relative performance difference with flooding.
Though flooding is more costly in terms of processor and network bandwidth
utilization, it is clearly a more reliable method of querying in a network suffering
from some amount of misrouting. A better solution may be to use a hybrid scheme
where one uses DHT routing until they detect misrouting or malicious nodes, then
switch to query flooding. In fact, a such a scheme is proposed in [21] and discussed
in Section 6.4.
6.3.6 Latency Comparisons
As we stated before, both SPROUT and our social trust model are not limited to
studying misrouting. With few modifications our model can be used to evaluate other
issues, such as Quality of Service. If peers prioritized their message queues based on
service agreements and/or social connections we may want to use latency as the metric
for comparing routing algorithms. Using the latency trust function (with ε = 1 and
∆ = 3) and latency path rater we described in Section 6.1, we route messages using
both SPROUT and augmented Chord and see which provides the least latency. We
would expect SPROUT to perform even better with respect to Chord in such systems.
We performed an analysis to determine the optimal MHD for latency-based rout-
ing. As in the misrouting scenario an MHD of 0.5 performed the best. This is
surprising since the latency path rater is additive, not multiplicative. The difference
with other values for MHD was almost negligible, indicating that for small ∆ where
the cost of social links and regular links are similar, shortening the overall path out-
weighs choosing social links. In fact, with a larger ∆ of 10, smaller MHD values
perform significantly better than 0.5.
226 CHAPTER 6. SPROUT: P2P ROUTING WITH SOCIAL NETWORKS
0
2
4
6
8
10
12
14
16
18
100 1000 10000 100000 0
10
20
30
40
50
60
Ave
rage
Lat
ency
Per
cent
age
Number of Nodes
ACSPROUT
Percent Improvement (SPROUT over AC)
Figure 6.7: Latency measurements for SPROUT vs AC w.r.t. network size. Lower isbetter.
Figure 6.7 shows the average path latency for both SPROUT and augmented
Chord as a function of the network size (using a Small World topology). The
third curve shows the percent decrease in latency attained by switching from AC
to SPROUT. We see that SPROUT results in roughly half (40-60%) the latency of
AC. We would expect SPROUT to deliver messages twice as fast as AC by preferring
to take advantage of service agreements, rather than simply minimizing hop count.
Clearly, Quality of Service issues greatly benefit from routing algorithms which
account for service agreements between peers, as SPROUT does. In fact, real-world
systems which deal with QoS, such as ISPs and phone carriers, base their routing
decisions on service agreements among their peers, though their networks are not as
dynamic as peer-to-peer networks.
6.3.7 Message Load
One problem SPROUT faces is uneven load distribution due to the widely varying
social connectivity of the nodes. Peers with more social links are expected to for-
ward messages for friends at a higher rate than weakly socially connected peers. To
study this issue we measure the number of messages forwarded by each node over all
6.3. RESULTS 227
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
1 10 100 1000 10000
Load
Node Rank
Augmented ChordSPROUT
SPROUT (No Top 10)SPROUT (Limit 20)
Figure 6.8: Distribution of load (in fraction of routes) for augmented Chord andSPROUT. Lower is better. Social links were removed for the top 10 highest connectednodes for the No Top 10 curve. All nodes were limited to at most 20 social links forthe Limit 20 curve. Note the logscale x-axis.
1,000,000 paths for both SPROUT(1,0.5) and augmented Chord. The resulting load
on each node, in decreasing order, is given by the first two curves in Figure 6.8. The
load is calculated as the fraction of all messages a node participated in routing.
The highest loaded node in the SPROUT experiment was very heavily loaded in
comparison to AC (4% vs 0.75%). As expected, a peer’s social degree is proportional
to its load, with the most connected peers forwarding the most messages. Though the
top 200 nodes suffer substantially more load with SPROUT than AC, the remaining
nodes report equal or less load. Because the average path length for SPROUT is
slightly higher than for AC, the total load is greater in the SPROUT scenario. Yet
the median load is slightly lower for SPROUT, further indicating an imbalanced load
distribution.
To analyze the importance of the highly connected nodes we removed the social
links from the top 10 most connected nodes, but kept their regular Chord links and
reran the experiment. As the third curve in Figure 6.8 shows, the load has lowered
for the most heavily weighted nodes, yet remains well above AC. Surprisingly the
reliability was barely affected, dropping by 2% to 0.4569. If highly connected nodes
228 CHAPTER 6. SPROUT: P2P ROUTING WITH SOCIAL NETWORKS
were to stop forwarding for friends due to too much traffic, the load would shift to
other nodes and the overall system performance would not be greatly affected.
Instead of reacting to high load, nodes may wish to only provide a limited number
of social links for routing from the start. We limited all nodes to using only at most
20 social links for SPROUT. As we can see from the Limit 20 curve in Figure 6.8, the
load on the highly-loaded peers (excluding the most loaded peer) has fallen further,
but not significantly from the No Top 10 scenario. The average path reliability has
dropped only an additional 1.5% to 0.4500.
In the end, it is the system architect who must decide whether the load skew is
acceptable. For weakly connected homogeneous systems, fair load distribution may
be critical. For other systems, improved reliability may be more important. In fact,
one could take advantage of this skew. Adding one highly-connected large-capacity
node to the network would increase reliability while significantly decreasing all other
nodes’ load.
6.4 Related and Future Work
In [21], Castro et al. propose using stricter network identifier assignment and density
checks to detect misrouting attacks in DHTs. They suggest using constrained routing
tables and redundant routing to circumvent malicious nodes and provide more secure
routing. SPROUT is complementary to their approach, simply increasing the proba-
bility that the message will be routed correctly the first time. One technique of theirs
that would be especially useful in our system was their route failure test based on
measuring the density of network IDs around oneself and the purported destination.
Not only can this technique be used to determine when a route has failed, but it can
be used to evaluate the trustworthiness of a node’s sequential neighbors by comparing
local density to that at random locations in idspace or around friends.
6.5. CONCLUSION 229
As discussed in Section 6.1, the LOCKSS digital preservation system uses a friends
list containing peers with which one has a priori trust relationships. By including
some peers from this list in the voting process, it reduces the influence of an adversary
who attempts to poison the peers’ reference lists with its many virtual identities [87].
One open question is whether node IDs can be assigned more intelligently to
improve trustworthiness. That is, if identifiers were assigned to nodes based on
the current IDs of their connected friends, what algorithm or distribution for ID
assignment would optimize our ability to route over social links?
One method to provide greater reliability in a DHT for fault tolerance and/or
security, is to replicate the index to multiple nodes. If we do k-replication then
when we insert, update or search for an entry in the DHT, we must contact k nodes
determined by using k hash functions. If a good node A wishes to insert an item
into the DHT, it attempts to contact all k replicas. Each message has an expected
probability p of having traversed only well-behaved nodes to the destination. Likewise,
if node B wishes to look up the item A inserted it can try to contact all k replicas,
each time with an expected probability of success of p. Assuming neither A nor B can
determine whether they contacted a good node or are being lied to, the probability of
B locating A’s item is 1− (1− p2)k. Using the values in Table 6.1 for p and a typical
replication factor of k = 3, SPROUT would succeed 41% of the time compared to
only 26% for AC.
6.5 Conclusion
Today’s peer-to-peer systems are very vulnerable to malicious attacks. The anonymity
and transience of the members make it difficult to determine who to trust. Integrating
social networks with P2P networks will provide this much-needed trust information.
230 CHAPTER 6. SPROUT: P2P ROUTING WITH SOCIAL NETWORKS
We have presented a method for leveraging the trust relationships gained by mar-
rying a peer-to-peer system with a social network, and showed how to improve the
expected number of query results and the how to reduce the expected delays. We
described a model for evaluating routing algorithms in such a system and proposed
SPROUT, a routing algorithm designed to leverage trust relationships given by social
links. Our results demonstrate how SPROUT can significantly improve the likelihood
of getting query results in a timely fashion, when a large fraction of nodes are mali-
cious. Though flooding-based search schemes are far more robust when threatened
by a large number of malicious users, with the right techniques structured networks
can obtain acceptable performance at far less bandwidth costs.
Chapter 7
Mitigating Routing Misbehavior in
Mobile Ad Hoc Networks
There will be tremendous growth over the next decade in the use of wireless com-
munication, from satellite transmission into many homes to wireless personal area
networks. As the cost of wireless access drops, wireless communications could re-
place wired in many settings. One advantage of wireless is the ability to transmit
data among users in a common area while remaining mobile. However, the distance
between participants is limited by the range of transmitters or their proximity to
wireless access points. Ad hoc wireless networks mitigate this problem by allowing
out of range nodes to route data through intermediate nodes.
Ad hoc networks have a wide array of military and commercial applications. Ad
hoc networks are ideal in situations where installing an infrastructure is not possible
because the infrastructure is too expensive or too vulnerable, the network is too
transient, or the infrastructure was destroyed. For example, nodes may be spread
over too large an area for one base station and a second base station may be too
expensive. An example of a vulnerable infrastructure is a military base station on
a battlefield. Networks for wilderness expeditions and conferences may be transient
231
232 CHAPTER 7. MITIGATING MANET MISBEHAVIOR
if they exist for only a short period of time before dispersing or moving. Finally,
if network infrastructure has been destroyed due to a disaster, an ad hoc wireless
network could be used to coordinate relief efforts. Since DARPA’s PRNET [71], the
area of routing in ad hoc networks has been an open research topic.
Ad hoc networks maximize total network throughput by using all available nodes
for routing and forwarding. Therefore, the more nodes that participate in packet
routing, the greater the aggregate bandwidth, the shorter the possible routing paths,
and the smaller the possibility of a network partition. However, a node may misbehave
by agreeing to forward packets and then failing to do so, because it is overloaded,
selfish, malicious, or broken. An overloaded node lacks the CPU cycles, buffer space
or available network bandwidth to forward packets. A selfish node is unwilling to
spend battery life, CPU cycles, or available network bandwidth to forward packets
not of direct interest to it, even though it expects others to forward packets on its
behalf. A malicious node launches a denial of service attack by dropping packets. A
broken node might have a software fault that prevents it from forwarding packets.
In ad hoc networks, misbehaving mobile nodes can be a significant problem. Sim-
ulations presented in this chapter show that if 10%-40% of the nodes in the ad hoc
network misbehave, then the average throughput degrades by 16%-32%. However,
the worst case throughput experienced by any one node may be worse than the av-
erage, because nodes that try to route through a misbehaving node experience high
loss while other nodes experience no loss. Thus, even a few misbehaving nodes can
have a severe impact.
One solution to misbehaving nodes is to forward packets only through nodes that
share an a priori trust relationship. A priori trust relationships are based on pre-
existing relationships built outside of the context of the network (e.g. friendships,
companies, and armies). SPROUT, discussed in the previous chapter, used these re-
lationships to improve routing. However, several issues prevent a priori trust routing
233
from being practical in mobile ad hoc networks:
a) SPROUT leveraged existing social network services, which may employ a central-
ized trusted entity, such as AOL Instant Messenger. No such service is likely to
exist in scenarios where ad hoc networks are deployed.
b) While in a wired overlay network any peer can contact a friendly peer directly,
mobile nodes are limited to routing through nodes within radio transmission range.
Even if one’s friends are participating in the ad hoc network, they may not be
in communication range. Routing packets towards a destination through friendly
nodes for even one or two hops may be impossible.
c) Although relying solely on a priori trust-based forwarding reduces the number of
misbehaving nodes, it will exclude untrusted well behaved nodes whose presence
could improve ad hoc network performance.
d) Finally, in the more hostile environments and scenarios for which mobile ad hoc
networks are envisioned (e.g. battlefield), trusted nodes are more likely to be
compromised.
Another solution to misbehaving nodes is to attempt to forestall or isolate these
nodes from within the actual routing protocol for the network. However, this would
add significant complexity to protocols whose behavior must be very well defined.
In fact, current versions of mature ad hoc routing algorithms, including DSR [70],
AODV [31], TORA [26], DSDV [106], STAR [51], and others [39] only detect if the
receiver’s network interface is accepting packets, but they otherwise assume that
routing nodes do not misbehave. Although trusting all nodes to be well behaved
increases the number of nodes available for routing, it also admits misbehaving nodes
to the network.
In this chapter we explore a different approach, and install extra facilities on top
of the network routing protocol to detect and mitigate routing misbehavior. In this
234 CHAPTER 7. MITIGATING MANET MISBEHAVIOR
way, only minimal changes to the underlying routing algorithm are needed in order
to take advantage of our mechanisms. We introduce two extensions to any ad hoc
routing algorithm that mitigate the effects of routing misbehavior: the Watchdog
and the Pathrater. The Watchdog identifies misbehaving nodes, while the Pathrater
avoids routing packets through these nodes. When a node forwards a packet, the
node’s Watchdog verifies that the next node in the path also forwards the packet.
The Watchdog does this by listening promiscuously to the next node’s transmissions.
If the next node does not forward the packet, then it is misbehaving. The Pathrater
uses this knowledge of misbehaving nodes to choose the network path that is most
likely to deliver packets. In this chapter, we demonstrate how to use our extensions
with the Dynamic Source Routing algorithm (DSR) [70]. However, Watchdog and
Pathrater can be easily applied to other ad hoc routing protocols.
Using the ns network simulator [43], we show that the two techniques increase
throughput by 17% in the presence of up to 40% misbehaving nodes during moderate
mobility, while increasing the ratio of overhead transmissions to data transmissions
from the standard routing protocol’s 9% to 17%. During extreme mobility, Watchdog
and Pathrater can increase network throughput by 27%, while increasing the percent-
age of overhead transmissions from 12% to 24%. We describe mechanisms to reduce
this overhead in Section 7.6.
The remainder of this chapter is organized as follows. Section 7.1 specifies our
assumptions about ad hoc networks and gives background information about DSR.
Section 7.2 describes the Watchdog and Pathrater extensions. Section 7.3 describes
the methodology we use in our simulations and the metrics we use to evaluate the
results. We present these results in Section 7.4. Section 7.5 presents related work and
Section 7.7 concludes the chapter.
The basis of this chapter originally appeared in [97].
7.1. ASSUMPTIONS AND BACKGROUND 235
7.1 Assumptions and Background
This section outlines the assumptions we make regarding the properties of the physical
and network layers of ad hoc networks and includes a brief description of DSR, the
routing protocol we use.
7.1.1 Definitions
We use the term neighbor to refer to a node that is within wireless transmission
range of another node. Likewise, neighborhood refers to all the nodes that are within
wireless transmission range of a node.
7.1.2 Physical Layer Characteristics
Throughout this chapter we assume bidirectional communication symmetry on every
link between nodes. This means that if a node B is capable of receiving a message
from a node A at time t, then node A could instead have received a message from node
B at time t. This assumption is often valid, since many wireless MAC layer protocols,
including IEEE 802.11 and MACAW [13], require bidirectional communication for
reliable transmission. The Watchdog mechanism relies on bidirectional links.
In addition, we assume wireless interfaces that support promiscuous mode oper-
ation. Promiscuous mode means that if a node A is within range of a node B, it
can overhear communications to and from B even if those communications do not
directly involve A. Lucent Technologies’ WaveLAN interfaces have this capability.
While promiscuous mode is not appropriate for all ad hoc network scenarios (partic-
ularly some military scenarios) it is useful in other scenarios for improving routing
protocol performance [70].
236 CHAPTER 7. MITIGATING MANET MISBEHAVIOR
SD
(a)
SD
(b)
SD
(c)
Figure 7.1: Example of a route request. (a) Node S sends out a route requestpacket to find a path to node D. (b) The route request is forwarded throughoutthe network, each node adding its address to the packet. (c) D then sends back aroute reply to S using the path contained in one of the route request packetthat reached it. The thick lines represent the path the route reply takes back tothe sender.
7.1.3 Dynamic Source Routing (DSR)
DSR is an on-demand, source routing protocol. Every packet has a route path con-
sisting of the addresses of nodes that have agreed to participate in routing the packet.
The protocol is referred to as “on-demand” because route paths are discovered at the
time a source sends a packet to a destination for which the source has no path.
We divide DSR into two main functions: route discovery and route maintenance.
Figure 7.1 illustrates route discovery. Node S (the source) wishes to communicate
with node D (the destination) but does not know any paths to D. S initiates a route
discovery by broadcasting a route request packet to its neighbors that contains
the destination address D. The neighbors in turn append their own addresses to the
route request packet and rebroadcast it. This process continues until a route
request packet reaches D. D must now send back a route reply packet to inform S
7.2. WATCHDOG AND PATHRATER 237
of the discovered route. Since the route request packet that reaches D contains
a path from S to D, D may choose to use the reverse path to send back the reply
(bidirectional links are required here) or to initiate a new route discovery back to S.
Since there can be many routes from a source to a destination, a source may receive
multiple route replies from a destination. DSR caches these routes in a route cache
for future use.
The second main function in DSR is route maintenance, which handles link breaks.
A link break occurs when two nodes on a path are no longer in transmission range.
If an intermediate node detects a link break when forwarding a packet to the next
node in the route path, it sends back a message to the source notifying it of that link
break. The source must try another path or do a route discovery if it does not have
another path.
7.2 Watchdog and Pathrater
In this section we present the Watchdog and the Pathrater — tools for detecting
and mitigating routing misbehavior. We also describe the limitations of these meth-
ods. Though we implement these tools on top of DSR, some of our concepts can be
generalized to other source routing protocols. We note those concepts that can be
generalized during our descriptions of the techniques.
7.2.1 Watchdog
The Watchdog method detects misbehaving nodes. Figure 7.2 illustrates how the
Watchdog works. Suppose there exists a path from node S to D through intermediate
nodes A, B, and C. Node A cannot transmit all the way to node C, but it can listen
in on node B’s traffic. Thus, when A transmits a packet for B to forward to C, A
can often tell if B transmits the packet. If encryption is not performed separately for
238 CHAPTER 7. MITIGATING MANET MISBEHAVIOR
S A B C D
Figure 7.2: When B forwards a packet from S toward D through C, A can overhearB’s transmission and can verify that B has attempted to pass the packet to C. Thesolid line represents the intended direction of the packet sent by B to C, while thedashed line indicates that A is within transmission range of B and can overhear thepacket transfer.
S A B C D2 1 1
Figure 7.3: Node A does not hear B forward packet 1 to C, because B’s transmissioncollides at A with packet 2 from the source S.
each link, which can be expensive, then A can also tell if B has tampered with the
payload or the header.
We implement the Watchdog by maintaining a buffer of recently sent packets and
comparing each overheard packet with the packet in the buffer to see if there is a
match. If so, the packet in the buffer is removed and forgotten by the Watchdog,
since it has been forwarded on. If a packet has remained in the buffer for longer than
a certain timeout, the Watchdog increments a failure tally for the node responsible
for forwarding on the packet. If the tally exceeds a certain threshold bandwidth, it
determines that the node is misbehaving and sends a message to the source notifying
it of the misbehaving node.
The Watchdog technique has advantages and weaknesses. DSR with the Watchdog
has the advantage that it can detect misbehavior at the forwarding level and not just
the link level. Watchdog’s weaknesses are that it might not detect a misbehaving
node in the presence of 1) ambiguous collisions, 2) receiver collisions, 3) limited
transmission power, 4) false misbehavior, 5) collusion, and 6) partial dropping.
The ambiguous collision problem prevents A from overhearing transmissions from
B. As Figure 7.3 illustrates, a packet collision can occur at A while it is listening for B
to forward on a packet. A does not know if the collision was caused by B forwarding
7.2. WATCHDOG AND PATHRATER 239
S A B C D1 21
Figure 7.4: Node A believes that B has forwarded packet 1 on to C, though C neverreceived the packet due to a collision with packet 2.
on a packet as it should or if B never forwarded the packet and the collision was
caused by other nodes in A’s neighborhood. Because of this uncertainty, A should
not immediately accuse B of misbehaving, but should instead continue to watch B
over a period of time. If A repeatedly fails to detect B forwarding on packets, then
A can assume that B is misbehaving.
In the receiver collision problem, node A can only tell whether B sends the packet
to C, but it cannot tell if C receives it (Figure 7.4). If a collision occurs at C when
B first forwards the packet, A only sees B forwarding the packet and assumes that
C successfully receives it. Thus, B could skip re-transmitting the packet and leave A
none the wiser. B could also purposefully cause the transmitted packet to collide at
C by waiting until C is transmitting and then forwarding on the packet. In the first
case, a node could be selfish and not want to waste power with retransmissions. In
the latter case, the only reason B would have for taking the actions that it does is
because it is malicious. B wastes battery power and CPU time, so it is not selfish.
An overloaded node would not engage in this behavior either, since it wastes badly
needed CPU time and bandwidth. Thus, this second case should be a rare occurrence.
Another problem can occur when nodes falsely report other nodes as misbehaving.
A malicious node could attempt to partition the network by claiming that some nodes
following it in the path are misbehaving. For instance, node A could report that
node B is not forwarding packets when in fact it is. This will cause S to mark B as
misbehaving when A is the culprit. This behavior, however, will be detected. Since
A is passing messages on to B (as verified by S), then any acknowledgements from
D to S will go through A to S, and S will wonder why it receives replies from D
240 CHAPTER 7. MITIGATING MANET MISBEHAVIOR
when supposedly B dropped packets in the forward direction. In addition, if A drops
acknowledgements to hide them from S, then node B will detect this misbehavior and
will report it to D.
Another problem is that a misbehaving node that can control its transmission
power can circumvent the Watchdog. A node could limit its transmission power
such that the signal is strong enough to be overheard by the previous node but too
weak to be received by the true recipient. This would require that the misbehaving
node know the transmission power required to reach each of its neighboring nodes.
Only a node with malicious intent would behave in this manner — selfish nodes have
nothing to gain since battery power is wasted and overloaded nodes would not relieve
any congestion by doing this.
Multiple nodes in collusion can mount a more sophisticated attack. For example,
B and C from Figure 7.2 could collude to cause mischief. In this case, B forwards
a packet to C but does not report to A when C drops the packet. Because of this
limitation, it may be necessary to disallow two consecutive untrusted nodes in a
routing path. In this study, we only deal with the possibility of nodes acting alone.
The harder problem of colluding nodes is being studied by Johnson at CMU [69].
Colluding nodes pose another threat by intercepting and dropping misbehavior
reports. If a node notices the next hop node is not forwarding packets and sends a
notification back along the path to the sender, a malicious node inserted earlier in
the path may drop the notification, preventing the source from receiving it. Once
again, Watchdog could be employed in the reverse direction. However, the node that
detects the dropped report would have to establish a new route to the source node,
adding complexity and resource usage to the protocol.
Finally, a node can circumvent the Watchdog by dropping packets at a lower
rate than the Watchdog’s configured minimum misbehavior threshold. Although the
watchdog will not detect this node as misbehaving, this node is forced to forward at
7.2. WATCHDOG AND PATHRATER 241
the threshold bandwidth. In this way the watchdog serves to enforce this minimum
bandwidth.
The watchdog mechanism could be used to some degree to detect replay attacks
but would require maintaining a great deal of state information at each node as it
monitors its neighbors to ensure that they do not retransmit a packet that they have
already forwarded. Also, if a collision has taken place at the receiving node, it would
be necessary and correct for a node to retransmit a packet, which may appear as a
replay attack to the node acting as its Watchdog. Therefore, detecting replay attacks
would neither be an efficient nor an effective use of the Watchdog mechanism.
For the Watchdog to work properly, it must know where a packet should be in two
hops. In our implementation, the Watchdog has this information because DSR is a
source routing protocol. If the Watchdog does not have this information (for instance
if it were implemented on top of a hop-by-hop routing protocol), then a malicious or
broken node could broadcast the packet to a non-existent node and the Watchdog
would have no way of knowing. Because of this limitation, the Watchdog works best
on top of a source routing protocol.
7.2.2 Pathrater
The Pathrater, run by each node in the network, combines knowledge of misbehaving
nodes with link reliability data to pick the route most likely to be reliable. Each
node maintains a rating for every other node it knows about in the network. It
calculates a path metric by averaging the node ratings in the path. We choose this
metric because it gives a comparison of the overall reliability of different paths and
allows Pathrater to emulate the shortest length path algorithm when no reliability
information has been collected, as explained below. If there are multiple paths to the
same destination, we choose the path with the highest metric. Note that this differs
from standard DSR, which chooses the shortest path in the route cache. Further note
242 CHAPTER 7. MITIGATING MANET MISBEHAVIOR
that since the Pathrater depends on knowing the exact path a packet has traversed,
it must be implemented on top of a source routing protocol.
The Pathrater assigns ratings to nodes according to the following algorithm. When
a node in the network becomes known to the Pathrater (through route discovery),
the path-rater assigns it a “neutral” rating of 0.5. A node always rates itself with
a 1.0. This ensures that when calculating path rates, if all other nodes are neutral
nodes (rather than suspected misbehaving nodes), the Pathrater picks the shortest
length path. The Pathrater increments the ratings of nodes on all actively used paths
by 0.01 at periodic intervals of 200 ms. An actively used path is one on which the
node has sent a packet within the previous rate increment interval. The maximum
value a neutral node can attain is 0.8. We decrement a node’s rating by 0.05 when
we detect a link break during packet forwarding and the node becomes unreachable.
The lower bound rating of a “neutral” node is 0.0. The Pathrater does not modify
the ratings of nodes that are not currently in active use.
We assign a special highly negative value, −100 in the simulations, to nodes sus-
pected of misbehaving by the Watchdog mechanism. When the Pathrater calculates
the path metric, negative path values indicate the existence of one or more suspected
misbehaving nodes in the path. If a node is marked as misbehaving due to a temporary
malfunction or incorrect accusation it would be preferable if it were not permanently
excluded from routing. Therefore nodes that have negative ratings should have their
ratings slowly increased or set back to a non-negative value after a long timeout. This
is not implemented in our simulations since the current simulation period is too short
to reset a misbehaving node’s rating. Section 7.4.3 discusses the effect on throughput
of accusing well-behaving nodes.
When the Pathrater learns that a node on a path that is in use misbehaves, and
it cannot find a path free of misbehaving nodes, it sends out a route request if
we have enabled an extension we call Send Route Request (SRR).
7.3. METHODOLOGY 243
7.3 Methodology
In this section we describe our simulator, simulation parameters, and measured met-
rics.
We use a version of Berkeley’s Network Simulator (ns) [43] that includes wireless
extensions made by the CMU Monarch project. We also use a visualization tool from
CMU called ad-hockey [109] to view the results of our simulations and detect overall
trends in the network. To execute the simulations, we use PCs (450 or 500 MHz
Pentium IIIs with at least 128 MB of RAM) running Red Hat Linux 6.1.
Our simulations take place in a 670 by 670 meter flat space filled with a scattering
of 50 wireless nodes. The physical layer and the 802.11 MAC layer we use are included
in the CMU wireless extensions to ns[15].
7.3.1 Movement and Communication Patterns
The nodes communicate using 10 constant bit rate (CBR) node-to-node connections.
Four nodes are sources for two connections each, and two nodes are sources for one
connection each. Eight of the flow destinations receive only one flow and the ninth
destination receives two flows. The communication pattern we use was developed by
CMU [15].
In all of our node movement scenarios, the nodes choose a destination and move
in a straight line towards the destination at a speed uniformly distributed between 0
meters/second (m/s) and some maximum speed. This is called the random waypoint
model [15]. We limit the maximum speed of a node to 20 m/s (10 m/s on average)
and we set the run-time of the simulations to 200 seconds. Once the node reaches
its destination, it waits for the pause time before choosing a random destination and
repeating the process. We use pause times of 0 and 60 seconds. In addition we use
two different variations of the initial node placement and movement patterns. By
244 CHAPTER 7. MITIGATING MANET MISBEHAVIOR
combining the two pause times with two movement patterns, we obtain four different
mobility scenarios.
7.3.2 Misbehaving Nodes
Of the 50 nodes in the simulated network, some variable percentage of the nodes
misbehave. In our simulations, a misbehaving node is one that agrees to participate
in forwarding packets (it appends its address into route request packets) but then
indiscriminately drops all data packets that are routed through it.
We vary the percentage of the network comprised of misbehaving nodes from
0% to 40% in 5% increments. While a network with 40% misbehaving nodes may
seem unrealistic, it is interesting to study the behavior of the algorithms in a more
hostile environment than we hope to encounter in real life. We use Tcl’s [102] built-in
pseudo-random number generator to designate misbehaving nodes randomly. We use
the same seed across the 0% to 40% variation of the misbehaving nodes parameter,
which means that the group of misbehaving nodes in the 10% case is a superset of the
group of misbehaving nodes in the 5% case. This ensures that the obstacles present
in lower percentage misbehaving node runs are also present in higher percentage
misbehaving node runs.
7.3.3 Metrics
We evaluate our extensions using the following three metrics:
• Throughput: This is the percentage of sent data packets actually received by
the intended destinations.
• Overhead: This is the ratio of routing-related transmissions (route request,
route reply, route error, and Watchdog) to data transmissions in a sim-
ulation. A transmission is one node either sending or forwarding a packet.
7.4. SIMULATION RESULTS 245
For example, one packet being forwarded across 10 nodes would count as 10
transmissions. We count transmissions instead of packets because we want to
compare routing-related transmissions to data transmissions, but some routing
packets are more expensive to the network than other packets: route re-
quest packets are broadcast to all neighbors which in turn broadcast to all of
their neighbors, causing a tree of packet transmissions. Unicast route reply,
route error, Watchdog, and data packets only travel along a single path.
• Effects of Watchdog false positives on network throughput. False positives occur
when the Watchdog mechanism reports that a node is misbehaving when in fact
it is not, for reasons discussed in Section 7.2.1. We study the impact of this on
throughput.
7.4 Simulation Results
In this section we present the results of our simulations. We focus on three metrics of
evaluation: network throughput, routing overhead, and the effects of false positives
on throughput.
We test the utility of various combinations of our extensions: Watchdog (WD),
Pathrater (PR), and send (extra) route request (SRR). We use the SRR extension to
find new paths when all known paths include a suspected misbehaving node. Each of
the following sections includes two graphs of simulation results for two separate pause
times. The first graph is for a pause time of 0 (the nodes are in constant motion) and
the second is for a pause time of 60 seconds before and in between node movement.
We simulate two different node mobility patterns using four different pseudo-random
number generator seeds. The seeds determine which nodes misbehave. We plot the
average of the eight simulations.
246 CHAPTER 7. MITIGATING MANET MISBEHAVIOR
0
0.2
0.4
0.6
0.8
1
0 0.1 0.2 0.3 0.4 0.5
Thro
ughp
ut (p
erce
nt o
f pac
kets
rece
ived
)
Fraction of misbehaving nodes
WD=ON ,PR=ON ,SRR=ON WD=ON ,PR=ON ,SRR=OFFWD=OFF,PR=ON ,SRR=OFFWD=OFF,PR=OFF,SRR=OFF
(a) 0 second pause time
0
0.2
0.4
0.6
0.8
1
0 0.1 0.2 0.3 0.4 0.5
Thro
ughp
ut (p
erce
nt o
f pac
kets
rece
ived
)
Fraction of misbehaving nodes
WD=ON ,PR=ON ,SRR=ON WD=ON ,PR=ON ,SRR=OFFWD=OFF,PR=ON ,SRR=OFFWD=OFF,PR=OFF,SRR=OFF
(b) 60 second pause time
Figure 7.5: Overall network throughput as a function of the fraction of misbehavingnodes in the network.
7.4.1 Network Throughput
We graph four curves for network throughput: everything enabled, Watchdog and
Pathrater enabled, only Pathrater enabled, and everything disabled. We choose to
graph both everything enabled and everything enabled except SRR, because we want
to isolate performance gains or problems caused by extra route requests. Since the
Pathrater is not strictly a tool to be used for circumventing misbehaving nodes, we
choose to include the graph where only Pathrater is enabled to determine if it increases
network throughput without any knowledge of suspected misbehaving nodes. We do
not graph Watchdog and SRR activated without Pathrater, since without Pathrater
the information about misbehaving nodes would not be used for routing decisions.
Figure 7.5 shows the total network throughput, calculated as the fraction of data
packets generated that are received, versus the fraction of misbehaving nodes in the
network for the combinations of extensions. In the case where the network contains
no misbehaving nodes, all four curves achieve around 95% throughput. After the 0%
misbehaving node case, the graphs diverge.
As expected, the simulations with all three extensions active perform the best by a
considerable margin as misbehaving nodes are added to the network. The mechanisms
7.4. SIMULATION RESULTS 247
Maximum Minimum
0 second pause time 88.6% 75.2%60 second pause time 95.0% 73.9%
Table 7.1: Maximum and minimum network throughput obtained by any simulationat 40% misbehaving nodes with all features enabled.
increase the throughput by up to 27% compared to the basic protocol, maintaining a
throughput greater than 80% for both pause times, even with 40% misbehaving nodes.
Table 7.1 lists the maximum and minimum throughput achieved in any simulation
run at 40% misbehaving nodes with all options enabled.
When a subset of the extensions is active, performance does not increase as much
over the simulations with no extensions. Watchdog alone does not affect routing
decisions, but it supplies Pathrater with extra information to combat misbehaving
nodes more effectively. When Watchdog is deactivated, the source node has no way of
detecting the misbehaving node in its path to the destination, and so its transmission
flow suffers total packet loss. Pathrater alone cannot detect a path with misbehaving
nodes to decrement its rate (see Section 7.6).
One effect of the randomness of ns is that nodes may receive route replies to their
route requests in a different order in one simulation than in another simulation with
slightly varied parameters. This change can result in a node choosing a path with a
misbehaving node in one run, but not choosing that path in a simulation with more
misbehaving nodes in the network. This may actually result in slight increases in
network throughput when the number of misbehaving nodes increases. For instance,
this is noticeable in the Pathrater-only curve of Figure 7.5 (b) where the throughput
raises from 82% to 84% between 20% and 25% misbehaving nodes.
In both throughput graphs, the everything disabled curve and the Pathrater only
curves closely follow each other. From the graphs we conclude that the Pathrater
248 CHAPTER 7. MITIGATING MANET MISBEHAVIOR
0
0.2
0.4
0.6
0.8
1
0 0.1 0.2 0.3 0.4 0.5
Ove
rhea
d ra
tion
Fraction of misbehaving nodes
WD=ON ,PR=ON ,SRR=ON WD=ON ,PR=ON ,SRR=OFFWD=ON ,PR=OFF,SRR=OFFWD=OFF,PR=OFF,SRR=OFF
(a) 0 second pause time
0
0.2
0.4
0.6
0.8
1
0 0.1 0.2 0.3 0.4 0.5
Ove
rhea
d ra
tion
Fraction of misbehaving nodes
WD=ON ,PR=ON ,SRR=ON WD=ON ,PR=ON ,SRR=OFFWD=ON ,PR=OFF,SRR=OFFWD=OFF,PR=OFF,SRR=OFF
(b) 60 second pause time
Figure 7.6: This figure shows routing overhead as a ratio of routing packet trans-missions to data packet transmissions. This ratio is plotted against the fraction ofmisbehaving nodes.
alone does not significantly affect performance. In Section 7.6 we suggest some im-
provements to the Pathrater that may increase its utility in the absence of the other
extensions.
7.4.2 Routing Overhead
For routing overhead, we graph four curves: everything on, Pathrater and Watchdog
on, only Watchdog on (Watchdog-only), and everything off. Using the everything
off graph as our basis for comparison, we graph the Watchdog-only curve to find the
overhead generated just by the Watchdog when it sends notifications to senders. The
Watchdog and Pathrater curve shows the overhead added by Watchdog and Pathrater
but with Pathrater’s ability to send out extra route requests disabled. The everything
on curve includes the overhead created by Pathrater when sending out extra route
requests.
Figure 7.6 shows the amount of overhead incurred by activating the different
routing extensions. The greatest effect on routing overhead results from using the
SRR feature, which sends out route requests for a destination to which the only
7.4. SIMULATION RESULTS 249
Maximum Minimum
0 second pause time 31.3% 18.9%60 second pause time 23.5% 11.0%
Table 7.2: Maximum and minimum overhead obtained by any simulation at 40%misbehaving nodes with all features enabled.
known routes include suspected misbehaving nodes. For 40% misbehaving nodes in
the high mobility scenario, the overhead rises from 12% to 24% when SRR is activated
in the Pathrater. Any route requests generated by SRR will flood the network with
route request and route reply packets, which greatly increase the overhead.
Table 7.2 lists the maximum and minimum overhead for any of the simulations with
all options enabled at 40% misbehaving nodes.
The Watchdog mechanism itself only adds a very small amount of extra overhead
as seen by comparing the Watchdog-only graph with the all-disabled graph. Also, the
added overhead is not affected by the increase in misbehaving nodes in the network.
Using both the Watchdog and Pathrater mechanisms increases the throughput of the
network by 16% at 40% misbehaving nodes with only 6% additional network overhead
(see Figure 7.6 (a)).
Though the overhead added by these extensions is significant, especially when
Pathrater sends out route requests to avoid misbehaving nodes, these extensions still
improve net throughput. Therefore, the main concerns with high overhead involve
issues such as increased battery usage on portables and PDAs. Since the largest
factor accounting for the overhead is route requests, the overhead can be significantly
reduced by optimizing the delay between Pathrater sending out route requests and
incorporating some of the approaches developed for mitigating route requests and
broadcast storms in general [10, 20, 78].
250 CHAPTER 7. MITIGATING MANET MISBEHAVIOR
0
0.2
0.4
0.6
0.8
1
0 0.1 0.2 0.3 0.4 0.5
Thro
ughp
ut (f
ract
ion
of p
acke
ts re
ceiv
ed)
Fraction of misbehaving nodes
No False PositivesRegular Watchdog
(a) 0 second pause time
0
0.2
0.4
0.6
0.8
1
0 0.1 0.2 0.3 0.4 0.5
Thro
ughp
ut (f
ract
ion
of p
acke
ts re
ceiv
ed)
Fraction of misbehaving nodes
No False PositivesRegular Watchdog
(b) 60 second pause time
Figure 7.7: Comparison of network throughput between the regular Watchdog and aWatchdog that reports no false positives.
7.4.3 Effects of False Detection
We compare simulations of the regular Watchdog with a Watchdog that does not
report false positives. Figure 7.7 shows the network throughput lost by the Watchdog
incorrectly reporting well-behaved nodes. These results show that throughput is not
appreciably affected by false positives and that they may even have beneficial side
effects, as described below.
The similarity in throughput can be attributed to a few factors. First, the nodes
incorrectly reported as misbehaving could have moved out of the previous node’s
listening range before forwarding on a packet. If these nodes move out of range
frequently enough to warrant an accusation of misbehavior they may be unreliable
due to their location, and the source would be better off routing around them. The
fact that more false positives are reported in the 0 second pause time simulations as
compared to the 60 second pause time simulations, as shown in Table 7.3, supports
this conclusion. Table 7.3 shows the average value of false positives reported by the
simulation runs for each pause time and misbehaving node percentage.
Another factor that may account for the similar throughput of the Watchdog’s
performance with and without false positives concerns one of the limitations of the
7.5. RELATED WORK 251
% misbehaving nodes 0% 5% 10% 15% 20% 25% 30% 35% 40%
0 second pause time 111.2 82.8 90.3 66.5 75.5 60.8 67.5 31.3 50.860 second pause time 39.0 57.6 40.8 63.1 35.7 79.5 46.7 21.7 47.2
Table 7.3: Comparison of the number of false positives between the 0 second and 60second pause time simulations. Average taken from the simulations with all featuresenabled.
Watchdog. As described in Section 7.2.1, if a collision occurs while the Watchdog
is waiting for the next node to forward a packet, it may never overhear the packet
being transmitted. If many collisions occur over time, the Watchdog may incorrectly
assume that the next node is misbehaving. However, if a node constantly experiences
collisions, it may actually increase throughput to route packets around areas of high
communication density.
Yet another factor is that increased false positives will result in more paths in-
cluding a suspected misbehaving node. The Pathrater will then send out more route
requests to the destination. This increases the overhead in the network, but it also
provides the sending node with a fresher list of routes for its route cache.
7.5 Related Work
At the time this work was originally published [97] there was no previously published
work on detection of routing misbehavior specific to ad hoc networks, although there
is relevant work by Smith, Murthy and Garcia-Luna-Aceves on securing distance
vector routing protocols from Byzantine routing failures [122]. In their work, they
suggest countermeasures to secure routing messages and routing updates. This work
may be applicable to ad hoc networks in that distance vector routing protocols, such
as DSDV, have been proposed for ad hoc networks.
Zhou and Haas investigate distributed certificate authorities in ad hoc networks
252 CHAPTER 7. MITIGATING MANET MISBEHAVIOR
using threshold cryptography[144]. Zhou and Haas take the view that no one sin-
gle node in an ad hoc network can be trusted due to low physical security and low
availability. Therefore, using a single node to provide an important network-wide
service, such as a certificate authority, is very risky. Threshold cryptography al-
lows a certificate authority’s private key to be broken up into shares and distributed
across multiple nodes. To sign a certificate, a subset of the nodes with private key
shares must jointly collaborate. Thus, to mount a successful attack on the certificate
authority, an intruder must compromise multiple nodes.
To further frustrate attack attempts over time, Zhou and Haas’ scheme uses share
refreshing. It is possible that over a long period of time enough share servers could be
compromised to recover the certificate authority’s secret key. Share refreshing allows
uncompromised servers to compute a new private key periodically from the old private
key’s shares. This periodic refreshing means that an attacker must infiltrate a large
number of nodes within a short time span to recover the certificate authority’s secret
key.
Stajano and Anderson [125] elucidate some of the security issues facing ad hoc
networks and investigate ad hoc networks composed of low compute-power nodes such
as home appliances, sensor networks, and PDAs where full public key cryptography
may not be feasible. The authors develop a system in which a wireless device ”im-
prints” itself on a master device, accepting a symmetric encryption key from the first
device that sends it a key. After receiving that key, the slave device will not recognize
any other device as a master except the device that originally sent it the key. The
authors bring up an interesting denial of service attack: the battery drain attack.
A misbehaving node can mount a denial-of-service attack against another node by
routing seemingly legitimate traffic through the node in an attempt to wear down the
other node’s batteries.
Since this study was originally conducted in 1999, there has been much work in the
7.5. RELATED WORK 253
MANET community building upon Watchdog and/or Pathrater. Below, we discuss
a sample of the more relevant projects.
Michiardi and Molva designed CORE [99], a reputation system for ad hoc routing
that utilizes Watchdog to detect misbehavior. This protocol is targeted at discourag-
ing misbehavior by selfish nodes and does not protect against malicious nodes. Each
CORE node maintains three types of reputation information about each peer. A
separate functional reputation is calculated for each node based on its performance
of specific network functions (e.g. packet forwarding, route discovery). This informa-
tion is further broken down into subjective and indirect reputations, based on personal
observations and second-hand reports, respectively, similar to the limited reputation
system analyzed in Chapter 5. The application of appropriate weight values when
combining the separate reputations into a single value is essential to the success of
this protocol. However, the authors do not discuss how these weights are determined.
CONFIDANT [16], proposed by Buchegger and Le Boudec, builds upon our work
with two significant extensions. First, in addition to detecting next-hop drops for
packets a node personally forwards, a node also eavesdrops on neighboring nodes
in an attempt to catch misrouting. This technique is likely to result in many false
positives due to the hidden terminal problem. The second extension is a reputation
system where nodes notify “friends” of possible malicious routers, in a way similar
to our limited reputation system. However, to avoid malicious nodes poisoning the
system with false reports, the protocol defines friends to be nodes with which one
has an a priori trust relationship. In the simulations, all well-behaved nodes are
considered to be friends with each other, an idealized assumption they acknowledge
and which is a focus of their future work.
Buchegger et al. [17] also constructed a test-bed architecture that allows experi-
ments on routing attack detection to be conducted using real-world wireless technol-
ogy in actual mobility scenarios. The initial test-bed experiments evaluated the ability
254 CHAPTER 7. MITIGATING MANET MISBEHAVIOR
of enhanced passive acknowledgement (PACK), their improved version of Watchdog,
to detect misrouting attacks, as well as modification and fabrication attacks. They
find that some of the possible disadvantages of Watchdog we mention in Section 7.2.1,
such as partial dropping, have little or no effect in real-world scenarios.
7.6 Future Work
This chapter presents initial work in detecting misbehaving nodes and mitigating
their performance impact in ad hoc wireless networks. In this section we describe
some further ideas we would like to explore.
We plan on conducting more rigorous tests of the Watchdog and Pathrater pa-
rameters to determine optimal values to increase throughput in different situations.
Currently we are experimenting with different Watchdog thresholds for deciding when
a node is misbehaving. Some of the variables to optimize for the Pathrater include
the rating increment and decrement amounts, the rate incrementing interval, and the
delay between sending out route requests to decrease the overhead caused by this
feature.
Currently the Pathrater only decrements a node’s rating when another node tries
unsuccessfully to send to it or if the Watchdog mechanism is active and determines
that a node is misbehaving. Without the Watchdog active, the Pathrater cannot
detect misbehaving nodes. An obvious enhancement would be to receive updates from
a reliable transport layer, such as TCP, when ACKs fail to be received. This would
allow the Pathrater to detect bad paths and lower the nodes’ ratings accordingly.
The experiments conducted in this study were of relatively short duration, only 200
seconds. Longer simulations were infeasible due to the complexity of the simulations
resulting in long runtimes. While we believe these simulations adequately evaluate
the Watchdog mechanism, longer simulations may provide more information on the
7.7. CONCLUSION 255
performance of Pathrater. We would expect its performance to improve as more
information is collected, allowing it to calculate more accurate reputations for nodes
encountered. We postulate that the relative performance improvement gained by
utilizing Pathrater is likely to be similar to the gain seen in Chapter 5 by the local
reputation system over the base case.
All the simulations presented in this chapter use CBR data sources with no relia-
bility requirements. Our next goal is to analyze how the routing extensions perform
with TCP flows common to most network applications. Our focus would then change
from measuring throughput, or dropped packets, to measuring the time to complete
a reliable transmission, such as an FTP transfer. For these tests the modification to
Pathrater described above should improve performance significantly in the case where
the Watchdog is not active.
Finally, we would like to evaluate the Watchdog and Pathrater considering latency
in addition to throughput.
7.7 Conclusion
Ad hoc networks are an increasingly promising area of research with practical ap-
plications, but they are vulnerable in many settings to nodes that misbehave when
routing packets. For robust performance in an untrusted environment, it is necessary
to resist such routing misbehavior.
In this chapter we analyze two possible extensions to DSR to mitigate the effects of
routing misbehavior in ad hoc networks - the Watchdog and the Pathrater. We show
that the two techniques increase throughput by 17% in a network with moderate mo-
bility, while increasing the ratio of overhead transmissions to data transmissions from
the standard routing protocol’s 9% to 17%. During extreme mobility, Watchdog and
Pathrater can increase network throughput by 27%, while increasing the percentage
256 CHAPTER 7. MITIGATING MANET MISBEHAVIOR
of overhead transmissions from 12% to 24%.
These results show that we can gain the benefits of an increased number of routing
nodes while minimizing the effects of misbehaving nodes. In addition we show that
this can be done without a priori trust or excessive overhead.
Chapter 8
Conclusion and Future Work
Designing an online resource exchange or content distribution system presents a chal-
lenging undertaking, especially when that system is required to be decentralized and
its members fully autonomous. To facilitate the system architecture and protocol de-
sign, it is important to understand the behavior of the users and the impact system
parameters play on their actions. This thesis presented our research into designing
reputation systems for autonomous, decentralized, peer-to-peer networks.
We began with an overview of research related to reputation system design. Our
goal was to organize existing ideas and work, to facilitate system design. Chapter 2
presented a taxonomy of reputation system components, their properties, and dis-
cussed how user behavior and technical constraints can conflict. In our discussion,
we described current research (some of which is presented in this thesis) that ex-
emplifies solutions developed and compromises made in order to produce a useable,
implementable system.
Next in Chapters 3 and 4, we presented two theoretical models for how trust
influences users in an online trading systems. The first model used a microeconomic
approach, focusing on individual transaction strategies. The second macroeconomic
model quantified system design choices on expected user behavior and participation.
257
258 CHAPTER 8. CONCLUSION AND FUTURE WORK
In Chapter 3, we proposed a simple game model that captures the incentives
dictating the interaction between buyers and sellers and study the strategies that
evolve in different scenarios, such as eBay auctions. In particular, we focused on the
effect seller history has on player strategy. We proved that for simple reputation-based
buyer strategies, a seller’s decision whether to cheat or not is dependent only on the
length of history, not on the particular actions committed. Given a finite number
of transactions, a seller can compute a utility optimal sequence of cooperations and
defections. As more advanced buyer/seller strategies evolve, equilibrium is reached
when players predominantly cooperate.
Chapter 4 identified key attributes that drive the actions of users of trading sys-
tems, whether they are cooperative, selfish, or malicious. We then presented an
economic model that captures the behavior of peers in a system that employs incen-
tive schemes and reputation systems to mitigate the effects of both freeriding and
misbehavior. We showed how the basic model could be modified to account for de-
sign decisions and derived a more generalized model. Results from an individual
transaction-based simulator approximated our economic model’s expectations, sug-
gesting the model captures the key elements of a reputation-based trading system.
Next, we concentrated on a more realistic study by fully simulating an unstruc-
tured peer-to-peer system using real statistics and traces from actual large file-sharing
P2P networks. We evaluated the effect of limited reputation information sharing on
the efficiency and load distribution of a peer-to-peer system. Chapter 5 presented ad-
vantages and disadvantages of resource selection techniques based on peer reputation.
We also investigated the cost in efficiency of two identity models for peer-to-peer rep-
utation systems. Our results show that, using some simple mechanisms, reputation
systems can provide a factor of 20 improvement in performance over no reputation
system.
259
Finally, we proposed two protocols to improve message routing throughput us-
ing trust information. Each was targeted at very different peer-to-peer networks;
one towards “traditional” structured P2P systems, and the other to mobile ad hoc
networks. The sources of trust information also varied greatly.
In Chapter 6, we investigated how existing social networks can benefit P2P data
networks by leveraging the inherent trust associated with social links. We presented
a trust model that lets us compare routing algorithms for P2P networks overlaying
social networks. We proposed SPROUT, a DHT routing algorithm that, by using
social links, significantly increases the number of query results and reduces query
delays. We discussed further optimization and design choices for both the model and
the routing algorithm. Finally, we evaluated our model versus both regular DHT
routing and Gnutella-like flooding.
Chapter 7 described two techniques that improve throughput in an ad hoc network
in the presence of nodes that agree to forward packets but fail to do so. To mitigate
this problem, we proposed categorizing nodes based upon their dynamically measured
behavior. We suggested a Watchdog that identifies misbehaving nodes and a Pathrater
that helps routing protocols avoid these nodes. Through simulation we evaluated
Watchdog and Pathrater using packet throughput, percentage of overhead (routing)
transmissions, and the accuracy of misbehaving node detection. When used together
in a network with moderate mobility, the two techniques increase throughput by
17% in the presence of 40% misbehaving nodes, while increasing the percentage of
overhead transmissions from the standard routing protocol’s 9% to 17%. During
extreme mobility, Watchdog and Pathrater can increase network throughput by 27%,
while increasing the overhead transmissions from the standard routing protocol’s 12%
to 24%.
The work contained in this thesis runs the gamut of system design and analysis,
beginning with categorization and high-level theoretical models, to specific system
260 CHAPTER 8. CONCLUSION AND FUTURE WORK
protocols and mechanisms. In all categories, there are still open problems in this
field of investigation. Many facets of reputation system design remain to be explored
beyond what is presented here. We end by presenting areas of research that build
upon the work described in this thesis that will need to be explored in order to provide
efficient and effective reputation management in P2P systems.
Refine High-Level Models
Both the micro- and macroeconomic models presented in Chapters 3 and 4 (respec-
tively) could be refined into more sophisticated, realistic models. Our study of agent
strategies under reputation was limited to the perfect knowledge-space, where buyers
had access to a seller’s complete transactional history. The next step would be to
analyze agent strategies with limited or inaccurate views of transactional histories.
The mathematical model presented in Chapter 4 also assumed a perfect reputa-
tion system capable of accurately and instantly collecting the amount and type of
contributions made by each peer and maintaining their trust rating. Perturbations
could be added to the model in order to improve realism. One example would be to
introduce a delay between when contributions occur and when they affect the con-
tributor’s reputation. A peer’s contributive capacity could be a function of time and
not constant as we assumed it to be. In addition, adding an error factor to each
contribution in the trust equation would mimic inaccurate/incomplete transaction
reporting.
SPROUT and Watchdog
The connectivity of nodes in social networks is not as regular as in structured P2P
networks. The additional message routing links derived from social connections used
in SPROUT could lead to imbalanced load across peers. An analysis of the effects
of SPROUT on load distribution, perhaps similar to that presented in Chapter 5, is
261
needed.
Some of the advantages of SPROUT could be applied to Pathrater. We expect the
performance of Pathrater to increase when it can make use of explicitly trusted nodes.
Trusted node lists are available in some ad hoc network scenarios, and we would like
to analyze the performance of our routing extensions in these scenarios. However, due
to the constrained transmission range, a node’s choice of link neighbors is limited.
Therefore, locating a trusted node may be unlikely, diminishing the effectiveness of
using a priori trust.
Bootstrapping Trust
Throughout the thesis we touch on the problem of how peers should initially regard
a new member joining the system. This issue is focused on primarily in Chapter 5,
but touched on in Chapter 4 as well. In those chapters a default low initial trust
rating is assigned to newcomers. However, more research needs to be done on how
to balance giving new members opportunities to contribute, while not falling prey to
whitewashing.
One possibility is to leverage a priori trust relationships with existing users.
Should a peer in good standing vouch for a newcomer, the new node could receive
preferential treatment by the reputation system. Should the newcomer abuse this
trust, a penalty would be incurred both by the vouching peer as well the misbehaving
newcomer.
Peer-to-peer systems rely on the goodwill of many unknown and untrusted users
to function effectively. This reliance makes naıve P2P systems vulnerable to malicious
attacks. Reputation systems are needed to detect and deter malicious users, mak-
ing these networks available and useable for all. Effective decentralized reputation
systems will enable P2P technology as a viable medium for content distribution and
services.
Appendix A
Proof Of Long-Term Reputation
Damage
Set the two utility equations (Eq. 3.6 and Eq. 3.7) equal to each other and solve for
k.
Uc(n+ 1 + k) = Ud(n+ 1 + k) (A.1)
U(n) +(
vm
n− c)
+k∑
i=1
(
vF−(n+1)(n+ i) + 1
n+ i− f(n+ i+ 1)c
)
= U(n) +(
vm
n
)
+k∑
i=1
(
vF−(n+1)(n+ i)
n+ i− f(n+ i+ 1)c
)(A.2)
���U(n) +�
��vm
n− c+
k∑
i=1
(
vF−(n+1)(n+ i) + 1
n+ i
)
−���������
k∑
i=1
f(n+ i+ 1)c
= ���U(n) +�
��vm
n+
k∑
i=1
(
vF−(n+1)(n+ i)
n+ i
)
−���������
k∑
i=1
f(n+ i+ 1)c
(A.3)
262
263
−c+�����������k∑
i=1
vF−(n+1)(n+ i)
n+ i+
k∑
i=1
v1
n+ i=
�����������k∑
i=1
vF−(n+1)(n+ i)
n+ i(A.4)
v
k∑
i=1
1
n+ i= c (A.5)
n+k∑
i=1
1
i−
n∑
i=1
1
i=c
v(A.6)
(A.7)
Now we have two finite harmonic sums. To simplify the summations, we apply
the formula for finite harmonic sum [77].
Hn =n∑
i=1
1
i= ln(n) + γ +
1
2n− 1
12n2+
1
120n4− ε
where 0 < ε <1
252n6
(A.8)
Where γ is the Euler-Mascheroni constant.
Let ε′(n) =1
2n(A.9)
Clearly, ln(n) + γ < Hn < ln(n) + γ + ε′(n) (A.10)
Next substitute the appropriate upper or lower bound for Hn for each summation
in Eq. A.7 so as to get an upper and lower bound k.
264 APPENDIX A. PROOF OF LONG-TERM REPUTATION DAMAGE
(ln(n+ k) + γ)− (ln(n) + γ + ε′(n)) <c
v< (ln(n+ k) + γ + ε′(n+ k))− (ln(n) + γ)
(A.11)
ln(n+ k) + �γ − ln(n)− �γ − ε′(n) <c
v< ln(n+ k) + ε′(n+ k) + �γ − ln(n)− �γ
(A.12)
ln(n+ k
n
)
− ε′(n) <c
v< ln
(n+ k
n
)
+ ε′(n+ k) (A.13)
(A.14)
Notice that ε′(n+ k) ≤ ε′(n) ∀k ≥ 0. We can replace ε′(n+ k) with ε′(n) without
invalidating the inequality.
ln(n+ k
n
)
− ε′(n) <c
v< ln
(n+ k
n
)
+ ε′(n) (A.15)
n+ k
ne−ε′(n) < ec/v <
n+ k
neε′(n) (A.16)
(n+ k)e−ε′(n) < nec/v < (n+ k)eε′(n) (A.17)
(A.18)
Solving each inequality separately, we have
(n+ k)e−ε′(n) < nec/v nec/v < (n+ k)eε′(n) (A.19)
n+ k < nec/veε′(n) n+ k > nec/ve−ε′(n) (A.20)
k < n(ec/veε′(n) − 1) k > n(ec/ve−ε′(n) − 1) (A.21)
A.1. ERROR BOUNDS 265
Notice that e−ε′(n) < 1 and eε′(n) > 1. Therefore, we will approximate k to be
k ≈ n(ec/v − 1) (A.22)
A.1 Error Bounds
What is the error range for k? Subtracting the lower bound from the upper bound
we have
n(ec/veε′(n) − 1)− n(ec/ve−ε′(n) − 1) = ec/vn(eε′(n) − e−ε′(n)) (A.23)
Consider, limn→∞
n(eε′(n) − e−ε′(n)) = limn→∞
n(e1
2n − e− 1
2n ) (A.24)
Substituting x =1
2ngives lim
x→0
ex − e−x
2x(A.25)
limx→0
ex − e−x
2x= 1 (A.26)
As n→∞ the error range decreases and converges to ec/v, a constant with respect
to n. Therefore, the largest error is when n is as small as possible. Originally, we
stated that we are not concerned with the situation that the seller is new to the
system, but has instead generated a history of several transactions. Therefore, we
assume n is not small and definitely not 0. For example, using n = 5 in Eq. A.23, the
error range will be 1.002ec/v.
By definition c < v, therefore ec/v < e. In our running example of c = $1 and
v = $3, then e1/3 = 1.4. Here, the error range is less than 1.5. Because we are
interested in k as an integer, the approximate value from Eq. A.22 cannot be off by
more than 1. Even in the worst case, where n = 1 and ec/v = e, the error range is
less than 3, therefore the approximate value of k cannot be off by more than 2. For
266 APPENDIX A. PROOF OF LONG-TERM REPUTATION DAMAGE
the range of n we are interested in, the approximation of k may be acceptable.
A.2 Improved Approximation
In the previous section we bounded the error to a range of size ec/v, constant with
respect to n. However, it is possible to do better by using a tighter bound on the
harmonic number [62, 132].
1
24(n+ 1)2< Hn − ln(n+
1
2)− γ < 1
24n2(A.27)
Using this equation we now have tighter bounds for the error. To make easier use of
the bounds
Let ε′′(n) =1
24n2(A.28)
0 < ε′′(n+ 1) < Hn − ln(n+1
2)− γ < ε′′(n) (A.29)
Substituting into Eq. A.7 we have
ln(n+k+1
2)+γ−(ln(n+
1
2)+γ+ε′′(n)) <
c
v< ln(n+k+
1
2)+γ+ε′′(n+k)−(ln(n+
1
2)+γ)
(A.30)
Clearly, ε′′(n+ k) < ε′′(n).
ln(n+ k +1
2)− ln(n+
1
2)− ε′′(n) <
c
v< ln(n+ k +
1
2) + ε′′(n)− ln(n+
1
2) (A.31)
n+ k + 12
n+ 12
e−ε′′(n) < ecv <
n+ k + 12
n+ 12
eε′′(n) (A.32)
(n+ k +1
2)e−ε′′(n) < (n+
1
2)e
cv < (n+ k +
1
2)eε′′(n) (A.33)
A.2. IMPROVED APPROXIMATION 267
Solving each inequality separately, we have
(n+ k +1
2)e−ε′′(n) < (n+
1
2)ec/v (n+
1
2)ec/v < (n+ k +
1
2)eε′′(n) (A.34)
n+ k +1
2< (n+
1
2)ec/veε′′(n) n+ k +
1
2> (n+
1
2)ec/ve−ε′′(n) (A.35)
k < (n+1
2)(ec/veε′′(n) − 1) k > (n+
1
2)(ec/ve−ε′′(n) − 1) (A.36)
A better approximation for k than Eq. A.22 is
k ≈ (n+1
2)(ec/v − 1) (A.37)
The error range now is
(n+1
2)(ec/veε′′(n)−1)− (n+
1
2)(ec/ve−ε′′(n)−1) = ec/v(n+
1
2)(eε′′(n)− e−ε′′(n)) (A.38)
While our previous approximation gave constant bounds on the error range as n
grew. This approximation can be shown to converge to a single value as n grows.
Consider, limn→∞
(n+1
2)(eε′′(n) − e−ε′′(n)) = lim
n→∞(n+
1
2)(e
1
24n2 − e− 1
24n2 ) (A.39)
Substitute x =1
24n2(A.40)
limx→0
ex − e−x
√24x
− ex − e−x
2= 0− 0 = 0 (A.41)
The second term goes to 0, and, using L’Hopital’s rule, the first term also goes to
0. So for sufficiently large n, the error range converges to 0.
What happens for small n? With our previous approximation, the error range
for n = 1 was approximately 1.04ec/v, which had an upper bound of approximately
2.8, when c = v. With the improved approximation the error range at n = 1 is less
268 APPENDIX A. PROOF OF LONG-TERM REPUTATION DAMAGE
than 0.13ec/v, which in the worst case, is less than 0.34. Calculating k to the nearest
integer will be correct with very high probabilty.
Appendix B
Proof of Unique Global Maximum
for Segregated Schedule Utility
From Equation 3.15 we have the following equation for the utility of a segregated
schedule of length Z with x cooperations followed by Z − x defections. We may
express the total utility of such a schedule as
Useg(Z, x) = U = (v − c)x+Z−1∑
i=x
vx
i(B.1)
To prove there can only be one unique value of x that maximizes Useg(Z, x) for a
given Z, we will take the second derivative with respect to x and show that it only
takes on negative values for 0 ≤ x ≤ Z.
We begin by simplifying Eq. B.1.
U = (v − c)x+Z−1∑
i=x
vx
i(B.2)
= (v − c)x+ vx(Z−1∑
i=1
1
i−
x−1∑
i=1
1
i
)
(B.3)
269
270 APPENDIX B. UNIQUE MAXIMUM OF SEGREGATED SCHEDULE
We now have the difference of two finite harmonic sums. A finite harmonic sum
can be expressed analytically as
Hn = γ + ψ0(n+ 1) (B.4)
where γ is the Euler-Mascheroni constant and ψ0(n+1) is the digamma function [132].
Substituting in for the series gives us
U = (v − c)x+ vx(γ + ψ0(Z)− (γ + ψ0(x))) (B.5)
Next, we simplify and take two derivatives. The derivative of ψ0(z) is ψ1(z) and
similarly the derivative of ψ1(z) is ψ2(z), where ψ1(z) and ψ2(z) are polygamma
functions [133].
U = (v − c)x+ vx(�γ + ψ0(Z)− �γ − ψ0(x)) (B.6)
= (v − c)x+ vx(ψ0(Z)− ψ0(x)) (B.7)
= (v − c)x+ vψ0(Z)x− vxψ0(x) (B.8)
dU
dx= (v − c) + vψ0(Z)− vψ0(x)− vxψ1(x) (B.9)
d2U
dx2= −vψ1(x)− vψ1(x)− vxψ2(x) (B.10)
= −v(2ψ1(x) + xψ2(x)) (B.11)
(B.12)
271
A polygamma function ψn(z) can be written as follows [133]:
ψn(z) = (−1)n+1n!∞∑
k=0
1
(z + k)n+1(B.13)
Applying Eq. B.13 to Eq. B.12 and simplifying:
d2U
dx2= −v(2ψ1(x) + xψ2(x)) (B.14)
= −v[
2
(∞∑
k=0
1
(x+ k)2
)
+ x
(
−2∞∑
k=0
1
(x+ k)3
)]
(B.15)
= −2v∞∑
k=0
(
x+ k
(x+ k)3− x
(x+ k)3
)
(B.16)
= −2v∞∑
k=0
(
k
(x+ k)3
)
(B.17)
Notice that for any valid value of x, 0 < x ≤ Z, the summation is purely positive.
Therefore, the second derivative of Useg(Z, x) must be negative in that same range.
Appendix C
Estimating Optimal Schedule for
Fixed Number of Transactions
From Equation 3.9 we know that, given a number of completed transactions n and
a cost/valuation ratio c/v, we can calculate how many additional transactions k are
needed so that the utility from cooperating or defecting on the n + 1 turn is equal.
Consequently, a seller benefits more from defecting if she participates in less than k
additional transactions, and benefits more from cooperating if she participates in more
than k additional transactions. Therefore, for a given number of total transactions
Z, we can determine how many cooperations are optimal by computing n and k + 1
respectively from Z = n+ 1 + k, substituting in Equation 3.9 for k, then solving for
n.
272
273
Z = n+ 1 + k (C.1)
Z = n+ 1 + (n+1
2)(ec/v − 1) (C.2)
Z = �n+ 1 + nec/v −�n+1
2ec/v − 1
2(C.3)
n = (Z − 1
2)e−c/v − 1
2(C.4)
We now have the optimal number of cooperations in terms of Z, the total number
of transactions, nC(Z). Now subtracting the value of n in Eq. C.4 from Z gives us
the optimal number of defections in terms of Z, nD(Z).
nD(Z) = Z − ((Z − 1
2)e−c/v − 1
2) (C.5)
= Z − (Z − 1
2)e−c/v +
1
2(C.6)
= (Z − 1
2) +
1
2− (Z − 1
2)e−c/v +
1
2(C.7)
= (Z − 1
2)(1− e−c/v) + 1 (C.8)
The previous equations allow for real numbered values. Because we are interested
in integer values we must apply proper integer conversions. For a fixed number of
transactions Z, the utility optimal number of cooperations and defections, respec-
tively, are
nC(Z) =⌈
(Z − 1
2)e−c/v − 1
2
⌉
(C.9)
nD(Z) =⌊
(Z − 1
2)(1− e−c/v) + 1
⌋
(C.10)
Appendix D
Mathematical Derivations of
Economic Model
Here we present the derivations of equations from Chapter 4 to help the reader un-
derstand the process.
274
D.1. UTILITY OVER TIME 275
D.1 Utility Over Time
∫
Pdt =
∫
(πgtkvA− kuA+ (kmCB + kpC − kcC)T (t)− κ)dt (D.1)
U(t) = πgtkvAt− kuAt+ (kmCB + kpC − kcC)
∫
T (t)dt− κt+ Y (D.2)
=(πgtkvA− kuA− κ)t+
(kmCB + kpC − kcC)ln((rgCG + rbCB + δ)ergCGt + Z)
rgCG + rbCB + δ+ Y
(D.3)
=(πgtkvA− kuA− κ)t+
(kmCB + kpC − kcC)
ln
(
(rgCG + rbCB + δ)(ergCGt − 1) + rgCG
T (0)
)
rgCG + rbCB + δ+ Y
(D.4)
where Y = U(0)− (kmCB + kpC − kcC)ln( rgCG
T (0)
)
rgCG + rbCB + δ(D.5)
(D.6)
U(t) =(πgtkvA− kuA− κ)t+
(kmCB + kpC − kcC)
ln
(
(rgCG + rbCB + δ)(ergCGt − 1) T (0)rgCG
+ 1
)
rgCG + rbCB + δ+ U(0)
(D.7)
276 APPENDIX D. MATH. DERIV. OF ECON. MODEL
D.2 Generalized Trust Over Time (σ(T, p∗) = 1)
∫
∆Tdt =
∫
(rgCG(1− T (t))− (rbCB + δ)T (t))�����σ(T, p∗)dt
T (t) =rgCG
rgCG + rbCB + δ+ Z · e−(rgCG+rbCB+δ)t
where Z = T (0)− rgCG
rgCG + rbCB + δ
(D.8)
Bibliography
[1] Stanford Peers research group. http://www-db.stanford.edu/peers/.
[2] Martn Abadi, Mike Burrows, Mark Manasse, and Ted Wobber. Moderately
hard, memory-bound functions. In Proceedings of the 10th Annual Network
and Distributed System Security Symposium, 2003.
[3] Lada Adamic. Search in power law networks. Physical Review E, 64:46135–
46143, 2001.
[4] Lada Adamic. Personal communication, 2002.
[5] Eytan Adar and Bernardo A. Huberman. Free riding on gnutella. First Monday,
5(10), October 2000.
[6] Gagan Agarwal, Mayank Bawa, Prasanna Ganesan, Hector Garcia-Molina, Kr-
ishnaram Kenthapadi, Nina Mishra, Rajeev Motwani, Utkarsh Srivastava, Dilys
Thomas, Jennifer Widom, and Ying Xu. Vision Paper: Enabling Privacy for
the Paranoids. In VLDB, 2004.
[7] Gagan Agarwal, Mayank Bawa, Prasanna Ganesan, Hector Garcia-Molina, Kr-
ishnaram Kenthapadi, Rajeev Motwani, Utkarsh Srivastava, Dilys Thomas, and
Ying Xu. Two Can Keep a Secret: A Distributed Architecture for Secure Data-
base Services. In CIDR, 2005.
277
278 BIBLIOGRAPHY
[8] Apple Computer, Inc. iTunes, 2004. http://www.apple.com/itunes/.
[9] Robert Axelrod. The Evolution of Cooperation. Basic Books, 1984.
[10] Stefano Basagni, Imrich Chlamtac, Violet R. Syrotiuk, and Barry A. Woodward.
A distance routing effect algorithm for mobility (dream). In MobiCom ’98:
Proceedings of the 4th annual ACM/IEEE international conference on Mobile
computing and networking, pages 76–84. ACM Press, 1998.
[11] Mayank Bawa, Brian F. Cooper, Arturo Crespo, Neil Daswani, Prasanna Gane-
san, Hector Garcia-Molina, Sepandar Kamvar, Sergio Marti, Mario Schlosser,
Qi Sun, Patrick Vinograd, and Beverly Yang. Peer-to-peer research at stanford.
SIGMOD Rec., 32(3):23–28, 2003.
[12] BBC NEWS. Viruses turn to peer-to-peer nets. BBC NEWS. 1/20/2004,
January 2004.
[13] Vaduvur Bharghavan, Alan Demers, Scott Shenker, and Lixia Zhang. Macaw:
a media access protocol for wireless lan’s. In SIGCOMM ’94: Proceedings of the
conference on Communications architectures, protocols and applications, pages
212–225. ACM Press, 1994.
[14] Alberto Blanc, Yi-Kai Liu, and Amin Vahdat. Designing Incentives for Peer-
to-Peer Routing. In Workshop on Economics of Peer-to-Peer Systems.
[15] Josh Broch, David A. Maltz, David B. Johnson, Yih-Chun Hu, and Jorjeta
Jetcheva. A performance comparison of multi-hop wireless ad hoc network
routing protocols. In MobiCom ’98: Proceedings of the 4th annual ACM/IEEE
international conference on Mobile computing and networking, pages 85–97.
ACM Press, 1998.
BIBLIOGRAPHY 279
[16] Sonja Buchegger and Jean-Yves Le Boudec. Performance analysis of the confi-
dant protocol (cooperation of nodes - fairness in dynamic ad-hoc networks). In
Proceedings of MobiHoc 2002, Lausanne, June 2002.
[17] Sonja Buchegger, Cedric Tissieres, and Jean-Yves Le Boudec. A test-bed for
misbehavior detection in mobile ad-hoc networks - how much can watchdogs
really do? In Proceedings of IEEE WMCSA 2004, English Lake District, UK,
December 2004.
[18] Chiranjeeb Buragohain, Divyakant Agrawal, and Subhash Suri. A Game The-
oretic Framework for Incentives in P2P Systems. In IEEE 3rd International
Conference on Peer-to-Peer Computing (P2P 2003).
[19] Orkut Buyukkokten. Club Nexus, 2001.
[20] Robert Castaneda and Samir R. Das. Query localization techniques for on-
demand routing protocols in ad hoc networks. In MobiCom ’99: Proceedings of
the 5th annual ACM/IEEE international conference on Mobile computing and
networking, pages 186–194. ACM Press, 1999.
[21] Miguel Castro, Peter Druschel, Ayalvadi Ganesh, Antony Rowstron, and Dan S.
Wallach. Secure routing for structured peer-to-peer overlay networks. In Pro-
ceedings of the Fifth Symposium on Operating Systems Design and Implemen-
tation, 2002.
[22] Kay-Yut Chen, Tad Hogg, and Nathan Wozny. Experimental Study of Market
Reputation Mechanisms. In ACM Conference on Electronic Commerce (EC’04),
2004.
[23] Bram Cohen. Incentives Build Robustness in BitTorrent. In Workshop on
Economics of Peer-to-Peer Systems, 2003.
280 BIBLIOGRAPHY
[24] Brian F. Cooper, Mayank Bawa, Neil Daswani, and Hector Garcia-Molina. Pro-
tecting the PIPE from Malicious Peers. Technical report, Stanford University,
2002.
[25] Fabrizio Cornelli, Ernesto Damiani, and Sabrina De Capitani. Choosing Rep-
utable Servents in a P2P Network. In Proc. of the 11th International World
Wide Web Conference, 2002.
[26] Scott Corson and Vincent Park. Temporally-Ordered Routing Algorithm
(TORA) Version 1 Functional Specification. Mobile Ad-hoc Network (MANET)
Working Group, IETF, October 1999.
[27] Arturo Crespo and Hector Garcia-Molina. Routing Indices For Peer-to-Peer
Systems. Proceedings of the International Conference on Distributed Computing
Systems (ICDCS), July 2002.
[28] B. P. Crow, I. K. Widjaja, G. Jeong, and P. T. Sakai. IEEE-802.11 Wireless
Local Area Networks. IEEE Communications Magazine, 35(9):116–126, Sep-
tember 1997.
[29] Ernesto Damiani, De Capitani di Vimercati, Stefano Paraboschi, Pierangela
Samarati, and Fabio Violante. A reputation-based approach for choosing reli-
able resources in peer-to-peer networks. In Proceedings of the 9th ACM confer-
ence on Computer and communications security, pages 207–216. ACM Press,
2002.
[30] Adam D’Angelo. BuddyZoo. http://www.buddyzoo.com.
[31] Samir Das, Charles E. Perkins, and Elizabeth M. Royer. Ad Hoc On Demand
Distance Vector (AODV) Routing (Internet-Draft). Mobile Ad-hoc Network
(MANET) Working Group, IETF, October 1999.
BIBLIOGRAPHY 281
[32] Neil Daswani. Personal communication, 2004.
[33] Neil Daswani. Denial of Service Attacks and Commerce Infrastructure in Peer-
to-peer Networks. PhD thesis, Stanford University, 2004.
[34] Neil Daswani and Hector Garcia-Molina. Query-Flood DoS Attacks in Gnutella.
In ACM Conference on Computer and Communications Security, 2002.
[35] Neil Daswani and Hector Garcia-Molina. Pong-Cache Poisoning in GUESS. In
ACM Conference on Computer and Communications Security, 2004.
[36] Jay L. Devore. Probability and statistics for engineering and the sciences.
Brooks/Cole Publishing Co., 3rd edition, 1991.
[37] Roger Dingledine, Michael J. Freedman, David Hopwood, and David Molnar.
A reputation system to increase MIX-net reliability. Lecture Notes in Computer
Science, 2137:126+, 2001.
[38] John R. Douceur. The Sybil Attack. In Proc. of the International Workshop
on Peer-to-Peer Systems, 2002.
[39] IETF MANET Working Group Internet Drafts.
http://www.ietf.org/ids.by.wg/manet.html.
[40] Cynthia Dwork, Andrew Goldberg, and Moni Naor. On memory-bound func-
tions for fighting spam. In Advances in Cryptology – CRYPTO’03, 2003.
[41] Cynthia Dwork and Moni Naor. Pricing via processing. In Advances in Cryp-
tology – CRYPTO’92, 1992.
[42] eBay - The World’s Online Marketplace. http://www.ebay.com/.
282 BIBLIOGRAPHY
[43] K. Fall and K. Varadhan. ns notes and documentation. The VINT Project,
UC Berkeley, LBL, USC/ISI, and Xerox PARC. Available from http://www-
mash.cs.berkeley.edu/ns/, July 1999.
[44] Michalis Faloutsos, Petros Faloutsos, and Christos Faloutsos. On power-law
relationships of the internet topology. In SIGCOMM, pages 251–262, 1999.
[45] Michal Feldman, Kevin Lai, Ion Stoica, and John Chuang. Robust Incentive
Techniques for Peer-to-Peer Networks. In ACM Conference on Electronic Com-
merce (EC’04), 2004.
[46] Michal Feldman, Christos Padimitriou, John Chuang, and Ion Stoica. Free-
Riding and Whitewashing in Peer-to-Peer Systems. In ACM SIGCOMM 2004,
Workshop of Practice and Theory of Incentives and Game Theory in Networked
Systems, 2004.
[47] Michael Freedman and Robert Morris. Tarzan: A Peer-to-Peer Anonymizing
Network Layer. In Proceedings of the 9th ACM Conference on Computer and
Communications Security, 2002.
[48] Eric Friedman and Paul Resnick. The social cost of cheap pseudonyms. Journal
of Economics and Management Strategy, 10(2):173–199, 1998.
[49] Friendster Inc. Friendster Beta, 2003. http://www.friendster.com.
[50] Drew Fudenberg and David K. Levine. Reputation and Equilibrium Selection
in Games with a Patient Player. Econometrica, (57), 1989.
[51] J. J. Garcia-Luna-Aceves and Marcelo Spohn. Source-tree routing in wireless
networks. In ICNP ’99: Proceedings of the Seventh Annual International Con-
ference on Network Protocols, page 273. IEEE Computer Society, 1999.
BIBLIOGRAPHY 283
[52] J.J. Garcia-Luna-Aceves, Marcelo Spohn, and David Beyer. Source Tree
Adaptive Routing (STAR) Protocol (Internet-Draft). Mobile Ad-hoc Network
(MANET) Working Group, IETF, October 1999.
[53] Yolanda Gil and Varun Ratnakar. Trusting information sources one citizen at
a time. In Proceedings of the First International Semantic Web Conference
(ISWC), 2002.
[54] T.J. Giuli. Personal communication, 2005.
[55] TJ Giuli, Petros Maniatis, Mary Baker, David S. H. Rosenthal, and Mema
Roussopoulos. Attrition defenses for a peer-to-peer digital preservation system.
In Proceedings of the USENIX Technical Conference, 2005.
[56] Gnutella specification. www9.limewire.com/
developer/gnutella protocol 0.4.pdf.
[57] H. Charles J. Godfray. Signalling of need by offspring to their parents. Nature,
(352):328–330, 1991.
[58] A. Grafen. Biological signals as handicaps. Journal of Theoretical Biology,
(144):517–546, 1990.
[59] R. Guha, Ravi Kumar, Prabhakar Raghavan, and Andrew Tomkins. Prop-
agation of trust and distrust. In Proceedings of the 13th World Wide Web
Conference (WWW2004), 2004.
[60] K. Gummadi, R. Gummadi, S. Gribble, S. Ratnasamy, S. Shenker, and I. Stoica.
The impact of DHT routing geometry on resilience and proximity. In Proc.
ACM SIGCOMM, 2003.
284 BIBLIOGRAPHY
[61] Minaxi Gupta, Paul Judge, and Mostafa Ammar. A reputation system for
peer-to-peer networks. In ACM 13th International Workshop on Network and
Operating Systems Support for Digital Audio and Video, 2003.
[62] Julian Havil. Gamma: Exploring Euler’s Constant. Princeton University Press,
2003.
[63] Tad Hogg and Lada Adamic. Enhancing Reputation Mechanisms via Online
Social Networks. In ACM Conference on Electronic Commerce (EC’04), 2004.
[64] B. Horne, B. Pinkas, and T. Sander. Escrow Services and Incentives in Peer-to-
Peer Networks. In Proceedings of 3rd ACM Conference on Electronic Commerce,
2001.
[65] Bernardo A. Huberman and Fang Wu. The dynamics of reputations.
www.hpl.hp.com/shl/papers/reputations/, 2002.
[66] IDC. Internet commerce model, v9.3, January 2005.
[67] IFILM Corp. IFILM, 2004. http://www.ifilm.com.
[68] Per Johansson, Tony Larsson, Nicklas Hedman, Bartosz Mielczarek, and Mikael
Degermark. Scenario-based performance analysis of routing protocols for mobile
ad-hoc networks. In MobiCom ’99: Proceedings of the 5th annual ACM/IEEE
international conference on Mobile computing and networking, pages 195–206.
ACM Press, 1999.
[69] Dave Johnson. Personal Communication, February 2000.
[70] David B. Johnson, David A. Maltz, and Josh Broch. Source Tree Adaptive
Routing (STAR) Protocol (Internet-Draft). Mobile Ad-hoc Network (MANET)
Working Group, IETF, October 1999.
BIBLIOGRAPHY 285
[71] J. Jubin and J. Tornow. The DARPA Packet Radio Network Protocols. Pro-
ceedings of the IEEE, 75(1):21–32, 1987.
[72] Radu Jurca and Boi Faltings. Towards incentive-compatible reputation man-
agement. In Proceedings of the AAMAS 2002 Workshop on Deception, Fraud
and Trust in Agent Societies.
[73] Sepandar D. Kamvar, Mario T. Schlosser, and Hector Garcia-Molina. The
EigenTrust Algorithm for Reputation Management in P2P Networks. In Pro-
ceedings of the Twelfth International World Wide Web Conference, 2003.
[74] KaZaA Home Page. http://www.kazaa.com/.
[75] C. Keser. Experimental games for the design of reputation management sys-
tems. IBM Systems Journal, 42(3):498–506, 2003.
[76] Donald E. Knuth. Seminumerical Algorithms, volume 2 of The Art of Computer
Programming. Addison-Wesley Publishing Co., 1969.
[77] Donald E. Knuth. Fundamental Algorithms, volume 1 of The Art of Computer
Programming. Addison-Wesley Publishing Co., 2nd edition, 1973.
[78] Young-Bae Ko and Nitin H. Vaidya. Location-aided routing (LAR) in mobile
ad hoc networks. In MobiCom ’98: Proceedings of the 4th annual ACM/IEEE
international conference on Mobile computing and networking, pages 66–75.
ACM Press, 1998.
[79] Young-Bae Ko and Nitin H. Vaidya. Geocasting in Mobile Ad Hoc Networks:
Location-Based Multicast Algorithms. In WMCSA’99, 1999.
[80] D. Kreps and R. Wilson. Reputation and Imperfect Information. Journal of
Economic Theory, (50):253–79, 1982.
286 BIBLIOGRAPHY
[81] John Kubiatowicz, David Bindel, Yan Chen, Patrick Eaton, Dennis Geels,
Ramakrishna Gummadi, Sean Rhea, Hakim Weatherspoon, Westly Weimer,
Christopher Wells, and Ben Zhao. OceanStore: An Architecture for Global-
scale Persistent Storage. In Proceedings of ACM ASPLOS. ACM, November
2000.
[82] Kevin Lai. Personal communication, 2004.
[83] Kevin Lai, Michal Feldman, Ion Stoica, and John Chuang. Incentives for Coop-
eration in Peer-to-Peer Networks. In Workshop on Economics of Peer-to-Peer
Systems, 2003.
[84] Seungjoon Lee, Rob Sherwood, and Bobby Bhattacharjee. Cooperative Peer
Groups in NICE. In Proceedings of the IEEE INFOCOM, 2003.
[85] Qin Lv, Pei Cao, Edith Cohen, Kai Li, and Scott Shenker. Search and repli-
cation in unstructured peer-to-peer networks. In Proceedings of the 2002 ACM
SIGMETRICS international conference on Measurement and modeling of com-
puter systems.
[86] Mylene Mangalindan. Some Sellers Leave eBay Over New Fees. Wall Street
Journal, page B.1, January 31, 2005.
[87] Petros Maniatis, Mema Roussopoulos, TJ Giuli, David S. H. Rosenthal, Mary
Baker, and Yanto Muliadi. Preserving peer replicas by rate-limited sampled vot-
ing. In 19th ACM Symposium on Operating Systems Principles (SOSP 2003),
2003.
[88] R. Marimon, J. Nicolini, and P. Teles. Competition and reputation. In Pro-
ceedings of the World Conference Econometric Society, 2000.
BIBLIOGRAPHY 287
[89] Sergio Marti, Prasanna Ganesan, and Hector Garcia-Molina. SPROUT: P2P
Routing with Social Networks. In International Workshop on Peer-to-Peer
Computing & DataBases (P2P&DB 2004), 2004.
[90] Sergio Marti, Prasanna Ganesan, and Hector Garcia-Molina. SPROUT:
P2P Routing with Social Networks. Technical report, 2004.
dbpubs.stanford.edu/pub/2004-5.
[91] Sergio Marti and Hector Garcia-Molina. Identity Crisis: Anonymity vs. Repu-
tation in P2P Systems. In IEEE 3rd International Conference on Peer-to-Peer
Computing (P2P 2003).
[92] Sergio Marti and Hector Garcia-Molina. Examining Metrics for Reputation
Systems (in progress). Technical report, 2003. dbpubs.stanford.edu/pub/2003-
39.
[93] Sergio Marti and Hector Garcia-Molina. Limited Reputation Sharing in P2P
Systems. In ACM Conference on Electronic Commerce (EC’04), 2004.
[94] Sergio Marti and Hector Garcia-Molina. Modeling Reputation and
Incentives in Online Trade (extended). Technical report, 2004.
dbpubs.stanford.edu/pub/2004-45.
[95] Sergio Marti and Hector Garcia-Molina. A Game Theoretic Approach to Rep-
utation (extended). Technical report, 2004. dbpubs.stanford.edu/pub/2004-49.
[96] Sergio Marti and Prasanna Ganesan Hector Garcia-Molina. DHT Routing
Using Social Links. In 3rd International Workshop on Peer-to-Peer Systems
(IPTPS’04), 2004.
[97] Sergio Marti, T.J. Giuli, Kevin Lai, and Mary Baker. Mitigating Routing
Misbehavior in Mobile Ad Hoc Networks. In MobiCom ’00: Proceedings of
288 BIBLIOGRAPHY
the 6th annual ACM/IEEE international conference on Mobile computing and
networking, 2000.
[98] Les McClain. RIAA posting bad music files to deter illegal downloaders. The
Daily Texan. 2/6/2004, February 2004.
[99] Pietro Michiardi and Refik Molva. Core: a collaborative reputation mecha-
nism to enforce node cooperation in mobile ad hoc networks. In Sixth IFIP
Conference on Security, Communications and Multimedia, 2002.
[100] P. Milgrom and J. Roberts. Limit Pricing and Entry Under Incomplete Infor-
mation: An Equilibrium Analysis. Econometrica, (50):443–60, 1982.
[101] Kieron O’Hara, Harith Alani, Yannis Kalfoglou, and Nigel Shadbolt. Trust
Strategies for the Semantic Web. In ISWC’04 Workshop on Trust, Security
and Reputation on the Semantic Web, 2004.
[102] John K. Ousterhout. Tcl and the Tk Toolkit. Addison Wesley, 1994.
[103] Larry Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The PageRank
citation ranking: Bringing order to the web. Technical report, Stanford Digital
Library Technologies Project, 1998.
[104] C. Palmer and J. Steffan. Generating network topologies that obey power laws.
In Proceedings of GLOBECOM ’2000.
[105] Se Hyun Park, Aura Ganz, and Zvi Ganz. Security protocol for IEEE 802.11
wireless local area network. Mobile Networks and Applications, 3:237–246, 1998.
[106] Charles Perkins and Pravin Bhagwat. Highly dynamic destination-sequenced
distance-vector routing (DSDV) for mobile computers. In ACM SIGCOMM’94
BIBLIOGRAPHY 289
Conference on Communications Architectures, Protocols and Applications,
pages 234–244, 1994.
[107] Ryan Porter and Yoav Shoham. Designing Efficient Online Trading Systems.
In ACM Conference on Electronic Commerce (EC’04), 2004.
[108] R. Prakash. Unidirectional links prove costly in wireless ad-hoc networks. In
Proceedings of DIMACS Workshop on Mobile Networks and Computers, 1999.
[109] The CMU Monarch Project. The cmu monarch project’s wireless and mobility
extensions to ns. http://www.monarch.cs.cmu.edu/cmu-ns.html, October 1999.
[110] A. R Puniyani, R. M Lukose, and B. A Huberman. Intentional Walks
on Scale Free Small Worlds. ArXiv Condensed Matter e-prints, July 2001.
http://aps.arxiv.org/abs/cond-mat/0107212.
[111] Eric Rasmusen. Games and Information. Basil and Blackwell Ltd., 1989.
[112] Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, and Scott
Shenker. A scalable content addressable network. Technical Report TR-00-
010, Berkeley, CA, 2000.
[113] Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, and Scott
Shenker. A Scalable Content-Addressable Network. In Proceedings of the ACM
SIGCOMM Symposium on Communication, Architecture, and Protocols, pages
161–172, San Diego, CA, U.S.A., August 2001. ACM SIGCOMM.
[114] Vicky Reich and David S. H. Rosenthal. LOCKSS: A Permanent
Web Publishing and Access System. D-Lib Magazine, 7(6), June 2001.
http://www.dlib.org/dlib/june01/reich/06reich.html.
290 BIBLIOGRAPHY
[115] Michael K. Reiter and Aviel D. Rubin. Crowds: Anonymity for web transac-
tions. In ACM Transactions on Information and System Security, 1998.
[116] Paul Resnick, Richard Zeckhauser, Eric Friedman, and Ko Kuwabara. Reputa-
tion systems. Communications of the ACM, pages 45-48, December 2000.
[117] Tim Roughgarden. Personal communication, 2004.
[118] Antony Rowstron and Peter Druschel. Pastry: Scalable, decentralized object
location, and routing for large-scale peer-to-peer systems. IFIP/ACM Interna-
tional Conference on Distributed Systems Platforms, pages 329–350, 2001.
[119] Stefan Saroiu, P. Krishna Gummadi, and Steven D. Gribble. A measurement
study of peer-to-peer file sharing systems. In Proceedings of Multimedia Com-
puting and Networking 2002 (MMCN ’02), San Jose, CA, USA, January 2002.
[120] Aameek Singh and Lin Liu. TrustMe: Anonymous Management of Trust Rela-
tionships in Decentralized P2P Systems. In IEEE 3rd International Conference
on Peer-to-Peer Computing (P2P 2003), 2003.
[121] Bradley Smith and J.J. Garcia-Luna-Aceves. Efficient Security Mechanisms for
the Border Gateway Routing Protocol. Computer Communications (Elsevier),
21(3):203–210, 1998.
[122] Bradley R. Smith, Shree Murthy, and J. J. Garcia-Luna-Aceves. Securing
distance-vector routing protocols. In Proceedings of Internet Society Sympo-
sium on Network and Distributed System Security, pages 85–92, February 1997.
[123] Herber Gintis Eric Alden Smith and Samuel Bowles. Costly signaling and
cooperation. Journal of Theoretical Biology, (213):103–119, 2001.
BIBLIOGRAPHY 291
[124] K. Sripanidkulchai. The popularity of gnutella queries and its implications on
scalability. Featured on O’Reilly’s www.openp2p.com website, February 2001.
[125] Frank Stajano and Ross Anderson. The resurrecting duckling: Security issues
for ad-hoc wireless networks. pages 172–194.
[126] Doug G. Stinson. Cryptography: Theory and Practice. CRC Press, 1995.
[127] Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M. Frans
Kaashoek, Frank Dabek, and Hari Balakrishnan. Chord: a scalable peer-to-peer
lookup protocol for internet applications. IEEE/ACM Trans. Netw., 11(1):17–
32, 2003.
[128] Paul Syverson, David Goldschlag, and Michael Reed. Anonymous Connections
and Onion Routing. In Proceedings of the IEEE Symposium on Security and
Privacy, 1997.
[129] C. K. Toh. Associativity-based routing for ad-hoc mobile networks. Wireless
Personal Communications Journal, Special Issue on Mobile Networking and
Computing Systems, 4(2):103–139, 1997.
[130] United States Department of Commerce. Quarterly Retail E-Commerce Sales
3rd Quarter 2004. United States Department of Commerce News, November 19,
2004.
[131] William Vickrey. Counter speculation, auctions, and competitive sealed tenders.
Journal of Finance, (16):8–37, 1961.
[132] Eric W. Weisstein. Harmonic number. From MathWorld–A Wolfram Web
Resource, 2004. http://mathworld.wolfram.com/HarmonicNumber.html.
292 BIBLIOGRAPHY
[133] Eric W. Weisstein. Polygamma function. From MathWorld–A Wolfram Web
Resource, 2004. http://mathworld.wolfram.com/PolygammaFunction.html.
[134] Jay Wrolstad. Online Holiday Shopping Up 25 Percent. NewsFactor Network,
January 4, 2005.
[135] Yahoo! Finance. Quotes and info: ebay inc.
http://finance.yahoo.com/q/ks?s=EBAY, January 31, 2005. Data provided by
Reuters.
[136] Beverly Yang. Personal communication, 2002.
[137] Beverly Yang, Tyson Condie, Sepandar Kamvar, and Hector Garcia-Molina.
Addressing the Non-Cooperation Problem in Competitive P2P Systems. In
Workshop on Economics of Peer-to-Peer Systems.
[138] Beverly Yang and Hector Garcia-Molina. Comparing hybrid peer-to-peer sys-
tems (extended). Technical report, 2000.
[139] Beverly Yang and Hector Garcia-Molina. Comparing hybrid peer-to-peer sys-
tems. In The VLDB Journal, pages 561–570, sep 2001.
[140] Beverly Yang and Hector Garcia-Molina. PPay: Micropayments for Peer-to-
Peer Systems. In Proceedings of the 10th ACM Conference on Computer and
Communications Security (CCS), 2003. Washington D.C.
[141] B. Yu and M. P. Singh. A social mechanism of reputation management in
electronic communities. Cooperative Information Agents, pages 154–165, 2000.
[142] Amotz Zahavi. Mate selection: a selection for handicap. Journal of Theoretical
Biology, (53):205–214, 1975.
BIBLIOGRAPHY 293
[143] Amotz Zahavi. The cost of honesty (further remarks on the handicap principle).
Journal of Theoretical Biology, (67):603–605, 1977.
[144] Lidong Zhou and Zygmunt J. Haas. Securing ad hoc networks. IEEE Network,
13(6):24–30, 1999.