Upload
sakibpathan
View
218
Download
0
Embed Size (px)
Citation preview
8/3/2019 p815-wei
1/5
PeerMate: A Malicious Peer Detection Algorithm
for P2P Systems based on MSPCA
Xianglin WeiDepartment of Computer
Science & Engineering
PLA University of
Science & Technology
Nanjing, China
Email: wei [email protected]
Tarem AhmedDepartment of
Computer Science
International Islamic
University Malaysia
Kuala Lumpur, Malaysia
Email: [email protected]
Ming ChenDepartment of Computer
Science and Engineering
PLA University of
Science and Technology
Nanjing, China
Email: [email protected]
Al-Sakib Khan PathanDepartment of
Computer Science
International Islamic
University Malaysia
Kuala Lumpur, Malaysia
Email: [email protected]
1
AbstractMany reputation management schemes have beenproposed to assist peers in choosing the most trustworthy collab-
orators in a P2P environment where honest peers coexist withmalicious ones. While these schemes indeed generally providesome useful information regarding the reliability of peers, theystill suffer from various attacks such as slandering, collusion, etc.Consequently, being able to detect the malicious peers plays acritical role in the successful functioning of these mechanisms,and this is our focus in this paper. First, we divide the maliciouspeers into several categories. Second, we introduce PeerMate, amalicious peer detection algorithm based on Multiscale PrincipalComponent Analysis and Quality of Reconstruction, to detectmalicious peers in Reputation-based P2P systems. Finally, weexperimentally demonstrate that PeerMate is able to detectmalicious peers accurately and efficiently.
I. INTRODUCTION
In order to stimulate peers to contribute resources and to
assist peers in selecting the most trustworthy collaborators,
several reputation management schemes have been proposed
[1], [2]. These schemes try to evaluate the transactions per-
formed by peers and assign reputation values to them to reflect
their past behavioral features. These reputation values will
then be the basis for identifying trustworthy peers in reducing
the blindness of peer selection. Although these schemes have
been proved to be theoretically attractive, they still have a long
way to go before practical deployment, as they are still faced
with various attacks including self-promoting, whitewashing,
slandering, collusion [3] and Sybil attacks [4].As a burgeoning field, malicious peer detection has attracted
the attention of many researchers in recent years, as is dis-
cussed in Section II. These methods, however, mostly either
concentrate on malicious peers of some particular categories,
or are based on global assumptions. Contrarily, we focus in
this work on developing a general detection algorithm, which
we have named PeerMate. The main differences between Peer-
Mate and existing methods are as follows: first, PeerMate aims
at detecting malicious peers of multi-categories rather than
some particular categories; second, PeerMate is based only on
1Tarem Ahmed is also affiliated with the Department of Electrical andElectronic Engineering, BRAC University, Dhaka, Bangladesh.
reputation information, which can be collected in many current
Reputation-based P2P (RP2P) systems, rather than relying onglobal information. Furthermore, the performance of PeerMate
is independent of the underlying P2P workload model or the
number of the types of malicious peers, as long as they have
different behaviors with the honest peers.
Our primary contributions in this work are threefold: first,
we aim to detect malicious peers in RP2P systems from the
signal processing aspect; second, we develop PeerMate to de-
tect malicious peers based on Multiscale Principal Component
Analysis (MSPCA) [5]; third, the performance of PeerMate is
experimentally evaluated through simulations.
The rest of the paper is organized as follows. Section
II summarizes the related work. PeerMate is introduced in
Section III. Section IV presents our experimental results. Weconclude in Section V with a discussion of the future potential
of our work.
I I . RELATED WOR K
Mekouar et al. have proposed a Malicious Detector Al-
gorithm to detect liar peers that send wrong feedback to
subvert reputation system [6]. That is, after each transaction
between a pair of peers, both peers are required to generate
feedback to describe the transaction. If there is an obvious gap
between the two pieces of feedback, both are regarded being
suspicious. Ji et al. have proposed a group-based metric for
protecting P2P network against Sybil attacks and collusionsby dividing the whole network into trust groups based on
global structure information [7]. Global structure information,
however, is difficult to obtain. Lian et al. have recommended
various collusion detection approaches in [3], including pair-
wise detector and traffic concentration detector, using trace
analysis on data from the Maze file sharing application [8].
Recently, an upload entropy scheme has been developed by
Liu et al. to prevent collusions and further enhance robustness
of private trackers sites [1]. The threshold of this scheme,
however, needs to be set manually. Lee et al. have proposed a
simplified clique detection method to detect the colluders [9],
but their method is restricted to colluders who form a clique.
International Conference on Computing, Networking and Communications, Network Architecture & P2P Protocol Symposium
978-1-4673-0009-4/12/$26.00 2012 IEEE 815
8/3/2019 p815-wei
2/5
III. PEE RMATE
In this section we first describe the detection context, and
outline the various categories of malicious peers that we aim to
detect, before introducing our proposed PeerMate algorithm.
A. Detecting Scenario
The detecting context. Our detecting context is derived
from the current popular RP2P systems including TVTorrents
[10], EigenTrust [2] and Maze [8]. The exchange process is
divided into several time slots (rounds), and the process obeys
a typical P2P workload model, such as the classic workload
model described by Gummadi et al. in [11] or the BitTorrent
workload model [1]. We have found that the effectiveness of
PeerMate is independent of the underlying workload model,
and in this paper, we present results using the classic model
from literature [11]. Each peer initiates requests during eachround, which follows the request model in literature. Each
peer is assigned an initial reputation value. A peers reputation
value will increase by Ru when the peer uploads a valid object
and will decrease by Rd when the peer downloads a valid
object, with Ru Rd. Details of the workload model is
illustrated in Section IV.
Malicious peers. According to their different behavioral
features, the malicious peers can be divided into various
categories, and hence it is hard to summarize all the categories
comprehensively due to the complexity of the behaviors.
Here, we mainly focus on the following categories to help
us understand the effectiveness of PeerMate and to serve
as a benchmark for evaluating the forthcoming detectionalgorithms. MP1: free-riding peers that utilize P2Ps resources
without providing appropriate amount of resources, for exam-
ple BitTyrant and BitThief clients; MP2: peers that upload
inauthentic objects to persecute the community, for example
those peers controlled by the music industry that inject fake
files into KaZaA; MP3: peers that collaborate with each
other to promote their reputation values through uploading
object to each other preferentially, such as the colluders in
Maze, and then using these reputation values to download
their desired objects; MP4: peers that create Sybil peers for
uploading to promote their own reputation values; MP5: peers
that exploit resources of P2P for malicious purposes like worm
dispatching, denial-of-service, etc [12]. We recognize that thispartition is incomplete, and there exist other categories of
malicious peers with more complex behaviors. For example,
some peers may belong to multiple categories at the same time,
and their behavior is a combination of the characteristics of
multiple categories. As another example, a peer may pretend
to be nice until its reputation becomes high, and then begin
to exploit the system. In summary, all these malicious peers
behave differently from honest peers. For the sake of simplic-
ity, these malicious peers are called strategic malicious peers,
and we will discuss them further in Section IV. Other peers
besides the malicious ones will be referred to as honest peers.
Furthermore, we assume that majority of peers are honest.
B. Reputation Matrix
Let N be the total number of peers, and RTp be the
reputation value of peer p at the end of round T, with
1 p N. Consequently, the reputation value of all peers
form a reputation vector RVT = (RT1 , RT2 , , RTN) at theend of the Tth round. Besides, from the aspect of a single
peer p with reputation values over time Rtp, 1 t T, wecan form a reputation time-series RSp = (R
1
p, R2
p, , RTp ).
We can then form a reputation matrix RTN at the end of the
Tth round:
RTN =
R11
R12
. . . R1NR21
R22
. . . R2N...
.... . .
...
RT1
RT2
. . . RTN
(1)
Thus the ith column of RTN is the reputation time-series
of peer i, and the tth row of RTN is the reputation vectorfor all peers at the end of round t. Each entry Rtp is the
reputation value of peer p at the end of the tth round. In RP2P
systems, RTN may be iteratively calculated by each peer as
in EigenTrust, or be collected and calculated by a centralized
facility such as the tracker servers in private tracker sites or the
central server in Maze. Distributed Hash Table (DHT) -based
methods also provide a feasible way of collecting reputation
information in a distributed manner. Thus it is always possible
to calculate RTN. For the sake of simplicity of notation, we
will use R to represent RTN from now on. Note that due
to network congestion and churn in overlay network, a few
elements in R may be inaccurate or missing. We discuss the
effects of this in Subsection IV-C.
C. Fundamental Idea
As described in Subsection III-A, malicious peers span
various types. Since the reputation value of a peer reflects its
behavioral features, different behaviors will lead to different
reputation reputation values, which will subsequently lead
to different reputation time-series in R. The behavior of a
malicious peer, of whatever type, is however expected to be
different from that of an honest peer. Therefore, we will be
able to isolate malicious peers from honest ones, if we are
able to mine the different behavioral features embedded in
the different deterministic features of the malicious peersreputation time-series within R.
To extract the deterministic features of reputation time-
series in Rwhile also coping with inaccurate or missing data,
PeerMate begins by applying Multiscale Principal Component
Analysis (MSPCA) [5] to R to obtain the reconstructed
reputation matrix R. We have found that the variability in
R may be captured in a lower-dimensional space, thereby
justifying the application of MSPCA in this manner. We then
define a Quality of Reconstruction QR metric as is described
in Subsection III-E, and finally distinguish between malicious
and honest ones using a threshold . The setting of this
threshold is discussed in Subsection IV-D.
816
8/3/2019 p815-wei
3/5
5 :
=/ /
= + 5
=/ /
< * 5
=P P
< * 5
=< * 5
#
#
:DYHOHW
FRHIILFLHQW
7KUHVKROG
ZDYHOHW
FRHIILFLHQW
WKUHVKROG
/=
/
8/3/2019 p815-wei
4/5
TABLE ISIMULATION PARAMETERS.
Symbol Meaning Base value
N # of peers 200
O # of objects 4000
R per-user request rate 2 objects/roundO object arrival rate varies
PMratio of # of malicious peers
to # of total peersvaries
Phprobability of strategic
malicious peer acting honestvaries
F. Complexity
The time complexity of PeerMate is governed by that of
MSPCA. The time complexity is evaluated to be O(TN2).The storage cost of PeerMate is O(TN).
IV. EXPERIMENTAL RESULTS
In this section, we conduct simulations to evaluate theperformance of PeerMate. We first describe our simulation
context, then present the results with analyses of the sensitiv-
ities of the various algorithm parameters.
A. Simulation Context
In our simulation, the basic workload model follows the
typical workload model from literature [11]. The objects arrive
at constant rate O > 0, and their popularity follows theZipf probability distribution. On average, a client requests a
constant number of objects per round, from which it chooses
objects to comply with a Zipf distribution with parameter 1.0.We assume that all objects in the system are of equal size, and
malicious peers are selected randomly from all peers. Table Ipresents our parameters values.
B. Comparison Benchmark and Evaluation Metrics
We choose the two recently proposed, representative
schemes of Upload Entropy (UEntropy) and Interaction En-
tropy (IEntropy) [1] as our comparison benchmarks. UEntropy
and IEntropy are both aimed at providing incentive to peers
for sharing content in Private BT society, in which peers
with the lowest entropy are considered the least trustworthy
collaborators, i.e. they are the suspected malicious peers.
Therefore, in order to guarantee the fairness of comparison,
in UEntropy and IEntropy schemes, those peers that have the
lowest entropy will be treated as suspicious malicious peers.
Let MPS be the malicious peers set, HPS be the set of
honest peers, and SMPS be suspected malicious peers set. We
define the True Positive Ratio (TPR) and the False Negative
Ratio (FNR) as TPR = |SMPS MPS|/|MPS| and(FNR) = |SMPS HPS|/|SMPS| respectively, where ||represents the rank of a set and denotes the intersection oftwo sets. TPR and FNR both lie in [0, 1].
C. Without Strategic Malicious Peers
We first compare detection performance of PeerMate with
the chosen benchmark schemes of UEntropy and IEntropy
with O = 2, Ph = 0, PM = 0.2, = 0.9 and default
TABLE IIDETECTION PERFORMANCES.
Scheme TPR FNR
PeerMate 97.5% 7%
UEntropy 42.5% 57.5%
PeerMate 0% 100%
values for the other parameters as stated in Table I. We used
200 rounds. The results are presented in Table II. It is clearthat PeerMate exhibits the best performance, with the highest
TPR and lowest FNR.
After manual inspection, we found that the 2.5% maliciouspeers that could be distinguished by PeerMate were those
belonging to type MP4, the behaviors of which are similar to
honest peers. In order to detect these malicious peers, one can
investigate the behavior of Sybil peers, as they only download
content from malicious peers belonging to MP4. The 7% FNRmay occur because the objects of these honest peers may be
at a lower rank among all objects during the request process.
D. Threshold Setting
Threshold plays a critical role in PeerMate and determinesthe TPR and FNR to some extent. To investigate the impact of
, we ran PeerMate using a range of possible values for thethreshold. The other parameters remained as before: O =2, Ph = 0 and PM = 0.2. Figure 2 shows the variation inTPR and FNR with the selection of . We observe that theTPR of PeerMate increases from 55% to 97.5% and the FNRincreases from 4.5% to 11.5%, as goes from 0.84to0.93. Asthe objective of PeerMate is to detect as many malicious peers
as possible with as low FNR as possible, we choose = 0.9as the default in our experiments.
E. With Strategic Malicious Peers
We study two other more complex scenarios to evaluate the
performance of PeerMate.
In order to avoid being detected, many strategic malicious
peers will act as honest ones with certain probability Phduring each round. To study the effect of this on the detection
performance of PeerMate, we simulated with a range of values
for Ph. The other algorithm parameters remained constant attheir default values: O = 2, PM = 0.2 and = 0.9. Figure 3
presents the results. We observe that the TPR decreases from97.5% to 65% as Ph goes from 0 to 0.8. TPR exceeds 75%as long as Ph is below the significantly high value of 0.4.The FNR remains within a generally low band, with values
fluctuating between 3% and 11.5%. This experiment indicatesthat the performance of PeerMate is acceptable as long as Phremains below the significantly high value of 0.4.
F. Mixture Model
We have also evaluated PeerMate with strategic malicious
peers which belong to multiple categories at the same time,
and whose behaviors are a combination of the behaviors of
multiple categories. Generally speaking, even if a malicious
818
8/3/2019 p815-wei
5/5
0.84 0.86 0.88 0.9 0.92 0.940.5
0.6
0.7
0.8
0.9
1
TPR
0.84 0.86 0.88 0.9 0.92 0.940.04
0.05
0.06
0.07
0.08
0.09
0.1
0.11
0.12
FNR
Fig. 2. Variation in detection performance of PeerMate with threshold .
peer exhibits a mixture of different types of malicious behav-
iors, its specific behavior will still be different from that of an
honest peer.
We have injected mixed malicious peers which act as MP1,
,MP5
with certain probabilities. We found that TPR was95% and FNR was 9% after 100 rounds. Thus PeerMate is
also able to detect malicious peers of mixture types.
G. Discussion
We see that PeerMate can detect malicious types of various
types with high success rates. Our experiments have also
shown that missing data does not affect the detection perfor-
mance of PeerMate, when up to 50% of the data is missing.
In addition, we have found that PeerMate is insensitive to the
value of PM, the ratio of the number of malicious peers to
total peers, when PM is below 60%. We do not show these
results here due to space constraints.
Each reputation management scheme needs to implementsome algorithms to detect malicious peers. This may be proved
through a game-theoretic model and will be presented in our
future papers.
PeerMate may be applied to Maze-like and EigenTrust-
like systems directly since they have common context. It can
also be applied to other Reputation-based P2P systems if
some schemes are implemented to collect and compute the
reputation values of all the peers.
V. CONCLUSION AND FUTURE WOR K
In this paper, we have presented PeerMate, a novel ma-
licious peer detection algorithm for Reputation-based RP2P
systems, which distinguishes malicious peers from honest onesusing Multiscale Principal Component Analysis and Quality
of Reconstruction. Experimental results have indicated that
PeerMate achieves high detection accuracy and flexibility, has
low complexity, and is insensitive to parameter selection.
As a future task, we are planning to extend PeerMate to
incorporate adaptive, online anomaly detection algorithms that
have recently been suggested. Examples include the KOAD
[15], KEAD [16] and OCNM [17] algorithms.
ACKNOWLEDGEMENTS
This work was supported in part by the National Natural ScienceFoundation of China Grants # 61070173 and 61103225, and theJiangsu Province Natural Science Foundation Grants # BK2010133and BK2009058.
0 0.2 0.4 0.6 0.80.65
0.7
0.75
0.8
0.85
0.9
0.95
1
Ph
TPR
0 0.2 0.4 0.6 0.80.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
0.11
0.12
Ph
FNR
Fig. 3. Variation in detection performance of PeerMate with the value ofPh, the probability of a strategic malicious peer acting honest.
REFERENCES
[1] Z. Liu, P. Dhungel, D. Wu, C. Zhang, and K. Ross, Understanding andimproving ratio incentives in private communities, in Proc. IEEE Int.Conf. on Distributed Computing Systems (ICDCS), Genoa, Italy, Aug.2010.
[2] S. Kamvar, M. Schlosser, and H. Garcia-Molina, The EigenTrustalgorithm for reputation management in P2P networks, in Proc. Int.World Wide Web Conf. (WWW), Budapest, Hungary, May 2003.
[3] Q. Lian, Z. Zhang, M. Yang, B. Zhao, Y. Dai, and X. Li, An empiricalstudy of collusion behavior in the maze P2P file-sharing system, inProc. IEEE Int. Conf. on Distributed Computing Systems (ICDCS),Toronto, ON, Canada, Jun. 2007.
[4] J. Douceur, The sybil attack, in Proc. Int. Workshop on Peer-to-PeerSystems, Cambridge, MA, USA, Mar. 2007.
[5] B. Bakshi, Multiscale PCA with application to multivariate statisticalprocess monitoring, AIChE Journal, vol. 44, no. 7, pp. 15961610, Jul.1998.
[6] L. Mekouar, Y. Iraqi, and R. Boutaba, Peer-to-Peers most wanted:Malicious peers, Computer Networks, vol. 50, no. 4, pp. 545562, Mar.2006.
[7] W. Ji, S. Yang, and B. Chen, A group-based trust metric for P2Pnetworks: Protection against sybil attack and collusion, in Proc. IEEE
Int. Conf. on Computer Science and Software Engineering, Wuhan,China, Dec. 2008.[8] Maze. Portal. [Online]. Available: http://maze.tianwang.com/[9] H. Lee, J. Kim, and K. Shin, Simplified clique detection for collusion-
resistant reputation management scheme in P2P networks, in Proc. Int.Symposium on Communications and Information Technologies (ISCIT),Tokyo, Japan, Oct. 2010.
[10] TV Torrents. Portal. [Online]. Available: http://www.tvtorrents.com/[11] K. Gummadi, R. Dunn, S. Saroiu, S. Gribble, H. Levy, and J. Zahorjan,
Measurement, modeling, and analysis of a peer-to-peer file-sharingworkload, ACM SIGOPS Operating Systems Review (OSR), vol. 37,no. 5, pp. 314329, Dec. 2003.
[12] K. Hoffman, D. Zage, and C. Nita-Rotaru, A survey of attack anddefense techniques for reputation systems, ACM Computing Surveys,vol. 42, no. 1, pp. 131, Dec. 2009.
[13] D. Donoho, I. Johnstone, G. Kerkyacharian, and D. Picard, Waveletshrinkage: Asymptopia? J. of the Royal Stat. Soc., vol. 57, no. 2, pp.
301369, 1995.[14] A. Lakhina, M. Crovella, and C. Diot, Diagnosing network-wide trafficanomalies, in Proc. ACM SIGCOMM, Portland, OR, Aug. 2004.
[15] T. Ahmed, M. Coates, and A. Lakhina, Multivariate online anomalydetection using kernel recursive least squares, in Proc. IEEE Int. Conf.on Computer Communications (INFOCOM), Anchorage, AK, USA,May 2007.
[16] T. Ahmed, Online anomaly detection using KDE, in Proc. IEEEGlobal Communications Conf. (GLOBECOM), Honolulu, HI, USA, Nov.2009.
[17] T. Ahmed, X. Wei, S. Ahmed, and A.-S. Pathan, Intruder detection incamera networks using the one-class neighbor machine, in Proc. Amer-ican Telecommunications Systems Management Association (ATSMA)
Networking and Electronic Commerce Research Conf. (NAEC), Riva delGarda, Italy, Oct 2011.
819