p815-wei

8/3/2019 p815-wei

1/5

PeerMate: A Malicious Peer Detection Algorithm

for P2P Systems based on MSPCA

Xianglin WeiDepartment of Computer

Science & Engineering

PLA University of

Science & Technology

Nanjing, China

Email: wei [email protected]

Tarem AhmedDepartment of

Computer Science

International Islamic

University Malaysia

Kuala Lumpur, Malaysia

Email: [email protected]

Ming ChenDepartment of Computer

Science and Engineering

PLA University of

Science and Technology

Nanjing, China


Al-Sakib Khan PathanDepartment of

Computer Science

International Islamic

University Malaysia

Kuala Lumpur, Malaysia


1

AbstractMany reputation management schemes have beenproposed to assist peers in choosing the most trustworthy collab-

orators in a P2P environment where honest peers coexist withmalicious ones. While these schemes indeed generally providesome useful information regarding the reliability of peers, theystill suffer from various attacks such as slandering, collusion, etc.Consequently, being able to detect the malicious peers plays acritical role in the successful functioning of these mechanisms,and this is our focus in this paper. First, we divide the maliciouspeers into several categories. Second, we introduce PeerMate, amalicious peer detection algorithm based on Multiscale PrincipalComponent Analysis and Quality of Reconstruction, to detectmalicious peers in Reputation-based P2P systems. Finally, weexperimentally demonstrate that PeerMate is able to detectmalicious peers accurately and efficiently.

I. INTRODUCTION

In order to stimulate peers to contribute resources and to

assist peers in selecting the most trustworthy collaborators,

several reputation management schemes have been proposed

[1], [2]. These schemes try to evaluate the transactions per-

formed by peers and assign reputation values to them to reflect

their past behavioral features. These reputation values will

then be the basis for identifying trustworthy peers in reducing

the blindness of peer selection. Although these schemes have

been proved to be theoretically attractive, they still have a long

way to go before practical deployment, as they are still faced

with various attacks including self-promoting, whitewashing,

slandering, collusion [3] and Sybil attacks [4].As a burgeoning field, malicious peer detection has attracted

the attention of many researchers in recent years, as is dis-

cussed in Section II. These methods, however, mostly either

concentrate on malicious peers of some particular categories,

or are based on global assumptions. Contrarily, we focus in

this work on developing a general detection algorithm, which

we have named PeerMate. The main differences between Peer-

Mate and existing methods are as follows: first, PeerMate aims

at detecting malicious peers of multi-categories rather than

some particular categories; second, PeerMate is based only on

1Tarem Ahmed is also affiliated with the Department of Electrical andElectronic Engineering, BRAC University, Dhaka, Bangladesh.

reputation information, which can be collected in many current

Reputation-based P2P (RP2P) systems, rather than relying onglobal information. Furthermore, the performance of PeerMate

is independent of the underlying P2P workload model or the

number of the types of malicious peers, as long as they have

different behaviors with the honest peers.

Our primary contributions in this work are threefold: first,

we aim to detect malicious peers in RP2P systems from the

signal processing aspect; second, we develop PeerMate to de-

tect malicious peers based on Multiscale Principal Component

Analysis (MSPCA) [5]; third, the performance of PeerMate is

experimentally evaluated through simulations.

The rest of the paper is organized as follows. Section

II summarizes the related work. PeerMate is introduced in

Section III. Section IV presents our experimental results. Weconclude in Section V with a discussion of the future potential

of our work.

I I . RELATED WOR K

Mekouar et al. have proposed a Malicious Detector Al-

gorithm to detect liar peers that send wrong feedback to

subvert reputation system [6]. That is, after each transaction

between a pair of peers, both peers are required to generate

feedback to describe the transaction. If there is an obvious gap

between the two pieces of feedback, both are regarded being

suspicious. Ji et al. have proposed a group-based metric for

protecting P2P network against Sybil attacks and collusionsby dividing the whole network into trust groups based on

global structure information [7]. Global structure information,

however, is difficult to obtain. Lian et al. have recommended

various collusion detection approaches in [3], including pair-

wise detector and traffic concentration detector, using trace

analysis on data from the Maze file sharing application [8].

Recently, an upload entropy scheme has been developed by

Liu et al. to prevent collusions and further enhance robustness

of private trackers sites [1]. The threshold of this scheme,

however, needs to be set manually. Lee et al. have proposed a

simplified clique detection method to detect the colluders [9],

but their method is restricted to colluders who form a clique.

International Conference on Computing, Networking and Communications, Network Architecture & P2P Protocol Symposium

978-1-4673-0009-4/12/$26.00 2012 IEEE 815

8/3/2019 p815-wei

2/5

III. PEE RMATE

In this section we first describe the detection context, and

outline the various categories of malicious peers that we aim to

detect, before introducing our proposed PeerMate algorithm.

A. Detecting Scenario

The detecting context. Our detecting context is derived

from the current popular RP2P systems including TVTorrents

[10], EigenTrust [2] and Maze [8]. The exchange process is

divided into several time slots (rounds), and the process obeys

a typical P2P workload model, such as the classic workload

model described by Gummadi et al. in [11] or the BitTorrent

workload model [1]. We have found that the effectiveness of

PeerMate is independent of the underlying workload model,

and in this paper, we present results using the classic model

from literature [11]. Each peer initiates requests during eachround, which follows the request model in literature. Each

peer is assigned an initial reputation value. A peers reputation

value will increase by Ru when the peer uploads a valid object

and will decrease by Rd when the peer downloads a valid

object, with Ru Rd. Details of the workload model is

illustrated in Section IV.

Malicious peers. According to their different behavioral

features, the malicious peers can be divided into various

categories, and hence it is hard to summarize all the categories

comprehensively due to the complexity of the behaviors.

Here, we mainly focus on the following categories to help

us understand the effectiveness of PeerMate and to serve

as a benchmark for evaluating the forthcoming detectionalgorithms. MP1: free-riding peers that utilize P2Ps resources

without providing appropriate amount of resources, for exam-

ple BitTyrant and BitThief clients; MP2: peers that upload

inauthentic objects to persecute the community, for example

those peers controlled by the music industry that inject fake

files into KaZaA; MP3: peers that collaborate with each

other to promote their reputation values through uploading

object to each other preferentially, such as the colluders in

Maze, and then using these reputation values to download

their desired objects; MP4: peers that create Sybil peers for

uploading to promote their own reputation values; MP5: peers

that exploit resources of P2P for malicious purposes like worm

dispatching, denial-of-service, etc [12]. We recognize that thispartition is incomplete, and there exist other categories of

malicious peers with more complex behaviors. For example,

some peers may belong to multiple categories at the same time,

and their behavior is a combination of the characteristics of

multiple categories. As another example, a peer may pretend

to be nice until its reputation becomes high, and then begin

to exploit the system. In summary, all these malicious peers

behave differently from honest peers. For the sake of simplic-

ity, these malicious peers are called strategic malicious peers,

and we will discuss them further in Section IV. Other peers

besides the malicious ones will be referred to as honest peers.

Furthermore, we assume that majority of peers are honest.

B. Reputation Matrix

Let N be the total number of peers, and RTp be the

reputation value of peer p at the end of round T, with

1 p N. Consequently, the reputation value of all peers

form a reputation vector RVT = (RT1 , RT2 , , RTN) at theend of the Tth round. Besides, from the aspect of a single

peer p with reputation values over time Rtp, 1 t T, wecan form a reputation time-series RSp = (R

1

p, R2

p, , RTp ).

We can then form a reputation matrix RTN at the end of the

Tth round:

RTN =

R11

R12

. . . R1NR21

R22

. . . R2N...

.... . .

...

RT1

RT2

. . . RTN

(1)

Thus the ith column of RTN is the reputation time-series

of peer i, and the tth row of RTN is the reputation vectorfor all peers at the end of round t. Each entry Rtp is the

reputation value of peer p at the end of the tth round. In RP2P

systems, RTN may be iteratively calculated by each peer as

in EigenTrust, or be collected and calculated by a centralized

facility such as the tracker servers in private tracker sites or the

central server in Maze. Distributed Hash Table (DHT) -based

methods also provide a feasible way of collecting reputation

information in a distributed manner. Thus it is always possible

to calculate RTN. For the sake of simplicity of notation, we

will use R to represent RTN from now on. Note that due

to network congestion and churn in overlay network, a few

elements in R may be inaccurate or missing. We discuss the

effects of this in Subsection IV-C.

C. Fundamental Idea

As described in Subsection III-A, malicious peers span

various types. Since the reputation value of a peer reflects its

behavioral features, different behaviors will lead to different

reputation reputation values, which will subsequently lead

to different reputation time-series in R. The behavior of a

malicious peer, of whatever type, is however expected to be

different from that of an honest peer. Therefore, we will be

able to isolate malicious peers from honest ones, if we are

able to mine the different behavioral features embedded in

the different deterministic features of the malicious peersreputation time-series within R.

To extract the deterministic features of reputation time-

series in Rwhile also coping with inaccurate or missing data,

PeerMate begins by applying Multiscale Principal Component

Analysis (MSPCA) [5] to R to obtain the reconstructed

reputation matrix R. We have found that the variability in

R may be captured in a lower-dimensional space, thereby

justifying the application of MSPCA in this manner. We then

define a Quality of Reconstruction QR metric as is described

in Subsection III-E, and finally distinguish between malicious

and honest ones using a threshold . The setting of this

threshold is discussed in Subsection IV-D.

816

8/3/2019 p815-wei

3/5

5 :

=/ /

= + 5

=/ /

< * 5

=P P

< * 5

=< * 5

#

#

:DYHOHW

FRHIILFLHQW

7KUHVKROG

ZDYHOHW

FRHIILFLHQW

WKUHVKROG

/=

/

8/3/2019 p815-wei

4/5

TABLE ISIMULATION PARAMETERS.

Symbol Meaning Base value

N # of peers 200

O # of objects 4000

R per-user request rate 2 objects/roundO object arrival rate varies

PMratio of # of malicious peers

to # of total peersvaries

Phprobability of strategic

malicious peer acting honestvaries

F. Complexity

The time complexity of PeerMate is governed by that of

MSPCA. The time complexity is evaluated to be O(TN2).The storage cost of PeerMate is O(TN).

IV. EXPERIMENTAL RESULTS

In this section, we conduct simulations to evaluate theperformance of PeerMate. We first describe our simulation

context, then present the results with analyses of the sensitiv-

ities of the various algorithm parameters.

A. Simulation Context

In our simulation, the basic workload model follows the

typical workload model from literature [11]. The objects arrive

at constant rate O > 0, and their popularity follows theZipf probability distribution. On average, a client requests a

constant number of objects per round, from which it chooses

objects to comply with a Zipf distribution with parameter 1.0.We assume that all objects in the system are of equal size, and

malicious peers are selected randomly from all peers. Table Ipresents our parameters values.

B. Comparison Benchmark and Evaluation Metrics

We choose the two recently proposed, representative

schemes of Upload Entropy (UEntropy) and Interaction En-

tropy (IEntropy) [1] as our comparison benchmarks. UEntropy

and IEntropy are both aimed at providing incentive to peers

for sharing content in Private BT society, in which peers

with the lowest entropy are considered the least trustworthy

collaborators, i.e. they are the suspected malicious peers.

Therefore, in order to guarantee the fairness of comparison,

in UEntropy and IEntropy schemes, those peers that have the

lowest entropy will be treated as suspicious malicious peers.

Let MPS be the malicious peers set, HPS be the set of

honest peers, and SMPS be suspected malicious peers set. We

define the True Positive Ratio (TPR) and the False Negative

Ratio (FNR) as TPR = |SMPS MPS|/|MPS| and(FNR) = |SMPS HPS|/|SMPS| respectively, where ||represents the rank of a set and denotes the intersection oftwo sets. TPR and FNR both lie in [0, 1].

C. Without Strategic Malicious Peers

We first compare detection performance of PeerMate with

the chosen benchmark schemes of UEntropy and IEntropy

with O = 2, Ph = 0, PM = 0.2, = 0.9 and default

TABLE IIDETECTION PERFORMANCES.

Scheme TPR FNR

PeerMate 97.5% 7%

UEntropy 42.5% 57.5%

PeerMate 0% 100%

values for the other parameters as stated in Table I. We used

200 rounds. The results are presented in Table II. It is clearthat PeerMate exhibits the best performance, with the highest

TPR and lowest FNR.

After manual inspection, we found that the 2.5% maliciouspeers that could be distinguished by PeerMate were those

belonging to type MP4, the behaviors of which are similar to

honest peers. In order to detect these malicious peers, one can

investigate the behavior of Sybil peers, as they only download

content from malicious peers belonging to MP4. The 7% FNRmay occur because the objects of these honest peers may be

at a lower rank among all objects during the request process.

D. Threshold Setting

Threshold plays a critical role in PeerMate and determinesthe TPR and FNR to some extent. To investigate the impact of

, we ran PeerMate using a range of possible values for thethreshold. The other parameters remained as before: O =2, Ph = 0 and PM = 0.2. Figure 2 shows the variation inTPR and FNR with the selection of . We observe that theTPR of PeerMate increases from 55% to 97.5% and the FNRincreases from 4.5% to 11.5%, as goes from 0.84to0.93. Asthe objective of PeerMate is to detect as many malicious peers

as possible with as low FNR as possible, we choose = 0.9as the default in our experiments.

E. With Strategic Malicious Peers

We study two other more complex scenarios to evaluate the

performance of PeerMate.

In order to avoid being detected, many strategic malicious

peers will act as honest ones with certain probability Phduring each round. To study the effect of this on the detection

performance of PeerMate, we simulated with a range of values

for Ph. The other algorithm parameters remained constant attheir default values: O = 2, PM = 0.2 and = 0.9. Figure 3

presents the results. We observe that the TPR decreases from97.5% to 65% as Ph goes from 0 to 0.8. TPR exceeds 75%as long as Ph is below the significantly high value of 0.4.The FNR remains within a generally low band, with values

fluctuating between 3% and 11.5%. This experiment indicatesthat the performance of PeerMate is acceptable as long as Phremains below the significantly high value of 0.4.

F. Mixture Model

We have also evaluated PeerMate with strategic malicious

peers which belong to multiple categories at the same time,

and whose behaviors are a combination of the behaviors of

multiple categories. Generally speaking, even if a malicious

818

8/3/2019 p815-wei

5/5

0.84 0.86 0.88 0.9 0.92 0.940.5

0.6

0.7

0.8

0.9

1

TPR

0.84 0.86 0.88 0.9 0.92 0.940.04

0.05

0.06

0.07

0.08

0.09

0.1

0.11

0.12

FNR

Fig. 2. Variation in detection performance of PeerMate with threshold .

peer exhibits a mixture of different types of malicious behav-

iors, its specific behavior will still be different from that of an

honest peer.

We have injected mixed malicious peers which act as MP1,

,MP5

with certain probabilities. We found that TPR was95% and FNR was 9% after 100 rounds. Thus PeerMate is

also able to detect malicious peers of mixture types.

G. Discussion

We see that PeerMate can detect malicious types of various

types with high success rates. Our experiments have also

shown that missing data does not affect the detection perfor-

mance of PeerMate, when up to 50% of the data is missing.

In addition, we have found that PeerMate is insensitive to the

value of PM, the ratio of the number of malicious peers to

total peers, when PM is below 60%. We do not show these

results here due to space constraints.

Each reputation management scheme needs to implementsome algorithms to detect malicious peers. This may be proved

through a game-theoretic model and will be presented in our

future papers.

PeerMate may be applied to Maze-like and EigenTrust-

like systems directly since they have common context. It can

also be applied to other Reputation-based P2P systems if

some schemes are implemented to collect and compute the

reputation values of all the peers.

V. CONCLUSION AND FUTURE WOR K

In this paper, we have presented PeerMate, a novel ma-

licious peer detection algorithm for Reputation-based RP2P

systems, which distinguishes malicious peers from honest onesusing Multiscale Principal Component Analysis and Quality

of Reconstruction. Experimental results have indicated that

PeerMate achieves high detection accuracy and flexibility, has

low complexity, and is insensitive to parameter selection.

As a future task, we are planning to extend PeerMate to

incorporate adaptive, online anomaly detection algorithms that

have recently been suggested. Examples include the KOAD

[15], KEAD [16] and OCNM [17] algorithms.

ACKNOWLEDGEMENTS

This work was supported in part by the National Natural ScienceFoundation of China Grants # 61070173 and 61103225, and theJiangsu Province Natural Science Foundation Grants # BK2010133and BK2009058.

0 0.2 0.4 0.6 0.80.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Ph

TPR

0 0.2 0.4 0.6 0.80.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

0.11

0.12

Ph

FNR

Fig. 3. Variation in detection performance of PeerMate with the value ofPh, the probability of a strategic malicious peer acting honest.

REFERENCES

[1] Z. Liu, P. Dhungel, D. Wu, C. Zhang, and K. Ross, Understanding andimproving ratio incentives in private communities, in Proc. IEEE Int.Conf. on Distributed Computing Systems (ICDCS), Genoa, Italy, Aug.2010.

[2] S. Kamvar, M. Schlosser, and H. Garcia-Molina, The EigenTrustalgorithm for reputation management in P2P networks, in Proc. Int.World Wide Web Conf. (WWW), Budapest, Hungary, May 2003.

[3] Q. Lian, Z. Zhang, M. Yang, B. Zhao, Y. Dai, and X. Li, An empiricalstudy of collusion behavior in the maze P2P file-sharing system, inProc. IEEE Int. Conf. on Distributed Computing Systems (ICDCS),Toronto, ON, Canada, Jun. 2007.

[4] J. Douceur, The sybil attack, in Proc. Int. Workshop on Peer-to-PeerSystems, Cambridge, MA, USA, Mar. 2007.

[5] B. Bakshi, Multiscale PCA with application to multivariate statisticalprocess monitoring, AIChE Journal, vol. 44, no. 7, pp. 15961610, Jul.1998.

[6] L. Mekouar, Y. Iraqi, and R. Boutaba, Peer-to-Peers most wanted:Malicious peers, Computer Networks, vol. 50, no. 4, pp. 545562, Mar.2006.

[7] W. Ji, S. Yang, and B. Chen, A group-based trust metric for P2Pnetworks: Protection against sybil attack and collusion, in Proc. IEEE

Int. Conf. on Computer Science and Software Engineering, Wuhan,China, Dec. 2008.[8] Maze. Portal. [Online]. Available: http://maze.tianwang.com/[9] H. Lee, J. Kim, and K. Shin, Simplified clique detection for collusion-

resistant reputation management scheme in P2P networks, in Proc. Int.Symposium on Communications and Information Technologies (ISCIT),Tokyo, Japan, Oct. 2010.

[10] TV Torrents. Portal. [Online]. Available: http://www.tvtorrents.com/[11] K. Gummadi, R. Dunn, S. Saroiu, S. Gribble, H. Levy, and J. Zahorjan,

Measurement, modeling, and analysis of a peer-to-peer file-sharingworkload, ACM SIGOPS Operating Systems Review (OSR), vol. 37,no. 5, pp. 314329, Dec. 2003.

[12] K. Hoffman, D. Zage, and C. Nita-Rotaru, A survey of attack anddefense techniques for reputation systems, ACM Computing Surveys,vol. 42, no. 1, pp. 131, Dec. 2009.

[13] D. Donoho, I. Johnstone, G. Kerkyacharian, and D. Picard, Waveletshrinkage: Asymptopia? J. of the Royal Stat. Soc., vol. 57, no. 2, pp.

301369, 1995.[14] A. Lakhina, M. Crovella, and C. Diot, Diagnosing network-wide trafficanomalies, in Proc. ACM SIGCOMM, Portland, OR, Aug. 2004.

[15] T. Ahmed, M. Coates, and A. Lakhina, Multivariate online anomalydetection using kernel recursive least squares, in Proc. IEEE Int. Conf.on Computer Communications (INFOCOM), Anchorage, AK, USA,May 2007.

[16] T. Ahmed, Online anomaly detection using KDE, in Proc. IEEEGlobal Communications Conf. (GLOBECOM), Honolulu, HI, USA, Nov.2009.

[17] T. Ahmed, X. Wei, S. Ahmed, and A.-S. Pathan, Intruder detection incamera networks using the one-class neighbor machine, in Proc. Amer-ican Telecommunications Systems Management Association (ATSMA)

Networking and Electronic Commerce Research Conf. (NAEC), Riva delGarda, Italy, Oct 2011.

819

Documents

p815-wei