Social Closeness Based Clone Attack Detection for Mobile ...yychen/papers/Social Closeness Based Clon… · Social Closeness Based Clone Attack Detection for Mobile Healthcare System

Social Closeness Based Clone Attack Detectionfor Mobile Healthcare System

Yanzhi Ren∗, Yingying Chen∗, Mooi Choo Chuah†∗Dept. of ECE, Stevens Institute of Technology † Dept. of CSE, Lehigh UniversityCastle Point on Hudson, Hoboken, NJ 07030 Bethlehem, PA 18015

{yren2, yingying.chen}@stevens.edu [email protected]

Abstract—The inclusion of embedded sensors in mobilephones, and the explosion of their usage in people’s dailylives provide users with the ability to collectively sensethe world. The collected sensing data from such a mobilephone enabled social network can be mined for users’behaviors and their social communities, and to support abroad range of applications including mobile healthcaresystems. However, such mobile healthcare systems builtupon social networks are vulnerable to clone attacks, inwhich the adversary replicates the legitimate nodes anddistributes the clones throughout the network to underminethe successful application deployment. Existing clone attackmitigation approaches either only focus on the preventiontechniques or can only work in static or well-connectednetworks, and hence are not applicable to our targetedmobile healthcare systems. In this paper, we proposea social closeness based method in a mobile healthcaredisease control system to detect any clone attacks thatmay be launched to disrupt the normal operations of thesystem. Our social closeness based method exploits thesocial relationships among users for clone attack detection.Specifically, we define a new metric called communitybetweenness, which considers mobile users’ communityinformation. We find that the value of this metric changessignificantly under the clone attack, which is suitable to beused for clone attack detection. We derive both analyticaland training based approaches to determine the thresholdsetting of the community betweenness for robust cloneattack detection. Extensive trace-driven simulation studiesreveal that our social closeness based method can detectclone attacks with high detection ratio and low false positiverate.

I. INTRODUCTION

Mobile phones have become increasingly popular andplay significant roles in our daily lives. In particular, withthe rapid deployment of sensing technology in mobiledevices, the collected sensing data can be mined for un-derstanding human behaviors. For example, we can usemobile devices equipped with Bluetooth technology todiscover the encounter events between people such thattheir social relationships can be derived and analyzed.Information about users’ social relationships can assistin the development of applications in various domainsincluding healthcare applications. For instance, the dis-covered social relationships can be used to extract socialcommunities [1], [2], which reflect either close relation-ships or similar behavioral patterns among people. Suchextracted social community information provides newopportunities for epidemiology research and facilitate the

control of disease spread in the healthcare domain [3],[4]. Furthermore, information about social communitiesextracted from human contact traces can also be usedin other mobile healthcare systems to determine howsocially active senior citizens are.

However, wireless devices may easily be capturedby adversaries and unlimited number of clones of thecompromised nodes can be deployed. Since the cloneddevices have legitimate IDs and have access to securitykeys and other credentials, they can participate in thewireless network as a legitimate node. The existence ofthe cloned devices may not affect the network perfor-mance, however, they aim to have significant impact onpervasive mobile healthcare systems. For example, in themobile healthcare systems presented for facilitating epi-demiology research and disease propagation control [3],[4] which utilize the social community information, theadversary can attract more vaccine allocation to certaingeographical areas by deploying many cloned devices,and thus undermine the regular operations of socialcommunity-based mobile healthcare systems. Therefore,detecting the presence of clone attacks is importantto support the wide deployment of emerging mobilehealthcare systems.

Nevertheless, detecting clone attacks is not an easytask. The cloned devices have access to all the securityinformation of the compromised device and hence theycan pass all security checks without being detected.Existing techniques for mitigating malicious attacks insensor networks mostly focused on attack preventionthrough key distribution schemes [5]. These prevention-oriented schemes are ineffective [6] in preventing thelaunching of clone attacks since cloned devices have ac-cess to the legitimate information and can thus fool otherlegitimate devices without being detected. Recently, newtechniques utilizing the neighbors information [7], [8]have been proposed for detecting clone attacks in wire-less sensor networks. However, they can be applied eitheronly to static networks or for online social networks,making them less suitable for mobile healthcare systems.

In this work, we take the view point of exploitingusers’ social relationships extracted from human contact-based traces for detecting clone attacks in mobile health-care systems. Our design goal is to enable clone attackdetection without relying on other hardware such as

GPS, which may drain a phone’s battery quickly andtheir readings can also be easily hacked [9]. The basicidea is that a user often belongs to one or two staticcommunities during a certain time of period, and thecommunity membership often remains unchanged duringthat time period. However, the presence of cloned userscause that victim user to belong to multiple distinctsocial communities simultaneously, contradicting thatuser’s regular social behaviors. Thus, we propose a socialcloseness based approach that can detect clone attacksin a mobile healthcare system even with colluded clonedusers. To the best of knowledge, our work is the first thatutilizes social closeness to detect the clone attack for theemerging mobile healthcare systems.

In particular, our attack detection scheme developsa new metric called community betweenness for distin-guishing between legitimate and clone users. This metric,computed based on mobile users’ social communityinformation, changes significantly during clone attacks.Three variants are developed to compute this com-munity betweenness value, namely contact-frequencybased, contact-duration based, and shortest-path based.To make the detection more robust, we present ananalytical-based approach for determining the thresholdsetting of the community betweenness value for a homo-geneous mobile social network such that our detectionscheme can achieve a desirable detection rate with highconfidence. We then describe a training-based method fordetermining such a threshold for heterogeneous mobilesocial networks where the sizes of the communities varywidely.

We show the effectiveness of our detection methodusing the disease control mobile healthcare system pro-posed in [3], [4]. We evaluated our detection scheme viasimulations using the SWIM trace and the MIT realitymining trace [10], [11]. The results showed that ourscheme is highly effective for detecting various cloneattacks in this mobile healthcare system. Our techniquecan also be applied to other mobile healthcare systemsthat utilize human contact-based traces.

The rest of the paper is organized as follows. Wefirst put our work in the context of current researchin Section II. We then present the disease propagationcontrol framework (which we focus on as a case study)in the mobile healthcare system and the attack model inSection III. We next present our social community baseddetection scheme in Section IV with the communitybetweenness metric. We describe our analytical-basedand training-based approaches for robust detection inSection V. In Section VI, we validate the feasibilityof our proposed detection scheme using the trace-drivenapproach. Finally, we conclude our work in Section VII.

II. RELATED WORK

To thwart clone attacks, using tamper-resistant hard-ware for mobile nodes appears effective. However, ap-

proaches with tamper-resistant hardware may involvehigh cost and are hard to be applied to already de-ployed mobile nodes. Furthermore, an attacker may stillbe possible to bypass the tamper-resistant hardware toextract secret keys from captured nodes given enoughtime and computation ability even though the tamper-resistant hardware makes it much harder to accomplishthis [12]. Thus, it is desirable to seek low cost detectionmethods that do not require any hardware changes tocope with clone attacks.

Along this direction, several software-based cloneattack detection schemes have been proposed for sensornetworks [7], [13]. The basic idea of these schemesis to let each node report its neighboring informationand attempt to find conflicting reports. However, theseexisting methods can only be applied to static networksand cannot be used in networks with mobility. In [14],when a node announces its location, each of its neighborssends a copy of the location claim to a set of randomlyselected witness nodes. Any conflicting location claimswill provide the evidence for clone attack detection.However, the location information of nodes is neededconstantly for this scheme to work but such informationmay not always be available. In our situation, the nodesmay be sparsely connected in the mobile social network,it can be hard to find enough nodes as witnesses in theclone attack detection.

Schemes based on having neighboring nodes vote forchecking the sanity of a given node based on their localobservations have been developed [15], [16]. However,these schemes are not capable of detecting clones ifcloned nodes move only in close proximity with the vic-tim node, and may also fail when multiple clone nodesin close proximity collude. Our work is different in thatwe aim to detect the presence of the clone attack by us-ing the extracted social community information enabledby the widely used mobile phones rather than addingadditional overhead on mobile users. Our approach cansmartly detect clone attacks when multiple users colludeto trick the system. Our work is the first to address theclone attack problem in mobile healthcare systems whichexploits social community information to make eithercontrol decisions (e.g. for effective vaccine distributionand reduce disease propagation rate) or diagnosis (e.g.whether seniors are depressed or socially active).

III. MODEL ASSUMPTIONS

In this section, we first describe the system model ofa mobile healthcare system that is used in this work.We then present a high level overview of dynamiccommunity extraction method utilized in our mobilehealthcare system and describe our threat model.

A. System Model

In this work, we employ the mobile healthcare systemthat provides effective vaccine distribution by utilizing

the social community information extracted from mobilephone traces [4] and we refer the social communityas subsets of users within which user to user connec-tions are dense, but between which connections areless dense [2]. The mobile healthcare system can bedeployed for a geographical region, e.g. a town or a city.Instead of the traditional random vaccine distribution,targeting vaccination to a group of people with higherrisk of infection can provide more effective control ofan infectious disease propagation.

In our mobile healthcare system, each person has aunique user identifier (e.g., phone number) and ownsa mobile device equipped with Bluetooth technology.Each user then registers with the system by using his/heridentifier and mobile device’s Bluetooth ID. Encountertraces collected by Bluetooth technology are sent to thesystem server by each user and the server can deriveencounter events between users according to the registra-tion information in the system. The dynamic communityextraction scheme is applied and the extracted socialcommunity information is then used in the decisionprocess for targeted vaccine distribution. When a newdisease is discovered [17], those people within the samecommunities as a sick person have higher risks ofbeing infected, and thus should receive disease alert orvaccination messages sent by the server.

B. Preliminaries of Dynamic Community Extraction

In general, people may belong to different socialcommunities during different time periods. A socialcommunity may appear several times during a day orweek. Instead of directly extracting communities fromthe contact graphs constructed by the contact events, thebasic idea of the dynamic community extraction methodwe are using is to first extract the communities foreach non-overlapping time period and then merge thosecommunities with high similarity [4].

In particular, a contact graph G = (V,E) consistsof a vertex set V and an edge set E. Each vertexdenotes a person, while each edge weight denotes howfrequent two persons meet during a time period. Eachcontact graph is a representation of people’s relationshipswhere their relationship strength is derived based on thefrequency of their encounters. The dynamic communityextraction scheme constructs R contact graphs Gi ={(Vi, Ei), i = 1, ..., R} from R non-overlapping periods.Then, it extracts multiple communities from each contactgraph Gi using the hierarchical clustering algorithm [1],and the modularity metric Q [18]. The community setextracted from each time period is denoted as ASi.Assume that each ASi has ki communities as follows:

ASi = {Aki , k = 1, ..., ki}. (1)

Furthermore, we compare each community in ASi

with all the communities discovered in ASi+1 to seeif a community in ASi can be merged or removed basedon one of the following conditions:

• It is part of a bigger community in ASi+1 and hencecan be removed.

• It can be merged with one community in ASi+1

using the community merge operation for two com-munities Aj

i and Ali+1 with an adjustable commu-

nity threshold τc:

|Aji ∩Al

i+1|Max(|Aj

i |, |Ali+1|)

> τc (2)

• It is a superset of a community Aji+1 in ASi+1,

then Aji+1 is removed from set ASi+1 .

At the end of such operation, the two sets ASi andASi+1 are unioned to form a new AS

′i+1, which will

be merged with ASi+2 in the next round. The mergingprocess iterates through R time windows to yield thefinal M social communities: AS = {Aj , j = 1, ...,M}.

C. Attack Model

The mobile healthcare systems are vulnerable to cloneattacks [3], [4], [17]. An adversary can capture mobiledevices equipped with Bluetooth capability within thenetwork and deploy unlimited number of clone deviceswith duplicated Bluetooth IDs. Since these replicas havelegitimate IDs, the users carrying such devices canparticipate in the mobile social network and thus signifi-cantly affect the effectiveness of these mobile healthcaresystems.

Particularly, the adversary can distribute devices withcloned Bluetooth IDs to a group of users from the samegeographical area who share similar interests with him.These users can collude with the adversary to launch aclone attack so as to gain some benefits. For example,when there is an outbreak of a new infectious diseasewith limited vaccine supply, some pharmacy companiesor clinics may be interested in having more vaccine shotsallocated to their geographical area e.g. county or town.They can distribute devices with cloned IDs to differentcolluded users. A registered user with cloned IDs mayend up belonging to many social communities. If he issick, users who are in the same social community withhim will have higher chances of being selected to receivevaccinations. If an adversarial company replicate manyclone IDs which belong to legitimate registered users,the number of vaccines allocated for this geographicalarea will increase significantly. This attack strategy isespecially harmful when there is only a limited supplyof expensive vaccine shots.

IV. CLONE ATTACK DETECTION

In this section, we first describe our new metric ofcommunity betweenness. We show how to calculatethe community betweenness values by providing threemethods. We then present our clone attack detectioncriteria based on the community betweenness metric.

Community

Node under consideration Nodes in R(k)

Contact link

k

III

III

IVab

c

d e

2

31

4 5

f

Nodes not in R(k)

Fig. 1. An example of computing community betweenness.

A. Overview

Our clone attack detection method is based on theobservation that users of mobile devices typically belongto a number of stable communities. By stable, we meanthe communities that a user belongs to do not changerapidly after that user’s community profile has beenconstructed based on observations for a sufficient longtime period. However, when a clone attack is launched,the number of communities a user with cloned deviceID belongs to may increase significantly within a shorttime period, and hence such observation can be used forclone attack detection.

To capture this observation in a quantitative way, wepropose the concept of community betweenness inspiredby the idea that the betweenness of a node in a networkgraph is defined as the number of shortest paths betweenpairs of other vertices which go through that node. [2]has used betweenness as a measure of the centrality andinfluence of a node in social or biological networks.Thus, in order to find out which nodes in a social networkare always “between” different social communities forclone attack detection, we generalize the betweennessconcept to community betweenness. The communitybetweenness of a node is defined as the number ofshortest paths between pairs of communities that gothrough that node in the contact graph. The basic ideais that if there are users who share the same clonedID in a social network and these cloned nodes belongto different communities, then almost all shortest pathsbetween these communities in the contact graph maygo through this cloned ID. Thus, this particular clonedID will have a higher community betweenness valuecomparing to normal situations. More formal definitionof community betweenness is given in Section IV.B.

B. Definitions

We next define some basic concepts for computationof the community betweenness: neighbor set, the shortestpath between a node and a node set, the distance betweena node and a node set, and the node-centric set in thecontact graph.

We consider a mobile network in which each nodeis assigned with a unique identifier. We assume thatthese nodes represent mobile devices having sensing ca-pabilities, e.g., Bluetooth discovery that allows encounterevents between owners of devices to be recorded as

a contact-based trace. As in Section III, we constructR contact graphs Gi = {(Vi, Ei) , i = 1, ..., R} for theR non-overlapping time periods. In each contact graph,each node represents a user (identified by a uniqueidentifier), and edges that connect the nodes representsome encounter events between these two users. Theweight of each edge represents the cumulative contactfrequency of these encounters in a time period. Sucha contact graph is a geometric representation of therelationships between users.

By utilizing the final dynamic communities AS ={Aj , j = 1, ...,M} extracted in Section III, we definethe following concepts for a node k in a contact graphGj :

Definition 1. Neighbor set N(k): The collection ofnode k’s 1-hop neighbors in a contact graph is definedas its neighbor set N(k).

Definition 2. Shortest path between nodes: If nodek and node d are connected within the contact graph,the shortest path Ps(k, d) between them is defined asthe path with the smallest hop length from node k tonode d, otherwise, Ps(k, d) = ∞.

Definition 3. Shortest path between a node and anode set: Let ‖∗‖ be the number of nodes in a set and{d1, ..., d‖D‖

}be the nodes in a set D, the shortest path

Ps(k,D) from a node k to node set D is defined as:

Ps (k,D) = argminPs(k,dl)

|Ps (k, dl)| , dl ∈{d1, ..., d‖D‖

}

(3)Definition 4. Distance between a node and a node

set: The distance between node k and node set D isdefined as the hop length of the shortest path between kand D:

Dist (k,D) = |Ps (k,D)| (4)Definition 5. Node-centric set: Suppose node

k belongs to Mk different social communities:{Ai |k ∈ Ai, i = 1, ...,Mk}. The node-centric set R(k)

of node k is defined as: R(k) = N(k)∩(Mk⋃i=1

Ai). In other

words, node k’s node-centric set includes its neighboringnodes which belong to the same social community as k.

C. Community betweenness

1) Definition of Community Betweenness: Commu-nity betweenness of node k is defined as the numberof shortest paths going through k between each node inR(k) and any community which this node does not be-long to in R(k). To compute the community betweenness,we need to define the weight of the betweenness link be-tween k and a node n in R(k). This link weight considersthe number of shortest paths going through k betweennode n and other communities in R(k) that node n doesnot belong to. Then, the community betweenness of nodek is the sum of all the betweenness link weights betweenk and each node in R(k). Typically, whether two persons

3

CommunityBetweenness: 9

Node IDs

Com

mun

ity ID

s

6021 5137

5

10

15

0

4

1213

14

Node IDs7551 60 81838621 37 49 9596

Com

mun

ity ID

s

5

10

15

0

The shortest path does not go through the

considered node

Node belongs to this community

The shortest path goes through theconsidered node

Nodes in node centricset belong to this

community

4

9

11121314

1

(a) Before clone attack (b) After clone attack

CommunityBetweenness: 52

2 2 2 2 9 3 28 8 9 232 22

Fig. 2. The illustration of calculating community betweenness of node 7 in MIT Reality trace using Contact Frequency Based (CFB) method.

meet or not determines how likely one person can spreada disease to another. But sometimes, the larger thenumber of encounters between two persons, the higherthe chance of one person spreading a disease to another.Thus, we define two variants on how the communitybetweenness value is computed: (a) Contact FrequencyBased and (b) Contact Duration Based. In addition, wefurther define another variant (c) Shortest Path Based,where the community betweenness value is defined asthe number of shortest paths going through k betweenany node n in R(k) and nodes in other communitieswhich node n does not belong to in R(k).

2) Variants of Betweenness Computation: The maindifference of the three variations is: (a) the ContactFrequency Based betweenness utilizes the size of theintersection set of R(k) and each community in itslink weight calculation; (b) the Contact Duration Basedbetweenness relies on both the size of intersection setand the cumulative contact times betweeen nodes duringthe link weight calculation; and (c) the Shortest PathBased betweenness considers the number of shortestpaths between nodes from different communities in R(k)that go through node k for its link weight computation.Details of each variant are given below. Initially, the linkweight between node k and each node in R(k) is set to0.

Contact Frequency Based (CFB). If ∃Ai, n /∈ Ai

and Wi = Ai ∩ R(k) �= ∅ where n ∈ R(k), we thencompute the link weight Lnk between node k and n asfollows: If there exists one shortest path going throughnode k from node n to set Wi (k ∈ Ps(n,Wi)), the linkweight Lnk between node n and k will be increasedby the number of nodes in Wi: Lnk = Lnk + ‖Wi‖,otherwise it will not be changed. This process is repeatedfor all n’s qualifying communities Ai to obtain thefinal value of the link weight Lnk. The communitybetweenness of node k is then the sum of all the linkweights between k and each node in R(k):

bet(k) =∑

n∈R(k)

Lnk (5)

We next give an example to show how the CFB be-tweenness value is computed: A contact graph of node k

is shown in Figure 1 with nodes {a, k}, {k, c, b}, {k, d}and {e, f} belong to communities I-IV respectively andthe communities are shown as dotted circles. The cumu-lative contact frequency in a time period between eachpair of neighboring nodes are also shown in Figure 1.From the definitions above we see that k’s neighbors a, band c are in node-centric set R(k) because they belongto the same community as k. We consider each nodein R(k) and compute its link weight between node k:node a belongs to community I and its shortest path tonode set W2 = {b, c} (i.e., the intersection of R(k) andcommunity II) is path a-b and it does not go throughnode k. Thus, from the definition of link weight, thelink weight between a and k should be 0. Similarly, theweight link between b and k is also 0 because the shortestpath between b and W1 = {a} (i.e., the intersection ofR(k) and community I) also does not go through nodek. However, if we consider node c, its shortest path to setW1 = {a} goes through node k. Thus, the link weightbetween c and k will be increased by 1 because there isonly one node a in set W1. By now, we have consideredevery node in R(k) and thus the computation processstops. Thus, the contact frequency based communitybetweenness of k should be 0 + 0 + 1 = 1.

Contact Duration Based (CDB). The computationdifference of the CFB and CDB community betweennesslies in the link weight computation. If there existsone shortest path which goes through node k (k ∈Ps(n,Wi)) from node n to Wi, the link weight Lnk

will be increased by ‖Wi‖ × wnk. where wnk is thecumulative contact times between node n and k in thistime period.

Thus, in Figure 1, the link weight between node c andk using the CDB method should be 1×3 = 3. Similarly,the link weight between a, b and k are 0, thus, the contactduration based betweenness of k is 3.

Shortest Path Based (SPB). This method considersthe shortest paths between pairs of nodes from differentcommunities in R(k) instead of considering the shortestpaths between a node and node sets as in the formertwo variants. The computation of Lnk for each qualifiedcommunity Wi is carried out as follows: if there exist mi

(0 < mi ≤ ‖Wi‖) nodes in Wi and each of them (nodeg) satisfies the condition that the shortest path from noden to g goes through node k (k ∈ Ps(n, g)), then the linkweight Lnk will be increased by mi.

Using the same example in Figure 1, node a incommunity I has only one shortest path a-k-c whichgoes through node k to W2 (i.e., the intersection ofR(k) and community II). Thus, the link weight betweenk and a should be 1. Similarly, the weight between kand c should also be 1. The link weight between k andb is 0 because we cannot find a shortest path whichgoes through k between b and a node in W1 (i.e., theintersection of R(k) and community I). Thus, the SFBbased betweenness value for node k is 2.

3) Feasibility Study of Betweenness: Figure 2 il-lustrates how the CFB community betweenness valuechanges when a clone attack involving a particular node(node 7) happens within the MIT [11] reality trace. Dueto the space limitation, we only show the results of CFB.The similar trend is observed when using CDB and SPB.In Figure 2, we show the neighbors and communityinformation of node 7 in an eight-hour time period bythe Node IDs versus Community IDs.

In the example depicted in Figure 2, the communityIDs marked as green squares represent the communitieswhich the nodes in R(7) belong to. The nodes depictedin red circles represent the community IDs that a nodein R(7) belongs to. The blue dots and blue stars showwhether or not there exists a shortest path which goesthrough node 7 between a node n in R(7) and theintersection set of R(k) and another community thatnode n does not belong to . For instance, in Figure 2(a), node 37 belongs to communities 12 and 13 as shownby the red circles. It also shows that the shortest pathbetween node 37 and the intersection set of R(k) andcommunities 4,5 goes through node 7 while the shortestpath between node 37 and the intersection set of R(k)and community 10 does not go through node 7.

Thus, using the definition of the Contact-FrequencyBased (CFB) community betweenness, we can computethe betweenness link weights between each node n inR(7) and node 7 by adding the number of red circlesin the rows which have blue dots in the correspondingcolumn of node n. The total community betweennesscan be computed by adding the link weights together.Figure 2 (a) shows that the CFB community betweennessof node 7 is 9 prior to a clone attack. After the cloneattack is launched, the CFB community betweennessincreases from 9 to 52 as Figure 2 (b) shows. Thechanges of CFB community betweenness in Figure 2confirm the feasibility of detecting the clone attack byusing community betweenness.

D. Detection Criteria

A straightforward method of detecting a clone attackis to determine whether there exists any node with a

community betweenness value which exceeds a pre-defined threshold during a time period. However, anuncloned node may also have a high community be-tweenness value in a particular time period. In orderto make the detection more robust, the presence of acloned node will be declared only when a node hascommunity betweenness values that exceed a pre-definedbetweenness threshold, τb for multiple (i.e., S) timeperiods within an observation window which consists ofL time periods. Our detection module searches througheach consecutive observation window to declare thepresence of the clone attack when the ratio S/L is largerthan a (ratio) threshold τr.

V. ROBUST DETECTION

The choice of the community betweenness thresholdτb is critical for the performance of our clone attackdetection scheme. However, the complexity and varyingcharacteristics of the mobile social networks make itdifficult to provide a generic analytical framework fordetermining a suitable τb. In this section, we show twopossible approaches to determine τb such that our cloneattack detection scheme yields robust detection results:an analytical approach for homogeneous social networksand a training based method for heterogeneous socialnetworks.

A. Analytical Approach

1) Modeling of the Contact Process: In a homoge-neous social network, the size of each community is thesame and the number of nodes which two communitieshave in common is also fixed. We propose a new contactmodel by considering scenarios where nodes belong tomultiple communities, inspired by the small world graphmodel [19] and the Watts and Strogatz model [20].We note that the small world graph model [19] onlyconsidered users belonging to a single community andthe Watts and Strogatz model did not take into consid-eration the community concept. In our contact model,N nodes are numbered sequentially and arranged ina ring. Every consecutive K nodes within a ring arein the same social community. Since one user maybelong to multiple communities, we let each pair ofneighboring communities in a ring have P commonnodes. To generate the sequence of encounter events,each node selects the encountered nodes uniformly atrandom every Q seconds: with probability q, it selects apeer uniformly from its community members, whereasit selects a peer uniformly at random from its non-community members with probability 1− q.

An example of the contact model is illustrated inFigure 3 (a): 76 nodes (N = 76) are arranged in aring and they have been divided into 19 communities.Each community has 7 nodes (K = 7) and each pair ofneighboring communities in the ring have 3 nodes (P =3) in common. Thus, nodes {1, ..., 7}, {5, ..., 11},...,

1

23 4

56

7

8

(1)

9

10

1173

74

(2)(19)75

76

1 2 3 40

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Number of cloned nodes

Dete

cti

on

Rati

o

CFBCDBSPB

(a) Social network model. (b) Simulation results.

Fig. 3. An example of the social network model and simulationresults.

{73, 74, 75, 76, 1, 2, 3} belong to communities (1) to(19), respectively. The nodes marked with the dark colorin Figure 3 only belong to one social community andthe nodes marked with the light color belong to twoneighboring communities.

Based on our proposed contact model, we first derivethe probability distribution of community betweennessbet(k) under normal situations and the detailed descrip-tion of our analysis may be found in the technical re-port [21]. Provided the knowledge about the probabilitydistribution, we can then determine the detection thresh-old τb by using a confidence level α (e.g., α = 95%):

P (bet(k) ≤ τb) =∑

xb≤τb

P (bet(k) = xb) = α (6)

For a high confidence level α = 95%, it is not likely thatthe community betweenness value will be larger than τbin a normal social network.

2) Evaluation: To evaluate the effectiveness of ourdetection threshold setting under homogeneous socialnetworks, we generated human contact traces withparameter setting similar to the well-known SWIMtrace [10], which contains 3 days of 76 persons’ mo-bility patterns in conference and university campus en-vironments. We let each node select a peer from itscommunity members with the probability 0.9 (q = 0.9)every 600 seconds (Q = 600). We launch clone attacksby randomly selecting a certain number of nodes inthe trace and modifying their node identifiers to bethat of the victim node. We studied detection ratio,which is the percentage of clone nodes that are detectedby the detection scheme, and the corresponding falsepositive ratio, the percentage of normal nodes that aremistakenly detected as clone nodes. Figure 3 (b) depictedthe detection ratio when using CFB, CDB, and CPBfor community betweenness computation under differentnumber of clone nodes. We observed that our detectionscheme is effective under the homogeneous social net-works and can achieve detection ratio over 80% and thecorresponding false positive ratios are zero for all thescenarios.

B. Training Based Approach

The analysis using homogeneous social networksshows the feasibility of our approach. When communitysizes and the number of overlapping nodes between

different communities vary significantly, we resort toa training-based approach for choosing an appropriatethreshold so that we can achieve robust detection. Thebasic idea is that we analyze historical contact traces todetermine typical community betweeenness values andthen choose appropriate detection thresholds for nodeswith similar encounter rates.

In particular, we first classify the nodes in a mobile so-cial network into different sets based on their encounterfrequencies. We analyze the encounter frequencies ofall the nodes in a mobile social network and dividethese nodes into 3 categories: (i) nodes that have thelowest 30% of encounter rates are classified as non-active nodes, (ii) nodes that have the highest 30% ofencounter rates are classified as active nodes, and (iii) theremaining 40% nodes are considered as regular nodes.

Next, we divide the training data into different timeperiods and compute the average community between-ness values over these time periods for each node inthree node sets specified above. Suppose there are Na,Nr and Nn nodes respectively in the active, regular andnon-active sets and bia, bir and bin denote the averagecommunity betweenness value of each node in thesethree sets. Thus, the average community betweennessvalues over time periods for these three sets of nodescan be represented as: Ba =

{bia, i = 1, ..., Na

}, Br ={

bir, i = 1, ..., Nr

}and Bn =

{bin, i = 1, ..., Nn

}, re-

spectively. Then, we determine the averages and standarddeviations of community betweenness values from Ba,Br and Bn and they are denoted as AvgBa, AvgBr,AvgBn and σa, σr, σn respectively. In our thresh-old setting method, we empirically set the thresholdτb as the average value plus two standard deviations(AvgBa+2σa, AvgBr +2σr, AvgBn+2σn) for nodesfrom the active, regular and non-active sets respectively.

VI. PERFORMANCE EVALUATION

In this section, we conduct performance evaluationof our detection methods based on the training basedapproach when the community size and the number ofoverlapping node vary.

A. Simulation Methodology

We conducted simulations by using two humancontact-based traces, SWIM [10] and MIT reality [11]traces. The SWIM traces were generated from the SWIMmobility generator to mimic human mobility traces for76 participants during a 3-day period in conference anduniversity campus settings. The MIT traces, which lastedfor 20 days, were collected from smart phones equippedwith Bluetooth devices carried by 97 participants in anuniversity environment. Each trace contains informationabout the IDs of the Bluetooth devices which are withinthe transmission range of each other, and the starting andending times of their encounters.

We divide the SWIM trace into 72 time periods witheach time period being 1 hour. Similarly, we divide the

1 2 3 40

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Det

ecti

on

Rat

io

Active set + SameActive set + DifferentRegular set + SameRegular set + Different

1 2 3 40

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Det

ecti

on

Rat

io

Active set + SameActive set + DifferentRegular set + SameRegular set + Different

(a) Contact Frequency Based (CFB) (b) Contact Duration Based (CDB)

Fig. 4. Detection ratio under different number of clone nodes whenthe victim and clone nodes are from the same node set, either activeor regular, from the SWIM trace.

MIT trace into 60 non-overlapping time periods witheach time period being 8 hours. In both traces, we leteach observation window contain L = 12 time periods,and set the ratio threshold τr = 0.5. We use the firstobservation window of the traces as training data todetermine the community betweenness thresholds, andthe remaining observation windows as the testing datato validate our detection approach.

We conduct extensive simulations on these two setsof traces by varying the number of clone nodes from1 to 4. Since nodes from the non-active set seldomcontact with other nodes, we only consider the nodesfrom either the active or regular sets. For each testingscenario, we randomly choose the nodes from either theactive or regular sets and repeat the simulation 50 timesto get a statistical view of the results. As in SectionV.A, we use the detection ratio and the correspondingfalse positive ratio to evaluate the effectiveness of ourdetection scheme.

B. Results

The results of using the Shortest Path Based (SPB)betweenness in our clone attack detection scheme ex-hibits the same trend as we observed when using theContact Frequency Based (CFB) betweenness. We thusonly present the results obtained from Contact FrequencyBased (CFB) and Contact Duration Based (CDB). Fur-thermore, the results from the SWIM traces achieves thesimilar detection ratio and false positive ratio to thosefrom the MIT Reality traces. Due to the space limitation,we only present the results using the SWIM traces in thefollowing subsections. The detailed results may be foundin our comprehensive technical report [21].

Same Node Set Study. Figure 4 presents the detectionratio versus the number of clone nodes with both thevictim and clone nodes belonging to the same node set(can be either active or regular) using the Contact Fre-quency Based (CFB) and Contact Duration Based (CDB)betweenness in our clone attack detection scheme. InFigure 4, the “active set” and “regular set” denote bothvictim and clone nodes are from active set or regularset, respectively. The “Same” and “Different” denote thevictim and clone nodes are chosen from the same ordifferent communities.

We observed that the detection ratio increases to100% as the number of cloned nodes increases. Thisis because more clone nodes increase the number ofshortest paths going through the victim node, whichincreases its observed community betweenness value.The corresponding false positive ratio remains at zero. Inaddition, we found that the detection ratio is higher whenthe nodes are from different communities than thosefrom the same communities. Moreover, higher detectionratio is also achieved when both the victim node andclone nodes are from the active set. This indicates thatour proposed approach is more effective when the victimand clone nodes come from different communities orfrom the active set. This is inline with the design of ourbetweenness-based detection algorithm because clonenodes from different communities or from the active setresult in having more shortest paths which go throughthe victim node.

Different Node Set Study. Next, we let the clonenodes and the victim node come from different node sets,i.e., the victim node is chosen from the active sets whilethe clone nodes are chosen from the regular sets, andvice versa. Figure 5 depicted the results of using CFBand CDB betweenness respectively, when victim nodeand clone nodes are chosen from different node sets,i.e., active or regular. The “active victim node” denotesthe victim node comes from active sets and the clonenodes are from regular sets, and vice versa for “regularvictim node”. The “Same” and “Different” also denotethe victim and clone nodes are chosen from the same ordifferent communities.

We found that the detection ratio often improves whenthe number of clone nodes increases or when the clonenodes are from different communities. Comparing theresults from different combinations of victim and clonenodes in Figure 5, we further observed that the detectionratio is higher when the victim node is chosen from theregular node set and the clone nodes are chosen fromthe active node set. This is because typically active nodeshave larger number of neighboring nodes, and thus moreshortest paths exist. Thus, with a lower detection thresh-old used for a regular node, the community betweennessvalue for such a node during a clone attack can easilyexceed the threshold and hence be detected. Comparingthe first two bars in Figure 5 with Figure 4 when thenumber of cloned nodes is 1 or 2, we found that thedetection ratio when the victim node and clone nodes areall from active set is higher than the detection ratio whenthe victim node is from the active set while the clonenodes are from the regular set. The victim node froman active set uses a higher detection threshold and thusif the cloned nodes are from regular set, the communitybetweenness value of the victim node may not exceedthe threshold for the attack to be detected. In addition,when we compare the last two bars in Figure 5 withFigure 4 when the number of cloned nodes is 1 or 2, we

1 2 3 40

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Det

ecti

on

Rat

io

Active victim node + SameActive victim node + DifferentRegular victim node + SameRegular victim node + Different

1 2 3 40

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Det

ecti

on

Rat

io

Active victim node + SameActive victim node + DifferentRegular victim node + SameRegular victim node + Different

(a) Contact Frequency Based (CFB) (b) Contact Duration Based (CDB)

Fig. 5. Detection ratio under different number of clone nodes whenvictim and clone nodes are from different node sets (active or regular)from SWIM trace.

observe that the detection ratio is higher when the victimnode is chosen from the regular node set and the clonenodes are chosen from the active node set. It is also inlinewith our expectations: the cloned nodes from the activeset with more neighboring nodes cause the communitybetweenness value of the victim node to easily exceedthe threshold.

Finally, we found that the false positive ratio is zerofor all clone attack scenarios, which is encouragingas these results suggest that our proposed approach isfeasible and effective in detecting the presence of clonenodes in a mobile social network.

VII. CONCLUSION

In this paper, we introduced clone attacks in a mo-bile healthcare system and proposed a social commu-nity based detection method that exploits the socialrelationships to detect the presence of clone attacks.The concept of community betweenness is introducedby considering both the community and neighboringinformation of mobile users. The existence of a cloneattack is identified by calculating the community be-tweenness value of a node in multiple time periods.To achieve robust attack detection, we developed bothanalytical and training based approaches to find a suit-able community betweenness threshold for clone attackdetection. Through extensive simulations using SWIMand MIT Reality traces, we showed that by consideringsocial community information, our proposed method candetect clone attacks efficiently with high detection ratioand zero false positive rate. Our results demonstratedthe feasibility of exploiting the social community in-formation derived from mobile device daily traces forsolving security problems (e.g., clone attacks) in mobilehealthcare systems.

VIII. ACKNOWLEDGMENT

This work is supported in part by National ScienceFoundation Grants CNS1016303 and CNS1016296.

REFERENCES

[1] J.Scott, Social Network Analysis: A Handbook. Sage PublicationLtd, 2000.

[2] M. Girvan and M. E. J. Newman, “Community structure insocial and biological networks,” in Proceedings of the NationalAcademy of Sciences of the United States of America, June 2002.

[3] M. A. Kazandjieva, J. W. Lee, M. Salath, M. W. Feldman,J. H. Jones, and P. Levis, “Experiences in measuring a humancontact network for epidemiology research,” in Proceedings ofthe ACM Workshop on Hot Topics in Embedded NetworkedSensors (HotEmNets), 2010.

[4] Y. Ren, J. Yang, M. C. Chuah, and Y. Chen, “Mobile phoneenabled social community extraction for controlling of diseasepropagation in healthcare,” in Proceedings of the IEEE Interna-tional Conference on Mobile Ad Hoc and Sensor Systems (IEEEMASS), concise paper, 2011.

[5] R. R. Brooks, P. Y. Govindaraju, M. Pirretti, N. Vijaykrishnan,and M. T. Kandemir, “On the detection of clones in sensornetworks using random key predistribution.” IEEE Transactionson Systems, Man, and Cybernetics, Part C, pp. 1246–1258, 2007.

[6] H. Choi, S. Zhu, and T. F. La Porta, “Set: Detecting node clonesin sensor networks,” in Security and Privacy in CommunicationsNetworks and the Workshops, 2007., sept. 2007, pp. 341 –350.

[7] K. Xing, F. Liu, X. Cheng, and D. H. C. Du, “Real-time detectionof clone attacks in wireless sensor networks,” in Proceedingsof the 2008 The 28th International Conference on DistributedComputing Systems.

[8] L. Jin, H. Takabi, and J. B. Joshi, “Towards active detection ofidentity clone attacks on online social networks,” in Proceedingsof the first ACM conference on Data and application securityand privacy.

[9] N. O. Tippenhauer, C. Pöpper, K. B. Rasmussen, and S. Capkun,“On the requirements for successful gps spoofing attacks,” inProceedings of the 18th ACM conference on Computer andcommunications security, 2011, pp. 75–86.

[10] A. Mei and J. Stefa, “Swim: A simple model to generate smallmobile worlds,” in Proc. of IEEE INFOCOM, 2009, pp. 2106–2113.

[11] N. Eagle and A. Pentland, “Reality mining: Sensing complexsocial systems,” In Personal and Ubiquitous Computing, vol. 10,no. 4, 2005.

[12] A. Becher, E. Becher, Z. Benenson, and M. Dornseif, “Tamperingwith motes: Real-world physical attacks on wireless sensornetworks,” in Proceeding of the 3rd International Conference onSecurity in Pervasive Computing (SPC), 2006, pp. 104–118.

[13] M. Conti, R. Di Pietro, L. V. Mancini, and A. Mei, “A random-ized, efficient, and distributed protocol for the detection of nodereplication attacks in wireless sensor networks,” in Proceedingsof the 8th ACM international symposium on Mobile ad hocnetworking and computing.

[14] B. Parno, A. Perrig, and V. Gligor, “Distributed detection of nodereplication attacks in sensor networks,” in Proceedings of the2005 IEEE Symposium on Security and Privacy, 2005, pp. 49–63.

[15] F. Liu, X. Cheng, and D. Chen, “Insider attacker detection inwireless sensor networks,” in 26th IEEE International Conferenceon Computer Communications, Anchorage, Alaska, USA, 2007,pp. 1937–1945.

[16] H. Jun-won, W. Matthew, and D. S. K, “Fast detection ofreplica node attacks in mobile sensor networks using sequentialanalysis,” in Proc. of IEEE INFOCOM, 2009, pp. 1773–1781.

[17] S. Huang, “Probabilistic model checking of disease spread andprevention,” in Scholarly Paper for the Degree of Masters inUniversity of Maryland, 2009.

[18] L. Tang, X. Wang, and H. Liu, “Uncovering groups via heteroge-neous interaction analysis,” in Proceedings of IEEE InternationalConference on Data Mining(ICDM), 2009.

[19] T. Hossmann, T. Spyropoulos, and F. Legendre, “Know thyneighbor: towards optimal mapping of contacts to social graphsfor dtn routing,” in Proceedings of the 29th conference onInformation communications.

[20] D. J. Watts, Small Worlds: The Dynamics of Networks betweenOrder and Randomness. Princeton University Press, 2003.

[21] Y. Ren, Y. Chen, and M. C. Chuah, “Detection schemes againstclone attacks in mobile social network,” Technical report, StevensInstitute of Technology, November 2011.

Documents

Social Closeness Based Clone Attack Detection for Mobile ...yychen/papers/Social Closeness Based Clon… · Social Closeness Based Clone Attack Detection for Mobile Healthcare System