14
Fully Anonymous Profile Matching in Mobile Social Networks Xiaohui Liang, Student Member, IEEE, Xu Li, Kuan Zhang, Rongxing Lu, Member, IEEE, Xiaodong Lin, Member, IEEE, and Xuemin (Sherman) Shen, Fellow, IEEE Abstract—In this paper, we study user profile matching with privacy-preservation in mobile social networks (MSNs) and introduce a family of novel profile matching protocols. We first propose an explicit Comparison-based Profile Matching protocol (eCPM) which runs between two parties, an initiator and a responder. The eCPM enables the initiator to obtain the comparison-based matching result about a specified attribute in their profiles, while preventing their attribute values from disclosure. We then propose an implicit Comparison-based Profile Matching protocol (iCPM) which allows the initiator to directly obtain some messages instead of the comparison result from the responder. The messages unrelated to user profile can be divided into multiple categories by the responder. The initiator implicitly chooses the interested category which is unknown to the responder. Two messages in each category are prepared by the responder, and only one message can be obtained by the initiator according to the comparison result on a single attribute. We further generalize the iCPM to an implicit Predicate-based Profile Matching protocol (iPPM) which allows complex comparison cri- teria spanning multiple attributes. The anonymity analysis shows all these protocols achieve the confidentiality of user profiles. In addition, the eCPM reveals the comparison result to the initiator and provides only conditional anonymity; the iCPM and the iPPM do not reveal the result at all and provide full anonymity. We analyze the communication overhead and the anonymity strength of the protocols. We then present an enhanced version of the eCPM, called eCPM+, by combining the eCPM with a novel prediction-based adaptive pseudonym change strategy. The performance of the eCPM and the eCPM+ are comparatively studied through extensive trace-based simulations. Simulation results demonstrate that the eCPM+ achieves significantly higher anonymity strength with slightly larger number of pseudonyms than the eCPM. Index Terms—Mobile social network, profile matching, privacy preservation, homomorphic encryption, oblivious transfer. I. I NTRODUCTION Social networking makes digital communication technolo- gies sharpening tools for extending the social circle of people. It has already become an important integral part of our daily lives, enabling us to contact our friends and families on time. As reported by ComScore [1], social networking sites such as Facebook and Twitter have reached 82 percent of the world’s Manuscript received 28 February 2012; revised 22 July 2012. This work was partially supported by the International Cooperative Program of Shenzhen City (No.ZYA201106090040A). X. Liang, K. Zhang, R. Lu, and X. Shen are with the Department of Electrical and Computer Engineering, University of Waterloo, Canada (emails: {x27liang, k52zhang, rxlu, xshen}@bbcr.uwaterloo.ca). X. Li is with Huawei Technologies Canada. Part of this work was done when he was with Inria, France (email: [email protected]). X. Lin is with the Faculty of Business and Information Technology, Univer- sity of Ontario Institute of Technology, Canada (email: [email protected]). online population, representing 1.2 billion users around the world. In the meantime, fueled by the pervasive adoption of advanced handheld devices and the ubiquitous connections of Bluetooth/WiFi/GSM/LTE networks, the use of Mobile Social Networking (MSNs) has surged. In the MSNs, users are able to not only surf the Internet but also communicate with peers in close vicinity using short-range wireless communications [2]–[6]. Due to its geographical nature, the MSNs support many promising and novel applications [7]–[12]. For example, through Bluetooth communications, PeopleNet [7] enables ef- ficient information search among neighboring mobile phones; a message-relay approach is suggested in [8] to facilitate carpool and ride sharing in a local region. Realizing the potential benefits brought by the MSNs, recent research efforts have been put on how to improve the effectiveness and effi- ciency of the communications among the MSN users [9], [11], [12]. They developed specialized data routing and forwarding protocols associated with the social features exhibited from the behavior of users, such as, social friendship [9], social selfishness [11], and social morality [12]. It is encouraging that the traditional solutions can be further extended to solve the MSN problems by considering the unique social features. Privacy preservation is a significant research issue in social networking. Since more personalized information is shared with the public, violating the privacy of a target user become much easier [13]–[17]. Research efforts [13], [14], [17] have been put on identity presentation and privacy concerns in social networking sites. Gross and Acquisti [13] argued that users are putting themselves at risk both offline (e.g., stalking) and online (e.g., identity theft) based on a behavior analysis of more than 4,000 students who have joined a popular social net- working site. Stutzman [14] presented a quantitative analysis of identity information disclosure in social network communi- ties and subjective opinions from students regarding identity protection and information disclosure. When the social net- working platforms are extended into the mobile environment, users require more extensive privacy-preservation because they are unfamiliar with the neighbors in close vicinity who may eavesdrop, store, and correlate their personal information at different time periods and locations. Once the personal infor- mation is correlated to the location information, the behavior of users will be completely disclosed to the public. Chen and Rahman [17] surveyed various mobile Social Networking Applications (SNAs), such as, neighborhood exploring appli- cations, mobile-specific SNAs, and content-sharing applica- tions, all of which provide no feedback or control mechanisms to users and may cause inappropriate location and identity IEEE TRANASCTIONS ON NETWORKING YEAR 2013

Fully Anonymous Profile Matching in Mobile Social Networksvaweinstitutes.com/pdffiles/Base Paper.pdf · Fully Anonymous Profile Matching in Mobile Social Networks Xiaohui Liang,

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Fully Anonymous Profile Matching in Mobile Social Networksvaweinstitutes.com/pdffiles/Base Paper.pdf · Fully Anonymous Profile Matching in Mobile Social Networks Xiaohui Liang,

Fully Anonymous Profile Matching in MobileSocial Networks

Xiaohui Liang, Student Member, IEEE, Xu Li, Kuan Zhang, Rongxing Lu, Member, IEEE,Xiaodong Lin, Member, IEEE, and Xuemin (Sherman) Shen, Fellow, IEEE

Abstract—In this paper, we study user profile matching withprivacy-preservation in mobile social networks (MSNs) andintroduce a family of novel profile matching protocols. Wefirst propose an explicit Comparison-based Profile Matchingprotocol (eCPM) which runs between two parties, an initiatorand a responder. The eCPM enables the initiator to obtain thecomparison-based matching result about a specified attributein their profiles, while preventing their attribute values fromdisclosure. We then propose an implicit Comparison-based ProfileMatching protocol (iCPM) which allows the initiator to directlyobtain some messages instead of the comparison result fromthe responder. The messages unrelated to user profile can bedivided into multiple categories by the responder. The initiatorimplicitly chooses the interested category which is unknown tothe responder. Two messages in each category are prepared by theresponder, and only one message can be obtained by the initiatoraccording to the comparison result on a single attribute. Wefurther generalize the iCPM to an implicit Predicate-based ProfileMatching protocol (iPPM) which allows complex comparison cri-teria spanning multiple attributes. The anonymity analysis showsall these protocols achieve the confidentiality of user profiles. Inaddition, the eCPM reveals the comparison result to the initiatorand provides only conditional anonymity; the iCPM and theiPPM do not reveal the result at all and provide full anonymity.We analyze the communication overhead and the anonymitystrength of the protocols. We then present an enhanced versionof the eCPM, called eCPM+, by combining the eCPM with anovel prediction-based adaptive pseudonym change strategy. Theperformance of the eCPM and the eCPM+ are comparativelystudied through extensive trace-based simulations. Simulationresults demonstrate that the eCPM+ achieves significantly higheranonymity strength with slightly larger number of pseudonymsthan the eCPM.

Index Terms—Mobile social network, profile matching, privacypreservation, homomorphic encryption, oblivious transfer.

I. INTRODUCTION

Social networking makes digital communication technolo-gies sharpening tools for extending the social circle of people.It has already become an important integral part of our dailylives, enabling us to contact our friends and families on time.As reported by ComScore [1], social networking sites such asFacebook and Twitter have reached 82 percent of the world’s

Manuscript received 28 February 2012; revised 22 July 2012. This workwas partially supported by the International Cooperative Program of ShenzhenCity (No.ZYA201106090040A).

X. Liang, K. Zhang, R. Lu, and X. Shen are with the Department ofElectrical and Computer Engineering, University of Waterloo, Canada (emails:{x27liang, k52zhang, rxlu, xshen}@bbcr.uwaterloo.ca).

X. Li is with Huawei Technologies Canada. Part of this work was donewhen he was with Inria, France (email: [email protected]).

X. Lin is with the Faculty of Business and Information Technology, Univer-sity of Ontario Institute of Technology, Canada (email: [email protected]).

online population, representing 1.2 billion users around theworld. In the meantime, fueled by the pervasive adoption ofadvanced handheld devices and the ubiquitous connections ofBluetooth/WiFi/GSM/LTE networks, the use of Mobile SocialNetworking (MSNs) has surged. In the MSNs, users are ableto not only surf the Internet but also communicate with peersin close vicinity using short-range wireless communications[2]–[6]. Due to its geographical nature, the MSNs supportmany promising and novel applications [7]–[12]. For example,through Bluetooth communications, PeopleNet [7] enables ef-ficient information search among neighboring mobile phones;a message-relay approach is suggested in [8] to facilitatecarpool and ride sharing in a local region. Realizing thepotential benefits brought by the MSNs, recent research effortshave been put on how to improve the effectiveness and effi-ciency of the communications among the MSN users [9], [11],[12]. They developed specialized data routing and forwardingprotocols associated with the social features exhibited fromthe behavior of users, such as, social friendship [9], socialselfishness [11], and social morality [12]. It is encouragingthat the traditional solutions can be further extended to solvethe MSN problems by considering the unique social features.

Privacy preservation is a significant research issue in socialnetworking. Since more personalized information is sharedwith the public, violating the privacy of a target user becomemuch easier [13]–[17]. Research efforts [13], [14], [17] havebeen put on identity presentation and privacy concerns insocial networking sites. Gross and Acquisti [13] argued thatusers are putting themselves at risk both offline (e.g., stalking)and online (e.g., identity theft) based on a behavior analysis ofmore than 4,000 students who have joined a popular social net-working site. Stutzman [14] presented a quantitative analysisof identity information disclosure in social network communi-ties and subjective opinions from students regarding identityprotection and information disclosure. When the social net-working platforms are extended into the mobile environment,users require more extensive privacy-preservation because theyare unfamiliar with the neighbors in close vicinity who mayeavesdrop, store, and correlate their personal information atdifferent time periods and locations. Once the personal infor-mation is correlated to the location information, the behaviorof users will be completely disclosed to the public. Chenand Rahman [17] surveyed various mobile Social NetworkingApplications (SNAs), such as, neighborhood exploring appli-cations, mobile-specific SNAs, and content-sharing applica-tions, all of which provide no feedback or control mechanismsto users and may cause inappropriate location and identity

IEEE TRANASCTIONS ON NETWORKING YEAR 2013

Page 2: Fully Anonymous Profile Matching in Mobile Social Networksvaweinstitutes.com/pdffiles/Base Paper.pdf · Fully Anonymous Profile Matching in Mobile Social Networks Xiaohui Liang,

Initiator ui

Responder uj

Initiator ui

Responder uj

ai,x = 0.8 aj,x= 0.5

E(s1,y)

ai,x = 0.8 aj,x= 0.5

ai,x > aj,x

ai,x = aj,x

ai,x < aj,x

ai ,x > aj,x

(a) Scenario-1 (b) Scenario-2

ai,x > aj,x

ai,x < aj,x

�������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

E(0), …, E(1), …, E(0)

������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

E(0), …, E(1), …, E(0)������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������s*,*������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

s1,1, …, s1,y, …, s1,λ

s0,1, …, s0,y, …, s0,λ

y-th dimension

Fig. 1: Two considered scenarios: (a) Attribute value ai,x and attribute value aj,x will not be disclosed to uj and ui,respectively. The initiator obtains the comparison result at the end of the protocol. (b) ai,x and aj,x will not be disclosed touj and ui, respectively. In addition, category Ty will not be disclosed to uj , and the comparison result will not be disclosed

to any of ui and uj . The initiator obtains either s1,y or s0,y depending on the comparison result between ai,x and aj,x.

information disclosure. To overcome the privacy violation inMSNs, many privacy enhancing techniques have been adoptedinto the MSN applications [4], [12], [17]–[23]. For example,when two users encounter in the MSNs, privacy-preservingprofile matching acts as a critical initial step to help users,especially strangers, initialize conversation with each other ina distributed and privacy-preserving manner. Many researchefforts on the privacy preserving profile matching [20]–[23]have been carried out. The common goal of these works isto enable the handshake between two encountered users ifboth users satisfy each other’s requirement while eliminatingthe unnecessary information disclosure if they are not. Theoriginal idea is from [18], where an agent of the CentralIntelligence Agency (CIA) wants to authenticate herself to aserver, but does not want to reveal her CIA credentials unlessthe server is a genuine CIA outlet. In the meantime, the serverdoes not want to reveal its CIA credentials to anyone but CIAagents.

In the MSNs, we consider a generalized function to supportinformation exchange by using profile matching as a metric.Following the previous example, we consider two CIA agentswith two different priority levels in the CIA system, A with alow priority lA and B with a high priority lB . They know eachother as a CIA agent. However, they do not want to reveal theirpriority levels to each other. B wants to share some messagesto A. The messages are not related to user profile, and theyare divided into multiple categories, e.g., the messages relatedto different regions (New York or Beijing) in different years(2011 or 2012). B shares one message of a specified categoryT at a time. The category T is chosen by A, but the choice isunknown to B. For each category, B prepares two self-definedmessages, e.g., a low-confidential message for the CIA agentat a lower level and a high-confidential message for the agentat a higher level. Because lA < lB , A eventually obtains thelow-confidential message without knowing that it is a low-confidential one. In the meantime, B does not know whichmessage A receives. The above function offers both A andB the highest anonymity since neither the comparison resultbetween lA and lB is disclosed to A or B nor the category Tof A’s interest is disclosed to B. In the following, we refer toA as the initiator ui, B as the responder uj , the attribute usedin the comparison (i.e., priority level) as ax, and the categoryT of A’s interest as Ty . The attribute values of ui and uj on

the attribute ax are denoted by ai,x and aj,x, respectively. Wefirst formally describe two scenarios from the above examples.

Scenario-1: The initiator wants to know the comparisonresult, i.e., whether it has a value larger, equal, or smaller thanthe responder on a specified attribute. For example, as shownin Fig. 1 (a), the initiator ui expects to know if ai,x > aj,x,ai,x = aj,x, or ai,x < aj,x.

Scenario-2: The initiator expects that the responder sharesone message related to the category of its interest, which ishowever kept unknown to the responder. In the meantime, theresponder wants to share with the initiator one message whichis determined by the comparison result of their attribute values.For example, as shown in Fig. 1 (b), both ui and uj know thatax is used in the comparison and the categories of messagesare T1, · · · , Tλ. The initiator ui first generates a (0, 1)-vectorwhere the y-th dimension value is 1 and other dimensionvalues are 0. Then, ui encrypts the vector with its own publickey and sends the ciphertexts (E(0), · · · , E(1), · · · , E(0))to the responder uj . The ciphertexts imply ui’s interestedcategory Ty , but uj is unable to know Ty since E(0) andE(1) are non-distinguishable without a decryption key. ui

also provides its attribute value ai,x in an encrypted formso that uj is unable to obtain ai,x. On the other hand, uj

prepares λ pairs of messages, each pair (s1,h, s0,h) relating toone category Th(1 ≤ h ≤ λ). uj executes a calculation overthe ciphertexts and sends the result to ui. Finally, ui obtainsE(s1,y) if ai,x > aj,x or E(s0,y) if ai,x < aj,x, and obtainss1,y or s0,y by the decryption.

A. Problem Statement

In the literature, there are many privacy-preserving profilematching protocols [10], [20]–[23]. They aim to determinethe overall similarity of two profiles rather than their relationin specific attributes. They commonly check whether theproximity measure of the two profiles is larger, equal, orsmaller than a pre-defined threshold value. The proximitymeasurement can be the size of the intersection of two sets orthe distance of two vectors where sets and vectors are usedto represent profiles. They do not consider the larger, equal,or smaller relations of the attribute values as the matchingmetrics. Moreover, the profile matching results are revealedto the participating users in certain conditions, and behaviorlinkage happens when the matching results are distinctive.

Page 3: Fully Anonymous Profile Matching in Mobile Social Networksvaweinstitutes.com/pdffiles/Base Paper.pdf · Fully Anonymous Profile Matching in Mobile Social Networks Xiaohui Liang,

Consider users adopt the multiple-pseudonym technique [24],[25], i.e., users achieve high anonymity by frequently changingthe unlinkable pseudonyms in the communication. As shownin Fig. 2, users uk and uj both change their pseudonyms attime t and t′(> t). Since the matching result between uk

and ui is non-unique value 0.7, ui is unable to link uk’sbehavior. However, ui is likely to know that user uj staysin its neighborhood because the matching result remains to be0.1 which is much distinctive from other matching results. Inaddition, if 0.1 is unique among all possible matching resultsof users, they would easily recognize each other by executingthe matching protocols though their profiles are not disclosed.Hence, the privacy protection of users is related to boththeir profiles and their profile matching results. Consideringa user has ν possible instances of the profile, we classifythe anonymity of profile matching into three classes, non-anonymity, conditional anonymity, and full anonymity, basedon the following definition.

uk

uj

ui

ujMatching

Result (MR)

MR = 0.1MR = 0.1

MR = 0.6

MR = 0.7

MR = 0.8

MR = 0.6

MR = 0.7

Time t Time t’ (> t)

ui

uk

MR = 0.6

MR = 0.7

Fig. 2: Behavior linkage

Definition 1 (Non-Anonymity). A profile matching protocolprovides non-anonymity if after executing multiple runs of theprotocol with any user, the probability of correctly guessingthe profile of the user is equal to 1.

Definition 2 (Conditional Anonymity). A profile matchingprotocol achieves conditional anonymity if after executingmultiple runs of the protocol with some user, the probabilityof correctly guessing the profile of the user is larger than 1

ν .

Definition 3 (Full Anonymity). A profile matching protocolachieves full anonymity if after executing multiple runs of theprotocol with any user, the probability of correctly guessingthe profile of the user is always 1

ν .

The profile matching protocols [21]–[23] allow the users toobtain the profile matching results which contain partial profileinformation. Further, the profile matching results may causebehavior linkage in certain conditions such that the revealedprofile information will be correlated to break user anonymity.By cross-checking the profile matching results with the profileset, some possible instances may be excluded, and then theprobability of correctly guessing the profile of the target usermust be larger than 1

ν . Thus, the previous works [21]–[23] onlyprovide conditional anonymity. In this paper, we aim to designthe profile matching protocols with conditional anonymity andfull anonymity. We propose an explicit Comparison-based Pro-file Matching protocol (eCPM) with conditional anonymity, an

TABLE I: Frequently used notationsu1, · · · , uN N usersui and uj The initiator and the respondera1, · · · , aw The attributes in user profiles

ax The attribute used in comparisonai,x and aj,x Two attribute values of ui and uj

[1, l] The range of attribute valuesT1, · · · , Tλ The categories of messages

Ty The category of the initiator ui’s interestpidi and pidj The pseudonyms of ui and uj

pki and pkj The public keys of ui and uj

certpidi The certificate on (pidi, pki)s1,y and s0,y Two messages in the category Ty

Π The predicate set by the responder uj

A The attribute set of Π, |A| = nt The threshold value of Π

implicit Comparison-based Profile Matching protocol (iCPM)and an implicit Predicate-based Profile Matching protocol(iPPM) both with full anonymity.

B. Network Model

We consider a homogenous MSN composed of N mobileusers. Users have equal wireless communication range, andthe communication is bi-directional. The multi-pseudonymtechnique [24], [25] is adopted to preserve user identity and lo-cation privacy, i.e., users use pseudonyms rather than their realidentities during communication and change their pseudonymsperiodically. A Trusted Central Authority (TCA) is used forbootstrapping but not involved in user communication. Duringthe bootstrapping, the TCA generates profiles, pseudonymsand associated certificates for individual users.

Similar to previous works [22], [23], each user is assumedto have w attributes, and its profile is a w-dimension vectorspanning all these attributes. An integer value between 1 andl is assigned to each dimension, representing the prioritylevel, knowledge level, or capability of users on the corre-sponding attribute. Therefore, a user ui will have a profilepi = (ai,1, · · · , ai,w) where ai,h ∈ Z, 1 ≤ ai,h ≤ l and1 ≤ h ≤ w. For the messages to be shared, they are dividedinto λ categories Th for 1 ≤ h ≤ λ. A non-exhaustive list ofnotations to be used throughout the rest of the paper can befound in Table 1.

Malicious users exist in the network. They are curious aboutthe personal information of others, such as unique identities,location and profiles. Personal information revealed in profilematching imposes direct privacy threats to users. Fractionsof such information may be aggregated by colluding users,and the behavior of the target user may be linked [15], [16].To prevent privacy violation completely, personal information,and even profile matching results must not be disclosed.Protocol-dependent techniques are needed for preventing be-havior linkage.

C. Our Contributions

In this paper, we propose a family of novel protocols eCPM,iCPM, and iPPM to solve the considered profile matchingproblems based on the above network model. These protocols

Page 4: Fully Anonymous Profile Matching in Mobile Social Networksvaweinstitutes.com/pdffiles/Base Paper.pdf · Fully Anonymous Profile Matching in Mobile Social Networks Xiaohui Liang,

rely on the homomorphic encryption to protect the content ofuser profiles from disclosure. They provide increasing levelsof anonymity (from conditional to full). Our contributions aresummarized below.

Firstly, we propose the eCPM for Scenario-1. For a specifiedattribute, the eCPM allows the initiator to know the com-parison result, i.e., whether it has a larger, equal, or smallervalue than the responder on the attribute. Due to the exposureof the comparison result, user profile will be leaked andlinked in some conditions. We provide a numerical analysison the conditional anonymity of the eCPM. We study theanonymity risk level in relation to the pseudonym change forthe consecutive eCPM runs.

Secondly, we propose the iCPM for Scenario-2. In this pro-tocol, the responder prepares multiple categories of messageswhere two messages are generated for each category. Theinitiator can obtain only one message related to one categoryfor each run. During the protocol, the responder is unable toknow the category of the initiator’s interest. To receive whichmessage in the category is dependent on the comparison resulton a specified attribute. The responder does not know whichmessage the initiator receives, while the initiator cannot derivethe comparison result from the received message. We providean analysis of the effectiveness of the iCPM, and show thatthe iCPM achieves full anonymity.

Thirdly, we extend the iCPM to obtain the iPPM, which hasthe same anonymity property as the iCPM. The iPPM allowsthe comparisons of multiple attributes for profile matching.The responder defines a predicate, similar to [26], which is alogical expression made of multiple comparisons between itsown attribute values and the initiator’s attribute values. Theinitiator receives one message from the responder correspond-ing to the specified category. To receive which message inthe category is dependent on whether the initiator’s attributevalues satisfy the predicate or not. We provide an analysis ofthe effectiveness of the iPPM.

Fourthly, we improve the eCPM by combining it withan adaptive pseudonym change strategy and obtain a newvariant eCPM+. In the eCPM+, each user measures its cur-rent neighborhood status periodically and predicts its futureneighborhood status using an Autoregressive Moving Average(ARMA) model. Based on the aggregate neighborhood statussince last pseudonym change, it periodically estimates itsanonymity risk level and changes its pseudonym when thelevel is too high. The extensive trace-based simulation showsthat eCPM+ achieves significantly higher anonymity strengthwith slightly larger number of used pseudonyms than theeCPM.

The remainder of this paper is organized as follows. Wereview existing profile matching protocols in Sec. II andintroduce some fundamental techniques in Sec. III. The threeprotocols eCPM, iCPM, and iPPM are presented respectivelyin Sec. IV-VI, along with the effectiveness discussion. Theircommunication overhead and anonymity strength are analyzedin Sec. VII. We present the eCPM+ in Sec. VIII and evaluateits performance in Sec. IX. Finally, we conclude the paper inSec. X.

II. RELATED WORK

Mobile social networks as emerging social communicationplatforms [27]–[29] have attracted great attention recently,and their mobile applications have been developed and imple-mented pervasively. In mobile social networking applications,profile matching acts as a critical initial step to help users,especially strangers, initialize conversation with each otherin a distributed manner. Yang et al. [30] introduced a dis-tributed mobile communication system, called E-SmallTalker,which facilitates social networking in physical proximity. E-SmallTalker automatically discovers and suggests commontopics between users for easy conversation. Lu et al. [20]studied e-healthcare cases by proposing a symptom matchingscheme for mobile health social networks. They consideredthat such matching scheme is valuable to the patients whohave the same symptom to exchange their experiences, mutualsupport, and inspiration with each other.

In general, the profile matching can be categorized based onthe formats of profiles and the types of matching operations. Awell-known profile matching is the FNP scheme [19], wherea client and a server compute their intersection set such thatthe client gets the result while the server learns nothing.Later, Kissner et al. [31] implemented profile matching withmore operations including set intersection, union, cardinalityand over-threshold operations. On the other hand, Ye et al.[32] further extended the FNP scheme to a distributed privatematching scheme and Dachman-Soled et al. [33] aimed atreducing the protocol complexity. All the above solutions tothe set intersection rely on homomorphic encryption operation.In the meantime, other works [34], [35] employed an obliv-ious pseudo random function to build their profile matchingprotocols, where communication and computational efficiencyis improved. Li et al. [21] implemented profile matchingaccording to three increasing privacy levels: i) revealing thecommon attribute set of the two users; ii) revealing the size ofthe common attribute set; and iii) revealing the size rank of thecommon attribute sets between a user and its neighbors. Theyconsidered an honest-but-curious (HBC) adversary model,which assumes that users try to learn more information thanallowed by inferring from the profile matching results, buthonestly following the protocol. They applied secure multi-party computation, the Shamir secret sharing scheme, and thehomomorphic encryption scheme to achieve the confidentialityof user profiles.

In another category of profile matching [22], [23], [36],profiles can be represented as vectors, and matching operationcan be inner product or distance. Such profile matching is aspecial instance of the secure two-party computation, whichwas initially introduced by Yao [37] and later generalized tothe secure multi-party computation by Goldreich et al. [38].Specifically, we introduce two recent works in this category.Dong et al. [23] considered user profile consisting of attributevalues and measured the proximity of two user profiles usingdot product fdot(u, v). An existing dot product protocol [39]is improved to enable verifiable secure computation. The im-proved protocol only reveals whether the dot product is aboveor below a given threshold. The threshold value is selected by

Page 5: Fully Anonymous Profile Matching in Mobile Social Networksvaweinstitutes.com/pdffiles/Base Paper.pdf · Fully Anonymous Profile Matching in Mobile Social Networks Xiaohui Liang,

the user who initiates the profile matching. They pointed outthe potential anonymity risk of their protocols; an adversarymay adaptively adjust the threshold value to quickly narrowdown the value range of the victim profile. Thus, it is requiredthat the threshold value must be larger than a pre-defined lowerbound (a system parameter) to guarantee user anonymity. Thesame problem exists in other works [21], [22]. Furthermore,Dong et al. [23] required users to make a commitment abouttheir profiles to ensure the profile consistency, but profileforgery attack may still take place during the commitmentphase. In the same category, Zhang et al. [22] set the matchingoperation fdis(u, v) of two d-dimension user profiles u andv as the calculation of the following distances: i) Manhattandistance, i.e., fdis(u, v) = lα(u, v) = (

∑d1 |vi − ui|α)

1α ; or

ii) Max distance, i.e., fdis(u, v) = lmax(u, v) = max{|v1 −u1|, · · · , |vd − ud|}. The distance is compared with a pre-defined threshold τ to determine whether u and v match. Then,three increasing privacy levels are defined as: i) one of u andv learns fdis(u, v), and the other only learns fdis; ii) one ofthem learns fdis(u, v), and the other learns nothing; and iii)one of them learns whether fdis(u, v) < τ , and the other learnsnothing.

The proposed profile matching protocols are novel since thecomparison of attribute values is considered as the matchingoperation. The intuitive idea is inspired by the famous Yao’smillionaires’ problem [37] and its solution [40]. Similar toother works [21]–[23], we propose three different protocolswith different anonymity levels. For the eCPM with condi-tional anonymity, we provide detailed anonymity analysis andshow the relation between pseudonym change and anonymityvariation. For the iCPM and the iPPM with full anonymity,we show that the use of these protocols does not affect useranonymity level and users are able to completely preserve theirprivacy.

III. PRIMITIVES

In this section, we introduce homomorphic encryption andAutoregressive Moving Average (ARMA) model that will beused in our proposed profile matching protocols.

A. Homomorphic Encryption

There are several existing homomorphic encryption schemesthat support different operations such as addition and multipli-cation on ciphertexts, e.g. [41], [42]. By using these schemes, auser is able to process the encrypted plaintext without knowingthe secret keys. Due to this property, homomorphic encryptionschemes are widely used in data aggregation and computationspecifically for privacy-sensitive content [43]. We review thehomomorphic encryption scheme [42] that serves a buildingblock of our proposed profile matching protocols.

A central authority runs a generator G which outputs⟨p, q, R,Rq, Rp, χ⟩ as system public parameters:

• p < q are two primes s.t. q ≡ 1 mod 4 and p≫ l;• Rings R = Z[x]/⟨x2+1⟩, Rq = R/qR = Zq[x]/⟨x2+1⟩;• Message space Rp = Zp[x]/⟨x2 + 1⟩;• A discrete Gaussian error distribution χ = DZn,σ with

standard deviation σ.

Suppose user ui has a public/private key pair (pki, ski) suchthat pki = {ai, bi}, with ai = −(bis + pe), bi ∈ Rq ands, e ← χ, and ski = s. Let bi,1 and bi,2 be two messagesencrypted by ui.

• Encryption Epki(bi,1): ci,1 = (c0, c1) = (aiut + pgt +bi,1, biut + pft), where ut, ft, gt are samples from χ.

• Decryption Dski(ci,1): If denoting ci,1 = (c0, · · · , cα1),bi,1 = (

∑α1

k=0 cksk) mod p.

Consider the two ciphertexts ci,1 = E(bi,1) = (c0, · · · , cα1)and ci,2 = E(bi,2) = (c′0, · · · , c′α2

).

• Addition: Let α = max(α1, α2). If α1 < α, let cα1+1 =· · · = cα = 0; If α2 < α, let c′α2+1 = · · · = c′α = 0.Thus, we have E(bi,1 + bi,2) = (c0 ± c′0, · · · , cα ± c′α).

• Multiplication: Let v be a symbolic variable and com-pute (

∑α1

k=0 ckvk) · (

∑α2

k=0 c′kv

k) = cα1+α2vα1+α2 +

· · · + c1v + c0. Thus, we have E(bi,1 × bi,2) =(c0, · · · , cα1+α2

).

B. Autoregressive Moving Average (ARMA) Model

Autoregressive model (AR) is a classic tool for understand-ing and predicting a time series data [44]. It estimates thecurrent term zk of the series by a linear weighted sum ofprevious p terms (i.e., observations) in the series. The modelorder p is generally much smaller than the length of theseries. AR is often combined with Moving-Average model(MA) to obtain complex ARMA model for generally improvedaccuracy. While AR depends on the previous terms of a timeseries data, MA describes the current value of the series usinga linear weighted sum of white Gaussian noise or randomshocks of its prior q values. As a straightforward combinationof AR and MA, ARMA model is notated as ARMA(p, q) andwritten as

zk = c+

p∑i=1

ϕizk−i +

q∑j=1

θjϵk−j + ϵk,

where c is a constant standing for the mean of the series, ϕi theautoregression coefficients, θi the moving-average coefficients,and ϵk the zero-mean white Gaussian noise error. For sim-plicity, the constant c is often omitted. Deriving ARMA(p, q)involves determining the coefficients ϕi for i ∈ [1 · · · p] and ϵfor j ∈ [1..q] that give a good prediction. The model can beupdated as new samples arrive so as to ensure accuracy, or itmay be recomputed only when the prediction is too far fromthe true measurement. ARMA modeling has been applied tosolve various problems, e.g. routing [45], and it will be usedin Sec. VIII to enable pre-adaptive pseudonym change.

IV. EXPLICIT COMPARISON-BASED APPROACH

In this section, we present the explicit Comparison-basedProfile Matching protocol, i.e., eCPM. This protocol allowstwo users to compare their attribute values on a specifiedattribute without disclosing the values to each other. But, theprotocol reveals the comparison result to the initiator, andtherefore offers conditional anonymity.

Page 6: Fully Anonymous Profile Matching in Mobile Social Networksvaweinstitutes.com/pdffiles/Base Paper.pdf · Fully Anonymous Profile Matching in Mobile Social Networks Xiaohui Liang,

A. BootstrappingThe protocol has a fundamental bootstrapping phase, where

the TCA generates all system parameters, user pseudonyms,and keying materials. Specifically, the TCA runs G to generate⟨p, q, R,Rq, Rp, χ⟩ for initiating the homomorphic encryption(see Sec. III-A). The TCA generates a pair of public andprivate keys (pkTCA, skTCA) for itself. The public key pkTCA

is open to all users; the private key skTCA is a secret whichwill be used to issue certificates for user pseudonyms andkeying materials, as shown below.

The TCA generates disjoint sets of pseudonyms (pidi) anddisjoint sets of homomorphic public keys (pki) for users (ui).For every pidi and pki of ui, the TCA generates the corre-sponding secret keys pski and ski. In correspondence to eachpseudonym pidi, it assigns a certificate certpidi to ui, whichcan be used to confirm the validity of pidi. Generally, theTCA uses skTCA to generate a signature on pidi and pki. TheTCA outputs certpidi as a tuple (pki, SignskTCA(pidi, pki)).The homomorphic secret key ski is delivered to ui togetherwith pski; pki is tied to pidi and varies as the change ofpseudonyms.

B. Protocol StepsConsider user ui with a neighboring user uj . Denote by pidi

the current pseudonym of ui and by pidj that of uj . Recallthat ax is an attribute, ai,x and aj,x the values of ui and uj

on ax, and l the largest attribute value. Suppose that ui as aninitiator starts profile matching on ax with a responder uj . Letpski and pskj be the secret keys corresponding to pidi andpidj , respectively. The protocol is executed as follows.

Step 1. ui calculates di = Epki(ai,x), and sends a 5-tuple(pidi, certpidi , ax, di, Signpski(ax, di)) to uj .

Step 2. After receiving the 5-tuple, uj opens the certificatecertpidi and obtains the homomorphic public key pki anda signature. It checks certpidi

to verify that (pki, pidi) aregenerated by the TCA, and it checks the signature to vali-date (ax, di). If any check is failed, uj stops; otherwise, uj

proceeds as follows. It uses pki to encrypt its own attributevalue aj,x, i.e., dj = Epki(aj,x); it chooses a random valueφ ∈ Zp such that 1 ≤ φ < ⌊p/(2l)⌋ and m|φ for any integerm ∈ [1, l− 1] (φ can be chosen dependent on uj’s anonymityrequirement). By the homomorphic property, it calculatesEpki(ai,x − aj,x) and d′j = Epki(φ(ai,x − aj,x)); it finallysends a 5-tuple (pidj , certpidj , ax, d

′j , Signpskj (ax, d

′j)) to ui.

Step 3. After receiving the 5-tuple, ui opens the certificatecertpidj and checks the signature to make sure the validity ofpidj and (ax, d

′j). If the check is successful, ui uses ski to

decrypt d′j and obtains the comparison result c = φ(ai,x −aj,x). ui knows ai,x > aj,x if 0 < c ≤ p−1

2 , ai,x = aj,x ifc = 0, or ai,x < aj,x otherwise.

C. Effectiveness DiscussionThe effectiveness of the eCPM is guaranteed by the follow-

ing theorems.

Theorem 1 (Correctness). In the eCPM, the initiator ui is ableto obtain the correct comparison result with the responder uj

on a specified attribute ax.

Proof: Recall p ≫ l and 1 ≤ φ < ⌊p/(2l)⌋. As 1 ≤ai,x, aj,x ≤ l, we have −l < ai,x − aj,x < l. If ai,x > aj,x,we have 0 < φ(ai,x − aj,x) < ⌊p/(2l)⌋ × l ≤ p/2. Becausep is a prime and φ(ai,x − aj,x) is an integer, we have 0 <φ(ai,x − aj,x) ≤ (p − 1)/2. In case of ai,x < aj,x, we maysimilarly derive (p + 1)/2 ≤ φ(ai,x − aj,x) < p. Thus, bycomparing φ(ai,x−aj,x) with 0, (p−1)/2 and (p+1)/2, ui isable to know whether ai,x > aj,x, ai,x = aj,x, or ai,x < aj,x.

Theorem 2 (Anonymity). The eCPM does not disclose theattribute values of participating users.

Proof: The initiator ui who starts the protocol for attributeax encrypts its attribute value ai,x using its homomorphicpublic key pki. Thus, the responder uj is unable to knowany information about ai,x. On the other side, the responderuj does not transmit its attribute value aj,x, but returnsφ(ai,x − aj,x) to ui, where φ is a random factor addedfor anonymity. Since m|φ for 1 ≤ m ≤ l − 1, we havem|(φ(ai,x − aj,x)). Thus, (ai,x − aj,x) can be any valuebetween −(l − 1) and l − 1 from ui’s view, and the exactvalue of aj,x is thus protected.

Theorem 3 (Non-forgeability). The eCPM discourages profileforgery attack at the cost of involving the TCA for signatureverification and data decryption.

Proof: Consider two users ui and uj running the eCPMwith each other on attribute ax. Their public keys pki andpkj used for homomorphic encryption are generated by theTCA, and the TCA has full knowledge of the correspondingprivate keys ski and skj . In addition, their attribute valuesare generated by the TCA and recorded in the TCA’s localrepository, and the TCA can retrieve any attribute value ofusers (e.g. ai,x or aj,x) anytime when necessary. After thetwo users finish the protocol, ui will have Signpskj (d

′j), and

uj will have Signpski(di). If ui(uj) uses the forged profile inthe protocol, uj(ui) can cooperate with the TCA to trace suchmalicious attack. Specifically, uj(ui) can send Signpski(di)(Signpskj (d

′j)) to the TCA. the TCA will be able to check if

the signatures are valid and the encrypted values are consistentwith ai,x and aj,x. Thus, any profile forgery attack can bedetected with the help from the TCA, and such attacks willbe discouraged.

V. IMPLICIT COMPARISON-BASED APPROACH

In this section, we propose the implicit Comparison-basedProfile Matching (iCPM) by adopting the oblivious transfercryptographic technique [40]. We consider users have distinctvalues for any given attribute. As shown in Fig. 3, the iCPMconsists of three main steps. In the first step, ui chooses aninterested category Ty by setting y-th element to 1 and otherelements to 0 in a λ-length vector Vi. ui then encrypt thevector by using the homomorphic encryption and sends theencrypted vector to uj . Thus, uj is unable to know Ty but stillcan process on the ciphertext. In the second step, uj computesthe ciphertexts with input of self-defined messages (s1,h, s0,h)for 1 ≤ h ≤ λ, two encrypted vectors (mi, di), and its own

Page 7: Fully Anonymous Profile Matching in Mobile Social Networksvaweinstitutes.com/pdffiles/Base Paper.pdf · Fully Anonymous Profile Matching in Mobile Social Networks Xiaohui Liang,

attribute value aj,x. In the last step, ui decrypts the ciphertextand obtain s1,y if ai,x > aj,x or s0,y if ai,x < aj,x.

A. Protocol Steps

Initiator ui Responder uj

Vi = (0, … , 1, …, 0)m i=Epki(Vi)

di = (Epki(bi,x,1), …, Epki(bi ,x,θ)) m i, di

bj,x,1, …, bj,x,θmi, di

For 1≤h≤λ, (s1,h, s0,h) Epki(µ) = (Epki(µ1), …, Epki(µθ))

Decrypt dj and obtains1,y, if ai,x > aj,x

s0,y, if ai,x < aj,x

1

3

2

dj

Fig. 3: The iCPM flow

In the iCPM, the responder uj prepares λ pairs of messages(s0,h, s1,h) for category Th (1 ≤ h ≤ λ) where s0,h, s1,h ∈ Zp

and s0,h, s1,h ≤ (p− 1)/2. These messages are not related touj’s profile. The initiator ui first decides which category Ty itwants to receive messages related to. But ui does not discloseTy to uj . Then, the responder uj shares either s0,y or s1,y toui without knowing which one will be received by ui. Whenthe protocol finishes, ui receives one of s0,y and s1,y with noclue about the comparison result. We elaborate the protocolsteps below.

Step 1. ui generates a vector Vi = (v1, · · · , vλ), wherevy = 1 and vh = 0 for 1 ≤ h ≤ λ and h = y. Thisvector implies that ui is interested in the category Ty . ui

sets mi = Epki(Vi) = (Epki

(v1), · · · , Epki(vλ)). It converts

ai,x to binary bits ⟨bi,x,1, · · · , bi,x,θ⟩, where θ = ⌈log l⌉, andsets di = (Epki

(bi,x,1), · · · , Epki(bi,x,θ)). It sends a 6-tuple

(pidi, certpidi , ax, di,mi, Signpski(ax, di,mi)) to uj .Step 2. After receiving the 6-tuple, uj checks if

(pidi, certpidi) are generated by the TCA and the signature isgenerated by ui. If both checks are successful, it knows that(ax, di,mi) is valid. uj proceeds as follows:

1) Convert aj,x to binary bits ⟨bj,x,1, · · · , bj,x,θ⟩ and com-pute Epki(bj,x,t) for 1 ≤ t ≤ θ.

2) Compute e′t = Epki(bi,x,t)− Epki(bj,x,t) = Epki(ζ′t).

3) Compute e′′t = (Epki(bi,x,t) − Epki(bj,x,t))2 =

Epki(ζ′′t ).

4) Set γ0 = 0, and compute Epki(γt) as 2Epki(γt−1)+e′′t ,which implies γt = 2γt−1 + ζ ′′t .

5) Select a random rt ∈ Rp in the form of ax + b wherea, b ∈ Zp, a = 0, and compute Epki(δt) as Epki(ζ

′t) +

Epki(rt) × (Epki(γt) − Epki(1)), which implies δt =ζ ′t + rt(γt − 1).

6) Select a random rp ∈ Zp (rp = 0), and computeEpki(µt) as

λ∑h=1

((s1,h + s0,h)Epki(1) + s1,hEpki(δt)− s0,hEpki(δt))

× (rp((Epki(vh))2 − Epki(vh)) + Epki(vh))

+ rp(

λ∑h=1

Epki(vh)− Epki(1)).

which implies µt =∑λ

h=1(s1,h(1 + δt) + s0,h(1 −δt))((v

2h − vh)rp + vh) + (

∑λh=1 vh − 1)rp.

Then, uj compiles Epki(µ) = (Epki(µ1), · · · , Epki(µθ)), andmakes a random permutation to obtain dj = P(Epki(µ)). It fi-nally sends a 5-tuple (pidj , certpidj , ax, dj , Signpskj (ax, dj))to ui.

Step 3. ui checks the validity of the received 5-tuple. Then,it decrypts every ciphertext Epki(µt) in dj as follows: forEpki(µt) = (c0, · · · , cα), obtain µt by µt = (

∑αh=0 chs

h)mod p. If ai,x > aj,x, ui is able to find a plaintext µt ∈ Zp

and µt = 2s1,y ≤ p− 1 and computes s1,y; if ai,x < aj,x, ui

is able to find µt = 2s0,y and computes s0,y .

B. Effectiveness Discussion

The correctness of the iCPM can be verified as follows. Ifai,x > aj,x, then there must exist a position, say the t∗-thposition, in the binary expressions of ai,x and aj,x such thatbi,x,t∗ = 1, bj,x,t∗ = 0 and bi,x,t′ = bj,x,t′ for all t′ < t∗. Sinceγt = 2γt−1 + ζ ′′t , we have γt′ = 0, γt∗ = 1, and δt∗ = 1. Fort′′ > t∗, we have γt′′ ≥ 2, and δt is a random value due tort′′ . Since s0,y and s1,y are elements of Zp and rt is in theform of ax + b (a, b ∈ Zp, a = 0), ui can always determinethe effective plaintext from others. The effective plaintext willbe µt =

∑λh=1(s1,h(1 + δt∗) + s0,h(1− δt∗))((v

2h − vh)rp +

vh) + (∑λ

h=1 vh − 1)rp. If the vector Vi from ui does notsatisfy

∑λh=1 vh = 1 or vh ∈ {0, 1}, ui cannot remove the

random factor rp; if Vi satisfies the conditions, only s1,y ands0,y will be involved in the computation. Because δt∗ = 1, ui

can obtain µt = 2s1,y ≤ p−1 and recovers s1,y. If ai,x < aj,x,we similarly have µt = 2s0,y and ui can obtain s0,y.

The confidentiality of user profiles is guaranteed by thehomomorphic encryption. The comparison result δt is alwaysin the encrypted format, and δt is not directly disclosed toui. The revealed information is either s1,y or s0,y which isunrelated to user profiles. Therefore, the protocol transactionsdo not help in guessing the profiles, and the full anonymity isprovided. In the meantime, vector Vi is always in an encryptedformat so that uj is unable to know the interested categoryTy of ui. In addition, uj ensures that only one of s1,y ands0,y will be revealed to ui. The non-forgeability propertyis similar to that of the eCPM. ui will not lie as it makessignature Signpski(ax, di) and gives it to uj . The profileforgery attack will be detected if uj reports the signature tothe TCA. Moreover, uj has no need to lie as it can achievethe same objective by simply modifying the contents of s1,yand s0,y.

VI. IMPLICIT PREDICATE-BASED APPROACH

Both the eCPM and the iCPM perform profile matching on asingle attribute. For a matching involving multiple attributes,they have to be executed multiple times, each time on oneattribute. In this section, we extend the iCPM to the multi-attribute cases, without jeopardizing its anonymity property,and obtain an implicit Predicate-based Profile Matching pro-tocol, i.e., iPPM. This protocol relies on a predicate which isa logical expression made of multiple comparisons spanningdistinct attributes and thus supports sophisticated matchingcriteria within a single protocol run.

Page 8: Fully Anonymous Profile Matching in Mobile Social Networksvaweinstitutes.com/pdffiles/Base Paper.pdf · Fully Anonymous Profile Matching in Mobile Social Networks Xiaohui Liang,

As shown in Fig. 4, the iPPM is composed of three mainsteps. In the first step, different from the iCPM, ui sends to uj

n encrypted vectors of its attribute values corresponding to theattributes in A where A (|A| = n ≤ w) is the attribute set ofthe predicate Π. In the second step, uj sets 2λ polynomialfunctions fsat,h(x), funsat,h(x) for 1 ≤ h ≤ λ. uj thengenerates 2λn secret shares from fsat,h(x), funsat,h(x) bychoosing 1 ≤ h ≤ λ, 1 ≤ x ≤ n, and arranges them in acertain structure according to the predicate Π. For every 2λsecret shares with the same index h, similar to the step 2 of theiCPM, uj generates θ ciphertexts. uj obtains nθ ciphertexts atthe end of the second step. In the third step, ui decrypts thesenθ ciphertexts and finds n secret shares of s1,y and s0,y . uj

finally can obtain s1,y or s0,y from the secret shares.

A. Protocol Steps

Initiator ui Responder uj

Vi = (0, … , 1, …, 0)m i=Epki(Vi)

di = (Epki(bi,x,1), …, Epki(bi ,x ,θ)…, …, )

m i, di

bj,x,1, …, bj ,x,θmi, di

For 1≤h≤λ, (s1,h,x, s0,h,x)djDecrypt dj and obtainn secret shares of s1,y and s0,y 3

2

1 Set functions fsat and funsat

For ax,

Set predicate π

dj

generate s1,h,x and s0,h,x for 1≤h≤λ,

Fig. 4: The iPPM flow

The iPPM is obtained by combining the iCPM with a secretsharing scheme [46] to support a predicate matching. Theinitiator ui sends its attribute values corresponding to theattributes in A to the responder uj . Without loss of generality,we assume A = {a1, · · · , an}. Then, uj defines a predicateΠ = “t of {(ai,x, opt, aj,x)|ax ∈ A}”, where the comparisonoperator opt is either > or < and t ≤ n. The predicate containsn number of requirements (i.e., comparisons), each for adistinct ax. The responder uj determines λ pairs of messages(s0,h, s1,h) for attributes ah (1 ≤ h ≤ λ). The initiator ui

receives s1,h if at least t of the n requirements are satisfied,or s0,h otherwise. Similar to the iCPM, Ty is determined by ui

but unknown to uj . The threshold gate 1 ≤ t ≤ n is chosen byuj . When n = 1, the iPPM reduces to the iCPM. The protocolsteps are given below.

Step 1. ui generates a vector Vi = (v1, · · · , vλ), wherevy = 1 and vh = 0 for 1 ≤ h ≤ λ and z = y, and setsmi = Epki(Vi) = (Epki(v1), · · · , Epki(vλ)). In addition, ui

selects the attribute set A (|A| = n), and sends a 6-tuple(pidi, certpidi , A, di,mi, Signpski(A, di,mi)) to uj , where dicontains nθ (θ = ⌈log l⌉) ciphertexts as the homomorphicencryption results of each bit of ai,x for ax ∈ A.

Step 2. uj checks the validity of the received 6-tuple(similar to the Step 2 of the iCPM). It creates a predicate Π andchooses the threshold gate t. Using the secret sharing scheme[46], uj creates 2λ polynomials: fsat,h(v) = ρt−1,hv

t−1 +· · ·+ρ1,hv+s1,h and funsat,h(v) = ρ′n−t,hv

n−t+· · ·+ρ′1,hv+s0,h for 1 ≤ h ≤ λ, where ρt−1,h, · · · , ρ1,h, ρ′n−t,h, · · · , ρ

′1,h

are random numbers from Z∗p. For each attribute ax ∈ A,

it calculates the secret shares of s1,h,x and s0,h,x as follows

(s1,h,x, s0,h,x ≤ (p− 1)/2 are required):s0,h,x = 0||funsat,h(x),s1,h,x = 1||fsat,h(x), if “ai,x > aj,x” ∈ Π;s0,h,x = 1||fsat,h(x),s1,h,x = 0||funsat,h(x), if “ai,x < aj,x” ∈ Π.

Note that uj adds a prefix 0 or 1 to each secret share suchthat ui is able to differentiate the two sets of shared secrets,one for s1,h, the other for s0,h. uj runs the Step 2 of the iCPMn times, each time for a distinct attribute ax ∈ A and with(s1,h,x, s0,h,x) for (1 ≤ h ≤ λ) being input as s1,h and s0,h,respectively. uj then obtains dj including nθ ciphertexts. Fi-nally, it sends a 6-tuple (pidj , certpidj , t, A, dj , Signpskj (dj))to ui.

Step 3. ui checks the validity of the received 6-tuple. ui

can obtain n secret shares, and each of these shares is eitherfor s0,y or s1,y. It then classifies the n shares into two groupsby looking at the starting bit (either ‘0’ or ‘1’). Thus, if Π issatisfied, ui can obtain at least t secret shares of s1,y and beable to recover s1,y; otherwise, it must obtain at least n− t+1secret shares of s0,y and can recover s0,y .

B. Effectiveness Discussion

The correctness of the iPPM is as follows. At Step 2, theresponder uj executes the Step 2 of the iCPM n times, eachtime it effectively delivers only one secret share of either s0,yor s1,y to ui. When ui receives either t shares of s1,y orn − t + 1 shares of s0,y, it can recover either s1,y or s0,y.The interpolation function corresponding to the secret sharingscheme always guarantees the correctness. The anonymity andnon-forgeability of the iPPM are achieved similar to those ofthe iCPM and the eCPM, respectively.

VII. PERFORMANCE ANALYSIS

In this section, we analytically study the performance ofthree proposed protocols eCPM, iCPM, and iPPM in termsof communication overhead and anonymity strength. Whenanalyzing anonymity, we consider the case that users havedistinct values for any given attribute. Non-distinct attributevalues and comparison operations “≥” and “≤” will beconsidered in our future work.

A. Communication Overhead

Let |R| be the size of one ring element in Rq . In the eCPM,the initiator and the responder both need to send ciphertexts insize of 2|R|, and the communication overhead is thus subjectonly to the system parameter |R|.

In order to achieve full anonymity, the iCPM constructsciphertext in a sequence of operations. From Sec. III-A, weknow |Enc(b)| = 2|R|. Thus, the communication overhead ofthe initiator is 2(θ + λ)|R| with θ = ⌈log l⌉. It can be seenthat the initiator’s communication overhead increases withsystem parameters (θ, λ). According to Sec. III-A an additionoperation of homomorphic encryption does not increase theciphertext size, while a multiplication with inputs of twociphertexts of lengths a|R| and b|R| outputs a (a+ b− 1)|R|-length ciphertext. Thus, in the iCPM, the communication

Page 9: Fully Anonymous Profile Matching in Mobile Social Networksvaweinstitutes.com/pdffiles/Base Paper.pdf · Fully Anonymous Profile Matching in Mobile Social Networks Xiaohui Liang,

overhead of the responder increases to 6θ|R|. It is concludedthat the communication overhead of the eCPM and the iCPMare constantly dependent on system parameters (θ, λ).

The iPPM extends the iCPM by building complex pred-icates. From the protocol description, we observe that if apredicate includes n ≥ 1 comparisons, the communicationoverhead of the iPPM would be approximately n times of thatin the iCPM.

B. Anonymity

Suppose that user ui is currently using pseudonym pidi toexecute profile matching with others. We consider an adversaryaiming to break the k-anonymity of ui. k-anonymity [47] isa classic concept for evaluating anonymity. It implies that aseries of comparison results provide k-anonymity protectionto a user if the user’s behavior cannot be distinguished fromat least k − 1 other users. We have the following definition:

Definition 4. The k-anonymity risk level of a user is definedas the inverse of the minimum number of distinct protocol runs(MNDPR) that are required to break the user’s k-anonymity.

From this definition, the k-anonymity risk level reflects thedifficulty that the adversary can break a user’s k-anonymity.In the iCPM and the iPPM, the profile matching initiatordoes not reveal its attribute values to the responder, and theresponder has no clue about the comparison result and onlyreveals the self-defined messages which are not related to theprofile. In this case, a user’s k-anonymity risk level can beminimum, i.e., no matter how many protocol runs are executed,its k-anonymity risk level is always the lowest. Therefore, theiCPM and the iPPM both provide full anonymity (put users atminimum anonymity risk).

targetcomparedcompared

K L

I uncompared values J uncompared values

K+L+1 anonymity set

Fig. 5: Identifying the target from others

For the eCPM, it exposes the comparison results to usersand thus obviously puts users at risk of the disclosure ofattribute values. Because every eCPM run is executed for aparticular attribute (which is specified by the initiator), anyuser ui has a k-anonymity risk level on its each individual at-tribute. When “=” case happens, users have higher anonymitylevel because they will be indistinguishable from other userswith the same attribute values. In the following, we considerthe worst case where users have distinctive attribute valueson a single attribute. For a given attribute ax, we assumea1,x > a2,x > · · · > aN,x, where ai,x is the value of ui onax. In order to break ui’s k-anonymity on ax, the adversaryhas to make comparisons ‘aα,x > ai,x’ and ‘ai,x > aβ,x’ for

β − α − 1 < k so that the anonymity set of ai,x has a sizesmaller than k. Let I and J respectively be the numbers oflarger and smaller values on ax among all the users that havenot been compared to ai,x. Let K ≤ I and L ≤ J respectivelybe the number of such un-compared values in the k-anonymityset of ai,x. The relations among I, J,K, and L are shown inFig. 5. Assuming the contact is uniformly random, we definea recursive function f as shown in Eqn. (1).

The above function f(I, J,K,L) returns the MNDPR withrespect to a user’s k-anonymity on ax in the eCPM. Thus,the user’s anonymity risk level in this case is defined as L =1/f(I, J,K,L). Since we assumed a1,x, · · · aN,x are sortedin a descending order, the index i actually reflects the rankof ai,x among the attribute values. Fig. 6 plots the MNDPRf(I, J,K,L) and the k-anonymity risk level L in terms of 78users’ attribute values where k = 5, 10, · · · , 25. It can be seenthat a user with a median attribute value will have a lower k-anonymity risk level than those with larger or smaller values.This is reasonable because the user with a median attributevalue is less distinctive from other users.

0 20 40 60 800

5

10

15

20

25

30

35

Attribute value rank

MN

DP

R

5−anony10−anony15−anony20−anony25−anony

(a) MNDPR per 78 users

0 20 40 60 800

0.1

0.2

0.3

0.4

0.5

Attribute value rank

Ano

nym

ity r

isk

leve

l

5−anony10−anony15−anony20−anony25−anony

(b) Anonymity risk level

Fig. 6: Numerical results on user anonymity risk level

VIII. PERFORMANCE ENHANCEMENT

We have derived the maximum number of distinct eCPMruns (MNDPR) before a user’s k-anonymity is broken. Thisnumber is obtained under an assumption of uniformly randomcontact. However, in reality, users as social entities are likelyto gather with others who have similar attribute values. Thissituation increases user anonymity risk level quickly whenprofile matching is executed frequently, and the k-anonymitycan be broken within a much smaller number of the eCPM runsas a result. Recall that multi-pseudonym techniques are used toprotect user identity and location privacy. Similar to previouswork [24], [25], here we consider that pseudonyms themselvesare unlinkable. In the eCPM, if a user does not change thepseudonym, the comparison result will be easily linked tobreak the k-anonymity. If a user changes pseudonym for eachprotocol run, information revealed by the protocol cannotbe directly linked, and the user obtains highest anonymity.Nevertheless, it is desirable that the user changes pseudonymonly when necessary, since pseudonyms are limited resourcesand have associated cost [24], [25] (e.g., communication costfor obtaining them from the TCA and computation cost forgenerating them on the TCA). Thus, user anonymity is tightlyrelated with pseudonym change frequency.

Page 10: Fully Anonymous Profile Matching in Mobile Social Networksvaweinstitutes.com/pdffiles/Base Paper.pdf · Fully Anonymous Profile Matching in Mobile Social Networks Xiaohui Liang,

f(I, J,K,L) =

0, if K + L < k − 1 or I < K or J < L;I −K

I + J(f(I − 1, J,K,L) + 1) +

J − L

I + J(f(I, J − 1,K, L) + 1)+∑K

z=1(f(I − 1, J,K − z, L) + 1)

I + J+

∑Lz=1(f(I, J − 1,K, L− z) + 1)

I + J, otherwise

(1)

Our goal is to improve the anonymity strength of the eCPMby combining it with a pre-adaptive pseudonym change strat-egy which enables users to take necessary pseudonym changeaction before their k-anonymity is broken. The new verso ofthe eCPM is referred to as eCPM+. Before presenting the pre-adaptive strategy, we first propose a post-adaptive pseudonymchange strategy, where users measure their anonymity risklevels periodically and change their pseudonym after theiranonymity risk levels becomes larger than a pre-definedthreshold value.

The post-adaptive strategy assumes that a user uj as respon-der runs the protocol on an attribute ax with an initiator ui

(recognized by seeing the same pseudonym) only once, andrefuses to participate any subsequent protocol running on thesame ax with ui. However, if ui has changed its pseudonymsince the last protocol running with uj , then uj will considerui as a new partner and participate the protocol. Time isdivided into slots of equal duration. The neighborhood statusof ui on attribute ax in a time slot is characterized by apair of values NSi,x = (ni,x,s, ni,x,l), respectively implyingthe number of new partners (identified in the time slot) withattribute values smaller than ai,x and the number of those withattribute values larger than ai,x. It varies over time due to usermobility and can be modeled as a time series data.

The centre of this strategy is the continuous measurementof user anonymity risk level based on neighborhood status.In the iCPM, attribute values are protected, and users obtainthe matching results. For every attribute ax, user ui maintainsthe numbers Ni,x,s and Ni,x,l of discovered values that aresmaller and larger than its own value ai,x since the last changeof pseudonyms. These two numbers are respectively the sumof individual ni,x,s and the sum of ni,x,l corresponding to thepast several time slots. Recall that ai,x is ranked the i-th largestamong all N users in the network. Let I = i−1 and J = N−i.ui is not able to compute the accurate MNDPR because it doesnot have the information of the last two arguments of functionf() (see Eqn. 1). The anonymity risk level of ui on ax may beestimated as L = 1/f ′(Ni,x,s, Ni,x,l), where f ′(Ni,x,s, Ni,x,l)approximates the MNDPR of ui regarding ax and is givenas

∑1≤α≤I−Ni,x,s1≤β≤J−Ni,x,l

Pr[(α, β)] · f(I − Ni,x,s, J − Ni,x,l, α, β).

For simplicity, we assume that the Ni,x,s values are randomlydistributed among the I − α users (0 ≤ α ≤ I − Ni,x,s)with larger values on ax than ui and the Ni,x,l values arerandomly distributed among the J − β smaller-value users(0 ≤ β ≤ J − Ni,x,l). Thus, for Ni,x,s ≥ 1 and Ni,x,l ≥ 1,we have f ′(Ni,x,s, Ni,x,l) as

∑0≤α≤I−Ni,x,s0≤β≤J−Ni,x,l

(I−α−1Ni,x,s−1)(

J−β−1Ni,x,l−1)

(INi,x,s)(JNi,x,l

)f(I−Ni,x,s, J−Ni,x,l, α, β).

For Ni,x,s = 0 and Ni,x,l ≥ 1, f ′(Ni,x,s, Ni,x,l) is∑0≤β≤J−Ni,x,l

(J−β−1Ni,x,l−1)

(JNi,x,l)· f(I, J −Ni,x,l, I, β).

For Ni,x,s ≥ 1 and Ni,x,l = 0, f ′(Ni,x,s, Ni,x,l) is∑0≤α≤I−Ni,x,s

(I−α−1Ni,x,s−1)

(INi,x,s)· f(I −Ni,x,s, J, α, J).

In the above computation, ui needs to know N and its valuerank i. The information can be obtained from the TCA whenui registers to the TCA. If users are allowed to freely leaveand enter the network, they will need to de-register/re-registerthemselves with the TCA when leaving/joining the network. Inthis case, (N, t) are changing, and the TCA has to be involvedin the network operation in order to maintain latest networkstatus and update users with the latest information.

The post-adaptive strategy also relies on pseudonym lifetimefor making pseudonym change decisions. Suppose that userui is currently using pseudonym pidi. The longer pidi hasbeen used, the more private information of ui is leakedin case its anonymity has been broken. Hence, when ui’sanonymity risk level Li has stayed unchanged for a certainduration, called the lifetime of pidi and denoted by τ(pidi), ui

changes its pseudonym for damage control. However, τ(pidi)should not be given as a constant value, but subject to Li.The higher Li is, the more possible the anonymity of ui

is broken, and therefore the smaller τ(pidi) is. We defineτ(pidi) = ξMNDPRi

Li, where MNDPRi is obtained by Eqn. 1

and ξ > 1 is the pseudonym lifetime factor.For the pre-adaptive pseudonym change strategy, each user

ui initializes an ARMA model for its neighborhood statuson every attribute when entering the network. Since it has wattributes, the number of ARMA models to be initialized is w.At the end of each time slot, it measures its current neighbor-hood status on each attribute and updates the correspondingARMA models. It takes the post-adaptive strategy for eachattribute to determine whether to change its pseudonym. Incase pseudonym change is not suggested, it proceeds to predictthe neighborhood status on all the attributes in the followingtime slot using the ARMA models. If one of the predictedneighborhood status leads to an unacceptable anonymity risklevel, it changes its pseudonym; otherwise, it does not. Thepre-adaptive strategy strengths the post-adaptive strategy byone-step ahead prediction based decision making and generallyenhances user anonymity.

IX. PERFORMANCE EVALUATION

The eCPM+ addresses accumulative anonymity risk inmultiple protocol runs and tunes itself automatically to main-tain desired anonymity strength. Some previous works [21],

Page 11: Fully Anonymous Profile Matching in Mobile Social Networksvaweinstitutes.com/pdffiles/Base Paper.pdf · Fully Anonymous Profile Matching in Mobile Social Networks Xiaohui Liang,

[22] are concerned only with the anonymity risk brought byeach individual protocol run, and some works [23] reduceanonymity risk by manually adjusting certain threshold values.Though they provide the conditional anonymity as the eCPM,they are not comparable to the eCPM and the eCPM+ becausethe anonymity protection of users is considered in termsof consecutive protocol runs. Therefore, in this section weevaluate the eCPM+ (which uses a pre-adaptive pseudonymchange strategy) in comparison with two other eCPM variants,respectively employing a constant pseudonym change intervalz (CONST-z) and a post-adaptive pseudonym change strategy(Post).

A. Simulation SetupOur simulation study is based on the real trace [48] collected

from 78 users attending a conference during a four-day period.A contact means that two users come close to each other andtheir attached Bluetooth devices detect each other. The users’Bluetooth devices run a discovery program every 120 secondson average and logged about 128, 979 contacts. Each contactis characterized by two users, a start-time, and a duration. InCONST-z, we set the pseudonym change interval z from 1 to40 (time slots); in the post-adaptive and pre-adaptive strategies,we set pseudonym lifetime factor ξ = 30. In the pre-adaptivestrategy, we use ARMA order (10, 5).

We use the contact data to generate user profiles. Accordingto social community observations [49], users within the samesocial community often have common interests and are likelyinterconnected through strong social ties [11]. The stronger tietwo users have, the more likely they contact frequently. Letfi,j denote the number of contacts of users ui and uj . We builda complete graph of users and weight each edge (ui, uj) byfi,j . By removing the edges with a weight smaller then 100,we obtain a graph G containing 78 vertices and 2863 edges.We find all maximal cliques in G using the Bron-Kerboschalgorithm [50]. A clique is a complete subgraph. A maximalclique is a clique that cannot be extended by including onemore adjacent vertex. We obtain the 7550 maximal cliquesC1, · · · , C7550 that all contain ≥ 15 users.

Without loss of generality, we assume that these cliques aresorted in the descending order of the weight sum of their edges(the weight sum of C1 is the largest). We then construct com-munities in the following way. Scan the sequence of cliquesfrom C1 to C7550. For a scanned clique Ci, find a clique Cj

that has been previously scanned and identified as core cliqueand contains ≥ 80% vertices of Ci. If there are multiple suchcliques, take the one with largest weight sum as Cj . If Cj isfound, assign Ci with the same attribute as Cj ; otherwise,generate a new attribute, assign it to Ci, and mark Ci asa core clique. After the attribute generation and assignment,merge the cliques with the same attribute into a community. Acommunity contains multiple users, and a user may belong tomultiple communities. From the above settings, we generate349 attributes and thus obtain 349 communities. We howeverconcentrate on the first generated 100 attributes and theircorresponding communities for simplicity. On average, eachof these considered communities contains 28 users, and eachuser belongs to 38 communities.

0 20 40 60 800

0.2

0.4

0.6

0.8

1

Attribute value rank

Bre

ak r

atio

10−anony30−anony

(a) z = 1

0 20 40 60 800

0.2

0.4

0.6

0.8

1

Attribute value rank

Bre

ak r

atio

10−anony30−anony

(b) z = 20

Fig. 7: Anonymity break period under the constant strategy

Afterwards, we assign values to each user in G for these100 attributes. For an attribute ax, we find the correspondingcommunity Cx and do the following. For each user in Cx, wecompute the weight sum of its incidental edges in Cx; foreach vertex outside Cx, we compute the weight sum of itsincident edges to the vertices in Cx; then, we sort all the usersin the decreasing order of their weight sums and assigned theirvalues on ax with (78, 77, · · · , 1). This assignment methodis reasonable because a large weight sum indicates a largeinterest in communicating with users in Cx and thus a strongbackground in the aspect represented by ax.

Our simulation spans 10, 000 time slots, each lasting 30seconds, and focuses on a randomly selected attribute. Userscan change their pseudonym at the beginning of each time slot.The pseudonym is corrupted in terms of k-anonymity (on theselected attribute) if there are less than k−1 other users in thenetwork that will obtain the same matching results in the sameprotocol settings. A user experiences an anonymity break (onthe selected attribute) if it is using a corrupted pseudonym.

B. Simulation Results

Figure 7 shows the anonymity break period experienced byeach user with the constant strategy being used. It can beseen that when z = 1, each user experiences the shortestanonymity break period at the cost of 10, 000 pseudonymsper user. Anonymity break is still possible in this extremecase because users may have multiple contacts within a singletime slot while they are still using the same pseudonym. If auser has a more restrictive anonymity requirement (e.g., from10-anonymity to 30-anonymity) or uses a larger pseudonymchange interval (from 1 time slot to 20 time-slots), it will havemore corrupted pseudonyms and thus suffer a longer periodof anonymity break.

0 2000 4000 6000 8000 1000030

25

20

15

10

5

0

5

Time slots

Num

ber

of r

anke

d us

ers

smaller valueslarger values

(a) The 7th user

0 2000 4000 6000 8000 1000020

15

10

5

0

5

10

15

Time slots

Num

ber

of r

anke

d us

ers

smaller valueslarger values

(b) The 32nd user

Fig. 8: Neighborhood status over time

Page 12: Fully Anonymous Profile Matching in Mobile Social Networksvaweinstitutes.com/pdffiles/Base Paper.pdf · Fully Anonymous Profile Matching in Mobile Social Networks Xiaohui Liang,

2000 2200 2400 2600 2800 3000 32000

0.05

0.1

0.15

0.2

0.25

Time slot

10−

anon

ymity

ris

k le

vel

Most frequently change pseudonym Less frequently

change pseudonym

(a) Time period (2000, 3200)

8200 8400 8600 8800 9000 9200 94000

0.05

0.1

0.15

0.2

0.25

Time slot

10−

anon

ymity

ris

k le

vel Beyond threshold

Beyond max duration

(b) Time period (8200, 9400)

Fig. 9: Anonymity risk level over time (th = 0.15)

The neighborhood status of a user on a given attribute ischaracterized by the number of neighbors with larger valuesand the number of neighbors with smaller values. We investi-gate the regularity of neighborhood status of individual usersover time and justify the effectiveness of pre-adaptive strategy.To do so, we randomly choose two users, ranked respectivelythe 7th and the 32nd. Figure 8 shows their neighborhoodstatus. The 7th user’s neighborhood status exhibits regularchange, i.e., the number of neighbors with larger values staysstable, and that of neighbors with smaller values decreaselinearly over time. For the 32nd user, the number of users withlarger values and the number of users with smaller values bothdecrease.

We choose the 32nd user, who in general has loweranonymity risk level than the 7th user, and show its10-anonymity risk level in two consecutive time periods(2000, 3200) and (8200, 9400) with the post-adaptive strategyin Fig 9. The anonymity risk level threshold is th = 0.15. Inthe figure, the drop from a high risk level to a low risk levelindicates a pseudonym change. Recall that a user changes itspseudonym not only when the anonymity risk level is beyondthreshold th but also when its current pseudonym expires.This is reflected by the anonymity risk level drop happenedbelow the threshold line in the figure. From Fig. 8, we can seethat the pseudonym change frequency is high when the userencounters a large number of neighbors. This is reasonable asa large number of profile matching runs are executed in thiscase, and the user’s anonymity risk level grows quickly. Whenthe level is beyond a pre-defined threshold, the user changesits pseudonym.

Figure 10 shows the performance of the constant, the post-adaptive and the pre-adaptive strategies respectively for 5-anonymity and 10-anonymity, in relation with threshold th.The results are obtained with respect to the 32nd user. Forthe constant strategy, multiple lines are plotted, respectivelycorresponding to z = {1, 2, 4, 10, 20, 40}. As z goes up, theuser consumes a decreasingly number of pseudonyms andhas an increasingly break ratio (the ratio of the number oftime slots that the k-anonymity of the 32nd user is brokento 10,000). It can be seen that the number of pseudonymsconsumed by the post-adaptive and pre-adaptive strategiesare much smaller than those of the constant strategy. Forexample, in the case of 5-anonymity and th = 0.0763,the post-adaptive strategy spends 369 pseudonyms and re-sults in a 514 time slot anonymity break period. The con-

0.05 0.06 0.07 0.08 0.09 0.1 0.11 0.12 0.130

500

1000

1500

2000

2500

3000

Threshold

Num

ber

of p

seud

onym

s

CONST−110000

CONST−2

...

5000

...

CONST−4

CONST−10

CONST−20CONST−40

PostPre (eCPM+)

(a) # of pseudonyms for 5-anonymity

0.05 0.06 0.07 0.08 0.09 0.1 0.11 0.12 0.130

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

0.11

0.12

Threshold

Bre

ak r

atio

CONST−1

CONST−2CONST−4

CONST−10

CONST−20

CONST−40

PostPre (eCPM+)

(b) 5-anonymity break period

0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.220

500

1000

1500

2000

2500

3000

Threshold

Num

ber

of p

seud

onym

s

CONST−110000

CONST−2...

5000

...

CONST−4

CONST−10

CONST−20CONST−40

PostPre (eCPM+)

(c) # of pseudonyms for 10-anonymity

0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.220

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

0.11

0.12

Threshold

Bre

ak r

atio

CONST−1

CONST−2

CONST−4

CONST−10

CONST−20

CONST−40PostPre (eCPM+)

(d) 10-anonymity break period

Fig. 10: Pseudonyms and break ratio (the 32nd user)

stant strategy consumes 500(> 369) pseudonyms and hasa 0.0540(> 0.0514) break ratio. The post-adaptive strategyoutperforms the constant strategy in anonymity protectionby using fewer pseudonyms to achieve smaller break ratio.Similar phenomena are observed for other th values and10-anonymity scenario as well. In particular, we find thatas expected, the pre-adaptive strategy leads to yet betteranonymity performance than the post-adaptive one. Fig. 10shows that in case of 5-anonymity and th = 0.0763, thepre-adaptive strategy consumes 449(> 369) pseudonyms andresults in a 0.0445(< 0.0514) break ratio. The pre-adaptivestrategy consumes slightly more pseudonyms, but achievessignificantly shorter anonymity break period.

X. CONCLUSION

We have investigated a unique comparison-based profilematching problem in Mobile Social Networks (MSNs), andproposed novel protocols to solve it. The explicit Comparison-based Profile Matching (eCPM) protocol provides conditionalanonymity. It reveals the comparison result to the initiator.Consider the k-anonymity as a user requirement, we analyzethe anonymity risk level in relation to the pseudonym changefor consecutive eCPM runs. We have further introduced anenhanced version of the eCPM, i.e., eCPM+, by exploitingthe prediction-based strategy and adopting the pre-adaptivepseudonym change. The effectiveness of the eCPM+ is val-idated through extensive simulations using real-trace data.We have also devised two protocols with full anonymity,i.e., implicit Comparison-based Profile Matching (iCPM) andimplicit Predicate-based Profile Matching (iPPM). The iCPMhandles profile matching based on a single comparison ofan attribute while the iPPM is implemented with a logicalexpression made of multiple comparisons spanning multipleattributes. The iCPM and the iPPM both enable users toanonymously request for messages and respond to the requestsaccording to the profile matching result, without disclosing any

Page 13: Fully Anonymous Profile Matching in Mobile Social Networksvaweinstitutes.com/pdffiles/Base Paper.pdf · Fully Anonymous Profile Matching in Mobile Social Networks Xiaohui Liang,

profile information.In current version of the iCPM and the iPPM, we implement

“>” and “<” operations for profile matching. One future workis to extend them to support more operations, such as “≥” and“≤”. Another future work is to hide the predicate informationin the iPPM. Currently, the responder needs to transmit thethreshold value of the predicate to the initiator, which mayreveal partial information of the responder’s interest. Restrict-ing the disclosure of such parameter will be of significancefor advancing comparison-based family of profile matchingprotocols and warrants deep investigation.

REFERENCES

[1] “Comscore,” http://www.comscoredatamine.com/.[2] A. G. Miklas, K. K. Gollu, K. K. W. Chan, S. Saroiu, P. K. Gummadi,

and E. de Lara, “Exploiting social interactions in mobile systems,” inUbicomp, 2007, pp. 409–428.

[3] S. Ioannidis, A. Chaintreau, and L. Massoulie, “Optimal and scalabledistribution of content updates over a mobile social network,” in Proc.IEEE INFOCOM, 2009, pp. 1422–1430.

[4] R. Lu, X. Lin, and X. Shen, “Spring: A social-based privacy-preservingpacket forwarding protocol for vehicular delay tolerant networks,” inProc. IEEE INFOCOM, 2010, pp. 632–640.

[5] W. He, Y. Huang, K. Nahrstedt, and B. Wu, “Message propagation in ad-hoc-based proximity mobile social networks,” in PERCOM workshops,2010, pp. 141–146.

[6] D. Niyato, P. Wang, W. Saad, and A. Hjørungnes, “Controlled coalitionalgames for cooperative mobile social networks,” IEEE Transactions onVehicular Technology, vol. 60, no. 4, pp. 1812–1824, 2011.

[7] M. Motani, V. Srinivasan, and P. Nuggehalli, “Peoplenet: engineering awireless virtual social network,” in MobiCom, 2005, pp. 243–257.

[8] M. Brereton, P. Roe, M. Foth, J. M. Bunker, and L. Buys, “Design-ing participation in agile ridesharing with mobile social software,” inOZCHI, 2009, pp. 257–260.

[9] E.Bulut and B.Szymanski, “Exploiting friendship relations for efficientrouting in delay tolerant mobile social networks,” IEEE Transactions onParallel and Distributed Systems, vol. 23, no. 12, pp. 2254–2265, 2012.

[10] Z. Yang, B. Zhang, J. Dai, A. C. Champion, D. Xuan, and D. Li,“E-smalltalker: A distributed mobile system for social networking inphysical proximity,” in ICDCS, 2010, pp. 468–477.

[11] Q. Li, S. Zhu, and G. Cao, “Routing in socially selfish delay tolerantnetworks,” in Proc. IEEE INFOCOM, 2010, pp. 857–865.

[12] X. Liang, X. Li, T. H. Luan, R. Lu, X. Lin, and X. Shen, “Morality-driven data forwarding with privacy preservation in mobile socialnetworks,” IEEE Transactions on Vehicular Technology, vol. 7, no. 61,pp. 3209–3222, 2012.

[13] R. Gross, A. Acquisti, and H. J. H. III, “Information revelation andprivacy in online social networks,” in WPES, 2005, pp. 71–80.

[14] F. Stutzman, “An evaluation of identity-sharing behavior in socialnetwork communities.” iDMAa Journal, vol. 3, no. 1, pp. 10–18, 2006.

[15] K. P. N. Puttaswamy, A. Sala, and B. Y. Zhao, “Starclique: guaranteeinguser privacy in social networks against intersection attacks,” in CoNEXT,2009, pp. 157–168.

[16] E. Zheleva and L. Getoor, “To join or not to join: the illusion of privacyin social networks with mixed public and private user profiles,” in WWW,2009, pp. 531–540.

[17] G. Chen and F. Rahman, “Analyzing privacy designs of mobile socialnetworking applications,” IEEE/IFIP International Conference on Em-bedded and Ubiquitous Computing, vol. 2, pp. 83–88, 2008.

[18] D. Balfanz, G. Durfee, N. Shankar, D. K. Smetters, J. Staddon, and H.-C. Wong, “Secret handshakes from pairing-based key agreements,” inIEEE Symposium on Security and Privacy, 2003, pp. 180–196.

[19] M. J. Freedman, K. Nissim, and B. Pinkas, “Efficient private matchingand set intersection,” in EUROCRYPT, 2004, pp. 1–19.

[20] R. Lu, X. Lin, X. Liang, and X. Shen, “A secure handshake schemewith symptoms-matching for mhealthcare social network,” ACM MobileNetworks and Applications (MONET), vol. 16, no. 6, pp. 683–694, 2011.

[21] M. Li, N. Cao, S. Yu, and W. Lou, “Findu: Privacy-preserving personalprofile matching in mobile social networks,” in Proc. IEEE INFOCOM,2011, pp. 2435–2443.

[22] R. Zhang, Y. Zhang, J. Sun, and G. Yan, “Fine-grained private matchingfor proximity-based mobile social networking,” in Proc. IEEE INFO-COM, 2012, pp. 1969–1977.

[23] W. Dong, V. Dave, L. Qiu, and Y. Zhang, “Secure friend discovery inmobile social networks,” in Proc. IEEE INFOCOM, 2011, pp. 1647–1655.

[24] J. Freudiger, M. H. Manshaei, J.-P. Hubaux, and D. C. Parkes, “On non-cooperative location privacy: a game-theoretic analysis,” in ACM CCS,2009, pp. 324–337.

[25] R. Lu, X. Lin, H. Luan, X. Liang, and X. Shen, “Pseudonym changing atsocial spots: An effective strategy for location privacy in vanets,” IEEETransactions on Vehicular Technology, vol. 61, no. 1, pp. 86 – 96, 2011.

[26] J. Katz, A. Sahai, and B. Waters, “Predicate encryption supporting dis-junctions, polynomial equations, and inner products,” in EUROCRYPT,2008, pp. 146–162.

[27] N. Eagle and A. Pentland, “Social serendipity: mobilizing social soft-ware,” IEEE Pervasive Computing, vol. 4, no. 2, pp. 28–34, 2005.

[28] J. Teng, B. Zhang, X. Li, X. Bai, and D. Xuan, “E-shadow: Lubricatingsocial interaction using mobile phones,” in ICDCS, 2011, pp. 909–918.

[29] B. Han and A. Srinivasan, “Your friends have more friends than you do:identifying influential mobile users through random walks,” in MobiHoc,2012, pp. 5–14.

[30] Z. Yang, B. Zhang, J. Dai, A. C. Champion, D. Xuan, and D. Li,“E-smalltalker: A distributed mobile system for social networking inphysical proximity,” in ICDCS, 2010, pp. 468–477.

[31] L. Kissner and D. X. Song, “Privacy-preserving set operations,” inCRYPTO, 2005, pp. 241–257.

[32] Q. Ye, H. Wang, and J. Pieprzyk, “Distributed private matching and setoperations,” in ISPEC, 2008, pp. 347–360.

[33] D. Dachman-Soled, T. Malkin, M. Raykova, and M. Yung, “Efficientrobust private set intersection,” in ACNS, 2009, pp. 125–142.

[34] S. Jarecki and X. Liu, “Efficient oblivious pseudorandom function withapplications to adaptive ot and secure computation of set intersection,”in TCC, 2009, pp. 577–594.

[35] C. Hazay and Y. Lindell, “Efficient protocols for set intersection andpattern matching with security against malicious and covert adversaries,”Journal of Cryptology, vol. 23, no. 3, pp. 422–456, 2010.

[36] B. Goethals, S. Laur, H. Lipmaa, and T. Mielikainen, “On private scalarproduct computation for privacy-preserving data mining,” in ICISC,2004, pp. 104–120.

[37] A. C.-C. Yao, “Protocols for secure computations (extended abstract),”in FOCS, 1982, pp. 160–164.

[38] O. Goldreich, S. Micali, and A. Wigderson, “How to play any mentalgame or a completeness theorem for protocols with honest majority,” inSTOC, 1987, pp. 218–229.

[39] I. Ioannidis, A. Grama, and M. J. Atallah, “A secure protocol forcomputing dot-products in clustered and distributed environments,” inICPP, 2002, pp. 379–384.

[40] I. F. Blake and V. Kolesnikov, “Strong conditional oblivious transfer andcomputing on intervals,” in ASIACRYPT, 2004, pp. 515–529.

[41] P. Paillier, “Public-key cryptosystems based on composite degree resid-uosity classes,” in EUROCRYPT, 1999, pp. 223–238.

[42] M. Naehrig, K. Lauter, and V. Vaikuntanathan, “Can homomorphicencryption be practical?” in CCSW, 2011, pp. 113–124.

[43] R. Lu, X. Liang, X. Li, X. Lin, and X. Shen, “Eppa: An efficientand privacy-preserving aggregation scheme for secure smart grid com-munications,” IEEE Transactions on Parallel and Distributed Systems,vol. 23, no. 9, pp. 1621–1631, 2012.

[44] H. Ltkepohl, New introduction to multiple time series analysis.Springer, 2005.

[45] X. Liang, X. Li, Q. Shen, R. Lu, X. Lin, X. Shen, and W. Zhuang,“Exploiting prediction to enable secure and reliable routing in wirelessbody area networks,” in Proc. IEEE INFOCOM, 2012, pp. 388–396.

[46] A. Shamir, “How to share a secret,” Communications of the ACM,vol. 22, no. 11, pp. 612–613, 1979.

[47] L. Sweeney, “k-anonymity: A model for protecting privacy,” Interna-tional Journal of Uncertainty, Fuzziness and Knowledge-Based Systems,vol. 10, no. 5, pp. 557–570, 2002.

[48] J. Scott, R. Gass, J. Crowcroft, P. Hui, C. Diot, and A. Chaintreau,“CRAWDAD trace cambridge/haggle/imote/infocom (v. 2006-01-31),”Jan. 2006.

[49] D. J. Watts, “Small worlds: The dynamics of networks between orderand randomness,” J. Artificial Societies and Social Simulation, vol. 6,no. 2, 2003.

[50] C. Bron and J. Kerbosch, “Finding all cliques of an undirected graph(algorithm 457),” Communications of the ACM, vol. 16, no. 9, pp. 575–576, 1973.

Page 14: Fully Anonymous Profile Matching in Mobile Social Networksvaweinstitutes.com/pdffiles/Base Paper.pdf · Fully Anonymous Profile Matching in Mobile Social Networks Xiaohui Liang,

Xiaohui Liang (IEEE S’10) received the B.Sc.degree in Computer Science and Engineering andthe M.Sc. degree in Computer Software and Theoryfrom Shanghai Jiao Tong University (SJTU), China,in 2006 and 2009, respectively. He is currentlyworking toward a Ph.D. degree in the Department ofElectrical and Computer Engineering, University ofWaterloo, Canada. His research interests include ap-plied cryptography, and security and privacy issuesfor e-healthcare system, cloud computing, mobilesocial networks, and smart grid.

Xu Li is a research engineer at Huawei Technolo-gies Canada. Prior to joining Huawei, he workedat Inria, France (2011-2012) as research scientist,and at the University of Waterloo (2010-2011) andthe University of Ottawa (2009-2010) as post-docfellow. He received a PhD (2008) degree fromCarleton University, an M.Sc. (2005) degree fromthe University of Ottawa, and a B.Sc. (1998) degreefrom Jilin University, China, all in computer sci-ence. During 2004.1-8, he held a visiting researcherposition at National Research Council Canada. His

research interests are in next-generation wireless networks, with over 70refereed publications. He is on the editorial boards of the IEEE Transactionson Parallel and Distributed Systems, the Wiley Transactions on EmergingTelecommunications Technologies, Ad Hoc & Sensor Wireless Networks, andParallel and Distributed computing and Networks. He is/was a guest editorof a number of international archive journals. He was a recipient of NSERCPDF awards and a number of other awards.

Kuan Zhang received the B.Sc. degree in Electricaland Computer Engineering and the M.Sc. degreein Computer Science from Northeastern University(NEU), China, in 2009 and 2011, respectively. Heis currently working toward a Ph.D. degree in theDepartment of Electrical and Computer Engineer-ing, University of Waterloo, Canada. His researchinterests include packet forwarding, and security andprivacy for mobile social networks.

Rongxing Lu (IEEE S’09-M’11) received the Ph.D.degree in computer science from Shanghai Jiao TongUniversity, Shanghai, China in 2006 and the Ph.D.degree in electrical and computer engineering fromthe University of Waterloo, Waterloo, ON, Canada,in 2012. He is currently a Postdoctoral Fellow withthe Broadband Communications Research (BBCR)Group, University of Waterloo. His research interestsinclude wireless network security, applied cryptog-raphy, and trusted computing.

Xiaodong Lin (IEEE S’07-M’09) received the Ph.D.degree in information engineering from Beijing Uni-versity of Posts and Telecommunications, Beijing,China, in 1998 and the Ph.D. degree (with Out-standing Achievement in Graduate Studies Award)in electrical and computer engineering from theUniversity of Waterloo, Waterloo, ON, Canada, in2008. He is currently an assistant professor ofinformation security with the Faculty of Businessand Information Technology, University of OntarioInstitute of Technology, Oshawa, ON, Canada. His

research interests include wireless network security, applied cryptography,computer forensics, software security, and wireless networking and mobilecomputing. Dr. Lin was the recipient of a Natural Sciences and EngineeringResearch Council of Canada (NSERC) Canada Graduate Scholarships (CGS)Doctoral and the Best Paper Awards of the 18th International Conferenceon Computer Communications and Networks (ICCCN 2009), the 5th Inter-national Conference on Body Area Networks (BodyNets 2010), and IEEEInternational Conference on Communications (ICC 2007).

Xuemin (Sherman) Shen (IEEE M’97-SM’02-F09)received the B.Sc.(1982) degree from Dalian Mar-itime University (China) and the M.Sc. (1987) andPh.D. degrees (1990) from Rutgers University, NewJersey (USA), all in electrical engineering. He isa Professor and University Research Chair, Depart-ment of Electrical and Computer Engineering, Uni-versity of Waterloo, Canada. He was the AssociateChair for Graduate Studies from 2004 to 2008. Dr.Shen’s research focuses on resource managementin interconnected wireless/wired networks, wireless

network security, wireless body area networks, vehicular ad hoc and sensornetworks. He is a co-author/editor of six books, and has published morethan 600 papers and book chapters in wireless communications and networks,control and filtering. Dr. Shen served as the Technical Program CommitteeChair for IEEE VTC’10 Fall, the Symposia Chair for IEEE ICC’10, theTutorial Chair for IEEE VTC’11 Spring and IEEE ICC’08, the TechnicalProgram Committee Chair for IEEE Globecom’07, the General Co-Chair forChinacom’07 and QShine’06, the Chair for IEEE Communications SocietyTechnical Committee on Wireless Communications, and P2P Communicationsand Networking. He also serves/served as the Editor-in-Chief for IEEE Net-work, Peer-to-Peer Networking and Application, and IET Communications; aFounding Area Editor for IEEE Transactions on Wireless Communications; anAssociate Editor for IEEE Transactions on Vehicular Technology, ComputerNetworks, and ACM/Wireless Networks, etc.; and the Guest Editor for IEEEJSAC, IEEE Wireless Communications, IEEE Communications Magazine, andACM Mobile Networks and Applications, etc. Dr. Shen received the ExcellentGraduate Supervision Award in 2006, and the Outstanding Performance Awardin 2004, 2007 and 2010 from the University of Waterloo, the Premier’sResearch Excellence Award (PREA) in 2003 from the Province of Ontario,Canada, and the Distinguished Performance Award in 2002 and 2007 fromthe Faculty of Engineering, University of Waterloo. Dr. Shen is a registeredProfessional Engineer of Ontario, Canada, an IEEE Fellow, an EngineeringInstitute of Canada Fellow, a Canadian Academy of Engineering Fellow,and a Distinguished Lecturer of IEEE Vehicular Technology Society andCommunications Society. Dr. Shen has been a guest professor of TsinghuaUniversity, Shanghai Jiao Tong University, Zhejiang University, Beijing JiaoTong University, Northeast University, etc.