EPLQ: Efﬁcient Privacy-Preserving Location-Based Query ... Papers/2016 .Net/LSD1637 - EPLQ Eff… · LI et al.: EPLQ: EFFICIENT PRIVACY-PRESERVING LOCATION-BASED QUERY OVER OUTSOURCED

206 IEEE INTERNET OF THINGS JOURNAL, VOL. 3, NO. 2, APRIL 2016

EPLQ: Efficient Privacy-Preserving Location-BasedQuery Over Outsourced Encrypted Data

Lichun Li, Rongxing Lu, Senior Member, IEEE, and Cheng Huang

Abstract—With the pervasiveness of smart phones, location-based services (LBS) have received considerable attention andbecome more popular and vital recently. However, the use of LBSalso poses a potential threat to user’s location privacy. In thispaper, aiming at spatial range query, a popular LBS providinginformation about points of interest (POIs) within a given distance,we present an efficient and privacy-preserving location-basedquery solution, called EPLQ. Specifically, to achieve privacy-preserving spatial range query, we propose the first predicate-onlyencryption scheme for inner product range (IPRE), which canbe used to detect whether a position is within a given circu-lar area in a privacy-preserving way. To reduce query latency,we further design a privacy-preserving tree index structure inEPLQ. Detailed security analysis confirms the security propertiesof EPLQ. In addition, extensive experiments are conducted, andthe results demonstrate that EPLQ is very efficient in privacy-preserving spatial range query over outsourced encrypted data. Inparticular, for a mobile LBS user using an Android phone, around0.9 s is needed to generate a query, and it also only requires acommodity workstation, which plays the role of the cloud in ourexperiments, a few seconds to search POIs.

Index Terms—Location-based services (LBS), outsourcedencrypted data, privacy-enhancing technology, spatial rangequery.

I. INTRODUCTION

A FEW decades ago, location-based services (LBS) wereused in military only. Today, thanks to advances in infor-

mation and communication technologies, more kinds of LBShave appeared, and they are very useful for not only organiza-tions but also individuals. Let us take the spatial range query,one kind of LBS that we will focus in this paper, as an exam-ple. Spatial range query is a widely used LBS, which allows auser to find points of interest (POIs) within a given distance tohis/her location, i.e., the query point. As illustrated in Fig. 1,with this kind of LBS, a user could obtain the records of allrestaurants within walking distance (say 500 m). Then, theuser can go through these records to find a desirable restaurantconsidering price and reviews.

While LBS are popular and vital, most of these servicestoday including spatial range query require users to submittheir locations, which raises serious concerns about the leak-ing and misusing of user location data. For example, criminalsmay utilize the data to track potential victims and predict their

Manuscript received June 16, 2015; revised July 23, 2015; accepted August10, 2015. Date of publication August 17, 2015; date of current version March23, 2016.

The authors are with the School of Electrical and Electronic Engineering,Nanyang Technological University, Singapore 639798 (e-mail: [email protected]; [email protected]; [email protected]).

Digital Object Identifier 10.1109/JIOT.2015.2469605

Fig. 1. Example of spatial range query.

locations. For another example, some sensitive location data oforganization users may involve trade secret or national security.Protecting the privacy of user location in LBS has attracted con-siderable interest. However, significant challenges still remainin the design of privacy-preserving LBS, and new challengesarise particularly due to data outsourcing. In recent years, thereis a growing trend of outsourcing data including LBS databecause of its financial and operational benefits. Lying at theintersection of mobile computing and cloud computing, design-ing privacy-preserving outsourced spatial range query faces thechallenges below.

1) Challenge on querying encrypted LBS data. The LBSprovider is not willing to disclose its valuable LBS datato the cloud. As illustrated in Fig. 2, the LBS providerencrypts and outsources private LBS data to the cloud,and LBS users query the encrypted data in the cloud. Asa result, querying encrypted LBS data without privacybreach is a big challenge, and we need to protect not onlythe user locations from the LBS provider and cloud butalso LBS data from the cloud.

2) Challenge on the resource consumption in mobile devices.Many LBS users are mobile users, and their terminalsare smart phones with very limited resources. However,the cryptographic or privacy-enhancing techniques usedto realize privacy-preserving query usually result in highcomputational cost and/or storage cost at user side.

3) Challenge on the efficiency of POI searching. Spatialrange query is an online service, and LBS users are sensi-tive to query latency. To provide good user experiences,the POI search performing at the cloud side must be

2327-4662 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

LI et al.: EPLQ: EFFICIENT PRIVACY-PRESERVING LOCATION-BASED QUERY OVER OUTSOURCED ENCRYPTED DATA 207

Fig. 2. System model of outsourced LBS under consideration.

done in a short time (e.g., a few seconds at most). Again,the techniques used to realize privacy-preserving queryusually increase the search latency.

4) Challenge on security. LBS data are about POIs in realworld. It is reasonable to assume that the attacker mayhave some knowledge about original LBS data. With suchknowledge, known-sample attacks are possible (elabo-rated later in Section II).

Recently, there are already some solutions for privacy-preserving spatial range query [1]–[6]. However, as elaboratedin Section VIII later, existing solutions cannot address all theabove challenges. Aiming at these, in this paper, we proposean efficient solution for privacy-preserving spatial range querynamed EPLQ, which allows queries over encrypted LBS datawithout disclosing user locations to the cloud or LBS provider.To protect the privacy of user location in EPLQ, we designa novel predicate-only encryption scheme for inner productrange (IPRE scheme for short), which, to the best of ourknowledge, is the first predicate/predicate-only scheme of thiskind. To improve the performance, we also design a privacy-preserving index structure named ss-tree. Specifically, the maincontributions of this paper are three folds.

1) We propose IPRE, which allows testing whether the innerproduct of two vectors is within a given range withoutdisclosing the vectors. In predicate encryption, the keycorresponding to a predicate f can decrypt a ciphertextif and only if the attribute of the ciphertext x satisfiesthe predicate, i.e., f(x) = 1. Predicate-only encryptionis a special type of predicate encryption not designedfor encrypting/decrypting messages. Instead, it revealsthat whether f(x) = 1 or not. Predicate-only encryptionschemes supporting different types of predicates [7], [8]have been proposed for privacy-preserving query on out-sourced data. To the best of our knowledge, there does notexist predicate/predicate-only scheme supporting innerproduct range. Though our scheme is used for privacy-preserving spatial range query in this paper, it may beapplied in other applications as well.

2) We propose EPLQ, an efficient solution for privacy-preserving spatial range query. In particular, we show thatwhether a POI matches a spatial range query or not canbe tested by examining whether the inner product of two

vectors is in a given range. The two vectors contain thelocation information of the POI and the query, respec-tively. Based on this discovery and our IPRE scheme,spatial range query without leaking location informa-tion can be achieved. To avoid scanning all POIs to findmatched POIs, we further exploit a novel index struc-ture named ss-tree, which conceals sensitive locationinformation with our IPRE scheme. Experiments on ourimplementation demonstrate that our solution is very effi-cient. Moreover, security analysis shows that EPLQ issecure under known-sample attacks and ciphertext-onlyattacks.

3) Our techniques can be used for more kinds of privacy-preserving queries over outsourced data. In the spa-tial range query discussed in this work, we considerEuclidean distance, which is widely used in spatialdatabases. Our IPRE scheme and ss-tree may be used forsearching records within a given weighted Euclidean dis-tance or great-circle distance as well. Weighted Euclideandistance is used to measure the dissimilarity in manykinds of data, while great-circle distance is the distance oftwo points on the surface of a sphere. Using great-circledistance instead of Euclidean distance for long distanceson the surface of earth is more accurate. By supportingthese two kinds of distances, privacy-preserving simi-larity query and long spatial range query can also berealized.

This paper is organized as follows. In Section II, we formal-ize the system model and attack models considered in our work,and identify the design goal. In Section III, we recall bilinearpairing and related complexity assumptions as preliminaries,which will be used in subsequent sections. In Section IV, wepropose IPRE. Then, based on IPRE, we design the EPLQ solu-tion for privacy-preserving spatial range query in Section V,followed by its security analysis and performance evaluationin Sections VI and VII, respectively. We give related work inSection VIII, and finally conclude this work in Section IX.

II. MODELS AND DESIGN GOAL

In this section, we formalize the system model and attackmodels considered in this paper, and identify the design goal.

A. System Model

Privacy-preserving POI query has been studied in two set-tings of LBS: 1) public LBS and 2) outsourced LBS. In thispaper, we focus on the latter setting. In the former setting, thereis an LBS provider holding a spatial database of POI recordsin plaintext, and LBS users query POIs at the provider’s site.In outsourced LBS, as shown in Fig. 2, the system consists ofthree kinds of entities, LBS provider, LBS users, and cloud, asfollows.

1) The LBS provider has abundant of LBS data, which arePOI records. The LBS provider allows authorized users(i.e., LBS users) to utilize its data through location-basedqueries. Because of the financial and operational benefitsof data outsourcing, the LBS provider offers the query


services via the cloud. However, the LBS provider is notwilling to disclose the valuable LBS data to the cloud.Therefore, the LBS provider encrypts the LBS data, andoutsources the encrypted data to the cloud.

2) The cloud has rich storage and computing resources. Itstores the encrypted LBS data from the LBS provider, andprovides query services for LBS users. So, the cloud hasto search the encrypted POI records in local storage tofind the ones matching the queries from LBS users.

3) LBS users have the information of their own locations,and query the encrypted records of nearby POIs in thecloud. Cryptographic or privacy-enhancing techniquesare usually utilized to hide the location information inthe queries sent to the cloud. To decrypt the encryptedrecords received from the cloud, LBS users need toobtain the decryption key from the LBS provider inadvance.

B. Attack Models

Similar to most previous works on outsourced data query,the cloud is assumed honest but curious and considered as thepotential attacker in this work. That is, the cloud would honestlystore and search data as requested; however, the cloud wouldalso have financial incentives to learn those stored LBS dataand user location data in query. Because both LBS data anduser location data are valuable, they should be protected andhidden from the cloud. In general, in the outsourced LBS set-ting, the cloud can observe both queries from LBS users andencrypted LBS data from the LBS provider, which could be anadvantage to learn user locations. Therefore, assuming differentabilities of the attacker, there are mainly four attack models inoutsourced LBS setting.

1) Ciphertext-only attack. In this model, the attacker is ableto observe the ciphertexts of POIs’ locations and queriesbut does not know the plaintexts. Obviously, every cloudhas this ability. This is a weak attack model.

2) Known-sample attack. In this model, the attacker knowsthe plaintexts of some POIs’ locations and/or queries. Theattacker also knows that their corresponding ciphertextsmust exist in all the ciphertexts observed by the attacker.However, the attacker does not know which ciphertext iscorresponding to a known plaintext. Utilizing such infor-mation, the attacker may be able to reveal the plaintextcorresponded to any given ciphertext. Such informationis not hard to obtain if the attacker has the backgroundknowledge that the LBS database must contain the POIsof certain type in a certain area.

3) Known-plaintext attack. In this model, the attacker knowsthe plaintexts of some POIs’ locations and/or queriesas well as their corresponding ciphertexts. Utilizing thisinformation, the attacker may be able to reveal the plain-texts corresponded to other ciphertexts.

4) Access-pattern attack. In this model, the attacker hassome background knowledge about the pattern of POIaccessing. For example, the attacker knows that a knownPOI would be the most popular POI. If an encrypted POIappears most frequently in query results, it must be the

encrypted version of the known POI. Then, the attackerknows that corresponding query points must be close tothe known POI.

In addition to the above attacks, other attacks such as insiderattacks may be possible. In this paper, we consider ciphertext-only and known-sample attacks, which do not require attackerswith very strong abilities. We will leave the attacks requiringvery strong abilities for future study.

C. Design Goal

Under the outsourced LBS system model, our design goalis to develop an efficient, accurate, and secure solution forprivacy-preserving spatial range query. Specifically, the follow-ing three objectives should be achieved.

1) Efficiency. As discussed in Section I, spatial range queryhas stringent performance requirements. A good solu-tion should not consume many resources of mobile LBSusers, and the POI search latency should be acceptable foronline query.

2) Accuracy. It is desirable that a query result contains theexact records matching the query. False negatives wouldhurt user experience, while false positives would increasecommunication cost. Additional computational cost isalso required at the user side to filter out false positives.

3) Security. The proposed solution should be resilient tociphertext-only attacks and known-sample attacks. Anaccurate and efficient solution for spatial range query[1] already exists, which is resilient to ciphertext-onlyattacks but not to known-sample attacks and more power-ful attacks. The proposed solution should be more securethan the solution in [1].

Though subject to more powerful attacks such as known-plaintext attacks, the solution proposed in this paper still canbe used in many situations where the attackers do not have therequired abilities or knowledge. Our solution also has advan-tages over the solutions resilient to such attacks. As we will seein the related works in Section VIII, such solutions are eithervery computationally costly or not applicable to outsourcedLBS.

III. PRELIMINARIES: BILINEAR PAIRING AND

COMPLEXITY ASSUMPTIONS

In this section, we outline the cryptographic technique ofbilinear pairing and related complexity assumptions, which willserve as the basis of our IPRE scheme.

Let G1 and G2 be two cyclic groups of the same big primeorder p, and g be a generator of G1. Let e : G1 ×G1 → G2 bea pairing, i.e., a map satisfies the following properties:

1) bilinearity: e(P a, Qb) = e(P,Q)ab for any a, b ∈ Z∗p and

any P,Q ∈ G1;2) nondegeneracy: e(g, g) �= 1;3) computability: e can be computed efficiently.The definitions of pairing parameter generator and pairing

related complexity assumptions are given below. For morecomprehensive descriptions, refer to [9].


Definition 1: Gen. The pairing parameter generator Gen isa probabilistic algorithm that takes a security parameter λ asinput and outputs a 5-tuple (G1,G2, g, p, e).

Definition 2: Computational Diffie–Hellman (CDH) prob-lem. The CDH problem is: Given (P, P a, P b) ∈ G1 forunknown a, b ∈ Z∗

p , compute P ab ∈ G1.Definition 3: Decisional bilinear Diffie–Hellman (DBDH)

problem. The DBDH problem is: Given (P, P a, P b, P c) ∈ G1

and W ∈ G2 for unknown a, b, c ∈ Z∗p , determine whether

W = e(P, P )abc or a random element from G2.Definition 4: Discrete logarithm (DL) problem. The DL

problem is: Given Q ∈ G1, compute a ∈ Z∗p such that ga = Q.

IV. IPRE: A NOVEL PREDICATE-ONLY ENCRYPTION

SCHEME FOR INNER PRODUCT RANGE

In this section, we present IPRE, which will serve as thebasis of our EPLQ solution for privacy-preserving spatial rangequery.

A. Overview

The proposed IPRE scheme allows computing inner prod-ucts and comparing their values with a predefined range in aprivacy-preserving way. As far as we know, our scheme is thefirst predicate/predicate-only encryption scheme for inner prod-uct range. In IPRE, both attributes and predicates are vectors.So, we use attribute vectors and predicate vectors to refer to theattributes and predicates in IPRE. Let Λ ⊆ Zt

p be the attributeset and � ⊆ Zt

p be the class of predicates in IPRE. p is a bigprime here. IPRE allows testing if the inner product of a vectorfrom Λ and a vector from � is in a predefined range withoutdisclosing the vectors.

IPRE scheme is a symmetric predicate-only encryptionscheme, and it consists of four algorithms: 1) Setup algorithmfor generating a public parameter PP, an attribute encryptionkey AK, and a predicate encryption key PK; 2) Enc algorithmfor encrypting attribute vectors to ciphertexts; 3) GenTokenalgorithm for encrypting predicate vectors to tokens; and4) Check algorithm for checking if a ciphertext’s attributesatisfies a token’s predicate.

For the reader’s convenience, we summarize the importantnotations to be used in Table I.

B. Encoding Attribute Vectors and Predicate Vectors

Before describing IPRE’s algorithms, we define the encod-ings of attribute vectors and predicate vectors, which serveas a building block of IPRE. Let EncodeU () and EncodeV ()be the functions of encoding predicate vectors and attributevectors, respectively. EncodeU () takes a predicate vector−→Ui= (ui,1, ui,2, . . . , ui,t) and a random integer hi as input, and

outputs a vector−−→EUi. Similarly, EncodeV () takes an attribute

vector−→Vj = (vj,1, vj,2, . . . , vj,t) and a random integer sj as

input, and outputs a vector−−→EVj .

We want the encoding functions to meet the requirement that〈−−→EUi,

−−→EVj〉 = (α× 〈−→Ui,

−→Vj〉+ β)d + hi + sj for any pair of

TABLE INOTATIONS FREQUENTLY USED IN IPRE AND EPLQ

−−→EUi and

−−→EVj . α and β are two predefined secret integers.

Next, we show how to find such encoding functions.Following the well-known multinomial theorem, we have

following equation:

(α× 〈−→Ui,−→Vj〉+ β)d =

(β + α×

t∑k=1

ui,k × vj,k

)d

=∑

l1+l2+···+lt+1=d

l1,l2,...,lt+1∈[0,d]

(d

l1, l2, . . . , lt+1

)

× βlt+1

t∏k=1

(α× vj,k)lk

t∏k=1

ulki,k

where(

dl1,l2,...,lt+1

)= d!

l1!l2!...lt+1!.

Then, the last sum of the above equation∑l1 + l2 + ···+lt + 1 = d

l1,l2,...,lt + 1∈[0,d]

(d

l1,l2,...,lt + 1

)βlt + 1

∏tk=1(α× vj,k)

lk

∏tk=1 u

lki,k has

(d+tt

)terms. Let ai,m and bj,m be the

∏tk=1 u

lki,k

and(

dl1,l2,...,lt+1

)βlt+1

∏tk=1(α× vj,k)

lk , respectively,

in the mth term. Then, the last sum can be written as∑(d+tt )

m=1 ai,m × bj,m. Without loss of generality, let lt+1 = dand l1, l2, . . . , lt = 0 when m = 1. So, ai,1 = 1 and bj,1 = βd.Then, we have the following equation:

(α× 〈−→Ui,−→Vj〉+ β)d + hi + sj

= hi + sj +

(d+tt )∑

m=1

ai,m × bj,m


= hi + sj + ai,1 × bj,1 +

(d+tt )∑

m=2

ai,m × bj,m

= hi × 1 + 1× (βd + sj) +

(d+tt )∑

m=2

ai,m × bj,m

= 〈(hi, 1, ai,2, ai,3, . . . , ai,n−1),

(1, βd + sj , bj,2, bj,3, . . . , bj,n−1)〉.Thus, we can define EncodeU () and EncodeV () as

EncodeU (−→Ui, hi) = (hi, 1, ai,2, ai,3, . . . , ai,n−1) (1)

EncodeV (−→Vj , sj) = (1, βd + sj , bj,2, bj,3, . . . , bj,n−1) (2)

where n =(d+tt

)+ 1.

C. Setup Algorithm

The setup algorithm is a probabilistic algorithm, which takesa security parameter λ, the attribute/predicate vector length t,and an inner product range [τ1, τ2] as input. The algorithmoutputs an attribute encryption key AK = (α, β, d,M), a predi-cate encryption key PK = (d,M), and a public parameter PP =((G1,G2, g, p, e), (Ωk)

τ2k=τ1

).α, β ∈ Fp are two random numbers for encoding functions.

d is a positive integer, and its value depends on the securityparameter. The scheme is more secure if d is bigger. d couldbe 2, or d is an integer satisfying GCD(d, p− 1) = 1. If d =2, α, β must be chosen to make sure that the intersection ofthe set {z : z = −z1 − 2β/α mod p, τ1 ≤ z1 ≤ τ2} and the setS − [τ1, τ2] is empty. Here, S be the set of all possible values ofinner products. M is an n× n random invertible matrix over thefield Fp. n is the length of encoded attribute/predicate vectors.(G1,G2, g, p, e) is the pairing parameter generated by run-

ning Gen(λ), and Ωk = Hash(e(g, g)(α×perm(k)+β)d). Here,Hash() is a hash function and perm() is a random bijectionmapping from [τ1, τ2] to [τ1, τ2], i.e., a random permutationfunction. Let |Hash()| be the range size of Hash(). Hash() ischosen to make p |Hash()| τ2 − τ1.

D. Enc Algorithm

The algorithm of encrypting attribute vectors is a prob-abilistic algorithm, which takes an attribute vector

−→Vj =

(vj,1, vj,2, . . . , vj,t) and a random number sj ∈ Fp as input, andoutputs Cj = (cj,1, cj,2, . . . , cj,n+1) = ((gv

′′j,k)nk=1, e(g, g)

sj ).

Here, (v′′j,1, v′′j,2, . . . , v

′′j,n)

T = M−1(EncodeV (−→Vj , sj))

T modp. T is the matrix transpose operator.

E. GenToken Algorithm

The token generation algorithm GenToken is a prob-abilistic algorithm, which takes a predicate vector

−→Ui =

(ui,1, ui,2, . . . , ui,t) and a random number hi ∈ Fp as input,and outputs Ki = (qi,1, qi,2, . . . , qi,n+1) = ((gu

′′i,k)nk=1,

e(g, g)hi). Here, (u′′i,1, u

′′i,2, . . . , u

′′i,n) = EncodeU (

−→Ui, hi)

M mod p.

F. Check Algorithm

The check algorithm takes a ciphertext Cj = (cj,1, cj,2, . . . ,

cj,n+1) of an attribute vector−→Vj and a token Ki =

(qi,1, qi,2, . . . , qi,n+1) associated with a predicate vector−→Ui

as input. The algorithm computes Ψ =∏n

k=1 e(cj,k,qi,k)

cj,n+1×sj,n+1. If

Hash(Ψ) is in the set {Ωk : τ1 ≤ k ≤ τ2}, the algorithm out-puts 1. Otherwise, it outputs 0.

Remark 1: If two inner products are equal, their Ψ are thesame. This is not desirable for some applications. This problemcan be circumvented by adding randomness in the generation ofpredicate/attribute vectors. For a pair of fixed predicate vectorand attribute vector, the value Ψ is still fixed. However, givena pair of predicate and attribute, their vectors and their vec-tors’ inner product all have multiple possible values. We willdemonstrate it in our EPLQ solution in Section V.

Correctness proof . First, we prove Hash(Ψ) ∈ {Ωk :

τ1 ≤ k ≤ τ2} if τ1 ≤ 〈−→Ui,−→Vj〉 ≤ τ2. Recall that Ωk =

Hash(e(g, g)(α×perm(k)+β)d). From the following equation,it is easy to find out that Hash(Ψ) ∈ {Ωk : τ1 ≤ k ≤ τ2} if

τ2 ≥ 〈−→Ui,−→Vj〉 ≥ τ1:

Ψ =

∏nk=1 e(cj,k, qi,k)

cj,n+1 × sj,n+1

=

∏nk=1 e(g

u′′i,k , gv

′′j,k)

e(g, g)hi × e(g, g)sj

=e(g, g)

∑nk=1 u′′

i,k×v′′j,k

e(g, g)hi+sj

=e(g, g)EncodeU (

−→Ui,hi)MM−1(EncodeV (

−→Vj ,sj))

T

e(g, g)hi+sj

=e(g, g)〈EncodeU (

−→Ui,hi),EncodeV (

−→Vj ,sj)〉

e(g, g)hi+sj

=e(g, g)(α×〈−→Ui,

−→Vj〉+β)d+hi+sj

e(g, g)hi+sj

= e(g, g)(α×〈−→Ui,−→Vj〉+β)d .

Second, we prove Hash(Ψ) /∈ {Ωk : τ1 ≤ k ≤ τ2} with

overwhelming probability if 〈−→Ui,−→Vj〉 /∈ [τ1, τ2].

Lemma 1: ∀a ∈ Fp, a has at most one dth root ifGCD(d, p− 1) = 1 and p is a prime.

Proof: Clearly, if a = 0, it has only one dth root 0. Ifa �= 0, we prove the lemma by contradiction. If a had twoor more distinct dth roots, let χ1, χ2 ∈ Fp be two of them.Since χd

1 = χd2 = a, we have (χ1/χ2)

d = 1. Noticing that Z∗p

is a cyclic group of order p− 1, we have order(χ1/χ2)|d andorder(χ1/χ2)|(p− 1). So, order(χ1/χ2) is a common divi-sor of d and p− 1. Since χ1 and χ2 are distinct, we haveorder(χ1/χ2) �= 1. This contradicts with GCD(d, p− 1) = 1.Therefore, the assumption of a having two or more distinctsquare roots must be false. �

Lemma 2: ∀a ∈ Fp, a has at most two square roots if p is aprime.

Proof: Clearly, if a = 0, it has only one square root 0.If a �= 0, we prove the lemma by contradiction. Assume that


a has three or more distinct square roots. Then, we can findtwo square roots χ1, χ2 ∈ Fp satisfying χ1/χ2 �= −1 mod p.Because χ2

1 = χ22 = a, we have (χ1/χ2)

2 = 1 and the orderof χ1/χ2 is 2. This contracts with the fact that Z∗

p has onlyone subgroup of order 2 and the subgroup’s elements are 1 and−1 mod p. �

Let Υk be e(g, g)(α×perm(k)+β)d for any k ∈ [τ1, τ2]. Then,Hash(Υk) = Ωk. Let us consider two cases of d as follows.

Case 1: GCD(d, p− 1) = 1. Consider an equation fk(z)over the field Fp: (α× z + β)d mod p =logΥk mod p. From Lemma 1, we know thatlogΥk has only one dth root over the fieldFp. Then, the equation has only one rootz = (χ− β)/α mod p, where χ ∈ Fp is thedth root of logΥk. From the definition of Υk,we know that τ1 ≤ z ≤ τ2 for any k ∈ [τ1, τ2].

Therefore, if 〈−→Ui,−→Vj〉 /∈ [τ1, τ2], the resulting

Ψ = e(g, g)(α×〈−→Ui,−→Vj〉+β)d is not equal to any

Υk. Because the range size of Hash() is muchbigger than the size of [τ1, τ2], the probability ofHash(Ψ) ∈ {Ωk : τ1 ≤ k ≤ τ2} is very low.

Case 2: d = 2. From Lemma 2, we know that the equa-tion fk(z) : (α× z + β)2 mod p = logΥk mod phas at most two roots. Suppose there are two rootsz1 and z2. We have (α× z2 + β) mod p = −(α×z1 + β) mod p and z2 = (−z1 − 2β/α) mod p.From the definition of Υk, we know that at least oneof the roots is in [τ1, τ2]. Without loss of generality,suppose z1 ∈ [τ1, τ2]. Then, z2 is in the set {z : z =−z1 − 2β/α mod p, τ1 ≤ z1 ≤ τ2}. Recall that thevalues of β and α are chosen to avoid the over-lap between this set and the set S − [τ1, τ2]. Then,

if 〈−→Ui,−→Vj〉 /∈ [τ1, τ2], the inner product is not in

the set of roots. Therefore, the resulting Ψ is notequal to any Υk. Again, because |Hash()| τ2 −τ1, the probability of Hash(Ψ) ∈ {Ωk : τ1 ≤ k ≤τ2} is very low.

Remark 2: With a suitable Hash() function, using {Ωk :τ1 ≤ k ≤ τ2} instead of {Υk : τ1 ≤ k ≤ τ2} to check innerproduct range can reduce the size of public parameter, and theresulting false positives are rare. We will show in Section VIIthat the false positives are negligible in our EPLQ solution.

V. EPLQ: PROPOSED SOLUTION FOR

PRIVACY-PRESERVING SPATIAL RANGE QUERY

In this section, we propose a novel tree data structure namedss-tree and the EPLQ solution based on the above IPRE schemeand ss-tree.

A. Preliminary: ss-tree

The ss-tree introduced in this work is a variant of ss-tree [10].For indexing spatial data, there actually exist quite a few datastructures such as r-tree [11] and ss-tree, and some of them(e.g., ss-tree and sh-tree [12]) can be used for spatial rangequery. When such kind of data structures are used for privacy-preserving query, location data, e.g., the location data of points,

Fig. 3. Index POIs with ss-tree and ss-tree. (a) POI distribution. (b) ss-tree andss-tree structure.

Fig. 4. Data structures of ss-tree node and ss-tree node.

circular areas, rectangular areas, and single-dimension ranges,must be concealed. Since the above-proposed IPRE scheme canconceal location data of points and circular areas, and at thesame time ss-tree and some of its variants only need these loca-tion data, it is natural to apply IPRE to these data structuresfor privacy-preserving query. Hence, we choose ss-tree for itssimplicity, and propose ss-tree based on ss-tree and IPRE.

As shown in Fig. 3, ss-tree and ss-tree share the same struc-ture, and the data structures of their nodes are compared inFig. 4. Before describing ss-tree, we give an introduction onss-tree. An ss-tree has the following properties: 1) each nonrootparent node has mmin to mmax children; 2) the root node has0 to mmax children; 3) each leaf node represents a record; and4) each node has a centroid field and a radius field.

In the context of spatial database of Cartesian coordinate sys-tem, the centroid is a pair of coordinates (x, y). A leaf node’scentroid is the corresponding POI’s coordinates, and its radiusis 0. A nonleaf node’s centroid and radius depend on its chil-dren. Its centroid is the mean of all its children’s centroids. Itsradius is not smaller than the distance between its centroid andany descendant node’s centroid. In other words, a nonleaf nodeis associated with a circular area defined by the node’s centroidand radius, and all its descendant nodes are in this area.

Similar to other tree index structures, a nonleaf node has afield (child_pointer_array) for pointers to its children, and a leafnode has a field (leaf_data) storing some data of the correspond-ing record (e.g., the whole record, the pointer to the record, orthe record’s ID). A node of ss-tree also has some other fieldsto support tree building, approximation search, and samplingoperations. We omit these fields in this paper as they are notrelevant to our solution. Refer to [10] for the details and thebuilding of ss-tree.


With the ss-tree, searching POI records matching a spa-tial range query is very efficient. Noticing that all descendantnodes of a nonleaf node are in the nonleaf node’s associatedcircular area. Search POI records can be done by scanningthe ss-tree from root to leaves. If a node’s circular area inter-sects with the query area, all children nodes of the node willbe scanned. Otherwise, its descendant nodes will be skipped.Then, O(logN +R) trees nodes will be scanned to findmatched records where N is the number of POI records in thedatabase and R is the number of matched records.

B. Proposed ss-tree

ss-tree is the core of our EPLQ solution. It is a variant ofss-tree. They share the same tree structure, which is shown inFig. 3. The difference between ss-tree and ss-tree is the treenodes’ data, which are shown in Fig. 4. ss-tree hides each treenode’s location information using our predicate-only encryp-tion scheme, and removes unnecessary information. Because ofthe encryption, detecting circular area intersection and matchedrecords are also different when searching matched records withthe tree. More specifically, we use two kinds of inner productsfor detecting circular area intersection and matched records,and our IPRE scheme assures the detection via inner productrange in a privacy-preserving way.

1) Building ss-tree: ss-tree can be built from ss-tree. Afterbuilding an ss-tree for the spatial database, an ss-tree can bebuilt by the following steps.

Step 1) Configure parameters τ1, τ2, and φ.As mentioned in the IPRE scheme, τ1 and τ2 arethe lower limit and upper limit of the given innerproduct range, respectively. The lower limit τ1 isfixed to 0 in ss-tree. τ2 is set to a value notsmaller than r2max1 where rmax1 is the maximumquery range allowed in the system. φ is a positiveinteger used to scale down the inner products fordetecting area intersection, and τ2 ≥ (�rmax1/φ +�rmax2/φ + 2)2 where rmax2 is the biggest radiusof ss-tree’s circular areas. (Such inner products arescaled down to make them fit in a smaller range,which reduces the size of IPRE’s public parame-ter. As explained later at the end of this section, thescaling down will not decrease accuracy.)

Step 2) Generate an attribute vector for each tree node fromthe node’s centroid (xj , yj) and radius rj .

Case 1: The node is a leaf node. Generate the attributevector

−→Vj as follows: ((−1)ξj , (−1)ξj (r2j − x2

j −y2j ), (−1)ξj × 2xj , (−1)ξj × 2yj , (−1)ξj × 2rj ,

ξjτ2, (−1)ξj ).Here, ξj is a random number, and its value is 1or 0.

Case 2: The node is a nonleaf node. Do the following twosteps.

1) Modify the node’s original circular area to aminimal normalized area covering the originalone. In this paper, we say a circular area isa normalized area if its centroid’s coordinates

and its radius are all integral multiples of φ.The centroid’s coordinates will be modified tothe closest coordinates that are integral multi-ples of φ, and the radius will be modified to thesmallest integral multiple of φ that makes thenormalized area cover the original one.

2) After modifying (xj , yj) and rj , generate the

attribute vector−→Vj as follows:

((−1)ξj μj , (−1)ξj (μj(r2j − x2

j − y2j )/φ2 + θj),

(−1)ξj μj xj/φ,(−1)ξj μj yj/φ,(−1)ξj μj rj/φ,

ξjτ2, (−1)ξj ).

Here, μj = �τ2/(�rmax1/φ+ 1 + rj/φ)2�,

and θj is a random nonnegative integer notmore than μj and τ2 − μj(�rmax1/φ+ 1 +rj/φ)

2. Again, ξj is a random number, and itsvalue is 1 or 0.

Step 3) Encrypt each attribute vector with IPRE scheme.Step 4) Remove all unnecessary fields of each tree node.

At last, a node’s data include its encrypted attributevector. If the node is a nonleaf node, the data alsoinclude pointers to its children. If the node is a leafnode, its leaf_data field also store the pointer to thecorresponding record.

2) Searching ss-tree: Searching ss-tree is the same assearching ss-tree except that detecting circular area intersectionand matched records are based on our IPRE scheme.

Suppose a spatial range query wants to find all POIs withina circular area centered at coordinates (xi, yi) with radius ri.To search ss-tree, the tokens of two predicate vectors associ-ated with the query should be provided. The two vectors

−→Ui

and−→U ′

i are shown below, and tokens are generated with IPRE’sGenToken algorithm

−→Ui =((−1)ξi(μi(r

2i − x2

i − y2i ) + θi), (−1)ξiμi, (−1)ξiμixi,

(−1)ξiμiyi, (−1)ξiμiri, 1, ξiτ2)−→U ′

i =((−1)ξ′i(r′i2 − x′i2 − y′i

2)/φ2, (−1)ξ′i , (−1)ξ′i2x′

i/φ,

(−1)ξ′i2y′i/φ, (−1)ξ′i2r′i/φ, 1, ξ

′iτ2).

Here, ξi and ξ′i are random integers in {0, 1}. μi = �τ2/r2i �,and θi is a random nonnegative integer not more than μi andτ2 − μiri

2. (x′i, y

′i) and r′i are the centroid coordinates and

radius of the minimal normalized area covering the query area,respectively. The way to find this normalized area is the sameas that in the generation of attribute vectors.

The tokens of−→Ui and

−→U ′

i are used for detecting matchedrecords and circular area intersection, respectively. We callthe former vector POI-matching predicate vector and the latterarea-intersecting predicate vector.

Given the above tokens associated with the query, POIrecords matching the query can be found by searching ss-tree. The pseudocode of the search algorithm is shown inAlgorithm 1. The search starts from the root node. If a nonleafnode’s area intersects with the query area, all children of the


Algorithm 1. Search_ss-tree(node nd, query_tokensKs,node_listndl)

1: \\ nd: the node to be searched2: \\Ks: the array of two tokens associated with the query’s

predicate vectors. Ks[0] is the token for POI matchingdetection, while Ks[1] is the one for detecting intersectionof circular areas.

3: \\ ndl: the list to store matched leaf nodes4:5: C ← nd.encrypted_attribute_vector6: if nd is a leaf node then7: if Check(Ks[0], C) == 1 then8: \\ nd’s record matches the q’s area9: Add nd to node_list ndl.

10: end if11: else12: if Check(Ks[0], C) == 1 then13: \\ nd’s area intersects with the q’s area14: for each child node cld_i of nd do15: Search_ss-tree(cld_i, Ks, ndl)16: end for17: end if18: end if

node will be scanned. Otherwise, all descendant nodes of thisnonleaf node are skipped. Detecting circular area intersectionand matched records are based on our IPRE scheme for innerproduct range. We give the correctness proofs of the detectionas follows.

3) Correctness of Detecting Matched Records via InnerProduct Range: For any POI-matching predicate vector

−→Ui

and any leaf node’s attribute vector−→Vj , we have the following

equation:

〈−→Ui,−→Vj〉

= 〈((−1)ξi(μi(r2i − x2

i − y2i ) + θi), (−1)ξiμi, (−1)ξiμixi,

(−1)ξiμiyi, (−1)ξiμiri, 1, ξiτ2),

((−1)ξj , (−1)ξj (r2j − x2j − y2j ), (−1)ξj × 2xj , (−1)ξj

× 2yj , (−1)ξj × 2rj , ξjτ2, (−1)ξj )〉= (−1)ξi+ξj (μi(r

2i − x2

i − y2i ) + θi)

+ (−1)ξi+ξjμi(r2j − x2

j − y2j ) + (−1)ξi+ξj2μixixj

+ (−1)ξi+ξj2μiyiyj + (−1)ξi+ξj2μirirj

+ ξjτ2 + (−1)ξjξiτ2= (−1)ξi+ξj (μi((ri + rj)

2 − (xi − xj)2 − (yi − yj)

2) + θi)

+ ξjτ2 + (−1)ξjξiτ2

=

⎧⎪⎪⎨⎪⎪⎩

μi((ri + rj)2 − (xi − xj)

2 − (yi − yj)2) + θi,

if ξi = ξjτ2 − μi((ri + rj)

2 − (xi − xj)2 − (yi − yj)

2)− θi,otherwise.

As the tree node is a leaf node, rj = 0. Then, we have

〈−→Ui,−→Vj〉 =

⎧⎪⎪⎨⎪⎪⎩

μi(r2i − (xi − xj)

2 − (yi − yj)2) + θi,

if ξi = ξjτ2 − μi(r

2i − (xi − xj)

2 − (yi − yj)2)− θi,

otherwise.

The distance between the leaf node’s POI and the query pointis√(xi − xj)2 + (yi − yj)2. Then, the record matches the

query if and only if r2i ≥ r2i − (xi − xj)2 − (yi − yj)

2 ≥ 0.As shown below, record matching can be detected by examiningif 〈−→Ui,

−→Vj〉 is in the range [0, τ2] as well

r2i ≥ r2i − (xi − xj)2 − (yi − yj)

2 ≥ 0

⇔ τ2 ≥ μi(r2i − (xi − xj)

2 − (yi − yj)2) + θi ≥ 0

and

τ2 ≥ τ2 − μi(r2i − (xi − xj)

2 − (yi − yj)2)− θi ≥ 0

⇔ τ2 ≥ 〈−→Ui,−→Vj〉 ≥ 0.

For security reasons, the field order p in IPRE scheme is at least160 bits, and p 〈−→Ui,

−→Vj〉 −p holds. Then, we have

τ2 ≥ 〈−→Ui,−→Vj〉 ≥ 0⇔ τ2 ≥ 〈−→Ui,

−→Vj〉 mod p ≥ 0.

Therefore, record matching can be detected by examining if〈−→Ui,

−→Vj〉 mod p is in the range [0, τ2].

4) Correctness of Detecting the Intersection of CircularAreas via Inner Product Range: Similarly, for any area-

intersecting predicate vector−→U ′

i and any nonleaf node’s

attribute vector−→Vj , we have the following equation:

〈−→U ′i ,−→Vj〉

=

⎧⎪⎪⎨⎪⎪⎩μj((r

′i + rj)

2 − (x′i − xj)

2 − (y′i − yj)2)/φ2 + θj ,

if ξ′i = ξjτ2 − μj((r

′i + rj)

2 − (x′i − xj)

2 − (y′i − yj)2)/φ2 − θj ,

otherwise.

The nonleaf node’s circular area (been normalized) intersectswith the query’s normalized area if and only if (r′i + rj)

2 ≥(r′i + rj)

2 − (x′i − xj)

2 − (y′i − yj)2 ≥ 0. As shown below,

the intersection of normalized areas can be detected by exam-

ining if 〈−→U ′i ,−→Vj〉 mod p is in the range [0, τ2] as well. Note

that detecting intersection is to rule out subtrees not containingmatched POIs. Expanding original areas to normalized areasonly results in scanning more tree nodes. All matched POIrecords can still be found, and not matched records will notbe included in the result

(r′i + rj)2 ≥ (r′i + rj)

2 − (x′i − xj)

2 − (y′i − yj)2 ≥ 0

⇔ τ2 ≥ μj((r′i + rj)

2 − (x′i − xj)

2

− (y′i − yj)2)/φ2 + θj ≥ 0

and

τ2≥τ2 − μj((r′i + rj)

2−(x′i − xj)

2 − (y′i − yj)2)/φ2 − θj≥0

⇔ τ2 ≥ 〈−→U ′

i ,−→Vj〉 ≥ 0

⇔ τ2 ≥ 〈−→U ′

i ,−→Vj〉 mod p ≥ 0.


C. EPLQ Design

Our EPLQ solution consists of two algorithms: 1) systemsetup and 2) spatial range search.

1) System Setup: The LBS provider initializes the systemby the following steps.

Step 1) The LBS provider initializes the parameters andkeys for the solution.The LBS provider initializes the public parameterand keys of the proposed IPRE scheme as wellas the key of a standard encryption scheme (e.g.,AES). Let AK = (α, β, d,M), PK = (d,M), andPP = ((G1,G2, g, p, e), (Ωk)

τ2k=τ1

) be the attributeencryption key, predicate encryption key, and pub-lic parameter of IPRE scheme. PP is shared withthe cloud. PK, (G1,G2, g, p, e), and the key of thestandard encryption scheme are shared with LBSusers.Remark. The standard scheme will be used toencrypt POI records. IPRE scheme is for searchingencrypted records. (Ωk)

τ2k=τ1

in the public parameteris used for IPRE’s Check algorithm only, and LBSusers do not need it to generate tokens.

Step 2) The LBS provider builds an ss-tree for the LBSdatabase.

Step 3) The LBS provider encrypts each POI record with thestandard encryption scheme.

Step 4) The LBS provider outsources all encrypted POIrecords and the ss-tree to the cloud.

2) Spatial Range Search: Suppose an LBS user wants tofind all POIs within a circular area centered at coordinates(xi, yi) with radius ri. The privacy-preserving query is per-formed by the following steps.

Step 1) The LBS user generates two tokens for searchingPOI records with the proposed IPRE scheme.As elaborated earlier in Section V-B, to search ss-tree, two tokens associated with the query areashould be generated. The LBS user generates themfollowing the way in Section V-B. Let Ks[0] andKs[1] be the generated two tokens.

Step 2) The user sends (Ks[0],Ks[1]) as a query to the cloud.Step 3) The cloud searches ss-tree to find all leaf nodes

matching the query from the user.The search algorithm has been given in Section V-B,and its pseudocode is shown in Algorithm 1.

Step 4) The cloud returns the corresponding POI records ofmatched leaf nodes to the user.

Step 5) The LBS user decrypts received POI recordswith the shared key of the standard encryptionscheme.

VI. SECURITY ANALYSIS

In this section, we analyze the security properties of theproposed EPLQ solution. Specifically, following the securityrequirements discussed earlier, our analysis will focus on howthe proposed EPLQ solution can achieve the LBS data confi-dentiality and the user’s location privacy.

The confidentiality of LBS data includes not only the confi-dentiality of POI records but also the confidentiality of locationinformation in ss-tree. On the other hand, user location pri-vacy involves protecting sensitive location information in userqueries and ss-tree. The security of EPLQ solution depends onthe underlying standard encryption scheme and IPRE scheme.The standard encryption scheme is responsible for preventingthe cloud from learning POI records, while our IPRE scheme isresponsible for protecting user location and POI location fromthe cloud. The current AES standard can be used as the standardscheme, and it is secure under ciphertext-only, known-sample,and known-plaintext attacks. Thus, we focus on the analysis ofuser/POI location protection with IPRE scheme.

A. Security of Query and POI Index Encryption

In EPLQ, user queries and the sensitive location informa-tion in ss-tree are encrypted with IPRE scheme. A queryconsists of two tokens associated with two predicate vectors,which contains the LBS user’s location information. For apredicate vector

−→Ui = (ui,1, ui,2, . . . , ui,t), the corresponding

token is ((gu′′i,k)nk=1, e(g, g)

hi) where (u′′i,1, u

′′i,2, . . . , u

′′i,n) =

EncodeU (−→Ui, hi)M mod p. Because of the hardness of CDH

problem, the attacker cannot reveal any exponent in the tokeneven if knowing the predicate vector. Without the secret keys ofIPRE (i.e., TK and AK), no one can reveal the predicate vec-tor, the secret matrix M , or the random number hi. Because ofthe randomness and secretness of hi, encrypting predicate vec-tor to token is semantically secure. Therefore, it is secure underciphertext-only and known-sample attacks. The sensitive loca-tion information in ss-tree is concealed with attribute vectorencryption. The encryption is very similar to predicate vectorencryption, and their security properties are same. Therefore,we omit its security analysis.

B. Security Under the Attack on Inner Products

As discussed above, it is hard to reveal user queries andPOI locations directly from the ciphertexts of attribute vectorsand predicate vectors. Alternatively, the attacker may attemptto recover the inner products of predicate vectors and attributevectors first, and then reveal the vectors containing informationabout user locations and POI locations. Next, we show how thisattack works and its countermeasure.

The attacker may attempt to recover inner products throughexhaustive attacks on (Υk = e(g, g)(α×perm(k)+β)d)τ2k=τ1

. Theexponents ((α× perm(k) + β)d)τ2k=τ1

could be viewed asd-degree polynomials of d+ 1 terms where the variables areα and β. The coefficients of the polynomials are in the range[τd1 , τ

d2 ]. These polynomials are in the same vector space of

dimension d+ 1. Any d+ 1 of these polynomials are linearlyindependent, and any d+ 2 of them are not. Thus, for anyΥk1

,Υk2, . . . ,Υkd+2

, there exists nonzero λ1, λ2, . . . , λd+2

satisfying the following equation:

d+2∏l=1

Υλl

kl=

d+2∏l=1

e(g, g)λl×(α×perm(kl)+β)d = e(g, g)0.


The attacker can find such λ1, λ2, . . . , λd+2 throughexhaustive search, and then recover d+ 2 innerproducts perm(k1), . . . , perm(kd+2). Recall that τ1is set to 0 when using IPRE in EPLQ. There are(τ2 + 1)× τ2 · · · × (τ2 − d) possible values for the com-bination (perm(k1), perm(k2), . . . , perm(kd+2)). Therefore,the complexity of recovering inner products is O(τd+2

2 ).With enough known inner products, the attacker can fur-

ther reveal sensitive location data by generating and solving anoverdetermined nonlinear polynomial equation system. Giventhe inner product ρi,j of an unknown predicate vector

−→Ui

and attribute vector−→Vj , the attacker can generate the equa-

tion 〈−→Ui,−→Vj〉 = ρi,j . The unknowns are the location data and

random numbers used to generate the vectors. From the innerproducts of ε encoded predicate vectors and w encoded attributevectors, a system of ε× w equations and O(ε+ w) unknownscan be generated. If ε and w are big enough, the system isoverdetermined. There are polynomial time algorithms [13],[14] for solving overdetermined nonlinear polynomial systems.Though the generated system has many solutions, an attackerknowing enough plaintext samples can reveal user locationsfrom the solutions.

Noticing that the complexity of recovering inner productsis O(τd+2

2 ), we prevent the attack above by configuring bigenough τ2 and d. d is a configurable parameter in our IPREscheme, while, as demonstrated in our EPLQ solution, τ2 can beconfigured by controlling the generation of predicate/attributevectors. In our prototype, τ2 = 108 and d = 2. Then, the com-plexity of the attack is over O(2106).

VII. PERFORMANCE EVALUATION

In this section, we evaluate the performance of the proposedEPLQ solution in terms of communication cost, computationalcost, storage cost, and accuracy.

A. Implementation and Experimental Settings

We have implemented EPLQ in JAVA, and the user interfaceis shown in Fig. 5. We tested EPLQ’s performance in a testbedof two workstations and one Android phone. These machinesplay the roles of LBS provider, cloud, and mobile LBS user,respectively. The hardware and software of these machines areshown in Table II(a), while the parameter settings are shown inTable II(b).

In addition, three datasets are extracted from theOpenStreetMap project (www.openstreetmap.org) for ourexperiments. The POIs in the datasets are the restaurants fromNew York, California, and France. Their distributions areshown in Fig. 6, and the POI counts of the three datasets are1676, 6994, and 28 340, respectively.

B. Computational Cost at User Side

To generate a query, an LBS user needs to encrypt twopredicate vectors, which requires 2n modular exponentiations,about 2n2 multiplications, and about 2n2 additions. n is the

Fig. 5. User interface of EPLQ.

TABLE IIEXPERIMENTAL SETTING

Fig. 6. POI distributions of experimental datasets including restaurants in(a) New York, (b) California, and (c) France.

length of encoded vectors. To see whether the computationalcost is acceptable for mobile LBS users or not, we measuredthe query generation latency. We let the Android phone inour testbed generate 1000 queries, and the average latency perquery generation is about 0.9 s.

C. LBS Provider’s Computational Cost

During system setup, the LBS provider needs to encrypt POIrecords, setup IPRE and build the ss-tree. Because the cost ofrecord encryption is the same as that in most solutions, we eval-uate only the computational cost related to IPRE and ss-treehere. The cost is evaluated by system setup latency, which isdefined as the time used to setup IPRE and build the ss-tree.As shown in Table III, the latencies for the three datasets arebetween 1 and 3 h. Considering that system setup is conductedonly once, the computational cost is acceptable.


TABLE IIISYSTEM SETUP LATENCY

Fig. 7. POI query latency at cloud side. Note that the latency should be muchlower once deployed at a real cloud.

D. Cloud’s Computational Cost

Recall that searching ss-tree to find matched records requiresto scan O(logN +R) trees nodes for the database with Nrecords and the query having R matched records. Determiningwhether a record or tree node matches a query or not requirescomputing Check(Ki, Cj). Computing the function requiresn = 37 pairings and multiplications.

To see whether the computational cost of searching databaseis acceptable or not, we conducted experiments on threedatasets. For each dataset, 1000 random query points are cho-sen. If a query point’s location is not near any POI, the query isnot realistic and query latency is lower than that in normal sit-uations. To avoid that, in the experiments, each query point’slocation is the same as one random POI’s. For each querypoint, we generate three queries with radii of 500 m, 1 km, and2 km, respectively. Therefore, 3000 queries are generated foreach dataset. We measured the average search latency of thesequeries for each dataset, and the results are shown in Fig. 7.As expected, the latency increases very slow when increasingPOI count and query radius. In the experiments, a workstationplays the role of cloud, and only four CPU cores can be utilizedto do the computing. A real cloud has much more computingresources, and the query latency at a real cloud should be muchlower.

E. Communication Cost and Storage Cost

To make a query, an LBS user sends two tokens to the cloud.The communication cost is O(n× log p). Under the settingsin Table II(b), the traffic is 4.75 KB, which is acceptable. Theuser has to store the attribute encryption key AK and pairingparameter locally. The storage usage is dominated by M , whichis about 27 KB. This is negligible even for a mobile LBS user.

Let N be the number of POI records in the database. Inaddition to LBS data, the cloud also needs to store the publicparameter of IPRE and an ss-tree of less than 4N/3 nodes (forthe case of mmin = 4). The size of the public parameter is dom-inated by (Ωk)

τ2k=τ1

, which is (τ2 − τ1 + 1)× �log |Hash()|

bits. It is about 763 MB for settings in Table II(b). A treenode’s size is around 1.3 KB. Assume that there are 1 millionrecords in the database. The cloud needs at most 2.42 GB stor-age space in total. The public parameter and ss-tree can fit inthe memory of even one single server. Therefore, the storagecost is acceptable. The LBS provider sends the cloud, the pub-lic parameter, and the tree only once. The communication costis also acceptable.

F. Accuracy

As discussed in Section IV, using hash function in IPREscheme reduces the size of public parameter but introducessome false positives. This will not hurt the accuracy of EPLQsolution. The false positive rate is (τ2 − τ1 + 1)/|Hash()|,which is about 5.42× 10−12. O(logN +R) tree nodes arescanned during a query. This number is at most a few hundreds.Then, the probability that a query result contains false posi-tive(s) is at most a few hundred times of 5.42× 10−12, whichis negligible.

VIII. RELATED WORKS

Our work is related to not only privacy-preserving LBS butalso privacy-preserving query over outsourced encrypted data.In this section, we introduce some related works that can beused to realize privacy-preserving POI query, though some ofthem are not designed for POI query or LBS. In the litera-ture, there are four kinds of privacy-preserving queries overPOIs: spatial range query [1], nearest neighbor (NN) query[15], K nearest neighbors (KNN) query [2], [16]–[20], andmultidimensional range query [20]–[26]. Spatial range querycannot be replaced by NN and KNN queries, which all returnthe nearest neighbor(s) to a given location. They have differ-ent usages. Sometimes spatial range query may be replaced bymultidimensional range query, which returns POIs in a rectan-gular area instead of a circular area. However, the inaccurateresult is not desirable. Next, we review the works applicable toprivacy-preserving spatial range query.

A. Solutions Applicable to Outsourced LBS

1) Privacy-Preserving Spatial Range Query Based onCoordinate Transformation: In the solution based on coordi-nate transformation [1], the coordinates of queries and POIsin the original coordinate system are transformed to newcoordinates in a new coordinate system. After the transfor-mation, the distance information of any two points is stillpreserved. Coordinate transformation is very efficient, and thereturn results are accurate. However, solutions designed basedon coordinate transformation would be vulnerable to known-sample attacks [2].

2) Privacy-Preserving POI Query Based on PIR: As faras we know, only PIR-based solutions [3], [4] can protect theprivacy in both public LBS and outsourced LBS. Private infor-mation retrieval (PIR) [5] is a privacy primitive hiding theretrieved data item’s ID from the database server(s). Becausethe data items being retrieved are hidden from the database


server(s), whether two queries’ results are the same or notare undetectable. Therefore, PIR-based solutions are resilientto access-pattern attacks. PIR can be used to realize all thefour kinds of POI queries. However, PIR is very communica-tive and computationally costly [6] for the following reasons.PIR requires linearly scanning all POI records including theirlocation data (coordinates and radii) and nonlocation data.Moreover, to use PIR in LBS, an LBS user must additionallyaccess the LBS database’s index data in a privacy-preservingmanner. PIR can retrieve records if given their IDs. To supportspatial range query, an LBS user should obtain nearby POIs’record IDs from index data in a privacy-preserving manner. PIRor other techniques may be used to obtain such IDs.

B. Solutions for Public LBS Only

1) Privacy-Preserving LBS Based on AnonymousCommunication: In this kind of solutions [27], [28], oneor more third parties relay messages between users and theLBS provider. This approach hides the linkage between useridentities and messages from the LBS provider. The query areawould be exposed to the LBS provider, but the user sending thequery is hidden among a set of users.

2) Privacy-Preserving LBS Based on Location Obfuscation:In this kind of solutions [29], [30], to prevent the LBS providerfrom knowing users’ precise locations, users submit low-precision locations or fake locations along with real locations.These solutions offer a weak level of privacy.

3) Privacy-Preserving LBS Based on Spatial Cloaking:This kind of solutions [31], [32] combines anonymous commu-nication and location obfuscation techniques together. To theLBS provider, a user cannot be identified from a set of users ina cloaking area, and the cloaking area instead of users’ preciselocations is sent to the LBS provider.

All the above solutions can be applied to a wide range of LBSincluding POI query. However, their techniques do not allow thecloud to search encrypted data. Therefore, they cannot be usedfor outsourced LBS where LBS data in the cloud are encrypted.

IX. CONCLUSION

In this paper, we have proposed EPLQ, an efficient privacy-preserving spatial range query solution for smart phones, whichpreserves the privacy of user location, and achieves confiden-tiality of LBS data. To realize EPLQ, we have designed anIPRE and a novel privacy-preserving index tree named ss-tree.EPLQ’s efficacy has been evaluated with theoretical analy-sis and experiments, and detailed analysis shows its securityagainst known-sample attacks and ciphertext-only attacks. Ourtechniques have potential usages in other kinds of privacy-preserving queries. If the query can be performed throughcomparing inner products to a given range, the proposedIPRE and ss-tree may be applied to realize privacy-preservingquery. Two potential usages are privacy-preserving similar-ity query and long spatial range query. In the future, wewill design solutions for these scenarios and identify moreusages.

REFERENCES

[1] A. Gutscher, “Coordinate transformation—A solution for the privacyproblem of location based services?” in Proc. 20th Int. Parallel Distrib.Process. Symp. (IPDPS’06), Rhodes Island, Greece, Apr. 25–29, 2006,p. 424.

[2] W. K. Wong, D. W.-l. Cheung, B. Kao, and N. Mamoulis, “SecurekNN computation on encrypted databases,” in Proc. SIGMOD, 2009,pp. 139–152.

[3] G. Ghinita, P. Kalnis, A. Khoshgozaran, C. Shahabi, and K.-L. Tan,“Private queries in location based services: Anonymizers are not neces-sary,” in Proc. SIGMOD, 2008, pp. 121–132.

[4] X. Yi, R. Paulet, E. Bertino, and V. Varadharajan, “Practical k nearestneighbor queries with location privacy,” in Proc. 30th Int. Conf. DataEng. (ICDE), 2014, pp. 640–651.

[5] B. Chor, E. Kushilevitz, O. Goldreich, and M. Sudan, “Private informa-tion retrieval,” J. ACM, vol. 45, no. 6, pp. 965–981, 1998.

[6] F. Olumofin and I. Goldberg, “Revisiting the computational practicalityof private information retrieval,” in Financial Cryptography and DataSecurity. New York, NY: Springer, 2012, pp. 158–172.

[7] J. Katz, A. Sahai, and B. Waters, “Predicate encryption supporting dis-junctions, polynomial equations, and inner products,” in Proc. 27th Ann.Int. Conf. Theory Appl. Cryptograph. Tech. Adv. Cryptol. (EUROCRYPT’08), Istanbul, Turkey, Apr. 13–17, 2008, pp. 146–162.

[8] D. Boneh and B. Waters, “Conjunctive, subset, and range queries onencrypted data,” in Proc. 4th Theory Cryptograph. Conf. (TCC’07),Amsterdam, The Netherlands, Feb. 21–24, 2007, pp. 535–554.

[9] D. Boneh and M. K. Franklin, “Identity-based encryption from the Weilpairing,” SIAM J. Comput., vol. 32, no. 3, pp. 586–615, 2003.

[10] D. A. White and R. Jain, “Similarity indexing with the ss-tree,” in Proc.12th Int. Conf. Data Eng. (ICDE), 1996, pp. 516–523.

[11] A. Guttman, “R-trees: A dynamic index structure for spatial searching,”in Proc. Annu. Meeting (SIGMOD’84), Boston, MA, USA, Jun. 18–21,1984, pp. 47–57.

[12] T. K. Dang, J. Küng, and R. Wagner, “The sh-tree: A super hybrid indexstructure for multidimensional data,” in Proc. 12th Int. Conf. DatabaseExpert Syst. Appl. (DEXA’ 01), Munich, Germany, Sep. 3–5, 2001,pp. 340–349.

[13] B.-Y. Yang and J.-M. Chen, “All in the XL family: Theory and practice,”in Proc. Int. Conf. Inf. Secur. Cryptol, 2004, pp. 67–86.

[14] G. Ars, J.-C. Faugere, H. Imai, M. Kawazoe, and M. Sugita, “Comparisonbetween XL and Gröbner basis algorithms,” in Proc. ASIACRYPT , 2004,pp. 338–353.

[15] B. Yao, F. Li, and X. Xiao, “Secure nearest neighbor revisited,” in Proc.IEEE 29th Int. Conf. Data Eng. (ICDE’13), 2013, pp. 733–744.

[16] Y. Elmehdwi, B. K. Samanthula, and W. Jiang, “Secure k-nearest neigh-bor query over encrypted data in outsourced environments,” in Proc. IEEE30th Int. Conf. Data Eng. (ICDE), 2014, pp. 664–675.

[17] A. Khoshgozaran and C. Shahabi, “Blind evaluation of nearest neigh-bor queries using space transformation to preserve location privacy,” inAdvances in Spatial and Temporal Databases. New York, NY, USA:Springer, 2007, pp. 239–257.

[18] B. Hore, S. Mehrotra, M. Canim, and M. Kantarcioglu, “Secure multidi-mensional range queries over outsourced data,” VLDB J., vol. 21, no. 3,pp. 333–358, 2012.

[19] I.-T. Lien, Y.-H. Lin, J.-R. Shieh, and J.-L. Wu, “A novel privacy preserv-ing location-based service protocol with secret circular shift for k-NNsearch,” IEEE Trans. Inf. Forensics Secur., vol. 8, no. 6, pp. 863–873,Jun. 2013.

[20] M. L. Yiu, G. Ghinita, C. S. Jensen, and P. Kalnis, “Enabling searchservices on outsourced private spatial data,” VLDB J., vol. 19, no. 3,pp. 363–384, 2010.

[21] E. Shi, J. Bethencourt, T.-H. Chan, D. Song, and A. Perrig, “Multi-dimensional range query over encrypted data,” in Proc. IEEE Symp.Secur. & Privacy, 2007, pp. 350–364.

[22] B. Wang, Y. Hou, M. Li, H. Wang, and H. Li, “Maple: Scalable multi-dimensional range search over encrypted cloud data with tree-basedindex,” in Proc. 9th ACM Symp. Inf. Comput. Commun. Secur., 2014,pp. 111–122.

[23] R. Agrawal, J. Kiernan, R. Srikant, and Y. Xu, “Order preservingencryption for numeric data,” in Proc. SIGMOD, 2004, pp. 563–574.

[24] J. Shao, R. Lu, and X. Lin, “Fine: A fine-grained privacy-preservinglocation-based service framework for mobile devices,” in Proc. IEEEINFOCOM, 2014, pp. 244–252.

[25] A. Boldyreva, N. Chenette, Y. Lee, and A. Oneill, “Order-preservingsymmetric encryption,” in Proc. EUROCRYPT , 2009, pp. 224–241.


[26] P. Wang and C. Ravishankar, “Secure and efficient range queries on out-sourced databases using Rp-trees,” in Proc. Int. Conf. Data Eng. (ICDE),2013, pp. 314–325.

[27] A. R. Beresford and F. Stajano, “Location privacy in pervasive comput-ing,” Pervasive Comput., vol. 2, no. 1, pp. 46–55, Jan./Mar. 2003.

[28] Y. Zhu, D. Ma, D. Huang, and C. Hu, “Enabling secure location-basedservices in mobile cloud computing,” in Proc. 2nd ACM SIGCOMMWorkshop Mobile Cloud Comput., 2013, pp. 27–32.

[29] H. Kido, Y. Yanagisawa, and T. Satoh, “An anonymous communicationtechnique using dummies for location-based services,” in Proc. Int. Conf.Perv. Serv. (ICPS), 2005, pp. 88–97.

[30] C. A. Ardagna, M. Cremonini, E. Damiani, S. D. C. Di Vimercati,and P. Samarati, “Location privacy protection through obfuscation-basedtechniques,” in Proc. Data Appl. Secur. XXI, 2007, pp. 47–60.

[31] M. Gruteser and D. Grunwald, “Anonymous usage of location-basedservices through spatial and temporal cloaking,” in Proc. 1st Int. Conf.Mobile Syst. Appl. Serv., 2003, pp. 31–42.

[32] M. F. Mokbel, C.-Y. Chow, and W. G. Aref, “The new Casper: Queryprocessing for location services without compromising privacy,” in Proc.32nd Int. Conf. Very Large Data Bases (VLDB’06), 2006, pp. 763–774.

Lichun Li received the Bachelor’s degree in infor-mation engineering from the Beijing Universityof Posts and Telecommunications, Beijing, China,in 2002, the Master’s degree in communicationand information systems from the China Academyof Telecommunication Technology, Beijing, China,in 2006, and the Ph.D. degree in computer sci-ence from the Beijing University of Posts andTelecommunications, Beijing, China, in 2009.

He is currently a Postdoctoral Research Fellowwith the INFINITUS Laboratory, School of Electrical

and Electronic Engineering, Nanyang Technological University, Singapore. Hisresearch interests include privacy and security in cloud and big data.

Rongxing Lu (S’09–M’11–SM’15) received thePh.D. degree in computer science from Shanghai JiaoTong University, Shanghai, China, in 2006, and thePh.D. degree in electrical and computer engineer-ing from the University of Waterloo, Waterloo, ON,Canada, in 2012.

From May 2012 to April 2013, he was aPostdoctoral Fellow with the University of Waterloo.Since May 2013, he has been an Assistant Professorwith the School of Electrical and ElectronicEngineering, Nanyang Technological University,

Singapore. His research interests include computer network security, mobileand wireless communication security, and applied cryptography.

Dr. Lu was the recipient of the Canada Governor General Gold Metal.

Cheng Huang received the B.Eng. degree in infor-mation security from Xidian University, Xi’an,China, in 2013.

He is currently a Project Officer with theINFINITUS Laboratory, School of Electrical andElectronic Engineering, Nanyang TechnologicalUniversity, Singapore. His research interests includeapplied cryptography, cyber security, and privacy.

Documents

EPLQ: Efﬁcient Privacy-Preserving Location-Based Query ... Papers/2016 .Net/LSD1637 - EPLQ Eff… · LI et al.: EPLQ: EFFICIENT PRIVACY-PRESERVING LOCATION-BASED QUERY OVER OUTSOURCED