34
A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

  • View
    219

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

A Customizable k-Anonymity Model for Protecting Location

Privacy

Written by: B. Gedik, L.Liu

Presented by: Tal Shoseyov

Page 2: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

Agenda

Introduction: Data Privacy & Anonymity k-Anonymity & location k-Anonymity Previous models The CliqueCloak Theorem The CliqueCloak Algorithm Different CliqueCloak variations

Page 3: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

Our Key Players

Client – a user with network access and a customer

Database – An application that manages data by records (for example, an LBS (Location Based Service) contains information on services in specific locations.

Server – A computer that receives requests from a client and passes them on to the requested service.

Record – A data unit on the database that contains information by attributes (name, age, color, …)

Subject – A person / organization that is associated to a record in a database (a hospital patient, a company’s employee, a customer, …)

Page 4: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

Client – Database Workflow

Message 1 from 10.0.0.1 to 10.0.0.99:Retrieve record from

T where “color”=“red”

Client (10.0.0.1)

Server (10.0.0.99)

Database (T)

Message 1 from 10.0.0.99 to 10.0.0.1:

Reply:

Query:Select “record” from T where “color” = “red”

Reply:

Page 5: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

The Database Privacy Problem

“How can a client efficiently retrieve a record from an untrusted database without having the database know information about the record in question?”

Client Server Database

Query:Retrieve record from

T where “color”=“red”

Play 1:

!Play 2: Query:

Retrieve record from T where “color”=“any”

?Reply:

Page 6: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

The Subject Anonymity Problem

“How can a database prevent untrusted clients from identifying the subject of a record, while the contents of the record remain useful to the user?”

Client Server Database

=

Query:Retrieve record from T

where “sex”=

and “salary” =

Reply:

Play 1:Play 2:

Reply:

?

Page 7: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

The Location Based Service Workflow

Client Server LBS Database

(Location Based Service)

Request:Retrieve all available services in

client’s location

Forward to local service:Retrieve all available services in

location

Reply:Reply:

Page 8: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

The Location Anonymity Problem

Client Server LBS Database

(Location Based Service)

Request: Retrieve all bus lines from location to address

= =

Page 9: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

Anonymity

“A message from a database to a client is called anonymous if the subject of the message cannot be distinguished from other subjects.”

Client Database

Page 10: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

Location Anonymity

“A message from a client to a database is called location anonymous if the client’s identity cannot be distinguished from other users based on the client’s location information.”

Database

Page 11: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

k-Anonymity

“A message from a database to a client is called subject k-anonymous if the subject of the message cannot be distinguished from other k-1 subjects.”

“A message from a client to a database is called location k-anonymous if the client cannot be identified by the database based on the client’s location from other k-1 clients.”

Page 12: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

Implementation of Location Anonymity

Client sends plain request to the server

Server sends “anonymized”

message

Database executes request according to the

received anonymous dataDatabase replies to server

with compiled data

Server forwards data to client

Server transforms the message by

“anonymizing” the location data in the message

Page 13: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

Implementation of Location k-Anonymity

Spatial Cloaking – Setting a range of space to be a single box, where all clients located within the range are said to be in the “same location”.

x

y

Temporal Cloaking – Setting a time interval, where all the clients in a specific location sending a message in that time interval are said to have sent the message in the “same time”.

t

Page 14: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

Implementation of Location k-Anonymity

x

yt

Spatial-Temporal Cloaking – Setting a range of space and a time interval, where all the messages sent by client inside the range in that time interval. This spatial and temporal area is called a “cloaking box”.

Page 15: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

Previous solutions

M. Gruteser, D Grunwald (2003) – For a fixed k value, the server finds the smallest area around the client’s location that potentially contains k-1 different other clients, and monitoring that area over time until such k-1 clients are found.

Drawback:

Fixed anonymity value for all clients (service dependent)

Page 16: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

The CliqueCloak Approach

Motivation:

Separate Anonymity values for each separate message – Each client can decide for himself the level of anonymity (k) for his message.

Preventing useless information from being sent to the client – Limiting the spatial area and providing a time limit, after which the message becomes expired.

Page 17: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

The CliqueCloak Approach

Definitions: Constraint Area:

For a message m, a constraint area is a spatial-temporal area that contains the sending client’s location. A client sends his message along with a constraint area to prevent the database from sending the client useless information on locations outside the constraint area.

x

y

m

k=3

Page 18: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

The CliqueCloak Approach

Definitions:

m2

k=3m1

k=2

m4

k=3

x

y

Cloaking Box:

A spatial and temporal area assigned to a transformed message. A valid cloaking box must comply to the following conditions:

1. The client that sent the message m is located in the cloaking box

2. The number of different clients inside the cloaking box must be at least m.k (the anonymity level of the message).

3. The cloaking box must be included inside the message’s constraint area.

Page 19: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

The CliqueCloak Approach

Constraint Graph:

Each mobile node is a vertice in the graph, and 2 nodes are connected iff each of them is inside the other node’s constraint area.

x

y

m2

k=3m1

k=2

m3

k=2m4

k=3

Definitions: An l-clique in that graph such that l ≥ mi.k for each i is mapped by the algorithm to a spatial cloaking box, where all messages in the clique will be transformed using the cloaking box, making each of the messages’ senders indistinguishable from one another.

Approach:

Page 20: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

The CliqueCloak TheoremDefinitions:

A plain message (from client to server) m consists of:

• m.uid = Unique identifier of the sender

• m.rno = Message’s reference number

• P(m) = Message’s spatial point (e.g. the client’s current location).

• B(m) = Message’s spatial constraint area

• m.t = Message’s temporal constraint (expiration time)

• m.C = Message’s content

• m.k = Message’s anonymity level

Page 21: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

The CliqueCloak TheoremDefinitions:A transformed message (from server to database) mT consists of:

• m.uid , m.rno

• Bcl(m) = Message’s spatial cloaking box

• m.C

Page 22: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

The CliqueCloak Theorem

S = the set of original messages

T = the set of anonymized messages

Definitions:

Page 23: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

The CliqueCloak Theorem

M forms an |M|-clique in G(S,E)

• M a subset of S (a set of messages from different clients)

• Bcl(M) a spatial cloaking box

• For each m in M: m.k ≤ |M|

• For each m in M: mT = <m.uid,m.rno, Bcl(M), m.C>

Then:

For each m in M, m can be transformed into mT

Let:

Page 24: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

The CliqueCloak Theorem

Proof:

• For each m and m’ in M: Bcl(M) is in mT and m’T.

→ m and m’ are in the same cloaking box Bcl(M).

→ P(m) is in the constraint area of m’ and vice versa (by definition of the constraint area).

→ There is an edge (m,m’) in G(S,E).

→ For every pair (m, m’), where m, m’ are in M, the edge (m,m’) is in G(S,E).

→ M forms a clique in the size of M (|M|).

Page 25: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

The CliqueCloak Theorem

We build Bcl(M) as the box with minimal size that contains the locations all |M| clients whose messages are in M.

We show that Bcl(M) is a valid cloaking box for all messages in M:

Condition 1: Bcl(M) includes the locations from all messages in M:→ True, by definition of Bcl(M).

Condition 2: For each P(m) in Bcl(M) m.k ≤ |M|:→ True, by definition of M.

Proof:

Page 26: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

Condition 3: Bcl(M) in included inside B(m) for each m in M:• For each m and for each n in M, there is an edge (m,n) in G(S,E)→ For a given m, P(m) is in B(n) for each n in M.→ P(m) is in ∩n in M B(n) (for every m)→ By definition and minimality of Bcl(M), Bcl(M) is in ∩n in M B(n).→ Bcl(M) is in B(n) for each n in M.→ For each m in M, Bcl(M) is within the constraint area of m □

Proof:

Page 27: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

The CliqueCloak Algorithm

The Idea:

xy

t

• For each plain message, along with its constraints and anonymity level k, we try to find a k-clique in the constraint graph and convert the clique into a spatial cloaking box.

• Each of the messages inside the cloaking box will be converted into transformed messages, replacing their location values with the cloaking box. • We try finding a cloaking box for a message until it is expired (exceeds its temporal constraints).

Page 28: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

The CliqueCloak AlgorithmInput:

• S = set of plain messages

Output:

• T = a set of transformed messages

Page 29: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

The CliqueCloak Algorithm•while TRUE do

• pick a message m from S.• N ← all messages in range B(m)• for each n in N do:

• if P(m) is in B(n) then:add the edge (m,n) into G

• M ← local_k_search(m.k, m, G)• if M ≠ Ø then

• Bcl(M) ← The minimal area that contains M• for each n in M do

• remove n from S• remove n from G• nT ← < n.uid, n.rno, Bcl(M), n.C >•output transformed message nT

• remove expired messages from S

Building constraint graph G

Building transformed messages from all messages in M

Finding a subset M of S s.t. m is in M, m.k = |M|, for each n in M n.k ≤ |M|, and M forms a clique in G.

Page 30: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

The CliqueCloak Algorithm

local_k_search(k, m, G)

• U ← { n | (m,n) is an edge in G and n.k ≤ k }

• if |U| < k-1 then

return Ø

• l ← 0

• while l ≠ |U| do

• l ← |U|

• for each u in U do

• if |{G neighbors of u in U}| < k-2 thenU ← U \ {u}

• find any subset M in U s.t. |M| = k-1 and M U {m} forms a clique

• return M U {m}

Find a group U of neighbors to m in G s.t. their anonymity value doesn’t exceed k.

Remove members of U with less than k-2 neighbors, that cannot provide us with a (k-1)-clique

Look for a k-clique inside U.

Page 31: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

Alternative CliqueCloak Algorithms

NBR-k search Deferred CliqueCloak

Page 32: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

NBR-k-SearchNbr-k_search(m, G)

• if there are < m.k-1 neighbors to m in G then

return Ø

• L ← {n.k | n = m or n is a neighbor of m}

• for all distinct k in L in decreasing order do

• if k < m.k thenreturn Ø

• M ← local-k_search(k,m,G)

• if M ≠ Ø thenreturn M

• return Ø

For a given plain message m, we search for the maximal k value that belongs to m’s neighbor, where local-k_search gives us a valid subset M.

No solution for groups with less than m.k membersGetting all k values of m’s neighbors

Looking for a valid k-clique that includes mValid set M found with maximal k value

No valid set M was found for k ≥ m. k

Page 33: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

Deffered ClickCloakIn order to minimize clique searches for a plain message m, the search is delayed by the server until there are at least α*m.k (with α ≥ 1) neighbors to m in the constraint graph, thus increasing the possibility that a matching cloaking box will be found.

Client Server LBS

Client sends message m to

server

Server waits until there are α*m.k neighbors to m in the

constraint graph before looking for a valid cloaking box

After finding a valid cloaking box for transforming m, Server

sends the transformed message to LBS.

Page 34: A Customizable k-Anonymity Model for Protecting Location Privacy Written by: B. Gedik, L.Liu Presented by: Tal Shoseyov

Q&A