Lecture 19: 590.03 Fall 12 1
Location Privacy
CompSci 590.03Instructor: Ashwin Machanavajjhala
Some slides are from a tutorial by Mohamed Mokbel (ICDM 2008)
news.consumerreports.org
Lecture 19: 590.03 Fall 12 2
Outline• Location based services
• Location Privacy Challenges
• Achieving Location Privacy– Concepts– Solutions
• Open Questions
Lecture 19: 590.03 Fall 12 3
Location Based services
“Imagine being a victim of cardiac arrest with about ten minutes to live, and first responders more than ten minutes away. A CPR-trained passerby gets a mobile ping from the fire department that someone nearby needs help; the good Samaritan then rushes to your side, administers CPR, and keeps you alive long enough to get professional help. ”
Mayor of Starbucks Today, Local Hero Tomorrow: The Power and Privacy Pitfalls of Location SharingJulie Adler, June 2011
Lecture 19: 590.03 Fall 12 4
Location Based Services• Location based Traffic Reports
– How many cars on 15-501?– What is the shortest travel time?
• Location based Search– “showtimes near me”– Is there an ophthalmologist within 3 miles of
my current location?– What is the nearest gas station?
• Location based advertising/recommendation– Starbucks (.5 miles away) is giving
away free lattes.
Analysis of location data
User initiated
System Initiated
Lecture 19: 590.03 Fall 12 5
Location Based Services
Lecture 19: 590.03 Fall 12 6
Location Based Services
GIS / Spatial Databases
Mobile DevicesInternet
GPS DevicesYahoo! MapsGoogle Maps…
Location Based Services
Lecture 19: 590.03 Fall 12 7
Outline• Location based services
• Location Privacy Challenges
• Achieving Location Privacy– Concepts– Solutions
• Open Questions
Lecture 19: 590.03 Fall 12 8
Privacy Threats
http://www.thereporteronline.com/article/20121102/NEWS01/121109915/man-accused-of-stalking-hatfield-woman
Lecture 19: 590.03 Fall 12 9
Privacy Threats
Lecture 19: 590.03 Fall 12 10
Privacy Threats
http://wifi.weblogsinc.com/2004/09/24/companies-increasingly-use-gps-enabled-cell-phones-to-track/
Lecture 19: 590.03 Fall 12 12
GPS Act (http://www.wyden.senate.gov/download/wyden-
chaffetz-gps-amendment-text)
Lecture 19: 590.03 Fall 12 13
Privacy-utility tradeoff Example: What is my nearest gas
station?
Util
ity
100%
100%
0%Privacy0%
Lecture 19: 590.03 Fall 12 14
Why is Location Privacy different? Database Privacy
• Each individual’s record must be kept secret.
• Queries are not private
• Data is usually static
• Privacy is common across all individuals
Location Privacy• Individual’s current and
future locations (and other inferences) must be secret.
• Queries (location) themselves are private!
• Must tolerate updates to locations.
• Privacy is personalized for different individuals
Lecture 19: 590.03 Fall 12 15
Outline• Location based services
• Location Privacy Challenges
• Achieving Location Privacy– Concepts– Solutions
• Open Questions
Lecture 19: 590.03 Fall 12 16
Location Perturbation• The user location is
represented with a wrong value
• The privacy is achieved from the fact that the reported location is false
• The accuracy and the amount of privacy mainly depends on how far the reported location form the exact location
Lecture 19: 590.03 Fall 12 17
Spatial Cloaking• The user exact location is
represented as a region that includes the exact user location
• An adversary does know that the user is located in the cloaked region, but has no clue where the user is exactly located
• The area of the cloaked region achieves a trade-off between the user privacy and the service
Lecture 19: 590.03 Fall 12 18
Spatio-temporal cloaking• In addition to spatial cloaking
the user information can be delayed a while to cloak the temporal dimension
• Temporal cloaking could tolerate asking about stationary objects (e.g., gas stations)
• Challenging to support querying moving objects, e.g., where is my nearest friend
X
Y
T
Lecture 19: 590.03 Fall 12 19
Data Dependent Cloaking
• If you know other individuals, you can have a single coarse region to represent all of them.
Naïve cloaking MBR cloaking
Lecture 19: 590.03 Fall 12 20
Space Dependent Cloaking
Adaptive grid cloakingFixed grid cloaking
Lecture 19: 590.03 Fall 12 21
K-anonymity• The cloaked region contains at least k users
• The user is indistinguishable among other k users
• The cloaked area largely depends on the surrounding environment.
• A value of k =100 may result in a very small area if a user is located in the stadium or may result in a very large area if the user in the desert.
Lecture 19: 590.03 Fall 12 22
Queries in Location services• Private Queries over Public Data
– What is my nearest gas station– The user location is private while the objects of interest are public
• Public Queries over Private Data– How many cars in the downtown area– The query location is public while the objects of interest is private
• Private Queries over Private Data– Where is my nearest friend– Both the query location and objects of interest are private
Lecture 19: 590.03 Fall 12 23
Modes of Privacy• User Location Privacy
– Users want to hide their location information and their query information
• User Query Privacy– Users do not mind or obligated to reveal their locations, however, users
want to hide their queries
• Trajectory Privacy– Users do not mind to reveal few locations, however, they want to avoid
linking these locations together to form a trajectory
Lecture 19: 590.03 Fall 12 24
Outline• Location based services
• Location Privacy Challenges
• Achieving Location Privacy– Concepts– Solutions
• Open Questions
Lecture 19: 590.03 Fall 12 25
Solution Architectures for Location Privacy
• Client-Server architecture– Users communicated directly with the sever to do the anonymization
process. Possibly employing an offline phase with a trusted entity
• Third trusted party architecture– A centralized trusted entity is responsible for gathering information and
providing the required privacy for each user
• Peer-to-Peer cooperative architecture– Users collaborate with each other without the interleaving of a centralized
entity to provide customized privacy for each single user
Lecture 19: 590.03 Fall 12 26
Client-Server
Location Based Service
Query + Perturbed Location
Answer
Lecture 19: 590.03 Fall 12 27
Client-Server• Clients try to cheat the server using either fake locations or fake
space
• Simple to implement, easy to integrate with existing technologies
• Lower quality of service
• Examples: Landmark objects, false dummies
Lecture 19: 590.03 Fall 12 28
Client-Server Solution 1: Landmarks• Instead of reporting the exact
location, report the location of a closest landmark
• The query answer will be based on the landmark
• Voronoi diagrams can be used to efficiently identify the closest landmark
Lecture 19: 590.03 Fall 12 29
Client-Server Solutions 2: False Dummies• A user sends m locations, only one of
them is true while m-1 are false dummies
• The server replies with a service for each received location
• The user is the only one who knows the true location, and hence the true answer
• Generating false dummies is hard: should follow a certain pattern similar to a user pattern but with different locations
Server
A separate answer for each received location
Lecture 19: 590.03 Fall 12 30
Trusted Third Party
Location Based Service
Query + Cloaked Spatial location
Location Anonymizer
Lecture 19: 590.03 Fall 12 31
Trusted Third Party• A trusted third party receives the exact locations from clients,
blurs the locations, and sends the blurred locations to the server
• Provide powerful privacy guarantees with high-quality services
• Need to trusted a third party …
Lecture 19: 590.03 Fall 12 32
Mix Zones• A strategy for anonymization for continuous location tracking• Server only sees locations and user’s pseudonyms• Mix zone is like a “no track zone” + “change of pseudonyms”
Mix ZoneUser1234
User1235
User5768User5678
Lecture 19: 590.03 Fall 12 33
Quad-tree Spatial Cloaking• Achieve k-anonymity, i.e., a
user is indistinguishable from other k-1 users
• Recursively divide the space into quadrants until a quadrant has less than k users.
• The previous quadrant, which still meet the k-anonymity constraint, is returned
Achieve 5-anonmity for
Lecture 19: 590.03 Fall 12 34
Nearest Neighbor k-Anonymization• STEP 1: Determine a set S containing u and k -
1 u’s nearest neighbors.
• STEP 2: Randomly select v from S.
• STEP 3: Determine a set S’ containing v and v’s k - 1 nearest neighbors.
• STEP 4: A cloaked spatial region is an MBR of all users in S’ and u.
• Need to pick a random node first. Otherwise, adversary can reconstruct location (by picking centroid of spatial region)
S
S’
Lecture 19: 590.03 Fall 12 35
Pyramid Anonymization• Divide region into grids at different
resolutions
• Each grid cell maintains the number of users in that cell
• To anonymize a user request, we traverse the pyramid structure from the bottom level to the top level until a cell satisfying the user privacy profile is found.
Lecture 19: 590.03 Fall 12 36
Outline• Location based services
• Location Privacy Challenges
• Algorithms Location Privacy– Concepts– Solutions
• Answering Queries over Anonymized Data
• Open Questions
Lecture 19: 590.03 Fall 12 37
Range Queries• Q1: “Find all gas stations within 5 miles from my location”
– Query is private, but results are public
• But “my location” is a cloaked region and not a point
• Extend the cloaked region by 5 miles in each direction.
Database returns all gas stations in the larger region.
Client filters out “extra” gas stations
Lecture 19: 590.03 Fall 12 38
Range Queries • Q1: “Find all gas stations within 5 miles from my location”
• Three ways to report the answer:
0.4
0.25
0.4
0.05
0.1
3. Answers per area
2. Probabilistic Answers
1. All possible answers
Lecture 19: 590.03 Fall 12 39
Range Queries• Q2: Find all cars/people within a certain area
– Query is public, but results are private
• Objects of interest are represented as cloaked spatial regions in which the objects of interest can be anywhere
• Any cloaked region that overlaps with the query region is a candidate answer
• Can also answer with probabilities (A, 0.1), (B, 0.2), (C, 1.0), (D, 0.25)
A
BC
D
Lecture 19: 590.03 Fall 12 40
Radius Queries• Q3: “How many friends are there in a 5 mile radius”
– Query is private, objects are also private
• Use a combination of previous 2 techniques
Lecture 19: 590.03 Fall 12 41
Nearest Neighbor Queries• Q1: “Find the gas stations nearest to my location”
– Query is private, but results are public
• Step 1: Identify a set of candidateanswers
• Step 2: Return all candidate answers, or
Determine probability of answers, or
Return answers in terms of areasv1 v2
v3 v4
Lecture 19: 590.03 Fall 12 42
Outline• Location based services
• Location Privacy Challenges
• Algorithms Location Privacy– Concepts– Solutions
• Answering Queries over Anonymized Data
• Open Questions
Lecture 19: 590.03 Fall 12 43
Privacy Guarantees• Most existing algorithms provide k-anonymity type guarantees
• However, this does not provide privacy against: – Homogeneity attack,
100s of people may be at a race track, but one can still learn that an individual was at the race track.
– Background knowledge attacks, where adversary knows something about individuals.
– Minimality attacks ,where adversary knows how the algorithm anonymizes the data
Lecture 19: 590.03 Fall 12 44
Differential Privacy• Differential privacy tolerates aforementioned attacks
• Can work effectively in the trusted third party model
• Montreal Traffic, Trajectory Anonymization
• … but, No good solutions for the typical location based services problem.
No known techniques to personalize differential privacy.
Lecture 19: 590.03 Fall 12 45
Utility• Cloaking techniques can provide good utility. But, if you need to
cloak trajectories, rather than locations, utility can degrade.
• Not much adoption of privacy technology due to this issue.
Lecture 19: 590.03 Fall 12 46
Summary• Our locations is being tracked in a number of ways
– Search queries– Location based services– GPS– …
• Defining privacy is tricky. – Data is not static. Location keeps changing. – Must be personalized …
• Number of solutions, but have privacy/utility problems and hence not much adoption in real systems.
Lecture 19: 590.03 Fall 12 47
ReferencesM. Mokbel, “Privacy Preserving Location Services”, Tutorial, ICDM 2008http://www-users.cs.umn.edu/~mokbel/tutorials/icdm08.pptx (see references in the tutorial for more pointers)
R. Chen, B C Fung, B. Desai, N. Sossou, “Differentially Private Transit Data Publication: A Case Study on the Montreal Transportation System”, KDD 2012
V. Rastogi, S. Nath, “Differentially private aggregation of distributed time-series with transformation and encryption”, SIGMOD ‘10