9

Click here to load reader

Physical Access Control for Captured RFID Data

Embed Size (px)

Citation preview

Page 1: Physical Access Control for Captured RFID Data

© 2007 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or

for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be

obtained from the IEEE.

For more information, please see www.ieee.org/web/publications/rights/index.html.

MOBILE AND UBIQUITOUS SYSTEMS www.computer.org/pervasive

Physical Access Control for Captured RFID Data

Travis Kriplean, Evan Welbourne, Nodira Khoussainova, Vibhor Rastogi,

Magdalena Balazinska, Gaetano Borriello, Tadayoshi Kohno, and Dan Suciu

Vol. 6, No. 4 October–December 2007

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works

may not be reposted without the explicit permission of the copyright holder.

Page 2: Physical Access Control for Captured RFID Data

48 PERVASIVEcomputing Published by the IEEE Computer Society ■ 1536-1268/07/$25.00 © 2007 IEEE

Physical Access Controlfor Captured RFID Data

Radio frequency identification tech-nology has become popular as aneffective, low-cost solution for tag-ging and wireless identification.Although early RFID deployments

focused primarily on industrial settings, successeshave led to a boom in more personal, pervasiveapplications such as reminders1 and eldercare.2

RFID promises to enhance many everyday activ-ities but also raises great challenges—in particu-lar, with respect to security and privacy.

At the University of Washington, we’ve de-ployed the RFID Ecosystem, a pervasive com-

puting system based on a build-ing-wide RFID infrastructurewith 80 RFID readers, 300antennas, tens of tagged people,and thousands of tagged ob-jects.3 The RFID Ecosystem isa capture-and-access systemthat streams all data from thereaders into a central database,where applications can access

it. Our goal is to provide a laboratory for long-term research in security and privacy, as well asapplications, data management, and systems is-sues for RFID-based, community-oriented per-vasive computing.

RFID security is a vibrant research area, withmany protection mechanisms against unautho-rized RFID cloning and reading attacks emerg-ing.4 However, little work has yet addressed thecomplementary issue of protecting the privacy ofRFID data after an authorized system has cap-tured and stored it. We’ve investigated peer-to-

peer privacy for personal RFID data through anaccess-control policy called Physical Access Con-trol. PAC protects privacy by constraining the dataa user can obtain from the system to those eventsthat occurred when and where that user was phys-ically present. While strictly limiting informationdisclosure, PAC also affords a database view thataugments users’ memory of places, objects, andpeople. PAC is appropriate as a default level ofaccess control because it models the physicalboundaries in everyday life. Here, we focus on theprivacy, utility, and security issues raised by itsimplementation in the RFID Ecosystem.

Privacy and utility in pervasivearchitectures

The 18th-century legal philosopher JeremyBentham first described the perfect architecturefor surveillance: the panopticon, a prison thatarranges its cells about a central tower fromwhich a guard can monitor every cell whileremaining invisible to the inmates. The architec-ture’s innovation is that the guard’s presencebecomes unnecessary except for occasional pub-lic demonstrations of power. Many privacy con-cerns in pervasive computing stem from a similarpotential for an unseen observer to access and acton data about someone else. Under these condi-tions, the “state of conscious and permanent vis-ibility [assures] the automatic functioning ofpower”5 because individuals must constantlyconform to the code of conduct their peers orsuperiors hold them to.

Just as surveillance can be built into an archi-tecture, so can privacy assurances. Our funda-

To protect the privacy of RFID data after an authorized system captures it,this policy-based approach constrains the data users can access to systemevents that occurred when and where they were physically present.

S E C U R I T Y & P R I V A C Y

Travis Kriplean, Evan Welbourne,Nodira Khoussainova, VibhorRastogi, Magdalena Balazinska,Gaetano Borriello, TadayoshiKohno, and Dan SuciuUniversity of Washington

Page 3: Physical Access Control for Captured RFID Data

mental conviction regarding privacy inthe RFID Ecosystem is that privacy mustbe designed into the system from theground up. The challenge in architectingprivacy into a pervasive-sensing systemis to provide enough utility to supportthe desired applications suite while care-fully controlling what information todisclose, to whom, how, and under whatconditions. Effective decisions must ho-listically trade off privacy and utility.6 Inparticular, a proposed privacy mecha-nism must consider perspectives andmethods from computer security, data-bases, human-computer interaction, andthe social sciences. Only in this way canwe understand the mechanism’s securityvulnerabilities, how well it matchesusers’ expectations of privacy, how easyit is to understand, and its utility.

Most pervasive-sensing systems rep-resent one of two architectural models:wearable or infrastructure. Generallyspeaking, each model makes differenttrade-offs between privacy and utility.The wearable model processes and storessensors and data on devices that the userowns and wears. MyLifeBits embodiesthis model: users wear microphones,video cameras, and other sensors thatcontinually record sensor data.7 Suchsystems can put the device wearer in con-trol if they store the data locally and dis-close no information without the user’sexplicit consent. These “perfect mem-ory” systems, however, pose privacyconcerns for others who encounter theuser but don’t consent to informationcapture.8 Plausible deniability is lost:although human memory is lossy, cap-tured sensor data isn’t.

In contrast, the infrastructure modelhas a central authority that manages sen-sor data on users’ behalf. This modelgives rise to the threat of permanent vis-ibility, but we adopted it for the RFIDEcosystem because it enables muchricher services through data and resourcesharing. It’s also less expensive because

cost is amortized over many users. More-over, a central database allows a systemto leverage database security and privacytechniques such as privacy-preservingdata mining9 and k-anonymity.10 Thesetechniques enable privacy-preserving sta-tistical queries that complement andextend the utility of careful access con-trol for point queries, such as “When didI see Bob today?” A principled, privacy-sensitive framework for managing datashould employ both statistical and pointqueries—as in a Hippocratic database,11

for example. However, our focus here ison point queries under one possibleaccess control policy.

PAC: How it worksWe could define many access-control

policies for captured RFID data. Forexample, policies might allow a managerto track employee location during workhours, support staff to locate inventoryobjects, and individual users to grantconditional tracking permissions to theirfriends. However, these policies presentproblems when applied as a default. Themanagerial example raises surveillanceconcerns, the object finder can be abusedto track people, and user-defined poli-

cies often lack foresight and vigilance.12

On the other hand, a restrictive policythat lets users access only their own dataprecludes many interesting applicationsthat might be safe.

Marc Langheinrich argues that prox-imity-based disclosure could limit thesurveillance threat.13 PAC implementsthis proposal. It attempts to realize thespatial privacy features of wearable sys-

tems within an infrastructure architec-tural model. We propose it as a defaultaccess-control policy because it modelsspatial privacy in everyday life. It placesupper and lower bounds on accessibleinformation, restricting the informationusers can obtain to what they could haveobserved in person. Specifically, a usercan “see” other users and her owntagged objects at any time and placewhen she was physically present, butcan’t see the other users’ objects. Foreach user, the database stores a persis-tent record of all encountered personsand personal objects. PAC thus respectsYitao Duan and John Canny’s data dis-cretion principle,14 which states thatusers should have access to mediarecorded when they were physically pre-sent and shouldn’t when they weren’tpresent. We believe that PAC providesan intuitive, easily understandable flowof captured information.12

Although PAC is conservative in theinformation it reveals, its memory-likeview of the data provides useful serviceprimitives. For example, a user’s queryon the location of his lost object willreturn the location where he last sawit—a likely answer. Queries over a

user’s entire history of activities couldenable other personal-memory appli-cations—for example, a reminder ser-vice that alerts users when they forget totake an item with them as they go homefor the day.1 This balance betweenprivacy and utility makes PAC a suit-able default access-control policy fora pervasive RFID deployment. More-over, privacy-preserving extensions and

OCTOBER–DECEMBER 2007 PERVASIVEcomputing 49

A proposed privacy mechanism

must consider perspectives and methods

from computer security, databases, HCI,

and the social sciences.

Page 4: Physical Access Control for Captured RFID Data

relaxations of PAC could enable an evengreater range of applications. As such,rather than plunge a community into aninformation-promiscuous environment,our strategy is to start with informationdisclosure commensurate with everydaylife and carefully extend it as needed foruseful applications.

ImplementationRFID systems collect data as a stream

of triples having the form (Identity, Location,Time). PAC defines a database view con-sisting of only those triples representinga user’s own location and the locationsof users and objects he or she could plau-sibly have seen. A database respectingthe PAC policy always responds to auser’s queries through her view, ratherthan the complete data—in other words,PAC employs a Truman model.15 Forexample, if a user asks, “How many peo-ple were on the fourth floor yesterday?”the system effectively responds, “Yousaw five people on the fourth floor,”instead of “You saw five out of the 11total people on the fourth floor.”

A PAC implementation thus requires

a procedure for inferring when a usercould plausibly have seen another useror object. Our implementation relies ona notion of mutual visibility for this pro-cedure. Two users or a user and an objectare mutually visible if they share anunobstructed line of sight. Every suchinstance of mutual visibility is called avisibility event.

Consider the scenario in figure 1. Itpresents a snapshot of six users, A–F,going about their daily routines and atable enumerating all visibility events.

This definition of mutual visibility isan ideal that the system must approxi-mate using captured sensor data. In theRFID Ecosystem, the mutual-visibilitycomputation incorporates the spatio-temporal relationships between theRFID tag reads that antennas collect.The RFID Ecosystem’s formal definitionof mutual visibility depends on theserelations between tag reads:

• Spatial. Determining the unobstructedline of sight between two tag readsposes two challenges. First, a sightedtag’s exact location is unknown; in-

stead, the antenna’s location serves asa proxy for the tag’s location. Second,two tags might be mutually visible yetread by two different antennas, whichmotivates the definition of mutuallyvisible antennas: pairings of antennasA1 and A2 such that the system caninterpret a tag read at A1 as mutuallyvisible with a tag read at A2.

• Temporal. By protocol, each antennareads tags rapidly and in sequence, sotwo tags are rarely read at exactly thesame time. We therefore use a para-meterizable time window, �, that de-fines how close in time two tag readsmust occur for the tags to be consid-ered mutually visible.

We can now express a formal defini-tion of mutual visibility in terms of thedata captured by the system: Two tagsX and Y are mutually visible if X isread by antenna A1 at time tX and Y isread by antenna A2 at time tY, such that|tX � tY| � � and A1 and A2 are mutuallyvisible.

This definition yields the visibilityevents marked in figure 1. In this sce-

50 PERVASIVEcomputing www.computer.org/pervasive

S E C U R I T Y & P R I V A C Y

FF, E 12 -15

Detected

--x

xx

15

99

End

9

57

Start

9

77

2

A, EC, ED, ED, CA, C

Visibilityevent

16171819

1

2

3

4 5 6 7 8 9 10 11

15 14 13 12

Antenna

Antennacoverage

D6

D

E

E7

E8

E9E101

E11

E12E

A 11A2 A3 A4

7A5

A6

C

C21

C3 C44C5 C6

2C7

C

A7

A8 A9

A

B2

B3

A B C D E F

Person A Person B Person C Person D Person E Person F

2m

B

B

E6

Figure 1. Six people moving over the course of 15 time steps. Each user’s location is annotated with the time stamp at which theymove to the space. No time stamp indicates that they remain in that space. The table shows visibility events between users. Thelast column shows visibility events inferred by the system when the parameterizable time window � is 1 and antennas 1–3, 4–11,and 12–19 are mutually visible.

Page 5: Physical Access Control for Captured RFID Data

nario, antennas 18 and 15 are mutuallyvisible, so the system will correctly inter-pret A and C as mutually visible at t = 3.In contrast, antennas 11 and 12 are notmutually visible, so E and C won’t beconsidered mutually visible. Note howvarying � tunes mutual visibility’s strict-ness. In figure 1, � = 1, so the systemnever detects D and A as mutually visi-ble; however, if � = 2, then they are mutu-ally visible during time steps 5 to 7.

Users, objects, and ownershipWe distinguish between user tags and

object tags. A user can see an object’slocation only when the user and objectare mutually visible and the user ownsthat object. The ownership restriction isrequired because RFID tags are readablethrough opaque materials, such as back-packs. X-ray vision is not part of thePAC information ethic. In our currentmodel, ownership is simple: each objectis singly owned. However, our goal is tostudy community-oriented systems, sofuture work will need to investigate howPAC can operate with shared objects.

Measuring mutual visibilityGiven a pair of antennas Ai, Aj, we

would like to label them as mutually vis-ible or not in a way that minimizes falsevisibility events. Our approach has beento label each antenna pair with the prob-ability that two tags in these coverageareas share a line of sight. The motiva-tion is to give system administrators away to systematically reason about po-tential information leakage.

One method is to sample a large num-ber of points from each antenna’s ex-pected coverage area and calculate thefraction of point pairs that share a lineof sight. Let Ci and Cj be two sets ofpoints uniformly drawn from Ai’s andAj’s respective coverage areas and letvisiblePoints(pa, pb) be 1 when points pa

and pb share a line of sight, and 0 other-wise. Then

We then consider Ai, Aj mutually visibleif visibility(Ai, Aj) � �, where � sets thelower bound on the fraction of pointpairs that must share a line of sight fortwo antennas to be considered mutuallyvisible. With � = 1.0, all points mustshare a line of sight, providing the high-est privacy by minimizing falsely de-tected, mutually visible tags. However,

� = 1.0 will likely miss many actual visi-bility events, thus degrading utility. Afterstudying our deployment, we labeled theantennas we thought should be mutu-ally visible, yielding � = 0.84.

This measure has limitations. First, �is an approximation because true an-tenna coverage varies over time and withenvironmental conditions. Second, an-tennas can sometimes read throughopaque surfaces (for example, an inte-rior laboratory window with curtainsdrawn). Further techniques are necessaryto accurately model antenna behavior.

PAC feasibilityOur definition of PAC assumes a well-

behaved, lossless model for our RFIDequipment. We therefore wanted to de-termine how PAC performed in a realdeployment with antennas that might notbehave as expected. Erroneous behaviorcan have adverse effects on both privacyand utility. For example, privacy viola-tions can occur if an antenna reads a tagbeyond its expected range (possibly caus-ing false visibility events); likewise, util-ity can be degraded when an antenna fails

to read a tag. Here, we discuss our exper-iment to evaluate PAC in practice.

Experimental setup and methodology

We evaluated PAC over a set of userscenarios that cover some ways visibil-ity events could occur. For each scenario,we enacted multiple trials and collectedthe corresponding stream of raw tagreads that the antennas captured on eachtrial. For each trial, we also capturedground-truth location data. A simulatorthen processed the ground truth data,

producing a stream of simulated tagreads that models the case of a well-behaved, lossless RFID deployment.

Representative scenarios. A visibilityevent can occur in many ways. We’vedefined four scenarios that, while notexhaustive, represent common types ofvisibility events:

• In the personal-objects scenario, a sin-gle user walked around the halls carry-ing six tagged objects on various partsof the body and inside a duffel bag.

• In the glimpses scenario, one userstood at one end of a hallway whileanother entered the opposite end froman office and quickly walked aroundthe corner.

• In the walking-together scenario, twousers walked around the hallwaystogether.

• In the passing-by scenario, two userspassed one another while walking inopposite directions.

Data collection and ground truth. Thescenarios gave a rough script for exper-

visibility

visiblePoints

A A

p p

i j

a bp Ca i

,

,

( )=

( )∈ ,,p C

i j

b j

C C

∈∑

OCTOBER–DECEMBER 2007 PERVASIVEcomputing 51

The motivation is to give system administrators

a way to systematically reason about potential

information leakage.

Page 6: Physical Access Control for Captured RFID Data

imenters to follow. To collect groundtruth accurately for each trial, experi-menters used tablet PCs with a map-based data-collection tool.16 Using thetool’s stylus, experimenters could tracktheir current location as they enactedthe scenarios by moving a cross-hair tothe corresponding location on a map.In this way, each trial produced anXML trace of time-stamped latitudeand longitude coordinates. This trace isfed to the simulator to produce the sim-ulated tag reads. On the basis of expe-rience with our RFID deployment, weset the antenna coverage area as a cir-cle with a 2-meter radius about theantenna.

Comparing visibility events. For each pairof tags X and Y in a given trial, we com-pared the visibility events detected in theraw and simulated data. Let S and Rdenote the set of all visibility events forX and Y in the simulated and raw data,respectively. By definition, a visibilityevent v occurs during a window of time(tX, tY) such that |tX – tY| � �. We definethe visibility-event time stamp of v as �v

= (tY + tX)/2. A visibility event s in Soccurs in R if there exists an r in R suchthat | �r – �s | � �. So, a visibility eventin S also occurs in R when a visibilityevent in R has a time stamp within � ofthe visibility-event’s time stamp in S. Inour experiments, � = 1 second.

To measure privacy and utility, we

compared the recall and precision of thevisibility events detected in the raw andsimulated data. Recall measures utilityaccording to the fraction of simulatedvisibility events that also occurred in theraw data. A recall of 1 indicates that theraw data accurately captured all visibil-ity events in the simulated data. A recallless than 1 indicates that the simulationmissed some visibility events because ofmissed tag reads. In contrast, precisionmeasures privacy according to the frac-tion of detected visibility events that alsooccurred in the simulated data. A preci-sion less than 1 indicates privacy lossbecause the system detects false visibil-ity events.

Computing results. We performed 10 tri-als of each scenario. After each trial, wecalculated precision and recall for the vis-ibility events of every tag pair. We thencomputed the mean and standard devia-tions of precision and recall across all thetrials for these visibility events. In the per-sonal-objects scenario, visibility eventsoccur between the experimenter and eachof his or her objects. In this case, we givethe average and standard deviation ac-ross all these pairings.

ResultsFigure 2 shows the precision and recall

for all four scenarios. The experimentaloutcome is encouraging and indicatesthat PAC can indeed operate effectively

in a lossy environment to provide bothprivacy and utility.

Privacy. The high precision demonstratesthat nearly all the visibility eventsdetected by our RFID deployment alsooccurred in the simulated data, suggest-ing that little information leaked. Thisresult also verifies the integrity of thedata-collection procedure because highprecision depends on correct groundtruth input.

Utility. Recall suffers when antennas failto read tags and so miss visibility events.This can happen for various reasons,such as the properties of the material towhich the tag is affixed and the tag’s ori-entation with respect to the antennas.The tags hung from the experimenters’shirt or pants. This resulted in high readrates for the user tags (recall between 90and 95 percent) in all the experimentsand correspondingly high detection ofvisibility events between user tags (morethan 80 percent).

In the personal-objects scenario, weobserved lower recall because theantennas couldn’t consistently read thetags in pockets or the duffel bag. How-ever, several algorithms and tools couldameliorate this problem by cleaningRFID data. We evaluated whether suchtools could improve PAC performanceby comparing the precision and recallof the raw data stream against a thirdset of tag reads produced by the PEEX(Probabilistic Event Extractor for RFIDData) research prototype.17 PEEX usesintegrity constraints to correct rawRFID data. We ran PEEX with a fewintegrity constraints that capture logi-cal, physical relations between objectsand people (for example, an object can’t

52 PERVASIVEcomputing www.computer.org/pervasive

S E C U R I T Y & P R I V A C Y

1.00.90.80.70.60.50.40.30.20.1

0Personal objects

A, objects

Perc

ent

GlimpsesA, B

Raw precision

Pass byA, B

Walk togetherA, B

Raw recall PEEX precision PEEX recall

Figure 2. Average precision and recall for visibility events in each scenario. Eachgroup of experimental results representsa single visibility event between twotags, except for the first, which is theaverage of A’s visibility events with theirobjects.

Page 7: Physical Access Control for Captured RFID Data

move by itself). Figure 2 shows thatPEEX significantly improves recall inthe personal-objects and walking-to-gether scenarios (t-test with p < 0.001),without affecting precision.

Overall, our results show the practi-cal feasibility of employing PAC. Despitethe noisy, lossy, and inaccurate nature ofreal RFID data, PAC effectively limitsinformation disclosure while providinggood system utility. It captures user-userinteractions quite well. Inherent RFIDunreliabilities hamper the capture ofuser-object interactions, but even simplecleaning tools such as PEEX significantlyimprove performance in this area.

“Misplaced” user tagsOur PAC implementation assumes

that users are always wearing their usertags. We must, however, anticipate userswho accidentally or intentionally “mis-behave.” For example, Alice might acci-dentally forget her user tag in Bob’soffice, or she might maliciously place itin Bob’s backpack. In both scenarios,the system would incorrectly believethat Alice is in Bob’s proximity andwould grant her access to his data. (Sucherrors aren’t possible with object tagsbecause they can only be mutually visi-ble with their owner.)

We’re developing several mechanismsfor addressing “misplaced” user tags. Wefocus here on users who intentionally mis-behave; mechanisms that defend againstmalicious parties will also account foraccidental misuse. Our defensive tech-niques fall under the principle of securityrisk management. While an adversarymight still be able to circumvent our secu-rity mechanisms, the cost of mounting anattack against users’ privacy should out-weigh the benefits to the adversary.

DetectionThe threat of detection can deter mali-

cious activities because it might lead tosocial sanctions and punitive measures for

the offending party. We’re exploring twoclasses of detection mechanisms. First, theRFID Ecosystem could automaticallydetect anomalies in a user tag’s move-ments. The system could trigger an alertif Alice’s user tag is always mutually visi-ble with Bob or one of Bob’s objects, suchas his backpack, or if Alice’s user tag hasbeen in an unusual location for too long.

Second, because nonvisually impairedusers generally know the ground truthabout the people (or at least the num-ber of people) in their immediate vicin-ity, we can explicitly involve users inanomaly detection. For example, Bobcould detect Alice’s maliciously plantedRFID user tag if he’s in an elevator alonebut the elevator’s front panel says thatthere are two occupants.

PreventionWe’re exploring two classes of pre-

vention mechanisms: making attacks toocostly or inconvenient for adversariesand periodically verifying that the RFIDuser tags are in the appropriate user’spossession.

One method for increasing the cost toan adversary is to combine user tags withexpensive or essential devices, such as

cell phones or employee badges. A sec-ond method is to stop capturing readsfor user tags that become separated fromtheir legitimate owners. For example, wecould consider a user tag “capturable”for n time units whenever the RFIDEcosystem detects that the legitimateuser is actually in possession of thattag—say, by detecting when the tagenters that user’s office.

Other attack vectorsAlice could share her legitimate obser-

vations of Bob (as accessed through thePAC system) with Charlie, thereby re-vealing to Charlie information aboutBob’s location that Charlie couldn’t haveobserved. Direct attacks on the RFIDhardware (for example, cloning tags) arealso possible, but we don’t consider themhere. (Ari Juels surveys the state of theart in preventing such attacks.4)

Future workWe’re focusing our future work in

three areas.

Principled PAC relaxationsWe’re looking at other access-control

mechanisms to implement alongside PACto provide additional information whensocially appropriate. For example, user-defined access-control rules are impor-tant for applications that rely on sharedcontext between users, such as location-shared buddy lists.18 Because users ex-plicitly grant permission to make theirinformation available when they opt into such applications, physical proximityis not a necessary access requirement.

Socially situated events offer another

potential relaxation. For example, anaugmented calendar system might let auser query the location of a meeting’sinvitees during the scheduled meetingtime. Such a relaxation would give userssocially acceptable information at par-ticular times.

A third class of relaxations couldinvolve mediation with the system. Forexample, a lost object’s owner might

OCTOBER–DECEMBER 2007 PERVASIVEcomputing 53

The threat of detection can deter

malicious activities because it might lead

to social sanctions and punitive measures

for the offending party.

Page 8: Physical Access Control for Captured RFID Data

54 PERVASIVEcomputing www.computer.org/pervasive

S E C U R I T Y & P R I V A C Y

request its location. The system couldchoose to reveal the object’s location tothe requester, or it could send an email tothe person most recently detected to havemoved it. By involving the system or anadministrator, this relaxation could pre-vent abuse by not revealing sensitiveinformation while still supporting use-ful actions.

Finally, an opportunistic access-control scheme19 could let users accessprivate information in rare circum-stances such as emergencies. Adminis-trators would log and investigate theseactions to decide if they were legitimate.Determining the conditions and fre-quency under which to use this accessmechanism is an open problem.

User studiesWe still need to empirically validate our

assertion that PAC is an intuitive policythat will match users’ expectations of pri-vacy in everyday life. Moreover, a numberof studies have examined user privacyexpectations for captured audio and videodata (for example, Giovanni Iachello andhis colleagues20) and disclosure of infor-mation to others across potentially greatdistances (for example, Sunny Consolvoand her colleagues21), but few studiesexamine how physical space factors intopeople’s expectations of privacy for cap-tured RFID data. Apu Kapadia and hiscolleagues have begun to explore the useof spatial metaphors,22 but further workis necessary. The PAC definition andimplementation has helped in obtainingthe Institutional Review Board “minimalrisk” approval for these user studies.

Probabilistic dataWe’ve demonstrated that using PEEX

to clean data enhances data utility, butPEEX can also produce probabilisticdata. In this case, each tuple has an asso-ciated probability that represents the sys-tem’s confidence about its validity.

Implementing PAC in a probabilistic

context leads to a challenging problem.Suppose user A asks, “Is user B currentlyat location L?” If A is at L, then PACallows the correct answer. If A isn’t at L,then PAC refuses to reveal B’s location.However, cleaned data would assignprobabilities pA and pB to the chance thatA is at L and B is at L, respectively. In theprobabilistic context, the correct answeris no longer yes or no but pB. Yet the sys-tem can’t return pB when A isn’t at L.

One approach is to return pA • pB, theprobability that both A and B are at L.However, this reveals too much. Even ifpA is small (A is not likely to be at L), Acan still compute pB. More generally, therequirement is that if A is likely to be atL (pA is large), then the system shouldreveal pB. Otherwise, the system shouldhide this information. One ad hoc strat-egy is to return min(pA, pB). We plan toexplore more principled approaches thatfulfill this requirement.

The RFID Ecosystem projectaims to research socially appro-priate RFID systems to providethe community (including busi-

nesses and policy makers) examples ofeffective methods for balancing utilitywith privacy. PAC is a first step in thisdirection because, as a default access con-trol policy, it provides an upper and lowerbound on accessible information thatmodels human experience. It also leavesopen the possibility of utility-enhancingextensions. Our experiments show thatPAC is a practical solution that workswell in a real-world RFID deploymentwhere sensors are unreliable.

ACKNOWLEDGMENTSWe thank Garret Cole, Patricia Lee, Robert Spies,and Jordan Walke, with special thanks to CaitlinLustig for her work on the simulator we used. Wealso acknowledge the University of Washington

College of Engineering. The US National ScienceFoundation funded this research under its Comput-ing Research Initiative grants 0454394, IIS-0428168, and IIS-0415193.

REFERENCES1. G. Borriello et al., “Reminding ABOUT

Tagged Objects Using Passive RFIDs,”Proc. Ubiquitous Computing 6th Int’lConf. (Ubicomp 04), LNCS 3205, Springer,2004, pp. 36–53.

2. D. Patterson et al., “Fine-Grained ActivityRecognition by Aggregating AbstractObject Usage,” Proc. 9th Int’l Symp. Wear-able Computers (ISWC 05), IEEE CS Press,2005, pp. 44–51.

3. E. Welbourne et al., “Challenges for Perva-sive RFID-Based Infrastructures,” Proc. 5thAnn. IEEE Int’l Conf. Pervasive Comput-ing and Communications Workshops(Pertec 07), IEEE CS Press, 2007, pp.388–394.

4. A. Juels, “RFID Security and Privacy: AResearch Survey,” IEEE J. Selected Areasin Communications, Feb. 2006, pp.381–395.

5. M. Foucault, Discipline and Punish, Ran-dom House, 1975.

6. G. Iachello et al., “Privacy and Proportion-ality: Adapting Legal Evaluation Tech-niques to Inform Design in UbiquitousComputing,” Proc. SIGCHI Conf. HumanFactors in Computing Systems, ACM Press,2005, pp. 91–100.

7. J. Gemmell et al., “Passive Capture andEnsuing Issues for a Personal LifetimeStore,” Proc. 1st ACM Workshop on Con-tinuous Archival and Retrieval of PersonalExperiences (CARPE 04), ACM Press, 2004,pp. 48–55.

8. S. Intille et al., New Challenges for PrivacyLaw: Wearable Computers that CreateElectronic Digital Diaries, tech. report, MITDept. of Architecture House_n, Sept. 2003.

9. R. Agrawal et al., “Privacy-Preserving DataMining,” ACM SIGMOD Record, vol. 29, no.2, 2000, pp. 439–450.

10. L. Sweeney, “K-Anonymity: A Model forProtecting Privacy,” Int’l J. Uncertainty,Fuzziness and Knowledge-based Systems,vol. 10, no. 5, 2002, pp. 557–570.

Page 9: Physical Access Control for Captured RFID Data

11. R. Agrawal et al., “Hippocratic Databases,”Proc. 28th Int’l Conf. Very Large Databases(VLDB 02), Morgan Kaufmann, 2002, pp.143–154.

12. S. Lederer et al., “Personal Privacy throughUnderstanding and Action,” Personal Ubiq-uitous Computing, vol. 8, no. 6, 2004, pp.440–454.

13. M. Langheinrich, “Privacy by Design: Prin-ciples of Privacy-Aware Ubiquitous Sys-tems,” Proc. Ubiquitous Computing 3rdInt’l Conf. (Ubicomp 01), LNCS 2201,Springer, 2001, pp. 273–291.

14. Y. Duan and J. Canny, “Protecting UserData in Ubiquitous Computing,” PrivacyEnhancing Technologies, LNCS 3424,Springer, 2004, pp. 273–291.

15. S. Rizvi et al., “Extending Query RewritingTechniques for Fine-Grained Access Con-trol,” Proc. SIGMOD, ACM Press, 2004, pp.551–562.

16. Y. Li et al., “Design and Experimental An-alysis of Continuous Location TrackingTechniques for Wizard of Oz Testing,” Proc.SIGCHI Conf. Human Factors in ComputingSystems, ACM Press, 2006, pp. 1019–1022.

17. N. Khoussainova et al., “Probabilistic RFIDData Management,” tech. report UW-CSE-07-03-01, Univ. of Washington, ComputerScience and Engineering Dept., Mar. 2007.

18. J. Hong et al., “An Architecture for Privacy-Sensitive Ubiquitous Computing,” Proc.Mobisys, ACM Press, 2004, pp. 177–189.

19. D. Povey, “Optimistic Security,” Proc. 1999Workshop on New Security Paradigms,ACM Press, 1999, pp. 40–45.

20. G. Iachello et al., “Prototyping and Sam-pling Experience to Evaluate UbiquitousComputing Privacy in the Real World,”Proc. SIGCHI Conf. Human Factors in Com-puting Systems, ACM Press, 2006, pp.1009–1018.

21. S. Consolvo et al., “Location Disclosure toSocial Relations,” Proc. SIGCHI Conf. Hu-man Factors in Computing Systems, ACMPress, 2005, pp. 81–90.

22. A. Kapadia et al., “Virtual Walls,” Proc.Pervasive, LNCS 4480, Springer, May2007, pp. 162–179.

For more information on this or any other comput-ing topic, please visit our Digital Library at www.computer.org/publications/dlib.

OCTOBER–DECEMBER 2007 PERVASIVEcomputing 55

the AUTHORSTravis Kriplean is a doctoral student in computer science and engineering at theUniversity of Washington. His primary research interests lie in human-computer in-teraction, specifically in computer-supported cooperative work. Aside from workingon privacy issues for the RFID Ecosystem, he works on supporting public delibera-tion in the urban-planning domain through the UrbanSim project. He received hisBS in computer science and sociology from the University of Wisconsin. Contacthim at 101 Paul G. Allen Center, CSE Dept., Univ. of Washington, Box 352350, Seat-tle, WA 98195-2350; [email protected].

Evan Welbourne is a doctoral student in computer science at the University of Wash-ington. His research interests lie at the intersection of pervasive computing, data-bases, and human-computer interaction. He’s the lead graduate student on theRFID Ecosystem project and has worked on a variety of pervasive computing proj-ects at the University of Washington, Intel Research, and Microsoft Research. He re-ceived his MS in computer science from the University of Washington. Contact himat 101 Paul G. Allen Center, CSE Dept., Univ. of Washington, Box 352350, Seattle,WA 98195-2350; [email protected].

Nodira Khoussainova is a doctoral student in the University of Washington’s data-base group. Her research interests include the management of uncertain data fromsensors such as RFID antennas. More specifically, she focuses on the areas of extract-ing high-level events from raw data and of data privacy. She received her Bachelor’sdegree in computer science from the University of Auckland in New Zealand. Con-tact her at 101 Paul G. Allen Center, CSE Dept., Univ. of Washington, Box 352350,Seattle, WA 98195-2350; [email protected].

Vibhor Rastogi is a doctoral student in the University of Washington’s databasegroup. His research interests include database privacy, security, and probabilistic da-tabases. He obtained his master’s in computer science from the University of Wash-ington. Contact him at 101 Paul G. Allen Center, CSE Dept., Univ. of Washington,Box 352350, Seattle, WA 98195-2350; [email protected].

Magdalena Balazinska is an assistant professor of computer science and engineer-ing at the University of Washington. Her research interests focus on data manage-ment systems for distributed data streams, stream data archives, and dirty, uncertainsensor data. She received her PhD in computer science from the Massachusetts In-stitute of Technology. She’s a member of the IEEE, the ACM, and ACM SIGMOD. Con-tact her at 101 Paul G. Allen Center, CSE Dept., Univ. of Washington, Box 352350,Seattle, WA 98195-2350; [email protected].

Gaetano Borriello is a professor of computer science and engineering at the Uni-versity of Washington. He also founded Intel Research Seattle, where he launchedthe lab on applications of ubiquitous computing technology to healthcare and eldercare, in particular. His research interests include location-based systems, sensor-based inferencing, and tagging objects. He received his PhD in computer sciencefrom University of California Berkeley. He’s an associate editor in chief of IEEE Perva-sive Computing. Contact him at 101 Paul G. Allen Center, CSE Dept., Univ. of Wash-ington, Box 352350, Seattle, WA 98195-2350; [email protected].

Tadayoshi Kohno is an assistant professor of computer science and engineering atthe University of Washington. His research interests are computer security, privacy,and cryptography. He received his PhD in computer science at the University of Ca-lifornia, San Diego. He is a member of the ACM, the International Association forCryptological Research, and the IEEE. Contact him at 101 Paul G. Allen Center, CSE Dept., Univ. of Washington, Box 352350, Seattle, WA 98195-2350; [email protected].

Dan Suciu is a professor of computer science and engineering at the University ofWashington. His research is data management, emphasizing topics arising from shar-ing data on the Internet. He received his PhD in computer science from the Univer-sity of Pennsylvania. Contact him at 101 Paul G. Allen Center, CSE Dept., Univ. ofWashington, Box 352350, Seattle, WA 98195-2350; [email protected].