3770 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 24, …20 feet; (2) active tags, which have their own power sources and have relatively longer communication range. A reader has a dedicated

3770 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 24, NO. 6, DECEMBER 2016

Fast and Reliable Detection and Identificationof Missing RFID Tags in the Wild

Muhammad Shahzad and Alex X. Liu

Abstract— Radio-frequency identification (RFID) systems havebeen deployed to detect and identify missing products by affixingthem with cheap passive RFID tags and monitoring them withRFID readers. Existing missing tag detection and identificationprotocols require the tag population to contain only those tagswhose IDs are already known to the reader. However, in reality,tag populations often contain tags with unknown IDs, calledunexpected tags. These unexpected tags cause unexpected falsepositives, i.e. , due to them, missing tags are detected as present.We take the first step toward addressing the problem of detectingand identifying missing tags from a population that containsunexpected tags. Our protocol, RUN, uses standardized frameslotted Aloha for communication between tags and readers.It executes multiple frames with different seeds to reducethe effects of unexpected false positives. At the same time, itminimizes the missing tag detection and identification time byfirst estimating the number of unexpected tags in the populationand then using it along with the false-positive probability toobtain optimal frame sizes and minimum number of times Alohaframes should be executed to mitigate the effects of false positives.RUN works with multiple readers with overlapping regions.It is easy to deploy, because it is implemented on readers asa software module and does not require any modifications totags or to the communication protocol between the tags and thereaders. We implemented RUN along with four major missingtag detection and identification protocols, namely, TRP, IIP, MTI,and SFMTI, and the fastest tag ID collection protocol TH andcompared them side by side. Our performance evaluation resultsshow that RUN is the only protocol that achieves requiredreliability in the presence of unexpected tags, whereas thebest existing protocol achieves a maximum reliability of only67%. RUN identifies 100% of missing tags in the presence ofunexpected tags, whereas the best existing protocol identifies amaximum of only 60% of missing tags.

Index Terms— RFID, missing tag detection, missing tagidentification.

I. INTRODUCTION

A. Background and Motivation

SHOPLIFTING, employee theft, and vendor fraud havebecome major causes of lost capital for retailers [8], [19].

Manuscript received May 24, 2015; revised December 12, 2015 andFebruary 20, 2016; accepted March 24, 2016; approved by IEEE/ACMTRANSACTIONS ON NETWORKING Editor X. Lin. Date of publicationAugust 12, 2016; date of current version December 15, 2016. This workwas supported in part by the National Natural Science Foundation ofChina under Grants 61472184, 61321491, and 61272546, the Jiangsu FutureInternet Program under Grant BY2013095-4-08, and the Jiangsu High-LevelInnovation and Entrepreneurship (Shuangchuang) Program. The preliminaryversion of this paper, titled “Expecting the Unexpected: Fast and ReliableDetection of Missing RFID Tags in the Wild,” was published in the pro-ceedings of IEEE INFOCOM 2015 (pp. 1939–1947). (Corresponding author:Alex X. Liu.)

M. Shahzad is with the Department of Computer Science, North CarolinaState University, Raleigh, NC 27606 USA (e-mail: [email protected]).

A. X. Liu is with the Department of Computer Science, Michigan StateUniversity, East Lansing, MI 48824 USA (e-mail: [email protected]).

Digital Object Identifier 10.1109/TNET.2016.2559539

In 2011 alone, the retailers lost an estimated 34.5 billion dol-lars due to these causes [1]. With the benefits of not requiringa line-of-sight and low cost of tags (e.g., 5 cents per tag [16]),radio frequency identification (RFID) systems have beendeployed for monitoring products by affixing them with cheappassive RFID tags and using RFID readers, which are giventhe IDs of the tags that are being monitored, to detect andidentify any missing tags. A tag is a microchip with an antennain a compact package that has limited computing power andcommunication range. There are two types of tags: (1) passivetags, which power up by harvesting the radio frequency energyfrom readers and have communication range often less than20 feet; (2) active tags, which have their own power sourcesand have relatively longer communication range. A reader hasa dedicated power source with significant computing power.It transmits queries to a set of tags and the tags respond overa shared wireless medium. In this paper, we deal with bothpassive and active RFID tags.

B. Summary and Limitations of Prior Art

There are two types of missing tag detection andidentification protocols: probabilistic [13], [20] and determin-istic [11], [12], [24]. The probabilistic protocols are fasterbut only report the event that some tags are missing, withoutpinpointing exactly which ones. The deterministic protocolsreturn IDs of all the missing tags but are comparativelyslower. Both approaches have their merits. In fact, they arecomplementary to each other, and should be used together. Forexample, a probabilistic protocol may be scheduled to executefrequently to detect if any tags are missing. Once a missingtag event is detected, a deterministic protocol may be invokedto identify the tags that are missing. Several probabilisticprotocols such as TRP [20] and EMTD [13] and deterministicprotocols such as IIP [11], MTI [24], and SFMTI [12] havebeen proposed.

There are two key limitations of existing protocols. Thefirst limitation is that all existing protocols require a perfectenvironment that contains only the expected tags whose IDsare already known to the reader. However, in reality, tagpopulations often contain unexpected tags whose IDs areunknown. Here we give three examples. For the first example,in airports where an airline company uses RFID readers tomonitor baggage of its passengers, the tags of other airline’sbaggage, which are in the vicinity of this airline’s readers, alsorespond to the queries of this airline’s readers. For the secondexample, in a large warehouse rented to multiple tenants, onetenant’s RFID readers receive responses from tags of othertenants. For the third example, in a retail store that usesRFID readers to monitor only expensive merchandize such

1063-6692 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

SHAHZAD AND LIU: FAST AND RELIABLE DETECTION AND IDENTIFICATION OF MISSING RFID TAGS IN THE WILD 3771

as jewelry, the readers will receive responses from tags ofinexpensive merchandize as well. Similar scenarios exist inother settings such as hospitals, prisons, and shopping malls.Existing protocols propose variations in standard frame slottedAloha protocol to speed up the detection and identificationprocess. However, they cannot handle the presence of unex-pected tags because unexpected tags fill up unexpected slotsin Aloha frames resulting in unexpected false positives.

The second major limitation of existing protocols is thatexcept TRP, none of these protocols is compliant with theEPCGlobal Class 1 Generation 2 (C1G2) RFID standard [6].These protocols require the manufacturers to put random bitsequences in tags that the tags use to calculate specialized hashfunctions. They also require the tags to be able to receive andinterpret “pre-vector” and/or “post-vector” frames and use theinformation in those frames to select slots in Aloha frames.Such functionalities are not provisioned in the C1G2 standardbecause tags, especially the passive ones, do not have enoughcomputational power to calculate complex hashes and processsuch pre- and post-vector frames. It is important for an RFIDprotocol to be compliant with the C1G2 standard because thecheap commercially available off-the-shelf (COTS) RFID tagsfollow the C1G2 standard. A protocol that is not compliantwith the C1G2 standard will require home brewed tags, whichwill not only cost more but will also work only in limitedsettings. For example, if an airline uses a protocol and tagsthat are non-compliant with the C1G2 standard, it will be ableto track its luggage only at its home airport but not at theairports in rest of the world, which support only the C1G2compliant tags.

C. Problem Statement and Proposed Approach

Now we formally define the missing tag detection andidentification problem. Let E represent the set of IDs of theexpected tags, i.e. , the tags that are expected to be presentin a population and need to be monitored. Let an unknownnumber of tags, m, out of these |E| tags be missing, where0 ≤ m ≤ |E|. Let Ep be the set of IDs of the remaining |E|−mtags that are actually present in the population. Let U be theset of IDs of all the unexpected tags in the population thatdo not need to be monitored. We neither know exactly whichIDs belong to sets Ep and U nor do we know their sizes,but we do know that Ep ⊆ E. Let T be a threshold on thenumber of missing tags. Our objective is to design a missingtag monitoring protocol using which a set of readers shouldquickly detect a missing tag event with a probability ≥ αwhenever the number of missing tags m is greater than orequal to the threshold T , and if required, identify the IDsof the missing tags. This probability α is called the requiredreliability and lies in the range 0 ≤ α < 1. Additionally,a tag monitoring protocol should (1) be able to estimate thenumber of missing tags if the exact IDs of the missing tagsare not required, (2) work in single as well as multiple readersenvironment, and (3) be compliant with the C1G2 standard.

For the problem of detecting and identifying missing tagsin the presence of unexpected tags, there are three seeminglyobvious solutions based on previous work. The first solutionis to let RFID readers repeatedly execute a tag collection

protocol to first collect IDs of all tags and then compare themwith the IDs in set E to detect and identify missing tags.This solution works; however, it is too slow. For example,even using the fastest existing tag collection protocol TH [18],our performance evaluation results show that this solution is14.3 times slower than our scheme. Delayed identification ofmissing RFID tags may have unbearable consequences suchas giving thieves enough time to steal and run. The secondsolution is to first execute a tag collection protocol to getthe IDs of unexpected tags and then repeatedly execute anexisting RFID missing tag detection or identification protocol.This solution has two limitations. First, it is slow becausethe missing tag detection protocol will have to monitor theunexpected tags in addition to the expected tags. Second, themissing tag detection protocol will report a missing tag eventeven when some unexpected tags go missing, which is notthe requirement. Furthermore, these solutions can’t be usedin settings where readers aren’t allowed to read IDs in set U

due to privacy reasons. An example of such a setting is theaforementioned multi-tenant warehouse, where one tenant maynot permit readers of other tenants to read the IDs of itstags.

The third solution is to repeatedly execute a tag estimationprotocol and look for a net change in the population size. Thelimitation of this solution is that if some expected tags gomissing but an equal or greater number of unexpected tagsjoin the population, the estimation protocol can not detectthe missing tag event. Furthermore, missing tag detectionprotocols are much faster compared to estimation protocolsdue to the knowledge of set E [11], [12], [20].

In this paper, we propose a new protocol calledRFID monitoring protocol with unexpected tags (RUN), thefirst protocol that can achieve required reliability in detectingand identifying missing tags when unexpected tags might bepresent in the population. RUN uses the frame slotted Alohaprotocol specified in the C1G2 standard as its MAC layer com-munication protocol. In Aloha protocol, the reader first tellsthe tags a frame size f and a random seed number R. Eachtag within the transmission range of the reader then uses f , R,and its ID to select a slot in the frame by evaluating a hashfunction h(f, R, ID) whose result is uniformly distributedin [1, f ]. Each tag has a counter initialized with the slot numberit chose to reply. After each slot, the reader first transmits anend of slot signal and then each tag decrements its counter byone. In any given slot, all the tags whose counters are equalto 1 respond with a pseudo-random sequence called RN16. Ifno tag replies in a slot, it is called an empty slot. If one or moretags reply in a slot, it is called a nonempty slot. As per theC1G2 standard, tags do not transmit their IDs unless the readerspecifically asks them to do so. In RUN, the reader checks ifa slot is empty or nonempty using the RN16 sequence andnever asks the tags to transmit their IDs. This preserves theprivacy in settings where a reader is not allowed to read IDsof tags in set U. C1G2 standard provisions this functionalityof not asking the tags for their IDs.

RUN can detect a missing tag event and if required, itcan identify the tags that are missing. To distinguish betweenthese two functionalities of RUN, we represent it by RUND


when used for detection of missing tags and by RUNI

when used for identification of missing tags. In rest of thispaper, whenever we use the word RUN, it means that weare referring to both RUND and RUNI. Next, we brieflydescribe how RUND detects a missing tag event and how RUNI

identifies the missing tags.Missing Tag Detection: To detect if any tags are missing,

RUND executes multiple Aloha frames with different seeds.In each frame, each tag uses the seed for that frame toselect its slot. As RUND already knows the IDs of all tagsin set E, it pre-computes which tags in E will select whichslots in each frame. Thus, it knows which slots in the framesshould be nonempty if all the tags in E are present in thepopulation. When a reader executes a frame, RUND comparesthe response in each slot of that frame with the correspondingslot in the pre-computed frame. If it finds that a particularpre-computed slot was nonempty but the corresponding slot inthe executed frame is empty, it stops and declares that sometags are missing. To minimize the effect of unexpected falsepositives and consequently the detection time, RUND estimatesthe size of U implicitly without running an extra estimationphase and uses this estimate to calculate optimal values ofsystem parameters.

Missing Tag Identification: To identify exactly which tagsare missing from the population of RFID tags, instead of stop-ping on encountering the first empty slot that was nonemptyin the pre-computed frames, RUNI continues and executes vframes with different seeds. On each encounter with an emptyslot in a frame that was nonempty in the corresponding pre-computed frame, it marks all the tags in E that should haveresponded in that slot as absent. The value of v is such that inexecuting v frames, RUNI is expected to identify all missingtags.

D. Technical Challenges and Solutions

There are four key technical challenges in detecting andidentifying missing tags. The first technical challenge is tohandle the presence of unexpected tags. Due to the presenceof such tags, it is possible that a particular slot that RUNexpected to be nonempty due to a specific tag in E actuallyturns out to be nonempty even though that specific tag in E

was missing. To address this challenge, RUN executes multipleframes with different seeds, which reduces the effects ofsuch unexpected false positives. We derive an expression tocalculate the false positive probability due to tags in Ep ∪ U

and use that expression to calculate optimal values of framesizes and the number of times the frames should be executedto mitigate the effects of false positives.

The second technical challenge is to estimate the number ofunexpected tags |U| in the population of RFID tags, which isrequired to calculate the optimal values of system parameterssuch as frame size f and number of times n the frames shouldbe executed. To address this challenge, RUN first pre-computeswhich slots in each frame will the tags in E not select. Second,it executes the frames and sees how many of such slots turn outto be nonempty. The number of such slots that are nonemptyin the executed frames is a function of |U| but is independentof |E| because we know from the pre-computed frames that

the tags in E never select these slots. Therefore, only the tagsin U could have selected these slots. Thus, by observing thenumber of slots that are empty in the pre-computed frames andnonempty in the executed frames, RUN estimates the numberof unexpected tags |U| in the population of RFID tags. Notethat RUN does not carry out a separate estimation phase toestimate the size of U. It obtains the estimate while executingAloha frames to detect and identify missing tags and thus,does not incur any extra time cost.

The third technical challenge is to achieve the requiredreliability in smallest possible time in case of detection ofmissing RFID tags. To address this challenge, we use theexpression for false positive probability to derive a “reliabilitycondition”, which, if satisfied by the values of frame size f andnumber of frames n, guarantees that RUND will achieve therequired reliability. These values of frame size f and numberof frames n ensure with probability α that there will be at leastone slot in the n frames that is nonempty in the pre-computedframes and empty in the executed frames when m ≥ T .To minimize RUND’s execution time, we express it in termsof f and n and minimize it under the constraint that f and nsatisfy the reliability condition.

The fourth technical challenge is to identify all missing tagsin smallest possible time in case of identification of missingRFID tags. To address this challenge, we derive an expressionto calculate the number of missing tags that RUNI identifiesin each frame while incorporating the false positives. Usingthis expression, we derive a second expression to calculatethe number of frames RUNI needs to execute to identify allmissing tags. We use this second expression to calculate totalexecution time of RUNI and minimize it with respect to framesize f to get the optimal value of frame size. In additionto minimizing execution time, the optimal value of framesize f also ensures that each missing tag is expected tomap to at least one such slot that is nonempty in a pre-computed frame and empty in the corresponding executedframe.

E. Key Novelty and Advantages Over Prior Art

The key novelty of this paper is twofold. First, we identifythe problem of detecting and identifying missing tags in thepractical scenario where unexpected tags are present. Second,we propose RUND for detecting missing tags and RUNI foridentifying missing tags in the presence of unexpected tags.RUN has two key advantages over prior art. First, it achievesthe required reliability in the presence of unexpected tags,whereas none of the existing protocols achieve the requiredreliability. We have extensively evaluated and compared RUNwith four state-of-the-art missing tag detection protocols(TRP [20], IIP [11], MTI [24], and SFMTI [12]) in a varietyof scenarios for a large range of tag population sizes. Amongexisting protocols, SFMTI achieves the highest reliability of67% whereas RUND achieves arbitrarily high reliability as perthe requirement such as 99%. Similarly, SFMTI identifies upto 60% of missing tags in the presence of unexpected tagswhereas RUNI identifies 100% of missing tags. Second, it iscompliant with the C1G2 standard whereas existing protocols,except TRP, are not.


II. RELATED WORK

Several probabilistic [3], [13], [20] and deterministic [11],[12], [24], [26] missing tag detection and identification proto-cols have been proposed. The common and major drawback ofall of these protocols is that none of them handle unexpectedtags and assume that the readers already know the IDs of alltags that can be present in the population.

A. Probabilistic Protocols

Tan et al. proposed the first probabilistic protocol calledTRP [20]. TRP pre-computes slots in a frame and comparesthem with the executed slots to detect missing tags. Thedifference with RUN, however, lies in that TRP does notconsider false positives from unexpected tags because theyassume that the reader already knows all the IDs in thepopulation. In comparison, RUN does not know which slotswill cause false positives due to the presence of unexpectedtags, making the problem challenging. Furthermore, for largepopulations, TRP requires frame size that exceeds the C1G2specified upper limit of 215, which is not possible in practicalRFID systems. Among existing protocols, TRP is the only onethat is compliant with the C1G2 standard as long as the framesize is below 215. Luo et al. proposed another probabilisticprotocol called EMTD [13]. Unfortunately, this protocol isnon-compliant with the C1G2 standard because it assumesthe RFID tags to be intelligent with enough computing powerto implement a hash ring and calculate hashes using thatring. Bu et al. proposed a suite of protocols, namely Cardiff,Cardiff+, and Divar, to anonymously detect missing tags [3].Cardiff is essentially the same as the frame slotted Alohaprotocol except that the tags never transmit their IDs, rathertransmit only a 10-bit sequence. Cardiff counts the number oftags in the population and if the counted number of tags isless than the expected number of tags, it declares that sometags are missing. Cardiff+ slightly enhances the performanceof Cardiff by increasing the frame size from equal to thetag population size to 1.2 to 1.4 times the tag populationsize. In Divar, the authors propose to use dynamic sequencespread spectrum (DSSS) to verify the intactness of an RFIDpopulation. Unfortunately, RFID tags in general and C1G2compliant tags in particular, cannot implement a sophisticatedscheme such as DSSS. Neither of the three protocols in [3]handle unexpected tags. None of the existing probabilisticprotocols have been designed to work in multiple-readerenvironment.

B. Deterministic Protocols

Li et al. proposed a suite of protocols in [11] out of whichIIP performs the best. IIP is non-compliant with the C1G2standard due to following three reasons. First, it requirestags to interpret pre-vector frames and reply to the readerqueries as described in those frames. Second, it requires framesizes greater than 215 for large populations. Last, it requiresmanufacturers to insert a ring of random bits in tag memoryat the time of manufacturing. IIP can not handle multiplereaders and requires frame sizes greater than 215 for largepopulations. Zhang et al. proposed a deterministic protocolin [24], called MTI, that handles multiple readers but is also

non-compliant with the C1G2 standard. Note that MTI isreferred to as protocol 3 in [24]. The protocol is essentiallya tag collection protocol that first collects the IDs of allthe tags in the population and then checks which tags inthe population are missing. The authors have not provided atheoretical frame work to calculate optimal system parameters,due to which MTI cannot be used to achieve an arbitrarydesired accuracy. Liu et al. proposed a deterministic protocolin [12], called SFMTI. Although this protocol handles the useof multiple readers, it is non-compliant with the C1G2 tagsbecause it requires tags to interpret non-standardized vectorstransmitted by the readers before and after selecting a slotin a frame. Zheng and Li proposed P-MTI, a physical layermissing tag identification protocol that leverages physical layersignals to identify missing tags [26]. Unfortunately, P-MTIworks only when there are no unexpected tags because thereader in P-MTI needs the IDs of all tags in the population topre-compute a sequence of bits for each tag. In the presenceof unexpected tags with unknown IDs, the reader cannotcompute such sequences, and thus not solve the system oflinear equations, rendering P-MTI unable to identify missingtags.

III. SYSTEM MODEL

A. Architecture

For detecting and identifying missing tags, RUN uses acentral controller connected with a set of readers that coverthe area where the tags in set E are located. The use of acentral controller ensures that all readers use consistent valuesof frame sizes and seeds when executing frames, which helpsin efficiently aggregating and processing information returnedby the readers. The readers use the standardized frame slottedAloha protocol to communicate with tags and never ask thetags to transmit their IDs.

The use of multiple readers with overlapping coverageregions introduces following two problems: (1) scheduling thereaders such that no two readers with overlapping regionstransmit at the same time, and (2) mitigating the effect ofsome tags responding to multiple readers due to overlap inthe coverage region of those readers. For the first problem, thecontroller uses one of the several existing reader schedulingprotocols [4], [21], [23], [27] to avoid reader-reader collisions.For the second problem, RUN uses same values of framesize f and seed R across all readers with the help of thecentral controller. When a reader transmits seed Ri in itsith frame, it does not generate Ri on its own, rather it usesthe ith seed Ri issued by the central controller. That is, eachreader generates the same sequence of seeds in consecutiveframes. As all readers use the same seed Ri in the ith frame,the slot number that a particular tag chooses in the ith frameof each reader covering this tag is the same i.e. , h(f, Ri, ID)evaluated by the tag results in same value for each reader.Once a reader completes its frame, it sends the responses to thecentral controller. The controller applies logical OR operatoron all the ith frames from all readers and gets a single ith frameas if returned by one reader covering the entire tag population.It then compares it with the ith pre-computed frame to detectand identify missing tags.


B. C1G2 Compliance

RUN does not require any modifications to tags or read-ers. It only requires the readers to receive the frame size,persistence probability, and seed number from the controllerand communicate the responses in the frames back to thecontroller. Persistence probability p is the probability withwhich a tag decides whether it will participate in a frame ornot before selecting a slot in that frame. Later in the paper,we will show how we use p to handle frame sizes that exceedthe C1G2 specified upper limit of 215. As the C1G2 standarddoes not specify the use of p, COTS tags do not supportit. To avoid making any modifications to tags, in RUN, thereader implements p by announcing a frame size of f/p butterminating the frame after the first f slots. To terminate aframe, the reader issues a SELECT command, specified in theC1G2 standard, with its position, target, and action parametersset to 0. This command “resets” all tags and they go into astate where they expect a new frame to start. For further detailson frame termination, see [6, Sec. 6].

C. Communication Channel

We assume that the communication channel between readersand tags is reliable i.e. , tags correctly receives queries fromthe readers and the readers correctly detect transmission ofRN16 sequence in a slot if one or more tags in the populationtransmit in that slot. If the channel is unreliable, the solutionproposed in [18] can be easily adapted for use with RUN.

D. Formal Development Assumption

To make the formal development tractable, we assume thatinstead of picking a single slot to transmit at the start ofith frame of size f , a tag independently decides to transmitin each slot of the frame with probability 1/f regardless ofits decision about previous or forthcoming slots. Vogt first usedthis assumption for the analysis of Aloha protocol for RFIDand justified its use by recognizing that this problem belongsto a class of problems called occupancy problem, which dealswith the allocation of balls to urns [22]. Ever since, the useof this assumption has become a norm in the formal analysisof all Aloha based RFID protocols [17], [22], [25].

The implication of this assumption is that a tag can endup choosing more than one slots in the same frame or evennot choosing any at all, which is not in accordance withthe C1G2 standard that requires a tag to pick exactly oneslot in a frame. However, this assumption does not createany problems because the expected number of slots that atag chooses in a frame is still one. The analysis with thisassumption is, therefore, asymptotically the same as thatwithout this assumption [2]. This independence assumptionis made only to make the formal development tractable. In allour simulations, a tag chooses exactly one slot at the start ofa frame. Table I lists the symbols used in this paper.

IV. PROTOCOL FOR DETECTION: RUND

To detect if any of the tags in set E is missing from thepopulation, in RUND, the central controller executes up ton Aloha frames using the RFID readers. There are 6 steps

TABLE I

SYMBOLS USED IN THE PAPER

involved in executing each frame. First, before executing anyframe i, the controller calculates the optimal values of framesize fi, persistence probability pi, and generates a randomseed number Ri. Second, as the controller knows the IDs inset E, it pre-computes which tag in E will choose which slotin the ith frame. Thus, it knows which slots of the executedith frame should be nonempty if all the tags in E were presentand a single reader covered the entire population. It representsthe nonempty slots in the pre-computed frame with 1s and allother slots with 0s. Third, it provides each reader with theparameters fi, pi, and Ri and asks each of them to executethe ith frame using these parameters. The motivation behindusing the same values of fi, pi, and Ri across all readers forthe ith frame is to enable RUND to work with multiple readerswith overlapping regions. As all readers use the same valuesof fi, pi, and Ri in the ith frame, the slot number that aparticular tag chooses in the ith frame of each reader coveringthis tag is the same i.e. , h(fi/pi, Ri, ID) evaluated by thetag results in same value for each reader. Fourth, each readerexecutes the frame on its turn as per the reader schedulingprotocol and sends the responses in the frame back to thecontroller. Fifth, when the controller receives the ith frame of


each reader, it applies logical OR operator on all the receivedith frames and obtains a resultant ORed frame. This resultantORed frame is same as if received by a single reader coveringall the tags. Sixth, the controller compares all the slots in thepre-computed ith frame with the corresponding slots in theresultant ORed ith frame. If there is any slot that is 1 in thepre-computed frame but 0 in the resultant ORed frame, thecontroller detects this as a missing tag event because such aslot implies that all tags in E that are mapped to this slot in thepre-computed frame are absent from the population. At thispoint, the controller stops the protocol and does not executethe remaining n − i frames. If the controller does not detecta missing tag event even after each reader has executed nframes, it declares that the number of missing tags m is lessthan the threshold T .

V. PARAMETER OPTIMIZATION: RUND

Recall from the previous section that before executingany frame i, the controller calculates the optimal valuesof frame size fi and persistence probability pi. For this,the controller first estimates the value of number of unex-pected tags |U| at the start of the ith frame, representedby |Ui|, based on the responses from the tag population in theprevious i−1 frames. Then, using this estimate along with thevalues of number of expected tags |E|, required reliability α,and threshold T , the controller calculates the optimal valuesof the frame size fi and persistence probability pi such thatRUND achieves the required reliability in shortest time. Beforeasking the readers to execute the ith frame, the controller alsorecalculates the maximum number of frames that it shouldexecute, represented by ni. As the controller executes moreand more frames, i.e. , as i increases, the estimate |Ui| of thenumber of unexpected tags asymptotically becomes equal tothe actual number of unexpected tags |U|. Consequently, fi,pi, and ni asymptotically become equal to constants f , p, andn, respectively. When the estimate of |U| does not changeby more than 1% in 10 consecutive frames, the controllerconsiders the estimate to be close enough to |U|. At this point,the controller calculates the values of fi, pi, and ni and putsf = fi, p = pi, and n = ni, and uses these fixed values of fand p to execute subsequent frames until the total number offrames executed since the first frame become equal to n. Notethat the controller executes n frames only if it does not detectany missing tag event in any frame. Otherwise, it terminatesthe protocol as soon as it detects a missing tag event. For thefirst frame, i.e. , when i = 1, the controller uses f1 = 2× |E|,p1 = 1, and n1 = ∞. The choices of the values of f1, p1,and n1 are arbitrary and do not really matter because asthe controller executes more frames, the frame size, thepersistence probability, and the number of frames converge toconstants f , p, and n, respectively.

In rest of this section, we will derive equations that thecontroller uses at the start of each frame to calculate theoptimal values of frame size f , number of times the framesshould be repeated n, and persistence probability p to mini-mize the execution time of RUND while ensuring that its actualreliability is no less than the required reliability. We havedropped the subscript i from these parameters to make the

presentation simple. To calculate these optimal values, thecontroller requires the estimate of |U|. Next, we will firstpresent a method to obtain this estimate at the start of anyframe i based on the responses from the tag populationin the previous i − 1 frames. Second, using the estimateof |U|, we will derive an expression for the false positiveprobability, i.e. , the probability that a missing tag is detectedas present. Third, we will use the expression for false positiveprobability in conjunction with the required reliability α andthreshold T to obtain an equation with three unknowns f , p,and n. To ensure that the actual reliability is greater than orequal to the required reliability, the controller must use thevalues of f , p, and n that satisfy this equation. We call thisequation the reliability condition. Fourth, we will derive anexpression for the total execution time of RUND and minimizeit with respect to n to get an expression involving p and n.The controller simultaneously solves this expression with thereliability condition using p = 1 to obtain the optimal valuesof f and n. Last, we will show how to bring the value of fwithin limit when the optimal value of the frame size exceedsthe C1G2 specified upper limit of 215, We will also calculatethe expected number of slots RUND takes to detect the firstmissing tag event. Next, we describe these five steps in detail.

A. Estimating Number of Unexpected Tags

In this section, we present a method to estimate the numberof unexpected tags in the population at the start of anyframe i. Although a lot of work has been done by the researchcommunity to estimate the number of tags present in anRFID tag population [5], [9], [10], [17], there is no work onestimating the size of some subset of RFID tag population.In our case, that subset is the set of unexpected tags in thepopulation.

Recall from Section IV that in any frame i, the slots that are0 in the ith pre-computed frame are the slots that only the tagsin set U can select when the reader executes the ith frame.This is because we have prior knowledge that the tags inset E will select only those slots in the ith executed framethat are 1 in the ith pre-computed frame. The intuition behindour estimation method is that as the number of unexpected tagsin a population increases, the number of slots that are 0 in apre-computed frame but are 1 in the corresponding executedresultant ORed frame also increase. The number of such slotsin any given frame is a function of |U| and can, therefore, beused to estimate the value of |U|.

Next, we derive an expression that relates the number ofslots that are 0 in a pre-computed frame but are 1 in thecorresponding executed resultant frame with the value of |U|.We will use this expression to obtain the estimate of |U|. Letthe size of the ith frame be fi and let ki of these fi slots be1s in the pre-computed frames. Let j be the jth 0 slot in thepre-computed frame. Thus, 1 ≤ j ≤ fi − ki. Let Xij be anindicator random variable for the event that the jth 0 slot inthe ith pre-computed frame turns out to be 1 in the ith executedresultant frame. The expected value of Xij is given by

E[Xij ] = P {Xij = 1} = 1 −(1 − pi

fi

)|U|≈ 1 − e

−pifi

|U|


Let N 01i be a random variable representing the number of

slots that are 0 in the ith pre-computed frame but 1 inthe ith executed resultant frame. Thus, N 01

i =∑fi−ki

j=1 Xij .As

{Xi1, Xi2, . . . , Xi(fi−ki)

}forms a set of identically dis-

tributed random variables, E[N 01i ] is given by

E[N 01i ] = E[

fi−ki∑j=1

Xij ] = (fi − ki) × E[Xij ]

= (fi − ki) × (1 − e−pi

fi|U|) (1)

Let N 01i represent the observed value of the number of slots

that were 0 in the ith pre-computed frame but 1 in thecorresponding executed resultant frame. Replacing E[N 01

i ]in the equation above with N 01

i and solving for |U| givesan estimate of |U|. This estimate is obtained by utilizingthe information from the ith frame only. While this estimatemay not be accurate, if we use the information from a largenumber of frames, the estimate will become more accurate.Specifically, we leverage the well known statistical resultthat the variance in the observed value of a random variablereduces by x times if we take the average of x observationsof that random variable. Therefore, to obtain the estimate |Ui|of |U| at the start of the ith frame, we obtain an estimatefrom each of the previous i−1 frames and take their average.Solving Equation (1) for |U| and averaging over past i − 1frames, the formal expression for |Ui| becomes

|Ui| = − 1i − 1

i−1∑l=1

fl

plln

{1 − N 01

l

fl − kl

}(2)

Finally, note that the controller obtains this estimate withoutexecuting any additional frames. It gets this estimate from theframes it was already executing to detect missing tag events.

B. False Positive Probability

A false positive occurs when all the slots that a particularmissing tag maps to in the n pre-computed frames turn outto be nonempty when the frames are executed because someother tags in the population also selected those slots. Lemma 1gives the expression to calculate the false positive probability.

Lemma 1: Let m out of |E| tags be missing, and let therebe |U| unexpected tags in the population. With persistenceprobability p, frame size f , and number of frames n, the falsepositive probability, Pfp, is given by:

Pfp =

{1 − p

(1 − p

f

)|U|+|E|−m}n

(3)

Proof: The total number of tags in the population are|U| + |E| − m. Consider an arbitrary tag in E that is missingfrom the population. As this tag participates in each pre-computed frame with probability p, it is possible that it doesnot participate in one or more of the n pre-computed frames.Let Z be the random variable for the number of pre-computedframes in which this missing tag participates. Let q be theprobability that a slot that this missing tag maps to in a pre-computed frame is selected by one or more of the tags present

Fig. 1. False positive probability (Pfp) comparison.

in the population in the executed frame. Therefore,

Pfp =n∑

z=0

P {Z = z} × qz (4)

As a missing tag participates in each pre-computed frame withprobability p and there are n pre-computed frames, the numberof pre-computed frames in which the missing tag participatesfollows a binomial distribution i.e. , Z ∼ Binom(n, p). Whena frame is executed, probability that a tag in the populationchooses the same slot to which the missing tag maps in thepre-computed frame is p

f . Probability that it does not choose

that same slot is 1 − pf . Probability that none of the tags

in the population choose that same slot is (1 − pf )|U|+|E|−m.

Probability that at least one tag in the population chooses that

same slot is 1 − (1 − pf )|U|+|E|−m, which is the value of q.

Therefore, Equation (4) becomes

Pfp =n∑

z=0

(n

z

)pz(1 − p)n−z

{1 − (1 − p

f)|U|+|E|−m

}z

To simplify the equation above, we use the binomial theorem,which states that

n∑z=0

(n

z

)xzyn−z = (x + y)n

Substituting x = p×{1− (1− p

f )|U|+|E|−m}

and y = 1−p inthe equation above, its left hand side (L.H.S) becomes equalto the right hand side (R.H.S) of Equation (5) and its R.H.Sbecomes equal to the R.H.S of Equation (3). Thus, the R.H.Ssof Equations (3) and (5) are equal. �

Figure 1 shows the theoretically calculated false positiveprobability from Equation (3) represented by the solid lineand values of false positive probability observed from simula-tions represented by the dots. To obtain this figure, we use|E| = 100, |U| = 500, f = 300, p = 1, and n = 2.Each dot represents the false positive probability calculatedfrom 100 runs of simulation. We observe that the theoreticallycalculated values match perfectly with the values observedfrom simulations, showing that our independence assumptionthat we stated in Section III-D does not cause the theoreticalanalysis to deviate from practically observed values. We alsoobserve that as the number of missing tags increases, the falsepositive probability decreases. This means that it is hardest forRUND to detect a missing tag event when m = T and becomeseasier as m increases beyond T . Thus, we will use m = T inall further analytical development, because if RUND is able todetect a missing tag event with probability α when m = T ,


it will be able to detect a missing tag event with probabilitygreater than α when m > T .

C. Achieving Required Reliability

Following theorem gives the reliability condition that thevalues of f , p, and n need to satisfy in order for RUND to beable to achieve the required reliability i.e. , detect missing tagevent with probability greater than or equal to α if the numberof missing tags are greater than or equal to T .

Theorem 1: Given a set E with expected IDs, set U withunexpected IDs, threshold T , and required reliability α, RUND

will achieve the required reliability if the values of f , p, andn satisfy the reliability condition given below.

f =p(T − |E| − |U|)ln

{ 1−(1−α)1

nT

p

} (5)

Proof: The expression for Pfp in Equation (3) gives theprobability that RUND does not detect a given missing tag.Probability that it does not detect any of the T missing tagsis PT

fp. Probability that it detects at least one of the missingtags is 1 − PT

fp, which should be greater than or equal to α.In worst case, the probability that RUND detects at least one ofthe missing tags should at least be equal to α i.e. , 1−PT

fp = α.Substituting Pfp by the R.H.S of Equation (3), we get

1 − α =

{1 − p

(1 − p

f

)|U|+|E|−T}nT

≈{

1 − pe−pf (|U|+|E|−T )

}nT

Rearranging the equation above gives Equation (5). �

D. Minimizing Execution Time

Following theorem gives the condition that p and n need tosatisfy to make the execution time of RUND minimum underthe constraint that it achieves the required reliability.

Theorem 2: Given a threshold T and required reliability α,the execution time of RUND is minimum under the constraintthat it achieves the required reliability if the values of p and nsatisfy the following equation:

p ={1 − (1 − α)

1nT

}{(1 − α)

(1−α)1

nT

nT(−1+(1−α)1

nT )

}(6)

Proof: Execution time is directly proportional to thetotal number of slots required to detect the missing tag eventbecause the duration of each slot is the same and is equal tothe time it takes the reader to distinguish between an emptyand a nonempty slot (roughly around 300μs for Philips I-CodeRFID reader [14]). Let Sd represent the total number of slots,which equals the product of frame size f and total number offrames n, i.e. , Sd = f ×n. To ensure that RUND achieves therequired reliability, we use the value of f from Equation (5).Thus,

Sd =pn(T − |E| − |U|)ln

{1−(1−α)

1nT

p

} (7)

Fig. 2. Number of slots (Sd) vs. number of frames (n).

Fig. 3. Frame size (f ) vs. persistence probability (p).

Figure 2 plots Sd as a function of n. We observe that Sd

is a convex function of n. Therefore, optimum value of nexists, represented by nop, that minimizes the total number ofslots Sd. To find optimal value of n, we differentiate Equation(7) with respect to n and equate the resulting expression to 0,which gives Equation (6). �

At the start of each frame, the controller replaces |U| with itsestimate, puts p = 1 in Equation (6), and solves it numericallyusing Brent’s method to obtain the optimal value of numberof frames nop. Then it puts n = nop and p = 1 in Equation (5)to get the optimal value of frame size fop. When the controllercalculates fop and nop like this at the start of each frame, theexecution time of RUND is minimized. At the same time, asthe reliability condition is satisfied, the protocol achieves therequired reliability.

E. Handling Large Frame Sizes

For large populations, high required reliability, and/or smallthreshold, it is possible for the value of fop to exceed the C1G2specified upper limit of 215. Next, we describe how we use pto bring the frame size within limits. Bringing the frame sizewithin limits comes at a cost of increased number of slots;greater than the minimum value of Sd that would have beenachieved if the controller could use fop > 215.

When we decrease the value of p, the number of tagsthat participate in a frame decrease. Therefore, intuitively, therequired value of f should also decrease. Figure 3 confirmsthis intuition. This figure shows the plot of frame size vs.persistence probability, obtained using Equations (5) and (6).We can see that when p decreases, f decreases. Participationby lesser tags means that participation by the tags belongingto both the sets E and U decreases. This increases the chancesthat a given missing tag will not map to any slot in a givenpre-computed frame, which means that chances of detectingits absence decrease. Therefore, the overall uncertainty indetection of missing tags increases. To reduce this uncertainty,intuitively, the value of n should increase when p decreases toachieve the required reliability. Figure 4 confirms this intuition.


Fig. 4. Number of frames (n) vs. persistence probability (p).

This figure shows the plot of number of frames vs. persistenceprobability, obtained using Equations (5) and (6). We observethat when p decreases, n increases.

We use these two observations to reduce the value of fwhenever fop > 215. When fop > 215, the controller usesf = fmax = 215 in Equation (5), which leaves two unknowns,p and n, in the resulting equation. The controller solves theresulting equation simultaneously with Equation (6) to get newvalues of p and n. The new value of p is less than 1 and thenew value of n is greater than nop because fmax < fop. Puttingf = fmax in Equation (5) and solving for n, we get

n =ln {1 − α}

T ln{1 − pe

pfmax

(T−|E|−|U|)} (8)

Replacing n in Equation (6) with the R.H.S of the equationabove, and simplifying, we get

p2(T − |E| − |U|)f(e

pfmax

(|E|+|U|−T ) − p)= ln

{1 − pe

pfmax

(T−|E|−|U|)}

(9)

The numerical solution of the equation above gives the newvalue of p, which the controller puts in Equation (8) to getthe new value of n. The controller uses these new valuesof n and p along with f = fmax to pre-compute the ith

frame. Although the total number of slots Sd = fmax × n >fop × nop, this is still the smallest under the constraints thatthe required reliability is achieved and the frame size does notexceed fmax.

F. Expected Detection Time

The values of f and n calculated above ensure that inexecuting n frames, RUND will detect a missing tag eventwith probability ≥ α if number of missing tags is ≥ T .However, in many cases, the first missing tag event is detectedbefore all n frames are executed. We calculate the expectedvalue of the number of slots that RUND takes to detectthe first missing tag event. For this, we first calculate theprobability that a missing tag event is detected in a givenslot.

Lemma 2: Given a set E with expected IDs, set U withunexpected IDs, and threshold T , when controller executesRUND with persistence probability p and frame size f , theprobability g that a missing tag event is detected in any slotis given by the following equation.

g ={

1 −(1 − p

f

)T}×

{(1 − p

f

)|U|+|E|−T}

(10)

Fig. 5. Comparison of analytical and empirical E[D].

Proof: Probability that a missing tag event is detected ina given slot is the product of the probability that at least onemissing tag maps to this slot in the pre-computed frame andthe probability that no tag in the population selects that slotin the executed frame. Considering the scenario where it ishardest for RUND to detect a missing tag event i.e. , whenm = T , probability that at least one of the missing tags maps

to the given slot in the pre-computed frame is 1−(1− p

f

)T

.

The probability that none of the tags present in the population

selects that slot is(1− p

f

)|U|+|E|−T

. The product of these two

probabilities gives the expression for g in Equation (10). �Following theorem gives the expected value of the num-

ber of slots that RUND takes to detect the first missingtag.

Theorem 3: Let D be the random variable for the slotnumber when the first missing tag event is detected. Giventhat the probability of detecting a missing tag event in a slotis g, as calculated in Lemma 2, frame size is f , and numberof frames is n, we get

E[D] =1 − (1 − g)fn − fng(1 − g)fn

g(11)

Proof: The random variable D follows geometric distrib-ution with parameter g i.e. , P {D = i} = (1 − g)i−1g. Theexpected value, thus, becomes

E[D] =Sd∑i=1

iP {D = i} =f×n∑i=1

ig(1 − g)i−1

=1 − (1 − g)fn − fng(1 − g)fn

g�

Figure 5 shows the theoretically calculated expected slotnumber of first detection (calculated using Equation (11))represented by the solid line and values of slot number offirst detection observed from simulations represented by thedots. This figure is obtained using |E| = 100, |U| = 500,and f , p, and n are calculated using the method explainedin Section V. Each dot represents average of results from100 runs of simulation. We see that the values observedfrom simulations track the theoretically calculated values verywell.

G. Estimating Number of Missing Tags

In some scenarios, exact IDs of missing tags may not beneeded, rather only a count of the number of missing tags maybe of interest. For example, consider a warehouse that stores a


large number of boxes of shoes of the same type among otherproducts. If some of those boxes of shoes are stolen (or maybe shipped), the manager may only be interested in calculatinghow many boxes have left the warehouse. Furthermore, insuch a scenario, if the number of boxes in the warehousechange frequently, estimation is a better choice comparedto identification because estimation is usually much fastercompared to identification for large populations [10], [17].

RUND can obtain a rough estimate of the number oftags missing from the population without incurring any extratime cost. We adapt the two most accurate and fast esti-mation schemes proposed until now i.e. , Enhanced ZeroBased Estimator (EZB) [10] and Average Run based TagEstimator (ART) [17]. The idea behind EZB is that whenthe number of tags in the population increases, the expectednumber of empty slots in the frame decreases and the expectednumber of collision slots increases. A collision slot is a slotselected by two or more tags. Similarly, a singleton slot isa slot which is selected by only one tag. As we do notdistinguish between singleton and collision slots, we adapt theidea behind EZB to our scenario, viz., as the tag populationsize increases, the number of empty slots decreases and thenumber of nonempty slots increases. Therefore, the number ofempty slots and nonempty slots is a function of tag populationsize.

Next, we derive an expression to obtain a rough estimateof the tag population size. Let j be any arbitrary slot in theframe, where 1 ≤ j ≤ f . Let Xj be an indicator randomvariable for the event that jth slot is nonempty, i.e. , Xj = 1 ifone or more tags in the population select the jth slot, otherwiseXj = 0. Let N be a random variable representing the numberof nonempty slots in the frame. The value of N in terms ofXj is given by N =

∑fj=1 Xj . Therefore, when m tags are

missing from the population

E[Xj ] = P {Xj = 1} = 1 −(1 − p

f

)|U|+|E|−m

≈ 1 − e−pf (|U|+|E|−m)

As {X1, X2, . . . , Xf} forms a set of identically distributedrandom variables, E[N ] is given by

E[N ] = E[f∑

j=1

Xj ] = f × E[Xj ] = f(1 − e−pf (|U|+|E|−m))

Solving the equation above for m, we get

m =p|U| + p|E| + f ln

{1 − E[N ]

f

}

p(12)

To obtain an estimate m of m, we execute n frames, count thenumber of 1s in each frame and take their average across then frames. We replace E[N ] in Equation (12) with this averagevalue and obtain the estimate m.

Similarly, we apply ART on the executed frames to obtainanother estimate of m and take its average with m obtainedusing Equation (12) to get a final estimate of number ofmissing tags. The estimate of number of missing tags thatwe obtain this way is only a rough estimate. The accuracy

of the estimate improves when more and more frames areexecuted. Note that this estimate is obtained without executingany additional frames.

VI. PROTOCOL FOR IDENTIFICATION: RUNI

In this section, we explain the RUNI protocol for identifyingthe tags in E that are missing from the population. RUNI

protocol for identification builds on the RUND protocol fordetection. The controller maintains two sets A and P andinitializes them as A = φ and P = E. When the protocolcompletes, set A contains the IDs of all the tags in E that aremissing from the population and set P contains the IDs of allthe tags in E that are present. The controller calculates thevalues of v, p, and f , generates v random seed numbers Ri,where 1 ≤ i ≤ v, and pre-computes which tag will choosewhich slot in each frame. It executes the ith frame by providingeach reader with f , p, and Ri. Each reader sends the responsesin the ith frame back to the controller and the controller appliesthe logical OR operator on all received ith frames to obtaina resultant ORed frame. It then compares all the slots in thepre-computed ith frame with the corresponding slots in theORed ith frame. For each slot that is 1 in the pre-computedframe but 0 in the ORed frame, the controller considers all thetags in E that mapped to this slot in the pre-computed frameas absent. Let Ii represent the set of all the newly identifiedmissing tags by the controller after comparing the ith ORedframe with the ith pre-computed frame. After the ith frame, thecontroller adds the set Ii to set A and removes it from set P

i.e. , A = A ∪ Ii and P = P − Ii. The controller comparesthe frames and updates the sets A and P for all values ofi ∈ [1, v]. RUNI completes when the controller has comparedall v frames. At this point, set A is expected to contain theIDs of all the missing tags.

VII. PARAMETER OPTIMIZATION: RUNI

In this section, we derive expressions to calculate the totalnumber of frames v, persistence probability p, and frame size fthat minimize the execution time of RUNI while ensuring thatafter executing v frames, the controller is expected to haveidentified all tags in E that are missing from the population.For this, as a first step, we obtain an equation with threeunknowns: v, p, and f . To ensure that RUNI is expected toidentify all missing tags, controller must use the values of v, p,and f that satisfy this equation. In the second step, we derivean expression for the execution time of RUNI and minimizeit with respect to p and f . This results in two more equationsinvolving v, p, and f . Thus, there are three equations withthree unknowns that the controller solves simultaneously toobtain the optimal values of v, p, and f . In the last step, thecontroller checks whether or not the frame size obtained fromthe simultaneous solution of these equations exceeds 215, andif it does, it brings its value in limits by adjusting the valuesof v and p in a similar way as described in Section V-E.

1) Identifying All Missing Tags: Following theorem givesthe equation that the values of v, p, and f must satisfy toensure that the controller is expected to identify all missingtags in v frames.


Theorem 4: Given a set E with expected IDs, set U withunexpected IDs, number of missing tags m, frame size f ,and persistence probability p, the controller should exe-cute v frames, calculated using the equation below, whichwill ensure that the expected number of missing tags itidentifies in v frames equal the actual number of missingtags.

v = 1 − ln {m}ln

{1 − p(1 − p

f )|U|+|E|−m} (13)

Proof: Let mi be the expected number of missing tags thatare not yet identified by the controller as missing at the startof ith frame. Let Yij be the random variable that representsthe number of tags that the controller identifies as missing injth slot of ith frame. Probability that the random variable Yij

takes on a value l in jth slot of ith frame is the product of theprobability that l out of the remaining mi missing tags map tothe jth slot of the pre-computed ith frame and the probabilitythat no tag in the population selects jth slot of the executedith frame. Thus, the distribution of Yij is given by:

P {Yij = l} =(

mi

l

)( p

f

)l

(1 − p

f)mi−l × (1 − p

f)|U|+|E|−m

Expected value of the number of tags that the controlleridentifies as missing in jth slot of the ith frame is, therefore,given by

E[Yij ] =mi∑l=1

lP {Yij = l}

=mi∑l=1

l

(mi

l

)( p

f

)l

(1 − p

f)mi−l × (1 − p

f)|U|+|E|−m

=mip

f× (1 − p

f)|U|+|E|−m

The last step follows from the observation that the termon the left of “×” is expectation of a binomial randomvariable ∼ Binom(mi,

pf ).

Let Yi be the random variable that represents the number oftags the controller identifies as missing in the ith frame. It isgiven by:

Yi =f∑

j=1

Yij

Expected value of Yi, therefore, becomes

E[Yi] =f∑

j=1

E[Yij ] =f∑

j=1

mip

f(1 − p

f)|U|+|E|−m

= mip(1 − p

f)|U|+|E|−m

After ith frame, the expected number of missing tags that thecontroller has not yet identified is mi+1, and is given by

mi+1 = mi − E[Yi] = mi − mip(1 − p

f)|U|+|E|−m (14)

Equation (14) gives the recurrence relation to calculateexpected number of missing tags the controller has not iden-tified after i frames. Using this equation, we derive a closed

form expression for mi. As m1 = m, it follows that

mi = m

{1 − p

(1 − p

f

)|U|+|E|−m}i−1

Next, we calculate the value of v such that before the startof vth frame, the expected number of missing tags that thecontroller has not yet identified is only 1, which we expect tobe identified in the vth frame. Thus, substituting 1 for mi andv for i and solving for v, we get Equation (13) in the theoremstatement. �

2) Minimizing the Execution Time: Following theorem givesthe two conditions that the values of p and f need to satisfyto make the execution time of RUNI minimum under theconstraint that the expected number of tags identified by thecontroller equal the actual number of missing tags.

Theorem 5: The execution time of RUNI is minimum underthe constraint that the expected number of tags identified bythe controller equal the actual number of missing tags if thevalues of p and f satisfy the following two equations:

∂

∂p

(f − f × ln {m}

ln{1 − p(1 − p

f )|U|+|E|−m}

)= 0 (15)

∂

∂f

(f − f × ln {m}

ln{1 − p(1 − p

f )|U|+|E|−m}

)= 0 (16)

Proof: Let Si represent the total number of slots that thecontroller executes for RUNI. It is equal to the product offrame size f and total number of frames v, i.e. , Si = f × v.To satisfy the constraint mentioned in the theorem statement,we use the value of v from Equation (13) to calculate Si. Thetotal number of slots Si is a concave function of p and f andtherefore, optimum values of p and f exist, represented bypop and fop, that minimize the total number of slots Si. Dueto limitation of space, we have omitted the figures showingconcavity of Si with respect to p and f . To find optimalvalues of p and f , we take partial derivatives of Si withrespect to p and f and equate them to 0, which is whatEquations (15) and (16), respectively, show. �

The expanded expressions of Equations (15) and (16)are long and complex and do not have a closed formsolution. Therefore, we have omitted them to keep thepresentation simple. The controller first numerically solvesEquations (15) and (16) using Levenberg-Marquardt Algo-rithm to obtain the values of pop and fop and then substitutesthese values in Equation (13) to get the optimal value of v.

From Equation (13), we see that to calculate the valueof v, the controller needs the value of m, which is usuallynot known apriori. To address this problem, the controllerstarts with m = |E| because that results in the largest valueof v and then, after each frame, estimates the value of m asexplained in Section V-G and recalculates the value of v. Asthe controller executes more and more frames, the estimateof m becomes more and more accurate and eventually thevalue of v converges to its correct value. As the values ofv are noisy in the initial few frames because the estimateof m is not accurate at the start, the controller uses sometolerance when deciding to stop i.e. , it does not stop when ithas obtained and compared v ORed frames, rather executes a


Fig. 6. Convergence of value of v with frames.

few more frames to ensure that the value of v has stabilized.It stops only when the value of v has been relatively constantin last max {0.05v, 50} frames, and it has already receivedat least v ORed frames. Figure 6 shows the evolution of thevalue of v in an example scenario as the controller executesframes. This figure is obtained using |E| = 1000, |U| = 10000,and m = 200. We observe that the value of v is noisy atthe start because the estimate of m is not accurate at thestart, but as the controller executes more frames, the valueconverges to its correct value of 2017. In this example, it takesabout 1511 frames for v to converge to its correct value, whichis much smaller than v itself.

VIII. PERFORMANCE EVALUATION

We implemented RUND and RUNI in Matlab. Although,none of the existing protocols handles the presence of unex-pected tags and except for TRP, none of the existing protocolsis C1G2 compliant, we still implemented four prior state of theart missing tag detection and identification protocols in Matlabnamely TRP [20], IIP [11], MTI [24], and SFMTI [12], andcompared their performance with RUND and RUNI. We cal-culated parameter values for these protocols by following theinstructions in their respective papers. We also implementedthe fastest existing tag collection protocol TH [18]. We choosetag ID length of 64 bits as specified in the C1G2 standard.We do not make any assumptions about the distribution ofIDs of expected, unexpected, and missing tags in the ID spacebecause RUND and RUNI are inherently independent of tag IDdistributions. We calculate the optimal values of n, f , and pfor RUND as described in Sections V-A through V-E, and ofv, f , and p as described in Section VII.

In this section, we first evaluate the actual reliability ofRUND and the existing protocols for multiple values ofrequired reliability, keeping the unexpected tag populationsize fixed and changing the number of missing tags. We alsoshow the time taken by each protocol to detect first missingtag event. We further compare the number of missing tagsidentified by RUNI and existing missing tag identificationprotocols in f × v slots where f and v are calculated asdescribed in Section VII. Second, we evaluate the actualreliability of RUND and the existing protocols for multiplevalues of required reliability by keeping the number of missingtags fixed and changing the unexpected tag population size.We again show the time taken by each protocol to detect firstmissing tag event and the number of missing tags identified byRUNI and existing protocols in f×v slots. Third, we study the

actual reliability achieved by each protocol when the numberof tags missing from the population is different from the valueof threshold T . Fourth, we evaluate the estimation accuracyof RUND. Last, we compare the detection and identificationtimes of our protocols with the fastest tag collection proto-col TH. We also study whether there is a scenario in whichthe tag collection protocol is faster than our proposed proto-cols. For each scenario, we repeated simulations 500 timesusing different seeds and obtained the results from those500 repetitions.

Recall that the controller in RUND uses Equation (2) toestimate the number of unexpected tags |U| at the start of theith frame. However, for the very first frame, the controller doesnot have any knowledge of |U| and thus does not know whatframe size to start with. To address this problem, the controllerexecutes a scheme based on Flajolet and Martin’s probabilisticcounting algorithm [7] that provides a quick but rough estimateof the number of tags in a population. This method forestimating the number of tags using Flajolet and Martin’salgorithm was first proposed by Qian et al. in [15]. In thisscheme, the reader executes multiple single-slot frames, wherethe persistence probability p follows a geometric distributionstarting from p = 1 (i.e. , p = 1

2j−1 in the jth single slotframe), until the reader gets an empty slot. As an example, ifthe number of tags in a population are 10000, the expectedvalue of the frame number in which reader will get an emptyslot is log2 10000� = 14. Suppose the empty slot occurredin the jth single-slot frame, then t = 1.2897 × 2j−2 is arough estimate of number of tags t in the population [7], [15].Elaborate details on how Flajolet and Martin’s algorithm wasadapted for use in RFID estimation can be found in [15]. Thecontroller uses this value t as the first very rough estimateof |U|. This ensures that the size of the first frame is not grosslyinaccurate, which in turns allows the controller to convergeat the optimal frame size quickly. All results presented inthis section incorporate this cost of obtaining this first roughestimate.

A. Impact of Number of Missing Tags on RUND

RUND is the only protocol that achieves the required reli-ability in the presence of unexpected tags for any number ofmissing tags. Figure 7 shows the actual reliability achievedby RUND and all existing protocols for required reliabilityα = 0.9 and 0.99, respectively. This figure is plotted using|E| = 1000, |U| = 10000 and m is varied from 50 to 900.The actual reliabilities are obtained using 100 runs of eachprotocol for each value of m. None of the existing protocolsachieves the required reliability because none of them hasbeen designed to handle unexpected tags. Among the existingprotocols, SFMTI has the highest actual reliability of upto 0.67, which increases as the number of missing tags increasebecause increase in number of missing tags makes it easier fora protocol to detect a missing tag event.

RUND is the fastest protocol that achieves the requiredreliability compared to the existing protocols. Figure 8 showsthe average times each protocol took to either detect the firstmissing tag event if it finds a missing tag or to complete exe-cution if it does not find a missing tag. From this figure, MTI


Fig. 7. Actual reliability vs. # missing tags.

Fig. 8. Detection time vs. # missing tags.

seems to have smaller detection time compared to RUND, butwhen we analyze these figures in conjunction with Figure 7,we see that the actual reliability of MTI is close to 0, far lowerthan the required reliability. This shows that for majority oftimes, MTI completed execution without detecting any missingtags due to the presence of unexpected tags.

B. Impact of Number of Unexpected Tags on RUND

RUND is the only protocol that achieves the requiredreliability in the presence of unexpected tags while existingprotocols achieve the required reliability only when there areno unexpected tags in the population. Figure 9 shows theactual reliability obtained by RUND and the existing protocolsfor α = 0.9, and 0.99, respectively. This figure is plotted using|E| = 1000, m = 200, and |U| is varied from 0 to 10000.As before, the actual reliabilities are obtained using 100 runsof each protocol for each value of |U|. RUND always achievesthe required reliability whereas the existing protocols achievethe required reliability only when |U| = 0, i.e. , with nounexpected tags. As soon as the value of |U| increases, theactual reliability of existing protocols drops swiftly.

RUND is the fastest protocol that achieves the requiredreliability compared to the existing protocols even when thereare no unexpected tags in the population. Figure 10 showsthe average times each protocol took to either detect the firstmissing tag event or complete execution without detectingany missing tags. From this figure, MTI again seems to havesmaller detection time compared to RUND when number ofunexpected tags in the population is comparatively larger, butwhen we analyze these figures in conjunction with Figure 9,we see that actual reliability of MTI is close to 0 when numberof unexpected tags in the population is comparatively larger.

C. Impact of Number of Missing Tags on RUNI

RUNI identifies up to 100% of all the missing tags inf × v slots, regardless of the number of missing tags, whereas

Fig. 9. Actual reliability vs. # unexpected tags.

Fig. 10. Detection time vs. # unexpected tags.

Fig. 11. Percentage of identified missing tags vs. total number of missingtags.

Fig. 12. Percentage of identified missing tags vs. total number of unexpectedtags.

the existing protocols do not identify more than 30% of allthe missing tags in same number of slots. This can be seen inFigure 11, which shows the bar plots of number of missingtags identified by each protocol for m = 50, 300, and 500.In Figure 8, which shows the average times each protocoltook to detect the first missing tag event, we do not show aplot for RUNI because RUNI is same as RUND except that itdoes not stop after the first detection of missing tag event.Therefore, the average time RUNI takes to detect the firstmissing tag event is same as that of RUND.

D. Impact of Number of Unexpected Tags on RUNI

RUNI identifies the largest number of missing tagsin f×v slots among all the protocols even when |U| = 0. This


Fig. 13. Affect of difference between m and T .

can be seen in Figure 12, which shows the bar plots of numberof missing tags identified by each protocol for |U| = 0, 3000,and 5000. RUNI identifies up to 100% of all the missingtags, whereas the existing protocols do not identify more than60% of all the missing tags. Note that Figures 11 and 12 donot compare RUNI with TRP because TRP can not identify themissing tags; it can only detect that some tags are missing.

E. Impact of Deviation From Threshold

The actual reliability of RUND exceeds the required reliabil-ity when the number of missing tags in the population exceedthe threshold T , used to obtain the values of f , p, and n.This is seen in Figure 13, which plots the actual reliabilitiesof all protocols when number of missing tags are larger orsmaller compared to T . This figure is made using |E| = 1000,|U| = 10000, T = 200, α = 0.99, and m is varied from50 to 900. Note that the actual reliability of RUND is lessthan the required reliability when the number of missing tagsare less than T , but this is insignificant because as per therequirement, we are interested in detecting the missing tagsonly if the number of missing tags in a population exceed thethreshold T .

F. Estimation Accuracy

The estimates of number of missing tags obtained usingthe method described in Section V-G lie within ±5% of theactual number of missing tags when the required reliabilityα ≥ 0.95. The estimates are dependent on required reliabilitybecause required reliability is a factor that determines thenumber of frames n that the controller executes to ensurethat RUND achieves the required reliability. The larger thevalue of the required reliability α, greater is the value ofn and thus, more accurate are the estimates. Figure 14plots the coefficient of variation of the estimates of num-ber of missing tags obtained using the method described inSection V-G. Coefficient of variation, cv, is a statistical mea-sure that determines the deviation of data from its mean andis defined as the ratio of standard deviation in a set of valuesto the mean of those values. We see in Figure 14 that for α >0.95, cv < 0.03, which means that standard deviation is 3% ofthe mean. In a standard normal distribution, 99.7% values liewithin three standard deviations from mean. Therefore, almostall the values of the estimates lie within ±4.5% of the actualnumber of missing tags when α > 0.95.

Fig. 14. cv in estimates of missing tags.

G. Comparison With Tag ID Collection Protocol

RUND is faster than the fastest tag collection protocol, TH,whenever T ≥ 4 for α > 0.9 and T ≥ 7 for α > 0.99.This means that when the the required reliability is 0.9, itis faster to execute TH only when the threshold T is lessthan 4. Similarly, when the the required reliability is 0.99,it is faster to execute TH only when T < 3. This result isintuitive because when the threshold is small, RUND needs toexecute more slots to ensure that it does not fail to detect amissing tag. Note that in most large scale settings, such asin large warehouses, containing tens of thousands of tags, thethreshold T is set greater than 7 because it is quite likely thatdue to the communication errors between tags and readers,some tags may appear to be missing when they actually arenot. Therefore, in large scale settings, RUND will alwaysbe faster than TH as long as T ≥ 7. We achieved theseresults both theoretically and through simulations. Next, wefirst describe the theoretical relationship between the executiontimes (in terms of number of slots) of RUND and TH andthen describe how we conducted the simulations to verify thetheoretical results.

The authors of TH have shown that to collect the IDsof all tags in a population containing t tags, TH exe-cutes approximately 1.5t slots (refer to [18, Fig. 7(a)]. Letk represent the number of slots TH executes to collect theIDs of |U| + |E| − T tags. As per the results from [18],k ≈ 1.5× (|U|+ |E|−T ). To calculate the values of α and Tfor which the number of slots of RUND equal those of TH ,we put Sd = k. As Sd = f ×n, thus, f ×n = k. Substitutingf = k/n and p = 1 in Equation (5), we get an expression thatis only a function of n. Solving this expression with Equation(6), and putting k = 1.5 × (|U| + |E| − T ), we get

α = 1 − 0.51.5T ln 2 (17)

This equation expresses the relationship between the values ofα and T for which the execution times of TH and RUND areequal. Note that this equation does not depend on the valuesof |U| and |E|. Figure 15 shows a plot of this equation.

To validate this equation, we executed both RUND and THby varying |E| from 500 to 10000, |U| from 0 to 10000,and T from 100 to |E|/2. For each set of values of |E|, |U|,and T , we executed both RUND and TH 500 times to get theircorresponding average number of slots. In these simulations,to calculate the optimal values of n and f for RUND, foreach set of values of |E|, |U|, and T , we put the value


Fig. 15. α vs. T when time of TH and RUN equal.

of T in Equation 17 to obtain the corresponding value of α.We observed from these simulations that the execution timesof RUND and TH for each set of values of |E|, |U|, and T ,were almost the same. The maximum difference between theaverage values of the number of slots for RUND and of TH forany set of values of |E|, |U|, and T was not more than 0.01%.This validates Equation 17 and consequently the observationmentioned at the start of this subsection.

IX. CONCLUSIONS

The key technical contribution of this paper is in proposinga protocol to detect and identify missing tag events in thepresence of unexpected tags. This paper represents the firsteffort on addressing this important and practical problem. Thekey technical depth of this paper is in the mathematical devel-opment of the theory that RUND and RUNI are based upon.The solid theoretical underpinning ensures that the actualreliability of RUND is greater than or equal to the requiredreliability. We have proposed a technique that our protocoluses to handle large frame sizes to ensure compliance with theC1G2 standard. We have also proposed a method to implicitlyestimate the size of the unexpected tag population withoutrequiring an explicit estimation phase. We implemented RUND

and RUNI and conducted side-by-side comparisons withfour major prior missing tag detection and identificationprotocols even though none of the existing protocols handlethe presence of unexpected tags. Our protocols significantlyoutperform all prior protocols in terms of actual reliabilityand detection time.

REFERENCES

[1] Preliminary National Retail Security Survey Findings, accessed onJun. 22, 2012. [Online]. Available: https://nrf.com/news/national-retail-security-survey-retail-shrinkage-totaled-345-billion-2011

[2] C. Bordenave, D. McDonald, and A. Proutiere, “Performance of ran-dom medium access control, an asymptotic approach,” in Proc. ACMSIGMETRICS, 2008, pp. 1–12.

[3] K. Bu et al., “Who stole my cheese?: Verifying intactness of anonymousRFID systems,” Ad Hoc Netw., vol. 36, pp. 111–126, Jan. 2016.

[4] B. Carbunar, M. K. Ramanathan, M. Koyuturk, C. Hoffmann, andA. Grama, “Redundant reader elimination in RFID systems,” in Proc.IEEE SECON, Sep. 2005, pp. 176–184.

[5] B. Chen, Z. Zhou, and H. Yu, “Understanding RFID counting pro-tocols,” in Proc. 19th Annu. Int. Conf. Mobile Comput. Netw., 2013,pp. 291–302.

[6] Radio-Frequency Identity Protocols C1G2 UHF RFID Protocol forCommunications at 860 MHz–960 MHz, EPCglobal Inc., 2013.

[7] P. Flajolet and G. N. Martin, “Probabilistic counting algorithms for database applications,” Comput. Syst. Sci., vol. 31, no. 2, pp. 182–209, 1985.

[8] M. S. Humphreys, “Shoplifting—In 100 words,” Brit. J. Psychiatry,vol. 202, no. 2, p. 128, 2013.

[9] M. Kodialam and T. Nandagopal, “Fast and reliable estimation schemesin RFID systems,” in Proc. 12th Annu. Int. Conf. MobiCom, 2006,pp. 322–333.

[10] M. Kodialam, T. Nandagopal, and W. C. Lau, “Anonymous track-ing using RFID tags,” in Proc. 26th IEEE INFOCOM, May 2007,pp. 1217–1225.

[11] T. Li, S. Chen, and Y. Ling, “Identifying the missing tags in a largeRFID system,” in Proc. 11th ACM Int. Symp. MobiHoc, 2010, pp. 1–10.

[12] X. Liu, K. Li, G. Min, Y. Shen, A. X. Liu, and W. Qu, “Completelypinpointing the missing RFID tags in a time-efficient way,” IEEE Trans.Comput., vol. 64, no. 1, pp. 87–96, Jan. 2015.

[13] W. Luo, S. Chen, T. Li, and Y. Qiao, “Probabilistic missing-tag detec-tion and energy-time tradeoff in large-scale RFID systems,” in Proc.13th ACM Int. Symp. MobiHoc, 2012, pp. 95–104.

[14] SL2ICS11 I·CODE UID Smart Label IC Functional SpecificationDatasheet, Philips Semiconductors, Eindhoven, The Netherlands, 2004.

[15] C. Qian, H. Ngan, and Y. Liu, “Cardinality estimation for large-scaleRFID systems,” in Proc. 6th Annu. IEEE Int. Conf. PerCom, Mar. 2008,pp. 30–39.

[16] M. Roberti, “A 5-cent breakthrough,” RFID J., 2006. [Online]. Avail-able: http://www.rfidjournal.com/articles/view?2295

[17] M. Shahzad and A. X. Liu, “Every bit counts: Fast and scalable RFIDestimation,” in Proc. 18th ACM MobiCom, 2012, pp. 365–376.

[18] M. Shahzad and A. X. Liu, “Probabilistic optimal tree hopping for RFIDidentification,” in Proc. ACM SIGMETRICS, 2013, pp. 293–304.

[19] A. D. Smith, A. A. Smith, and D. L. Baker, “Inventory managementshrinkage and employee anti-theft approaches,” Int. J. Electron. Finance,vol. 5, no. 3, pp. 209–234, 2011.

[20] C. C. Tan, B. Sheng, and Q. Li, “How to monitor for missing RFIDtags,” in Proc. 28th IEEE ICDCS, Jun. 2008, pp. 295–302.

[21] S. Tang, J. Yuan, X.-Y. Li, G. Chen, Y. Liu, and J. Zhao, “RASPberry:A stable reader activation scheduling protocol in multi-reader RFIDsystems,” in Proc. 17th IEEE ICNP, Oct. 2009, pp. 304–313.

[22] H. Vogt, “Efficient object identification with passive RFID tags,” inPervasive Computing. Berlin, Germany: Springer, 2002.

[23] J. Waldrop, D. W. Engels, and S. E. Sarma, “Colorwave: A MACfor RFID reader networks,” in Proc. IEEE Wireless Commun. Netw.,Mar. 2003, pp. 1701–1704.

[24] R. Zhang, Y. Liu, Y. Zhang, and J. Sun, “Fast identification of themissing tags in a large RFID system,” in Proc. IEEE SECON, Jun. 2011,pp. 278–286.

[25] B. Zhen, M. Kobayashi, and M. Shimizu, “Framed ALOHA for multipleRFID objects identification,” IEICE Trans. Commun., vol. 88-B, no. 3,pp. 991–999, 2005.

[26] Y. Zheng and M. Li, “P-MTI: Physical-layer missing tag identifica-tion via compressive sensing,” in Proc. IEEE INFOCOM, Apr. 2013,pp. 917–925.

[27] Z. Zhou, H. Gupta, S. R. Das, and X. Zhu, “Slotted scheduled tagaccess in multi-reader RFID systems,” in Proc. IEEE ICNP, Oct. 2007,pp. 61–70.

Muhammad Shahzad received the Ph.D. degree incomputer science from Michigan State University in2015. He is currently an Assistant Professor withthe Department of Computer Science, North Car-olina State University, USA. His research interestsinclude design, analysis, measurement, and model-ing of networking and security systems. He receivedthe 2015 Outstanding Graduate Student Award, the2015 Fitch Beach Award, and the 2012 OutstandingStudent Leader Award at Michigan State University.

Alex X. Liu received the Ph.D. degree in computerscience from the University of Texas at Austin in2006. He is currently an Associate Professor with theDepartment of Computer Science and Engineering,Michigan State University (MSU), East Lansing, MI,USA. His research interests focus on networking,security, and dependable systems. He received theIEEE and IFIP William C. Carter Award in 2004and an NSF CAREER Award in 2009. He alsoreceived the MSU College of Engineering WithrowDistinguished Scholar Award in 2011.

Documents

3770 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 24, …20 feet; (2) active tags, which have their own power sources and have relatively longer communication range. A reader has a dedicated