13
Efficient Data Collection for Large-Scale Mobile Monitoring Applications Haiying Shen, Senior Member, IEEE, Ze Li, Lei Yu, Member, IEEE, and Chenxi Qiu Abstract—Radio frequency identification (RFID) and wireless sensor networks (WSNs) have been popular in the industrial field, and both have undergone dramatic development. RFID and WSNs are well known for their abilities in identity identification and data transmission, respectively, and hence widely used in applications for environmental and health monitoring. Though the integration of a sensor and an RFID tag was proposed to gather both RFID tag and sensed information, few previous research efforts explore the integration of data transmission modes in the RFID and WSN systems to enhance the performance of the applications. In this paper, we propose a hybrid RFID and WSN system (HRW) that synergistically integrates the traditional RFID system and WSN system for efficient data collection. HRW has hybrid smart nodes that combine the function of RFID tags, the reduced function of RFID readers, and wireless sensors. Therefore, nodes can read each other’s sensed data in tags, and all data can be quickly transmitted to an RFID reader through the node that first reaches it. The RFID readers transmit the collected data to the back-end servers for data processing and management. We also propose methods to improve data transmission efficiency and to protect data privacy and avoid malicious data selective forwarding in data transmission. Comprehensive simulation and trace-driven experimental results show the high performance of HRW in terms of the cost of deployment, transmission delay and capability, and tag capacity requirement. Index Terms—Radio frequency identification (RFID), wireless sensor networks (WSNs), distributed hash tables (DHTs), data routing Ç 1 INTRODUCTION R ADIO Frequency Identification (RFID) and wireless sensor networks (WSNs) are two of the most important systems widely used in many monitoring applications such as environmental and health monitoring and enterprise supply chains. WSNs are mainly used for monitoring physical or environmental condition, collecting environ- mental data such as temperature, sound. RFID is a technology that uses radio waves to transfer data between RFID tags and RFID readers (readers in short). RFID can be implemented on the objects to be identified, improving the efficiency of individual object tracking and management. More than 104 Wal-Mart stores have installed RFID systems to monitor the stock levels and track merchandizes in the supply chain [1] so that the products will not be out of stock or lost. RFID tag data usually is collected using direct transmis- sion mode, in which an RFID reader communicates with a tag only when the tag moves into its transmission range. If many tags move to a reader at the same time, they will contend to access the channels for information transmis- sion. Normally, the percentage of tags that can success- fully transmit their data in one transmission is merely 34.6 percent to 36.8 percent [2]. Such a transmission architecture for RFID data collection is not sufficient to meet the requirements of low economic cost, high performance and real-time individual monitoring in large-scale mobile monitoring applications. . Economic cost. An RFID reader cannot quickly receive information from tags due to its immobility and short transmission range. Thus, enormous RFID readers are required to increase their coverage for fast data collection. This would cause significant cost of the system deployment considering the high price of a high-quality RFID reader (at least $500) and the high cost of establishing connections between back-end servers and RFID readers. There- fore, it is important to constrain the number of RFID readers while still achieves efficient data collection. . High performance. In traditional RFID monitoring applications, such as supply chain management and baggage checking in Delta Airlines, an RFID reader is required to quickly process several tags at different distances. An RFID reader can only read tags in its range. Limited communication band- width, background noise, multi-path fading and channel accessing contention between tags would severely deteriorate the performance of the data collection. These problems can be avoided by transmitting data in short distances via the multi- hop data transmission mode in WSNs. . Real-time individual monitoring. In applications that require real-time monitoring on individual objects (e.g., endangered animals and patients), real-time information (e.g., body temperature, blood pres- sure, location) retrieval of individual objects is the most important. Though the integration of a sensor . The authors are with the Department of Electrical and Computer Engineering, Clemson University, Clemson, SC 29634 USA. E-mail: {shenh, zel, leiy, chenxiq}@clemson.edu. Manuscript received 16 Jan. 2013; revised 14 Mar. 2013; accepted 12 Apr. 2013. Date of publication 18 Apr. 2013; date of current version 16 May 2014. Recommended for acceptance by X.-Y. Li. For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference the Digital Object Identifier below. Digital Object Identifier no. 10.1109/TPDS.2013.122 1045-9219 Ó 2013 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/ redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 25, NO. 6, JUNE 2014 1424

Efficient data collection for large scale mobile monitoring applications

Embed Size (px)

DESCRIPTION

2014 IEEE / Non IEEE / Real Time Projects & Courses for Final Year Students @ Wingz Technologies It has been brought to our notice that the final year students are looking out for IEEE / Non IEEE / Real Time Projects / Courses and project guidance in advanced technologies. Considering this in regard, we are guiding for real time projects and conducting courses on DOTNET, JAVA, NS2, MATLAB, ANDROID, SQL DBA, ORACLE, JIST & CLOUDSIM, EMBEDDED SYSTEM. So we have attached the pamphlets for the same. We employ highly qualified developers and creative designers with years of experience to accomplish projects with utmost satisfaction. Wingz Technologies help clients’ to design, develop and integrate applications and solutions based on the various platforms like MICROSOFT .NET, JAVA/J2ME/J2EE, NS2, MATLAB,PHP,ORACLE,ANDROID,NS2(NETWORK SIMULATOR 2), EMBEDDED SYSTEM,VLSI,POWER ELECTRONICS etc. We support final year ME / MTECH / BE / BTECH( IT, CSE, EEE, ECE, CIVIL, MECH), MCA, MSC (IT/ CSE /Software Engineering), BCA, BSC (CSE / IT), MS IT students with IEEE Projects/Non IEEE Projects and real time Application projects in various leading domains and enable them to become future engineers. Our IEEE Projects and Application Projects are developed by experienced professionals with accurate designs on hot titles of the current year. We Help You With… Real Time Project Guidance Inplant Training(IPT) Internship Training Corporate Training Custom Software Development SEO(Search Engine Optimization) Research Work (Ph.d and M.Phil) Offer Courses for all platforms. Wingz Technologies Provide Complete Guidance 100% Result for all Projects On time Completion Excellent Support Project Completion & Experience Certificate Real Time Experience Thanking you, Yours truly, Wingz Technologies Plot No.18, Ground Floor,New Colony, 14th Cross Extension, Elumalai Nagar, Chromepet, Chennai-44,Tamil Nadu,India. Mail Me : [email protected], [email protected] Call Me : +91-9840004562,044-65622200. Website Link : www.wingztech.com,www.finalyearproject.co.in

Citation preview

Page 1: Efficient data collection for large scale mobile monitoring applications

Efficient Data Collection for Large-Scale MobileMonitoring Applications

Haiying Shen, Senior Member, IEEE, Ze Li, Lei Yu, Member, IEEE, and Chenxi Qiu

Abstract—Radio frequency identification (RFID) and wireless sensor networks (WSNs) have been popular in the industrial field, andboth have undergone dramatic development. RFID and WSNs are well known for their abilities in identity identification and datatransmission, respectively, and hence widely used in applications for environmental and health monitoring. Though the integrationof a sensor and an RFID tag was proposed to gather both RFID tag and sensed information, few previous research effortsexplore the integration of data transmission modes in the RFID and WSN systems to enhance the performance of theapplications. In this paper, we propose a hybrid RFID and WSN system (HRW) that synergistically integrates the traditionalRFID system and WSN system for efficient data collection. HRW has hybrid smart nodes that combine the function ofRFID tags, the reduced function of RFID readers, and wireless sensors. Therefore, nodes can read each other’s sensed data intags, and all data can be quickly transmitted to an RFID reader through the node that first reaches it. The RFID readerstransmit the collected data to the back-end servers for data processing and management. We also propose methods to improvedata transmission efficiency and to protect data privacy and avoid malicious data selective forwarding in data transmission.Comprehensive simulation and trace-driven experimental results show the high performance of HRW in terms of the costof deployment, transmission delay and capability, and tag capacity requirement.

Index Terms—Radio frequency identification (RFID), wireless sensor networks (WSNs), distributed hash tables (DHTs), data routing

Ç

1 INTRODUCTION

RADIO Frequency Identification (RFID) and wirelesssensor networks (WSNs) are two of the most important

systems widely used in many monitoring applications suchas environmental and health monitoring and enterprisesupply chains. WSNs are mainly used for monitoringphysical or environmental condition, collecting environ-mental data such as temperature, sound. RFID is atechnology that uses radio waves to transfer data betweenRFID tags and RFID readers (readers in short). RFID can beimplemented on the objects to be identified, improving theefficiency of individual object tracking and management.More than 104 Wal-Mart stores have installed RFIDsystems to monitor the stock levels and track merchandizesin the supply chain [1] so that the products will not be outof stock or lost.

RFID tag data usually is collected using direct transmis-sion mode, in which an RFID reader communicates with atag only when the tag moves into its transmission range. Ifmany tags move to a reader at the same time, they willcontend to access the channels for information transmis-sion. Normally, the percentage of tags that can success-fully transmit their data in one transmission is merely34.6 percent to 36.8 percent [2]. Such a transmission

architecture for RFID data collection is not sufficient tomeet the requirements of low economic cost, highperformance and real-time individual monitoring inlarge-scale mobile monitoring applications.

. Economic cost. An RFID reader cannot quicklyreceive information from tags due to its immobilityand short transmission range. Thus, enormous RFIDreaders are required to increase their coverage forfast data collection. This would cause significantcost of the system deployment considering the highprice of a high-quality RFID reader (at least $500)and the high cost of establishing connectionsbetween back-end servers and RFID readers. There-fore, it is important to constrain the number of RFIDreaders while still achieves efficient data collection.

. High performance. In traditional RFID monitoringapplications, such as supply chain management andbaggage checking in Delta Airlines, an RFID readeris required to quickly process several tags atdifferent distances. An RFID reader can only readtags in its range. Limited communication band-width, background noise, multi-path fading andchannel accessing contention between tags wouldseverely deteriorate the performance of the datacollection. These problems can be avoided bytransmitting data in short distances via the multi-hop data transmission mode in WSNs.

. Real-time individual monitoring. In applications thatrequire real-time monitoring on individual objects(e.g., endangered animals and patients), real-timeinformation (e.g., body temperature, blood pres-sure, location) retrieval of individual objects is themost important. Though the integration of a sensor

. The authors are with the Department of Electrical and ComputerEngineering, Clemson University, Clemson, SC 29634 USA. E-mail:{shenh, zel, leiy, chenxiq}@clemson.edu.

Manuscript received 16 Jan. 2013; revised 14 Mar. 2013; accepted 12 Apr.2013. Date of publication 18 Apr. 2013; date of current version 16 May 2014.Recommended for acceptance by X.-Y. Li.For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference the Digital Object Identifier below.Digital Object Identifier no. 10.1109/TPDS.2013.122

1045-9219 � 2013 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 25, NO. 6, JUNE 20141424

Page 2: Efficient data collection for large scale mobile monitoring applications

and an RFID tag helps gather both RFID tag andsensed information [3], [4], [5], [6] from objects,quickly collecting the information still remains as achallenge.

In this paper, we propose a Hybrid RFID and WSNsystem (HRW) that synergistically integrates the RFID andWSN data transmission modes for efficient data collectionin large-scale monitoring applications for moving objects(e.g., environmental and health monitoring). HRW novellyleverages the integration to reduce the number of requiredRFID readers hence economic cost and enhance datatransmission efficiency. HRW has a new type of nodescalled Hybrid Smart Nodes (smart nodes/nodes in short)that combine the function of RFID tags, and the reducedfunction of wireless sensors and RFID readers. The systemmainly consists of three components: smart nodes, RFIDreaders and the back-end server infrastructure. The RFIDreaders collect data from smart nodes and transmit the datato the infrastructure. We summarize the contribution ofthis paper in below.

. Proactive data transmission. Inspired by the multi-hoptransmission mode in WSNs, rather than passivelywaiting for RFID readers to read data, smart nodesactively transmit data to readers in a multi-hopmanner. Smart nodes read tag data between eachother. In this way, instead of reading every tag oneby one when they move into the reading range, RFIDreader can receive the information of a group of tagsby reading only one first-encountered node. As aresult, the channel contention and noise interferenceduring the data transmission can be significantlyreduced. In the traditional WSN, a node in thesleeping mode cannot receive and forward data. InHRW, a node can read data from the RFID tag ofanother node even if it is in sleep mode, whichgreatly increases transmission efficiency.

. Algorithms to enhance efficiency. We further improvethe information collection efficiency by lettingcluster nodes replicate their data to each other orto one specified cluster head that has high encoun-tering frequency with cluster nodes and RFIDreaders. We also propose a tag clean-up algorithmto remove delivered data from tags to reducetransmission overhead.

. Security strategies. We propose solutions to handletwo security threats in the data communication toreaders to reduce privacy and security risks.

Our comprehensive simulation and trace-driven exper-imental results show that HRW can reduce the number ofRFID readers, the transmission delay of each node, and thedemand on the capacity of tags, compared to the traditionalRFID monitoring system. The results also show theeffectiveness of our proposed algorithms to enhance theefficiency and security. Compared to our previous confer-ence version of this work [7], this paper enhances theperformance of HRW with cluster-based data transmissionand security mechanisms. It also presents theoreticalanalysis and additional experimental results.

The rest of this paper is organized as follows. Section 2describes the HRW system for monitoring applications.Section 3 presents our security mechanisms. Section 4presents the simulation results. Finally, Section 5 concludesthis paper with remarks on future works. In the supple-ment material http://doi.ieeecomputersociety.org/10.1109/122, we present a theoretical analysis of our proposedmethods compared to the traditional RFID system, a bloomfilter based data indexing to efficiently check redundantdata in the local tag that does not need to be replicated orneeds to be removed, additional experimental results andan overview of related works.

2 HYBRID RFID AND WSN SYSTEM (HRW)2.1 Hybrid Smart NodesHRW has smart nodes that synergistically integrate RFIDand WSN functions. A smart node has the following typicalcomponents.

. Reduced-function sensor. Unlike the normal sensors,this sensor does not have transmission function. Itcollects the environmental data and the sensed data(e.g., pressure, temperature) from hosts.

. RFID tag. As the normal RFID tags, it serves astraditional packet memory buffer for informationstorage. The RFID information such as identity andproperties is configured into the RFID tag during theproduction stage.

. Reduced-function RFID reader (RFRR). It is used forthe data transmission between smart nodes. A smartnode uses RFRR to read other smart nodes’ tags andwrite the information into its own tag.

RFRR can just be a simple ultra-high frequency readermodule from traditional RFID readers. An RFID readermodule can be as low as $29, which costs much less than ahigh-quality RFID reader (at least $500). Using RFRR, nodescan exchange their tag data in a proactive manner. RFRRalso helps to store the data sensed from monitored hostsand environment into the tags. The smart nodes are feasibleto build as they consist of simpler and partial parts fromnodes integrating RFID tag and sensor functions [3], [4], [5],[6], [8] and RFRR. Compared to RFID tags, HRW achieveshigher performance at the cost of additional components ofreduced-function sensor and RFRR for each node. Howev-er, this additional cost is much less than the cost of itsreduced many high-cost RFID readers. The nodes withintegrated RFID tag and sensor functions can also use HRWfor efficient data collection with RFRR modules.

Each smart node has two modes: sleep mode and activemode. In the active mode, the sensor unit in the smart nodecollects the physical information of the smart node host andasks RFRR to write the data into the node’s tag. While in thesleep mode, smart nodes do nothing. The tag informationin a node can be read by other active nodes; no matter itis in sleep mode or not [2]. Since there are many smartnodes in the system, and the transmission of the collectedinformation to RFID readers is delay tolerant, it is notnecessary to let all of the smart nodes remain active all

SHEN ET AL.: EFFICIENT DATA COLLECTION FOR MOBILE MONITORING APPLICATIONS 1425

Page 3: Efficient data collection for large scale mobile monitoring applications

the time, which otherwise consumes considerable batterypower.

2.2 Proactive Data TransmissionFig. 1 shows the traditional RFID architecture, and Fig. 2shows the architecture of the HRW system. Botharchitectures are hierarchical. The upper layer is com-posed of RFID readers connected to the back-endinfrastructure with high-speed backbone cables. Theback-end infrastructure connects to the applications (e.g.,database in a hospital). The lower layer is formed by aconsiderable number of object hosts that transmit data toRFID readers. The difference between these two architec-tures is the transmission mode. In Fig. 1, only the nodes(hosts) in the transmission range of RFID readers can sendtheir tag information to the RFID readers. As explained inSection 1, this direct transmission mode would lead tochannel contention and hence low successful transmissionrate and slow data collection. In Fig. 2, the nodes are smartnodes that can exchange and replicate tag information witheach other using wireless RF channels. Each RFID readerreads tags within its transmission range. Since the data canbe transmitted to the RFID reader using a multi-hoptransmission mode, each RFID reader can also receive theinformation in tags outside of its transmission range. In thisway, HRW can quickly collect data and expedite the datacollection.

After smart node A collects the sensed data, itappends the sensed data with a timestamp and storesthe data in its tag through RFRR. Fig. 3 shows an exampleof data collection process of two smart nodes. After thesensor unit in a smart node collects the information aboutits tag host (Step 1), it asks RFRR to store the informationinto its tag (Step 2). Once two nodes move into thetransmission range of each other, the RFRR in a node readsthe information stored in another node’s tag (Step 3). Basedon the host ID and timestamp, the node checks if it hasstored the information previously. If not, the RFRR thenstores the acquired information into the local tag (Step 4).

When node i replicates node j’s data, node i alsorecords the timestamp of the replication time denoted bytij. Next time when node i meets node j, node i will notreplicate node j’s data with timestamps prior to tij.Suppose the timestamp of smart node 3 for node 4 is11230337, which represents the time 03:37 am, Nov. 23th.When node 3 meets node 4 next time, node 3 ignores theinformation with timestamp less than 11230337 in theinformation replication. In this way, smart nodes avoid

recording duplicated information, and hence avoid theunnecessary overhead in the transmission. Algorithm 1shows the pseudo-code of the information collectionprocess.

By the data exchange, the data of a node can bereplicated to a number of other nodes in the system andany one of these nodes that meets the RFID reader cantransmit the information to the reader. In this way, thelikelihood that the information is delivered to the RFIDreader is greatly increased and the number of RFID readersneeded for fast information delivery is reduced.

Algorithm 1 Pseudo-code of the process of informationreplication executed by smart node i.

1: if this.state ¼ active then

2: Collect the sensed data of its host Di

3: ==StoreDi into its tagi4: StoreðDi; tagiÞ5: for every node j in its transmission range do

6: if this.linkAvailable(j) then

7: Read data Dj with timestamp 9 tij from tagj8: //Store data Dj in itstagi9: StoreðDj; tagiÞ

10: Update timestamp tij with current time

11: end if

12: end for

13: end if

Fig. 2. HRW architecture.

Fig. 3. Replication process of two smart nodes.Fig. 1. Traditional RFID architecture.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 25, NO. 6, JUNE 20141426

Page 4: Efficient data collection for large scale mobile monitoring applications

When a node enters the reading range of an RFID reader,the RFID reader reads the information in the node’s tag. Ifseveral nodes enter the range of RFID reader at the sametime, the RFID reader gives the first meeting tag the highestpriority to access the channel, reducing channel contentionand long distance transmission interference. The RFIDreader can erase the information in the tag once afterobtaining it. Algorithm 2 shows the pseudocode of thereading process of an RFID reader.

Algorithm 2 Pseudo-code of the process of informationreading executed by RFID reader i.

1: for every node j in its transmission range do

2: if this.linkAvailable(j) then

3: Read data Dj from tagj in node j

4: //Store data Dj in storage Si in the RFID reader5: StoreðDj; SiÞ6: Erase Dj from tagj in node j

7: end if

8: end for

With this data transmission algorithm, after an RFIDreader receives the information of a node, many nodes stillhold the replicas of the information. Exchanging suchdelivered and redundant information incurs high trans-mission overhead but does not contribute to informationcollection. In order to reduce the unnecessary messagetransmission, we use a tag clean-up algorithm to delete thedelivered messages in the system. Specifically, after anRFID reader reads the information from a node, the readersends the node a directory containing the tag IDs andtimestamps of recently received data items. This directoryhas a TTL (Time to live) with it. It is then broadcastedamong nodes and will be deleted when TTL expires. Afterreceiving the directory, the nodes delete the deliveredinformation in their own tags. Considering that the time-stamp and ID of each information item have much less sizethan a complete data item, the overhead of the directorybroadcasting is small.

2.3 Cluster-Based Data TransmissionReplicating data between any two encountered smartnodes generates a high cost. Concurrent data transmissionfrom many nodes to an RFID reader causes channel accesscongestion. Also, it is not easy to erase duplicate data that isalready reported to the RFID readers from replica nodes.We propose enhanced data transmission algorithms tomitigate these problems.

A simple algorithm to reduce the cost is to enable asource node to replicate its data to a limited number ofnodes. Here, we describe two enhanced algorithms calledcluster-member based and cluster-head algorithms, inwhich smart nodes are clustered to different virtualclusters and each cluster has a cluster head. In thecluster-member based algorithm, cluster members repli-cate their tag data between each other. When a clustermember of a virtual cluster enters the reading range of anRFID reader, by reading the aggregated tag informationfrom the cluster member, the RFID reader receives all

information of nodes in this virtual cluster. In the cluster-head based algorithm, cluster members replicate their tagdata to the cluster head. When a cluster head of a virtualcluster reaches an RFID reader, the RFID reader receivesall information of nodes in this virtual cluster. Thisenhanced method greatly reduces channel access conges-tion, reduces the information exchanges between nodesand makes it easy to erase duplicate information in acluster. The method is suitable to the applications wheremonitored objects (e.g., zebras, birds, people) tend to movein clusters.

Algorithm 3 Pseudo-code of cluster head determinationand data transmission conducted by smart node i.

1: Receive cluster head candidates from an RFID reader

2: for each cluster head candidate j do

3: Calculate ðfnij � frjÞ4: end for

5: Choose the cluster head with maxðfnij � frjÞ6: if it is a cluster head and meet its cluster member then

7: Read data from the cluster member

8: end if

9: if it is a cluster head and meet an RFID reader then

10: Send its data to the RFID reader

11: end if

To form the clusters in the cluster-member basedalgorithm, nodes report their encountering frequency tothe server through the RFID readers. The server formsnodes with high encountering frequency into a clusterusing the method in [9] and notifies the cluster nodesthrough the RFID readers. The cluster head for a cluster canbe selected in a number of ways depending on theapplication requirement. For example, in a health moni-toring application where real-time data collection isrequired, the nodes with the most contact frequency withcluster members and RFID readers should be the clusterheads. In the supply chain where nodes are always close toeach other, the nodes with the highest energy should be thecluster heads. We use the former example to show how tochoose cluster heads. Algorithm 3 shows the pseudocode ofcluster head determination and data transmission con-ducted by each smart node in the second algorithm. RFIDreaders record the meeting frequency with each node andreport the data to the back-end server. The server calculatesthe sum of the frequencies from different readers for eachnode j, denoted by frj , and selects N nodes with the highestfrj as the cluster heads. The information of the selectedcluster heads along with their fr is transmitted back to theRFID readers, which will forward the information to thenodes. We use fnij to denote the meeting frequencybetween node i and a cluster head j. A node measures itsfnij � frj for each cluster head candidate, and selects the onewith the highest metric as its cluster head. The metric offnij � frj indicates how fast cluster head j can forward nodei’s data to an RFID reader. Through RFID readers, eachnode reports its selected cluster head to the server and theserver then notifies all heads about their cluster members.

SHEN ET AL.: EFFICIENT DATA COLLECTION FOR MOBILE MONITORING APPLICATIONS 1427

Page 5: Efficient data collection for large scale mobile monitoring applications

The head determination can also be solely conducted at theserver to reduce the communication. As a result, eachcluster head is associated with a group of nodes, and it canmost quickly forward the data to RFID readers for itscluster members.

In the HRW system, since the data is stored in tags,active nodes can retrieve the information at any time from asleeping node. In traditional WSNs, however, nodes insleeping mode cannot conduct data transmission. There-fore, the HRW system can greatly improve packet trans-mission efficiency with the RFID technology.

3 COMMUNICATION SECURITY MECHANISMS

The multi-hop message transmission mode in HRWimproves the communication efficiency. However, suchmethod introduces privacy and security risks. Low-costRFID nodes are not tamper-resistant and deployed in openenvironment, thus the attackers can easily physicallyaccess and take control of these nodes. The attacker canobtain all the information in the compromised nodes anduse the compromised nodes to obtain sensitive informa-tion and disrupt system functions. Thus, in this section,we consider two security threats arising from nodecompromise attacks: data manipulation and data selectiveforwarding.

3.1 Data Privacy and Data ManipulationIn the system, each smart node replicates its information toother nodes. Once a node is compromised, all theinformation of other nodes is exposed to the adversaries,which is dangerous especially in privacy sensitive applica-tions such as health monitoring. A malicious node can alsomanipulate the gathered information and provide falseinformation to the readers. Therefore, it is important toprotect the confidentiality and authenticity of tag informa-tion in data transmission.

Public key operations are too expensive for the smartnodes due to their limited computing, storage andbandwidth resources. We then develop a symmetric keybased security scheme in our system. In this paper, wefocus on the threats due to the compromised smart nodesand assume the readers are secure. In our securityscheme, each smart node N is initially assigned with anindividual key KN . The pairs ðN;KNÞ of all smart nodesare stored in a central server, which can be securelyaccessed by the readers. To achieve data confidentiality,each smart node N generates a temporary keyK0N ¼ HðNoncejKNÞ, where Nonce is a nonce numberwhich can be the timestamp of RFID data, Hð�Þ is a

system-wide secure hash function known by every node,and ‘‘j’’ represents the concatenation of two strings. NodeN uses this symmetric key to encrypt its data DN and sendsthe encrypted data, denoted by EnðK0N;DNÞ, to othernodes. The use of temporary keys for every data transmis-sion further enhance the security against the ciphertext-only attacks by interpreting historical transmissions. Toprotect data authenticity, node N also computes themessage authentication code with the temporary key K0N ,denoted by MACðK0N;NjDNÞ. The message from a smartnode is in the format of

N;Nonce; En K0N;DN

� �;MAC K0N;N jDN

� �� �:

Fig. 4 shows the procedure of data reading withencryption and authentication. When a reader receivesthe data, it first sends to the central server the tag ID N andNonce. The server finds KN and computes the temporarykey K0N , and then securely sends K0N to the reader. Afterreceiving K0N , the reader is able to decrypt the data DN

from EnðK0N;DNÞ and then verifies whether MAC iscorrect. If the recomputed MAC is consistent with theMAC received from the smart node, the reader considersthe MAC is correct and the data set is authentic. Otherwise,the EnðK0N;DNÞ is changed by an adversary node.

To avoid being detected for changing data, an adversarymay launch old message replay attack by replacing a newmessage from a node with an old message from the node.When a reader forwards the N and Nonce to the centralserver, the central server can easily detect outdated noncevalues which were reported previously. As a result, the oldmessage replay attack can be detected.

Once a smart node N is compromised, its individual keyKN is exposed and the adversary can derive all previoustemporary keys to decrypt data in the old messages. Thus,it is important to achieve the backward security byupdating the individual key periodically. However, peri-odically distributing new keys from a central server to allsmart nodes incurs expensive communication cost. There-fore, we propose a key hash chain method to avoid the keydistribution cost. Initially, each node is loaded with a keyhash chain computed with a secure one-way hash functionHð�Þ as follows:

K0 ¼)H

K1 ¼)H

K2 � � � ¼)H

KL (1)

where Ki ¼ HðKi�1Þ. The smart node uses a key as itsindividual key on the chain in the order from K0 to KL. Itperiodically updates its individual key from Ki to Kiþ1 anderases Ki from its storage. Because H is one-way hashfunction, even the attacker obtains Ki, it cannot derive anykeys Kj with j G i, and thus cannot decrypt previoustransmissions encrypted by Kj.

In a large-scale system with a large amount of nodes, itcould be an expensive and time-consuming operation tofind the individual key of a specific smart node among allnodes’ keys. The searching time is linear to the totalnumber of nodes. We propose two methods to resolve thisproblem. First, we propose to compute individual keys inrun time rather than storing all keys in advance andsearching keys on-demand. To this end, the central servermaintains a secret key Kc. For each node with the tag ID

Fig. 4. Procedure for secure data reading and verification.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 25, NO. 6, JUNE 20141428

Page 6: Efficient data collection for large scale mobile monitoring applications

N , its individual key KN is computed by the cryptogra-phical secure hash function H with Kc, i.e., KN ¼HðNjKcÞ. In this way, the server does not need to storeany individual keys. When receiving the tag ID N , theserver directly recomputes HðN jKcÞ and obtains theindividual key KN , which avoids the searching. Sincethe computation time of the hash function is independentof the number of nodes, the time for finding individualkeys can be significantly reduced in large-scale systemscompared to linear searching. Second, we proposedistributed key storage in the back-end servers. We formthe back-end servers into a distributed hash table (DHT).The DHT overlay supplies Insert(key, data) and Lookup(key)functions. Insert(key, data) stores the data into its assignedserver. Lookup(key) retrieves the data with the indicatedkey. Regarding key here as the consistent hash [10] valueof a node’s tag ID and data here as the node’s individualkey, the back-end servers can efficiently store and retrieve anode’s individual key usingOðlognÞ path length, where n isthe number of nodes.

3.2 Data Selective ForwardingIn the cluster-head based transmission algorithm, thecluster head in each cluster is responsible for forwardingthe tag data of all cluster members to the reader. Amalicious cluster head can drop part of the data andselectively forward the gathered information to the reader.Since an RFID reader may not know all the smart nodes in ahead’s cluster in advance, it cannot detect such attacks. Toprevent the selective forwarding attack, we can exploit thecluster-member based data transmission algorithm, inwhich all cluster members hold the data of all other nodesin the cluster. A reader can compare cluster members’reported data with the cluster head’s reported data toverify the correctness of the latter.

We use Dall to denote the set of all encrypted tag dataðN;Nonce; EnðK0N;DNÞ;MACðK0N;N jDNÞÞ in a cluster. Af-ter node N collects encrypted data from all other nodes inits cluster, it creates its MAC on DallN and sends itsðN;Nounce;MACðK0N;N jDallN ÞÞ to the reader.

After receiving Dallc from a cluster head and the MACsof Dall from cluster members, the reader can verify theauthenticity of Dallc . Based on a cluster member’s N andK0N , the reader creates MACðK0N;N jDallcÞ and compares itwith the received MACðK0N;N jDallN Þ from node N . If twoMAC values are different, it means that the data from thecluster head or from node N is not valid. After conductingmany comparisons for many cluster nodes, if the majoritycomparisons are valid, then the data from the cluster headshould be valid, otherwise, it is not valid.

Obviously, it causes excessive communication cost if thereader needs every cluster member to send its MAC forDall. A simple solution is to let the reader only collectMACs from T ðT � 1Þ number of cluster members. Once Tnumber of MACs are collected, the reader verifies theauthenticity of the data set and considers it valid if all theMACs are correct. However, this method cannot preventthe collusion attack of multiple compromised nodes.Suppose that a node sent a pruned data set to the reader,other T compromised nodes can compute valid MACs forthe pruned data set and send them to the reader.

To prevent the collusion attack, we propose a securerandomized solution, in which each smart node randomlydecides whether to send its MAC to the reader. Suppose Fis a cryptographically secure pseudo-random functionwhich uniformly maps the input values into the range of[0, 1). Each node N checks the inequality

F N jK0N� �

G �ð0 G � G 1Þ; (2)

where � is a threshold which decides the expected numberof MACs the reader will receive. If the inequality holds, thenode sends its MAC to the reader. Otherwise, it does not.As a result, each smart node in the cluster has a probabilityof � to send its MAC to the reader. When the readerreceives the MAC from a smart node N , it recomputes Fand accepts the MAC only when the inequality holds. Oncethe reader finds that all received MACs are correct, itconsiders the data set valid and complete. In this way, thecollusion attack is prevented through verifying the legiti-macy of nodes for providing their MACs (i.e., checkingwhether Inequality (2) holds), while the communicationcost between the nodes and the reader is reduced. Thethreshold � is a system parameter loaded into the tag nodesand servers when the system is initialized. Larger thresh-old means stronger security strength at the expense ofhigher communication cost. The threshold � is initiallydecided by the users according to their security strengthdemand.

4 PERFORMANCE EVALUATION

4.1 Evaluation on Data TransmissionSince the transmission links between hosts are usuallyintermittently connected, we created a delay tolerantnetwork environment for the performance evaluation.

The simulation was built on a custom discrete event-driven simulator [11].

We use the random way-point model [12] to simulate thesituation where there is no movement pattern and use thereal mobile traces [13], [14] to simulate the situation wherethere is movement pattern for the applications wheremonitored objects (e.g., zebras, birds, people) tend to movein clusters. In the random way-point model, each nodewaits for a pause time randomly chosen from (1-5)s, thenmoves to another random position with a speed chosenbetween 1 to 10 m/s. In the simulation, 50 nodes and 5RFID readers were independent identically distributed(i.i.d.) over a 600 m � 700 m area.

We used two transmission modes in HRW: epidemic[15] and source-replication. In the epidemic transmission,the packets of nodes are replicated to other nodes withinTTL hops, which was set to 6 by default. In the source-replication transmission, a source node allows a certainnumber (10 by default) of nodes to read its packets. Wecompared these methods with the ‘‘direct’’ transmissionmethod in the traditional RFID systems, in which a nodekeeps its collected information in its tag until it reaches therange of an RFID reader. If one of the copies of a packetarrives at an RFID reader, we consider this packetsuccessfully delivered. We only considered the firstdelivered replica of a packet in the measurement.

SHEN ET AL.: EFFICIENT DATA COLLECTION FOR MOBILE MONITORING APPLICATIONS 1429

Page 7: Efficient data collection for large scale mobile monitoring applications

The entire simulation time was set to 4000 s. Thewarmup time was set to 100 s during which nodesrandomly move around. Then, during the following1000 s, at every second, we randomly selected a node togenerate a packet in its own tag. By default, the active timeof each node was randomly chosen from 10-15 s and thesleeping time was randomly chosen from 0-10 s. Unlessotherwise specified, the tag capacity of each node was set to30 packets, and the reading range of the RFID and smartnode readers was set to 30 m. There was no tag capacitylimit for RFID readers. In the simulation, a node droppedpackets if its tag was full and these dropped packets wouldnot be retransmitted. The packets dropped by sleepingnodes would be retransmitted. We run each simulation testfor 10 times and report the average value.

4.1.1 Comparison of Data Delivery DelayWe use the delivery latency of the copy of a packet that firstarrives at an RFID reader as the packet’s transmissiondelay. We calculated the average transmission delay perpacket for successfully delivered packets. Fig. 5 shows theaverage transmission delay versus the tag capacity withtwo different reading ranges of all nodes denoted byR ¼ 20 m and R ¼ 40 m. Comparing the two figures, wefind that as the reading range increases, the averagetransmission delay of all three protocols decreases. This isbecause a larger reading range makes it easier for a node tofind other neighbor nodes, which may either be the RFIDreaders or promising relay nodes moving to RFID readers.Moreover, the movement speed of the electromagneticwaves is much faster than the nodes, thus message deliverydelay is reduced with a larger transmission range. In eachfigure, we find that as the tag capacity grows, the averagetransmission delay also increases. With a smaller tagcapacity, more packets are dropped without retransmis-sion, leading to lower average transmission delay. With the

increase of the tag capacity, more packets are able to residein the tags long enough to be delivered to readers, leadingto longer transmission delay.

The comparison of the three protocols indicates that thedirect transmission always produces higher average trans-mission delay than epidemic and source-replication. Epi-demic and source-replication proactively transmit data toRFID readers using multi-hop routing, while the directtransmission lets nodes wait until meeting readers totransmit data. The result indicates the higher efficiency ofthe HRW transmission mode than the traditional RFIDsystem.

We see that when R ¼ 20 m and the tag capacity is lessthan 15, source-replication has higher delay than epidemic,and when the tag capacity is larger than 15, source-replication has lower delay than epidemic. In the formercase, many packets are dropped in epidemic. Such packetdropping reduces the delivery delay in epidemic. Source-replication does not have so many drops, thus it has longerdelay. As the tag size increases, more packets can bebuffered in the tag. As some of the previously droppedpackets have long transmission delay, the average delay inepidemic increases. However, as source-replication is notgreatly affected by the tag size, the delay performance doesnot increase significantly. When R ¼ 40 and tag capacity issmall, source-replication leads to higher delay thanepidemic due to the same reason as when R ¼ 20 m.

4.1.2 Comparison of Data Delivery CapacityFig. 6 shows the number of successfully delivered packets(i.e., received packets) versus the tag capacity. Both figuresindicate that as the tag capacity increases, the number ofreceived packets increases due to fewer packet drops asexplained previously. Comparing the two figures, we seethat as the transmission range of the nodes increases, thenumber of received packets also increases. The reason is

Fig. 5. Transmission delay versus tag capacity. (a) Range ¼ 20 m. (b) Range ¼ 40 m.

Fig. 6. Delivery capability versus tag capacity. (a) Range ¼ 20 m. (b) Range ¼ 40 m.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 25, NO. 6, JUNE 20141430

Page 8: Efficient data collection for large scale mobile monitoring applications

that with the increasing reading range, it is more likely for acertain node to meet an RFID reader or to find moreneighbor nodes for the information replication. WithR ¼ 40 m, when the tag capacity is larger than 10, allpackets in three protocols can be successfully delivered.This is because packets can be quickly transmitted to RFIDreaders without a long-time buffering, leading to fewerpacket drops.

Fig. 6 also shows that the direct transmission suffersfrom more congestion than the other two protocols whena tag has small capacity. This is because the transmis-sion delay of direct transmission is much longer than theother two protocols, so that nodes have much less freetag buffer than the nodes in the other two protocols.Therefore, when the tag capacity of nodes is limited, thetraditional RFID system is not a wise choice forenvironmental and health monitoring applications.Wecan also see that as the tag size increases, the number ofdelivered packets in the direct transmission increasesmuch faster than those in source-replication and epidemicand finally exceeds them. This is because the nodes in thesource-replication and epidemic need to buffer multiplecopies of a packet, thus resulting in more packet drops dueto tag capacity limitation. Source-replication producesmore successfully delivered packets than epidemic becauseit buffers fewer duplicated copies of a packet, leading tofewer drops of different packets.

4.2 Evaluation on Cluster-Based Data TransmissionWe then compare the cluster-head based and source-replication (i.e., cluster-member) based data transmission.In the simulation, every 5 nodes form a cluster and movetogether to random places based on the random way-pointmodel to simulate the group moving features of the

monitored objects. During the 1000 s packet initiationtime, every node generates one packet every 100 s.According to the specification of MICA2 motes [16], weassume that when reading range R ¼ 20 m and R ¼ 40 m,the energy consumed to receive a byte is 0.0057 micro-joules (mJ) and 0.0228 mJ, and to transmit a byte takes0.0144 mJ and 0.0576 mJ, respectively. The size of a tagdata is 30 bytes. We also evaluated the tag clean-upalgorithm in this section.

Fig. 7 shows the comparison results of the averagetransmission delay versus the network size excludingreaders when R ¼ 20 m and R ¼ 40 m, respectively. Wesee that as the network size increases, the packet transmis-sion delay of both algorithms decreases slightly. The reasonis that given the same number of packets, increasing thenumber of nodes in the same area increases the nodedensity. Therefore, source nodes gain higher probability ofmeeting other nodes or cluster heads to forward theirpackets, which reduces the transmission delay. We see thatsource-replication decreases slightly faster than the cluster-head as the network size increases. This is because theprobability of the head of a cluster to meet readers is notincreased as much as that of any node in a cluster as thenetwork size increases.

We also see that cluster-head has longer delay thansource-replication. In the cluster-head method, the clusterhead holds the replicas of the packets in the cluster andsends the replicas to an RFID reader when it meets an RFIDreader. In the source-replication method, every node in acluster holds a copy of packets from other nodes in thecluster. All information can be transmitted to an RFIDreader whenever one cluster member meets an RFIDreader, which greatly reduces the packet transmissiondelay.

Fig. 7. Comparison of transmission delay versus network size. (a) Range ¼ 20 m. (b) Range ¼ 40 m.

Fig. 8. Comparison of the delivery capacity versus network size. (a) Range ¼ 20 m. (b) Range ¼ 40 m.

SHEN ET AL.: EFFICIENT DATA COLLECTION FOR MOBILE MONITORING APPLICATIONS 1431

Page 9: Efficient data collection for large scale mobile monitoring applications

Comparing Figs. 7a and 7b, we find that as the transmissionrange of the nodes increases, the packet transmission delaydecreases. This is because a larger transmission rangeenables the nodes to communicate with RFID readers atlonger distances, which decreases the packet transmissiondelay. Both figures show that the tag clean-up algorithmhelps to reduce transmission delay. This algorithm reducesthe number of redundant packets in the tag, saving morespace for the transmission of other undelivered packets.More replicas of a packet increases the probability that onereplica is delivered, thus reducing its transmission delay.

Fig. 8 shows the comparison results of the number ofdelivered packets versus different network sizes. We seethat as the network size increases, the number of deliveredpackets in the system increases. The reason is that morenodes in the system generate more packets and alsoincrease the network density, thus increasing the proba-bility of successful packet delivery. We also see that whenthe transmission range of nodes is 40 m, all methods candeliver all packets due to the same reason as for Fig. 6.When R ¼ 20, no less than 97 percent packets weresuccessfully delivered. Also, source-replication has veryslightly less number of delivered packets than othermethods. As source-replication does not use the clean-upalgorithm and the probability for a cluster member to meetan RFID reader is smaller with a small reading range,packets are more likely to be dropped because of thecongestion.

We define the transmission overhead as the number oftransmission operations between nodes, and betweennodes and RFID readers for all packets. Fig. 9 shows thecomparison results of transmission overhead versus thenetwork size. We see that cluster-head generates much lessoverhead than source-replication. This is because incluster-head, only the cluster head holds the replicas of

the packets in all nodes in the cluster while in source-replication, every node in the cluster holds a replica of thepackets of all other cluster members. We also see that as thenetwork size increases, the overhead grows since morepackets are initiated. We also see that the transmissionoverhead increases as the number of nodes increases, but itis not greatly affected by the reading range. The transmis-sion overhead increases in proportion to the number ofnodes. After a cluster head gathers all information in itscluster in cluster-head or after cluster members exchangethe information in source-replication, no more additionalreplicas are generated. As nearly all packets were finallydelivered to the readers in the simulation time, thetransmission range does not affect the transmissionoverhead greatly. The figures also show that the tagclean-up algorithm reduces the transmission overhead.Since this algorithm helps nodes reduce redundant pack-ets, fewer outdated information is exchanged among nodesand between nodes and readers, leading to less overhead.

Fig. 10a shows the comparison results of the averagetransmission delay versus different cluster sizes. As thecluster size increases, the transmission delay of bothtransmission algorithms decreases. In cluster-head, as thecluster size increases, the cluster head holds more infor-mation of the nodes in the system. Therefore, once thecluster head meets an RFID reader, all information of thecluster is delivered to it at one time. This greatly reducesthe packet transmission delay when the packets are sent tothe RFID readers separately by multiple nodes. Similarly,in source-replication, with more nodes in a cluster, moreinformation of the nodes is gathered for transmission. Oncea cluster member meets an RFID reader, all information ofthe cluster members is delivered. Since the probability thatany cluster member of a cluster meets an RFID reader ishigher than the probability that a cluster head meets an

Fig. 9. Comparison of transmission overhead versus network size. (a) Range ¼ 20 m. (b) Range ¼ 40 m.

Fig. 10. Transmission delay and overhead versus cluster size. (a) Transmission delay. (b) Transmission overhead.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 25, NO. 6, JUNE 20141432

Page 10: Efficient data collection for large scale mobile monitoring applications

RFID reader, the transmission delay of source-replication ismuch lower than cluster-head. We also see that the tagclean-up algorithm reduces the packet transmission delaydue to the same reason as for Fig. 7.

Fig. 10b shows the transmission overhead versus thenumber of nodes in a cluster. In source-replication, everynode in a cluster needs to exchange packets with eachother. In cluster-head, only the cluster head needs to collectpackets from its cluster members. Therefore, the transmis-sion overhead of source-replication is much higher thanthat of cluster-head. We see that the tag clean-up algorithmreduces the transmission overhead due to the same reasonas for Fig. 9.

Fig. 11 shows the comparison results of total transmis-sion energy cost versus the network size. We see that theresults of transmission energy cost of different methodscoincide with the corresponding transmission overhead inFig. 9, since the transmission overhead represents thenumber of packet transmissions. However, unlike Fig. 9,the total transmission energy cost for each method withR ¼ 40 is significantly larger than the corresponding resultwith R ¼ 20, since larger reading range requires highertransmission power. Due to the consistent relationshipbetween the transmission energy cost and the transmissionoverhead, we focus on the transmission overhead in thefollowing evaluations, which directly reflects the transmis-sion energy cost.

Fig. 12a compares the energy cost of a cluster head withthat of a cluster member in a cluster-head based transmis-sion for collecting the data from all nodes in the cluster. Aswe see, in both cases of R ¼ 20 and R ¼ 40, the transmis-sion energy cost of the cluster head is significantly larger

than that of each cluster member. The energy cost gapbetween a cluster head and a cluster member becomeslarger when the number of nodes increases, because theenergy cost of the cluster head increases with the clustersize, and the energy cost of each cluster member remainsthe same. We see that the energy cost of a cluster head is notextremely high and can be afforded by a node when acluster size is limited. Fig. 12b compares the energy cost ofthe cluster head in the cluster-head based transmissionwith that of the cluster member node that forwards theaggregated data in the source-replication algorithm. Thefigure shows that the cluster head has much less energyconsumption than the forwarding cluster-member node.This is because the latter consumes much energy toreplicate its data to all other cluster members. Theexperimental results show that though a cluster headconsumes high energy as it forwards all data of its clustermembers to a reader, the energy cost of the cluster-headmethod is much lower than that of the source-replicationmethod.

4.3 Analysis of the Security MechanismsIn this section, we evaluate the performance of our securitymechanisms. We use cluster-head and cluster-memberreplication method for data transmission. Cost analysis. Inthe transmitted packet with the security mechanismsðN;Nonce; EnðK0N;DNÞ;MACðK0N;N jDNÞÞ, the size of thetag ID N , Nonce and the encrypted data EnðK0N;DNÞ wasset to 32 bits respectively, and the size of the encrypted dataEnðK0N;DNÞ was set to 64 bits. The size of the plain datawithout the security mechanism was set to 64 bits. We usedthe total transmitted bits to measure the communication

Fig. 12. Transmission energy cost of a cluster head versus cluster size. (a) Cluster head vs cluster member. (b) Cluster head energy cost.

Fig. 11. Comparison of total transmission energy cost versus network size. (a) Range ¼ 20 m. (b) Range ¼ 40 m.

SHEN ET AL.: EFFICIENT DATA COLLECTION FOR MOBILE MONITORING APPLICATIONS 1433

Page 11: Efficient data collection for large scale mobile monitoring applications

cost. We evaluated the ratio of the increase of totalcommunication cost, including the data replication cost ofnodes and the communication cost between nodes and thereader. The ratio was computed by ðCs � CÞ=C, where Cand Cs are the total communication cost without and withthe security mechanisms respectively.

Fig. 13a shows the ratio of the cost increase with regardto different threshold � value and the number of nodes in acluster. In the figure, � ¼ 0 means that only messageencryption and authentication are used and there is noprotection against data selective forwarding. Because ofadditional nonce and message authentication code, the sizeof each data message increases by more than half, about67 percent. We see that a larger threshold leads to higherratio of cost increase as it causes more nodes to send theirMACs to the RFID reader. Given a threshold, the ratio ofcost increase grows as the network size increases becauselarger network size makes more nodes to send theirMACs to the RFID readers.

Security analysis. Compromised tags can collude to-gether to provide enough MACs to authenticate a falsedata. Thus, a critical problem for our secure randomizedsolution is how much resiliency it has against compro-mised nodes. To answer this question, we analyze thedetection probability of false data when a number ofcompromised nodes exist in the system.

Theorem 4.1. Suppose in a group of n tags, nc number ofcompromised tags collude with a compromised node whichsends pruned data set to the RFID reader. Given the threshold� for the probability of a node sending its MAC to the readerin the cluster, then the probability of the reader successfullydetecting data selective drop attack, denoted by Prd, is theprobability that at least one non-compromised tag choose tosend its MAC for all data set to the sender. Then, we have

Prd ¼ 1� ð1� �Þðn�ncÞ: (3)

Proof. According to Formula (2), each node is selected withprobability � to send MAC to the reader. Iff only thecompromised nodes send their MACs to the reader, thefalse data cannot be detected. The probability of suchcase is ð1� �Þðn�ncÞ. Then, the detection probability canbe derived as 1� ð1� �Þðn�ncÞ. Ì

Given n ¼ 100, we show the relationship between Prdand nc with regard to different � values in Fig. 13b. Thefigure shows that our security mechanism is resilient to thenode compromising attack, since the detection probability

is greater than 90 percent even when half of nodes in acluster are compromised. A larger � value further increasesthe resiliency of our security mechanism. When � is greaterthan 0.05, the detection probability is almost one evenwhen 50 percent nodes are compromised.

5 CONCLUSION

In this paper, we propose Hybrid RFID and WSN System(HRW) that integrates the multi-hop transmission mode ofWSNs and direction transmission mode of RFID systems toimprove the efficiency of data collection, hence to meet therequirements of low economic cost, high performance andreal-time monitoring in mobile monitoring applications.HRW is composed of RFID readers and hybrid smartnodes. Instead of waiting for RFID readers to read data,smart nodes replicate packets with neighbor nodes usingspecial reduced functional RFID readers. The collectedpackets are sent to a RFID reader when one of the replicanodes moves into the range of the RFID reader. We furtherpropose enhanced data transmission algorithms andsecurity mechanisms. Extensive simulation and trace-driven experimental results show that HRW outperformstraditional RFID in terms of the cost of deployment,transmission capacity and delay and tag capacity require-ment. In the future, we plan to evaluate HRW in a real-world testbed.

ACKNOWLEDGMENT

This research was supported in part by U.S. NSF grantsCNS-1249603, CNS-1049947, CNS-1156875, CNS-0917056and CNS-1057530, CNS-1025652, CNS-0938189, CSR-2008826, CSR-2008827, Microsoft Research Faculty Fellow-ship 8300751, and U.S. Department of Energy’s Oak RidgeNational Laboratory including the Extreme Scale SystemsCenter located at ORNL and DoD 4000111689.

REFERENCES

[1] RFID Progress at Wal-Mart in IDTechEx Website. [Online].Available: http://www.idtechex.com/research/articles/rfid_progress_at_wal_mart_00000161.asp

[2] R. Clauberg, ‘‘RFID and Sensor Networks,’’ in Proc. RFIDWorkshop, St. Gallen, Switzerland, Sept. 2004.

[3] L. Zhang and Z. Wang, ‘‘Integration of RFID into WirelessSensor Networks: Architectures, Opportunities and Challeng-ing Problems,’’ in Proc. Grid Coop. Comput. Workshops, 2006,pp. 433-469.

Fig. 13. Overhead and detection probability on security mechanism. (a) Overhead. (b) Detection probability.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 25, NO. 6, JUNE 20141434

Page 12: Efficient data collection for large scale mobile monitoring applications

[4] H. Liu, M. Bolic, A. Nayak, and I. Stojmenovic, ‘‘Taxonomy andChallenges of the Integration of RFID and Wireless SensorNetworks,’’ IEEE Netw., vol. 22, no. 6, pp. 26-35, Nov./Dec. 2008.

[5] J.Y. Daniel, J.H. Holleman, R. Prasad, J.R. Smith, and B.P. Otis,‘‘NeuralWISP: A Wirelessly Powered Neural Interface with1-m Range,’’ IEEE Trans. Biomed. Circuits Syst., vol. 3, no. 6,pp. 379-387, Dec. 2009.

[6] A.P. Sample, D.J. Yeager, and J.R. Smith, ‘‘A Capacitive TouchInterface for Passive RFID Tags,’’ in Proc. IEEE Int’l Conf. RFID,2009, pp. 103-109.

[7] Z. Li, H. Shen, and B. Alsaify, ‘‘Integrating RFID with WirelessSensor Networks for Inhabitant, Environment and HealthMonitoring,’’ in Proc. ICPADS, 2008, pp. 639-646.

[8] T. Lez and D. Kim, ‘‘Wireless Sensor Networks and RfidIntegration for Context Aware Services,’’ Auto-ID Labs,Cambridge, MA, USA, Tech. Rep., 2007.

[9] F. Li and J. Wu, ‘‘MOPS: Providing Content-Based Servicein Disruption-Tolerant Networks,’’ in Proc. ICDCS, 2009,pp. 526-533.

[10] D. Karger, E. Lehman, T. Leighton, M. Levine, D. Lewin, andR. Panigrahy, ‘‘Consistent Hashing and Random Trees: Distrib-uted Caching Protocols for Relieving Hot Spots on the WorldWide Web,’’ in Proc. STOC, 1997, pp. 654-663.

[11] T. Spyropoulos, K. Psounis, and C. Raghavendra, ‘‘EfficientRouting in Termittently Connected Mobile Networks: The Single-Copy Case,’’ IEEE/ACM Trans. Netw., vol. 16, no. 1, pp. 63-76,Feb. 2007.

[12] Y. Wu, T. Ho, W. Liao, and C. Tsao, ‘‘Epoch, Length of theRandom Waypoint Model in Mobile Ad Hoc Networks,’’ IEEECommun. Lett., vol. 9, no. 11, pp. 1003-1005, Nov. 2005.

[13] N. Eagle, A. Pentland, and D. Lazer, ‘‘Inferring Social NetworkStructure Using Mobile Phone Data,’’ Proc. Nat. Acad. Sci., USA,vol. 106, no. 36, pp. 15 274-15 278, Sept. 2009.

[14] A. Chaintreau, P. Hui, J. Scott, R. Gass, J. Crowcroft, and C. Diot,‘‘Impact of Human Mobility on Opportunistic ForwardingAlgorithms,’’ IEEE Trans. Mobile Comput., vol. 6, no. 6, pp. 606-620, June 2007.

[15] A. Vahdat and D. Becker, ‘‘Epidemic Routing for PartiallyConnected Ad Hoc Networks,’’ Duke Univ., Durham, NC, USA,Tech. Rep. CS-200006, 2000.

[16] MPR-Mote Processor Radio Board User’s Manual, Crossbow Inc.,San Jose, CA, USA, , 2004.

[17] S.M. Ross, Introduction to Probability Models, 8th ed., Amsterdam,The Netherlands: Academic, 2003.

[18] B.H. Bloom, ‘‘Space/Time Trade-Offs in Hash Coding withAllowable Errors,’’ Commun. ACM, vol. 13, no. 7, pp. 422-426,July 1970.

[19] D. Simplot-Ryl, I. Stojmenovic, A. Micic, and A. Nayak, ‘‘AHybrid Randomized Protocol for RFID Tag Identification,’’Sensor Rev., vol. 26, no. 2, pp. 147-154, 2006.

[20] B. Firner, P. Jadhav, Y. Zhang, R. Howard, W. Trappe, andE. Fenson, ‘‘Towards Continuous Asset Tracking: Low-PowerCommunication and Fail-Safe Presence Assurance,’’ in Proc. IEEESECON, 2009, pp. 1-9.

[21] Y. Gu, G. Yu, X. Li, and Y. Wang, ‘‘RFID Data InterpolationAlgorithm Based on Dynamic Probabilistic Path-Event Model,’’J. Softw., vol. 21, no. 3, pp. 439-441, Mar. 2010.

[22] C. Lee and C. Chung, ‘‘RFID Data Processing in Supply ChainManagement Using a Path Encoding Scheme,’’ IEEE Trans.Knowl. Data Eng., vol. 23, no. 5, pp. 742-758, May 2011.

[23] C. Tan, B. Sheng, and Q. Li, ‘‘How to Monitor for Missing RFIDTags,’’ in Proc. ICDCS, 2008, pp. 295-302.

[24] B. Sheng, Q. Li, and W. Mao, ‘‘Efficient Continuous Scanningin RFID Systems,’’ in Proc. INFOCOM, 2010, pp. 1-9.

[25] W. Luo, S. Chen, T. Li, and Y. Qiao, ‘‘Probabilistic Missing-TagDetection and Energy-Time Tradeoff in Large-Scale RFIDSystems,’’ in Proc. MobiHoc, 2012, pp. 95-104.

[26] W. Luo, S. Chen, T. Li, and S. Chen, ‘‘Efficient Missing TagDetection in RFID Systems,’’ in Proc. IEEE INFOCOM, 2011,pp. 356-360.

[27] Y. Bai, F. Wang, P. Liu, C. Zaniolo, and S. Liu, ‘‘RFID DataProcessing with a Data Stream Query Language,’’ in Proc. IEEEICDE, 2007, pp. 1184-1193.

[28] R. Szewczyk, A. Mainwaring, J. Polastre, J. Anderson, andD. Culler, ‘‘An Analysis of a Large Scale Habitat MonitoringApplication,’’ in Proc. SenSys, 2004, pp. 214-226.

[29] A. Maffei, K. Fall, and D. Chayes, ‘‘Ocean instrument internet,’’presented at the Proceedings of AGU Ocean Sciences Conference,San Francisco, CA, USA, 2006, Paper OS45D-14.

[30] E. Jovanov, A. Milenkovic, C. Otto, and P.C. Groen, ‘‘A WirelessBody Area Network of Intelligent Motion Sensors for ComputerAssisted Physical Rehabilitation,’’ J. Neuroeng. Rehab., vol. 2,no. 1, p. 6, Mar. 2005.

[31] T. Taleb, D. Bottazzi, M. Guizani, and H. Nait-Charif, ‘‘Angelah:A Framework for Assisting Elders at Home,’’ IEEE J. Sel. AreasCommun., vol. 27, no. 4, pp. 480-494, May 2009.

[32] P. Mohan, V.N. Padmanabhan, and R. Ramjee, ‘‘Nericell: RichMonitoring of Road and Traffic Conditions Using MobileSmartphones,’’ in Proc. SenSys, 2008, pp. 323-326.

[33] A. Thiagarajan, L. Ravindranath, K. LaCurts, S. Madden,H. Balakrishnan, S. Toledo, and J. Eriksson, ‘‘Vtrack: Accurate,Energy-Aware Road Traffic Delay Estimation Using MobilePhones,’’ in Proc. SenSys, 2009, pp. 85-98.

[34] M. Li, Y. Liu, J. Wang, and Z. Yang, ‘‘Sensor NetworkNavigation Without Locations,’’ in Proc. IEEE INFOCOM, 2009,pp. 2419-2427.

[35] S. Li, X. Wu, A. Zhan, and G. Chen, ‘‘ERN: Emergence RescueNavigation with Wireless Sensor Networks,’’ in Proc. IPDPS,2009, pp. 361-368.

[36] A. Thiagarajan, J. Biagioni, T. Gerlich, and J. Eriksson, ‘‘Coop-erative Transit Tracking Using Smart-Phones,’’ in Proc. SenSys,2010, pp. 85-98.

[37] H. Wu, R. Fujimoto, R. Guensler, and M. Hunter, ‘‘MDDV:Mobility-Centric Data Dissemination Algorithm for VehicularNetworks,’’ in Proc. ACM VANET, 2004, pp. 47-56.

[38] G. Zhou, J. Lu, C.Y. Wan, M. Yarvis, and J. Stankovic, ‘‘BodyQoS:Adaptive and Radio-Agnostic QoS for Body Sensor Networks,’’in Proc. IEEE INFOCOM, 2008, pp. 565-573.

[39] D. Vlasic, R. Adelsberge, G. Vannucc, J. Barnwell, M. Gross,W. Matusik, and J. Popovi, ‘‘Practical Motion Capture inEveryday Surroundings,’’ ACM Trans. Graph., vol. 26, no. 3, p. 35,July 2007.

[40] S.B. Eisenman, E. Miluzzo, N.D. Lane, R.A. Peterson, G.S. Ahn,and A.T. Campbell, ‘‘The Bikenet Mobile Sensing Systemfor Cyclist Experience Mapping,’’ in Proc. SenSys, 2007,pp. 87-101.

[41] X. Chen, K. Makki, K. Yen, and N. Pissinou, ‘‘Sensor NetworkSecurity: a Survey,’’ IEEE Commun. Surveys Tuts., vol. 11, no. 2,pp. 52-73, 2nd Quarter.

[42] H. Yang, F. Ye, Y. Yuan, S. Lu, and W.A. Arbaugh, ‘‘TowardResilient Security in Wireless Sensor Networks,’’ in Proc. ACMMobiHoc, 2005, pp. 34-45.

[43] Z. Yu and Y. Guan, ‘‘A Dynamic En-Route Scheme for FilteringFalse Data Injection in Wireless Sensor Networks,’’ in Proc. IEEEINFOCOM, Apr. 2006, pp. 1-12.

[44] K. Ren, W. Lou, and Y. Zhang, ‘‘LEDS: Providing Location-Aware End-To-End Data Security in Wireless Sensor Networks,’’in Proc. IEEE INFOCOM, Apr. 2006, pp. 1-12.

[45] L. Yu and J. Li, ‘‘Grouping-Based Resilient Statistical En-RouteFiltering for Sensor Networks,’’ in Proc. IEEE INFOCOM, 2009,pp. 1782-1790.

[46] Z. Yang and H. Wu, ‘‘FINDERS: A Featherlight InformationNetwork with Delay-Endurable RFID Support,’’ IEEE/ACM Trans.Netw., vol. 19, no. 4, pp. 961-974, Aug. 2011.

[47] X. Zhang, G. Neglia, J. Kurose, and D. Towsley, ‘‘PerformanceModeling of Epidemic Routing,’’ in Proc. IFIP, 2006, pp. 535-546.

[48] Y. Wang, S. Jain, M. Martonosi, and K. Fall, ‘‘Erasure-CodingBased Routing for Opportunistic Networks,’’ in Proc. SIGCOMM,2005, pp. 229-236.

[49] Y. Wang and H. Wu, ‘‘Delay/Fault-Tolerant Mobile SensorNetwork (DFT-MSN): A New Paradigm for Pervasive Infor-mation Gathering,’’ IEEE Trans. Mobile Comput., vol. 6, no. 9,pp. 1021-1034, Sept. 2007.

[50] J. Burgess, B. Gallagher, D. Jensen, and B. Levine, ‘‘MaxProp:Routing for Vehicle-Based Disruption-Tolerant Networks,’’ inProc. IEEE INFOCOM, 2006, pp. 1-11.

[51] J. Ghosh, S.J. Philip, and C. Qiao, ‘‘Sociological Orbit AwareLocation Approximation and Routing (SOLAR) in MANET,’’ AdHoc Netw., vol. 5, no. 2, pp. 189-209, Mar. 2007.

[52] E.M. Daly and M. Haahr, ‘‘Social Network Analysis for Routingin Disconnected Delay-Tolerant MANETs,’’ in Proc. Mobihoc,2007, pp. 32-40.

SHEN ET AL.: EFFICIENT DATA COLLECTION FOR MOBILE MONITORING APPLICATIONS 1435

Page 13: Efficient data collection for large scale mobile monitoring applications

Haiying Shen received the BS degree incomputer science and engineering from TongjiUniversity, Shanghai, China, in 2000, and theMS and PhD degrees in computer engineeringfrom Wayne State University, Detroit, MI, USA,in 2004 and 2006, respectively. Currently, she isan Associate Professor in the Department ofElectrical and Computer Engineering at Clem-son University, Clemson, SC, USA. Her re-search interests include distributed computersystems and computer networks, with an em-

phasis on P2P and content delivery networks, mobile computing,wireless sensor networks, and cloud computing. She is a MicrosoftFaculty Fellow of 2010 and a Senior Member of the IEEE and ACM.

Ze Li received the BS degree in electronics andinformation engineering from Huazhong Univer-sity of Science and Technology, Wuhan, China,in 2007, and the PhD degree in computerengineering from Clemson University, Clemson,SC, USA. His research interests include distrib-uted networks, with an emphasis on peer-to-peer and content delivery networks. Currently,he is a Data Scientist in the MicroStrategyIncorporation.

Lei Yu received the PhD degree in computerscience from Harbin Institute of Technology,Harbin, China, in 2011. Currently, he is aPostdoctoral Research Fellow in the Departmentof Electrical and Computer Engineering atClemson University, Clemson, SC, USA. Hisresearch interests include sensor networks,wireless networks, cloud computing, and net-work security. He is a member of the IEEE.

Chenxi Qiu received the BS degree in telecom-munication engineering from Xidian University,Xi’an, China, in 2009. Currently, he is a PhDstudent in the Department of Electrical andComputer Engineering at Clemson University,Clemson, SC, USA. His research interestsinclude sensor networks and wireless networks.

. For more information on this or any other computing topic,please visit our Digital Library at www.computer.org/publications/dlib.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 25, NO. 6, JUNE 20141436