14
J. Parallel Distrib. Comput. 66 (2006) 586 – 599 www.elsevier.com/locate/jpdc Fault-tolerant wireless sensor network routing protocols for the supervision of context-aware physical environments Azzedine Boukerche a , , Richard Werner Nelem Pazzi a , Regina Borges Araujo b a PARADISE Research Laboratory, SITE, University of Ottawa, Ottawa, Canada K1N 6N5 b DC—Universidade Federal de São, Carlos, CP 676, 13565-905 São Carlos, SP, Brazil Received 7 February 2005; received in revised form 4 November 2005; accepted 10 December 2005 Abstract Applications that require fine-grain monitoring of physical environments subjected to critical conditions, such as fire, leaking of toxic gases and explosions, pose a great challenge to sensor network protocols. These networks have to provide a fast, reliable, fault-tolerant and energy-aware channel for events diffusion, which meets the requirements of query-based, event-driven and periodic sensor networks application scenarios. These requirements have to be met even in the presence of emergency conditions that can lead to node failures and path disruption to the sink. This paper proposes two routing protocols: periodic, event-driven and query-based protocol (PEQ) and its variation CPEQ, two fault-tolerant and low-latency algorithms that meet sensor network requirements for critical conditions supervision in context-aware physical environments. While PEQ can provide low latency for event notification, fast broken path reconfiguration, and high reliability in the delivery of event packets for low-network data traffic, CPEQ is a cluster-based routing protocol that groups sensor nodes to efficiently relay the sensed data to the sink by uniformly distributing energy dissipation among the nodes and reducing latency for high-network data traffic (typical in emergency situations). PEQ and its variant CPEQ use the publish/subscribe paradigm to disseminate requests across the network. We discuss both PEQ and CPEQ protocols, their implementation, and report on the performance results of several scenarios using NS-2 simulator. The results obtained are compared with the well-known directed diffusion (DD) protocol, and show that our proposed algorithms exhibit a clear indication to meet the constraints and requirements of critical condition supervision in context-aware physical environments. Our results indicate that PEQ outperforms DD in the average delay since it uses the shortest path for the delivery of packets and speed up new subscriptions by using the reverse path used for event notification packets. CPEQ also outperforms DD in both the average delay and in the packet delivery ratio when the network scales up. © 2006 Published by Elsevier Inc. Keywords: Sensor networks; Hierarchical routing protocol; Publish/subscribe 1. Introduction With the recent developments in wireless networks and mul- tifunctional sensors with digital processing, power supply and communication capabilities, wireless sensor networks are be- ing largely deployed in physical environments for fine-grain monitoring in different classes of applications [1,5,7,10,12,25]. One of the most appealing applications is security surveillance and supervision of context-aware physical environments for This work was supported by the Canada Research Chair Program, NSERC, Canada Foundation for Innovation Funds, and OTI/Distinguished Researcher Award. Corresponding author. E-mail address: [email protected] (A. Boukerche). 0743-7315/$ - see front matter © 2006 Published by Elsevier Inc. doi:10.1016/j.jpdc.2005.12.007 critical conditions monitoring. In a prison, for instance, it is important to keep a reliable monitoring of the physical envi- ronment, especially when emergency situations emerge, such as prisoner rebellions that can lead to incendiary fire conditions and losses of human lives and patrimony. In such situations, it is important that information can be “sensed” from the phys- ical environment while the emergency state is in progress, since more precise information can be used by security and rescue teams for operation management and better strategic decisions. However, in order to keep the information flow- ing from the sensors during the emergency, a wireless sensor network solution has to cope with the failure of sensor nodes (sensors can be burnt, can have their propagation jeopardized by interferences, such as water or dense smoke present in the

Fault-tolerant wireless sensor network routing protocols for the supervision of context-aware physical environments

Embed Size (px)

Citation preview

Page 1: Fault-tolerant wireless sensor network routing protocols for the supervision of context-aware physical environments

J. Parallel Distrib. Comput. 66 (2006) 586–599www.elsevier.com/locate/jpdc

Fault-tolerant wireless sensor network routing protocols for the supervisionof context-aware physical environments�

Azzedine Boukerchea,∗, Richard Werner Nelem Pazzia, Regina Borges Araujob

aPARADISE Research Laboratory, SITE, University of Ottawa, Ottawa, Canada K1N 6N5bDC—Universidade Federal de São, Carlos, CP 676, 13565-905 São Carlos, SP, Brazil

Received 7 February 2005; received in revised form 4 November 2005; accepted 10 December 2005

Abstract

Applications that require fine-grain monitoring of physical environments subjected to critical conditions, such as fire, leaking of toxicgases and explosions, pose a great challenge to sensor network protocols. These networks have to provide a fast, reliable, fault-tolerant andenergy-aware channel for events diffusion, which meets the requirements of query-based, event-driven and periodic sensor networks applicationscenarios. These requirements have to be met even in the presence of emergency conditions that can lead to node failures and path disruptionto the sink. This paper proposes two routing protocols: periodic, event-driven and query-based protocol (PEQ) and its variation CPEQ, twofault-tolerant and low-latency algorithms that meet sensor network requirements for critical conditions supervision in context-aware physicalenvironments. While PEQ can provide low latency for event notification, fast broken path reconfiguration, and high reliability in the deliveryof event packets for low-network data traffic, CPEQ is a cluster-based routing protocol that groups sensor nodes to efficiently relay the senseddata to the sink by uniformly distributing energy dissipation among the nodes and reducing latency for high-network data traffic (typical inemergency situations). PEQ and its variant CPEQ use the publish/subscribe paradigm to disseminate requests across the network. We discussboth PEQ and CPEQ protocols, their implementation, and report on the performance results of several scenarios using NS-2 simulator. Theresults obtained are compared with the well-known directed diffusion (DD) protocol, and show that our proposed algorithms exhibit a clearindication to meet the constraints and requirements of critical condition supervision in context-aware physical environments. Our results indicatethat PEQ outperforms DD in the average delay since it uses the shortest path for the delivery of packets and speed up new subscriptions byusing the reverse path used for event notification packets. CPEQ also outperforms DD in both the average delay and in the packet deliveryratio when the network scales up.© 2006 Published by Elsevier Inc.

Keywords: Sensor networks; Hierarchical routing protocol; Publish/subscribe

1. Introduction

With the recent developments in wireless networks and mul-tifunctional sensors with digital processing, power supply andcommunication capabilities, wireless sensor networks are be-ing largely deployed in physical environments for fine-grainmonitoring in different classes of applications [1,5,7,10,12,25].One of the most appealing applications is security surveillanceand supervision of context-aware physical environments for

� This work was supported by the Canada Research Chair Program, NSERC,Canada Foundation for Innovation Funds, and OTI/Distinguished ResearcherAward.

∗ Corresponding author.E-mail address: [email protected] (A. Boukerche).

0743-7315/$ - see front matter © 2006 Published by Elsevier Inc.doi:10.1016/j.jpdc.2005.12.007

critical conditions monitoring. In a prison, for instance, it isimportant to keep a reliable monitoring of the physical envi-ronment, especially when emergency situations emerge, suchas prisoner rebellions that can lead to incendiary fire conditionsand losses of human lives and patrimony. In such situations, itis important that information can be “sensed” from the phys-ical environment while the emergency state is in progress,since more precise information can be used by security andrescue teams for operation management and better strategicdecisions. However, in order to keep the information flow-ing from the sensors during the emergency, a wireless sensornetwork solution has to cope with the failure of sensor nodes(sensors can be burnt, can have their propagation jeopardizedby interferences, such as water or dense smoke present in the

Page 2: Fault-tolerant wireless sensor network routing protocols for the supervision of context-aware physical environments

A. Boukerche et al. / J. Parallel Distrib. Comput. 66 (2006) 586–599 587

environment, can be malfunctioning, etc.). Thereby, wirelesssensor network solutions for such environments have to be faulttolerant and reliable, and to provide low latency, besides fastreconfiguration and energy saving. In terms of energy savings,in a silent monitoring state, sensor nodes can be programmedto notify about events in a periodic fashion (send temperatureat every 10 min) or event-driven fashion (send temperature onlywhen above 60 ◦C). In these cases the interest may not changefor quite some time.

Some existing energy-saving solutions take that into consid-eration and switch some nodes off, leading the nodes to an in-active state—these are waken up only when interest matchesthe events “sensed” [19]. On the other hand, in query-based ap-plication scenarios, queries (new interests) can be propagatedto sensors arbitrarily, according to the application and/or user’swill and so, some existing energy-saving solutions may not beadequate because the transition from inactive state to data trans-fer state can be costly in terms of energy use when many arbi-trary transitions are necessary [8]. Moreover, energy-saving andfault-tolerance support can present conflicting interests whenmore paths, involving inactive nodes, have to be quickly set upbecause of failure in nodes of previous paths.

In this paper, we propose two novel periodic, event-drivenand query-based wireless sensor protocols for the supervisionof context-aware physical environments.This paper describestwo routing protocols: periodic, event-driven and query-basedprotocol (PEQ) and its variation cluster-based PEQ (CPEQ),two fault-tolerant and low-latency algorithms that meet sen-sor network requirements for critical conditions supervision incontext-aware physical environments. PEQ can provide low la-tency for event notification packets, fast broken path reconfig-uration, and high reliability in the delivery of the packets withlow-energy dissipation. Low latency is achieved by the use ofthe shortest path for the delivery of packets. Fast subscriptionsof new interests (for query-based scenarios) are provided bythe concept of driven delivery of packets, in which new sub-scriptions to a sensor region are speed up by using the inversepath used for event notification packets. This has an impact onenergy saving, since less traffic is disseminated through the net-work for both event notification packets and broken path recon-figuration. The network nodes can trigger fault tolerance whenthey detect a node failure, in which case the nodes find, cooper-atively, the fastest path, with smallest possible number of trans-missions. The sensor network is configured through a hop tree,which is built at the configuration time. The publish/subscribeparadigm is used to promote the interaction between sensorsand sink. Subscriptions to the nodes are propagated to the sen-sors through the hop tree created. In order to better describethe algorithm, a grid model is used. However, the solution canbe applied to mesh and dense randomly deployed sensor nodenetworks as well.

In order to minimize even more the latency, by reducing thenetwork traffic and to uniformly distribute energy dissipationamong the nodes, a variation of PEQ, CPEQ, a cluster-basedrouting protocol, was devised to efficiently relay the senseddata to the sink. In CPEQ protocol, nodes with more residualenergy are selected as aggregator nodes that relay data to the

sink by uniformly distributing energy dissipation among thenodes. The strength of CPEQ is its simplicity and effectivenessin the packets delivery process.

The provision of low latency, high reliability even in thepresence of failures, fast subscription of new interests and en-ergy saving makes PEQ and CPEQ choice algorithms to sup-port applications in areas ranging from health care (body vitalsigns monitoring, medical instruments, localization of objectsand people in health-care facilities, laboratories, etc.) to trans-portation (traffic control, vehicle supervision and control, etc.),Government (environmental control, meteorological services,key national symbols, e.g., cultural institutions and nationalsites and monuments), manufacturing (including chemical in-dustry and defense industrial base) and miscellaneous (smartsuper markets, tourism guides, entertainment, etc.).

The paper is organized as follows: Section 2 describes thePEQ algorithm, showing the network configuration, subscrip-tion and notification of events and network reconfiguration inface of node failures. Section 3 describes the CPEQ algorithm,the network configuration, aggregator selection, cluster config-uration and data transmission to the sink. Performance eval-uation of PEQ and CPEQ is discussed in Section 5, alongwith the simulation scenarios, metrics used and results obtainedcompared to the directed diffusion (DD) paradigm. In Sec-tion 6, related works are discussed. Conclusion is presented inSection 7.

2. Description of PEQ

The main motivation for the work described here is drivenby the need to provide support for all of the following require-ments simultaneously: low latency, reliability, fast path recov-ery in the presence of failures and energy savings. Althoughseveral interesting solutions have been reported in the literature,they basically do not attend all three requirements at the sametime. Moreover, some solutions either require special hardwareor sophisticated processing at the nodes. The basic idea of thePEQ algorithm is to use ordinary modes, with no special hard-ware and a simple processing at each node by using the hoplevel as the main information to minimize data transmission.In the presence of failures, a switch to a fast recovery modeis done keeping the exchange of information among neighbornodes to a minimum, differently from other solutions. PEQ isa routing algorithm, which is realized in three steps. The firststep comprises the construction of the hop tree. The sink startsthe process of building the hop tree, which will be used as aconfiguration and subscription packet propagation mechanismto the sensor network. The second step involves the propagationof subscriptions to the sensor network. Finally, the last step isresponsible for delivering packets from the sensors to the sink,by using the fastest and less costly route, in terms of energysavings.

Next sections describe the publish/subscribe paradigm as themechanism for sensors/sink interaction, followed by the de-scription of the routing steps. It is assumed that the nodes aredisposed as a grid so that the transmission coverage of one sen-sor node is capable of reaching its eight neighbor nodes. How-

Page 3: Fault-tolerant wireless sensor network routing protocols for the supervision of context-aware physical environments

588 A. Boukerche et al. / J. Parallel Distrib. Comput. 66 (2006) 586–599

ever, the solution can be applied also for mesh networks modelas well as dense randomly deployed sensor node networks.

2.1. The publish/subscribe paradigm for sensors/sinkinteraction

Sensors networks can have thousands of nodes, each oneproducing an event that is delivered to one or more staticor mobile sinks. Several communication paradigms can beused to promote the interaction between sensors that produceevents and sinks that consume those events. Examples ofsuch paradigms include message passing, remote invocations,notifications, shared spaces and message queuing. The basicproblem with these paradigms is that they fail to promotefull decoupling between participants, making the system lessflexible and less scalable [15]. Eugster and colleagues [14]make an excellent study of all these paradigms and comparesthem to the publish/subscribe paradigm, which has receivedincreased attention because it decouples consumers and pro-ducers in time (publishers and subscribers do not need to beactive in the interaction at the same time), space (publishersand subscribers do not need to know each other) and flow(publishers and subscribers do not need to be synchronized tointeract). In the publish/subscribe interaction paradigm, one ormore sinks receive events notification packets from a sensornetwork. The sink expresses interest in a sensor by subscrib-ing to certain information it requires from the sensor. Whenthe sensor detects that information, it publishes it by firing anevent, which is sent to the sink(s) through a notification packetin an asynchronous way.

The publish/subscribe mechanism can be used to convey andreceive notification of the following types of interest: periodicinformation, query-based and event-driven. PEQ algorithm hasa simple and effective solution for the subscription propagationand notification of events, as described in the next sections.

2.2. Building the hop tree

In the wireless sensor network considered here, one nodedoes not have a global understanding of the network, i.e., a nodeonly knows a small amount of information about its nearestneighbors (those that are within its coverage reach). In a firstmoment, each node knows only the hop level, of a hop tree,that it is in. The hop tree is started by a sink, which transmitsto its neighbor(s) an attribute–value pair called hop.

The algorithm for building the hop tree is based on floodingthe network, starting from the sink, with a hop value, whichis stored, incremented and transmitted to its neighbor nodes.These neighbor nodes store the received hop value, incrementit and transmit it to its neighbor nodes and so on until thewhole sensor network is configured with different levels ofhops. Because the communication among the network nodes isthrough radio frequency, all the neighbors of a node receive thetransmission. So, one node that has already transmitted, willreceive its neighbor’s transmission, generating a loop. In orderto avoid these useless transmissions that cause energy waste, a

Fig. 1. Hop configuration in a mesh network.

set of rules was established as part of the algorithm for the hopdiffusion. One of the local rules establishes that when a nodereceives a hop from its neighbor, it checks this value againstits local hop value. If the local hop value is greater than thereceived one, the node updates its hop, increment this valueand retransmit it to its neighbors. In case the locally storedhop is smaller or equal to the received hop, the node does notupdate its hop and does not transmit it. Fig. 1 shows the initialconfiguration of a mesh network.

The data structure used in the algorithm comprises three ta-bles: configTable, routingTable and subscriptionTable. The con-figTable holds the configuration parameters associated with asink. A node uses the routingTable to forward packets to itsneighbor nodes. Finally, the subscriptionTable is used to storethe subscriptions a node receives. The routingTable has fourfields: sinkID, senderID, destID, and coordinates. The coordi-nates attribute is used to indicate the position of the node, soan application can know where the readings come from, and asink can send a subscription to a region delimited by coordi-nates, instead of sending to specific source IDs.

2.3. The subscription packet propagation

In the publish/subscribe paradigm, for a sink to be notifiedabout the events that are captured from the physical environ-ment by the sensors, it needs to subscribe to one or more nodesfor a given information, by setting one or more criteria (temper-ature > 60 ◦C, presence of smoke, etc.) that have to be matchedbefore any event packet is sent. By sending event packets onlywhen they match a criterion, it reduces network traffic, causingless waste of energy and extending the sensors network life.After the initial configuration of the network, the only informa-tion a node has is the hop level it is in. This information aloneis not enough for efficient subscription propagation. In the ab-sence of any information about which node of the network cansatisfy the sink interest, one way to propagate the initial sub-scription is to flood the network with this interest. Each node

Page 4: Fault-tolerant wireless sensor network routing protocols for the supervision of context-aware physical environments

A. Boukerche et al. / J. Parallel Distrib. Comput. 66 (2006) 586–599 589

of the network keeps a small subscription table and a routingtable. Each record of the subscription table represents a differ-ent subscription. During the subscription packet propagation,when a node receives this packet, it compares the coordinatesattribute to its own coordinates. If they are the same, it meansthat the subscription is meant to this node and so, it is storedin its subscriptions table. Otherwise, the node only re-transmitsthe subscription as part of the algorithm. During the subscrip-tion propagation, when a node receives a transmission, it sets itsroutingTable destID to be the node that has transmitted and setsits sinkID to be the corresponding sink that has sent the sub-scription. This information will be used to forward data backto the sink. When a node needs to forward data to a sink, itchecks its routingTable and forwards the packet to the destIDcorresponding to the sinkID.

2.4. Sending the notification packet

When information is captured from the physical environ-ment by a sensor, it checks its subscription list to determine ifthere is any registered interest. If a criterion is met, the nodeverifies the senderID of the node that transmitted the subscrip-tion. After that, the node assembles an event notification packetthat contains the following attributes: type, value, coordinates,sinkID and send them to its neighbors. When each neighbornode receives the packet, it compares the received destID withits own ID. If the result is true, the node stores coordinates andsenderID in its routing table, gets the destID of the routing ta-ble and each node repeats the algorithm until the notificationreaches the sink.

Each node processes packets only from the nodes that are ina previous hop level. This characteristic makes it easy to selectthe neighbor that transmitted faster, besides avoiding packetsloops. Supposing that a sink S sends a subscription to the net-work, and considering that the top-left-most node is the sensorthat produces an event that meets the subscription criterion ofsink S, the path that is created down to the sink for the send-ing of the notification packet can be seen in Fig. 2. Note thatthe arrows indicate the links that could form alternative paths,depending only on the choice each node makes for the fastestnode that delivered the subscription. An important feature ofthe configuration of the hops values can be observed in thenotification transmission phase. When a node receives a trans-mission from a neighbor node, it only retransmits the packetif the node has higher hop number (one unity more). For in-stance, only the nodes with hop = 4 retransmit the informationreceived from nodes with hop = 5, and so on.

The path used to forward data from the source node to thesink can also be used to forward subscription to the sourcenode, as shown in Fig. 3. In order to be able to use this reversepath, the nodes in the path send data to the senderID they getfrom their routingTables. This is useful when subscriptions ofquery-driven type have to be supported. Otherwise, if the nodedoes not have a matching value for coordinates, the node trans-mits without specifying the senderID, so that all neighbors willtransmit the subscription packet. According to this algorithm,

Fig. 2. Path for notification delivery.

Fig. 3. Driven delivery of subscriptions.

Fig. 4. Energy map of the network.

one node transmits only if its hop value matches the hop valuereceived.

Because the driven delivery of subscriptions use the samepath created for notification packets, only the nodes comprisingthis path spend energy for transmission. The other nodes eitherreceive and do not transmit (as is the case of the neighbor nodesto the path nodes) or do not even receive packets. Fig. 4 showsa map that represents the energy consumed by the networkwhen using the referred path. The darker nodes denote a largerexpenditure of energy.

2.5. Path repair mechanism

The path created for sending the notification packet is uniqueand efficient (promotes low latency and saves energy). It canalso be used for the driven delivery of new subscriptions (forquery-based scenarios, for instance, that may require randomsubscriptions). However, because the path is unique, any failurein one of its nodes will cause disruption, preventing the deliv-

Page 5: Fault-tolerant wireless sensor network routing protocols for the supervision of context-aware physical environments

590 A. Boukerche et al. / J. Parallel Distrib. Comput. 66 (2006) 586–599

ery of the event packet as well as the subscriptions. Possiblecauses of failure include: low energy, physical destruction ofone or more nodes, communication blockage, etc. Many rout-ing algorithms for sensor networks have been proposed in theliterature that cope with node failures. Some are based on pe-riodic flooding mechanisms [19,18], rooted at the sink, to re-pair broken paths and to discover new routes to forward trafficaround faulty nodes. However, this mechanism is not satisfac-tory in terms of energy saving because it wastes a lot of energybroadcasting repairing packets. Furthermore, during the inter-val of network flooding, these algorithms are unable to routedata around failed nodes, causing data losses. The PEQ algo-rithm offers an ACK-based path repair mechanism. This re-pairing mechanism consists of two parts: failure detection atthe destination node and selection of a new destination. Rightafter the initial configuration phase, each node has only onedestination node to forward data to the sink, due to the single(shortest) path created.

When a sensor node needs to forward data to its destination,it simply sends the data packet and sets a timeout and waits forthe neighbor’s acknowledgment. If the transmitter sensor nodereceives its neighbor’s ACK, it can infer that the neighbor isalive. The neighbor node sends the ACK packet right after it hasforwarded the original packet, therefore the transmitter sensornode knows that its packet was properly forwarded, and it doesnot need to retransmit the packet nor choose another neighbor.If the transmitter sensor node does not receive the ACK packet,a problem must have occurred with the neighbor and anothernode should be selected as the new target. Then the transmittersensor node immediately broadcasts a SEARCH packet to itsneighbors. The nodes will reply with a packet to the transmittersensor node containing their hop level and identification. Thenext step is to select a new destination. The transmitter sensornode chooses the neighbor with lower hop level to be its newdestination. The transmitter sensor node then updates its routingtable to ease the forwarding of subsequent packets. In orderto avoid creating closed paths (loops), the transmitter sensornode sets its own hop level to be the destination hop level plusone. If any neighbor does not reply the SEARCH packet, thetransmitter sensor node has to retransmit this packet. If the nodeis isolated, the only solution is to increase its radio range. Notethat the backtrack mechanism is implemented here, as any nodemay respond the request, even nodes with higher hop levels,including the originator of the packet.

A disrupted path is shown in Fig. 5. After the repairing mech-anism is exploited, the path is reconstructed as can be seen inFig. 6.

It is obvious that, if all neighbors of a node fail, this node willbe isolated and its transmission will not reach any neighbor.One solution would be to configure the radio module of thenode to increase its coverage area, but this will spend moreenergy. Another solution is to provide fault tolerance throughthe establishment of multiple paths from the nodes to the sink.

PEQ is a simple protocol with dynamic broken path reconfig-uration support. Low latency is expected through the use of theshortest path for the delivery of event packets. New subscrip-tions to a sensor region can be speed up by using the reverse

Fig. 5. Region with destroyed nodes.

Fig. 6. Repaired path.

path used for event notification packets. Individual nodes, in-stead of a sink-based mechanism, trigger fault-tolerance mech-anism locally.

In order to further reduce the network traffic and to uniformlydistribute energy dissipation among the nodes, a variation ofPEQ, named cluster-based CPEQ, was devised to efficientlyrelay the sensed data to the sink by aggregating data fromneighbor nodes. CPEQ is described in the next sections.

3. Improving PEQ through data aggregation:CPEQ—a cluster-based periodic, event-driven andquery-based protocol

Sensor nodes in a wireless sensor network may generate alot of data traffic, especially in an emergency situation. For in-stance, nodes may detect that the temperature of part of a build-ing is increasing; other nodes may detect fire and all these datamight be sent to a sink that immediately triggers some nodesto track people inside a building. Sensor nodes in a certain re-gion may detect the same event and send this redundant data tothe sink. In order to avoid this redundancy, that generates un-necessary traffic and dissipates more energy, a data aggregationmethod is necessary [13,15,16,21,30]. Thus, a cluster-based ap-proach was devised that groups sensor nodes to efficiently re-lay the sensed data to the sink. CPEQ adopts a cluster-basedapproach where nodes with more residual energy are selectedas aggregator nodes. An aggregator node builds up a clusterand the nodes from this cluster send their data to the aggre-gator. The aggregator then executes some function on the col-lected data and relays it to the sink. Every node on the networkcan become an aggregator node for a certain period of time.It is assumed that all nodes of the sensor network belong to acorresponding cluster. When the time expires, other nodes are

Page 6: Fault-tolerant wireless sensor network routing protocols for the supervision of context-aware physical environments

A. Boukerche et al. / J. Parallel Distrib. Comput. 66 (2006) 586–599 591

Fig. 7. Selection of aggregator nodes in CPEQ: (a) the node that generates a number <p sends REQ_EN message, (b) neighbor nodes receive the messageand reply REP_EN and (c) the node chooses the neighbor with more energy and assign it as aggregator node by sending SET_AGR.

Fig. 8. Cluster configuration in CPEQ: (a) before clustering and (b) after clustering.

selected as aggregators. The main goals of the CPEQ protocolare to uniformly distribute energy dissipation among the nodes,and to reduce latency and data traffic in the network.

The CPEQ algorithm is realized in five steps: initial config-uration; selection of aggregators; clusters configuration; datatransmission to the aggregator; and data transmission to thesink. Each step will be detailed in the following sections.

3.1. Initial configuration

The network needs to be configured before being used togather data from the environment. The initial configuration isbased on the PEQ algorithm, where the sink starts a floodingmechanism to configure the whole network. At the end of thisstep, each node knows the number of hops it is away from thesink, and is able to find the neighbor that is closest to the sinkin order to forward data to it. The CPEQ packet for this stephas an additional field that contains the percentage of nodesthat will become aggregators.

3.2. Aggregator selection

The aggregator selection scheme is based on the idea pre-sented in LEACH [17]. After the initial configuration, any nodein the network can become an aggregator with a certain proba-bility. For instance, if the desired percentage of aggregators is5%, the probability of a node becoming an aggregator will bep = 0.05. Therefore, each node generates a random numberbetween 0 and 1 and if this number is less than the probabilityp, the node will request the energy level from its immediate

neighbors by sending them a REQ_EN packet. Each neighborreplies REP_EN that contains its ID and the amount of energy.The node then selects the neighbor with more energy and sendsit a SET_AGR packet to inform that the node is the new ag-gregator. A node remains in the aggregator state for a specifictime, and when it expires, the aggregator selection scheme isexecuted again and again. Fig. 7 shows the selection scheme.

3.3. Clusters configuration

The new selected aggregator node is responsible for notify-ing its neighbors that it is the new aggregator. This way eachaggregator builds up its cluster of nodes. The cluster configu-ration is performed through the broadcasting of a notificationpacket AGR_NTF, that acts as the initial configuration algo-rithm of PEQ, but this time for a cluster. In order to limit thesize of a cluster, the AGR_NTF packet carries a time to live (ttl)field. When a node receives an AGR_NTF packet, it stores theID of the transmitter node on its routing table, so it knows theroute to the aggregator, and will send data through this route.It may happen that a node receives AGR_NTF from more thanone aggregator. The node will join the cluster that is closest tothe aggregator, just by checking the ttl field of the packets. Fig.8 shows a cluster and the paths between nodes and aggregator.In this example a ttl = 2 was used.

3.4. Data transmission to the aggregator

When a node detects an event in the environment, the senseddata have to be sent to the cluster aggregator node. The data

Page 7: Fault-tolerant wireless sensor network routing protocols for the supervision of context-aware physical environments

592 A. Boukerche et al. / J. Parallel Distrib. Comput. 66 (2006) 586–599

Fig. 9. Aggregator sending data to a sink.

routing algorithm is the same employed by PEQ when it routesdata to the sink, in which nodes use the information on theirrouting tables to find the paths to the sink. In CPEQ, the aggre-gator can be thought as a sink for its cluster. CPEQ also inheritthe path repair mechanism from PEQ.

3.5. Data transmission to the sink

After receiving data from the sensor nodes on its cluster, theaggregator needs to forward this data to the sink. The aggrega-tor may use a data aggregation or fusion function to reduce theamount of packets that will be transmitted to the sink. LEACHhas a problem with scalability [1], where the communicationbetween cluster heads (aggregators) and the sink is direct (onehop). Due to this scheme of communication, the cluster headsmust spend more energy to transmit at longer distances. It worksfine with small networks, because every cluster head can reachthe sink. However, with large networks, it can happen that acluster head is unable to send data to a sink because of thedistance between them. In CPEQ, the communication betweenaggregators and the sink is multi-hop—an aggregator sends itsdata through the shortest path to the sink, which was configuredduring the initial configuration step. Fig. 9 shows an aggregatorsending data through the sensor mesh to the sink. Other solu-tions [6,31] employ cluster heads with specialized hardware,which consumes more energy and communication resources.These solutions have higher costs and need special care withthe distribution of the cluster heads, as they will remain fixedthroughout the network lifetime.

It is possible that some nodes do not belong to any cluster. Inthis case, the nodes can use the routes found during the initialconfiguration, and relay data to the sink through these routes.

4. Simulation experiments

In order to assess the effectiveness of both PEQ and CPEQprotocols, we have carried out an extensive set of simulationexperiments using the NS-2 simulator [29]. We divide our dis-cussion of the experimental results into two subsections. Inthe first section we evaluate the PEQ protocol, pointing out

Fig. 10. Example scenario.

significant factors affecting PEQ’s performance. Following this,we discuss PEQ event packet delivery rate at different percent-ages of node failures. In the second part, we discuss the exper-imental results we have obtained from the performance evalua-tion of CPEQ, assessing its effectiveness in minimizing furtherthe latency when compared to PEQ protocol. We also discussother results we have obtained when comparing PEQ, CPEQand DD [19].

4.1. The simulation scenario and metrics for PEQ

We have implemented the PEQ protocol and evaluated itsperformance using the NS-2 simulator. It is assumed, for theexperiments, that the coordinates used by PEQ to calculate thedistance of a node to the sink are determined in a previousphase to the initial configuration and thus, their determinationwas not taken into account for the experiments. In the course ofour simulation experiments, we have chosen scenarios whichconsist of several sensors field of different population sizesranging from 100 to 500 sensor nodes. A mesh network wasconsidered, in which sensors are deployed randomly across anarea of 1000 × 1000 m2. A fixed workload of five sources andone sink was used. Sources were placed at the left side of thefield and the sink at the right-hand side so that when the networksize is increased, path numbers between sources and sink arealso increased (due to the number of nodes), as shown in Fig.10. Thus, the impact of network size on PEQ’s performancecan be more noticeable than randomly selecting the sourcesand the sink.

Table 1 lists the simulation parameters we have used in ourexperiments. The input values were basically based on the val-ues reported for the DD in [19] and the energy model basedon [11,27]. The data rate for each source was set to 10 eventpackets per second (instead of 2 as reported for DD, becausewe found that 2 event packets per second might not be enoughfor people localization in emergency situations). Since 5 sourcenodes were used for the experiment, 50 event were generatedper second. However, experiments with varying rate of 5, 10and 15 events per second were also carried out in order to eval-uate how a varying event packet rate can impact on the average

Page 8: Fault-tolerant wireless sensor network routing protocols for the supervision of context-aware physical environments

A. Boukerche et al. / J. Parallel Distrib. Comput. 66 (2006) 586–599 593

Table 1Simulation parameters [19]

Parameters Value

Simulation time (s) 500Number of nodes 100–500Source data rate (eventMsgs/s) 10Number of source nodes 5Repairing event interval (s) 20Radio range (m) 20Transmit energy (mW) 14.88Receive energy (mW) 12.50Dissipation in idle (mW) 12.36Dissipation in sleep (mW) 0.016

delay and event packet delivery ratio. The interval for sendinga repairing event packet was set to 20 s.

It is assumed that the sensors have enough energy to deliverthe event packet. However, along the simulation, the nodes inthe network were configured to dissipate energy according tothe parameters of Table 1. The nodes radio range was set to20 m to more closely mimic realistic sensor radio modules.Each value measured was taken from a mean of 20 simulations.Event packet delivery delay and delivery rate are critical met-rics for the performance of supervision applications. Moreover,dissipated energy may have a large impact on these delays.Thus, PEQ is evaluated through the following metrics:• Average delay: Average latency from the moment an event

packet is transmitted to the moment it is received at the sink.• Average event packet delivery ratio: Number of distinct re-

ceived event packets to the number of originally sent eventpacket ratio.

• Average dissipated energy: Total dissipated energy to thenumber of nodes ratio.

4.2. The effectiveness of PEQ

Average delay is particularly important in query-based appli-cations, which demand a fast and reliable response, like peoplelocalization queries for finding people in a building subject toan emergency situation, such as detection of fire or high tem-peratures, for instance. Low delay is important in this examplesince people can move fast in a few seconds and so, informationcan be out of date very quickly, preventing actions that couldsave lives if taken in a secure time. PEQ uses the subscriptionpacket to propagate the initial configuration that builds the pathto the sink and when the source receives the subscription, ituses this path to deliver data to the sink. The average latencyfrom the moment an event packet is transmitted to the momentit is received at the sink is shown in Fig. 11.

Node failures are simulated by turning off a fixed fraction ofthe nodes simulated. These nodes were randomly chosen fromthe sensor field and turned off at a random time during thesimulation. As network size is increased, the delay gets higherdue to the greater number of hops an event packet has to travelfrom source to sink. This makes sense as, in order to repair abroken path, the algorithm has to find “alive” nodes. Since the

Fig. 11. Average delay with node failures.

Fig. 12. Average event delivery ratio with node failures.

number of failed node increases, the new created path becomeslonger. PEQ always tries to find the shortest path to the sink. Asshown in the graph of Fig. 11, for a fixed network size, say 500nodes, the delay increases from 0.049 to 0.058 s, an acceptablelatency for the application scenario considered in this paper.The graph plots the comparison of the algorithms in a sensorfield with 30% of node failures. Sensor network reliability canbe measured by its average event packet delivery ratio, whichreflects the success rate of event packets transmissions to thesink. Fig. 12 shows that PEQ is able to maintain a reasonableevent packet delivery rate even at a high percentage of nodefailures. For the results shown in Fig. 12, it is taken into accountthe events that are generated at the end of the simulation—thenodes stop sending new event packets 500 ms before the endof the simulation, so that all event packets can reach the sink.

Fig. 13 depicts the average energy dissipation per node con-sidering node failures. As can be seen, the results were quitesimilar, as the idle time energy consumption practically domi-nates all simulations. As more node failures occur, more pathreconstructions are performed what adds an extra overhead.However, as a result from having more node failures, we havefewer nodes generating traffic in the network what justifies thesimilar curves in the graph.

All experiments reported above were conducted with a fixedevent packet rate of 10 event packets per second per sourcenode. However, it is important to know how a varying event

Page 9: Fault-tolerant wireless sensor network routing protocols for the supervision of context-aware physical environments

594 A. Boukerche et al. / J. Parallel Distrib. Comput. 66 (2006) 586–599

Fig. 13. Average dissipated energy with node failures.

Fig. 14. Average delay.

Fig. 15. Average delivery ratio.

packet rate can impact on average delay and event packet deliv-ery ratio. As mentioned in Section 4.2, experiments with vary-ing rate of 5, 10 and 15 event per second were also carriedout. Fig. 14 shows that an increase in the event packet rate hasraised the latency observed, especially when 15 or more eventsper second were generated by the sources.

This of course was expected since a rate of 15 event packetsper second generates much more traffic and therefore morepacket collisions and losses, as shown in Fig. 15, where theaverage delivery ratio decreases as more event per second aregenerated.

The sensor field considered here has only one sink, so thatthe algorithm could be better evaluated in a traffic jam situation.The nodes closer to the sink have to deal with a great numberof packets per second, limiting the performance of the networkand impacting on its lifetime. One solution would be to throwmore sinks and distribute the load among them, but this issubject of future work.

From the experiments, it can be seen that PEQ can providelow latency for event notification, fast broken path reconfigu-ration, and high reliability in the delivery of the event packetswith low-energy dissipation. Low latency was achieved by theuse of the shortest path for the delivery of event packets. Fastsubscriptions of new interests (for query-based scenarios) areprovided by the concept of driven delivery of event packets,in which new subscriptions to a sensor region are speed up byusing the inverse path used for event notification packets. Thishas an impact on energy saving, since less traffic is dissemi-nated through the network for both event notification packetsand broken path reconfiguration.

A CPEQ, described above, has been devised to minimize la-tency even further, specially under high-network data traffic (asis the case in emergency situations, such as fire, gas leaking,etc.). Next sections describe CPEQ effectiveness when com-pared to PEQ and DD Paradigm.

4.3. Performance evaluation of the CPEQ protocol

The CPEQ was also implemented using the simulator ns-2[29]. In the first set of simulation experiments, network den-sity considered was 500 nodes. Each source node sent 2 eventpackets per second, the AGR_NTF time to live field was set to2, and the percentage of aggregator nodes in the network wasbased on the studies of LEACH [17] and set to 5%. The percent-age of source nodes in the network varied from 2% to 16%, soit can better reflect an emergency situation, where many nodesdetect events from the physical environment. Also, these sourcenodes were randomly selected from the network. The amountof time to swap aggregator nodes was set to 50 s. The 802.11MAC layer was used for the simulations in the NS-2. Table 2summarizes the simulation parameters.

CPEQ was assessed through three metrics: average eventpacket delay, average event packet delivery ratio, and averagedissipated energy. PEQ and DD were simulated under the samescenario and parameters. A confidence of 95% was achievedwith a 4% error level.

4.4. Effectiveness of CPEQ

As outlined earlier, PEQ has indicated low latency for eventnotification, especially when the number of source nodes arenot too large. Therefore, a variation of PEQ, which we refer toas CPEQ was developed to deal with a large number of sourcenodes. In this section, we wish to assess the performance ofCPEQ, and compare it with PEQ and DD protocols. DD is data-centric, i.e., the data generated by the sensor nodes is namedusing attribute–value pairs. A node requests data by sending

Page 10: Fault-tolerant wireless sensor network routing protocols for the supervision of context-aware physical environments

A. Boukerche et al. / J. Parallel Distrib. Comput. 66 (2006) 586–599 595

Table 2Simulation parameters

Parameters Value

Simulation time (s) 1000Number of nodes 500Percentage of source nodes (%) 2–6Percentage of aggregators (%) 5Time to change aggregators (s) 50Source data rate (eventMsgs/s) 2Clustering time to live 2Repairing event interval (s) 20Radio range (m) 20Transmit energy (mW) 14.88Receive energy (mW) 12.50Dissipation in idle (mW) 12.36Dissipation in sleep (mW) 0.016

“interests” for named data. Data that matches the interest isthen “drawn” down towards that node by selecting a single pathor through multiple paths by using a low-latency tree. Eachintermediate sensor receiving the interest must broadcast it atleast once to setup the reverse path to the sink. The sensorspecified by the interest sends back the data through severalpaths. The sink may reinforce the preferred path after the initialexploratory stage. Without location information, the interestmust be broadcasted globally. Intermediate nodes can cache,or transform data, and may direct interests based on previouslycached data [19].

Let us now turn to our results. During the course of ourexperiments, we used the same measures used in Section 4.3.Recall that the average delay is the end-to-end delay observedbetween a source node and the sink. With low-data flow, CPEQis outperformed by the other protocols due to its extra delay,since the aggregators have to wait to receive some data fromthe nodes in their clusters before performing an aggregationfunction on the data, plus the delay observed during the datadelivery from the nodes to aggregators. As the number of nodesproducing events increase, as in a typical emergency situation,CPEQ is able to sustain a stable delay, because it reduces theamount of packets sent through the network by performinga simple aggregation function on each aggregator. The nodesin PEQ and DD have to forward all data generated by thesource nodes to the sink what contributed to the increase indelay. The graphic in Fig. 16 shows the stable delay observed inCPEQ and the abrupt increase in delay for PEQ and DD as thepercentage of source nodes increases. The dynamic clusteringset up (swapping of aggregators, clusters configuration, etc.)leads to extra overhead to CPEQ. With just a few source nodesin the network, PEQ and DD provide better results in terms ofenergy dissipation.

However, with an increasing number of sources, what is ex-pected in an emergency situation, CPEQ outperforms both pro-tocols, by reducing the data traffic and, consequently, the num-ber of transmissions and receptions, resulting in less energydissipation, as can be seen in Fig. 17. It is worthy noting that allthree simulated protocols do not save energy by turning nodes

Fig. 16. Average event delivery delay.

Fig. 17. Average dissipated energy per node.

Fig. 18. Average event delivery ratio.

off, and the fact that idle radio modules spend as much energyas receiving transmissions, the idle time energy utilization ab-solutely dominates all simulations.

The clustering and aggregation approaches of CPEQ alsocause an impact on the packet delivery success. With just afew source nodes, CPEQ behaves like the PEQ protocol, butwhen there is a great mount of data traffic, CPEQ shows betterperformance because only its aggregator nodes send data tothe sink, reducing data traffic and, therefore, reducing packetcollisions. Fig. 18 presents the results of data delivery ratio.

Page 11: Fault-tolerant wireless sensor network routing protocols for the supervision of context-aware physical environments

596 A. Boukerche et al. / J. Parallel Distrib. Comput. 66 (2006) 586–599

Fig. 19. PEQ energy map.

Fig. 20. CPEQ energy map.

In order to have a better perception of the nodes’ utilizationand of the dissipated energy distribution, the protocols weresimulated with a grid network scenario of 625 (25×25) nodes.The idle state energy consumption was set to zero, so it did notinterfere with the measurements. After 1000 s of simulation, asnapshot of the dissipated energy was taken, resulting in theenergy map of the network. Fig. 19 shows the PEQ energy map.

It can be observed that the nodes closer to sink were morerequested and the energy dissipation was high in that area. Itis also important to note that many nodes were rarely used.This is due to the unique paths the source nodes use to forwarddata to the sink. The CPEQ protocol showed a slightly betterdissipated energy distribution, as seen in Fig. 20.

The color bar represents the dissipated energy in Joules. Theperiodic random distribution of aggregators is responsible forthis effect. With only one static sink, the nodes closer to it aremore used and therefore dissipate more energy. CPEQ dissi-pated less energy due to its data aggregation feature performedby the aggregators, which reduces data traffic. In the second setof simulation experiments, the percentage of source nodes was

Fig. 21. Average event delivery delay.

Fig. 22. Average dissipated energy.

fixed in 14%, and network size ranged from 100 to 500 nodes.All the other parameters were left the same, as in Table 2. Theseexperiments were carried out to show how the protocols behaveunder different network sizes and, therefore, to express theirscalability. Fig. 21 shows that by increasing the network size,PEQ and DD takes longer to deliver a packet to the sink. Asmentioned earlier in this paper, CPEQ reduces the number oftransmitted packets, thus it reduces traffic and packet collisionswhat directly affects end-to-end delay and the other metrics.

The average dissipated energy is shown in Fig. 22. All threeprotocols had similar results, with a slightly better performanceof CPEQ as the network becomes larger. Packet losses areminimized in CPEQ due to its clustering and data aggregationfeatures, because not all packets produced by the source nodeshave to be relayed to the sink. Fig. 23 shows that CPEQ presentsthe highest delivery ratio (close to 1) even when increasing thenumber of nodes.

5. Related work

The main motivation for the work described here is to meetthe challenging requirements posed by different types of sce-narios for critical conditions monitoring applications simulta-neously: periodic, event-driven and query-based. These scenar-ios demand fast path set up for query-based subscriptions, suchas localization of people in a building during a fire emergency,

Page 12: Fault-tolerant wireless sensor network routing protocols for the supervision of context-aware physical environments

A. Boukerche et al. / J. Parallel Distrib. Comput. 66 (2006) 586–599 597

Fig. 23. Event delivery ratio.

low latency for event packets delivery and reliability (high-event packet delivery ratio), with minimum energy dissipation.Because energy saving is an important issue to preserve andextend a sensor network lifetime, various energy saving-basedsolutions were reported in the literature. It is well known that asensor in idle state consumes almost the same amount of energythan when it is awake, and that energy savings means to turn offcommunications completely (sleep mode) [4,19]. STEM [28]provides a good solution for energy saving when the sensorshave to be switched from the sleep mode to the data path mode(awake state) from time to time, i.e., when the application sce-nario is basically periodic. However, when a state switch fromsleep to awake mode has to be made often, such as when differ-ent types of subscriptions are requested (query-based or event-driven), then the switching could outperform the savings of en-ergy. Moreover, a dual radio setup is used for separating thedata path frequency from the wakeup frequency what can addcomplexity to the sensor node hardware. PEQ/CPEQ adoptscoordinate information (which are set up in a pre-configurationphase) to know the location of the nodes and uses this infor-mation to choose the shortest route to the destination (sink or acluster head node). Similarly, to EQ/CPEQ, Virtual Backbonefor Energy Saving in Wireless Sensor Networks (ViBES) alsouses hop counter information to choose the closest sinks. Theidea behind ViBES is to use only a small subset of the nodes forthe delivery of data, by setting up a delivery “backbone” andpreserving most of the nodes only for target detection [VIBES].Thus, under ViBES, the sensor network is divided in primaryViBES nodes (that are part of the backbone) and sensing nodes.The main difference between ViBES and PEQ/CPEQ is howthe primaryViBES/agregators nodes are selected, and how faulttolerance is provided. In PEQ/CPEQ, the network nodes cantrigger fault tolerance when they detect a node failure, in whichcase the nodes find, cooperatively, the fastest path, with smallestpossible number of transmissions. With ViBES multiple pathsare set up for the delivery so, if one failures, another will deliverthe data. The problem with this solution is that more traffic isgenerated in the network through the multipaths.

The Geographic Random Forwarding (GeRaF) is also amulti-hop transmission scheme, based on geographical lo-cation, which selects the relaying nodes through a “receiver

contention scheme” [GERAF]. In GeRaF, when a node wishesto transmit a message, it broadcasts to all active nodes in itscoverage area—each of these nodes calculates its distancefrom the destination and assesses how adequate it is to act asrelay. For that, the coverage area has to be divided in relayand non-relay regions. Thus, the relay region is further dividedin priority regions which are based on the distance from thedestination. Although it is an interesting idea, the selectionof the relaying nodes can be a complex and time-consumingtask if topology changes dynamically. EAD [3], a networklevel energy-aware routing protocol, uses novel concepts ofneighboring broadcast scheduling and distributed competitionamong neighbors based on residual energy in order to set up abackbone for reliable delivery of notifications.

It is a great challenge to meet energy-saving and fault-tolerance requisites simultaneously, because these require-ments can be conflicting. For instance, the multi-path versionof the DD paradigm [16] uses multiple routing paths to trans-fer data, so that node failures in one path can be overcome bysending the data through multiple paths what increases energyconsumption and can cause packet collisions. ARRIVE [20]and INSENS [9] use variations of this concept to cope withnode failures. The PFR protocol [8] is inspired by the prob-abilistic multi-path solution of the DD paradigm. Basically,it favors transmissions towards the sink using nodes within azone around an imaginary line connecting a source node to thesink. The protocol forms a “thin zone” of nodes to propagatethe data to the sink. The capacity of estimating the directionof a received transmission can increase the node energy con-sumption and cost, as the node needs to have a magnetometermodule. An extended version of PFR (SW-PFR) introducessleep–awake periods in order to save energy [26]. VariableTransmission Range Protocol—VRTP [2] copes with fault tol-erance and energy saving by allowing the data transmissionrange to vary in such a way to overcome node failures orobstacles. Network lifetime is increased since critical sensors(those that are close to the sink) are not overused; however amore complex hardware has to be deployed.

Algorithms that deal with node failures include SPIN (SPMS)[17], which uses meta-data exchange prior to exchange of datato decide if a node requires the data. It uses shortest distancemulti-hop routing for the request and data transfers, savingenergy and reducing end-to-end delay. SPMS fault-tolerancemechanism consists in keeping both the current and the secondshortest route in the routing table. When node failures occurin the current shortest path, the second path can be chosen.However, in an emergency situation such as a fire condition,many nodes can failure, including nodes from pre-defined andstored paths.

In order to reduce further traffic volume, energy-efficient hi-erarchical routing protocols such as LEACH [17] builds upclusters of sensor nodes based on their signal strength and inthe use of cluster heads as routers to the sink. The communica-tion with the sink is performed only by the cluster heads, whichare rotated among the sensor nodes so that nodes are stresseduniformly. The clustering performed by LEACH reduces traf-fic and energy dissipation. However, because the cluster heads

Page 13: Fault-tolerant wireless sensor network routing protocols for the supervision of context-aware physical environments

598 A. Boukerche et al. / J. Parallel Distrib. Comput. 66 (2006) 586–599

transmit directly to the sink (single-hop), the network may notscale, since for a larger sensor network, the cluster heads com-munication power may not be enough to reach the sink.

The PEQ/CPEQ were compared with the well-known DDfor energy, fault tolerance and latency performances. The maindifferences between our solution and DD are as follows. Basi-cally, DD has to flood the net and then exploratory event mes-sages are sent to the sink. Then the sink will increase the packetrate for some of the paths that the exploratory event messagesfound to be better. PEQ does not use that, as it generates moretraffic and delay. PEQ makes use of fast subscriptions usingthe inverse path, as described in Section 2. CPEQ is a variationof PEQ that deals with higher-density networks. It was devisedto minimize latency even further through the introduction ofclustering and data aggregation features, in which not all pack-ets produced by the source nodes have to be relayed to thesink—only its aggregator nodes send data to the sink, reduc-ing data traffic and, therefore, reducing packet collisions whatdirectly affects end-to-end delay and the other metrics. CPEQuses PEQ for routing packets to the sink so that fault toleranceand fast and reliable delivery are preserved—an essential requi-site for critical conditions monitoring applications. Differentlyfrom LEACH, CPEQ transmission to the sink is multi-hop soit can support large-scale sensor nodes networks.

Other clustering-based protocols such as APTEEN [23], PE-GASIS [22] and energy-aware routing for cluster-based sensornetworks [31] present good solutions to provide near-optimumenergy dissipation and reduced latency, but they can be com-plex. Marcucci et al. [24] demonstrates the benefits of the hi-erarchical routing protocol solutions. The strength of CPEQ isits simplicity and effectiveness for data transmission and nodeprocessing compared to those approaches what makes CPEQa potential solution to meet the requirements of critical condi-tions monitoring applications.

6. Conclusions

Sensor networks are increasingly being used for continuoussensing, event detection, location sensing as well as micro-sensing in applications areas ranging from health care, to trans-portation, finance, defense, food, government, manufacturing,fire fighting, and much more. One of the most appealing ap-plications is security surveillance and supervision of context-aware physical environments that can be subjected to criticalconditions such as fire, leaking of toxic gases and explosions.A great challenge to these networks is to provide a fast, reliableand fault-tolerant channel for event packets diffusion, whichmeets the requirements of query-based, event-driven and peri-odic sensor networks application scenarios, even in the pres-ence of emergency conditions that can lead to node failures andpath disruption to the sink that receives those event packets.

In this paper, we propose PEQ, a novel wireless network al-gorithm, that uses ordinary sensor node hardware with shortradio range to meet periodic, event-driven and query-based in-terests. PEQ uses a small amount of information for the rout-ing mechanism (basically the hop level and routing table). If afailure is detected, unlike other solutions that uses three way

protocol, PEQ broadcasts a SEARCH packet to its neighbors,and receives a reply with their hop level and identification. Theneighbor with lower hop level is chosen as the new destina-tion, and loop back is avoided. PEQ provides low latency forevent notification, dynamic broken path reconfiguration, highreliability in the delivery of the event packets with low-energydissipation. Low latency is achieved by the use of the shortestpath for the delivery of event packets. New subscriptions to asensor region are speed up by using the reverse path used forevent notifications. Individual nodes, instead of a sink-basedmechanism, trigger fault-tolerance mechanism locally.

In order to decrease latency even further, specially underhigh-network data traffic condition, a variation of PEQ was de-vised, CPEQ, a cluster-based routing protocol that groups sen-sor nodes to efficiently relay the sensed data to the sink. InCPEQ protocol nodes with more residual energy are selected asaggregator nodes that relay data to the sink by uniformly dis-tributing energy dissipation among the nodes. Important met-rics were evaluated and compared to the DD paradigm and thePEQ protocol. As the number of nodes get larger (above 500)CPEQ average delay indicates half the delay presented by DD(200 ms for DD versus 100 ms for CPEQ and 150 ms for PEQ).Delivery ratio is also better for CPEQ as the network size in-creases (nearly 97% of delivery success for 500 nodes versus70% for DD and nearly 80% for PEQ). In terms of energy dissi-pation, CPEQ consumes less energy due to its data aggregationfeature, which reduces data traffic. The strength of CPEQ isits simplicity and effectiveness in the delivery of event packetswhat makes it a good candidate to meet constraints and require-ments of event packet delivery in critical situations monitoringapplications.

References

[1] K. Akkaya, M. Younis, A survey of routing protocols in wireless sensornetworks, Elsevier Ad Hoc Network J. 3/3 (2005) 325–349.

[2] A. Boukerche, I. Chatzigiannakis, S. Nikoletseas, A new energyefficient and fault-tolerant protocol for data propagation in smart dustnetworks using varying transmission range, in: Annual SimulationSymposium, Proceedings of the 37th ACM/IEEE Annual SimulationSymposium—ANSS, 2004.

[3] A. Boukerche, X. Cheng, J. Linus, Energy-aware data-centric routing inmicrosensor networks, in: MSWiM’03, September 19, 2003, San Diego,CA, USA, 2003.

[4] A. Boukerche, I. Nikoletseas, Protocols for data propagation in wirelesssensor networks, in: M. Guizani (Ed.), Wireless CommunicationsSystems and Networks, Kluwer, Dordrecht, 2004, pp. 23–51, (Chapter2).

[5] N. Xu, S. Rangwala, K.K. Chintalapudi, D. Ganesan, A. Broad, R.Govindan, D. Estrin, 2004. A wireless sensor network for structuralmonitoring, in: Proceedings of the Second International Conference onEmbedded Networked Sensor Systems, Baltimore, MD, USA, November03–05, 2004, SenSys ’04, ACM Press, New York, NY, pp. 13–24.

[6] A. Cerpa, J. Elson, M. Hamilton, J. Zhao, Habitat monitoring:application driver for wireless communications technology, in: ACMSIGCOMM’2000, Costa Rica, April 2001.

[7] Y.J. Zhao, R. Govindan, D. Estrin, Computing aggregates for monitoringwireless sensor networks, in: Proceedings of the First IEEE InternationalWorkshop on Sensor Network Protocols and Applications (SNPA ’03),Anchorage, AK, USA, 2003.

[8] A. Boukerche, Handbook on Algorithms for Wireless Networking andComputing, CRC/Hall 2005.

Page 14: Fault-tolerant wireless sensor network routing protocols for the supervision of context-aware physical environments

A. Boukerche et al. / J. Parallel Distrib. Comput. 66 (2006) 586–599 599

[9] J. Deng, R. Han, S. Mishra, INSENS: intrusion-tolerant routingin wireless sensor networks, Poster Paper, in: The 23rd IEEEInternational Conference on Distributed Computing Systems (ICDCS2003), Providence, RI, May, 2003.

[10] G. Werner-Allen, J. Johnson, M. Ruiz, J. Lees, M. Welsh, Monitoringvolcanic eruptions with a wireless sensor network, in: EWSN’05,Istanbul, Turkey, January 2005.

[11] C. Efthymiou, S. Nikoletseas, J. Rolim, Energy balanced data propagationin wireless sensor networks, in: Proceedings of the Fourth InternationalWorkshop on Algorithms for Wireless, Mobile, Ad-Hoc and SensorNetworks (WMAN ’04), IPDPS 2004, 2004.

[12] E. Ardizzone, M. La Cascia, G.L. Re, M. Ortolani, 2005. An integratedarchitecture for surveillance and monitoring in an archeological site,in: Proceedings of the Third ACM international Workshop on VideoSurveillance & Amp; Sensor Networks, Hilton, Singapore, November11, 2005, VSSN’05. ACM Press, New York, NY, pp. 79–86.

[13] B. Przydatek, D. Song, A. Perrig, 2003. SIA: secure informationaggregation in sensor networks, in: Proceedings of the First InternationalConference on Embedded Networked Sensor Systems, Los Angeles,California, USA, November 05–07, 2003, SenSys ’03, ACM Press, NewYork, NY, pp. 255–265.

[14] P.T. Eugster, P. Felber, R. Guerraoui, A. Kermarrec, The many faces ofpublish/subscribe, ACM Comput. Survey 35 (2) (2003) 114–131.

[15] P.T. Eugster, R. Guerraoui, J. Sventek, in: E. Bertino (Ed.), DistributedAsynchronous Collections: Abstractions for Publish/SubscribeInteraction, ECOOP 2000, Lecturer Notes in Computer Science, vol.1850, Springer, Berlin, Heidelberg, 2000, pp. 252–276.

[16] D. Ganesan, R. Govindan, S. Shenker, D. Estrin, Highly resilient, energyefficient multipath routing in wireless sensor networks, MC2R 1 (2)(2002).

[17] W. Heinzelman, A. Chandrakasan, H. Balakrishnan, Energy-efficientcommunication protocol for wireless sensor networks, in: The Proceedingof the Hawaii International Conference System Sciences, Hawaii, January2000.

[18] J. Hill, R. Szewczyk, A. Woo, S. Hollar, D. Culler, K. Pister, Systemarchitecture directions for networked sensors, in: Proceedings of ACMASPLOS IX, November 2000.

[19] C. Intanagonwiwat, R. Govindan, D. Estrin, Directed diffusion: a scalableand robust communication paradigm for sensor networks, in: Proceedingsof the Sixth ACM/IEEE International Conference on Mobile Computing,2000.

[20] C. Karlof, Y. Li, J. Polastre, Arrive: algorithm for robust routing involatile environments, Technical Report UCBCSD-02-1233, ComputerScience Department, University of California at Berkeley, May 2002.

[21] N. Shrivastava, C. Buragohain, D. Agrawal, S. Suri, Medians and beyond:new aggregation techniques for sensor networks, in: Proceedings ofthe Second International Conference on Embedded Networked SensorSystems, Baltimore, MD, USA, November 03–05, 2004, SenSys ’04,ACM Press, New York, NY, pp. 239–249.

[22] S. Lindsey, C.S. Raghavendra, PEGASIS: power efficient GAthering insensor information systems, in: The Proceedings of the IEEE AerospaceConference, Big Sky, Montana, March 2002.

[23] A. Manjeshwar, D.P. Agrawal, APTEEN: a hybrid protocol for efficientrouting and comprehensive information retrieval in wireless sensornetworks, in: The Proceedings of the Second International Workshopon Parallel and Distributed Computing Issues in Wireless Networks andMobile computing, Ft. Lauderdale, FL, April 2002.

[24] A. Marcucci, M. Nati, C. Petrioli, A. Vitaletti, Directed diffusionlight: low overhead data dissemination in wireless sensor networks,in: Proceedings of IEEE VTC 2005, Spring, Stockholm, Sweden, May30–June 1, 2005.

[25] R. Min, M. Bhardwaj, S. Cho, A. Sinha, E. Shih, A. Wang, A.Chandrakasan, Low-power wireless sensor networks, in: VLSI Design2001, January 2001.

[26] S. Nikoletseas, I. Chatzigiannakis, A. Antoniou, H. Euthimiou, A. Kinalis,G. Mylonas, Energy efficient protocols for sensing multiple events insmart dust networks, in: Proceedings of the 37th Annual ACM/IEEESimulation Symposium (ANSS’04), IEEE Computer Society Press, SilverSpring, MD, 2004, pp. 15–24.

[27] A. Savvides, C.-C. Han, M. Srivastava, Dynamic fine grained localizationin ad-hoc networks of sensors, in: MobiCom 2001, Rome, Italy, July2001, pp. 166–179.

[28] C. Schurgers, V. Tsiatsis, S. Ganeriwal, M. Srivastava, Topologymanagement for sensor networks: exploiting latency and density, in:Proceedings of the MOBICOM’2002, 2002.

[29] The Network Simulator ns-2, http://www.isi.edu/nsman/ns.[30] T. He, B.M. Blum, J.A. Stankovic, T. Abdelzaher, 2004. AIDA: adaptive

application—independent data aggregation in wireless sensor networks,Trans. Embedded Comput Syst. 3, 2 (May 2004), 426–457.

[31] M. Younis, M. Youssef, K. Arisha, Energy-aware routing in cluster-based sensor networks, in: The Proceedings of the 10th IEEE/ACMInternational Symposium on Modeling, Analysis and Simulation ofComputer and Telecommunication Systems (MASCOTS2002), FortWorth, TX, October 2002.

Azzedine Boukerche is a Full Professor andholds a Canada Research Chair position at theUniversity of Ottawa. He is the Founding Direc-tor of PARADISE Research Laboratory at theUniversity of Ottawa. He spent a year at theJPL/NASA–California Institute of Technologywhere he contributed to a project centered aboutthe specification and verification of the softwareused to control interplanetary spacecraft.

His current research interests include wire-less ad hoc and sensor networks, wireless multi-media, wireless networking, distributed and mo-bile computing, distributed interactive simula-tion, QoS service provisioning, context aware

physical environments for emergency preparedness where wireless sensornetworks, and collaborative virtual simulation systems areas are integratedfor accurate monitoring and visualization of physical enviroments subject toemergency and harsh conditions.

Richard Werner Nelem Pazzi is a ComputerScience Ph.D. Candidate at the University ofOttawa. He received his B.Sc. and a M.Sc. in2002 and 2004 from the Department of Com-puter Sciences at the Federal University of SaoCarlos, Brazil, respectively. His research inter-ests include protocol design for wireless sensornetworks for emergency situations, fault tolerantwireless sensor networks large-scale distributedsimulations and computer graphics. Recently, hehas been focusing on designing protocols forremote virtual environments as well as multime-dia streaming protocols over wireless networks.

Regina B. Araujo is an Associate Professor atthe Computer Science Department of FederalUniversity of São Carlos, SP, Brazil.

Araujo’s research interests are context awarephysical environments for emergency pepared-ness where wireless sensor networks, distributedsimulations, context-aware computing and col-laborative virtual enviroments areas are inte-grated for accurate monitoring and visualizationof physical enviroments subject to emergencyconditions.

Prof. Araujo was a Program Co-Chair for theFirst ACM Workshop on QoS and Security forWireless Networks (Q2SWinet 2005). In 2005,

she was a Visiting Professor at the PARADISE Research Laboratory, Uni-versity of Ottawa. She serves as the program Co-Chair for the Second ACMInternational Workshop on Wireless Multimedia Networking and PerformanceModeling.