20
Inf Syst Front (2011) 13:481–500 DOI 10.1007/s10796-009-9211-y TMS-RFID: Temporal management of large-scale RFID applications Xue Li · Jing Liu · Quan Z. Sheng · Sherali Zeadally · Weicai Zhong Published online: 31 July 2009 © Springer Science + Business Media, LLC 2009 Abstract In coming years, there will be billions of RFID tags living in the world tagging almost everything for tracking and identification purposes. This phenom- enon will impose a new challenge not only to the network capacity but also to the scalability of event processing of RFID applications. Since most RFID applications are time sensitive, we propose a notion of Time To Live (TTL), representing the period of time that an RFID event can legally live in an RFID data management system, to manage various temporal event patterns. TTL is critical in the “Internet of Things” for handling a tremendous amount of partial event- tracking results. Also, TTL can be used to provide X. Li (B ) School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane, Australia e-mail: [email protected] J. Liu · W. Zhong Institute of Intelligent Information Processing, Xidian University, Xi’an, China J. Liu e-mail: [email protected] W. Zhong e-mail: [email protected] Q. Z. Sheng School of Computer Science, The University of Adelaide, Adelaide, Australia e-mail: [email protected] S. Zeadally Department of Computer Science and Information Technology, University of the District of Columbia, Washington, D.C., USA e-mail: [email protected] prompt responses to time-critical events so that the RFID data streams can be handled timely. We di- vide TTL into four categories according to the general event-handling patterns. Moreover, to extract event se- quence from an unordered event stream correctly and handle TTL constrained event sequence effectively, we design a new data structure, namely Double Level Sequence Instance List (DLSIList), to record interme- diate stages of event sequences. On the basis of this, an RFID data management system, namely Temporal Management System over RFID data streams (TMS- RFID), has been developed. This system can be con- structed as a stand-alone middleware component to manage temporal event patterns. We demonstrate the effectiveness of TMS-RFID on extracting complex tem- poral event patterns through a detailed performance study using a range of high-speed data streams and various queries. The results show that TMS-RFID has a very high throughput, namely 170,000–870,000 events per second for different highly complex continuous queries. Moreover, the experiments also show that the main structure to record the intermediate stages in TMS-RFID does not increase exponentially with the number of events. These results demonstrate that TMS- RFID not only supports high processing speeds, but is also highly scalable. Keywords RFID data management · TTL · RFID event processing · Unordered event stream 1 Introduction The size and characteristics of RFID (Radio Frequency Identification) data pose many new challenges to the

TMS-RFID: Temporal management of large-scale RFID applicationsweb.science.mq.edu.au/~qsheng/papers/ISF-2011.pdf · RFID in handling large-scale, high-speed RFID data streams. The

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: TMS-RFID: Temporal management of large-scale RFID applicationsweb.science.mq.edu.au/~qsheng/papers/ISF-2011.pdf · RFID in handling large-scale, high-speed RFID data streams. The

Inf Syst Front (2011) 13:481–500DOI 10.1007/s10796-009-9211-y

TMS-RFID: Temporal management of large-scaleRFID applications

Xue Li · Jing Liu · Quan Z. Sheng ·Sherali Zeadally · Weicai Zhong

Published online: 31 July 2009© Springer Science + Business Media, LLC 2009

Abstract In coming years, there will be billions ofRFID tags living in the world tagging almost everythingfor tracking and identification purposes. This phenom-enon will impose a new challenge not only to thenetwork capacity but also to the scalability of eventprocessing of RFID applications. Since most RFIDapplications are time sensitive, we propose a notion ofTime To Live (TTL), representing the period of timethat an RFID event can legally live in an RFID datamanagement system, to manage various temporal eventpatterns. TTL is critical in the “Internet of Things”for handling a tremendous amount of partial event-tracking results. Also, TTL can be used to provide

X. Li (B)School of Information Technology and ElectricalEngineering, The University of Queensland,Brisbane, Australiae-mail: [email protected]

J. Liu · W. ZhongInstitute of Intelligent Information Processing,Xidian University, Xi’an, China

J. Liue-mail: [email protected]

W. Zhonge-mail: [email protected]

Q. Z. ShengSchool of Computer Science, The University of Adelaide,Adelaide, Australiae-mail: [email protected]

S. ZeadallyDepartment of Computer Science and InformationTechnology, University of the District of Columbia,Washington, D.C., USAe-mail: [email protected]

prompt responses to time-critical events so that theRFID data streams can be handled timely. We di-vide TTL into four categories according to the generalevent-handling patterns. Moreover, to extract event se-quence from an unordered event stream correctly andhandle TTL constrained event sequence effectively,we design a new data structure, namely Double LevelSequence Instance List (DLSIList), to record interme-diate stages of event sequences. On the basis of this,an RFID data management system, namely TemporalManagement System over RFID data streams (TMS-RFID), has been developed. This system can be con-structed as a stand-alone middleware component tomanage temporal event patterns. We demonstrate theeffectiveness of TMS-RFID on extracting complex tem-poral event patterns through a detailed performancestudy using a range of high-speed data streams andvarious queries. The results show that TMS-RFID hasa very high throughput, namely 170,000–870,000 eventsper second for different highly complex continuousqueries. Moreover, the experiments also show that themain structure to record the intermediate stages inTMS-RFID does not increase exponentially with thenumber of events. These results demonstrate that TMS-RFID not only supports high processing speeds, but isalso highly scalable.

Keywords RFID data management · TTL ·RFID event processing · Unordered event stream

1 Introduction

The size and characteristics of RFID (Radio FrequencyIdentification) data pose many new challenges to the

Page 2: TMS-RFID: Temporal management of large-scale RFID applicationsweb.science.mq.edu.au/~qsheng/papers/ISF-2011.pdf · RFID in handling large-scale, high-speed RFID data streams. The

482 Inf Syst Front (2011) 13:481–500

current data management systems. RFID events havetheir own characteristics that cannot be supported bytraditional event systems (Chawathe et al. 2004; Wanget al. 2006; Wu et al. 2006; Sheng et al. 2008). RFIDApplication Level Event (ALE) standard proposed byEPCglobal,1 a common interface to process raw RFIDevents, also emphasizes the importance of RFID datastream processing. RFID data are time-dependent, dy-namically changing, in large volumes, and are carryingimplicit semantics. To support these characteristics, ageneral RFID data processing framework is requiredto process high volume RFID data streams in real time,and automatically transform the physical RFID obser-vations into the virtual counterparts linked to businessapplications (Chawathe et al. 2004; Sheng et al. 2008;Franklin et al. 2005).

Among various RFID applications, simple and com-plex event detection plays an important role. Severalevent processing systems that execute complex eventqueries over real-time streams of RFID readings havebeen proposed in recent years (Wang et al. 2006; Wuet al. 2006; Bai et al. 2007; Wang and Liu 2005). Thecomplex event queries in these systems can filter andcorrelate events to match with specific patterns, andtransform the relevant events into high-level businessevents to be used by external applications.

However, most RFID applications have time restric-tions on target events. An event is valid only if ithappens within or after a time limit. For example,in general cases, it is required in an airport that fora baggage to be loaded on time, it must arrive ata specified place 30 min before the scheduled flighttakes off. Unfortunately, this kind of time-restrictionissues are largely ignored by most of the current eventprocessing systems and left as separate or individual ap-plication problems. SASE (Wu et al. 2006) can extracta target event sequence according to the prescribedsequence order in a query, or check whether a targetevent finishes within a prescribed period of time. ButSASE cannot restrict the temporal distance betweenany two successive events. Although Wang et al. (2006)proposed several temporal operators and related con-structs that can restrict the temporal distance betweentwo events, these operators can only detect a sequencewith two events or the same type of events.

To handle various temporal event patterns, both sim-ple and complex, with time restrictions, it is importantto develop a novel RFID data management system.Since temporal event patterns restrict that events canonly be valid within a certain period of time, we identify

1http://www.epcglobalinc.org

a new time notion, namely Time To Live (TTL), tocover all kinds of complex temporal event patterns, in-cluding those that can be processed in existing systems.

In the Internet, TTL is an important concept thathelps the Internet to discard the datagrams that maynot be able to reach their destinations. RFID is widelybelieved as a promising technology to create an “Inter-net of Things” in the near future (Gershenfeld et al.2004). As a result, we suggest that TTL in RFID ap-plications should become an important notion to helpthe “Internet of Things” to be free from a large amountof partial event-tracking results and make prompt re-sponses to time-critical events.

All available RFID systems have a time notion, i.e.,timestamps, which are assigned to RFID data by RFIDreaders when tags are read. Timestamps indicate thetime points that events occurred in the real world, whileTTL indicates the time restrictions that target eventsshould satisfy. Thus, in addition to timestamps, weadvocate that TTL should become another importanttime notion for the RFID data management.

When processing complex events, it is general thatthe order of arriving events may not match with theorder of the occurrence of the events in the realworld. The problem, named as Unordered Event Stream(UnES), can be caused by network routing delay or byan arbitrary selection of records when a single tag isread by multiple readers simultaneously. An analogousproblem can be found in the Internet communicationswhen a datagram arrives at the system applicationlevel being out of sequence. Existing works address theUnES problem in two ways: one is to simply ignorethe problem by assuming that the events enter into thesystem with the same order of their occurrences in thereal world, such as SASE (Wu et al. 2006). The otheris to sort events according to their timestamps beforeextracting complex events patterns from them, such asDemers et al. (2006).

Both approaches, however, have disadvantages. Forthe first approach, since RFID readers are widely dis-tributed and each one may have a delay, it is obviousthat events cannot always enter into the system in theorder of their occurrences. Moreover, RFID tags canbe read by different readers simultaneously, but thereaders may send these primitive events to the systemat different times. Also, even after the events enter intothe system, they may still be subjected to the queuing ofdifferent processes. Clearly, ignoring the UnES prob-lem is impractical to most RFID applications. For thesecond approach, they require the time point that anevent enters into the system must be no later than abound, cope with all applications by the same bound.If there are many applications simultaneously handled

Page 3: TMS-RFID: Temporal management of large-scale RFID applicationsweb.science.mq.edu.au/~qsheng/papers/ISF-2011.pdf · RFID in handling large-scale, high-speed RFID data streams. The

Inf Syst Front (2011) 13:481–500 483

by the system, this single bound cannot be sufficient todeal with the combined effect of all incoming data.

To solve the UnES problem, we have developed adata structure, Double Level Sequence Instance List(DLSIList), to record the unordered event streams sothat the system can extract sequence patterns from anunordered event stream. To avoid the size of the inter-mediate stages growing exponentially, an algorithm hasalso been developed to maintain and update DLSIListby using TTL.

Our contributions in this paper can be summarizedas follows:

– A notion of TTL: TTL is a concept used to en-force the time restrictions in RFID applications.The main advantage of TTL is that it allows theconstruction of a generic mechanism to handle allkinds of time restrictions. In addition, we designand illustrate a variety of TTL queries that mayappear in different RFID applications.

– An effective solution to the UnES problem based onTTL: A new data structure, namely DLSIList, isdesigned based on TTL to mitigate the exponentialgrowth of partial-events tracking results, as well asto extract the complex events effectively.

– A prototype system for handling TTL queries:We have developed a prototype system, Tempo-ral Management System over RFID data streams(TMS-RFID) that implements the ideas proposedin this paper. TMS-RFID has been developed asa stand-alone component that is adaptable fordifferent event processing systems. We also demon-strate the effectiveness and efficiency of TMS-RFID in handling large-scale, high-speed RFIDdata streams.

The remainder of this paper is organized as follows.Section 2 overviews related work. Sections 3 and 4introduce the concept of TTL and the TTL query lan-guage respectively. Section 5 describes an algorithmbased on TTL for solving the unordered event streamproblem. Section 6 presents the system architecture ofTMS-RFID and Section 7 reports the results of a de-tailed performance study. Finally, Section 8 concludesthis paper.

2 Related work

Although RFID technology has existed for more than50 years (Landt 2005), it poses many new challenges fordata processing and management in recent years whenRFID tags are applied in large-scale applications, such

as missed and unreliable RFID readings, redundantRFID data, in-flood of RFID data, spatial and temporalmanagement of RFID data. Chawathe et al. (2004)present a brief introduction to RFID technology andhighlight several data management challenges. The im-portance of event processing for RFID data is pointedout by Palmer (2009).

Wu et al. (2006) propose a stream-based RFIDevent processing system, namely SASE. SASE can ef-ficiently execute monitoring queries over RFID datastreams, and is the first proposed model for RFIDevent processing. However, SASE cannot control timeintervals between two successive events. Since manyapplications require two successive primitive events in atarget complex event satisfying some time intervals, theability to control such intervals is extremely important.Moreover, SASE assumes that all events are totallyordered by their timestamps. Such an assumption istoo restrictive and would not be true for most RFIDapplications.

Wang et al. (2006), Bai et al. (2007) and Wangand Liu (2005) address the problem of complex eventprocessing over RFID data stream from the viewpointof Entity-Relationship (ER) model and SQL-basedstream query language. Wang and Liu (2005) proposean expressive temporal-based data modeling of RFIDdata, using a Dynamic Relationship ER Model, andthe method on how to use rules to transform RFIDdata from observations into the data model. Wanget al. (2006) further formulate a declarative rule basedapproach to provide support of automatic RFID datatransformation between the physical world and thevirtual world. Such an approach is capable of detect-ing complex temporal-pattern-based high level events.Among the four proposed temporal complex event con-structors, SEQ and TSEQ can only detect sequences oftwo events, while TSEQ and TSEQ+ can only detectsequences of the same type of events. Although TSEQand TSEQ+ can control time intervals between twoevents, TSEQ is only for two events and TSEQ+ is onlyfor the same type of events. Bai et al. (2007) extenda SQL-based stream query language with temporaloperators and related constructors and use several ex-ample scenarios to illustrate the power of the proposedlanguage.

Rizvi et al. (2005) discuss a system architecture andgeneral issues in processing RFID data and propose asystem, namely, HiFi. HiFi aggregates events along atree-structured network on various temporal and geo-graphic scales.

Liu et al. (1998) and Mansouri-Samani and Sloman(1997) also consider temporal restrictions. However,those temporal restrictions cannot be used to support

Page 4: TMS-RFID: Temporal management of large-scale RFID applicationsweb.science.mq.edu.au/~qsheng/papers/ISF-2011.pdf · RFID in handling large-scale, high-speed RFID data streams. The

484 Inf Syst Front (2011) 13:481–500

the special RFID events such as temporal sequence andtemporal negation.

Finally, Demers et al. (2006) and Brenna et al. (2007)propose a system, Cayuga, to process complex events inpublish/subscribe systems. Cayuga offers a powerful ex-pressive language similar to the one required in RFIDdata management systems. Cayuga can deal with a widerange of complex event patterns, including sequenceand Kleene closure patterns. However, Cayuga cannotcontrol time intervals between two successive events.For the UnES problem, Cayuga sorts events accordingto their timestamps before extracting complex eventspatterns from them, using the technique proposed bySrivastava and Widom (2004).

3 Time to live

Most RFID applications have time restrictions on tar-get events. However, many partial results would notbe able to reach their final stages because of timeoutrestrictions. We therefore need a mechanism to timeout RFID events, either simple or complex, to reducethe size of partial results, and thereby achieving highscalability with large RFID data streams.

3.1 RFID event model

RFID data are presented to a system in terms of events.Some events may only need single readings, while theothers may need multiple readings over a period of timeat different locations (Wang et al. 2006; Wu et al. 2006;Bai et al. 2007). The RFID event model serves as a basisfor TTL.

Definition 1 Event, Event Type An event is an occur-rence of interest happening in a particular time andlocation in the real world and is recorded in RFID datamanagement systems. An event type is a template thatprescribes a class of events that consists of a set ofattributes and the values of these attributes. Each eventhas an event type and a set of values corresponding tothe attributes of this event type.

Events can be divided into different types accordingto applications. We adopt the definitions of events fromEPC Information Services (EPCIS) standard.2 That is,each core event type has the fields that represent fourkey dimensions of any EPCIS event: (i) the object(s)or other entities that are the subject of the event; (ii)

2http://www.epcglobalinc.org

the date and time; (iii) the location at which the eventoccurred; and (iv) the business context. These fourdimensions usually represent “what, when, where, andwhy”, respectively. Here, we encode the information ofall four dimensions into the attributes of an event.

Definition 2 Primitive Event A primitive event is anRFID reader observation. In particular, the time at-tribute of each primitive event is a timestamp, repre-senting the time point that an RFID reader reads anRFID tag.

Definition 3 Complex Event A complex event is a pat-tern of primitive events and happens over a period oftime.

A complex event usually is composed of a few prim-itive events, and has a start time and an end time.

3.2 TTL taxonomy

Definition 4 Time To Live Time To Live, denoted asTTL, is the period of time that an event can legally livein an RFID data management system.

In RFID applications, both primitive and complexevents are detected by RFID readers in many differentsituations:

– Single event: A primitive event is detected by asingle RFID reading.

– Event sequence: A complex event that involves thesame or different event type is read in a certainorder within a time period by different readers.

– Repeating events: RFID tags can be detected atdifferent locations and time points with a fixednumber of times.

– Periodic events: A number of successive RFIDreadings, or the period of times that a sequence ofevents can be detected.

After carefully analyzing various RFID applications,we divide TTL into four categories.

Category 1: Absolute TTL

Definition 5 Absolute TTL Absolute TTL, denoted asTTLa, is the period of time that a primitive event canenter into RFID data management systems.

TTLa specifies the period of time that an RFID tagis regarded as alive in the physical world. After thatperiod, even if the primitive events generated by thattag are sent to the system again, the system would

Page 5: TMS-RFID: Temporal management of large-scale RFID applicationsweb.science.mq.edu.au/~qsheng/papers/ISF-2011.pdf · RFID in handling large-scale, high-speed RFID data streams. The

Inf Syst Front (2011) 13:481–500 485

not recognize them any more or would raise alarms toadvice user that the tag is invalid but still used in thephysical world.

Example 1 Suppose each one-time only train ticket isattached with an RFID tag. This kind of RFID tagcan only be valid within the period of time that apassenger travels from place A to B, and cannot bereused. After the passenger arrives B, the primitiveevents corresponding this tag will be labeled as invalidin the system.

Example 2 In pharmacy, drugs have expiry dates. Sup-pose bottles of medicines are attached with RFID tags.This kind of RFID tag can also only be valid within theexpiry dates, and be used only once. After the expirydate, if the medicine is still on the shelf, the systemshould raise an alarm.

Although both at train stations and in pharmacystores, there are many RFID readers, and one RFIDtag can generate many primitive events with differenttimestamps, there is a unique period of time related toeach RFID tag. This period of time is TTLa. In otherwords, TTLa of the primitive events in Example 1 isthe period of time that the train moves from A to B,while that of Example 2 is the expiry date. Obviously,TTLa indicates the life span of the tagged objects in thephysical world.

Since the number of tags in the physical world willincrease dramatically, if a system just tries to processall tags it is reading, it would be worn down quickly.The main function of TTLa is in fact to help the systemto distinguish between the tags that are still in use andthe tags that cannot be used any more. As a result, theworkload can be reduced in later processing steps.

One may argue that we can kill the tags if theycannot be used any more, and there is in fact a “kill”feature in the EPCglobal Architecture Framework.3

However, the “kill” feature in EPCglobal ArchitectureFramework is only a part of a comprehensive privacypolicy, not aimed at reducing the workload of systems.In addition, what would happen if someone physicallyreproduces the tags with the same EPC values of thetags that have been killed? Since these fake tags havethe same EPC values of the killed ones, the systemwould be easily cheated. We therefore strongly advo-cate that it is important to ensure that dead tags cannotbe recreated at a system logic level. TTLa is designedfor meeting this purpose. TTLa can be assigned when

3http://www.epcglobalinc.org/standards/architecture/

tags are physically generated and recorded in the sys-tem when the tags are first read. In some cases, TTLa

can also be opened until users notify the system.

Category 2: Relative TTL

Definition 6 Relative TTL For primitive events, Rela-tive TTL, denoted as TTLr, is the period of time that aprimitive event is valid in one application. For complexevents, TTLr is the period of time that a complex eventcan last.

For primitive events, TTLr specifies the period oftime that RFID tags can stay in a particular application.In Examples 1 and 2, the tags can only be used once.There are still many situations where each tag canbe used in many applications. Each application maygenerate different types of events, and each type ofevents may have different time restrictions. TTLr isused to denote the period of time for each applicationthat the tags are involved in. After that period, they canbe reassigned to other applications.

Example 3 In building access control, suppose eachvisitor is assigned an RFID tagged card when he comes.Visitors can only stay in the building for a prescribedperiod of time, T , based on their purposes. When vis-itors leave the building, cards are taken back, and willbe reassigned to other visitors in the future.

In this example, an RFID tag can be used manytimes, and each time its valid usage time (i.e., TTLr) isdifferent. Clearly, TTLr in this example is the period oftime that a visitor can stay in the building, which helpssystems to find out over-staying visitors.

For complex events, TTLr specifies the time periodthat this complex event can occupy. That is, TTLr isthe time restriction on the start and end time of thiscomplex event.

Since TTLa stands for the life span in the physicalworld of the RFID tag, each primitive event can onlybe associated with one TTLa. In contrast, each type ofevent can have several different TTLr depending on theapplications. An event can have both the restriction ofTTLa and TTLr simultaneously in some applications.Such tags can be assigned to more than one applicationduring their life span, and each application may have adifferent TTLr.

Category 3: Periodic TTL

Definition 7 Periodic TTL Periodic TTL, denoted asTTLp, is the time restriction on the interval between

Page 6: TMS-RFID: Temporal management of large-scale RFID applicationsweb.science.mq.edu.au/~qsheng/papers/ISF-2011.pdf · RFID in handling large-scale, high-speed RFID data streams. The

486 Inf Syst Front (2011) 13:481–500

two successive primitive events in a periodically oc-curred events sequence. The primitive events in thesequence have the same event type.

In some applications, systems need to check event se-quences. These events belong to either the same eventtype or different event types. The intervals betweenany two successive events can be identical, or can bedifferent. TTLp is designed for the cases that the eventsbelong to the same event type and the intervals be-tween any two successive events are identical. In otherwords, TTLp controls the period that the same type ofevents occurs.

Example 4 Suppose a kind of machine, e.g. an airplane,can be used for many years, but some parts have ashorter life span, and must be replaced in a regularbasis. If an RFID tag is attached to each of such parts,the system should determine when a part needs to bereplaced according to the readings of RFID readers.

In this example, each time a reader reads an RFIDtag attached to such parts, an event will be sent tothe system. However, since these parts only needto be replaced periodically, the system does not need toprocess the events generated within that period. In thiscase, TTLp is the interval between two replacements ofthe same part. Usually, TTLp requires that the events inthe sequence belong to the same event type. The eventsin the sequence can be primitive or complex.

It should be noted that the applications that TTLa

and TTLr are applied are different from those of TTLp.The former only involves a single event, while the latterinvolves an event sequence. In fact, Example 4 is akind of regular events detection. Such an applicationinvolves more than one occurrences of the same typeof events, and any two consecutive occurrences of theevents must be within some time interval.

Category 4: Sequential TTL

Definition 8 Sequential TTL Sequential TTL, denotedas TTLs, is the time restriction on the intervals be-tween any two successive primitive events in an eventsequence.

Similar to TTLp, TTLs is designed for an eventsequence, but has no limitation on the event types andthe intervals between two successive events. In essence,TTLp can be viewed as a special case of TTLs, sinceTTLs has fewer restrictions on the event types and theintervals. It may or may not have restrictions on theintervals.

Fig. 1 Time restrictions in Example 5

Example 5 Suppose many students do an experimentin a laboratory. There are four steps, namely, A, B, C,and D, to finish this experiment, and these four stepsmust be performed in the correct order and satisfy thetime restriction in Fig. 1. That is, the interval betweenA and B must be smaller than a time interval, T1, theinterval between C and D must be larger than a timeinterval, T2, and smaller than T3 (T2 < T3), there is nolimitation on the interval between B and C, and thewhole experiment must be finished within a given timeperiod, T4. It is difficult for a teacher to watch whethereach student do the experiment in correct order andintervals. If each student takes an RFID reader on hiswrist, and the reagents or equipments in each step areattached with RFID tags, then the sequence of stepscan be detected as a sequence of RFID readings, whichcan be generated when students’ RFID readers are veryclose to the reagents or equipments. Thus, the systemcan watch the experimental process of each studentaccording to these RFID readings.

Example 5 is a kind of event sequence detectionwith interval restriction. Such an application involvesdifferent types of events which occurred in a prescribedorder, and there may or may not be limitations on theintervals between two successive events. In addition, wemay also have a time restriction on the whole sequence,such as T4 in Example 5. In this example, different typesof events will be sent to the system in different steps,and the system needs to check whether the interval be-tween two successive events is valid. Each interval is akind of TTLs. Each event in the sequence can also haveTTLa and TTLr, and TTLr can also act on the wholeevent sequence. Both TTLp and TTLs are attached toeach occurrence of the event in the sequence, whichprescribes when the next event will happen.

To summarize, on the basis of analysing a widerange of RFID applications, we have identified fourcategories of TTL:

– TTLa: the life span an RFID tag can have;– TTLr: the period of time in applications related

only to one event, the event can be primitive, suchas T in Example 3, and can also be complex, suchas T4 in Example 5;

Page 7: TMS-RFID: Temporal management of large-scale RFID applicationsweb.science.mq.edu.au/~qsheng/papers/ISF-2011.pdf · RFID in handling large-scale, high-speed RFID data streams. The

Inf Syst Front (2011) 13:481–500 487

Table 1 Objects that TTL can exert to

Objects Event Event sequence

Simple Complex Same type Different type

TTLa√ × × ×

TTLr√ √ × ×

TTLp × × √ ×TTLs × × √ √

– TTLp: the interval between two successive eventswithin a periodically occurred event sequence;

– TTLs: the interval between two successive eventswithin an event sequence.

The objects that each category of TTL can exert to areshown in Table 1. From an implementation viewpoint,although we design four categories of TTL, there aresome internal relations among different categories, asshown in Fig. 2. That is, TTLa can be viewed as a specialcase of TTLr, while TTLp can be viewed as a specialcase of TTLs.

Generally speaking, no matter which category TTLbelongs to, the different types of TTL represent thetime that the system should take an action in an appli-cation. The actions can be an alert, a warning, or some-thing else that is related to applications. Such kinds ofactions are usually caused by TTLa or TTLr, which arenot related to event sequences. The actions can alsodetermine whether an expected event occurred. Suchkinds of actions are usually caused by TTLp and TTLs,which are related to event sequences.

4 TTL query language

In this section, we present the TTL query languageand illustrate how this language can be used to sup-port a range of emerging RFID applications. Sinceour purpose is not to design a new language, butto enhance the temporal management ability of theavailable languages, we leverage the available com-plex event query languages developed for active data-bases (Chakravarthy et al. 1994; Gehani et al. 1992;

Fig. 2 Internal relationshipamong different categories ofTTL

Zimmer and Unland 1999) and RFID data manage-ment systems (Wang et al. 2006; Wu et al. 2006; Baiet al. 2007).

Our TTL query language is a declarative language,which can be used to specify how individual eventsare filtered, and how multiple events are correlatedvia time-based and value-based conditions. The overallstructure of the TTL query language is as follows:

[DEFINE 〈 event_type1=specification,event_type2=specification,

......event_typen=specification〉]

EVENT 〈event pattern〉[WHERE 〈conditions for attribute values〉][TTLA 〈a list of events〉

[{operations for invalid TTLa}]][TTLRP 〈a list of events〉

[{operations for invalid TTLr ofprimitive events}]]

[TTLRC 〈a period of time〉][TTLP 〈a period of time〉][TTLS 〈a list of period of times〉]

The latter five clauses are specific for TTL queries.Both TTLRP and TTLRC are TTLr, but for primi-tive events and complex events, respectively. Amongthe five TTL clauses, TTLA and TTLRP clauses haverestrictions on events, while the three others have re-strictions on the period of time. Furthermore, TTLRCclause restricts the time of the whole sequence whileTTLP and TTLS clauses restrict the time between twosuccessive events in the sequence. The TTL query lan-guage can capture a wide range of RFID applicationsrelated to temporal restrictions.

Example 6 Let us visit again the application in Ex-ample 1. Suppose there are several RFID read-ers at the check-in counter, and the event type ofthe primitive events generated by these readers isTICKET-CHECKIN. When a passenger passes theseplaces, his/her ticket needs to be checked. The querycan be expressed as:

EVENT TICKET-CHECKINTTLA {Raise an alarm: cannot check in}

When a ticket is sold, its corresponding TTLa issaved in a central database. This query checks TTLa

when a TICKET- CHECKIN type of event is received.If its TTLa is invalid, an alarm will be raised, and thepassenger cannot check in.

Page 8: TMS-RFID: Temporal management of large-scale RFID applicationsweb.science.mq.edu.au/~qsheng/papers/ISF-2011.pdf · RFID in handling large-scale, high-speed RFID data streams. The

488 Inf Syst Front (2011) 13:481–500

Example 7 For the application in Example 3, supposemany RFID readers are installed in a building, and theevent type of the primitive events generated by the tagson cards is CARD. When the card belongs to a visitor,TTLr needs to be checked. The query can be expressedas:

EVENT CARDWHERE Type = VisitorTTLRP {Raise an alarm: overstaying visitor}

Similarly, when a card is assigned to a visitor, itsTTLr has been saved in a database. This query checksall visitors’ TTLr to prevent from over-staying visitors.

Example 8 For the application in Example 4, sup-pose the event type of the primitive events generatedby the tags on the special part of this machine isSPECIAL-PART. The query can be expressed as:

EVENT SEQ+ (SPECIAL-PART)

WHERE [ID]TTLP 1 year

Here the SEQ+ is for the event sequence pattern inTTLp, and the WHERE clause is used to make surethat it is the same part. Since it is a special part, eachtime the system receives such an event, the system willcheck its last repairing record to see whether 1 year haspast.

Example 9 See the application in Example 5. Supposethe event types of the primitive events generated bydifferent steps are A, B, C, and D. The query can beexpressed as:

EVENT SEQ(A a, B b , C c, D d)

WHERE (b .I D = a.I D) ∧ (c.I D = a.I D)

∧(d.I D = a.I D)

TTLRC T4

TTLS (0, T1); ; (T2, T3)

Here the SEQ is for event sequence pattern in TTLs.a, b , c, and d are the events. The WHERE clauseensures the four steps come from the same student.TTLRC clause is used to restrict the whole process.Although there is no limitation on the interval betweenB and C, the second “;” in TTLS clause is still required.

Example 10 Here, we give an example for negation.Suppose in airport, a baggage must arrive at a specifiedplace to wait to be loaded onto an airplane within60 min after this baggage is checked in. If no such anevent, an alarm will be raised accordingly. Suppose theevent type of the primitive events generated by RFID

readers at check-in counters are CHECKIN, and that atthe specified place is WAIT-LOADED. The query can beexpressed as:

EVENT SEQ(CHECKIN x, !WAIT_LOADED y)

WHERE x.ID=y.IDTTLS (0, 60) minutes

5 Algorithm for UnES problem based on TTL

For SEQ processing, a system can only send out queryresults after a complete sequence is received. There isa huge amount of intermediate stages to be stored. Thesystem needs to check the order of the incoming events.The order required by a query is always based on theorder that events occurred in the real world recordedas timestamps. However, the order that events enterinto the system does not necessarily reflect the orderof their occurrences. As a consequence, the system hasto extract event sequences from an unordered eventstream. The problem is defined as the following:

Definition 9 Unordered Event Stream An eventstream is E = (et1,t′1 , et2,t′2 , . . . , eti,t′i , . . .). The list (t1, t2,. . . , ti, . . .) contains timestamps, and the list (t′1, t′2,. . . , t′i, . . .) contains the times that the events areprocessed by the system. They satisfy t1 ≤ t2 ≤ . . . ≤ti ≤ . . ., t′1 ≤ t′2 ≤ . . . ≤ t′i ≤ . . ., t1 ≤ t′1, t2 ≤ t′2, . . . , ti ≤t′i, . . . If ∃eti,t′i , et j,t′j , ti ≤ t j and t′i > t′j, then E is anunordered event stream (UnES).

Based on the above discussions, we propose a newdata structure and an algorithm to deal with the UnESproblem, illustrated by comprehensive examples.

5.1 Double level sequence instance list

Definition 10 Sequence Instance, Partial Sequence In-stance Given a SEQ pattern, SEQ(EventType1, Event-Type2, . . . , EventTypen), and n events in the inputevent stream, e1, e2, . . . , en. If the event types of e1,e2, . . . , en are EventType1, EventType2, . . . , EventTypen,respectively, and

e1.timestamp < e2.timestamp < . . . < en.timestamp (1)

and e1, e2, . . . , en are also satisfy WHERE and TTLSclauses, then (e1, e2, . . . , en) is a sequence instance, de-noted as SI, and (ei, ei+1, . . . , e j) (1 ≤ i ≤ j ≤ n) is apartial sequence instance, denoted as PSI.

Page 9: TMS-RFID: Temporal management of large-scale RFID applicationsweb.science.mq.edu.au/~qsheng/papers/ISF-2011.pdf · RFID in handling large-scale, high-speed RFID data streams. The

Inf Syst Front (2011) 13:481–500 489

In query processing, the system needs to find outall sequence instances from the input event stream.Any sequential parts of a sequence instance are partialsequence instances, which satisfy the correspondingconditions in the query. From another viewpoint, se-quence instances are composed of partial sequence in-stances. Therefore, we can first record partial sequenceinstances, and then use them to compose sequenceinstances. To this purpose, we design the followingstructure for recording such information.

Definition 11 Double Level Sequence Instance ListGiven a SEQ pattern, SEQ(EventType1, EventType2,. . . , EventTypen), a double level sequence instance list,namely DLSIList, is defined as follows:

DLSIList = {PSIList = (PSIList2, PSIList3, . . . , PSIListn);LP = (LP

2 , LP3 , . . . , LP

n );SIList = {SI1,SI2, . . . ,SIm};

}

PSIList has (n-1) components, and each one is a listof partial sequence instances with single event belong-ing to the same event type, that is,

PSIListi = {e1, e2, . . . , eLPi}, 2 ≤ i ≤ n (2)

Where LPi is the number of components in PSIListi,

and the event type of each component is EventTypei

in the SEQ pattern. Each component of SIList willform a sequence instance, and has an attribute, namelyChanged, to indicate whether new event has beenadded into this component. For each event in SI1, SI2,. . . , SIm, another two attributes are appended whenthat event is added into SIList. That is, systemstamp,which indicates the time that this event is processedby the system, and First, which indicates whether thisevent is contained in the former components. That is,for 1 ≤ j < i

If (ei ∈ SI i) ∧ (ei.First = true), then ei ∈ SI j (3)

Intuitively, DLSIList has two lists: one for partialsequence instances, and the other for sequence in-stances. PSIList is used to store single event so thatthe system can handle UnES problems. SIList is usedto store intermediate results in different stages, whichmay lead to the final results in the future. It should benoted that events can only be added into a componentof SIList one by one in the order defined by theSEQ pattern. In addition, the PSIList starts from 2,

not 1. This is because the first event in the sequence cango into SIList directly. There are three new attributesappended to different components in DLSIList, andthey are designed to help the updating process inthis module. Before being inserted into the DLSIList,each event has an attribute, timestamp, which is as-signed by RFID readers. After events are added to theDLSIList, another attribute, systemstamp, will be ap-pended to them. Timestamp indicates the time an eventoccurs in the real world, while systemstamp indicatesthe time an event is processed by the system (after alltransmission delays). Both of them help the system todeal with UnES problems. Here, we assume the clockof each reader is synchronized, and clock synchroniza-tion of all readers and the system is out of the scopeof this paper. In the following, PSIListi, j represents thejth event in PSIListi, and SI i, j represents the jth eventin SI i.

5.2 The algorithm

When a SEQ pattern is received, a new DLSIList willbe created. At the beginning, the DLSIList is empty.As the events come, the DLSIList is updated andquery results are sent to users when some sequenceinstances are obtained. The main technique is the coop-eration between PSIList and SIList in the DLSIList.Events are first used to update SIList, and then storedin PSIList waiting to be checked in the future. Af-ter a component in SIList is changed, the PSIList ischecked to see whether former events can make thiscomponent grow further.

Although the cooperation between PSIList andSIList can solve the UnES problem, the sizes ofPSIList and SIList may grow dramatically. Their sizescan be reduced by exploiting TTL. When the condi-tions in TTL clauses indicate that some componentsin PSIList and SIList are useless, they are deleted.The details are given in Algorithm 1. Some details areexplained as follows.

The conditions in Line 14 make sure that the avail-able events in a component of SIList satisfy thequery. Let the restriction in TTLS clause on the inter-vals between any two successive events are (LIndexe−1,UIndexe−1), then e.timestamp must satisfy:

SI i,Indexe−1.timestamp + LIndexe−1 ≤ e.timestamp ≤SI i,Indexe−1.timestamp + UIndexe−1 (4)

Since the events may come in different orders, thesystem needs to check PSIList to see previous events

Page 10: TMS-RFID: Temporal management of large-scale RFID applicationsweb.science.mq.edu.au/~qsheng/papers/ISF-2011.pdf · RFID in handling large-scale, high-speed RFID data streams. The

490 Inf Syst Front (2011) 13:481–500

Algorithm 1 Algorithm for the UnES problem basedon TTL//Initialize a new DLSIList1: If (receive a SEQ pattern from Query Analyzer)2: {3: Create a DLSIList;4: PSIList2 ← ∅; . . . ; PSIListn ← ∅; LP

2 ← ∅;5: SIList← ∅; m ← 0;6: }//Update the DLSIList when a new event comes7: While (a new event e comes)8: {9: Indexe ← Event type of e in the sequence

pattern;e.systemstamp ← SystemTime; e.First ←True;

10: Set Changed to False for all components inSIList;

//Add e into SIList11: If (Indexe = 1) m++; SIm ← (e, ∅2, ∅3, . . . , ∅n);

SIm.Changed ← True;12: If (1 < Indexe ≤ n)13: For (i = 1; i ≤ m; i++)14: If (SI i,Indexe−1 = ∅)∧

(SI i,Indexe−1.First = True)∧(SI i,Indexe−1 and e satisfy TTLS clause)

15: {16: If (SI i,Indexe = ∅) SI i,Indexe ← e;

SI i.Changed ← True;17: Else m++;

SIm ← (SI i,1, . . . ,SI i,Indexe−1, e,∅i,Indexe+1, . . . , ∅i,n);

SIm,1.First ← False; . . . ;SIm,Indexe−1.First ← False;SIm.Changed ← True;

18: }//Check PSIList19: For (i = Indexe + 1; i ≤ n; i++)20: For( j = 1; j ≤ LP

i ; j++)21: For (each SIk in SIList whose Changed is

True)22: If (SIk,i−1, PSIListi, j satisfy TTLS clause)23: {24: If (SIk,i = ∅) SIk,i ← PSIListi, j;

SIk.Changed ← True;25: Else m++;

SIm ← (SIk,1, . . . ,SIk,i−1,

PSIListi, j, ∅k,i+1, . . . , ∅k,n);SIm,1.First ← False; . . . ;SIm,Indexe−1.First ← False;SIm.Changed ← True;

26: }

//Add e into PSIList27: If (1 < Indexe ≤ n) LP

Indexe+ +;

PSIListIndexe,LPIndexe

← e;//Reduce the size of PSIList28: For (i = 2; i ≤ n; i + +)29: For (each event PSIListi, j in PSIListi)30: If (no valid events in PSIListi−1)∧

(future events cannot satisfyTTLS clause)

31: Delete PSIListi, j; LPi − −;

//Reduce the size of SIList32: For (each component SI i in SIList)33: If (current system time cannot satisfy the

TTLS clause for the last non-empty eventSI i)∨(SI i does not satisfy TTLRC clause)

34: Delete SI i from SIList ; m − −;//Check the results in SIList35: For (each changed component SI i in SIList)36: If (SI i,n = ∅) Send out this sequence;

SI i,n ← ∅;37: }

for these changed components (Line 21). If a compo-nent and an event in PSIList satisfy the condition inLine 22, this event is added into the component. Thecondition in Line 22 is similar to that of Line 14:

SIk,i−1.timestamp + Li−1 ≤ PSIListi, j.timestamp ≤SIk,i−1.timestamp + Ui−1

(5)

Events in PSIList are waiting for events which arein front of them in the SEQ pattern but come laterthan them. If the system time indicates that no suchevents will come, the corresponding events in PSIListcan be deleted to reduce the size of PSIList. There aretwo conditions need to be checked (Line 30). The firstone checks whether there is an event, PSIListi−1,k inPSIListi−1 which satisfies

PSIListi−1,k.timestamp + Li−1 ≤PSIListi, j.timestamp ≤

PSIListi−1,k.timestamp + Ui−1 (6)

If yes, PSIListi, j cannot be deleted. The second oneneeds to check the system time. Let the current systemtime be SystemTime, and the maximum delay betweenthe time that an RFID tag is read and the time that the

Page 11: TMS-RFID: Temporal management of large-scale RFID applicationsweb.science.mq.edu.au/~qsheng/papers/ISF-2011.pdf · RFID in handling large-scale, high-speed RFID data streams. The

Inf Syst Front (2011) 13:481–500 491

corresponding primitive event is processed be Delay. IfSystemTime and PSIListi, j satisfy

SystemTime > PSIListi, j.timestamp − Li−1 + Delay

(7)

then no former events will come and PSIListi, j can bedeleted. If Delay is unknown or too long, Systemstampcan be used as the criterion instead, so that the size ofthe intermediate results can be reduced.

Since (Delay ≥ PSIListi, j.Delay) and (PSIListi, j.

timestamp+PSIListi,j.Delay=PSIListi, j.Systemstamp),the condition can be changed to

SystemTime > PSIListi, j.Systemstamp − Li−1 (8)

If a component SI i already has j events, it is waitingfor the ( j+1)th event. Since events in PSIList havebeen checked, if the system time indicates that no suchevents will come, SI i can be deleted to reduce the sizeof SIList. The condition is

SystemTime > SI i, j.timestamp + U j + Delay (9)

If Delay is unknown or too long, since (Delay ≥SI i, j.Delay) and (SI i, j.timestamp + SI i, j.Delay =SI i, j.Systemstamp), then condition can be changed to

SystemTime > SI i, j.Systemstamp + U j (10)

If a query has a TTLRC clause, the system still needsto check the interval between the first event and theexisting last event in SI i. If the current interval doesnot satisfy TTLRC clause, SI i can be deleted now.

Generally speaking, Algorithm 1 shows that TTL isuseful for reducing the size of intermediate results andincreasing the system’s scalability.

5.3 Example application

An example application is given to illustrate the execu-tion of Algorithm 1. Suppose a query is:

EVENT SEQ(A, B, C, D)TTLRC 60TTLS (0, 5); ; (10, 40)

The time unit is second, and the processing for othertime units is similar. This query is the event sequencesof a, b , c, d, belonging to event types A, B, C, D,and satisfy the time restrictions in TTLRC and TTLSclauses. Given that the event stream is (a1,2, b 5,7, a15,21,a16,21, b 18,20, c19,19, b 21,22, a25,31, c28,31, d30,32, b 30,30, c55,56,d62,62, c65,66, d77,77, d78,79), where the first number of

each event is timestamp and the second is systemstamp,and Delay=6s. The events came in the order of sys-temstamp, and the executive process of Algorithm 1 isshown in Fig. 3.

When a1,2 arrives, since it is the first event of the SEQpattern, a new component with a1,2 as the first event isadded into SIList, and its Check is assigned to True, seeFig. 3b.

When b 5,7 arrives, first, it matches with the firstcomponent of SIList, and added into this component,then it is added into PSIList to wait for future events,see Fig. 3c.

When c19,19 arrives, it is added into both the firstcomponent of SIList and PSIList. Then, since thecurrent system time is already 19, according to thesecond condition in Line 30 and (7), all future a eventscannot match with b 5 any more, so b 5 is deleted fromPSIList, see Fig. 3d.

When b 18,20 arrives, no component of SIList canmatch with it, so it is just added into PSIList (Fig. 3e).When a15,21 arrives, a new component is added intoSIList. b 18 and c19 came before a15, and are stored inPSIList, now they are added into the new component(Fig. 3f).

When a16,21 arrives, a new component is added intoSIList and b 18 and c19 are added into the new compo-nent (Fig. 3g). When b 21,22 arrives, the third componentof SIList matches with it, but this component alreadyhas a b event, so a new component is created, and a16

is copied to and b 21 is added to this new component(Fig. 3h).

When b 30,30 arrives, since the current system timeis already 30, b 18 and b 21 are deleted from PSIList.Although there is no restriction on the interval betweenB and C, the current system time indicates that no bevents before c19 will come, so c19 is also deleted, seeFig. 3i.

When a25,31 arrives, a new component is addedinto SIList and b 30 is added into the new component(Fig. 3j). When c28,31 arrives, all the first four com-ponents of SIList match with it. But the first threecomponents already have c event, so three new com-ponents are created. Then c28 is added into PSIList(Fig. 3k).

When d30,32 arrives, three query results are obtained.Since (7) indicates that no future c events can matchwith d30 any more, so d30 is not added into PSIList(Fig. 3l). When c55,56 arrives, although the last threecomponents match with it, their First is not True. Thenc55 is added into PSIList. The current system time is56, so b 30 and c28 are deleted from PSIList (Fig. 3m).

When d62,62 arrives, the 6th component of SIListdoes not satisfy TTLRC clause when the system

Page 12: TMS-RFID: Temporal management of large-scale RFID applicationsweb.science.mq.edu.au/~qsheng/papers/ISF-2011.pdf · RFID in handling large-scale, high-speed RFID data streams. The

492 Inf Syst Front (2011) 13:481–500

Fig. 3 Execution ofAlgorithm 1 for the examplein Section 5.3

Page 13: TMS-RFID: Temporal management of large-scale RFID applicationsweb.science.mq.edu.au/~qsheng/papers/ISF-2011.pdf · RFID in handling large-scale, high-speed RFID data streams. The

Inf Syst Front (2011) 13:481–500 493

Fig. 3 (continued)

reduces the size of SIList, and is deleted (Fig. 3n).When c65,66 arrives, five new components are created.When the system reduces the size of SIList, sincethe current system time is already 66, the first threecomponents do not satisfy (9), and are deleted. The firstnew component does not satisfy TTLRC clause, and isalso deleted (Fig. 3o).

When d77,77 arrives, nine components of SIListmatch with it. But only the second and last satisfy theconditions, and the other ones do not satisfy TTLRCclause. Thus, two new query results are obtained(Fig. 3p). When the last event, d78,79, arrives, five new

components are created. But only the second and lastnew components satisfy the conditions, and the threeothers do not satisfy TTLRC clause (Fig. 3q). Finally,the obtained ten query results are shown in Fig. 3r.

6 System architecture

An RFID data management system, namely TemporalManagement System over RFID data streams (TMS-RFID), has been developed based on TTL, and thesystem architecture is shown in Fig. 4.

Page 14: TMS-RFID: Temporal management of large-scale RFID applicationsweb.science.mq.edu.au/~qsheng/papers/ISF-2011.pdf · RFID in handling large-scale, high-speed RFID data streams. The

494 Inf Syst Front (2011) 13:481–500

Fig. 4 System architecture ofTMS-RFID

TMS-RFID consists of i) three databases: the EventType Database, the Query Database, and the CentralDatabase, and ii) six modules: the Filter and Cleaner,the Event Type Checker, the Query Checker, the QueryAnalyzer, the Event Processor, and the IntermediateResults Updater. TMS-RFID takes the input of queriesand an infinite RFID data stream and outputs the queryresults of events that match with the queries.

In general, there are three independent informa-tion flows in TMS-RFID, namely the Database ControlFlow, the Query Flow, and the Data Flow. The first oneis for administrators to manage the three databases.The second one represents the process where TMS-RFID handles queries submitted by users. Queries aresent to the Query Analyzer, which updates the QueryDatabase, Event Type Database, and IntermediateResults Updater.

Data Flow in TMS-RFID flows from receiving prim-itive events captured by RFID readers to publishing

query results to the user. The process is shown in Fig. 5.The raw RFID data are first filtered and cleaned, andthen the events are sent to the Event Processor, waitingfor being processed. When receiving the list of queriesrelated to the events, the Event Processor starts toprocess the queries. First, queries that are not related toevent sequences are processed, followed by the queriesthat are related to event sequences. Both query resultswill be sent to users, and the Central Database willbe updated if the query results require updating theinformation of TTLa. The function of each databaseand module is introduced in detail as follows.

Event Type Database This database is used to recordthe event types in TMS-RFID, and the main functionis to provide the information of event type for theEvent Type Checker when it needs to determine theevent type of an incoming event. Some event typesare defined by administrators, while some are defined

Page 15: TMS-RFID: Temporal management of large-scale RFID applicationsweb.science.mq.edu.au/~qsheng/papers/ISF-2011.pdf · RFID in handling large-scale, high-speed RFID data streams. The

Inf Syst Front (2011) 13:481–500 495

Fig. 5 Data flow

by queries. The Event Type Database is controlledand updated by both administrators and the QueryAnalyzer.

Query Database This database is used to record thequeries lodged from users, and the main function is toprovide the available queries for the Query Checkerwhen it needs to determine the queries that an eventtype is involved. Since the Query Checker requiresthe list of queries related to an event type, the QueryDatabase saves and organizes the submitted queriesaccording to the event types. This database is con-trolled and updated by administrators and the QueryAnalyzer.

Central Database This database serves as the tradi-tional database, recording the information extractedfrom event streams. On the other hand, it is usedto record the information of TTLa and TTLr for allprimitive events that enter into the system. Since TTLp

and TTLs are for complex events and only used whenthe system checks the intermediate results, there is noneed to save them in database. The Central Databaseis controlled and updated by administrators and theIntermediate Results Updater.

Query Analyzer The main function of this moduleis to analyze the input queries. Since queries are ex-pressed in a query language, the Query Analyzer needsto compile the query language. Based on the analysis ofresults obtained, the Query Analyzer needs to updatethe Query Database and the Event Type Database. Ifthe new query involves a sequence, the Query Analyzerneeds to notify the Intermediate Results Updater aboutthe checking of the new sequence.

Filter and Cleaner The main function of this moduleis to filter and clean the raw RFID readings, whichfocuses on dealing with missed readings, unreliablereadings, and data redundancy. After RFID readingsare processed, they are sent to the Event Type Checkerand the Event Processor.

Event Type Checker The main function of this moduleis to get the event type of the incoming event, andsend the event type to the Query Checker for furtheroperations.

Query Checker This module matches a list of querieswith the incoming events, and sends the list to the EventProcessor so that it can check whether the incomingevents meet the requirements of the available queries.

Event Processor This module processes the queries inthe list sent by the Query Checker one by one accordingto the incoming event. Usually, there are two kinds ofconditions in a query that need to be checked for asingle event: (i) attribute values, and (ii) TTLa or TTLr.If a query needs to extract an event sequence, the thirdkind of conditions, TTLp or TTLs, are needed to bechecked. However, in this case, the conditions will bechecked in the Intermediate Results Updater, and theEvent Processor only needs to send the incoming eventto the Intermediate Results Updater. The flowchartthat the Event Processor processes a query is shownin Fig. 6. For an incoming event, the related queriesthat do not involve an event sequence will be processedin the Event Processor, and the results will be sent tousers. Additionally, if the results require the system toupdate TTLa of some events, the Event Processor willupdate the Central Database accordingly. The queriesthat involve an event sequence will be processed furtherin Intermediate Results Updater.

Intermediate Results Updater This module is espe-cially designed for queries that need to extract eventsequences. For SEQ+, all events belong to the sameevent type, and each time the system receives a validevent, query results must be sent out. Thus, for eachinstance of a sequence, the Intermediate Results Up-dater only needs to record the last event, and to check

Page 16: TMS-RFID: Temporal management of large-scale RFID applicationsweb.science.mq.edu.au/~qsheng/papers/ISF-2011.pdf · RFID in handling large-scale, high-speed RFID data streams. The

496 Inf Syst Front (2011) 13:481–500

Fig. 6 Event processor

whether the next event is valid. When the IntermediateResults Updater receives a SEQ pattern, it will createa new double level sequence instance list (DLSIList).When new events come, it will update the DLSILists.When some intermediate stages reach the final stages,the Intermediate Results Updater will send the resultsto user. If the results require the system to update TTLa

for some events, this module will update the CentralDatabase accordingly.

7 System evaluation

7.1 Experimental setup

We implemented the techniques presented in the previ-ous sections in Visual C++. All experiments were per-formed on a PC with a Pentium IV 2.8 GHz processorand 512 MB memory. An event generator is developedto create a stream of events to be input into TMS-RFID. The number of events generated per secondranges from 1000 to 5000, and Delay=5s. There are20 event types and 5 attributes (A1, A2, A3, A4, A5)

for each event type in addition to the timestamp. Theoccurrences of event types are based on the Zipf distri-bution, with the key characteristic of Zipf distributionbeing set to 0. For each attribute, the number of pos-sible values this attribute can take is chosen from therange of [10, 10000]. The time for event generation isnot counted in the performance metric. Each time, afterthe event generator has generated events for 10 s, theseevents are sent to the system according to their delay.The stop criterion is whenever more than 10 millionevents have been processed.

To generate a query, the event types in EVENTclause are chosen from 20 event types according againto Zipf distribution, with the key characteristic is set to0. The WHERE clause requires that all events in thesame sequence must have the same attribute values onattribute A1. In TTLS clause, the lower bound Li ischosen from the range of [0, 5] in second, and the upperbound Ui is chosen from the range of [Li, Li+10], where1 ≤ i < n.

The above setting ought to be reasonable for a large-scale application that would have continuous TTLqueries to events detected by a few thousands of RFIDreaders with a quarter of million tags moving around inthe real world.

7.2 Experimental results

We put emphasis on testing TMS-RFID’s performanceon handling queries with SEQ pattern because suchqueries are the most complicated, which need to checknot only the time restrictions but also the order ofsequence.

7.2.1 Experiment on domain size

Since the WHERE clause requires that all events in thesame sequence must have the same attribute values onA1, the domain size of A1 will affect the performance.Thus, this set of experiments is executed to test theeffect of the domain size of A1 on the performance ofTMS-RFID. The sequence length is set to 2, 3, 4, 5, and6, respectively. The domain size of A1 increases from500 to 10000 in step of 500, and the results are shown inFig. 7.

For different sequence lengths, the throughput ofTMS-RFID always achieves a stable level when thedomain size of A1 increases. When the sequence lengthis 2, the throughput always stabilizes at about 870,000events/sec. When the sequence length is 3, the through-put increases from about 230,000 to 470,000 events/sec,and when the domain size of A1 is larger than 4000,the throughput stabilizes at about 450,000 events/sec.

Page 17: TMS-RFID: Temporal management of large-scale RFID applicationsweb.science.mq.edu.au/~qsheng/papers/ISF-2011.pdf · RFID in handling large-scale, high-speed RFID data streams. The

Inf Syst Front (2011) 13:481–500 497

Fig. 7 The effect of the domain size of A1

Similar performances are obtained for the sequencelength being 4, 5, and 6, which stabilize at 300,000events/sec, 210,000 events/sec, and 170,000 events/sec,

respectively. To summarize, Fig. 7 shows that when thedomain size of A1 is larger than about 2000, TMS-RFIDalways has a high throughput.

Figure 8 further shows the throughput variation withthe number of events. The domain sizes are set to 500,1000, 5000, and 10000, and the sequence lengths areset to three longer ones, namely 4, 5, and 6. The re-sults show that whatever the domain size and sequencelength are, the performance of the system can reach astable level quickly.

7.2.2 Experiment on sequence length

The longer the sequence length, the more intermediatestages there are. Thus, this set of experiments is exe-cuted to test the effect of the sequence length on theperformance of TMS-RFID. The domain size of A1 isset to 500, 1000, 5000, 10000, and the sequence lengthincreases from 2 to 6, and the results are shown inFig. 9.

Fig. 8 Throughput of TMS-RFID, and the sequence length is a 4, b 5, c 6

Page 18: TMS-RFID: Temporal management of large-scale RFID applicationsweb.science.mq.edu.au/~qsheng/papers/ISF-2011.pdf · RFID in handling large-scale, high-speed RFID data streams. The

498 Inf Syst Front (2011) 13:481–500

Fig. 9 The effect of the sequence length

Figure 9 shows that TMS-RFID scales very well withthe sequence length when the domain size of A1 is 1000,5000, and 10000, respectively. When the domain size ofA1 is 500, the throughput of TMS-RFID drops a large

level for the sequence length being 5 and 6, but stillscales well for the sequence length being 2, 3, 4.

7.2.3 Scalability of TMS-RFID

The main intermediate results of TMS-RFID are storedin DLSIList, and the size of DLSIList has a greateffect on the performance of TMS-RFID. If its sizeincreases exponentially with the number of events, thesystem could not handle RFID data streams with hugeamount of events. This set of experiments is thereforeperformed to check the size DLSIList. Since DLSIListincludes two lists: PSIList and SIList, the number ofcomponents in these two lists is shown in Fig. 10. Here,the sequence length is set to three longer ones, namely4, 5, and 6. The domain size of A1 is set to 500 sincethe above experiments on domain size shows that this isthe most difficult case. The results show that what-ever the sequence length is, although the sizes ofboth PSIList and SIList fluctuate with the numberof events, they stabilize at a fixed level, and do not

Fig. 10 The size of DLSIList, and the sequence length is a 4, b 5, c 6

Page 19: TMS-RFID: Temporal management of large-scale RFID applicationsweb.science.mq.edu.au/~qsheng/papers/ISF-2011.pdf · RFID in handling large-scale, high-speed RFID data streams. The

Inf Syst Front (2011) 13:481–500 499

increase exponentially. This demonstrates the scalabil-ity of our proposed TMS-RFID architecture.

8 Conclusions

The ability of RFID technology for precisely identi-fying objects at low-cost and non-line-of-sight createsmany new and exciting application areas. This widerange of applications will make RFID an integral partof our daily lives. Despite the obvious potential, RFIDalso presents a new challenge on efficiently processinglarge-scale and time sensitive RFID events. In thispaper, we presented TMS-RFID, a system for tempo-ral management of RFID applications over high-speedRFID data streams. The key idea behind TMS-RFIDis the notion of TTL, which is used to control the timerestrictions in various RFID applications. To solve theproblem of unordered event stream (UnES), we devel-oped a new data structure, namely DLSIList, to miti-gate the exponential growth of the intermediate resultswhen processing RFID queries. We demonstrated theeffectiveness of TMS-RFID in a detailed performancestudy. The results illustrate that TMS-RFID is capableof processing high-speed RFID data streams with agood scalability.

Compared to other systems such as SASE (Wu et al.2006), HiFi (Rizvi et al. 2005), and Cayuga (Brennaet al. 2007), TMS-RFID provides a unique add-onmiddleware component to deal with the TTL querieswith a solution to the UnES problem, which is onestep further towards deployment of large-scale RFIDapplications.

References

Bai, Y., Wang, F., Liu, P., Zaniolo, C., & Liu, S. (2007). RFIDdata processing with a data stream query language. In Pro.of the 23rd international conference on data engineering(ICDE’07). Istanbul, Turkey.

Brenna, L., Demers, A. J., Gehrke, J., Hong, M., Ossher, J.,Panda, B., et al. (2007). Cayuga: A high-performance eventprocessing engine. In Proc. of the ACM SIGMOD inter-national conference on management of data (SIGMOD’07).Beijing, China.

Chakravarthy, S., Krishnaprasad, V., Anwar, E., & Kim, S.(1994). Composite events for active databases: Semantics,contexts and detection. In Proc. of the 20th internationalconference on very large data bases (VLDB’94). Chile.

Chawathe, S. S., Krishnamurthy, V., Ramachandran, S., &Sarma, S. (2004). Managing RFID data. In Proc. of the 30thinternational conference on very large data bases (VLDB’04).Toronto, Canada.

Demers, A. J., Gehrke, J., Hong, M., Riedewald, M., & White,W. M. (2006). Towards expressive publish/subscribe systems.

In Proc. of the 10th international conference on extendingdatabase technology (EDBT’06). Munich, Germany.

Franklin, M. J., Jeffery, S. R., Krishnamurthy, S., Reiss, F., Rizvi,S., Wu, E., et al. (2005). Design considerations for high fan-insystems: The HiFi approach. In Proc. of the second biennialconference on innovative data systems research (CIDR’05).Asilomar, CA, USA.

Gehani, N. H., Jagadish, H. V., & Shmueli, O. (1992). Compositeevents specification in active databases: Model and imple-mentation. In Proc. of the 18th international conference onvery large data bases (VLDB’92). Vancouver, Canada.

Gershenfeld, N., Krikorian, R., & Cohen, D. (2004). The internetof things. Scientific American, 291(4), 76–81.

Landt, J. (2005). The history of RFID. IEEE Potentials, 24(4),8–11.

Liu, G., Mok, A., & Konana, P. (1998). A unified approach forspecifying timing constraints and composite events in activereal-time database sysems. In Proc. of the 4th IEEE real-timetechnology and applications symposium (RTAS’98). USA.

Palmer, M. (2009). Seven principles of effective rfid data manage-ment. http://www.objectstore.com/docs/articles/7principles_rfid_mgmnt.pdf.

Mansouri-Samani, M., & Sloman, M. (1997). GEM: A gener-alized event monitoring language for distributed systems.Distributed Systems Engineering, 4(2), 96–108.

Rizvi, S., Jeffery, S. R., Krishnamurthy, S., Franklin, M. J.,Burkhart, N., Edakkunni, A., et al. (2005). Events on theedge. In Proc. of the ACM SIGMOD international con-ference on management of data (SIGMOD’05). Baltimore,Maryland, USA.

Sheng, Q. Z., Li, X., & Zeadally, S. (2008). Enabling next-generation RFID applications: Solutions and challenges.IEEE Computer, 41(9), 21–28.

Srivastava, U., & Widom, J. (2004). Flexible time management indata stream systems. In Proc. of the 23rd ACM SIGACT-SIGMOD-SIGART symposium on principles of databasesystems (PODS’04). Paris, France.

Wang, F., & Liu, P. (2005). Temporal management of RFID data.In Proc. of the 31st international conference on very large databases (VLDB’05). Trondheim, Norway.

Wang, F., Liu, S., Liu, P., & Bai, Y. (2006). Bridging physi-cal and virtual worlds: Compex event processing for RFIDdata streams. In Proc. of the 10th international conferenceon extending database technology (EDBT’2006). Munich,Germany.

Wu, E., Diao, Y., & Rizvi, S. (2006). High-performance complexevent processing over streams. In Proc. of the ACM inter-national conference on management of data (SIGMOD’06).Chicago, Illinois, USA.

Zimmer, D., & Unland, R. (1999). On the semantics of com-plex events in active database management systems. InProc. of the 15th international conference on data engineering(ICDE’99). Sydney, Australia.

Xue Li is currently an Associate Professor in the School of Infor-mation Technology and Electrical Engineering at the Universityof Queensland. His research interests include data mining, mul-timedia data security, RFID data management, and intelligentWeb information systems. He received the BSc degree in Com-puter Science from Chongqing University, China. He receivedthe PhD degree in Information Systems from the QueenslandUniversity of Technology. He is a member of the IEEE and theACM.

Page 20: TMS-RFID: Temporal management of large-scale RFID applicationsweb.science.mq.edu.au/~qsheng/papers/ISF-2011.pdf · RFID in handling large-scale, high-speed RFID data streams. The

500 Inf Syst Front (2011) 13:481–500

Jing Liu was born in Xi’an, China. She received the BSc degreein Computer Science and Technology from Xidian University,Xi’an, China, in 2000, and received the PhD degree in Cir-cuits and Systems from the Institute of Intelligent InformationProcessing of Xidian University in 2004. From Apr. 2007 to Apr.2008, she was a Postdoctoral Research Fellow in the Universityof Queensland, Australia. Now she is a Professor in Xidian Uni-versity. Her research interests include evolutionary computation,data mining, and RFID data management.

Quan Z. Sheng is currently a lecturer in the School of ComputerScience at the University of Adelaide. He holds a PhD degreein Computer Science from the University of New South Walesand was a Research Scientist at CSIRO ICT Centre, Australia.His main research interests include Web services, Internet com-puting, and pervasive computing. He is the founding chair ofthe International Workshop on RFID Technology. Dr Shenghas edited nine books and proceedings and published more than50 refereed technical papers in journals and conferences includ-ing VLDB Journal, IEEE Transactions on Services Computing,IEEE Computer, IEEE Internet Computing, Communications ofthe ACM, VLDB, ICDE, ICSE, and CAiSE. He is a member ofthe IEEE and the ACM.

Sherali Zeadally received his Bachelor’s Degree in ComputerScience from University of Cambridge, England, and the Doc-toral Degree in Computer Science from University of Buck-ingham, England, in 1996. He is an Associate Professor in theDepartment of Computer Science and Information Technologyat the University of the District of Columbia. He currently serveson the Editorial Boards of several international journals. He is afellow of the British Computer Society and a fellow of the Insti-tution of Engineering Technology, UK. His research interests in-clude computer networks including wired and wireless networks),RFID, network and system security, and mobile computing.

Weicai Zhong was born in Jiangxi, China. He received theBSc degree in Computer Science and technology from XidianUniversity, Xi’an, China, in 2000, and received the PhD degree inpattern recognition and intelligent information system from theInstitute of Intelligent Information Processing of Xidian Univer-sity in 2004. Now he is a Postdoctoral Fellow in Xidian University.His research interests include evolutionary computation, datamining, and RFID data management.