A Better Tool Than Allens Relations

A better tool than Allens relations for expressing temporalknowledge in interval data

Fabian MorchenSiemens Corporate Research

755 College Road EastPrinceton NJ 08540, USA

[email protected]

ABSTRACTTemporal patterns composed of symbolic intervals are com-monly formulated with Allens interval relations originatingin temporal reasoning. We show that this representationhas severe disadvantages for knowledge discovery. The pat-terns are not robust, in the sense that small disturbancesof interval boundaries lead to different patterns for similarsituations. The representation is ambiguous since the samepattern can have quantitatively widely varying appearances.For all but very simple cases the patterns are not under-standable because the textual descriptions are lengthy andunstructured. We present the Time Series Knowledge Rep-resentation (TSKR), a new hierarchical language for intervalpatterns to express the temporal concepts of coincidence andpartial order. We demonstrate the superiority of this novelform of representing temporal knowledge over Allens rela-tions for data mining. Results on a real data set supportour claims and show a successful application.

Categories and Subject DescriptorsI.5 [Computing Methodologies]: Pattern Recognition

General TermsModels

Keywordsknowledge discovery, time series, interval patterns, Allensinterval relations

1. INTRODUCTIONSymbolic interval time series are an important data for-

mat in temporal knowledge discovery [11, 13, 29, 7, 14, 12,5, 19, 30]. In particular, numerical time series are often con-verted to symbolic interval time series by segmentation [14,12], discretization [29] or clustering [11]. Alternatively, the

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.TDM06, August 20, 2006, Philadelphia, Pennsylvania, USA.Copyright 2006 ACM X-XXXXX-XX-X/XX/XX ...$5.00.

intervals can be obtained directly from other temporal data,e.g., video [9] or association rules over time [21].Mining such data for patterns has largely been performed

based on Allens interval relations [3], e.g., in [13, 7, 12,5, 19, 30]. The relations were originally developed in thecontext of temporal reasoning where inference about past,present, and future supports applications in planning, un-derstanding, and diagnosis. The input usually consists ofexact but incomplete input data and temporal constraints,often expressed by Allens relations. Typical problems in-clude determining the consistency of the data and answeringqueries about scenarios satisfying all constraints. But theseproblems do not occur in the data mining context: almostthe complete interval data is given and meaningful and un-derstandable patterns are searched [12]. One may have tocope with some missing data, but more importantly withpossibly noisy and incorrect data.We show that Allens relations have severe disadvantages

when used for pattern discovery from interval time series.We introduce an alternative representation for temporalknowledge in interval data and compare it to approachesusing Allens relations conceptually and experimentally. Weshow that our representation is more expressive, more ro-bust, and arguably more understandable than the patternsof Hoppner [12] using Allens relations.

2. MOTIVATIONThe 13 interval relations of Allen are [3]: before, meets,

overlaps, starts, during, finishes, the corresponding inversesand equals. They can describe any relative positioning oftwo intervals. The relations are commonly used for the for-mulation of temporal rules involving intervals [13, 7, 12, 22,5, 19, 30].Three methods use Allens relations for unsupervised rule

mining from symbolic interval data with variants of theApriori algorithm. In [13] interval patterns are restricted toright concatenations of intervals to existing patterns, usingone of Allens relations to describe the relative positioningof the new interval to the interval of the complete pattern.Similarly, in [7] the patterns are restricted to composites oftwo already significant patterns or single intervals. Bothapproaches thus use at most k 1 relations for k intervals.

In contrast, the patterns of [12] list all k(k1)2

pairwise rela-tions of the intervals within a pattern. In [19] the problemof mining patterns from interval data was rediscovered. Thepattern format of [12] is used with a subset of Allens rela-tions. The patterns are mined with a tree-based enumer-

ation algorithm [4]. In [30] the patterns of [12] are minedwith a modified sequential pattern algorithm [15].We think that using Allens relation to represent interval

patterns has severe disadvantages for knowledge discoveryand first formulate and justify our claims with examples.(1) Patterns from noisy interval data expressed with

Allens interval relations are not robust: Several of Allensrelations require the equality of two or more interval end-points. Small disturbances can create patterns where a verysimilar relationship between intervals is fragmented into dif-ferent relations as defined by Allens relations. Figure 1shows several examples of almost equal intervals.

(a) overlaps (b) during (c) finishes

Figure 1: Examples for different patterns accordingto Allen that are fragments of the same approximaterelation almost equals.

(2) Patterns expressed with Allens interval relations areambiguous: The same relation of Allen can visually andintuitively represent very different situations. In Figure 2three very different versions of the overlaps relation areshown as an example. Even more ambiguous is the com-pact representation of patterns from [13] or [7], several dif-ferent descriptions are valid for the exact same pattern (seeFigure 3).

(a) small (b) medium (c) large

Figure 2: Three instances of Allens overlaps rela-tion with large quantitative differences.

Figure 3: Pattern that can be described by threedifferent compact rules using Allens relations.

(3) Patterns expressed with Allens interval relations arenot easily comprehensible: The representation of patternswith Allens relations does not follow the Gricean maxims[10] suggested for the representation of knowledge discoveryresults to humans [25]. For example, the maxim of manner isviolated. Since the compact format of [13, 7] is ambiguous,patterns need to be expressed with an unstructured list ofpairwise relations of all intervals [12] that grows quickly withthe number of intervals.We present the Time Series Knowledge Representation

(TSKR), a new hierarchical language for the representationof temporal knowledge based on interval times series. TheTSKR extends the Unification-based Temporal Grammar(UTG) that was proposed by Ultsch in [26, 27] and appliedto the analysis of sleeping disorders in [11]. The central pat-tern elements of the UTG are Events and Sequences shownin Figure 4. Events combine several more or less simul-taneous intervals, a robust version of Allens equals. The

number of intervals in an Event is restricted, however, tothe dimensionality of the interval series. Sequences describean ordering of several Events with immediately followed byor followed by after at most t time units.

Figure 4: Core patterns of the UTG: Events de-scribe a group of almost simultaneously startingand ending intervals, Sequences describe an orderof Events.

The TSKR extends the UTG significantly by allowing anarbitrary number of coinciding parts of intervals in constrastto the fixed number of complete intervals within Events andby relaxing the total order in Sequences to a partial order.For more details on the relation of the TSKR to the UTGwe refer the interested reader to [18]. Other interval min-ing methods are not as powerful as Allens relations or theUTG. In [29] only containments of intervals and in [14] onlyassociations of successive intervals are searched.

3. TIME SERIES KNOWLEDGE REPRE-SENTATION

The TSKR is a hierarchical interval language describingthe temporal concepts of duration, coincidence, and partialorder in interval time series. The basic primitives are la-beled intervals called Tones representing duration. A Tonepattern describes a property of the temporal process that is(repeatedly) observed during time intervals. For example,a Tone labeled temperature increasing could be obtained bysegmenting a numerical time series and considering the slopeon each segment. In the following paragraphs we describethe higher level TSKR patterns composed of Tones. Simul-taneously occuring Tones form a Chord, representing coinci-dence. Several Chords connected with a partial order forma Phrase. We only briefly outline the mining algorithms, formore details see [17, 18].In data mining the input data is usually measured at

discrete time points of a certain resolution representing asample of the generating time continuous process. Withoutloss of generality, we define the following patterns based onthe natural numbering T of a set of uniformly spaced timepoints. Let be a set of unique symbols.

Definition 1. A symbolic time interval is a triple[, s, e] with , [s, e] T2, s e. The durationof a symbolic interval is d([s, e]) = e s + 1. We write[, s, e] [, s, e] if s s and e e with equal-ity iff s = s and e = e. A symbolic time interval[, s, e] is maximal if [, s, e] [, s, e] 6= . If{s, ..., e} {s, ..., e} 6= we say that the intervals [, s, e]and [, s, e] overlap.

Definition 2. We define a partial order of intervalsas [s1, e1] [s2, e2] e1 < s2. We say that [s1, e1] is before[s2, e2].

Definition 3. A symbolic interval series is a set of non-overlapping symbolic time intervals I = {[i, si, ei]|i , [si, ei] T

2, i = 1, ..., N ; ej < sj+1, j = 1, ..., N 1}.The duration of an interval series is d(I) = Ni=1d([si, ei]).

Definition 4. A symbolic interval sequence is a set ofsymbolic time intervals I = {[i, si, ei]|i , [si, ei] I, i = 1, ..., N}

At the core of our knowledge representation stand thepatterns and the occurrences of patterns in the input data.

Definition 5. Let be a finite set of labels . Let l : 7 be the function assigning each symbol a label. Let be a set of characteristic functions X : I {True,False},where X is some arbitrary input data. A pattern (, , X )is a semiotic triple [26] composed of:

, a unique symbol representing the pattern inhigher levels constructs ( syntax),

= l() , a label providing a textual description ofthe practical meaning of the pattern (pragmatic),

X , a characteristic function determining whenthe pattern occurs ( semantic).

The semantic of and the data type of X will differ for thespecific patterns defined below. We write for = Trueand for = False.

The concept of semiotic triples is consistently used onall levels of the TSKR pattern hierarchy. It enables thepragmatic annotation of each pattern with labels to aid thelater interpretation when used as parts of larger patterns. Avalue-based Tone obtained by discretizing a univariate timeseries vt R t T has a characteristic function of the formA([s, e]) vmin < vi vmaxi = {s, ..., e} T wherevmin, vmax R and could be labeled high or medium.

Definition 6. An occurrence of a pattern is an inter-val [s, e] I with X ([s, e]). A maximal occurrence is anoccurrence [s, e] such that [s, e] [s, e] X ([s

, e]).

In this study we assume a set of Tones given. The obser-vations according to the characteristic function of a singleTone form a symbolic interval series, the observations ofseveral Tones form a symbolic interval sequence.

Definition 7. A Chord pattern is a semiotic triple c =(, , TT ) with , , and a characteristic functionTT indicating the simultaneous occurrences of k TonesT = {ti = (i, i, i)|i = 1, ..., k, k > 0} on a given time in-terval according to the interval sequence T with occurrencesof the Tones T :

TT ([s, e]) 1([s, e]) ... k([s, e]) (1)

We say the Chord ci cj is a super-Chord of cj if ci de-scribes the coincidence of a superset of the Tones from cj.

Definition 8. Let the support set of a Chord be the sym-bolic interval series of all maximal intervals. The supportsup(c) of a Chord c is the duration of the support set.

Often, the support set will be restricted to intervals witha duration of at least > 0, to exclude very short temporalphenomena that are not meaningful for the application un-der study. In Figure 5 the occurrences of the Tones A, B,and C are shown in the top segment. In the middle segmentall maximal Chords of size 2 and 3 are shown. All supportsets consist of two intervals. The occurrence of a Chord pat-tern implies the occurrence of all sub-Chords on the sameinterval, e.g. AB, BC, and AC for ABC. In general largerChords can be considered more interesting, because theyare more specific. We use the concept of closedness to non-redundantly represent a set of Chords motivated by closeditemsets [20].

Figure 5: Simultaneous occuring Tones form max-imal Chords. Phrases describe a (partial) order of(not neccessarily maximal) Chords.

Definition 9. A Chord ci is closed if there are no super-Chords that have the same support, i.e., cj ci, sup(cj) 0} according to a partial

order E {i}2 on a given time interval according to the

interval sequence C with occurrences of the Tones C:

C([s, e]) (i = 1, ..., k [si, ei] [s, e] i([s, e]))(2)

(i {1, ..., k} si = s) (3)

(i {1, ..., k} ei = e) (4)

(i 6= j {1, ..., k}2 (5)

[i, si, ei] [j , sj , ej ] (6)

(i, j) E) (7)

We say the Phrase pi pj is a super-Phrase of pj if pidescribes the partial order of a superset of of the Chords ofpj and all common Chords have the same partial order.

The characteristic function for a Phrase consists of fournecessary conditions. Line 2 ensures that the intervals ofall Chords are within the Phrase interval, while Line 3 andLine 4 prevent extra room before the first and after the lastChord, respectively. The occurrence of a Phrase is thus max-imal by definition. A Phrase occurs on a particular intervalbut not on the sub-intervals. Lines 5-7 require the Chordsto be in the partial order specified by E. Note, that any twointervals that have an order relation in E are not allowedto overlap. This restriction makes sense, because Chordsalready describe the concept of coincidence. Allowing over-lapping Chords within a Phrase in general would mean torepeatedly represent the same concept and can in fact beequally represented by a larger Chord on the interval wheretwo Chords overlap. The bottom segment of Figure 5 showsChord intervals that are part of a Phrase. The interval forAB is non-maximal to avoid overlap with the interval ofABC.The power of partial order is shown in Figure 6. The two

similar Chord sequences in the bottom rows of Figure 6(a)and Figure 6(b) are summarized by the partial order graphof a Phrase shown in Figure 6(c). Note, that similar to [12]multiple observation intervals of a Tone (here A) are allowedand treated as distinct intervals in the Phrase. In contrastto [12] and similar to similar to [9] we further allow that onlya part of an observed interval is used in a pattern. Differentparts of the long Tone interval B in Figure 6(a) are used inthe Chords AB, ABC, and BC. The Phrase in Figure 6(c)thus summarizes cases where the Tone B in the three ChordsAB, BC, and ABC stems from one (Figure 6(a)) or more(Figure 6(b)) occurrences of the Tone B.

Definition 12. Let the support set of a Phrase be thesymbolic interval series of all maximal intervals. The sup-port sup(p) of a Phrase c is the size of the support set.

Definition 13. A Phrase pi is closed if there are nosuper-Phrases that have the same support, i.e., pj pi, sup(pi) = sup(pj).

Definition 14. A Phrase pi is margin-closed w.r.t. athreshold < 1 if there are no super-Phrases that have al-

most the same support, i.e., pj pi,sup(pj)

sup(pi)< 1 .

Just as Chords relate to itemsets, Phrases relate toepisodes [16]. The mining of margin-closed Phrases can beperformed in several steps [17, 18]: First, the interval se-quence of Chords is converted to an itemset sequences withone itemset per interval where no Chords change containing

(a) Some Chords (b) Similar Chords

(c) Phrase for both

Figure 6: Chords summarize several overlappingTones. Phrases summarize similar sequences ofChords.

all currently active Chords. Next, an algorithm for closedsequence mining, e.g. CloSpan [31], is applied using a win-dowing of the itemset sequence. Similar to [6] the closedsequences are grouped according to their transaction lists.Each group is then converted to a partial order. We performthe grouping according to margin-closedness with a modi-fied version of the CHARM [32] algorithm considering closedsequences as items and the groups as itemsets.

4. ANALYSISHaving introduced a new language for the description of

interval patterns, we compare the TSKR to the patterns of[12, 30] using Allens interval relations. We show that thepattern language of the TSKR has a higher temporal expres-sivity, is more robust, and has advantages in interpretability.

4.1 ExpressivityIn data mining the purpose of a pattern expressed in a

knowledge representation language is to collectively describemany similar or possibly equal situations observed in thedata. We say a pattern is more expressive than another, ifit summarizes more similar, yet qualitatively different situ-ations.We first show that all patterns expressible with pairwise

Allens interval relations can also be expressed with theTSKR by construction. That means that all occurrencesof a single interval pattern described with the pattern for-mat of [12] can also be described by a single TSKR pattern.We write AB for a Chord with coinciding Tones A and B.We write AB CD for the Phrase expressing the totalorder of the Chord AB followed by the Chord CD.Let I = {(i, si, ei)|i = 1, ..., k} be all involved sym-

bolic intervals with arbitrary pairwise relations accordingto Allen. Let B = {bj} =

Ski=1{si, ei} the sorted set of all

interval boundaries. We construct a Phrase with at most|B| 1 Chords where the j-th Chord describes the non-empty coincidence of all i where si bj and ei bj+1 oris skipped otherwise. The Chords have a complete orderingaccording to the indices j.Consider the example pattern in Figure 7 consisting of six

intervals and at least one representative of each of Allensoperators within the pairwise relations. The six resultingChords are shown in the bottom row.

Figure 7: Conversion of complex pattern using allof Allens relations to a TSKR pattern.

Note, that a TSKR patterns constructed in this way de-scribes even more similar situations in the data than theoriginal pattern using Allens relations. In Figure 8(a) sev-eral instances of Allens relation A overlaps B are shownwith the corresponding TSKR pattern A AB B. InFigure 8(b) we give some examples of instances that are alsocovered by the TSKR pattern but not by the Allen pattern.The Phrase also matches observations where the long inter-val of A and B in Figure 8(a) are interrupted by noise.

(a) Examples of A overlaps B and A AB B.

(b) Examples also described by A AB B.

Figure 8: The TSKR pattern A AB B can de-scribe all instances of Allens A overlaps B patternand more.

On the other hand it is not always possible to find a singlepattern defined with Allens relations to describe all situa-tions covered by a certain single TSKR pattern. In par-ticular Phrases that summarize similar situations as in Fig-ure 6 by utilizing the concept of partial order, cannot beexpressed by a single pattern using all pairwise relations ofAllen. Consider the Chords AB, ABC, BC, and AC asshown in Figure 6(a) at the bottom resulting from the Tonepatterns A, B, C in the top rows. In Figure 6(b) the sameTone intervals result in the same Chords but with the mid-dle two Chords exchanged. Both versions can be capturedin a single Phrase (see Figure 6(c)) using the concept of par-tial order. The order relation of ABC and BC is simply notspecified, both versions of the pattern match this descrip-tion. The two similar instances cannot be captured with asingle pattern using the pattern format of [12]. Further, theinstances of the TSKR pattern A AB B in Figure 8(a)and Figure 8(b) cannot be described with a single patternof [12] because they have a different number of intervals.Even if we restrict ourselves for the moment to TSKR

patterns using only a total order among the Chords in aPhrase and exactly the same set of participating Tone inter-vals, constructing a pattern with pairwise Allens relationsthat describes the same situations in the data is not alwayspossible. For example the simple Phrase AB C, i.e.,Tones A and B occur simultaneously followed by an interval

where the Tones C occurs, covers instances that correspondto very different relations of Allen. The relation between Aand B could be overlaps, starts, during, finished, equals plusthe corresponding inverses. The relation between A (or B)and C could be before, meets, or overlaps. See Figure 9 forexamples.

Figure 9: Example of the same simple TSKR PhraseAB C with different relations according to Allen.

In summary, a single TSKR pattern can express manyobservations in the data where time intervals have a verysimilar relative positioning but different relations accordingto Allen. In contrast each pattern according to Allen canalso be described with a single TSKR pattern. The TSKRachieves a higher level of abstraction over small disturbancesin the data while offering clear temporal semantics with theconcepts of coincidence and partial order.

4.2 RobustnessSymbolic interval series or sequences obtained from nu-

meric time series inherit the noise present in the originaldata. The interval boundaries gained from preprocessingsteps like discretization of values or segmentation are sub-ject to noise in the measurements. Such time points shouldthus be considered approximate.Using Allens relations to formulate patterns such slight

variations of interval boundaries can create fragmented pat-terns that describe the same intuitive relationship betweenintervals with different operators. An example for two al-most equal intervals was given in Figure 1. Many more ex-amples can be constructed. Any pattern using one of Allensrelations that requires equality of two interval boundariescan be destroyed by changing one boundary by one timeunit only.There are approaches to relax the strictness of Allens

relations by using a threshold to consider temporally closeinterval boundaries equal [2]. This has not been used in tem-poral pattern mining. Fragmented patterns are still possibleif noise causes interval boundaries to be shifted around thethreshold value.In contrast, the Chord and Phrase patterns of the TSKR

are designed to be insensitive to small changes of the inter-val boundaries. The TSKR operator coincides is extremelyrobust. It only considers the intersection of all participat-ing intervals, any interval can individually be stretched toinfinity without changing the pattern at all. All three casesin Figure 1 could be represented well with a Chord describ-ing the coincidence of A and B. The leading and trailingintervals where only A or B is observed would not be con-sidered of significant duration for appropriate choices of theminimum Chord duration.When making intervals smaller, the pattern breaks down

as soon as the smallest interval disappears - as is true forboth temporal knowledge representations. The robustnessof Phrases directly depend on the Chords. Other aspects ofnoisy data like missing intervals similarly affect both repre-

sentations. This can be compensated by allowing alterna-tives within a pattern [26, 12].

4.3 InterpretabilityInterpretability is hard to specify or measure. In [25] nat-

ural language descriptions of time series are generated andinterpreted by experts. The authors suggest that the pre-sentation should follow the Gricean maxims of quality, rele-vance, manner, and quantity. We describe the implicationsof some of the the maxims in the context of interval patternsand judge how well the different pattern languages supportthis paradigm.The maxim of relevance requires, that only patterns rel-

evant to the expert are listed. This is a rather applicationdependent requirement, still there are differences in how thepattern languages support this task. The hierarchical struc-ture of the TSKR enables the user to view and filter lowerlevel patterns before the next level constructs are searched.Mining Phrases from only the relevant Chords will be muchfaster, fewer and smaller Phrase patterns need to be ana-lyzed in the next step. This form of pattern representa-tion supports the human analysis according to the zoom,filter, and details on demand paradigm [23]. The patternsexpressed with Allens relations are commonly mined in asingle step, resulting in a much larger set of patterns formanual analysis.The maxim of manner suggests to be brief and orderly

and to avoid obscurity and ambiguity. The TSKR patternscan be longer or shorter than the Allen patterns of [12].

A pattern of [12] with k intervals always consists of k(k1)2

pairwise relations. When counting each Chord and eachconsecutive order relation in a Phrase separately the worstcase for the number of atomic relations in a TSKR patternis (2k1)(2k2) because with 2k interval boundaries therecan be at most 2k 1 Chords. These Chords would thenoccur consecutively involving 2k 2 binary order relations.In the best case there is only a single relation if all intervalsare equal and thus form a single Chord. The typical size ofa TSKR pattern given k intervals depends on the data setat hand (see Section 5).The textual representation of the TSKR can be argued

to be more orderly and to avoid obscurity because is usesa hierarchical structure with a partial order relation on thehighest level that offers details on demand. In contrast,there is no inherent order in the list of pairwise Allen re-lations. They could be ordered by the intervals of the firstargument of each relation according to an temporal order ofthe intervals or the symbolic labels. In either case, the listneeds to be considered as a whole to understand the pattern.To avoid ambiguity, a given pattern should semantically

describe a single intuitive notion of the relationship of sev-eral intervals. This is not the case for all of Allens relation.The instances of a single pattern can quantitatively vary sig-nificantly and represent intuitively different patterns. Thiswas already demonstrated for the overlaps relation in Fig-ure 2, similar examples can be constructed for other rela-tions. Whether different versions of a pattern are seman-tically equivalent depends on the application. The sepa-ration of temporal concepts in the TSKR, however, com-pletely avoids this problem. Chord patterns simply describethe overlapping parts and ignore the rest of the intervals.Phrase patterns express the concept of partial order, allow-ing variation only in the length of Chords and possible gaps.

The ambiguity problem is even worse for the compact pat-tern format used in [13] and [7]. The exact same instance ofa pattern with k intervals can be written in k!

2different ways

by consecutively choosing an interval for the next positionin the pattern and excluding duplicates caused by inverseoperators. Again, this is not the case for the TSKR, givena set intervals and a minimum duration for Chords there isonly one Phrase of all maximal Chords.

5. EXPERIMENTSWe compare the TSKR with Allens relations on sym-

bolic interval data describing videos1. Different scenariosinvolving a hand moving colored blocks were analyzed withthe Leonard system for recognition of visual events fromvideo camera input [24]. The scenes include simple actionslike putting one block on another and more complex sceneswhere a whole stack of blocks is built. In [9] each sceneis described with a logical formula found by inductive logicprogramming in a supervised process. We use the data inan unsupervised manner and use the ground truth on thescenarios for evaluation purposes only. We preprocessed thedescriptions by normalizing the argument order, filtering outsome redundant descriptions, and merging scenarios thatwere equivalent.We mined Chords with a minimum support of 1%, mini-

mum size of 1, and a minimum count of 10. This leads to 19closed Chords and 15 margin-closed Chords using = 0.1.We further dropped three Chords of size one not involvingthe hand, but rather one of the contacts Tones. These Tonesare only interesting when combined with a hand action. Thelattice of the remaining Chords is shown in Figure 10.The trivial Chord C9 describes the hand holding the red

block. This Chord has very high support, covering 50% ofall time points and has several super-Chords. C13 and C14describe the hand holding the red block, while it sits on topof the green or blue block, respectively. Both have furthersuper-Chords C11 and C12 where yet another block is partof the stack. Note, that the support of the super-Chords isalways at least 10% smaller than that of any immediate sub-Chords, according to = 0.1. The Chords C2 and C3 inthe upper left represent a stack of three blocks not touchedby the hand. Without looking at the original Video data, itis unclear at this point which block sits on top and which atthe bottom. For C2 we only know that blue is in the middle,because it is an argument of both contacts relations. Theco-occurrences of the Chords with the known scenarios arestriking, a selection is listed in Table 1. The Chord patternsare not sufficient, however, to discriminate the scenarios,because they always appear in at least two different scenesthat are reversed versions of one another, e.g. stack andunstack. In order to distinguish each pair, we need patternsexpressing an order among the Chords.Using a minimum frequency of 12, 20 closed sequential

patterns were found and grouped into 10 margin-closedPhrases ( = 0.1). The co-occurrences of the Phrases withthe known scenarios showed an almost perfect correspon-dence, see Table 2. The Phrases further explain the actionsin the videos. The Phrase for the disassemble scenario isshown with pictures from the original Videos in Figure 11.For mining patterns expressed with Allens relations we

used the format of [12] to avoid ambiguity and counted sup-

1ftp://ftp.ecn.purdue.edu/qobi/ama.tar.Z

C2 (#44, 3%):CONTACTS-BLUE-GREEN

CONTACTS-BLUE-RED


CONTACTS-GREEN-RED


ATTACHED-HAND-RED

C7 (#64, 16%):ATTACHED-BLUE-HAND

C8 (#61, 5%):ATTACHED-BLUE-HAND

ATTACHED-BLUE-GREEN

C9 (#213, 50%):ATTACHED-HAND-RED


ATTACHED-BLUE-GREEN


ATTACHED-BLUE-RED


ATTACHED-GREEN-RED


ATTACHED-BLUE-GREEN

ATTACHED-BLUE-RED


ATTACHED-BLUE-GREEN

ATTACHED-GREEN-RED

C15 (#176, 16%):ATTACHED-BLUE-GREEN

Figure 10: The lattice of the most interesting margin-closed Chords from the Video data annotated withfrequency (#) and support (%). The three Chords in the upper right represent the hand holding one of thethree blocks. The two most specific Chords with three Tones describe a stack of three blocks with the handholding the topmost block.

BLUE-CONTACTS-RED

BLUE-CONTACTS-GREEN

HAND-HOLDS-RED

BLUE-CONTACTS-RED

BLUE-CONTACTS-GREEN

HAND-HOLDS-RED

BLUE-CONTACTS-GREEN

HAND-HOLDS-BLUE

BLUE-CONTACTS-GREENHAND-HOLDS-BLUE

Figure 11: Video frames [9] and Phrase explaining the complete disassemble scene.

Scenes/Chords

BLUE

-CO

NTAC

TS-G

REEN

BLUE

-CO

NTAC

TS-R

ED

BLUE

-CO

NTAC

TS-G

REEN

BLUE

-CO

NTAC

TS-R

ED

BLUE

-CO

NTAC

TS-G

REEN

HAN

D-H

OLD

S-RE

D

stack 0 28 30unstack 0 29 30assemble 14 0 34disassemble 30 0 30

Table 1: Co-occurrences of Chords and Videoscenes.

port as occurrences in windows [13] to be comparable withthe Phrase mining. We further pruned the set of patternsapplying the concept of margin-closedness in a brute forcepost-processing step. Using minimum size of two and a min-imum frequency of 10, 363 patterns were found, 174 of whichwere margin-closed for = 0.1. This large amount of pat-terns could not be analyzed manually, some filtering neededto be applied.We first looked at the 14 largest patterns with five or more

intervals. Patterns of these sizes were only observed withinthe (dis)assemble scenes with at most 21 out of 30 repeti-

Figure 12: Example of frequent Allen pattern A1explaining only fragments of the disassemble scenein Figure 11.

tions. The patterns for the assemble scene further describedonly fragments of the true temporal phenomena. None ofthem contained any information about the hand taking thered block and placing it on top of the stack. We alternativelyfiltered the mining results by a minimum frequency of 24 anda minimum size of three. These patterns were all composedof only three symbolic intervals, not sufficiently explainingthe complex events of these scenes. As an example we showa pattern found in 26 out of 30 of the disassemble scenes inFigure 12. The intervals only correspond to the first threevideo frames of Figure 11 and fail to mention importantparts of the scene. The TSKR Phrases consist of up to 10(sub-)intervals providing much more information.Apart from the more specific explanations that the TSKR

patterns provide, they are also almost always more pre-cise and more distinctive than the Allen patterns of [12],as shown in Table 3. For each video scene, precision andrecall of the best pattern from each representation was cal-culated. Only the recall for the complex assemble scene isbetter for Allen/Hoppner. In this case the TSKR pattern

Scenes/Phrases

CONT

ACTS

-BLU

E-G

REEN

ATTA

CHED

-HAN

D-RE

D

ATTA

CHED

-HAN

D-RE

D

ATTA

CHED

-BLU

E-G

REEN

ATTA

CHED

-GRE

EN-R

ED

CONT

ACTS

-BLU

E-G

REEN

CONT

ACTS

-GRE

EN-R

ED

CONT

ACTS

-BLU

E-G

REEN

CONT

ACTS

-GRE

EN-R

ED

ATTA

CHED

-HAN

D-RE

D

ATTA

CHED

-BLU

E-G

REEN

ATTA

CHED

-GRE

EN-R

ED

CONT

ACTS

-BLU

E-G

REEN

ATTA

CHED

-HAN

D-RE

D

ATTA

CHED

-BLU

E-HA

NDAT

TACH

ED-B

LUE-

HAND

ATTA

CHED

-BLU

E-G

REEN

CONT

ACTS

-BLU

E-G

REEN

ATTA

CHED

-HAN

D-RE

D

ATTA

CHED

-HAN

D-RE

D

ATTA

CHED

-BLU

E-G

REEN

ATTA

CHED

-BLU

E-RE

D

BLUE

-CO

NTAC

TS-R

ED

BLUE

-CO

NTAC

TS-G

REEN

HAN

D-H

OLD

S-RE

D

BLUE

-CO

NTAC

TS-R

ED

BLUE

-CO

NTAC

TS-G

REEN

HAN

D-H

OLD

S-RE

D

BLUE

-CO

NTAC

TS-G

REEN

HAN

D-H

OLD

S-BL

UE

BLUE

-CO

NTAC

TS-G

REEN

HAN

D-H

OLD

S-BL

UE

stack 28 0 0 0unstack 0 28 0 0assemble 0 0 28 0disassemble 0 0 0 30

Table 2: Co-occurrences of Phrases and Videoscenes.

is much more complex and provides better insight into thetemporal phenomena of the scene sacrificing some discrimi-nation power for explanation.A detailed analysis of the pattern for put-down revealed

five visually very similar patterns with the same intervalsbut slightly different pairwise relations (see Figure 13). Onlythe first combination is frequent enough to appear in themining result, the others would commonly be filtered outfrom the large amount of patterns found because they aresmall and rare.We further evaluated the typical size of TSKR patterns

on this data set. We ordered the intervals by their startpoints and end points. Each interval was used to generatea pattern of size k by selecting the next k 1 intervals aslong as consecutive intervals shared at least one time point.

Such a pattern can be described by k(k1)2

of Allens re-lations. For the creation of TSKR patterns we consideredall intervals between any two consecutive time points takenfrom the de-duplicated set of the 2k start and end points

Scenes TSKR Allen/HoppnerPrecision Recall Precision Recall

pick-up 100.0 100.0 42.3 36.7put-down 97.8 62.0 41.9 60.0stack 100.0 93.3 80.0 66.7unstack 100.0 93.3 100.0 90.0assemble 100.0 93.3 100.0 100.0disassemble 100.0 100.0 76.5 86.7

Table 3: Precision and recall for Allen and TSKRpatterns.

1 2 3 4 5 6 7 8 90

5

10

15

20

25

30

Number of intervals

Patte

rn s

ize

Allen/HoeppnerTSKR meanstdTSKR maximumTSKR minimum

Figure 14: Fixed size of Allen patterns compared totypical size of TSKR patterns given k intervals ofthe video data.

of the k intervals. Assuming the strictest minimum Chordduration of 1, we created a Chord for each of these intervalsdescribing the coincidence of all original intervals that inter-sect it. Finally, a Phrase describing the total order of theseChords was considered. Let the size of the de-duplicated setof time points be l. Counting each Chord and each consec-utive order within a Phrase as an elementary relations, thesize of the TSKR pattern is (l 1) + (l 2). In Figure 14the mean and standard deviation, as well as the minimumand maximum of the TSKR pattern sizes are compared tothe fixed size of Allen patterns for several values of k.For very small k Allens relations can usually more com-

pactly describe a group of intervals. As an explanation con-sider the example A overlaps B from Figure 8(a). Whilethere is only one relation of Allen for the two intervals, theTSKR considers 3 Chords with 2 order relations (size 5). Forlarger patterns the opposite effect is observed on this dataset. For k = 4 the typical size of TSKR patterns is about thesame than that of the Allen patterns and for larger patternsit is much smaller. Even the maximum TSKR pattern sizeobserved on this data set is well below the size of Allen pat-terns for k 7. The more intervals are involved, the morethe pattern size seems to profit from Chords that group morethan 2 concurrently observed intervals. For k = 8 only fewTSKR patterns, all of the same size, could be constructedwith the process described above, for k 9 no such patternswere available.

Figure 13: Five similar patterns for put-down only one of which is frequent. A is the hand holding the redblock, B is the placing of the red on top of the green block, C is the red on the green block without the hand.

6. DISCUSSIONThe descriptions found by the TSKM in the Video data

in an unsupervised process directly explained the known ex-perimental setups. Several valid patterns based on Allensrelations were also found, but they mostly explained onlyfragments of the scenarios and showed less correspondenceto the ground truth. This was partly explained by patternfragmentation.In [12] the rule sets are defragmented with disjunctions of

patterns but this makes the result even longer and harderto understand. Take for example a rule disjunctively com-bining the 5 cases of Figure 13. Another solution to theproblem is to use thresholds, effectively merging the prob-lematic patterns. This does not solve the problems of ambi-guity and comprehensibility, however, and poses the prob-lem of threshold selection. Ambiguity could be reduced bysplitting patterns with potential high variability into severaldifferent patterns, e.g. using the mid points [22]. But using49 instead of 13 relations with many additional conditionsrequiring equality of time points will in turn increase effectsof pattern fragmentation.The reason for the TSKR being more expressive than

Hoppners representation using Allens relation is the per-mission of partial order and of subintervals in patterns. Onecould also mine partially related Allens patterns by allowingblanks in the matrix of pairwise relations. This would makethe mining even more costly. Further, imagine being givena listing of 10 pairwise relations of 5 intervals and the taskto draw a qualitative example of the pattern. This is not atall a trivial task, many dependencies need to be considered.In fact, this corresponds to the constraint satisfaction prob-lem of temporal reasoning which is NP-complete [28]. Thesame task is rather easy given a description in the TSKRlanguage and shouldnt this be the case in order to call apattern understandable?If the semantics of Allens relations are explicitly desired

for an application, one should be aware of the negative ef-fects of noise and the general disadvantages of this represen-tation for knowledge discovery. For small patterns a quali-tative visualization of the pattern might not adequately rep-resent all instances found as there can be large quantitativedifferences. For larger patterns this effect may be weakerbecause with more pairwise relations there are fewer pos-sibilities for quantitative variance. In addition to intervalbased pattern visualizations, the TSKR also offers a moreabstract graph visualization of Phrases and Chords.The TSKR was designed for unsupervised learning in

noisy interval data. It can just as well be used with datawhere exact relations of interval endpoints matter. It is as-sumed in general, however, that the temporal process pro-ducing the symbolic interval is not chaotic and the proper-ties described by Tones can be observed during intervals ofa minimum length meaningful to the analyst. We believethat other data mining tasks like classification or anomaly

detection can also profit from the high robustness and highunderstandability of the TSKR. Other possible extensionsof the TSKR include the use of quantitative information[12], generation of implication rules [1, 12, 5, 30], and fuzzyitemsets and association rules [8].

7. SUMMARYWe presented a novel way of extracting understandable

patterns from multivariate temporal data with the aim ofknowledge discovery. We defined the Time Series Knowl-edge Representation (TSKR), a new pattern language forexpressing temporal knowledge. Efficient algorithms formining such patterns are available [17, 18]. The TSKRis more robust, more expressive, and better interpretablethan previously proposed approaches using Allens rela-tions. Unsupervised pattern mining with the TSKR wasdemonstrated using video data. Compared to Allen, manyfewer patterns needed to be analyzed with the TSKR andthey were more specific and more accurate.

Acknowledgements: This work was done in theDatabionic Research Group at the Philipps-UniversityMarburg, Germany. We thank Alfred Ultsch for guidanceand discussion during this time. We further thank AlanFern for donating and explaining the video data.

8. REFERENCES[1] R. Agrawal, T. Imielinski, and A. N. Swami. Mining

association rules between sets of items in largedatabases. In Proc. SIGMOD, pages 207216, 1993.

[2] M. Aiello, C. Monz, L. Todoran, and M. Worring.Document understanding for a broad class ofdocuments. IJDAR, 5(1):116, 2002.

[3] J. F. Allen. Maintaining knowledge about temporalintervals. Communications of the ACM,26(11):832843, 1983.

[4] R. J. Bayardo. Efficiently mining long patterns fromdatabases. In Proc. PODS, pages 8593, 1998.

[5] R. Bellazi, C. Larizza, P. Magni, and R. Bellazi.Temporal data mining for the quality assessment ofhemodialysis services. Artificial Intelligence inMedicine, 34:2539, 2005.

[6] G. Casas-Garriga. Summarizing sequential data withclosed partial orders. In Proc. SDM05, pages 380391,2005.

[7] P. R. Cohen. Fluent learning: Elucidating thestructure of episodes. In Proc. IDA01, pages 268277,2001.

[8] D. Dubois, E. Hullermeier, and H. Prade. Asystematic approach to the assessment of fuzzyassociation rules. Data Mining and KnowledgeDiscovery, 2006. to appear.

[9] A. Fern. Learning Models and Formulas of a TemporalEvent Logic. PhD thesis, Purdue University, 2004.

[10] H. Grice. Studies in the Way of Words. HarvardUniversity Press, 1989.

[11] G. Guimaraes and A. Ultsch. A method for temporalknowledge conversion. In Proc. IDA99, pages369380, 1999.

[12] F. Hoppner. Knowledge Discovery from SequentialData. PhD thesis, Technical University Braunschweig,Germany, 2003.

[13] P.-S. Kam and A. W.-C. Fu. Discovering temporalpatterns for interval-based events. In Proc.DaWaK00, pages 317326, 2000.

[14] M. Last, Y. Klein, and A. Kandel. Knowledgediscovery in time series databases. IEEE Trans. onSystems, Man, and Cybernetics, 31(1):160169, 2001.

[15] M.-Y. Lin and S.-Y. Lee. Fast discovery of sequentialpatterns by memory indexing. In Proc. DaWaK, pages150160, 2002.

[16] H. Mannila, H. Toivonen, and I. Verkamo. Discoveryof frequent episodes in event sequences. In Proc.KDD96, pages 210215, 1995.

[17] F. Morchen. Algorithms for time series knowledgemining. In Proc. KDD, 2006.

[18] F. Morchen. Time Series Knowledge Mining. PhDthesis, Philipps-University Marburg, Germany, 2006.

[19] P. Papaterou, G. Kollios, S. Sclaroff, andD. Gunopoulos. Discovering frequent arrangements oftemporal intervals. In Proc. ICDM, pages 354361,2005.

[20] N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal.Discovering frequent closed itemsets for associationrules. In Proceeding ICDT99, pages 398416, 1999.

[21] C. Rainsford and J. Roddick. Adding temporalsemantics to association rules. In Proc. PKDD99,pages 504509, 1999.

[22] J. F. Roddick and C. H. Mooney. Linear temporalsequences and their interpretation using midpointrelationships. IEEE TKDE, 17(1):133135, 2005.

[23] B. Shneiderman. The eyes have it: A task by datatype taxonomy for information visualizations. In Proc.IEEE Symposium on Visual Languages, page 336,1996.

[24] J. M. Siskind. Grounding the lexical semantics ofverbs in visual perception using force dynamics andevent logic. Journal of Artificial Intelligence Research,15:3190, 2001.

[25] S. G. Sripada, E. Reiter, and J. Hunter. GeneratingEnglish summaries of time series data using theGricean maxims. In Proc. KDD03, pages 187196,2003.

[26] A. Ultsch. Eine unifikationsbasierte Grammatik zurBeschreibung von komplexen Mustern in multivariatenZeitreihen. personal notes, 1996. German.

[27] A. Ultsch. Unification-based temporal grammar.Technical Report 37, CS Dept., Philipps-UniversityMarburg, Germany, 2004.

[28] M. Vilain, H. A. Kautz, and P. G. van Beek.Constraint propagation algorithms for temporalreasoning. Reading in Qualitative Reasoning aboutPhysical Systems, pages 373381, 1989.

[29] R. Villafane, K. Hua, D. Tran, and B. Maulik.Knowledge discovery from series of interval events.Journal of Intelligent Information Systems,15(1):7189, 2000.

[30] E. Winarko and J. F. Roddick. Armada - an algorithmfor discovering richer relative temporal associationrules from interval-based data. Data & KnowledgeEngineering, 2006. to appear.

[31] X. Yan, J. Han, and R. Afshar. CloSpan: Miningclosed sequential patterns in large datasets. In Proc.SDM03, pages 166177, 2003.

[32] M. J. Zaki and C.-J. Hsiao. Efficient algorithms formining closed itemsets and their lattice structure.IEEE TKDE, 17(4):462478, 2005.

Documents

A Better Tool Than Allens Relations