Join operations in temporal databases

VLDB Journal (2005) 14: 2–29 / Digital Object Identifier (DOI) 10.1007/s00778-003-0111-3

Join operations in temporal databases

Dengfeng Gao1, Christian S. Jensen2, Richard T. Snodgrass1, Michael D. Soo3

1 Computer Science Department, P.O. Box 210077, University of Arizona, Tucson, AZ 85721-0077, USAe-mail: {dgao,rts}@cs.arizona.edu

2 Department of Computer Science, Aalborg University, Fredrik Bajers Vej 7E, 9220 Aalborg Ø, Denmarke-mail: [email protected]

3 Amazon.com, Seattle; e-mail: [email protected]

Edited by T. Sellis. Received: October 17, 2002 / Accepted: July 26, 2003Published online: October 28, 2003 – c© Springer-Verlag 2003

Abstract. Joins are arguably the most important relationaloperators. Poor implementations are tantamount to comput-ing the Cartesian product of the input relations. In a temporaldatabase, the problem is more acute for two reasons. First, con-ventional techniques are designed for the evaluation of joinswith equality predicates rather than the inequality predicatesprevalent in valid-time queries. Second, the presence of tempo-rally varying data dramatically increases the size of a database.These factors indicate that specialized techniques are neededto efficiently evaluate temporal joins.

We address this need for efficient join evaluation in tempo-ral databases. Our purpose is twofold. We first survey all previ-ously proposed temporal join operators. While many temporaljoin operators have been defined in previous work, this workhas been done largely in isolation from competing propos-als, with little, if any, comparison of the various operators.We then address evaluation algorithms, comparing the appli-cability of various algorithms to the temporal join operatorsand describing a performance study involving algorithms forone important operator, the temporal equijoin. Our focus, withrespect to implementation, is on non-index-based join algo-rithms. Such algorithms do not rely on auxiliary access pathsbut may exploit sort orderings to achieve efficiency.

Keywords: Attribute skew – Interval join – Partition join –Sort-merge join – Temporal Cartesian product – Temporal join– Timestamp skew

1 Introduction

Time is an attribute of all real-world phenomena. Conse-quently, efforts to incorporate the temporal domain intodatabase management systems (DBMSs) have been ongo-ing for more than a decade [39,55]. The potential benefits ofthis research include enhanced data modeling capabilities andmore conveniently expressed and efficiently processed queriesover time.

Whereas most work in temporal databases has concen-trated on conceptual issues such as data modeling and query

languages, recent attention has been on implementation-related issues, most notably indexing and query processingstrategies. In this paper, we consider an important subproblemof temporal query processing, the evaluation ad hoc temporaljoin operations, i.e., join operations for which indexing or sec-ondary access paths are not available or appropriate. Temporalindexing, which has been a prolific research area in its ownright [44], and query evaluation algorithms that exploit suchtemporal indexes are beyond the scope of this paper.

Joins are arguably the most important relational operators.This is so because efficient join processing is essential for theoverall efficiency of a query processor. Joins occur frequentlydue to database normalization and are potentially expensive tocompute [35]. Poor implementations are tantamount to com-puting the Cartesian product of the input relations. In a tem-poral database, the problem is more acute. Conventional tech-niques are aimed at the optimization of joins with equalitypredicates, rather than the inequality predicates prevalent intemporal queries [27]. Moreover, the introduction of a timedimension may significantly increase the size of the database.These factors indicate that new techniques are required to ef-ficiently evaluate joins over temporal relations.

This paper aims to present a comprehensive and systematicstudy of join operations in temporal databases, including bothsemantics and implementation. Many temporal join operatorshave been proposed in previous research, but little compari-son has been performed with respect to the semantics of theseoperators. Similarly, many evaluation algorithms supportingthese operators have been proposed, but little analysis has ap-peared with respect to their relative performance, especiallyin terms of empirical study.

The main contributions of this paper are the following:

• To provide a systematic classification of temporal join op-erators as natural extensions of conventional join opera-tors.

• To provide a systematic classification of temporal joinevaluation algorithms as extensions of common relationalquery evaluation paradigms.

• To empirically quantify the performance of the temporaljoin algorithms for one important, frequently occurring,and potentially expensive temporal operator.

D. Gao et al.: Join operations in temporal databases 3

Our intention is for DBMS vendors to use the contribu-tions of this paper as part of a migration path toward incorpo-rating temporal support into their products. Specifically, weshow that nearly all temporal query evaluation work to datehas extended well-accepted conventional operators and eval-uation algorithms. In many cases, these operators and tech-niques can be implemented with small changes to an existingcode base and with acceptable, though perhaps not optimal,performance.

Research has identified two orthogonal dimensions of timein databases – valid time, modeling changes in the real world,and transaction time, modeling the update activity of thedatabase [23,51]. A database may support none, one, or bothof the given time dimensions. In this paper, we consider onlysingle-dimension temporal databases, so-called valid-time andtransaction-time databases. Databases supporting both timedimensions, so-called bitemporal databases, are beyond thescope of this paper, though many of the described techniquesextend readily to bitemporal databases. We will use the termssnapshot, relational, or conventional database to refer to data-bases that provide no integrated support for time.

The remainder of the paper is organized as follows. Wepropose a taxonomy of temporal join operators in Sect. 2.This taxonomy extends well-established relational operatorsto the temporal context and classifies all previously definedtemporal operators. In Sect. 3, we develop a correspondingtaxonomy of temporal join evaluation algorithms, all of whichare non-index-based algorithms. The next section focuses onengineering the algorithms. It turns out that getting the detailsright is essential for good performance. In Sect. 5, we empiri-cally investigate the performance of the evaluation algorithmswith respect to one particular, and important, valid-time joinoperator. The algorithms are tested under a variety of resourceconstraints and database parameters. Finally, conclusions anddirections for future work are offered in Sect. 6.

2 Temporal join operators

In the past, temporal join operators were defined in differenttemporal data models; at times the essentially same operatorswere even given different names when defined in differentmodels. Further, the existing join algorithms have also beenconstructed within the contexts of different data models. Thissection enables the comparison of join definitions and imple-mentations across data models. We thus proceed to proposea taxonomy of temporal joins and then use this taxonomy toclassify all previously defined temporal joins.

We take as our point of departure the core set of conven-tional relational joins that have long been accepted as “stan-dard” [35]: Cartesian product (whose “join predicate” is theconstant expression TRUE), theta join, equijoin, natural join,left and right outerjoin, and full outerjoin. For each of these, wedefine a temporal counterpart that is a natural, temporal gener-alization of it. This generalization hinges on the notion of snap-shot equivalence [26], which states that two temporal relationsare equivalent if they consist of the same sequence of time-indexed snapshots. We note that some other join operators doexist, including semijoin, antisemijoin, and difference. Theirtemporal counterparts have been explored elsewhere [11] andare not considered here.

Having defined this set of temporal joins, we show howall previously defined operators are related to this taxonomyof temporal joins. The previous operators considered includeCartesian product, Θ-JOIN, EQUIJOIN, NATURALJOIN,TIME JOIN [6,7],TE JOIN,TE OUTERJOIN, andEVENT JOIN [20,46,47,52] and those based on Allen’s [1]interval relations ([27,28,36]). We show that many of theseoperators incorporate less restrictive predicates or use spe-cialized attribute semantics and thus are variants of one of thetaxonomic joins.

2.1 Temporal join definitions

To be specific, we base the definitions on a single data model.We choose the model that is used most widely in tempo-ral data management implementations, namely, the one thattimestamps each tuple with an interval. We assume that thetimeline is partitioned into minimal-duration intervals, termedchronons [12], and we denote intervals by inclusive startingand ending chronons.

We define two temporal relational schemas, R and S, asfollows.

R = (A1, . . . , An, Ts, Te)S = (B1, . . . , Bm, Ts, Te)

The Ai, 1 ≤ i ≤ n and Bi, 1 ≤ i ≤ m are the explicitattributes found in corresponding snapshot schemas, and Ts

and Te are the timestamp start and end attributes, recordingwhen the information recorded by the explicit attributes holds(or held or will hold) true. We will use T as shorthand for theinterval [Ts, Te] and A and B as shorthand for {A1, . . . , An}and {B1, . . . , Bn}, respectively. Also, we define r and s to beinstances of R and S, respectively.

Example 1 Consider the following two temporal relations. Therelations show the canonical example of employees, the de-partments they work for, and the managers who supervisethose departments.

Employee

EmpName Dept T

Ron Ship [1,5]George Ship [5,9]Ron Mail [6,10]

Manages

Dept MgrName T

Load Ed [3,8]Ship Jim [7,15]

Tuples in the relations represent facts about the modeled re-ality. For example, the first tuple in the Employee relationrepresents the fact that Ron worked for the Shipping depart-ment from time 1 to time 5, inclusive. Notice that none of theattributes, including the timestamp attributes T, are set-valued– the relation schemas are in 1NF. ��

2.2 Cartesian product

The temporal Cartesian product is a conventional Cartesianproduct with a predicate on the timestamp attributes. To defineit, we need two auxiliary definitions.

First, intersect(U, V ), where U and V are inter-vals, returns TRUE if there exists a chronon t such that

4 D. Gao et al.: Join operations in temporal databases

t ∈ U ∧ t ∈ V . Second, overlap(U, V ) returns the max-imum interval contained in its two argument intervals. If nononempty intervals exist, the function returns ∅. To state thismore precisely, let first and last return the smallest and largestof two argument chronons, respectively. Also, let Us and Ue

denote, respectively, the starting and ending chronons of U ,and similarly for V .

overlap(U, V ) =

[last(Us, Vs), first(Ue, Ve)]if last(Us, Vs) ≤ first(Ue, Ve)

∅ otherwise

Definition 1 The temporal Cartesian product, r ×Ts, of twotemporal relations r and s is defined as follows.

r ×Ts = {z(n+m+2) | ∃x ∈ r ∃y ∈ s (z[A] = x[A] ∧ z[B] = y[B] ∧z[T] = overlap(x[T], y[T]) ∧ z[T] �= ∅)}

The second line of the definition sets the explicit attribute val-ues of the result tuple z to the concatenation of the explicitattribute values of x and y. The third line computes the time-stamp of z and ensures that it is nonempty. ��Example 2 Consider the query “Show the names of employeesand managers where the employee worked for the companywhile the manager managed some department in the com-pany.” This can be satisfied using the temporal Cartesian prod-uct.

Employee ×T Manager

EmpName Dept Dept MgrName T

Ron Ship Load Ed [3,5]George Ship Load Ed [5,8]George Ship Ship Jim [7,9]

Ron Mail Load Ed [6,8]Ron Mail Ship Jim [7,10]

��The overlap function is necessary and sufficient to ensure

snapshot reducibility, as will be discussed in detail in Sect. 2.7.Basically, we want the temporal Cartesian product to act asthough it is a conventional Cartesian product applied inde-pendently at each point in time. When operating on interval-stamped data, this semantics corresponds to an intersection:the result will be valid during those times when contributingtuples from both input relations are valid.

The temporal Cartesian product was first defined by Segevand Gunadhi [20,47]. This operator was termed the timejoin, and the abbreviation T-join was used. Clifford andCroker [7] defined a Cartesian product operator that is a com-bination of the temporal Cartesian product and the temporalouterjoin, to be defined shortly. Interval join is a building blockof the (spatial) rectangle join [2]. The interval join is a one-dimensional spatial join that can thus be used to implementthe temporal Cartesian product.

2.3 Theta join

Like the conventional theta join, the temporal theta join sup-ports an unrestricted predicate P on the explicit attributes ofits input arguments. The temporal theta join, r �T

P s, of two

relations r and s selects those tuples from r ×Ts that satisfypredicate P (r[A], s[B]). Let σ denote the standard selectionoperator.

Definition 2 The temporal theta join, r �TP s, of two temporal

relations r and s is defined as follows.r �T

P s = σP (r[A],s[B])(r ×Ts) ��

A form of this operator, the Θ-JOIN, was proposed byClifford and Croker [6]. This operator was later extended toallow computations more general than overlap on the time-stamps of result tuples [53].

2.4 Equijoin

Like snapshot equijoin, the temporal equijoin operator en-forces equality matching among specified subsets of the ex-plicit attributes of the input relations.

Definition 3 The temporal equijoin on two temporal relationsr and s on attributes A′ ⊆ A and B′ ⊆ B is defined as thetheta join with predicate P ≡ r[A′] = s[B′].r �T

r[A′]=s[B′] s ��Like the temporal theta join, the temporal equijoin was

first defined by Clifford and Croker [6]. A specialized oper-ator, the TE-join, was developed independently by Segevand Gunadhi [47]. The TE-join required the explicit joinattribute to be a surrogate attribute of both input relations.Essentially, a surrogate attribute would be a key attribute ofa corresponding nontemporal schema. In a temporal context,a surrogate attribute value represents a time-invariant objectidentifier. If we augment schemas R and S with surrogate at-tributes ID, then the TE-join can be expressed using thetemporal equijoin as follows.

r TE-join s ≡ r �Tr[ID]=s[ID] s

The temporal equijoin was also generalized by Zhang et al.to yield the generalized TE-join, termed the GTE-join, whichspecifies that the joined tuples must have their keys in a spec-ified range while their intervals should intersect a specifiedinterval [56]. The objective was to focus on tuples within in-teresting rectangles in the key-time space.

2.5 Natural join

The temporal natural join and the temporal equijoin bear thesame relationship to one another as their snapshot counter-parts. That is, the temporal natural join is simply a temporalequijoin on identically named explicit attributes followed bya subsequent projection operation.

To define this join, we augment our relation schemas withexplicit join attributes, Ci, 1 ≤ i ≤ k, which we abbreviateby C.

R = (A1, . . . , An, C1, . . . , Ck, Ts, Te)S = (B1, . . . , Bm, C1, . . . , Ck, Ts, Te)

Definition 4 The temporal natural join of r and s, r �Ts, isdefined as follows.


r �Ts = {z(n+m+k+2) | ∃x ∈ r ∃y ∈ s(x[C] = y[C]∧z[A] = x[A] ∧ z[B] = x[B] ∧ z[C] = y[C]∧z[T] = overlap(x[T], y[T]) ∧ z[T] �= ∅)}

The first two lines ensure that tuples x and y agree on the valuesof the join attributes C and set the explicit attributes of theresult tuple z to the concatenation of the nonjoin attributes Aand B and a single copy of the join attributes, C. The third linecomputes the timestamp of z as the overlap of the timestampsof x and y and ensures that x[T] and y[T] actually overlap. ��

This operator was first defined by Clifford and Croker [6],who named it thenatural time join.We showed in ear-lier work that the temporal natural join plays the same impor-tant role in reconstructing normalized temporal relations as thesnapshot natural join for normalized snapshot relations [25].Most previous work in temporal join evaluation has addressed,either implicitly or explicitly, the implementation of the tem-poral natural join or the closely related temporal equijoin.

2.6 Outerjoins and outer Cartesian products

Like the snapshot outerjoin, temporal outerjoins and Cartesianproducts retain dangling tuples, i.e., tuples that do not partic-ipate in the join. However, in a temporal database, a tuplemay dangle over a portion of its time interval and be coveredover others; this situation must be accounted for in a temporalouterjoin or Cartesian product.

We may define the temporal outerjoin as the union of twosubjoins, like the snapshot outerjoin. The two subjoins are thetemporal left outerjoin and the temporal right outerjoin. As theleft and right outerjoins are symmetric, we define only the leftouterjoin.

We need two auxiliary functions. The coalesce functioncollapses value-equivalent tuples – tuples with mutually equalnontimestamp attribute values [23] – in a temporal relationinto a single tuple with the same nontimestamp attribute val-ues and a timestamp that is the finite union of intervals thatprecisely contains the chronons in the timestamps of the value-equivalent tuples. (A finite union of time intervals is termeda temporal element [15], which we represent in this paper asa set of chronons.) The definition of coalesce uses the func-tion chronons that returns the set of chronons contained in theargument interval.

coalesce(r) = {z(n+1) |∃x∈ r(z[A] = x[A] ⇒ chronons(x[T]) ⊆ z[T]∧

∀x′ ∈ r(x[A] = x′[A] ⇒ (chronons(x′[T]) ⊆ z[T]))) ∧∀t ∈ z[T] ∃x′′ ∈ r(z[A] = x′′[A] ∧ t ∈ chronons(x′′[T]))}The second and third lines of the definition coalesce all value-equivalent tuples in relation r. The last line ensures that nospurious chronons are generated.

We now define a function expand that returns the set ofmaximal intervals contained in an argument temporal element,T .

expand(T ) = {[ts, te] |ts ∈ T ∧ te ∈ T ∧ ∀t ∈ chronons([ts, te])(t ∈ T )∧¬∃t′s ∈ T (t′s < ts ∧ ∀t (t′s < t < ts ⇒ t ∈ T )) ∧¬∃t′e ∈ T (t′e > te ∧ ∀t (te < t < t′e ⇒ t ∈ T ))}

The second line ensures that a member of the result is aninterval contained in T . The last two lines ensure that theinterval is indeed maximal.

We are now ready to define the temporal left outerjoin.Let R and S be defined as for the temporal equijoin. We useA′ ⊆ A and B′ ⊆ B as the explicit join attributes.

Definition 5 The temporal left outerjoin, r �Tr[A′]=s[B′]s, of

two temporal relations r and s is defined as follows.

r �Tr[A′]=s[B′]s = {z(n+m+2) |

∃x ∈ coalesce(r) ∃y ∈ coalesce(s)(x[A′] = y[B′] ∧ z[A] = x[A] ∧ z[T] �= ∅ ∧

((z[B] = y[B] ∧ z[T] ∈ expand(x[T] ∩ y[T])) ∨(z[B] = null ∧ z[T] ∈ expand(x[T] − y[T])))) ∨

∃x ∈ coalesce(r) ∀y ∈ coalesce(s)(x[A′] �= y[B′] ⇒ z[A] = x[A] ∧ z[B] = null ∧

z[T] ∈ expand(x[T ]) ∧ z[T] �= ∅)}The first five lines of the definition handle the case where,for a tuple x deriving from the left argument, a tuple y withmatching explicit join attribute values is found. For those timeintervals of x that are not shared with y, we generate tupleswith null values in the attributes of y. The final three lines of thedefinition handle the case where no matching tuple y is found.Tuples with null values in the attributes of y are generated. ��

The temporal outerjoin may be defined as simply the unionof the temporal left and the temporal right outerjoins (the unionoperator eliminates the duplicate equijoin tuples). Similarly,a temporal outer Cartesian product is a temporal outerjoinwithout the equijoin condition (A′ = B′ = ∅).

Gunadhi and Segev were the first researchers to investi-gate outerjoins over time. They defined a specialized versionof the temporal outerjoin called the EVENT JOIN [47]. Thisoperator, of which the temporal left and right outerjoins werecomponents, used a surrogate attribute as its explicit join at-tribute. This definition was later extended to allow any at-tributes to serve as the explicit join attributes [53]. A spe-cialized version of the left and right outerjoins called theTE-outerjoinwas also defined. TheTE-outerjoin in-corporated the TE-join, i.e., temporal equijoin, as a com-ponent.

Clifford and Croker [7] defined a temporal outer Cartesianproduct, which they termed simply Cartesian product.

2.7 Reducibility

We proceed to show how the temporal operators reduce tosnapshot operators. Reducibility guarantees that the seman-tics of the snapshot operator is preserved in its more complextemporal counterpart.

For example, the semantics of the temporal natural joinreduces to the semantics of the snapshot natural join in thatthe result of first joining two temporal relations and then trans-forming the result to a snapshot relation yields a result that isthe same as that obtained by first transforming the argumentsto snapshot relations and then joining the snapshot relations.This commutativity diagram is shown in Fig. 1 and stated for-mally in the first equality of the following theorem.


�

�

��

Snapshot relationsTemporal relations

��T

τ Tt

τ Tt

τ Tt (r �Tr′) = τ T

t (r) �τ Tt (r′)r �Tr′

τ Tt (r), τ T

t (r′)r, r′

Fig. 1. Reducibility of temporal nat-ural join to snapshot natural join

The timeslice operation τ T takes a temporal relation r asargument and a chronon t as parameter. It returns the corre-sponding snapshot relation, i.e., with the schema of r but with-out the timestamp attributes, that contains (the nontime stampportion of) all tuples x from r for which t belongs to x[T]. Itfollows from the theorem below that the temporal joins definedhere reduce to their snapshot counterparts.

Theorem 1 Let t denote a chronon and let r and s be relationinstances of the proper types for the operators they are appliedto. Then the following hold for all t.

τ Tt (r �Ts) = τ T

t (r) � τ Tt (s)

τ Tt (r ×Ts) = τ T

t (r) × τ Tt (s)

τ Tt (r �T

P s) = τ Tt (r) �P τ T

t (s)τ Tt (r �Ts) = τ T

t (r) � τ Tt (s)

τ Tt (r � Ts) = τ T

t (r) � τ Tt (s)

Proof: An equivalence is shown by proving its two inclu-sions separately. The nontimestamp attributes of r and s areAC and BC, respectively, where A, B, and C are sets of at-tributes and C denotes the join attribute(s) (cf. the definitionof temporal natural join). We prove one inclusion of the firstequivalence, that is, τ T

t (r �Ts) ⊆ τ Tt (r) � τ T

t (s). The remain-ing proofs are similar in style.

Let x′′ ∈ τ Tt (r � s) (the left-hand side of the equiv-

alence to be proved). Then there is a tuple x′ ∈ r �Tssuch that x′[ABC] = x′′ and t ∈ x′[T]. By the definitionof �T, there exist tuples x1 ∈ r and x2 ∈ s such thatx1[C] = x2[C] = x′[C], x1[A] = x′[A], x2[B] = x′[B],x′[T ] = overlap(x1[T ], x2[T ]). By the definition of τ T

t , thereexist a tuple x′

1 ∈ τ Tt (r) such that x′

1 = x1[AC] = x′[AC] anda tuple x′

2 ∈ τ Tt (s) such that x′

2 = x2[BC] = x′[BC]. Thenthere exists x′′

12 ∈ τ Tt (r) � τ T

t (s) (the right-hand side of theequivalence) such that x′′

12[AC] = x′1 and x′′

12[B] = x′2[B].

By construction, x′′12 = x′′. This proves the ⊆ inclusion. ��

2.8 Summary

We have defined a taxonomy for temporal join operators. Thetaxonomy was constructed as a natural extension of corre-sponding snapshot database operators. We also briefly de-scribed how previously defined temporal operators are accom-modated in the taxonomy.

Table 1 summarizes how previous work is representedin the taxonomy. For each operator defined in previ-ous work, the table lists the defining publication, re-searchers, the corresponding taxonomy operator, and any

restrictions assumed by the original operators. In earlywork, Clifford [8] indicated that an INTERSECTION JOINshould be defined that represents the categorized nonouterjoins and Cartesian products, and he proposed that a UNIONJOIN be defined for the outer variants.

3 Evaluation algorithms

In the previous section, we described the semantics of all pre-viously proposed temporal join operators. We now turn ourattention to implementation algorithms for these operators. Asbefore, our purpose is to enumerate the space of algorithmsapplicable to the temporal join operators, thereby providinga consistent framework within which existing temporal joinevaluation algorithms can be placed.

Our approach is to extend well-understood paradigmsfrom conventional query evaluation to temporal databases.Algorithms for temporal join evaluation are necessarily morecomplex than their snapshot counterparts. Whereas snapshotevaluation algorithms match input tuples based on their ex-plicit join attributes, temporal join evaluation algorithms typ-ically must additionally ensure that temporal restrictions aremet. Furthermore, this problem is exacerbated in two ways.Timestamps are typically complex data types, e.g., intervalsrequiring inequality predicates, which conventional query pro-cessors are not optimized to handle. Also, a temporal databaseis usually larger than a corresponding snapshot database dueto the versioning of tuples.

We consider non-index-based algorithms. Index-based al-gorithms use an auxiliary access path, i.e., a data structure thatidentifies tuples or their locations using a join attribute value.Non-index-based algorithms do not employ auxiliary accesspaths. While some attention has been focused on index-basedtemporal join algorithms, the large number of temporal in-dexes that have been proposed in the literature [44] precludesa thorough investigation in this paper.

We first provide a taxonomy of temporal join algorithms.This taxonomy, like the operator taxonomy of Table 1, is basedon well-established relational concepts. Sections 3.2 and 3.3describe the algorithms in the taxonomy and place existingwork within the given framework. Finally, conclusions areoffered in Sect. 3.4.

3.1 Evaluation taxonomy

All binary relational query evaluation algorithms, includingthose computing conventional joins, are derived from four


Table 1. Temporal join operators

Operator Initial citation Taxonomy operator Restrictions

Cartesian product [7] Outer Cartesian product NoneEQUIJOIN [6] Equijoin NoneGTE-join [56] Equijoin 2, 3INTERVAL JOIN [2] Cartesian product NoneNATURAL JOIN [6] Natural join NoneTIME JOIN [6] Cartesian product 1T-join [20] Cartesian product NoneTE-JOIN [47] Equijoin 2TE-OUTERJOIN [47] Left outerjoin 2EVENT JOIN [47] Outerjoin 2Θ-JOIN [6] Theta join NoneValid-time theta join [53] Theta join NoneValid-time left join [53] Left outerjoin None

Restrictions:1 = restricts also the valid time of the result tuples2 = matching only on surrogate attributes3 = includes also intersection predicates with an argument surrogate range and a time range

basic paradigms: nested-loop, partitioning, sort-merge, andindex-based [18].

Partition-based join evaluation divides the input tuples intobuckets using the join attributes of the input relations as keyvalues. Corresponding buckets of the input relations containall tuples that could possibly match with one another, andthe buckets are constructed to best utilize the available mainmemory buffer space. The result is produced by performingan in-memory join of each pair of corresponding buckets fromthe input relations.

Sort-merge join evaluation also divides the input relationbut uses physical memory loads as the units of division. Thememory loads are sorted, producing sorted runs, and written todisk. The result is produced by merging the sorted runs, wherequalifying tuples are matched and output tuples generated.

Index-based join evaluation utilizes indexes defined on thejoin attributes of the input relations to locate joining tuples ef-ficiently. The index could be preexisting or built on the fly.Elmasri et al. presented a temporal join algorithm that uti-lizes a two-level time index, which used a B+-tree to indexthe explicit attribute in the upper level, with the leaves ref-erencing other B+-trees indexing time points [13]. Son andElmasri revised the time index to require less space and usedthis modified index to determine the partitioning intervals in apartition-based timestamp algorithm [52]. Bercken and Seegerproposed several temporal join algorithms based on a multi-version B+-tree (MVBT) [4]. Later Zhang et al. describedseveral algorithms based on B+-trees, R∗-trees [3], and theMVBT for the related GTE-join. This operation requires thatjoined tuples have key values that belong to a specified rangeand have time intervals that intersect a specified interval [56].The MVBT assumes that updates arrive in increasing time or-der, which is not the case for valid-time data. We focus onnon-index-based join algorithms that apply to both valid-timeand transaction-time relations, and we do not discuss theseindex-based joins further.

We adapt the basic non-index-based algorithms(nested-loop, partitioning, and sort-merge) to supporttemporal joins. To enumerate the space of temporal join algo-

rithms, we exploit the duality of partitioning and sort-merge[19]. In particular, the division step of partitioning, wheretuples are separated based on key values, is analogous to themerging step of sort-merge, where tuples are matched basedon key values. In the following, we consider the character-istics of sort-merge algorithms and apply duality to derivecorresponding characteristics of partition-based algorithms.

For a conventional relation, sort-based join algorithms or-der the input relation on the input relations’ explicit join at-tributes. For a temporal relation, which includes timestampattributes in addition to explicit attributes, there are four pos-sibilities for ordering the relation. First, the relation can besorted by the explicit attributes exclusively. Second, the rela-tion can be ordered by time, using either the starting or endingtimestamp [29,46]. The choice of starting or ending timestampdictates an ascending or descending sort order, respectively.Third, the relation can be ordered primarily by the explicit at-tributes and secondarily by time [36]. Finally, the relation canbe ordered primarily by time and secondarily by the explicitattributes.

By duality, the division step of partition-based algorithmscan partition using any of these options [29,46]. Hence fourchoices exist for the dual steps of merging in sort-merge orpartitioning in partition-based methods.

We use this distinction to categorize the different ap-proaches to temporal join evaluation. The first approach above,using the explicit attributes as the primary matching attributes,we term explicit algorithms. Similarly, we term the second ap-proach timestamp algorithms. We retain the generic term tem-poral algorithm to mean any algorithm to evaluate a temporaloperator.

Finally, it has been recognized that the choice of bufferallocation strategy, GRACE or hybrid [9], is independent ofwhether a sort-based or partition-based approach is used [18].Hybrid policies retain most of the last run of the outer relationin main memory and so minimize the flushing of intermediatebuffers to disk, thereby potentially decreasing the I/O cost.

Figure 2 lists the choices of sort-merge vs. partitioning,the possible sorting/partitioning attributes, and the possible


{Sort-mergePartitioning

}×

ExplicitTimestamp

Explicit/timestampTimestamp/explicit

×

{GRACEHybrid

}

Fig. 2. Space of possible evaluation algorithms

buffer allocation strategies. Combining all possibilities yields16 possible evaluation algorithms. Including the basic nested-loop algorithm and GRACE and hybrid variants of the sort-based interval join mentioned in Sect. 2.2 results in a totalof 19 possible algorithms. The 19 algorithms are named anddescribed in Table 2.

We noted previously that time intervals lack a natural or-der. From this point of view spatial join is similar becausethere is no natural order preserving spatial closeness. Previouswork on spatial join may be categorized into three approaches.Early work [37,38] used a transformation approach based onspace-filling curves, performing a sort-merge join along thecurve to solve the join problem. Most of the work falls in theindex-based approaches, utilizing spatial index structures suchas the R-tree [21], R+-tree [48], R∗-tree [3], Quad-tree [45],or seeded tree [31]. While some algorithms use preexistingindexes, others build the indexes on the fly.

In recent years, some work has focused on non-index-based spatial join approaches. Two partition-based spa-tial join algorithms have been proposed. One of them[32] partitions the input relations into overlapping buckets anduses an indexed nested-loop join to perform the join withineach bucket. The other [40] partitions the input relations intodisjoint partitions and uses a computational-geometry-basedplane-sweep algorithm that can be thought of as the spatialequivalent of the sort-merge algorithm. Arge et al. [2] intro-duced a highly optimized implementation of the sweeping-based algorithm that first sorts the data along the vertical axisand then partitions the input into a number of vertical strips.Data in each strip can then be joined by an internal plane-sweep algorithm. All the above non-index-based spatial joinalgorithms use a sort- or partition-based approach or combinethese two approaches in one algorithm, which is the approachwe adopt in some of our temporal join algorithms (Sect. 4.3.2).

In the next two sections, we examine the space of explicitalgorithms and timestamp algorithms, respectively, and clas-sify existing approaches using the taxonomy developed in thissection. We will see that most previous work in temporal joinevaluation has centered on timestamp algorithms. However,for expository purposes, we first examine those algorithmsbased on manipulation of the nontimestamp columns, whichwe term “explicit” algorithms.

3.2 Explicit algorithms

Previous work has largely ignored the fact that conventionalquery evaluation algorithms can be easily modified to eval-uate temporal joins. In this section, we show how the threeparadigms of query evaluation can support temporal join eval-uation. To make the discussion concrete, we develop an algo-rithm to evaluate the valid-time natural join, defined in Sect. 2,for each of the three paradigms. We begin with the simplestparadigm, nested-loop evaluation.

explicitNestedLoop(r, s):result← ∅;for each block br ∈ r

read(br);for each block bs ∈ s

read(bs);for each tuple x ∈ br

for each tuple y ∈ bs

if x[C] = y[C] andoverlap(x[T], y[T]) �= ∅z[A]← x[A];z[B]← y[B];z[C]← x[C];z[T]← overlap(x[T], y[T]);result← result ∪ {z};

return result;

Fig. 3. Algorithm explicitNestedLoop

3.2.1 Nested-loop-based algorithms

Nested-loop join algorithms match tuples by exhaustivelycomparing pairs of tuples from the input relations. As an I/Ooptimization, blocks of the input relations are read into mem-ory, with comparisons performed between all tuples in theinput blocks. The size of the input blocks is constrained by theavailable main memory buffer space.

The algorithm operates as follows. One relation is des-ignated the outer relation, the other the inner relation[35,18]. The outer relation is scanned once. For each blockof the outer relation, the inner relation is scanned. When ablock of the inner relation is read into memory, the tuplesin that “inner block” are joined with the tuples in the “outerblock.”

The temporal nested-loop join is easily constructed fromthis basic algorithm. All that is required is that the timestamppredicate be evaluated at the same time as the predicate on theexplicit attributes. Figure 3 shows the temporal algorithm. (Inthe figure, r is the outer relation and s is the inner relation. Weassume their schemas are as defined in Sect. 2.)

While conceptually simple, nested-loop-based evaluationis often not competitive due to its quadratic cost. We nowdescribe temporal variants of the sort-merge and partition-based algorithms, which usually exhibit better performance.

3.2.2 Sort-merge-based algorithms

Sort-merge join algorithms consist of two phases. In the firstphase, the input relations r and s are sorted by their join at-tributes. In the second phase, the result is produced by simulta-neously scanning r and s, merging tuples with identical valuesfor their join attributes.

Complications arise if the join attributes are not key at-tributes of the input relations. In this case, multiple tuples inr and in s may have identical join attribute values. Hence agiven r tuple may join with many s tuples, and vice versa.(This is termed skew [30].)

As before, we designate one relation as the outer relationand the other as the inner relation. When consecutive tuples in


Table 2. Algorithm taxonomy

Algorithm Name Description

Explicit sort ES GRACE sort-merge by explicit attributesHybrid explicit sort ES-H Hybrid sort-merge by explicit attributesTimestamp sort TS GRACE sort-merge by timestampsHybrid timestamp sort TS-H Hybrid sort-merge by timestampsExplicit/timestamp sort ETS GRACE sort-merge by explicit attributes/timeHybrid explicit/timestamp sort ETS-H Hybrid sort-merge by explicit attributes/timeTimestamp/explicit sort TES GRACE sort-merge by time/explicit attributesHybrid timestamp/explicit sort TES-H Hybrid sort-merge by time/explicit attributesInterval join TSI GRACE sort-merge by timestampsHybrid interval join TSI-H Hybrid sort-merge by timestampsExplicit partitioning EP GRACE partitioning by explicit attributesHybrid explicit partitioning EP-H Hybrid partitioning by explicit attributesTimestamp partitioning TP Range partition by timeHybrid timestamp partitioning TP-H Hybrid range partitioning by timeExplicit/timestamp partitioning ETP GRACE partitioning by explicit attributes/timeHybrid explicit/timestamp partitioning ETP-H Hybrid partitioning by explicit attributes/timeTimestamp/explicit partitioning TEP GRACE partitioning by time/explicit attributesHybrid timestamp/explicit partitioning TEP-H Hybrid partitioning by time/explicit attributesNested-loop NL Exhaustive matching

structure stateinteger current block;integer current tuple;integer first block;integer first tuple;block tuples;

Fig. 4. State structure for merge scanning

the outer relation have identical values for their explicit join at-tributes, i.e., their nontimestamp join attributes, the scan of theinner relation is “backed up” to ensure that all possible matchesare found. Prior to showing the explicitSortMerge al-gorithm, we define a suite of algorithms that manage the scansof the input relations. For each scan, we maintain the statestructure shown in Fig. 4. The fields current block and cur-rent tuple together indicate the current tuple in the scan byrecording the number of the current block and the index ofthe current tuple within that block. The fields first block andfirst tuple are used to record the state at the beginning of ascan of the inner relation in order to back up the scan laterif needed. Finally, tuples stores the block of the relation cur-rently in memory. For convenience, we treat the block as anarray of tuples.

The initState algorithm shown in Fig. 5 initializes thestate of a scan. Essentially, counters are set to guarantee thatthe first block read and the first tuple scanned are the firstblock and first tuple within that block in the input relation. Weassume that a seek operation is available that repositions thefile pointer associated with a relation to a given block number.

The advance algorithm advances the scan of the argu-ment relation and state to the next tuple in the sorted relation.If the current block has been exhausted, then the next block ofthe relation is read. Otherwise, the state is updated to mark thenext tuple in the current block as the next tuple in the scan. The

initState(relation, state):state.current block ← 1;state.current tuple← 0;state.first block ←⊥;state.first tuple←⊥;seek(relation, state.current block);state.tuples← read block(relation);

advance(relation, state):if (state.current tuple = MAX TUPLES)

state.tuples← read block(relation);state.current block ← state.current block + 1;state.current tuple← 1;

elsestate.current tuple← state.current tuple + 1;

currentTuple(state):return state.tuples[state.current tuple]

backUp(relation, state):if (state.current block �= state.first block)

state.current block ← state.first block;seek(relation, state.current block);state.tuples← read block(relation);

state.current tuple← state.first tuple;

markScanStart(state):state.first block ← state.current block;state.first tuple← state.current tuple;

Fig. 5. Merge algorithms


explicitSortMerge(r, s, C):r′ ← sort(r, C);s′ ← sort(s, C);

initState(r′, outer state); initState(s′, inner state);x′[C]←⊥;result← ∅;advance(s′, inner state);y ← currentTuple(inner state);

for i← 1 to |r′|advance(r′, outer state);x← currentTuple(outer state);

if x[C] = x′[C]backUp(s′, inner state);y ← currentTuple(s′, inner state);

x′[C]← x[C];

while (x[C] > y[C])advance(s′, inner state);y ← currentTuple(inner state);

markScanStart(inner state);

while (x[C] = y[C])if overlap(x[T], y[T]) �= ∅)

z[A]← x[A]; z[B]← y[B]; z[C]← x[C];z[T]← overlap(x[T], y[T]);result← result ∪ {z};

advance(s′, inner state);y ← currentTuple(inner state);

return result;

Fig. 6. explicitSortMerge algorithm

current tuple algorithm merely returns the next tuple inthe scan, as indicated by the scan state. Finally, the backUpand markScanStart algorithms manage the backing up ofthe inner relation scan. The backUp algorithm reverts thecurrent block and tuple counters to their last values. Thesevalues are stored in the state at the beginning of a scan by themarkScanStart algorithm.

We are now ready to exhibit the explicitSortMergealgorithm, shown in Fig. 6. The algorithm accepts three pa-rameters, the input relations r and s and the join attributes C.We assume that the schemas of r and s are as given in Sect. 2.Tuples from the outer relation are scanned in order. For eachouter tuple, if the tuple matches the previous outer tuple, thescan of the inner relation is backed up to the first matchinginner tuple. The starting location of the scan is recorded incase backing up is needed by the next outer tuple, and thescan proceeds forward as normal. The complexity of the al-gorithm, as well as its performance degradation as comparedwith conventional sort-merge, is due largely to the bookkeep-ing required to back up the inner relation scan. We considerthis performance hit in more detail in Sect. 4.2.2.

Segev and Gunadhi developed three algorithms based onexplicit sorting, differing primarily by the code in the innerloop and by whether backup is necessary. Two of the algo-rithms, TEJ-1 and TEJ-2, support the temporal equijoin [46];

the remaining algorithm, EJ-1, evaluates the temporal outer-join [46].

TEJ-1 is applicable if the equijoin condition is on thesurrogate attributes of the input relations. The surrogate at-tributes are essentially key attributes of a corresponding snap-shot schema. TEJ-1 assumes that the input relations are sortedprimarily by their surrogate attributes and secondarily by theirstarting timestamps. The surrogate matching, sort-ordering,and 1TNF assumption described in Sect. 3.3.1 allows the re-sult to be produced with a single scan of both input relations,with no backup.

The second equijoin algorithm, TEJ-2, is applicable whenthe equijoin condition involves any explicit attributes, surro-gate or not. TEJ-2 assumes that the input relations are sortedprimarily by their explicit join attribute(s) and secondarily bytheir starting timestamps. Note that since the join attribute canbe a nonsurrogate attribute, tuples sharing the same join at-tribute value may overlap in valid time. Consequently, TEJ-2requires the scan of the inner relation to be backed up in orderto find all tuples with matching explicit attributes.

For the EVENT JOIN, Segev and Gunadhi described thesort-merge-based algorithm EJ-1. EJ-1 assumes that the inputrelations are sorted primarily by their surrogate attributes andsecondarily by their starting timestamps. Like TEJ-1, the resultis produced by a single scan of both input relations.

3.2.3 Partition-based algorithms

As in sort-merge-based algorithms, partition-based algorithmshave two distinct phases. In the first phase, the input relationsare partitioned based on their join attribute values. The par-titioning is performed so that a given bucket produced fromone input relation contains tuples that can only match withtuples contained in the corresponding bucket of the other in-put relation. Each produced bucket is also intended to fill theallotted main memory. Typically, a hash function is used asthe partitioning agent. Both relations are filtered through thesame hash function, producing two parallel sets of buckets. Inthe second phase, the join is computed by comparing tuples incorresponding buckets of the input relations. Partition-basedalgorithms have been shown to have superior performancewhen the relative sizes of the input relations differ [18].

A partitioning algorithm for the temporal natural join isshown in Fig. 7. The algorithm accepts as input two relationsr and s and the names of the explicit join attributes C. Weassume that the schemas of r and s are as given in Sect. 2.

As can be seen, the explicit partition-based join algorithmis conceptually very simple. One relation is designated theouter relation, the other the inner relation. After partitioning,each bucket of the outer relation is read in turn. For a given“outer bucket,” each page of the corresponding “inner bucket”is read, and tuples in the buffers are joined.

The partitioning step in Fig. 7 is performed by thepartition algorithm. This algorithm takes as its first argu-ment an input relation. The resulting n partitions are returnedin the remaining parameters. Algorithm partition assumes thata hash functionhash is available that accepts the join attributevalues x[C] as input and returns an integer, the index of thetarget bucket, as its result.


explicitPartitionJoin(r, s, C):result← ∅;

partition(r, r1, . . . , rn);partition(s, s1, . . . , sn);

for i← 1 to nouter bucket← read partition(ri);for each page p ∈ si

p← read page(si);for each tuple x ∈ outer bucket

for each tuple y ∈ pif (x[C] = y[C] and

overlap(x[T], y[T]) �= ∅)z[A]← x[A];z[B]← y[B];z[C]← x[C];z[T]← overlap(x[T], y[T]);result← result ∪ {z};

return result;

partition(r, r1, . . . , rn):for i← 1 to p

ri ← ∅;

for each block b ∈ rread block(b);for each tuple x ∈ b

i← hash(x[C]);ri ← ri ∪ {x};

Fig. 7.AlgorithmsexplicitPartitionJoin andpartition

3.3 Timestamp algorithms

In contrast to the algorithms of the previous section, timestampalgorithms perform their primary matching on the timestampsassociated with tuples.

In this section, we enumerate, to the best of our knowl-edge, all existing timestamp-based evaluation algorithms forthe temporal join operators described in Sect. 3. Many of thesealgorithms assume sort ordering of the input by either theirstarting or ending timestamps. While such assumptions arevalid for many applications, they are not valid in the generalcase, as valid-time semantics allows correction and deletionof previously stored data. (Of course, in such cases one couldresort within the join.) As before, all of the algorithms de-scribed here are derived from nested loop, sort-merge, or par-titioning; we do not consider index-based temporal joins.

3.3.1 Nested-loop-based timestamp algorithms

One timestamp nested-loop-based algorithm has been pro-posed for temporal join evaluation. Like the EJ-1 algorithmdescribed in the previous section, Segev and Gunadhi devel-oped their algorithm, EJ-2, for the EVENT JOIN [47,20] (Ta-ble 1).

EJ-2 does not assume any ordering of the input relations.It does assume that the explicit join attribute is a distinguished

surrogate attribute and that the input relations are in TemporalFirst Normal Form (1TNF). Essentially, 1TNF ensures thattuples within a single relation that have the same surrogatevalue may not overlap in time.

EJ-2 simultaneously produces the natural join and left out-erjoin in an initial phase and then computes the right outerjoinin a subsequent phase.

For the first phase, the inner relation is scanned once fromfront to back for each outer relation tuple. For a given outerrelation tuple, the scan of the inner relation is terminated whenthe inner relation is exhausted or the outer tuple’s timestamphas been completely overlapped by matching inner tuples. Theouter tuple’s natural join is produced as the scan progresses.The outer tuple’s left outerjoin is produced by tracking thesubintervals of the outer tuple’s timestamp that are not over-lapped by any inner tuples. An output tuple is produced foreach subinterval remaining at the end of the scan. Note thatthe main memory buffer space must be allocated to containthe nonoverlapped subintervals of the outer tuple.

In the second phase, the roles of the inner and outer rela-tions are reversed. Now, since the natural join was producedduring the first phase, only the right outerjoin needs to becomputed. The right outerjoin tuples are produced in the samemanner as above, with one small optimization. If it is knownthat a tuple of the (current) outer relation did not join withany tuples during the first phase, then no scanning of the innerrelation is required and the corresponding outerjoin tuple isproduced immediately.

Incidentally, Zurek proposed several algorithms for eval-uating temporal Cartesian product on multiprocessors basedon nested loops [57].

3.3.2 Sort-merge-based timestamp algorithms

To date, four sets of researchers – Segev and Gunadhi, Leungand Muntz, Pfoser and Jensen, and Rana and Fotouhi – havedeveloped timestamp sort-merge algorithms. Additionally, aone-dimensional spatial join algorithm proposed by Arge etal. can be used to implement a temporal Cartesian product.

Segev and Gunadhi modified the traditional merge-joinalgorithm to support the T-join and the temporal equijoin [47,20]. We describe the algorithms for each of these operators inturn.

For the T-join, the relations are sorted in ascending orderof starting timestamp. The result is produced by a single scanof the input relations.

For the temporal equijoin, two timestamp sorting algo-rithms, named TEJ-3 and TEJ-4, are presented. Both TEJ-3and TEJ-4 assume that their input relations are sorted by start-ing timestamp only. TEJ-4 is applicable only if the equijoincondition is on the surrogate attribute. In addition to assumingthat the input relations are sorted by their starting timestamps,TEJ-4 assumes that all tuples with the same surrogate valueare linked, thereby allowing all tuples with the same surrogateto be retrieved when the first is found. The result is performedwith a linear scan of both relations, with random access neededto traverse surrogate chains.

Like TEJ-2, TEJ-3 is applicable for temporal equijoinson both the surrogate and explicit attribute values. TEJ-3 as-sumes that the input relations are sorted in ascending order of


their starting timestamps, but no sort order is assumed on theexplicit join attributes. Hence TEJ-3 requires that the innerrelation scan be backed up should consecutive tuples in theouter relation have overlapping interval timestamps.

Leung and Muntz developed a series of algorithms basedon the sort-merge algorithm to support temporal join predi-cates such as “contains” and “intersect” [1]. Although theiralgorithms do not explicitly support predicates on nontem-poral attribute values, their techniques are easily modified tosupport more complex join operators such as the temporalequijoin. Like Segev and Gunadhi, this work describes evalu-ation algorithms appropriate for different sorting assumptionsand access paths.

Leung and Muntz use a stream-processing approach. Ab-stractly, the input relations are considered as sequences oftime-sorted tuples where only the tuples at the front of thestreams may be read. The ordering of the tuples is a tradeoffwith the amount of main memory needed to compute the join.For example, Leung and Muntz show how a contain join [1]can be computed if the input streams are sorted in ascendingorder of their starting timestamp. They summarize for varioussort orders on the starting and ending timestamps what tuplesmust be retained in main memory during the join computa-tion. A family of algorithms are developed assuming differentorderings (ascending/descending) of the starting and endingtimestamps.

Leung and Muntz also show how checkpoints, essentiallythe set of tuples valid during some chronon, can be used toevaluate temporal joins where the join predicate implies someoverlap between the participating tuples. Here, the check-points actually contain tuple identifiers (TIDs) for the tuplesvalid during the specified chronon and the TIDs of the nexttuples in the input streams. Suppose a checkpoint exists attime t. Using this checkpoint, the set of tuples participatingin a join over a time interval containing t can be computed byusing the cached TIDs and “rolling forward” using the TIDsof the next tuples in the streams.

Rana and Fotouhi proposed several techniques to improvethe performance of time-join algorithms in which they claimedthey used a nested-loop approach [43]. Since they assumed theinput relations were sorted by the start time and/or end time,those algorithms are more like the second phase of sort-merge-based timestamp algorithms. The algorithms are very similarto the sort-merge-based algorithms developed by Segev andGunadhi.

Arge et al. described the interval join, a one-dimensional spatial join algorithm, which is a building blockof a two-dimensional rectangle join [2]. Each interval is de-fined by a lower boundary and an upper boundary.The problemis to report all intersections between an interval in the outerrelation and an interval in the inner relation. If the intervalis a time interval instead of a spatial interval, this problem isequivalent to the temporal Cartesian product. They assumedthe two input relations were first sorted by the algorithm intoone list by their lower boundaries. The algorithm maintainstwo initially empty lists of tuples with “active” intervals, onefor each input relation. When the sorted list is scanned, thecurrent tuple is put into the active list of the relation it be-longs to and joins only with the tuples in the active list of theother relation. Tuples becoming inactive during scanning areremoved from the active list.

Most recently, Pfoser and Jensen [41] applied the sort-merge approach to the temporal theta join in a setting whereeach argument relation consists of a noncurrent and a currentpartition. Tuples in the former all have intervals that end be-fore the current time, while all tuples of the latter have intervalsthat end at the current time. They assume that updates arrive intime order, so that tuples in noncurrent partitions are orderedby their interval end times and tuples in current partitions areordered by their interval start times. A join then consists ofthree different kinds of subjoins. They develop two join algo-rithms for this setting and subsequently use these algorithmsfor incremental join computation.

As can be seen from the above discussion, a large num-ber of timestamp-based sort-merge algorithms have been pro-posed, some for specific join operators. However, each of theseproposals has been developed largely in isolation from otherwork, with little or no cross comparison. Furthermore, pub-lished performance figures have been derived mainly fromanalytical models rather than from empirical observations. Anempirical comparison, as provided in Sect. 5, is needed to trulyevaluate the different proposals.

3.3.3 Partition-based timestamp algorithms

Partitioning a relation over explicit attributes is relativelystraightforward if the partitioning attributes have discrete val-ues. Partitioning over time is more difficult since our time-stamps are intervals, i.e., range data, rather than discrete val-ues. Previous timestamp partitioning algorithms therefore de-veloped various means of range partitioning the time intervalsassociated with tuples.

In previous work, we described a valid-time join algorithmusing partitioning [54]. This algorithm was presented in thecontext of evaluating the valid-time natural join, though it iseasily adapted to other temporal joins. The range partitioningused by this algorithm mapped tuples to singular buckets anddynamically migrated the tuples to other buckets as neededduring the join computation. This approach avoided data re-dundancy, and associated I/O overhead, at the expense of morecomplex buffer management.

Sitzmann and Stuchey extended this algorithm by usinghistograms to decide the partition boundary [49]. Their algo-rithm takes the number of long-lived tuples into consideration,which renders its performance insensitive to the number oflong-lived tuples. However, it relies on a preexisting temporalhistogram.

Lu et al. described another range-partitioning algorithmfor computing temporal joins [33]. This algorithm is applica-ble to theta joins, where a result tuple is produced for each pairof input tuples with overlapping valid-time intervals. Their ap-proach is to map intervals to a two-dimensional plane, whichis then partitioned into regions. The join result is producedby computing the subjoins of pairs of partitions correspond-ing to adjacent regions in the plane. This method applies toa restricted temporal model where future time is not allowed.They utilize a spatial index to speed up the joining phase.


Table 3. Existing algorithms and taxonomy counterparts

Algorithm Defined by Taxonomy Assumptions

TEJ-1 Segev and Gunadhi Explicit/timestamp sort Surrogate attribute and 1TNFTEJ-2 Segev and Gunadhi Explicit/timestamp sort NoneEJ-2 Segev and Gunadhi Nested-loop Surrogate attribute and 1TNFEJ-1 Segev and Gunadhi Explicit/timestamp sort Surrogate attribute and 1TNFTime-join Segev and Gunadhi Timestamp sort NoneTEJ-3 Segev and Gunadhi Timestamp sort NoneTEJ-4 Segev and Gunadhi Timestamp sort Surrogate attribute/access chainSeveral Leung and Muntz Timestamp sort NoneInterval Arge et al. Timestamp sort NoneTwo Pfoser and Jensen Timestamp sort Partitioned relation; time-ordered updates– Soo et al. Timestamp partition None– Sitzmann and Stuckey Timestamp partition Requires preexisting temporal histogram– Lu et al. Timestamp partition Disallows future time; uses spatial index

3.4 Summary

We have surveyed temporal join algorithms and proposeda taxonomy of such algorithms. The taxonomy was devel-oped by adapting well-established relational query evaluationparadigms to the temporal operations.

Table 3 summarizes how each temporal join operation pro-posed in previous work is classified in the taxonomy. We be-lieve that the framework is complete since, disregarding data-model-specific considerations, all previous work naturally fitsinto one of the proposed categories.

One important property of an algorithm is whether it deliv-ers a partial answer before the entire input is read. Among thealgorithms listed in Table 3, only the nested-loop algorithmhas this property. Partition-based algorithms have to scan thewhole input relation to get the partitions. Similarly, sort-basedalgorithms have to read the entire input to sort the relation.We note, however, that it is possible to modify the temporalsort-based algorithms to be nonblocking, using the approachof progressive merge join [10].

4 Engineering the algorithms

As noted in the previous section, an adequate empirical inves-tigation of the performance of temporal join algorithms hasnot been performed. We concentrate on the temporal equijoin,defined in Sect. 2.4. This join and the related temporal naturaljoin are needed to reconstruct normalized temporal relations[25]. To perform a study of implementations of this join, wemust first provide state-of-the-art implementations of the 19different types of algorithms outlined for this join. In this sec-tion, we discuss our implementation choices.

4.1 Nested-loop algorithm

We implemented a simple block-oriented nested-loop algo-rithm. Each block of the outer relation is read in turn intomemory. The outer block is sorted by the explicit joining at-tribute (actually, pointers are sorted to avoid copying of tu-ples). Each block of the inner relation is then brought intomemory. For a given inner block, each tuple in that block isjoined by binary searching the sorted tuples.

This algorithm is simpler than the nested-loop algorithm,EJ-2, described in Sect. 3.3.1 [20,47]. In particular, our al-gorithm computes only the valid-time equijoin, while EJ-2computes the valid-time outerjoin, which includes the equi-join in the form of the valid-time natural join. However, ouralgorithm supports a more general equijoin condition thanEJ-2 in that we support matching on any explicit attributerather than solely on a designated surrogate attribute.

4.2 Sort-merge-based algorithms

We were careful to use a high-performance sort-merge algo-rithm with the features covered next.

4.2.1 Combining last sort step with merge step

Sort-merge join uses a disk-based sorting phase that startsby generating many small, fully sorted runs, then repeatedlymerges these into increasingly longer runs until a single runis obtained (this is done for the left-hand side and right-handside independently). Each step of the sort phase reads andwrites the entire relation. The merge phase then scans the fullysorted left-hand and right-hand relations to produce the outputrelation. A common optimization is to stop the sorting phaseone step early, when there is a small number of fully sortedruns. The final step is done in parallel with the merge phaseof the join, thereby avoiding one read and one write scan.Our sort-merge algorithms implemented for the performanceanalysis are based on this optimization. We generated initialruns using an in-memory quicksort on the explicit attributes(ES and ES-H), the timestamp attributes (TS and TS-H), orboth (ETS and ETS-H) and then merged the two relations onmultiple runs.

4.2.2 Efficient skew handling

As noted in Sect. 3.2.2, sort-merge join algorithms becomecomplicated when the join attributes are not key attributes.Our previous work on conventional joins [30] shows that in-trinsic skew is generally present in this situation. Even a smallamount of intrinsic skew can result in a significant perfor-mance hit because the naive approach to handling skew is to


520

540

560

580

600

620

640

660

680

0 10 20 30 40 50

elap

sed

time

(sec

s)

skew (percentage)

ES RereadES Cache

Fig. 8. Performance improvement of ES with spooled cache onskewed data

reread the previous tuples in the same value packet (containingthe identical values for the equijoin attribute); this rereading in-volves additional I/O operations. We previously proposed sev-eral techniques to handle skew efficiently [30]. Among them,SC-n (spooled cache on multiple runs) was recommended dueto its strikingly better performance in the presence of skew forboth conventional and band joins. This algorithm also exhibitsvirtually identical performance as a traditional sort-merge joinin the absence of skew. SC-n uses a small cache to hold theskewed tuples from the right-hand relation that satisfy the joincondition. At the cache’s overflow point, the cache data arespooled to disk.

Skew is prevalent in temporal joins. SC-n can be adaptedfor temporal joins by adding a supplemental predicate (re-quiring that the tuples overlap) and calculating the resultingtimestamps, by intersection.We adopt this spooled cache in ESinstead of rereading the previous tuples. The advantage of us-ing spooled cache is shown in Fig. 8. ES Reread is the multirunversion of the explicitSortMerge algorithm exhibitedin Sect. 3.2.2, which backs up the right-hand relation when aduplicate value is found in the left-hand relation.

The two algorithms were executed in the TimeIt sys-tem. The parameters are the same as those that will be usedin Sect. 5.1. In this experiment, the memory size was fixed at8 MB and the cache size at 32 KB.The relations were generatedwith different percentages of smooth skew on the explicit at-tribute. A relation has 1% smooth skew when 1% of the tuplesin the relation have one duplicate value on the join attributeand the remaining 98% of the tuples have no duplicates. Sincethe cache can hold the skewed tuples in memory, no additionalI/O is caused by backing up the relation. The performance im-provement of using a cache is approximately 25% when thedata have 50% smooth skew. We thus use a spooled cache tohandle skew. Spooling will generally not occur but is availablein case a large value packet is present.

4.2.3 Time-varying value packetsand optimized prediction rule

ES utilizes a prediction rule to judge if skew is present. (Recallthat skew occurs if the two tuples have the same join attributevalue.) The prediction rule works as follows. When the last

tuple in the right-hand relation (RHR) buffer is visited, thelast tuple in the left-hand relation (LHR) buffer is checked todetermine if skew is present and the current RHR value packetneeds to be put into the cache.

We also implemented an algorithm (TS) that sorts the inputrelations by start time rather than by the explicit join attribute.Here the RHR value packet associated with a specific LHRtuple is not composed of those RHR tuples with the samestart time but rather of those RHR tuples that overlap withthe interval of the LHR tuple. Hence value packets are notdisjoint, and they grow and shrink as one scans the LHR. Inparticular, TS puts into the cache only those tuples that couldoverlap in the future: the tuples that do not stop too early, thatis, before subsequent LHR tuples start. For an individual LHRtuple, the RHR value packet starts with the first tuple that stopssometime during the LHR tuple’s interval and goes throughthe first RHR tuple that starts after the LHR tuple stops. Valuepackets are also not totally ordered when sorting by start time.

These considerations suggest that we change the predic-tion rule in TS. When the RHR reaches a block boundary, themaximum stop time in the current value packet is comparedwith the start time of the last tuple in the LHR buffer. If themaximum stop time of the RHR value packet is less than thelast start time of the LHR, none of the tuples in the value packetwill overlap with the subsequent LHR tuples. Thus there is noneed to put them in the cache. Otherwise, the value packet isscanned and only those tuples with a stop time greater thanthe last start time of the LHR are put into the cache, therebyminimizing the utilization of the cache and thus the possibilityof cache overflow.

ETS sorts the input relations by explicit attribute first andthen by start time. Here the RHR value packet associated witha left tuple is composed of those right tuples that not only havethe same value of the explicit attribute but also overlap withthe interval of the left tuple. The prediction rules used in ESand TS are combined to decide whether or not to put a tupleor a value packet into the cache.

To make our work complete, we also implemented TES,which sorts the input relations primarily by start time and sec-ondarily by the explicit attribute. The logic of TES is exactlythe same as that of TS for the joining phase. We expect theextra sorting by explicit attribute will not help to optimize thealgorithm but rather will simply increase the CPU time.

4.2.4 Specialized cache purging

Since the cache size is small, it could be filled up if a valuepacket is very large or if several value packets accumulate inthe cache. For the former, nothing but spooling the cache canbe done. However, purging the cache periodically can avoidunnecessary cache spool for the latter and may result in fewerI/O operations.

Purging the cache costs more in TS since the RHR valuepackets are not disjoint, while in ES they are disjoint both ineach run and in the cache. The cache purging process in ESscans the cache from the beginning and stops whenever thefirst tuple that belongs to the current value packet is met. Butin TS, this purging stage cannot stop until the whole cachehas been scanned because the tuples belonging to the currentvalue packet are spread across the cache. An inner long-lived


3

3.5

4

4.5

5

5.5

2 4 8 16 32 64

CP

U ti

me

(sec

s)

memory size (MB)

No HeapHeap

Fig. 9. Performance improvement of using a heap in ES

tuple could be kept in the cache for a long time because itstime interval could intersect with many LHR tuples.

4.2.5 Using a heap

As stated in Sect. 4.2.1, the final step of sorting is done inparallel with the merging stage.Assuming the two relations aresorted in ascending order, in the merging stage the algorithmfirst has to find the smallest value from the multiple sorted runsof each relation and then compare the two values to see if theycan be joined. The simplest way to find the smallest value isto scan the current value of each run. If the relation is dividedinto m runs, the cost of selecting the smallest value is O(m).A more efficient way to do this is to use a heap to select thesmallest value. The cost of using a heap is O(log2m) whenm > 1. By utilizing a heap, the time complexity is reduced.

At the beginning of the merging step, the heap is built basedon the value of the first tuple in each run. Whenever advanceis called, the run currently on the top of the heap advances itsreading pointer to the next tuple. Since the key value of thistuple is no less than the tuple in the current state, it shouldbe propagated down to maintain the heap structure. When arun is backed up, its reading pointer is restored to point to apreviously visited tuple, which has a smaller key value, andthus should be propagated up the heap.

When the memory size is relatively small, which indicatesthat the size of each run is small and therefore that a relationhas to be divided into more runs (the number of runs m islarge), the performance of using a heap will be much betterthan that without a heap. However, using a heap causes somepointer swaps when sifting down or propagating up a tuple inthe heap, which are not needed in the simple algorithm. Whenthe memory size is sufficiently large, the performance of usinga heap will be close to or even worse than that of the simplealgorithm.

Figure 9 shows the total CPU time of ES when using andnot using a heap. The data used in Fig. 9 are two 64-MB re-lations. The input relations are joined while using differentsizes of memory. Note that the CPU time includes the time ofboth the sorting step and the merging step. As expected, theperformance of using a heap is better than that without a heap

when the memory is small. The performance improvementis roughly 40% when the memory size is 2 MB. The perfor-mance difference decreases as the memory increases. Whenthe memory size is greater than 32 MB, which is one half of therelation size, using a heap has no benefit. Since using a heapsignificantly improves the performance when the memory isrelatively small and barely degrades performance when thememory is large, we use a heap in all sort-based algorithms.

4.2.6 GRACE and hybrid variants

We implemented both GRACE and hybrid versions of eachsort-based algorithm. In the GRACE variants, all the sortedruns of a relation are written to disk before the merging stage.The hybrid variants keep most of the last run of the outerrelation in memory. This guarantees that one (multiblock) diskread and one disk write of the memory-resident part will besaved. When the available memory is slightly smaller than thedataset, the hybrid algorithms will require relatively fewer I/Ooperations.

4.2.7 Adapting the interval join

We consider the interval join a variant of the timestamp sort-merge algorithm (TS). In this paper, we call it TSI and itshybrid variant TSI-H. To be fair, we do not assume the inputrelations are sorted into one list. Instead, TSI begins with sort-ing as its first step. Then it combines the last step of the sortwith the merge step. The two active lists are essentially twospooled caches, one for each relation. Each cache has the samesize as that in TS. This is different from the strategy of keepinga single block of each list in the original paper. A small cachecan save more memory for the input buffer, thus reducing therandom reads. However, it will cause more cache spools whenskew is present. Since timestamp algorithms tend to encounterskew, we choose a cache size that is the same as that in TS,rather than one block.

4.3 Partition-based algorithms

Several engineering considerations also occur when imple-menting the partition-based algorithms.

4.3.1 Partitioning details

The details of algorithm TP are described elsewhere [54]. Wechanged TP to use a slightly larger input buffer (32 KB) and acache for the inner relation (also 32 KB) instead of using a one-page buffer and cache. The rest of the available main memoryis used for the outer relation. There is a tradeoff between alarge outer input buffer and a large inner input buffer andcache. A large outer input buffer implies a large partition size,which results in fewer seeks for both relations. But the cacheis more likely to spool. On the other hand, allocating a largecache and a large inner input buffer results in a smaller outerinput buffer, thus a smaller partition size. This will increaserandom I/O. We chose 32 KB instead of 1 KB (the page size)


as a compromise. The identification of the best cache size isgiven in Sect. 6 as a direction of future research.

The algorithms ETP and TEP partition the input relationsin two steps. ETP partitions the relations by explicit attributefirst. For each pair of the buckets to be joined, if none of themfits in memory, a further partition by timestamp attribute willbe made to these buckets to increase the possibility that theresulting buckets do not overflow the available buffer space.TEP is similar to ETP, except that it partitions the relations inthe reverse order, first by timestamp and then, if necessary, byexplicit attribute.

4.3.2 Joining the partitions

The partition-based algorithms perform their second phase,the joining of corresponding partitions of the outer and innerrelations, as follows. The outer partition is fetched into mem-ory, assuming that it will not overflow the available bufferspace, and pointers to the outer tuples are sorted using an in-memory quicksort. The inner partition is then scanned, usingall memory pages not occupied by the outer partition. For eachinner tuple, matching outer tuples are found by binary search.If the outer partitions overflow the available buffer space, thenthe algorithms default to an explicit attribute sort-merge joinof the corresponding partitions.

4.3.3 GRACE and hybrid variants

In addition to the conventional GRACE algorithm, we imple-mented the hybrid buffer management for each partition-basedalgorithm. In the hybrid algorithms, one outer bucket is des-ignated as memory-resident. Its buffer space is increased ac-cordingly to hold the whole bucket in memory. When the innerrelation is partitioned, the inner tuples that map to the corre-sponding bucket are joined with the tuples in the memory-resident bucket. This eliminates the I/O operations to writeand read one bucket of tuples for both the inner and the outerrelation. Similar to the hybrid sort-based algorithms, the hy-brid partition-based algorithms are supposed to have betterperformance when the input relation is slightly larger than theavailable memory size.

4.4 Supporting the iterator interface

Most commercial systems implement the relational operatorsas iterators [18]. In this model, each operator is realized bythree procedures called open, next, and close. The algorithmswe investigate in this paper can be redesigned to support theiterator interface.

The nested-loop algorithm and the explicit partitioning al-gorithms are essentially the corresponding snapshot join algo-rithms except that a supplemental predicate (requiring that thetuples overlap) and the calculation of the resulting timestampsare added in the next procedure.

The timestamp partitioning algorithms determine the peri-ods for partitions by sampling the outer relation and partitionthe input relations in the open procedure. The next procedurecalls the next procedure of nested-loop join for each pair of

partitions. An additional predicate is added in the next proce-dure to determine if a tuple should be put into the cache.

The sort-based algorithms generate the initial sorted runsfor the input relations and merge runs until the final mergestep is left in the open procedure. In the next procedure, theinner runs, the cache, and the outer runs are scanned to find amatch. At the same time, the inner tuple is examined to decidewhether to put it in the cache. The close procedure destroysthe input runs and deallocates the cache. The open and closeprocedures of interval join algorithms are the same as the othersort-based algorithms. The next procedure gets the next tuplefrom the sorted runs and scans the cache to find the matchingtuple and purges the cache at the same time.

5 Performance

We implemented all 19 algorithms enumerated in Table 2 andtested their performance under a variety of data distributions,including skewed explicit and timestamp distributions, time-stamp durations, memory allocations, and database sizes. Weensured that all algorithms generated exactly the same outputtuples in all of the experiments (the ordering of the tuples willdiffer).

The remainder of this section is organized as follows. Wefirst give details on the join algorithms used in the experimentsand then describe the parameters used in the experiments. Sec-tions 5.2 to 5.9 contain the actual results of the experiments.Section 5.10 summarizes the results of the experiments.

5.1 Experimental setup

The experiments were developed and executed using theTimeIT [17] system, a software package supportingthe prototyping of temporal database components. UsingTimeIT, we fixed several parameters describing all test re-lations used in the experiments. These parameters and theirvalues are shown in Table 4. In all experiments, tuples were16 bytes long and consisted of two explicit attributes, bothbeing integers and occupying 4 bytes, and two integer times-tamps, each also requiring 4 bytes. Only one of the explicitattributes was used as the joining attribute. This yields resulttuples that are 24 bytes long, consisting of 16 bytes of explicitattributes from each input tuple and 8 bytes for the timestamps.

We fixed the relation size at 64 MB, giving four milliontuples per relation. We were less interested in absolute relationsize than in the ratio of input size to available main memory.Similarly, the ratio of the page size to the main memory sizeand the relation size is more relevant than the absolute pagesize. A scaling of these factors would provide similar results.In all cases, the generated relations were randomly orderedwith respect to both their explicit and timestamp attributes.

The metrics used for all experiments are listed in Table 5. Ina modern computer system, a random disk access takes about10 ms, whereas accessing a main memory location typicallytakes less than 60 ns [42]. It is reasonable that a sequentialI/O takes about one tenth the time of a random I/O. Moderncomputer systems usually have hardware data cache, whichcan reduce the CPU time on cache hit. Therefore, we chosethe join attribute compare time as 20 ns, which was slightly less


Table 4. System characteristics

Parameter Value

Relation size 64 MBTuple size 16 bytesTuples per relation 4 millionTimestamp size ([s,e]) 8 bytesExplicit attribute size 8 bytesRelation lifespan 1,000,000 chrononsPage size 1 KBOutput buffer size 32 KBCache size in sort-merge 64 KBCache size in partitioning 32 KB

Table 5. Cost metrics

Parameter Value

Sequential I/O cost 1 msRandom I/O cost 10 msJoin attribute compare 20 nsTimestamp compare 20 nsPointer compare 20 nsPointer swap 60 nsTuple move 80 ns

than, while in the same magnitude of, the memory access time.The cost metrics we used is the average memory access timegiven a high hit ratio (> 90%) of cache. It is possible that theCPU cache has lower hit ratio when running some algorithms.However, the magnitude of the memory access time will notchange. We assumed that the sizes of both a timestamp anda pointer were the same as the size of an integer. Thus, theircompare times are the same as that of the join attribute. Apointer swap takes three times as long as the pointer comparetime because it needs to access three pointers. A tuple movetakes four times as long as the integer compare time since thesize of a tuple is four times that of an integer.

We measured both main memory operations and disk I/Ooperations. To eliminate any undesired system effects fromthe results, all operations were counted using facilities pro-vided by TimeIT. For disk operations, random and sequen-tial access was measured separately. We included the cost ofwriting the output relation in the experiments since sort-basedand partition-based algorithms exhibit dual random and se-quential I/O patterns when sorting/coalescing and partition-ing/merging. The total time was then computed by weighingeach parameter by the time values listed in Table 5.

Table 6 summarizes the values of the system parametersthat varied among the different experiments. Each row of thetable identifies the figures that illustrate the results of the ex-periments given the parameters for the experiment. The readermay have the impression that the intervals are so small thatthey are almost like the standard equijoin atributes. Are theretuples overlapping each other? In many cases, we performedthe self-join, which guaranteed for each tuple in one relationthat there is at least one matching tuple in the other relation.Long-duration timestamps (100 chronons) were used in twoexperiments. It guaranteed that there were on average four tu-ples valid in each chronon. Two other experiments examinethe case where one relation has short-duration timestamps and

the other has long-duration timestamps. Therefore, our exper-iments actually examined different degrees of overlapping.

5.2 Simple experiments

In this section, we perform three “base case” experiments,where the join selectivity is low, i.e., for an equijoin of valid-time relations r and s, a given tuple x ∈ r joins with one, orfew, tuples y ∈ s. The experiments incorporate random datadistributions in the explicit join attributes and short and longtime intervals in the timestamp attributes.

5.2.1 Low explicit selectivity with short timestamps

In this experiment, we generated a relation with little explicitmatching and little overlap and joined the relation with itself.This mimics a foreign key-primary key natural join in that thecardinality of the result is the same as one of the input rela-tions. The relation size was fixed at 64 MB, corresponding tofour million tuples. The explicit joining attribute values wereintegers drawn from a space of 231 − 1 values. For the givencardinality, a particular explicit attribute value appeared on av-erage in only one tuple in the relation. The starting timestampattribute values were randomly distributed over the relationlifespan, and the duration of the interval associated with eachtuple was set to one chronon. We ran each of the 19 algorithmsusing the generated relation, increasing main memory alloca-tions from 2 MB, a 1:32 memory to input size ratio, to 64 MB,a 1:1 ratio.

The results of the experiment are shown in Fig. 10. In eachpanel, the ordering of the legend corresponds to the order ofeither the rightmost points or the leftmost points of each curve.The actual values of each curve in all the figures may be foundin the appendix of the associated technical report [16]. Notethat both the x-axis and the y-axis are log-scaled.As suspected,nested loop is clearly not competitive. The general nested-loopalgorithm performs very poorly in all cases but the highestmemory allocation. At the smallest memory allocation, theleast expensive algorithm, EP, enjoys an 88% performanceincrease. Only at the highest memory allocation, that is, whenthe entire left-hand side relation can fit in main memory, doesthe nested-loop algorithm have comparable performance withother algorithms. Given the disparity in performance and giventhat various characteristics, such as skew or the presence oflong-duration tuples, do not impact the performance of thenested-loop algorithm, we will not consider this algorithm inthe remainder of this section.

To get a better picture of the performance of the remainingalgorithms, we plot them separately in Fig. 11. From this figureon, we eliminate the noncompetitive nested loop. We groupthe algorithms that have a similar performance and retain onlya representative curve for each group in the figures. In thisfigure, TES-H and TSI-H have performances very similar tothat of TS-H; ETS-H has a performance very similar to that ofES-H; ETP-H, TP-H, and TEP-H have performances similar tothat of EP-H; the remaining algorithms all have a performancesimilar to that of EP.

In this graph, only the x-axis is log-scaled. The sort-basedand partition-based algorithms exhibit largely the same per-formance, and the hybrid algorithms outperform their GRACE


Table 6. Experiment parameters

Explicit Timestamp Timestamp Outer Inner MemoryFigure skew skew duration size size sizenumbers (%) (%) (chronons) (MB) (MB) (MB)

10 and 11 None None 1 64 64 2–6412 None None 100 64 64 2–6413 None None Outer: 1; Inner: 100 64 64 2–6414 None None 1 4–64 64 1615 and 16 None None 100 4–64 64 1617 None None Outer: 1; Inner: 100 4–64 64 1618 0–100% one side None 1 64 64 1619 None 0–100% one side 1 64 64 1620 0–100% one side 0–100% one side 1 64 64 1621 0–4% both sides None 1 64 64 1623 None 0–4% both sides 1 64 64 1625 0–4% both sides 0–4% both sides 1 64 64 16

300

400

500

600

700800900

1000

2000

3000

4000

5000

2 4 8 16 32 64

elap

sed

time

(sec

s)

memory size (MB)

NLTES-H

TS-HTSI-H

ETS-HES-H

ETP-HEP-H

TEP-HTP-HTEP

TPETS

ESTESTSITS

ETPEP

Fig. 10. Low explicit selectivity,low timestamp selectivity

counterparts at high memory allocations, in this case when theratio of main memory to input size reaches approximately 1:8(2 MB of main memory) or 1:4 (4 MB of main memory). Thepoor performance of the hybrid algorithms stems from reserv-ing buffer space to hold the resident run/partition, which takesbuffer space away from the remaining runs/partitions, causingthe algorithms to incur more random I/O. At small memoryallocations, the problem is acute. Therefore, the hybrid groupstarts from a higher position and ends in a lower position. TheGRACE group behaves in the opposite way.

The performance differences between the sort-based algo-rithms and their partitioning counterparts are small, and thereis no absolute winner. TES, the sort-merge algorithm that sortsthe input relation primarily by start time and secondarily byexplicit attribute, has a slightly worse performance than TS,which sorts the input relation by start time only. Since the orderof the start time is not the order of the time interval, the extra

sorting by explicit attribute does not help in the merging step.The program logic is the same as for TS, except for the extrasorting. We expect TES will always perform a little worse thanTS. Therefore, neither TES nor TES-H will be considered inthe remainder of this section.

5.2.2 Long-duration timestamps

In the experiment described in the previous section, the joinselectivity was low since explicit attribute values were sharedamong few tuples and tuples were timestamped with intervalsof short duration. We repeated the experiment using long-duration timestamps. The duration of each tuple timestampwas fixed at 100 chronons, and the starting timestamps wererandomly distributed throughout the relation lifespan. As be-fore, the explicit join attribute values were randomly dis-


300

600

900

1200

1500

1800

2 4 8 16 32 64

elap

sed

time

(sec

s)

memory size (MB)

TS-HES-HEP-H

EP

Fig. 11. Low explicit selectivity, low timestamp selectivity (withoutnested loop)

300

600

900

1200

1500

1800

2100

2400

2700

2 4 8 16 32 64

elap

sed

time

(sec

s)

memory size (MB)

TS-HTSI-H

TSTSI

ES-HTP-H

ES

Fig. 12. Low explicit selectivity (long-duration timestamps)

tributed integers; thus the size of the result was just slightlylarger due to the long-duration timestamps.

The results are shown in Fig. 12, where the x-axis is log-scaled. In this figure, the group of ES-H and ETS-H are repre-sented by ES-H; the group of ETP-H, EP-H, TEP-H, and TP-Hby TP-H; the group of TP, TEP, ES, ETS, EP, and ETP by ES;and the rest are retained. The timestamp sorting algorithms,TS and TS-H, suffer badly. Here, the long duration of the tuplelifespans did not cause overflow of the tuple cache used in thesealgorithms. To see this, recall that our input relation cardinalitywas four million tuples. For a 1,000,000 chronon relation lifes-pan, this implies that 4, 000, 000/1, 000, 000 = 4 tuples arriveper chronon. Since tuple lifespans were fixed at 100 chronons,it follows that 4×100 = 400 tuples should be scanned beforeany purging of the tuple cache can occur. However, a 64-KBtuple cache, capable of holding 4000 tuples, does not tend tooverflow. Detailed examination verified that the cache neveroverflowed in these experiments. The poor performance of TSand TS-H are caused by the repeated in-memory processingof the long-lived tuples.

TSI and TSI-H also suffer in the case of long duration butare better than TS and TS-H when the main memory size issmall. TSI improves the performance of TS by 32% at the

smallest memory allocation, while TSI-H improves the per-formance of TS-H by 13%. Our detailed results show that theTS had slightly less I/O time than TSI. TS also saved sometime in tuple moving since it did not move every tuple intocache. However, it spent much more time in timestamp com-paring and pointer moving. In TSI, each tuple only joined withthe tuples in the cache of the other relation. The caches in TSIwere purged during the join process; thus the number of time-stamp comparisons needed by the next tuple was reduced. InTS, an outer tuple joined with both cache tuples and tuplesin the input buffer of the inner relation, and the input bufferwas never purged. Therefore, TS had to compare more times-tamps. Pointer moving is needed in the heap maintenance,which is used to sort the current tuples in each run. TS fre-quently backed up the inner runs inside the inner buffer andscanned tuples in the value packets multiple times. In eachscan, the heap for the inner runs had to sort the current innertuples again. In TSI, the tuples are sorted once and kept in or-der in the caches. Therefore, the heap overhead is small. Whenthe main memory size is small, the number of runs is large, asare the heap size and the heap overhead.

The timestamp partitioning algorithms, TP and TP-H,have a performance very similar to that described inSect. 5.2.1. There are two main causes of the good perfor-mance of TP and TP-H. The first is that TP does not replicatelong-lived tuples that overlap with multiple partition intervals.Otherwise, TP would need more I/O for the replicated tuples.The second is that TP sorts each partition by the explicit at-tribute. The long duration does not have any effect on the per-formance of the in-memory joining. All the other algorithmssort or partition the relations by explicit attributes. Therefore,their performance is not affected by the long duration.

We may conclude from this experiment that the timestampsort-based algorithms are quite sensitive to the durations ofinput tuple intervals. When tuple durations are long, the in-memory join in TS and TS-H performs poorly due to the needto repeatedly back up the tuple pointers.

5.2.3 Short- and long-duration timestamps

In the experiments described in the previous two sections, thetimestamps are either short or long for both relations. We ex-amined the case where the durations for the two input relationsare different. The duration of each tuple timestamp in the outerrelation was fixed at 1 chronon, while the duration of that inthe inner relation was fixed at 100 chronons. We carefully gen-erated the two relations so that the outer relation and the innerrelation had a one-to-one relationship. For each tuple in theouter relation, there is one tuple in the inner relation that hasthe same value of the explicit attributes and the same start timeas the outer tuple with a long duration instead of a short du-ration. This guaranteed that the selectivity was between thatof the two previous experiments. As before, the explicit joinattribute values and the start time were randomly distributed.

The results are shown in Fig. 13, where the x-axis is log-scaled. The groups of the curves are the same as in Fig. 12.The relative positions of the curves are similar to those in thelong-duration experiment. The performance of the timestampsorting algorithms were even worse than that of the others,but better than that in the experiment where long-duration tu-


300

600

900

1200

1500

1800

2100

2400

2 4 8 16 32 64

elap

sed

time

(sec

s)

memory size (MB)

TS-HTSI-H

TSTSI

ES-HTP-H

ES

Fig. 13. Low explicit selectivity (short-duration timestamps join long-duration timestamps)

ples were in both input relations. Long-duration tuples reducethe size of value packets for each tuple on only one side andtherefore result in fewer timestamp comparisons in all fourtimestamp sorting algorithms and fewer backups in TS andTS-H.

We also exchanged the outer and inner relations for thisexperiment and observed results identical to those in Fig. 13.This indicates that whether the long-duration tuples exist inthe outer relation or the inner relation has little impact on theperformance of any algorithm.

5.3 Varying relation sizes

It has been shown for snapshot join algorithms that the rela-tive sizes of the input relations can greatly affect which sort-or partition-based strategy is best [18]. We investigated thisphenomenon in the context of valid-time databases.

We generated a series of relations, increasing in size from4 MB to 64 MB, and joined them with a 64-MB relation. Thememory allocation used in all trials was 16 MB, the size atwhich all algorithms performed most closely in Fig. 11. Asin the previous experiments, the explicit join attribute val-ues in all relations were randomly distributed integers. Short-duration timestamps were used to mitigate the in-memory ef-fects on TS and TS-H seen in Fig. 12. As before, startingtimestamps were randomly distributed over the relation lifes-pan. Since the nested-loop algorithm is expected to be a com-petitor when one of the relations fits in the memory, we incor-porated this algorithm into this experiment. The results of theexperiment are shown in Fig. 14. In this figure, ES representsall the GRACE sorting algorithms, ES-H all the hybrid sortingalgorithms, EP all the GRACE partitioning algorithms, TP-Hthe hybrid timestamp partitioning algorithms, and EP-H thehybrid explicit partitioning algorithms, and NL is retained.

The impact of a differential in relation sizes for thepartition-based algorithms is clear. When an input relation issmall relative to the available main memory, the partition-based algorithms use this relation as the outer relation andbuild an in-memory partition table from it. The inner relation isthen linearly scanned, and for each inner tuple the in-memory

100

200

300

400

500

600

700

4 8 16 32 64

elap

sed

time

(sec

s)

outer relation size (MB)

ESES-H

EPNL

TP-HEP-H

Fig. 14. Different relation sizes (short-duration timestamps)

partition table is probed for matching outer tuples. The benefitof this approach is that each relation is read only once, i.e., nointermediate writing and reading of generated partitions oc-curs. Indeed, the inner relation is not partitioned at all, furtherreducing main memory costs in addition to I/O savings.

The nested-loop algorithm has the same I/O costs aspartition-based algorithms when one of the input relations fitsin the main memory. When the size of the smaller input rela-tion is twice as large as the memory size, the performance ofnested-loop algorithms is worse than that of any other algo-rithms. This is consistent with the results shown in Fig. 10.

An important point to note is that this strategy is bene-ficial regardless of the distribution of either the explicit joinattributes and/or the timestamp attributes, i.e., it is unaffectedby either explicit or timestamp skew. Furthermore, no similaroptimization is available for sort-based algorithms. Since eachinput relation must be sorted, both relations must be read andwritten once to generate sorted runs and subsequently readonce to scan and match joining tuples.

To further investigate the effectiveness of this strategy,we repeated the experiment of Fig. 14 with long-durationtimestamps, i.e., tuples were timestamped with timestamps100 chronons in duration. We did not include the nested-loopalgorithm because we did not expect the long-duration tuplesto have any impact on it. The results are shown in Fig. 15. Thegrouping of the curves in this figure is slightly different fromthe grouping in Fig. 14 in that timestamp sorting algorithmsare separated instead of grouped together.

As expected, long-duration timestamps adversely affectthe performance of all the timestamp sorting algorithms forreasons stated in Sect. 5.2.2. The performance of TSI and TSI-H is slightly better than that of TS and TS-H, respectively.This is consistent with the results at 16 MB memory size inFig. 12. Replotting the remaining algorithms in Fig. 16 showsthat the long-duration timestamps do not significantly impactthe efficiency of other algorithms.

In both the short-duration and the long-duration cases,the hybrid partitioning algorithms show the best performance.They save about half of the I/O operations of their GRACEcounterparts when the size of the outer relation is 16 MB. Thisis due to the hybrid strategy.


0

200

400

600

800

1000

1200

4 8 16 32 64

elap

sed

time

(sec

s)


TSTSI

TS-HTSI-H

ESES-H

EPTP-HEP-H

Fig. 15. Different relation sizes (long-duration timestamps)

50

100

150

200

250

300

350

400

450

4 8 16 32 64

elap

sed

time

(sec

s)


ESES-H

EPTP-HEP-H

Fig. 16. Different relation sizes (long-duration timestamps, withoutTS/TS-H)

We further changed the input relations so that the tuples inthe outer relations have the fixed short duration of 1 chrononand those in the outer relations have the fixed long durationof 100 chronons. Other features of the input relations remainthe same. The reults, as shown in Fig. 17, are very similar tothe long-duration case. The performance of timestamp sortingalgorithms is slightly better than that in Fig. 15. Again, we re-generated the relations such that the tuples in the outer relationhave the long-duration fixed at 100 chronons and those in theinner relation have the short-duration fixed at 1 chronon. Theresults are almost identical to those shown in Fig. 17.

The graph shows that partition-based algorithmsshould be chosen whenever the size of one or both ofthe input relations is small relative to the available bufferspace. We conjecture that the choice between explicit par-titioning and timestamp partitioning is largely dependenton the presence or absence of skew in the explicit and/ortimestamp attributes. Explicit and timestamp skew may ormay not increase I/O cost; however, they will increase mainmemory searching costs for the corresponding algorithms, aswe now investigate.

0

200

400

600

800

4 8 16 32 64

elap

sed

time

(sec

s)


TSTSI

TS-HTSI-H

ESES-H

EPTP-HEP-H

Fig. 17. Different relation sizes (short- and long-duration timestamps)

5.4 Explicit attribute skew

As in the experiments described in Sect. 5.3, we fixed themain memory allocation at 16 MB to place all algorithms ona nearly even footing. The inner and outer relation sizes werefixed at 64 MB each. We generated a series of outer relationswith increasing explicit attribute skew, from 0% to 100% in20% increments. Here we generated data with chunky skew.The explicit attribute has 20% chunky skew, which indicatesthat 20% of the tuples in this relation have the same explicitattribute value. Explicit skew was ensured by generating tu-ples with the same explicit join attribute value. Short-durationtimestamps, randomly distributed over the relation lifespan,were used to mitigate the long-duration timestamp effect ontimestamp sorting algorithms. The results are shown in Fig. 18.In this figure, TSI, TS, TEP, and TP are represented by TS andtheir hybrid counterparts by TS-H, and other algorithms areretained.

There are three points to emphasize in this graph. First,the explicit partitioning algorithms, i.e., EP, EP-H, ETP, andETP-H, show increasing costs as the explicit skew increases.The performance of EP and EP-H degrades dramatically withincreasing explicit skew. This is due to the overflowing of mainmemory partitions, causing subsequent buffer thrashing. Theeffect, while pronounced, is relatively small since only one ofthe input relations is skewed. Encountering skew in both rela-tions would exaggerate the effect. Although the performanceof ETP and ETP-H also degrades, the changes are much lesspronounced. The reason is that they employ time partitioningto reduce the effect of explicit attribute skew.

As expected, the group of algorithms that perform sortingor partitioning on timestamps, TS, TS-H, TP, TP-H, TEP, andTEP-H, have relatively flat performance. By ordering or parti-tioning by time, these algorithms avoid effects due to explicitattribute distributions.

The explicit sorting algorithms, ES, ES-H, ETS, andETS-H, perform very well. In fact, the performance of ESand ES-H increases as the skew increases. As the skew in-creases, by default the relations become increasingly sorted.Hence, ES and ES-H expend less effort during run generation.

We conclude from this experiment that if high explicitskew is present in one input relation, then explicit sorting,


340

370

400

430

460

490

520

550

580

0 20 40 60 80 100

elap

sed

time

(sec

s)

explicit skew in outer relation (percentage)

EPEP-HETP

TSETP-H

TS-HETS

ESETS-H

ES-H

Fig. 18. Explicit attribute skew (short-duration timestamps)

timestamp partitioning, and timestamp sorting appear to be thebetter alternatives. The choice among these is then dependenton the distribution and the length of tuple timestamps, whichcan increase the amount of timestamp skew present in theinput, as we will see in the next experiment.

5.5 Timestamp skew

Like explicit attribute distributions, the distribution of time-stamp attribute values can greatly impact the efficiency of thedifferent algorithms. We now describe a study on the effect ofthis aspect.

As in the experiments described in Sect. 5.3, we fixedthe main memory allocation at 16 MB and the sizes of allinput relations at 64 MB. We fixed one relation with randomlydistributed explicit attributes and randomly distributed tupletimestamps, and we generated a series of relations with in-creasing timestamp attribute chunky skew, from 0% to 100%in 20% increments. The timestamp attribute has 20% chunkyskew, which indicates that 20% of the tuples in the relationare in one value packet. The skew was created by generat-ing tuples with the same interval timestamp. Short-durationtimestamps were used in all relations to mitigate the long-duration timestamp effect on timestamp sorting algorithms.Explicit join attribute values were distributed randomly. Theresults of the experiment are shown in Fig. 19. In this figure,all the GRACE explicit algorithms are represented by EP, hy-brid explicit sorting algorithms by ES-H, and hybrid explicitpartition algorithms by EP-H; the remaining algorithms areretained.

Four interesting observations may be made. First, as ex-pected, the timestamp partitioning algorithms, i.e., TP, TEP,TP-H, and TEP-H, suffered increasingly poorer performanceas the amount of timestamp skew increased. This skew causes

overflowing partitions. The performance all four of these al-gorithms is good when the skew is 100% because TP andTP-H become explicit sort-merge joins and TEP and TEP-Hbecome explicit partition joins. Second, TSI and TSI-H alsoexhibited poor performance as the timestamp skew increasedbecause 20% skew in the outer relation caused the outer cacheto overflow. Third, TS and TS-H show increased performanceat the highest skew percentage. This is due to the sortednessof the input, analogous to the behavior of ES and ES-H inthe previous experiment. Finally, as expected, the remainingalgorithms have flat performance across all trials.

When timestamp skew is present, timestamp partitioningis a poor choice. We expected this result, as it is analogousto the behavior of partition-based algorithms in conventionaldatabases, and similar results have been reported for temporalcoalescing. The interval join algorithms are also bad choiceswhen the amount of timestamp skew is large. A small amountof timestamp skew can be handled efficiently by increasingthe cache size in interval join algorithms. We will discuss thisissue again in Sect. 5.8. Therefore, the two main dangers togood performance are explicit attribute skew and/or timestampattribute skew. We investigate the effects of simultaneous skewnext.

5.6 Combined explicit/timestamp attribute skew

Again, we fixed the main memory allocation at 16 MB andset the input relation sizes at 64 MB. Timestamp durationswere set to 1 chronon to mitigate the long-duration timestampeffect on the timestamp sorting algorithms. We then generateda series of relations with increasing explicit and timestampchunky skew, from 0% to 100% in 20% increments. Skew wascreated by generating tuples with the same explicit joiningattribute value and tuple timestamp. The explicit skew andthe timestamp skew are orthogonal. The results are shown in


300

350

400

450

500

550

600

650

0 20 40 60 80 100

elap

sed

time

(sec

s)

timestamp skew in outer relation (percentage)

TSITSI-H

TPTEP

TP-HEP

ES-HTS

EP-HTS-H

Fig. 19. Timestamp attribute skew(short-duration timestamps)

300

350

400

450

500

550

600

650

0 20 40 60 80 100

elap

sed

time

(sec

s)

explicit and timestamp skew in outer relation (percentage)

TSITSI-H

TEPEP

TEP-HEP-HETP

ETP-HESTP

TP-HES-H

Fig. 20. Combined explicit/timestampattribute skew

Fig. 20. In this figure, ETS, ES, and TS are represented by ES;and ETS-H, ES-H, and TS-H by ES-H; the other algorithmsare retained.

The algorithms are divided into three groups in terms ofperformance. As expected, most of the partition-based al-gorithms and the interval join algorithms, TEP, TEP-H, TP,TP-H, EP, EP-H, TSI, and TSI-H, show increasingly poorerperformance as the explicit and timestamp skew increases. Theremaining explicit/timestamp sorting algorithms show rela-tively flat performance across all trials, and the explicit sortingand timestamp sorting algorithms exhibit increasing perfor-mance as the skew increases, analogous to their performance

in the experiments described in Sects. 5.4 and 5.5. While theelapsed time of ETP and ETP-H increases slowly along withincreasing skew, these two algorithms perform very well. Thisis analogous to their performance in the experiments describedin Sect. 5.4.

5.7 Explicit attribute skew in both relations

In previous work [30], we studied the effect of data skew onthe performance of sort-merge joins. There are three types ofskew: outer relation skew, inner relation skew, and dual skew.


400

500

600

700

1000

1500

2000

3000

4000

5000

6000

0 1 2 3 4

elap

sed

time

(sec

s)

explicit skew in both relations (percentage)

ESES-H

EPTPTS

Fig. 21. Explicit attribute skew in both relations

0

1000

2000

3000

4000

5000

0 1 2 3 4

CP

U ti

me

(sec

s)

explicit skew in both relations (percentage)

ESES-H

EPTPTS

Fig. 22. Explicit attribute skew in both relations

Outer skew occurs when value packets in the outer relationcross buffer boundaries. Similarly, inner skew occurs whenvalue packets in the inner relation cross buffer boundaries.Dual skew indicates that outer skew occurs in conjunction withinner skew. While outer skew does not cause any problems forTS and TS-H, it degrades the performance of TSI and TSI-H; dual skew degrades the performance of the TS and TS-Hjoins. In this section, we compare the performance of the joinalgorithms in the presence of dual skew in the explicit attribute.

The main memory allocation was fixed at 16 MB and thesize of all input relations at 64 MB. We generated a seriesof relations with increasing explicit attribute chunky skew,from 0% to 4% in 1% increments. To ensure dual skew, weperformed self-join on these relations. Short-duration times-tamps, randomly distributed over the relation lifespan, wereused to mitigate the long-duration timestamp effect on thetimestamp sorting algorithms. The results are shown in Fig. 21.In this figure, all the explicit partitioning algorithms are rep-resented by EP, all the timestamp partitioning algorithms byTP, all the sort merge algorithms except ES and ES-H by TS,and ES and ES-H are retained.

There are three points to discuss regarding thegraph. First, the explicit algorithms, i.e., ES, ES-H, EP,EP-H, ETP, and ETP-H, suffer when the skew increases. Al-

though the numbers of I/O operations of these algorithms in-crease along with the increasing skew, the I/O-incurred dif-ference between the highest and the lowest skew is only 2 s.The difference in the output relation size between the highestand the lowest skew is only 460 KB, which leads to about a4.6-s performance difference. Then what is the real reason forthe performance hit of these algorithms? Detailed examinationrevealed that it is in-memory operations that cause the poorperformance of these algorithms. When data skew is present,these algorithms have to do substantial in-memory work toperform the join. This is illustrated in Fig. 22, which showsthe CPU time used by each algorithm. To present the differ-ence clearly, we do not use a log-scale y-axis. Note that sixalgorithms, i.e., ETS, TSI, TS, ETS-H, TSI-H, and TS-H, havevery low CPU cost (less than 30 s) in all cases. So their perfor-mance does not degrade when the degree of skew increases.

Second, the performance of the timestamp partitioning al-gorithms, i.e., TP, TP-H, TEP, and TEP-H, degrade with in-creasing skew, but not as badly as do the explicit algorithms.Although timestamp partitioning algorithms sort each parti-tion by the explicit attribute, the explicit attribute inside eachpartition is not highly skewed. For example, if n tuples havethe same value as the explicit attribute, they will be put intoone partition after being hashed in EP. In the join phase, therewill be an n × n loop within the join. In TP, this value packetwill be distributed evenly across partitions. Assuming thereare m partitions, each partition will have n/m of these tuples,which leads to an n2/m2 loop within the join per partition.The total number of join operations in TP will be n2/m, whichis 1/m of that of EP. This factor can be seen from Fig. 22.

Finally, the timestamp sorting algorithms, i.e., TS, TS-H, TSI, TSI-H, ETS, and ETS-H, perform very wellunder explicit skew. TS and TS-H only use the time-stamp to determine if a backup is needed. TSI andTSI-H only use the timestamp to determine if the cache tu-ples should be removed. We see the benefit of the secondarysorting on the timestamp in the algorithms ETS and ETS-H.Since these two algorithms define the value packet by both theexplicit attribute and the timestamp, the big loop in the joinphase is avoided.

From this experiment, we conclude that when explicit dualskew is present, all the explicit algorithms are poor choicesexcept for ETS and ETS-H. The effects of timestamp dualskew are examined next.

5.8 Timestamp dual skew

Like explicit dual skew, timestamp dual skew can affect theperformance of the timestamp sort-merge join algorithms. Welook into this effect.

We fixed main memory at 16 MB and input relations at64 MB. We generated a series of relations with increasingtimestamp chunky skew, from 0% to 4% in 1% increments.To ensure dual skew, we performed a self-join on these rela-tions. Short-duration timestamps, randomly distributed overthe relation lifespan, were used to mitigate the long-durationtimestamp effect on timestamp sorting algorithms. The ex-plicit attribute values were also distributed randomly. The re-sults are shown in Fig. 23. In this figure, GRACE explicit sortmerge algorithms are represented by ES; all hybrid partition-


400600

1000

2000

5000

10000

50000

200000

500000

1100000

0 1 2 3 4

elap

sed

time

(sec

s)

timestamp skew in both relations (percentage)

TSITS

TS-HESEP

EP-H

Fig. 23. Timestamp attribute skew in both relations

5

10

50

200

500

1000

2000

4000

6000

0 1 2 3 4

CP

U ti

me

(sec

s)

timestamp skew in both relations (percentage)

TSTSI

TS-HESEPTP

Fig. 24. Timestamp attribute skew in both relations

ing algorithms by EP-H, TSI and TSI-H by TSI, and TEP, TP,ETP, EP, ETS-H, and ES-H by EP; the remaining algorithmsare retained.

The algorithms fall into three groups. All the timestampsort-merge algorithms exhibit poor performance. However, theperformance of TS and TS-H is much better than that of TSIand TSI-H. At the highest skew, the performance of TS is174 times better than that of TSI. This is due to the cacheoverflow in TSI. One percent of 64 MB is 640 KB, which isten times the cache size. The interval join algorithm scans andpurges the cache once for every tuple to be joined. The cachethrashing occurs when the cache overflows. As before, thereis no cache overflow in TS and TS-H. The performance gapbetween these two algorithms and the group with flat curves iscaused by in-memory join operations. The CPU time used byeach algorithm is plotted separately in Fig. 24. In this figure,all the explicit sort-merge algorithms are represented by ES,all the explicit partitioning algorithms by EP, all the timestamppartitioning algorithms by TP, and TSI and TSI-H by TSI ; theremaining algorithms are retained. Since all but timestampsort-merge algorithms perform the in-memory join by sortingthe relations or the partitions on the explicit attribute, theirperformance is not at all affected by dual skew.

400600

1000

2000

5000

10000

50000

200000

500000

1100000

0 1 2 3 4

elap

sed

time

(sec

s)

explicity/timestamp skew in both relations (percentage)

TSIES

ES-HEPTP

Fig. 25. Explicit/timestamp attribute skew in both relations

It is interesting that the CPU time spent by TSI is less thanthat spent by TS. The poor overall performance of TSI due tothe cache overflow can be improved by increasing the cachesize of TSI. Actually, TSI performs the join operation in thetwo caches rather than in the input buffers. Therefore, a largecache size can be chosen when dual skew is present to avoidcache thrashing. In this case, a 1-MB cache size for TSI willresult in a performance similar to that of TS.

5.9 Explicit/timestamp dual skew

In this section, we investigate the simultaneous effect of dualskew in both the explicit attribute and the timestamp. This isa challenging situation for any temporal join algorithm.

The memory size is 16 MB, and we generated a seriesof 64-MB relations with increasing explicit and timestampchunky skew, from 0% to 4% in 1% increments. Dual skewwas guaranteed by performing a self-join on these relations.The results are shown in Fig. 25. In this figure, TSI and TSI-Hare represented by TSI, TS and ES by ES, TS-H and ES-H byTS-H, all the explicit partitioning algorithms by EP, and theremaining algorithms by TP.

The interesting point is that all the algorithms are affectedby the simultaneous dual skew in both the explicit and time-stamp attributes. But they fall into two groups. The algorithmsthat are sensitive to the dual skew in either explicit attributeor timestamp attribute perform as badly as they do in the ex-periments described in Sects. 5.7 and 5.8. The performanceof the algorithms not affected by the dual skew in either ex-plicit attribute or timestamp attribute degrades with increasingskew. However, their performance is better than that of the al-gorithms in the first group. This is due to the orthogonality ofthe explicit skew and the timestamp skew.

5.10 Summary

The performance study described in this section is the firstcomprehensive, empirical analysis of temporal join algo-rithms. We investigated the performance of 19 non-index-based join algorithms, namely, nested-loop (NL), explicit


partitioning (EP and EP-H), explicit sorting (ES and ES-H), timestamp sorting (TS and TS-H), interval join (TSIand TSI-H), timestamp partitioning (TP and TP-H), com-bined explicit/timestamp sorting (ETS and ETS-H) and time-stamp/explicit sorting (TES and TES-H), and combined ex-plicit/timestamp partitioning (ETP and ETP-H) and time-stamp/explicit partitioning (TEP and TEP-H) for the temporalequijoin. We varied the following main aspects in the experi-ments: the presence of long-duration timestamps, the relativesizes of the input relations, and the explicit-join and timestampattribute distributions.

The findings of this empirical analysis can be summarizedas follows.

• The algorithms need to be engineered well to avoid perfor-mance hits. Care needs to be taken in sorting, in purgingthe cache, in selecting the next tuple in the merge step, inallocating memory, and in handling intrinsic skew.

• Nested-loop is not competitive.• The timestamp sorting algorithms, TS, TS-H, TES,

TES-H, TSI, and TSI-H, were also not competitive. Theywere quite sensitive to the duration of input tuple times-tamps. TSI and TSI-H had very poor performance in thepresence of large amounts of skew due to cache overflow.

• The GRACE variants were competitive only when therewas low selectivity and a large memory size relative to thesize of the input relations. In all other cases, the hybridvariants performed better.

• In the absence of explicit and timestamp skew, our resultsparallel those from conventional query evaluation. In par-ticular, when attribute distributions are random, all sort-ing and partitioning algorithms (other than those alreadyeliminated as noncompetitive) have nearly equivalent per-formance, irrespective of the particular attribute type usedfor sorting or partitioning.

• In contrast with previous results in temporal coalescing[5], the binary nature of the valid-time equijoin allowsan important optimization for partition-based algorithms.When one input relation is small relative to the availablemain memory buffer space, the partitioning algorithmshave uniformly better performance than their sort-basedcounterparts.

• The choice of timestamp or explicit partitioning dependson the presence or absence of skew in either attribute di-mension. Interestingly, the performance differences aredominated by main memory effects. The timestamp parti-tioning algorithms were less affected by increasing skew.

• ES and ES-H were sensitive to explicit dual skew.• The performance of the partition-based algorithms, EP and

EP-H, was affected by both outer and dual explicit attributeskew.

• The performance of TP and TP-H degraded whenouter skew was present. Except for this one situ-ation, these partition-based algorithms are generallymore efficient than their sort-based counterparts sincesorting, and associated main memory operations, areavoided.

• It is interesting that the combined explicit/timestamp-based algorithms can mitigate the effect of either explicitattribute skew or timestamp skew. However, when dualskew was present in the explicit attribute and the timestamp

simultaneously, the performance of all the algorithms de-graded, though again less so for timestamp partitioning.

6 Conclusions and research directions

As a prelude to investigating non-index-based temporal joinevaluation, this paper initially surveyed previous work, firstdescribing the different temporal join operations proposed inthe past and then describing join algorithms proposed in previ-ous work. The paper then developed evaluation strategies forthe valid-time equijoin and compared the evaluation strategiesin a sequence of empirical performance studies. The specificcontributions are as follows.

• We defined a taxonomy of all temporal join operators pro-posed in previous research. The taxonomy is a natural onein the sense that it classifies the temporal join operators asextensions of conventional operators, irrespective of spe-cial joining attributes or other model-specific restrictions.The taxonomy is thus model independent and assigns aname to each temporal operator consistent with its exten-sion of a conventional operator.

• We extended the three main paradigms of query evalua-tion algorithms to temporal databases, thereby defining thespace of possible temporal evaluation algorithms.

• Using the taxonomy of temporal join algorithms, we de-fined 19 temporal equijoin algorithms, representing thespace of all such possible algorithms, and placed all exist-ing work into this framework.

• We defined the space of database parameters that affect theperformance of the various join algorithms. This space ischaracterized by the distribution of the explicit and time-stamp attributes in the input relation, the duration of times-tamps in the input relations, the amount of main memoryavailable to the join algorithm, the relative sizes of theinput relations, and the amount of dual attribute and/ortimestamp skew for each of the relations.

• We empirically compared the performance of the algo-rithms over this parameter space.

Our empirical study showed that some algorithms can beeliminated from further consideration: NL, TS, TS-H, TES,TES-H, ES, ES-H, EP, and EP-H. Hybrid variants generallydominated GRACE variants, eliminating ETP, TEP, and TP.When the relation sizes were different, explicit sorting (ETS,ETS-H, ES, ES-H) performed poorly.

This leaves three algorithms, all partitioning ones: ETP-H,TEP-H, TP-H. Each dominates the other two in certain circum-stances, but TP-H performs poorly in the presence of time-stamp and attribute skew and is significantly more compli-cated to implement. Of the other two, ETP-H came out aheadmore often than TEP-H. Thus we recommend ETP-H, a hybridvariant of explicit partitioning that partitions primarily by theexplicit attribute. If this attribute is skewed so that some buck-ets do not fit in memory, a further partition on the timestampattribute increases the possibility that the resulting buckets willfit in the available buffer space.

The salient point of this study is that simple modificationsto an existing conventional evaluation algorithm (EP) can beused to effect temporal joins with acceptable performance andat relatively small development cost. While novel algorithms


(such as TP-H) may have better performance in certain circum-stances, well-understood technology can be easily adapted andwill perform acceptably in many situations. Hence databasevendors wishing to implement temporal join may do so witha relatively low development cost and still achieve acceptableperformance.

The above conclusion focuses on independent join opera-tions rather than a query consisting of several algebraic oper-ations. Given the correlation between various operations, thelatter is more complex. For example, one advantage of sort-merge algorithms is that the output is also sorted, which canbe exploited in subsequent operations. This interesting orderis used in traditional query optimization to reduce the cost ofthe whole query. We believe temporal query optimization canalso take advantage of this [50]. Among the sort-merge al-gorithms we have examined, the output of explicit algorithms(ES, ES-H, ETS, ETS-H) is sorted by the explicit join attribute;interval join algorithms produce the output sorted by the starttimestamp. Of these six algorithms, we recommend ETS-Hdue to its higher efficiency.

Several directions for future work exist. Important prob-lems remain to be addressed in temporal query processing, inparticular with respect to temporal query optimization. Whileseveral researchers have investigated algebraic query opti-mization, little research has appeared with respect to cost-based temporal query optimization.

In relation to query evaluation, additional investigation ofthe algorithm space described in Sect. 5 is needed. Many op-timizations originally developed for conventional databases,such as read-ahead and write-behind buffering, forecasting,eager and lazy evaluation, and hash filtering, should be ap-plied and investigated. Cache size and input buffer allocationtuning is also an interesting issue.

All of our partitioning algorithms generate maximal parti-tions, that of the main memory size minus a few blocks for theleft-hand relation of the join, and then apply that partitioningto the right-hand relation. In the join step, a full left-hand par-tition is brought into main memory and joined with successiveblocks from the associated right-hand partition. Sitzmann andStuckey term this a static buffer allocation strategy and in-stead advocate a dynamic buffer allocation strategy in whichthe left-hand and right-hand relations are partitioned in onestep, so that two partitions, one from each relation, can si-multaneously fit in the main memory buffer [49]. The advan-tage over the static strategy is that fewer seeks are requiredto read the right-hand side partition; the disadvantage is thatthis strategy results in smaller, and thus more numerous, par-titions, which increases the number of seeks and requires thatthe right-hand side also be sampled, which also increases thenumber of seeks. It might be useful to augment the timestamppartitioning to incorporate dynamic buffer allocation, thoughit is not clear at the outset that this will yield a performancebenefit over our TP-H algorithm or over ETP-H.

Dynamic buffer allocation for conventional joins was firstproposed by Harris and Ramamohanarao [22]. They built thecost model for nested loop and hash join algorithms with thesize of buffers as one of the parameters. Then for each algo-rithm they computed the optimal, or suboptimal but still good,buffer allocation that led to the minimum join cost. Finally,the optimal buffer allocation was used to perform the join. Itwould be interesting to see if this strategy can improve the per-

formance of temporal joins. It would also be useful to developcost models for the most promising temporal join algorithm(s),starting with ETP-H.

The next logical progression in future work is to extend thiswork to index-based temporal joins, again investigating theeffectiveness of both explicit attribute indexing and timestampindexing. While a large number of timestamp indexes havebeen proposed in the literature [44] and there has been somework on temporal joins that use temporal or spatial indexes [13,33,52,56], a comprehensive empirical comparison of thesealgorithms is needed.

Orthogonally, more sophisticated techniques for tempo-ral database implementation should be considered. In partic-ular, we expect specialized temporal database architectures tohave a significant impact on query processing efficiency. Ithas been argued in previous work that incremental query eval-uation is especially appropriate for temporal databases [24,34,41]. In this approach, a query result is materialized andstored back into the database if it is anticipated that the samequery, or one similar to it, will be issued in the future. Updatesto the contributing relations trigger corresponding updates tothe stored result. The related topic of global query optimiza-tion, which attempts to exploit commonality between multiplequeries when formulating a query execution plan, also has yetto be explored in a temporal setting.

Acknowledgements. This work was sponsored in part by NationalScience Foundation Grants IIS-0100436, CDA-9500991, EAE-0080123, IRI-9632569, and IIS-9817798, by the NSF Research In-frastructure Program Grants EIA-0080123 and EIA-9500991, by theDanish National Centre for IT-Research, and by grants from Ama-zon.com, the Boeing Corporation, and the Nykredit Corporation.

We also thank Wei Li and Joseph Dunn for their help in imple-menting the temporal join algorithms.

References

1. Allen JF (1983) Maintaining knowledge about temporal inter-vals. Commun ACM 26(11):832–843

2. Arge L, Procopiuc O, Ramaswamy S, Suel T, Vitter JS (1998)Scalable sweeping-based spatial join. In: Proceedings of theinternational conference on very large databases, New York,24–27 August 1998, pp 570–581

3. Beckmann N, Kriegel HP, Schneider R, Seeger B (1990) TheR∗-tree: an efficient and robust access method for points andrectangles. In: Proceedings of the ACM SIGMOD conference,Atlantic City, NJ, 23–25 May 1990, pp 322–331

4. van den Bercken J, Seeger B (1996) Query processing tech-niques for multiversion access methods. In: Proceedings of theinternational conference on very large databases, Mubai (Bom-bay), India, 3–6 September 1996, pp 168–179

5. Bohlen MH, Snodgrass RT, Soo MD (1997) Temporal coalesc-ing. In: Proceedings of the international conference on verylarge databases, Athens, Greece, 25–29 August 1997, pp 180–191

6. Clifford J, Croker A (1987) The historical relational data model(HRDM) and algebra based on lifespans. In: Proceedings of theinternational conference on data engineering, Los Angeles, 3–5February 1987, pp 528–537. IEEE Press, New York

7. Clifford J, Croker A (1993) The historical relational data model(HRDM) revisited. In: Tansel A, Clifford J, Gadia S, Jajodia S,


Segev A, Snodgrass RT (eds) Temporal databases: theory, de-sign, and implementation, ch 1. Benjamin/Cummings, Reading,MA, pp 6–27

8. Clifford J, Uz Tansel A (1985) On an algebra for historical rela-tional databases: two views. In: Proceedings of the ACM SIG-MOD international conference on management of data, Austin,TX, 28–31 May 1985, pp 1–8

9. DeWitt DJ, Katz RH, Olken F, Shapiro LD, Stonebraker MR,Wood D (1984) Implementation techniques for main memorydatabase systems. In: Proceedings of the ACM SIGMOD in-ternational conference on management of data, Boston, 18–21June 1984, pp 1–8

10. Dittrich JP, Seeger B, Taylor DS, Widmayer P (2002) Pro-gressive merge join: a generic and non-blocking sort-based joinalgorithm. In: Proceedings of the conference on very large data-bases, Madison, WI, 3–6 June 2002, pp 299–310

11. Dunn J, Davey S, Descour A, Snodgrass RT (2002) Sequencedsubset operators: definition and implementation. In: Proceed-ings of the IEEE international conference on data engineering,San Jose, 26 February–1 March 2002, pp 81–92

12. Dyreson CE, Snodgrass RT (1993) Timestamp semantics andrepresentation. Inform Sys 18(3):143–166

13. Elmasri R, Wuu GTJ, Kim YJ (1990) The time index: an ac-cess structure for temporal data. In: Proceedings of the confer-ence on very large databases, Brisbane, Queensland, Australia,13–16 August 1990, pp 1–12

14. Etzion O, Jajodia S, Sripada S (1998) Temporal databases:research and practice. Lecture notes in computer science,vol 1399. Springer, Berlin Heidelberg New York

15. Gadia SK (1988) A homogeneous relational model and querylanguages for temporal databases. ACM Trans Database Sys13(4):418–448

16. Gao D, Jensen CS, Snodgrass RT, Soo MD (2002) Joinoperations in temporal databases. TimeCenter TR-71http://www.cs.auc.dk/TimeCenter/pub.htm

17. Gao D, Kline N, Soo MD, Dunn J (2002) TimeIT: the Timeintegrated testbed, v. 2.0 Available via anonymous FTP at:ftp.cs.arizona.edu

18. Graefe G (1993) Query evaluation techniques for large data-bases. ACM Comput Surv 25(2):73–170

19. Graefe G, Linville A, Shapiro LD (1994) Sort vs. hash revisited.IEEE Trans Knowl Data Eng 6(6):934–944

20. Gunadhi H, Segev A (1991) Query processing algorithms fortemporal intersection joins. In: Proceedings of the IEEE con-ference on data engineering, Kobe, Japan, 8–12 April 1991,pp 336–344

21. GuttmanA (1984) R-trees: a dynamic index structure for spatialsearching. In: Proceedings of the ACM SIGMOD conference,Boston, 18–21 June 1984, pp 47–57

22. Harris EP, Ramamohanarao K (1996) Join algorithm costs re-visited. J Very Large Databases 5(1):64–84

23. Jensen CS (ed) (1998) The consensus glossary of temporaldatabase concepts – February 1998 version. In [14], pp 367–405

24. Jensen CS, Mark L, Roussopoulos N (1991) Incremental imple-mentation model for relational databases with transaction time.IEEE Trans Knowl Data Eng 3(4):461–473

25. Jensen CS, Snodgrass RT, Soo MD (1996) Extending existingdependency theory to temporal databases. IEEE Trans KnowlData Eng 8(4):563–582

26. Jensen CS, Soo MD, Snodgrass RT (1994) Unifying temporalmodels via a conceptual model. Inform Sys 19(7):513–547

27. Leung TY, Muntz R (1990) Query processing for temporaldatabases. In: Proceedings of the IEEE conference on data en-gineering, Los Angeles, 6–10 February 1990, pp 200–208

28. Leung TYC, Muntz RR (1992) Temporal query processing andoptimization in multiprocessor database machines. In: Proceed-ings of the conference on very large databases, Vancouver, BC,Canada, pp 383–394

29. Leung TYC, Muntz RR (1993) Stream processing: tempo-ral query processing and optimization. In: Tansel A, CliffordJ, Gadia S, Jajodia S, Segev A, Snodgrass RT (eds) Tempo-ral databases: theory, design, and implementation, ch 14, Ben-jamin/Cummings, Reading, MA, pp 329–355

30. Li W, Gao D, Snodgrass RT (2002) Skew handling techniquesin sort-merge join. In: Proceedings of the ACM SIGMOD con-ference on management of data Madison, WI, 3–6 June 2002,pp 169–180

31. Lo ML, Ravishankar CV (1994) Spatial joins using seeded trees.In: Proceedings of the ACM SIGMOD conference, Minneapo-lis, MN, 24–27 May 1994, pp 209–220

32. Lo ML, Ravishankar CV (1996) Spatial hash-joins. In: Proceed-ings of ACM SIGMOD conference, Montreal, 4–6 June 1996,pp 247–258

33. Lu H, Ooi BC, Tan KL (1994) On spatially partitioned temporaljoin. In: Proceedings of the conference on very large databases,Santiago de Chile, Chile, 12–15 September 1994, pp 546–557

34. McKenzie E (1988) An algebraic language for query and up-date of temporal databases. Ph.D. dissertation, Department ofComputer Science, University of North Carolina, Chapel Hill,NC

35. Mishra P, Eich M (1992) Join processing in relational databases.ACM Comput Surv 24(1):63–113

36. Navathe S, Ahmed R (1993) Temporal extensions to the rela-tional model and SQL. In: Tansel A, Clifford J, Gadia S, JajodiaS, Segev A, Snodgrass RT (eds) Temporal databases: theory, de-sign, and implementation. Benjamin/Cummings, Reading, MA,pp 92–109

37. Orenstein JA (1986) Spatial query processing in an object-oriented database system. In: Proceedings of the ACMSIGMOD conference, Washington, DC, 28–30 May 1986,pp 326–336

38. Orenstein JA, Manola FA (1988) PROBE spatial data modelingand query processing in an image database application. IEEETrans Software Eng 14(5):611–629

39. Ozsoyoglu G, Snodgrass RT (1995) Temporal and real-timedatabases: a survey. IEEE Trans Knowl Data Eng 7(4):513–532

40. Patel JM, DeWitt DJ (1996) Partition based spatial-merge join.In: Proceedings of the ACM SIGMOD conference, Montreal,4–6 June 1996, pp 259–270

41. Pfoser D, Jensen CS (1999) Incremental join of time-orienteddata. In: Proceedings of the international conference on scien-tific and statistical database management, Cleveland, OH, 28–30July 1999, pp 232–243

42. Ramakrishnan R, Gehrke J (2000) Database management sys-tems. McGraw-Hill, New York

43. Rana S, Fotouhi F (1993) Efficient processing of time-joinsin temporal data bases. In: Proceedings of the internationalsymposium on DB systems for advanced applications, Daejeon,South Korea, 6–8 April 1993, pp 427–432

44. Salzberg B, Tsotras VJ (1999) Comparison of access methodsfor time-evolving data. ACM Comput Surv 31(2):158–221

45. Samet H (1990) The design and analysis of spatial data struc-tures. Addison-Wesley, Reading, MA

46. Segev A (1993) Join processing and optimization in temporalrelational databases. In: Tansel A, Clifford J, Gadia S, Jajo-dia S, Segev A, Snodgrass RT (eds) Temporal databases: the-ory, design, and implementation, ch 15. Benjamin/Cummings,Reading, MA, pp 356–387


47. SegevA, Gunadhi H (1989) Event-join optimization in temporalrelational databases. In: Proceedings of the conference on verylarge databases, Amsterdam, 22–25 August 1989, pp 205–215

48. Sellis T, Roussopoulos N, Faloutsos C (1987) The R+-tree: adynamic index for multidimensional objects. In: Proceedingsof the conference on very large databases, Brighton, UK, 1–4September 1987, pp 507–518

49. Sitzmann I, Stuckey PJ (2000) Improving temporal joins usinghistograms. In: Proceedings of the international conference ondatabase and expert systems applications, London/Greenwich,UK, 4–8 September 2000, pp 488–498

50. Slivinskas G, Jensen CS, Snodgrass RT (2001) A foundationfor conventional and temporal query optimization addressingduplicates and ordering. Trans Knowl Data Eng 13(1):21–49

51. Snodgrass RT, Ahn I (1986) Temporal databases. IEEE Comput19(9):35–42

52. Son D, Elmasri R (1996) Efficient temporal join processingusing time index. In: Proceedings of the conference on statisticaland scientific database management, Stockholm, Sweden, 18–20 June 1996, pp 252–261

53. Soo MD, Jensen CS, Snodgrass RT (1995) An algebra forTSQL2. In: Snodgrass RT (ed) The TSQL2 temporal querylanguage, ch 27, Kluwer, Amsterdam, pp 505–546

54. Soo MD, Snodgrass RT, Jensen CS (1994) Efficient evalua-tion of the valid-time natural join. In: Proceedings of the inter-national conference on data engineering, Houston, TX, 14–18February 1994, pp 282–292

55. Tsotras VJ, Kumar A (1996) Temporal database bibliographyupdate. ACM SIGMOD Rec 25(1):41–51

56. Zhang D, Tsotras VJ, Seeger B (2002) Efficient temporal joinprocessing using indices. In: Proceedings of the IEEE interna-tional conference on data engineering, San Jose, 26 February–1March 2002, pp 103–113

57. Zurek T (1997) Optimisation of partitioned temporal joins.Ph.D. Dissertation, Department of Computer Science, Edin-burgh University, Edinburgh, UK

Documents

Join operations in temporal databases