54
Snapshot Isolation and Integrity Constraints in Replicated Databases Yi Lin 1 , Bettina Kemme 1 School of Computer Science, McGill University, Canada [email protected], [email protected] Ricardo Jim´ enez-Peris 2 , Marta Pati˜ no-Mart´ ınez 2 Facultad de Inform´ atica, Universidad Polit´ ecnica de Madrid, Spain (rjimenez, mpatino)@fi.upm.es and Jos´ e Enrique Armend´ ariz-I˜ nigo 3 Departamento de Ingenier´ ıa Matem´ atica e Inform´ atica, UniversidadP´ublicade Navarra, Spain [email protected] Database replication is widely used for fault-tolerance and performance. However, it requires replica control to keep data copies consistent despite updates. The traditional correctness criterion for the concurrent execution of transactions in a replicated database is 1-copy-serializability. It is based on serializability, the strongest isolation level in a non-replicated system. In recent years, however, snapshot isolation (SI), a slightly weaker isolation level, has become popular in commercial database systems. There exist already several replica control protocols that provide SI in a replicated system. However, most of the correctness reasoning for these protocols has been rather informal. Additionally, most of the work so far ignores the issue of integrity constraints. In this paper, we provide a formal definition of 1-copy-SI using and extending a well-established definition of SI in a non-replicated system. Our definition considers integrity constraints in a way that conforms to the way integrity constraints are handled in commercial systems. We discuss a set of necessary and sufficient conditions for a replicated history to be producible under 1-copy-SI. This makes our formalism a convenient tool to prove the correctness of replica control algorithms. Categories and Subject Descriptors: H[Information Systems]: ; H.2 [Database Manage- ment]: ; H.2.4 [Systems]: Distributed databases General Terms: Theory, Verification, Reliability Additional Key Words and Phrases: Replication, Snapshot Isolation, Integrity Constraints 1 This work was partially supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) under its Discovery Grants Program. 2 This work was supported in part by the Spanish National Science Foundation (MEC) (grant TIN2007-67353-C02), Madrid Regional Research Council under the Autonomic project (grant S- 0505/TIC/000285) and the EU Commission under the NEXOF-RA project (grant FP7-216446). 3 This work has been partially supported by the Spanish MEC and EU FEDER under grant TIN2006-14738-C02 and IMPIVA under grant IMAETB/2007/30. Permission to make digital/hard copy of all or part of this material without fee for personal or classroom use provided that the copies are not made or distributed for profit or commercial advantage, the ACM copyright/server notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists requires prior specific permission and/or a fee. c 2009 ACM 0362-5915/2009/0300-0001 $5.00 ACM Transactions on Database Systems, Vol. V, No. N, February 2009, Pages 1–54.

Snapshot isolation and integrity constraints in replicated databases

Embed Size (px)

Citation preview

Snapshot Isolation and Integrity Constraints in

Replicated Databases

Yi Lin1, Bettina Kemme1

School of Computer Science, McGill University, Canada

[email protected], [email protected]

Ricardo Jimenez-Peris2, Marta Patino-Martınez2

Facultad de Informatica, Universidad Politecnica de Madrid, Spain

(rjimenez, mpatino)@fi.upm.es

and

Jose Enrique Armendariz-Inigo3

Departamento de Ingenierıa Matematica e Informatica, Universidad Publica de Navarra,

Spain

[email protected]

Database replication is widely used for fault-tolerance and performance. However, it requiresreplica control to keep data copies consistent despite updates. The traditional correctness criterionfor the concurrent execution of transactions in a replicated database is 1-copy-serializability. Itis based on serializability, the strongest isolation level in a non-replicated system. In recentyears, however, snapshot isolation (SI), a slightly weaker isolation level, has become popular incommercial database systems. There exist already several replica control protocols that provideSI in a replicated system. However, most of the correctness reasoning for these protocols has beenrather informal. Additionally, most of the work so far ignores the issue of integrity constraints.In this paper, we provide a formal definition of 1-copy-SI using and extending a well-establisheddefinition of SI in a non-replicated system. Our definition considers integrity constraints in a waythat conforms to the way integrity constraints are handled in commercial systems. We discuss aset of necessary and sufficient conditions for a replicated history to be producible under 1-copy-SI.

This makes our formalism a convenient tool to prove the correctness of replica control algorithms.

Categories and Subject Descriptors: H [Information Systems]: ; H.2 [Database Manage-

ment]: ; H.2.4 [Systems]: Distributed databases

General Terms: Theory, Verification, Reliability

Additional Key Words and Phrases: Replication, Snapshot Isolation, Integrity Constraints

1This work was partially supported by the Natural Sciences and Engineering Research Council ofCanada (NSERC) under its Discovery Grants Program.2This work was supported in part by the Spanish National Science Foundation (MEC) (grantTIN2007-67353-C02), Madrid Regional Research Council under the Autonomic project (grant S-0505/TIC/000285) and the EU Commission under the NEXOF-RA project (grant FP7-216446).3This work has been partially supported by the Spanish MEC and EU FEDER under grantTIN2006-14738-C02 and IMPIVA under grant IMAETB/2007/30.

Permission to make digital/hard copy of all or part of this material without fee for personalor classroom use provided that the copies are not made or distributed for profit or commercialadvantage, the ACM copyright/server notice, the title of the publication, and its date appear, andnotice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish,to post on servers, or to redistribute to lists requires prior specific permission and/or a fee.c© 2009 ACM 0362-5915/2009/0300-0001 $5.00

ACM Transactions on Database Systems, Vol. V, No. N, February 2009, Pages 1–54.

2 · ...

1. INTRODUCTION

Database systems are an important component in current information systems ar-chitectures. In these multi-tier architectures, the database system builds the back-end tier that provides persistence and transactional properties. With businessesproviding their clients and business partners increasingly online access to their ser-vices, and with the emergence of web-service standards, these information systemsface immense scalability issues. Often, the database is a bottleneck, and the onlycommercial solution to achieve scalability is to buy expensive parallel databasesoftware. A cheaper alternative is database replication. In this case, the databasesystem is installed on a cluster of machines each holding a copy of the database.Typically, a ROWA (read-one-write-all) approach is used. A read access can beexecuted by any replica while writes have to be performed by all replicas.

Database Replication In recent years, many cluster-based replication solutions havebeen proposed (e.g., [Carey and Livny 1991; Chundi et al. 1996; Breitbart et al.1999; Pacitti et al. 1999; Kemme and Alonso 2000; Pedone et al. 2003; Amza et al.2003; Holliday et al. 2003; Plattner and Alonso 2004; Cecchet et al. 2004; Patino-Martınez et al. 2005; Lin et al. 2005; Plattner et al. 2008]) that have shown toprovide excellent scalability for transactional workloads. Some of them integratereplica control directly into the database kernel. The clients connect to any of thedatabase replicas and submit their requests as if this was a non-replicated databasesystem. Other solutions implement the replication logic in a middleware layerbetween the client and the database replicas. The middleware provides a standarddatabase interface such as JDBC, and controls where reads and writes are executed.

Correctness in Replicated Databases Many of the solutions assume that the under-lying database system provides the isolation level serializability using strict twophase locking (2PL). Based on the locking mechanisms of the database system,the replication module guarantees 1-copy-serializability at the global level, i.e., theexecution in the replicated environment is equivalent to a serial execution over alogical single copy of the database.

Recently, Snapshot Isolation (SI) has emerged as a new isolation level [Berensonet al. 1995]. SI is slightly weaker than serializability and has become quite popular.It requires that transactions read data from a snapshot committed at the time pointwhen they start. Furthermore, if two transactions want to update the same dataitem concurrently, one will be aborted. SI has been adopted by many databasevendors such as Oracle, PostgreSQL, Interbase 4 and Microsoft SQL Server 2005.Implementations of SI allows for more concurrency than strict 2PL, the standardmechanism to achieve serializability, since read operations read from a snapshotand do not need to set locks. SI avoids all isolation anomalies as defined by theindustrial ANSI standard [ANSI X3.135-1992 1992]. However, it does not provideserializability as defined in the research literature. Berenson et al. [1995] provide anadjusted set of anomalies, and show that SI exhibits some anomalies that cannotoccur under their definition of serializability.

Given the popularity of SI, it makes sense for a replicated database to provide

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 3

what we call 1-copy-SI, meaning that the execution in the replicated environment isequivalent to an execution over a logical single copy of the database that is possibleunder SI. Indeed, several replica control protocols have been proposed that provideSI at the global level (e.g., [Plattner and Alonso 2004; Plattner et al. 2008; Linet al. 2005; Elnikety et al. 2005; Wu and Kemme 2005; Daudjee and Salem 2006]).Often, however, correctness reasoning is rather informal.

Integrity Constraints While SI and its relationship to serializability have been dis-cussed in depth in the research literature [Adya 1999; Berenson et al. 1995; Feketeet al. 2005], its behavior in regard to integrity constraints is not well defined. Aspointed out by Berenson et al. [1995], having all operations based on SI semantics,integrity constraints such as foreign key constraints are easily violated. Clearly,commercial database systems maintain integrity constraints even if they are basedon SI. That is, they actually implement an isolation level that is stronger than SIbut weaker than serializability. We are not aware of any work that formalizes thisbehavior. Adya [1999] discusses integrity constraints and its integration with SI.However, the author proposes to use the serializable isolation level for update trans-actions and SI only for read-only transactions. This is stricter than the isolationlevel implemented in commercial systems. In regard to replication tools, integrityconstraints are generally ignored, and it is not clear, whether the systems can han-dle them. Some might handle them, others not. However, in order to judge whethercorrectness is given, we need a way to express when an execution in a replicatedenvironment provides SI at the global level and at the same time does not violateany integrity constraints.

Contribution of the Paper This paper proposes a framework that allows us to reasonabout SI and integrity constraints in a replicated environment. Our framework isbased on the General Isolation Definition (GID) introduced in [Adya 1999; Adyaet al. 2000]. GID is a very powerful tool and allows the definition of isolation levelsin an implementation-independent manner. In particular, Adya [1999] defines SIusing GID. We extend this definition and the GID framework to reason aboutcorrectness in a replicated environment. We define 1-copy-SI as a correctness levelin a replicated system. Furthermore, we introduce integrity constraints and definean isolation level SI+IC that corresponds to the isolation level implemented incommercial systems. We extend this isolation level to 1-copy-SI+IC to be usedin a replicated environment. We present conditions that help to decide whether areplicated history conforms to 1-copy-SI or 1-copy-SI+IC. In particular, we showthat in order to be 1-copy-SI/1-copy-SI+IC a history must avoid certain cycles inits dependency graph. Our formalism is a convenient tool to prove the correctnessof a given replica control algorithm. We present three example protocols and showthat two provide 1-copy-SI+IC while one only provides 1-copy-SI.

The remainder of this paper is structured as follows. In Section 2, we presentGID as introduced in [Adya 1999] to reason about SI. In Section 3, we define 1-copy-SI based on GID and give some necessary and sufficient conditions for a replicatedexecution to be 1-copy-SI. In Section 4, we extend the formalism to express integrityconstraints (ICs) and define SI+IC as new isolation level. In Section 5, we derive1-copy-SI+IC which provides SI and proper handling of integrity constraints in areplicated environment. In Section 6, we describe several example replica control

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

4 · ...

protocols and prove their correctness. Section 7 deals with replica failures. Section8 presents related work and Section 9 concludes the paper.

2. SNAPSHOT ISOLATION (SI)

Berenson et al. [1995] define SI by two properties. Snapshot-Read requires that atransaction T reads data from a snapshot which contains all updates committedbefore T starts (plus its own updates). Snapshot-Read is typically implemented viaa multi-version system where read operations access previously committed versions.Snapshot-Write requires that no two concurrent transactions may write the sameobject. That is, if two concurrent transactions both want to write the same dataitem only one of them will be allowed to commit. Conflict detection for Snapshot-Write can be implemented via locking or via validation.

Our correctness reasoning is based on the formalism introduced in [Adya 1999;Adya et al. 2000], denoted as General Isolation Definition (GID). In his thesis [Adya1999], Adya defines GID and uses it to reason about various isolation levels in anon-replicated environment, including snapshot isolation. In the remainder of thissection, we present GID for snapshot isolation, slightly adjusted to our needs.

2.1 General Isolation Definition (GID)

2.1.1 Data Items and Transactions. A data item (object) x of the databasehas a life time from its initial unborn version, xinit, to its dead version, xdead,created by a transaction deleting x. A transaction Ti starts with a start operationsi, then contains a sequence of read and write operations, and terminates with acommit operation (i.e., ci) or an abort operation (i.e., ai). A transaction Ti createsa version xi of object x by performing a write operation wi(xi). If Ti reads x itreads a specific version xj , denoted as ri(xj). Reads cannot read unborn or deadversions. If Ti writes x, the version xi becomes a committed version at the time Ti

commits. We also say that Ti installs xi at commit time. Before the commit, xi

is a tentative version. If Ti aborts, xi becomes an aborted version that is no morevisible. For simplicity, we assume Ti does not read or write the same object twice,and if it reads and writes an object, it performs the read before the write4.

2.1.2 Transaction Histories. Execution is described through histories.

Definition 2.1. History. Let T be a set of transactions. A history H over Tdescribes the execution of all transactions in T and consists of two parts.

(1) It describes a partial order5, called time-precedes order ≺t, over operations oftransactions of T with the following properties:

(a) Each transaction in T has a start, and either a commit or an abort op-eration in H . H contains all operations of committed transactions. Foraborted transactions some of the read or write operations might be missing.

4Extending to multiple writes on an object or to write-then-read-relationships is conceptually verysimple but makes the notation and descriptions more cumbersome.5Partial order in this paper refers to an order < with irreflexivity (i.e., ¬(a < a)) and transitivity(i.e., (a < b) ∧ (b < c) ⇒ (a < c)).

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 5

(b) H includes the order in which operations within a transaction are executed.That is, for any two operations oij and oik of Ti ∈ T , if oij happens beforeoik in the execution, then oij ≺t oik. In particular si ≺t ci.

(c) If wi(xi) and rj(xi), then wi(xi) ≺t rj(xi).

(d) For any two committed transactions Ti and Tj : either ci ≺t sj or sj ≺t ci.

(2) H includes a version order �. For each object x there exists a total order onthe committed versions of x. xinit is the unborn version, and xdead (if existing)is the last version.

The description is very flexible and does not consider any isolation level. Differentisolation levels are then defined by putting specific restrictions on the possiblehistories. For convenience, we present a history H as a sequence of operations (i.e.,start, read, write, commit, abort) with a total order (from left to right) consistentwith ≺t. For example, consider the history Hexample:

Hexample: s1 s2 w1(x1) r2(x1)w2(x2)w2(y2) c1 c2 s3 r3(x1) c3 s4 w4(y4) a4 [x2 � x1]

In this history, T2 reads version x1 although it is not yet committed. x2 is orderedbefore x1 in the version order, although in ≺t, w1(x1) is ordered before w2(x2),and c1 is ordered before c2. This shows that, in general, the version order isindependent of the execution or commit order. Furthermore, T3 reads x1 althoughx2 was installed later. Finally, y4 is not considered in the version order since itwas created by an aborted transaction. Clearly, this history is not SI since it bothviolates Snapshot-Read (T2 reads a data version that was not committed beforeT2 started) and Snapshot-Write (T1 and T2 are concurrent, write the same object,and both commit). Those familiar with traditional serializability theory [Bernsteinet al. 1987] will easily see that the history is actually serializable.

In the following, our example histories often do not start with an empty databasebut assume that before the history H over a set of transactions T started, a trans-action T0 committed and created some data versions. We assume that if T0 wroteobject version x0, then x0 � xi for any transaction Ti ∈ T that writes x.

2.1.3 Predicates. A database query often accesses an entire set of data items andperforms a predicate evaluation. In the context of this paper we are only interestedin predicate reads6. Adya [1999] introduces a predicate evaluation as a special readoperation. We slightly enrich the formalism of [Adya 1999] to better serve our needs.A transaction Ti can have a predicate read operation ri(F:P:Oset(P):Iset(P)). P isa function over a set of relations defining a predicate. Iset(P ) contains a version foreach data item of each relation specified in P . This can include unborn and deadversions. P takes Iset(P ) as input and returns the versions Oset(P ) ⊆ Iset(P )that match the predicate. Unborn and dead versions cannot be in the return set.Function F takes Oset(P ) as input and returns the outcome of the query. Predicateread operations are added to the history just as normal read or write operations.

For instance, assume a relation D(did, location) with two data items d1 andd2. A transaction T0 has already created version d10=(‘d1’, ‘Chicago’) while d2still only has its unborn version d2init. If a query of transaction T1 now asks

6Predicate writes can be described in a similar way and are omitted for space reasons.

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

6 · ...

TTT 2 3

wr,ssww,wr

1

Fig. 1. SSG(Hexample)

for all departments in Chicago (e.g., select * from D where location = ‘Chicago’)we can write this as a predicate read r1(select : D.location = Chicago : {d10} :{d10, d2init}). For simplicity, this notation does not indicate the function P butonly the predicate defined by P . The predicate is evaluated over each of the dataversions Iset(P ) = {d10, d2init}. d10 is the only matching tuple in Oset(P ). Fsimply returns this tuple as outcome of the query. If a query only returns thenumber of departments in Chicago (e.g., select count(*) from D where location =‘Chicago’) then F returns the value “1” as outcome of the query.

2.1.4 Serialization Graph. GID uses data-flow graphs to reason about the prop-erties of a history. In this paper, we are interested in the Start-ordered SerializationGraph (SSG). It records dependencies between two committed transactions of agiven history H over T . Tj start-depends on Ti if Ti commits before Tj starts inthe time-precedes order. Tj directly write-depends on Ti if both write a commondata item x and xi and xj are consecutive versions of x in H ’s version order. Tj

directly read-depends on Ti if Ti installs some object version xi and Tj accesses xi

in its read operation (i.e., rj(xi) or rj(F:P:Oset(P):Iset(P)) and xi ∈ Iset(P )).Tj directly anti-depends on Ti if Ti accesses an object version xk in a standard orpredicate read operation and Tj creates x’s next version xj in the version order7.

Definition 2.2. Start-ordered Serialization Graph (SSG). The SSG(H) ofa history H over a set of transactions T is a directed graph where each node inSSG(H) corresponds to a committed transaction in H , and there is a write-, read-,anti- or start-dependency edge from Ti to Tj iff Tj directly write-, directly read-,directly anti-, or start-depends on Ti, respectively.

The dependency definitions and edges are summarized in Table I and Figure 1shows the SSG(Hexample) of the above example history. Since T4 aborts it isnot contained in the graph. In the following we refer to write-, read-, and anti-dependency edges also as ww-, wr- and rw-dependency edges, respectively. Theparticular data item x that leads to a dependency does usually not need to beconsidered. But if it does, we say that the dependency or the dependency edge isdue to data item x. Note that a dependency edge can be due to several data items.

In the following, given the SSG(H) of a history H , we denote as Tiww+

−→ Tj a pathin the graph from Ti to Tj consisting only of write-dependency edges. Similarily,

we denote as TiS+

−→ Tj a path in SSG(H) with only start-dependency edges.

2.2 Snapshot isolation in GID

Adya [1999] derives the set of histories allowable under SI by defining how Snapshot-Read and Snapshot-Write impose further restrictions on the ≺t order of certain start

7Adya [1999] defines anti-dependency for predicate reads to the first transaction to change theoutcome of the predicate read. For SI, however, we need the anti-dependency to the next version.

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 7

Dependency Type SSG Edge name

Directly write-depends Tiww−→ Tj write-dependency edge or ww-dependency edge

Directly read-depends Tiwr−→ Tj read-dependency edge or wr-dependency edge

Directly anti-depends Ti

rw− → Tj anti-dependency edge or rw-dependency edge

start-depends TiS

−→ Tj start-dependency edge

Table I. Dependencies (based on Fig. 2 in Adya et al. [2000])

and commit operations.

Definition 2.3. Snapshot-Read. All read operations or transaction Ti occurat Ti’s start point. That is, if ri(xj), (i 6= j) occurs in history H , then:

(1) cj ≺t si, and

(2) if wk(xk) also occurs in H(j 6= k 6= i), then either(a) si ≺t ck, or(b) ck ≺t si and ck ≺t cj

Part (1) requires that the read version was committed at start time of the readingtransaction. Part (2) requires that the latest of the committed versions is read.That is, if both xj and xk were installed (committed) before Ti started, and xk �xj , then Ti does not read the “outdated” version xk.

Definition 2.4. Snapshot-Write. For two committed transactions Ti and Tj inH that modify the same object x

(1) Either ci ≺t sj or cj ≺t si.

(2) If ci ≺t sj(≺t cj) then xi � xj and if cj ≺ si(≺t ci) then xj � xi

That is, no concurrent committed transactions may update the same object, andthe version order of an object x follows the order in which the transactions thatupdated x committed.

Similar in spirit to the ANSI definitions, GID now identifies phenomena that ahistory must avoid to be SI. Some of them are defined through properties of thehistory that are simple to verify. Others are properties of the SSG.

—G-1a: Aborted Reads. A history H over T exhibits phenomenon G-1a if itcontains an aborted transaction T1 and a committed transaction T2 such that T2

has read some objects modified by T1.

—G-1b: Intermediate Reads. A history H exhibits phenomenon G-1b if itcontains a committed transaction T2 that has read a version of object x writtenby transaction T1 that was not T1’s final modification of x. We do not fur-ther consider these phenomena because our transaction model assumes that eachtransaction only writes an object at most once.

—G-1c: Circular Information Flow. A history H has phenomenon G-1c if thestart-ordered serialization graph SSG(H) contains a directed cycle consistingentirely of ww-dependency and wr-dependency edges. We call this a G-1c cycle.

—G-SIa: Interference. A history H exhibits phenomenon G-SIa if SSG(H)contains a ww- or wr-dependency edge from Ti to Tj without there also being astart-dependency edge from Ti to Tj.

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

8 · ...

sTT1 2 T3wr,ww

s

rw

Fig. 2. SSG(Hnon−SI) in Example 1

TT2 4T1

wr,srw ww,s

Fig. 3. SSG(HSI) in Example 1

—G-SIb: Missed Effects. A history H exhibits phenomenon G-SIb if SSG(H)contains a directed cycle with exactly one rw-dependency edge. We refer to suchcycle as a G-SIb cycle.

GID defines an isolation level PL-SI corresponding to SI as the one in which the G-1and G-SI phenomena are disallowed. Roughly, G-1 captures the essence of dirtyread and dirty write while G-SI captures the essence of violating Snapshot-Readand Snapshot-Write8. For the convenience of discussion, we refer to a history asan SI-history if it avoids phenomena G-1 and G-SI.

Example 1. Hnon−SI is not an SI-history while HSI is an SI-history. Their SSGsare shown in Figure 2 and 3 respectively. We assume that a transaction T0 installsversion x0 and y0 before the transactions T1 to T3 start.

Hnon−SI : s1 s2 w1(x1) w1(y1) c1 r2(x1)w2(y2) c2 s3 r3(y0) c3 [y1 � y2]HSI : s1 s2 s3 w1(x1) c1 r2(x0)w2(y2) c2 w3(y3) a3 s4 r4(x1)w4(y4) c4 [y2 � y4]

In both Hnon−SI and HSI , T1 is the first to write and install x. In Hnon−SI , T2

reads the version of x created by T1 (r2(x1)). This violates property (1) of Snapshot-Read because T1 has not committed at the time T2 starts. Correspondingly we can

see that there is a T1wr−→ T2 edge but no T1

S−→ T2 edge in SSG(Hnon−SI)

(Figure 2). This means Hnon−SI has phenomenon G-SIa. Moreover, T1 and T2

both write y concurrently and both are allowed to commit. This violates Snapshot-

Write. Correspondingly we can see that there is a T1ww−→ T2 edge but no T1

S−→ T2

edge in SSG(Hnon−SI). Furthermore, there is a G-SIb cycle T1S

−→ T3

rw− → T1

in SSG(Hnon−SI) having exactly one anti-dependency edge. Phenomenon G-SIbalways occurs if a transaction reads an outdated version which violates property (2)of Snapshot-Read. In Hnon−SI , T3 reads y0, although T1 wrote y1 and committedbefore T3 started. Thus, T3 should have read y1 and not y0. This results in a G-SIbcycle between T1 and T3 with one start- and one rw-dependency edge.

In HSI , T2 reads x from T0 instead of T1 (r2(x0)). This is correct, because T2

started after T0 committed. Although T1 and T2 are concurrent, both are able tocommit because they write different objects. However, T3 is aborted because itwrites y, is concurrent to T2, and T2 commits (only one may commit). T4 reads thelast committed version as of start time. Figure 3 shows SSG(HSI). T3 does notappear in the SSG as it aborted. HSI avoids phenomenon G-1a since no transactionreads from T3, G-1b since no transaction updates the same data item twice, andG-SIa since both the wr-dependency edge from T1 to T4 and the ww-dependencyedge from T2 to T4 are accompanied by start-dependency edges. Furthermore, sinceSSH(HSI) is acyclic, G-1c and G-SIb are avoided. Hence, HSI is an SI-history.

8We refer to [Adya 1999; Adya et al. 2000] for the proofs that G-1 and G-SI are necessary andsufficient conditions for a history to provide Snapshot-Read and Snapshot-Write.

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 9

2.3 Observations

This section discusses some further properties of SI-histories and general historiesand their SSGs. They will be useful when we discuss SI in a replicated system.

First of all, we want to point out a property that holds in the SSG(H) of anyhistory H . Figure 4 shows an illustration of this property.

Proposition 2.5. Let H be a history over T . Let Ti, Tj ∈ T write x, and

Tk ∈ T read x. If Tiww−→ Tj and Ti

wr−→ Tk are two edges in SSG(H) due to x,

then Tk

rw− → Tj is an edge in SSG(H) due to x.

It directly follows from the definition of dependency edges. Tiww−→ Tj due to x

means that xi and xj are consecutive versions in x’s version order. Tiwr−→ Tk due to

x means that Tk reads version xi. Since Tj installs the next version of xi, according

to the definition of direct anti-dependency edges, there must be a Tk

rw− → Tj edge

in SSG(H) due to x.Secondly, we want to look at the relationship between dependencies and the

start and commit order of transactions. In Section 2.1, we have shown in our firstexample history, Hexample, that there are generally very few restrictions on howoperations are ≺t-ordered. However, an SI-history has quite strong properties inregard to the ≺t-order. Table II summarizes these ordering implications. Clearly,a start-dependency edge between Ti and Tj means ci ≺t sj for any history H bydefinition. Furthermore, in order to avoid G-SIa, every ww- or wr-dependency edgein the SSG(H) of an SI-history H is accompanied by a start-dependency edge, and

thus, we have ci ≺t sj in H . Finally, an anti-dependency Ti

rw− → Tj implies si ≺t cj

in H . Assume that this would not be the case. Then cj ≺t si holds. Thus, there

would be an edge TjS

−→ Ti resulting in a cycle between Ti and Tj with exactly onerw-dependency edge. This is phenomenon G-SIb and avoided by SI-histories.

iT Tj

Tk rw

ww

wr

Fig. 4. Relationship of read-, write-,and anti-dependency edge

Dependency Order Requirement inSI-history

TiS

−→ Tj ci ≺t sj

Tiww−→ Tj ci ≺t sj

Tiwr−→ Tj ci ≺t sj

Ti

rw− → Tj si ≺t cj

Table II. Order requirements for SI-histories

3. SNAPSHOT ISOLATION IN A REPLICATED SYSTEM

In this section we extend the notion of SI to a replicated environment. In orderfor a replicated database to provide a certain level of isolation, it should behavelike a non-replicated database that runs under this isolation level. The conceptof 1-copy-serializability is well known and understood ([Bernstein et al. 1987]). Itrequires the execution in the replicated system to be equivalent to a serial executionin a non-replicated system. In this section, we formally define what it means for

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

10 · ...

a history to be 1-copy-snapshot-isolation (1-copy-SI), and discuss necessary andsufficient conditions for a history to be 1-copy-SI.

3.1 Transactions and histories in a replicated database

A replicated database consists of a set of replicas R each of which keeps a copy ofthe database. That is, our framework assumes full replication. Our model followsa ROWA approach in which each update transaction executes on one replica thatperforms all its operations. The transaction is called local at this replica, and remoteat the other replicas. Only the write operations of a transaction are applied at theremote replicas. Hence, all replicas execute the same set of update transactions, butan update transaction Ti has a readset RSi consisting of all read operations onlyat one replica while it has the same writeset WSi consisting of its write operationsat all replicas. Read-only transactions, in contrast, only exist at the local replica.We express this by using a ROWA mapper function.

Definition 3.1. Mapper function. A ROWA mapper function, rmap, takes aset of transactions T and a set of replicas R as input, and transforms T into a setof transactions T ′ = rmap(T ,R). rmap(T ,R) transforms each update transactionTi ∈ T into a set {T k

i |Rk ∈ R}. In this set there is exactly one local transaction

T li where WSl

i = WSi and RSli = RSi (Ti is local at Rl). The rest are remote

transactions T ri , where WSr

i = WSi and RSri =∅ (Ti is remote at Rr). A read-only

transaction Ti is transformed into a single local transaction T li with RSl

i = RSi.We denote as T k = {T k

i |Tki ∈ T ′} the set of transactions executed at replica Rk.

Executing T ′ at the replicas R leads to what we denote a replicated history.

Definition 3.2. Replicated history. Let T be a set of transactions, R a setof replicas and rmap a ROWA mapper function generating T ′ = rmap(T ,R). LetRHk be the history over T k at Rk ∈ R. We denote the union over all histories RHk,Rk ∈ R, as a replicated history RH over rmap(T ,R), i.e., RH =

RHk, Rk ∈ R .

In the remainder of the paper, we assume that before the start of a replicatedhistory RH , all replicas have the same state of the database, i.e., for each data itemx, each replica Rk has the same last committed data version.

3.2 1-copy-SI

We now have to define when a replicated history provides 1-copy-SI, i.e., whenit is equivalent to an SI-history over a non-replicated database. We model thisby requiring a replicated history over rmap(T , R) to have the same dependenciesbetween read and write operations as a non-replicated SI-history over T . In GID,any such dependency is captured by the means of a ww-, wr- or rw-dependencyedge in the SSG. Thus each history RHk at replica Rk has its own SSG(RHk)reflecting the dependencies that occurred in this history. The union of all theseSSGs reflects the sum of all dependencies. An equivalent non-replicated SI-historyhas to have the same dependencies, except of the start-dependency edges. We firstdefine this set of dependencies as a graph:

Definition 3.3. Union Serialization Graph (USG). Let RH =⋃

RHk be areplicated history over rmap(T ,R). We denote as USG(RH) the following graph.

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 11

T21T T3ww,s

wr,s

rw

(a) A exact−edge SSG(RH )

T21T Tww,s

wr,s

4

(b) exact−edgeBSSG(RH )

T21T T3ww,s

wr,s

rw

(c)

T4wr,s

exact−edgeSSG(H )

Fig. 5. SSGs in Example 2

(1) ∀Rk ∈ R, if SSG(RHk) has node T ki ∈ T k, then USG(RH) has node Ti.

(2) ∀Rk ∈ R and each ww-, wr-, or rw-dependency edge from T ki to T k

j in

SSG(RHk), USG(RH) has a corresponding ww-, wr-, or rw-dependency fromTi to Tj.

(3) There are no further edges or nodes in USG(RH).

Definition 3.4. 1-copy-SI. Let RH =⋃

RHk be a replicated history over rmap(T ,R).We say RH is 1-copy-SI if

(1) ∀Rk ∈ R, RHk is an SI-history.

(2) For all update transactions Ti ∈ T and for all Rk, Rl ∈ R : cki ⇐⇒ cl

i.

(3) There exists an SI-history H over T such that,(a) SSG(H) and USG(RH) have the same nodes;(b) SSG(H) has exactly the same ww-, wr-, and rw-dependency edges as

USG(RH).

(1) means that the histories at all replicas must be SI-histories. In the followingwe often refer to them as the local histories. (2) means all local histories mustcommit the same set of update transactions. This is an obvious requirement ofROWA. Finally, (3) means an SI-history over the original set of transactions mustexist with the same dependencies. We refer to this non-replicated history over Toften as the global history. As with GID in general case, the data items that leadto the dependency edges do not need to be considered. We show in Appendix Athat is indeed the case and that our Definition of 1-copy-SI is sufficient.

Example 2. In this example, there are two replicas RA and RB. TransactionsT1, T2, and T3 are local at RA while T4 is local at RB. The replicated historyRHexact−edge is the union of the local histories RHA

exact−edge and RHBexact−edge.

RHAexact−edge : sA

1 wA1 (x1)wA

1 (y1) cA1 sA

2 wA2 (x2) sA

3 cA2 rA

3 (x1) cA3 [x1 � x2]

RHBexact−edge : sB

1 wB1 (x1)wB

1 (y1) cB1 sB

2 wB2 (x2) sB

4 cB2 rB

4 (y1) cB4 [x1 � x2]

SSG(RHAexact−edge) and SSG(RHB

exact−edge) are shown in Fig. 5. For simplicity, thesuperscript A and B, which indicate replicas, at the transactions are omitted in thefigure. It is easy to verify that both RHA and RHB are SI-histories. USG(RH) isthe union graph of all ww-, wr- and rw-dependency edges of SSG(RHA

exact−edge)

and SSG(RHBexact−edge).

We can show that the replicated history RHexact−edge is 1-copy-SI by buildingthe following global history Hexact−edge over {T1, T2, T3, T4}:

Hexact−edge: s1 w1(x1)w1(y1) c1 s2 w2(x2) s3 s4 c2 r3(x1) c3 r4(y1) c4 [x1 � x2]

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

12 · ...

SSG(Hexact−edge) is shown in Figure 5.(c). It has exactly the same ww-, wr- andrw-dependency edges as USG(RHexact−edge). We can also easily see that H avoidsG1 and G-SI. Hence, RHexact−edge is 1-copy-SI.

In the above example, we have shown the 1-copy-SI property by constructing a non-replicated history that fulfills the conditions of the 1-copy-SI definition. However,constructing an appropriate non-replicated global SI-history for an arbitrary repli-cated history that fulfills the 1-copy-SI property is not always trivial. Furthermore,in case a replicated history is not 1-copy-SI, it is difficult to prove that no globalSI-history with the appropriate properties exists. Thus, we need a more convenientway to determine whether a replicated history is 1-copy-SI.

Bernstein et al. [1987] showed that if the union of the serialization graphs of thehistories at different replicas, enhanced with certain edges, is acyclic then the repli-cated history is 1-copy-serializable. It would be nice if we could use the USG(RH)for a similar purpose. That is, given that USG(RH) has certain properties, e.g.,avoids certain cycles, then we know that RH is 1-copy-SI. Indeed, the next two sec-tions will discuss a set of properties for USG(RH) that help to determine whetherthe replicated history is 1-copy-SI.

3.3 Necessary conditions for a replicated history to be 1-copy-SI

It is clear that if USG(RH) has a G-1c or G-SIb cycle, then RH is not 1-copy-SIbecause it is not possible for an SI-history H to have a SSG(H) with the sameedges. Our first question is whether any other characteristics of USG(RH) can bedetermined that show that RH is not 1-copy-SI. Let’s have a look at an example.

Example 3. In this example, there are two replicas RA and RB. Transaction T1

and T2 are local at RA, T3 and T4 are local at RB. We assume an initial transactionT0 created x0 and y0 and committed before the following execution starts.

RHAhole : sA

1 wA1 (x1) cA

1 sA2 rA

2 (x1) rA2 (y0) cA

2 sA4 wA

4 (y4) cA4

RHBhole : sB

4 wB4 (y4) cB

4 sB3 rB

3 (y4) rB3 (x0) cB

3 sB1 wB

1 (x1) cB1

SSG(RHAhole) and SSG(RHB

hole) are shown in Figures 6.(a) and (b) respectively.The USG(RH) shown in Figure 6 (c) has no G-1c or G-SIb cycles.

Still, RHhole is not 1-copy-SI. We show this by contradiction. Assume RHhole

is 1-copy-SI. Then, there must be a global SI-history Hhole which contains thesame ww-, wr-, and rw-dependency edges as USG(RHhole). Hence, based on

T1wr−→ T2

rw− → T4 in USG(RHhole) and Table II, we derive for the ≺t-order of H :

T1wr−→ T2 =⇒ c1 ≺t s2

T2

rw− → T4 =⇒ s2 ≺t c4

}

=⇒ c1 ≺t c4

Similarly, due to T4wr−→ T3

rw− → T1 we derive:

T4wr−→ T3 =⇒ c4 ≺t s3

T3

rw− → T1 =⇒ s3 ≺t c1

}

=⇒ c4 ≺t c1

This results in c1 ≺t c4 ≺t c1 which is impossible since ≺t is irreflexive. Thus, noSI-history could have a graph which above edges, and RHhole is not 1-copy-SI.

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 13

T1 T2 T4wr, s

s

rws

hole(a) SSG(RH )A

T Twr, s

s

rws

T4 3 1B(b) SSG(RH )hole

T1 T2 T4rw Twr wr

3

rw

(c) USG(RH )hole

Fig. 6. SSGs in Example 3

The problem of RHhole is that T2 and T3 indirectly order T1 and T4 althoughT1 and T4 do not conflict. In RA, T2 reads x and y from a snapshot after T1

commits but before T4 commits. This indirectly requires T1 to commit before T4.In contrast, in RB, T3 reads x and y from a snapshot after T4 commits but beforeT1 commits, indirectly ordering T4 before T1. In a non-replicated history, only oneof the snapshots is possible, that is either T1 commits before T4 or it commits afterT4 but not both. USG(RHhole) (see Figure 6.(c)) expresses this behavior by havinga cycle with more than one rw-dependency edge. In principle, this is not explicitlyforbidden by the definition of SI. But it turns out that the particular cycle aboveis not possible in a non-replicated history. Thus, we define a further phenomenon.

—G-SIb*: rw-dependency cycle. A history H exhibits phenomenon G-SIb*if SSG(H) has a cycle with at least one rw-dependency edge and each rw-dependency edge is prefixed by a ww-, wr-, or start-dependency edge. We referto such a cycle as a G-SIb* cycle.

G-SIb* refers to cycles where there are no consecutive rw-dependency edges9.Note that G-SIb* includes G-SIb because if there is a cycle with exactly one rw-dependency edge, then this rw-dependency edge must be prefixed with a non rw-dependency edge. G-SIb* is a derived phenomenon, i.e., if a history avoids G-1a-cand G-SIa-b, then it automatically avoids G-SIb*.

Lemma 3.5. A (non-replicated) SI-history H over a set T avoids G-SIb*.

Proof Sketch. The proof follows the lines of reasoning taken in Example 3.Any cycle can be broken into m (m > 1) sections where each section k ∈ {0, . . . , m−

1} follows the pattern Tik

(ww/wr/S)+

−→ Tjk

rw− → Ti(k+1)%m

. From there, we can derivecik

≺t sjk≺t ci(k+1)%m

in the history, eventually leading to c0 ≺t c0 which is acontradiction. A complete proof is given in Appendix A.1.

In Example 3, as USG(RHhole) has a G-SIb* cycle, we know that RH is not1-copy-SI. In summary we observe the following necessary conditions.

Proposition 3.6. Necessary Conditions for 1-copy-SI. If a replicated his-tory RH is 1-copy-SI, then USG(RH) has no G-1c or G-SIb* cycles.

3.4 Sufficient conditions for a replicated history to be 1-copy-SI.

It turns out that avoiding G-1c and G-SIb* is not only necessary but also sufficientfor a replicated RH history to be 1-copy-SI. That is, for a replicated history RH ,if all local histories RHk are SI, all Rk commit the same update transactions, and

9SI allows cycles with two consecutive rw-dependency edges. Fekete et al. [2005] show that allhistories that are SI but not serializable contain cycles with consecutive rw-dependency edges.

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

14 · ...

USG(RH) has no G-1c and G-SIb* cycles, then RH is 1-copy-SI. In particular,we are able to construct a global SI-history H such that SSG(H) has the sameww-, wr- and rw-dependency edges as USG(RH). We start with some interestingproperties of an RH whose local histories are SI-histories.

Lemma 3.7. Let RH be a replicated history over rmap(T ,R). At each Rk ∈ R,let RHk be an SI-history over T k. Let each update transaction Ti ∈ T commit ateither all or none of the replicas.

If USG(RH) has no G-1c cycle, then for each Rk, Rl ∈ R(k 6= l): xi � xj inRHk ⇐⇒ xi � xj in RH l. That is, all local histories have the same version ordersfor all data items, and thus, the same ww-dependency edges in their SSG(RHk).

Proof Sketch. By definition of ww-dependency edges, xi � xj implies a path

Tiww+

−→ Tj in the local SSG. If there are different version orders xi � xj in RHk

and xj � xi in RH l, then SSG(RHk) has a path Tiww+

−→ Tj and SSG(RH l) a

reverse path Tjww+

−→ Ti. Thus in contradiction to our assumption USG(RH) has aG-1c cycle. A complete proof is given in Appendix A.2.

As in SI the version order of an object is consistent with the commit order of thetransactions updating the object, we can derive the following:

Proposition 3.8. Let RH be a replicated history over rmap(T ,R). At eachRk ∈ R, let RHk be an SI-history over T k. Let each update transaction Ti ∈ Tcommit at either all or none of the replicas.

If USG(RH) has no G-1c cycles, then for any Ti, Tj ∈ T writing a commondata item x and for any replicas RA, RB ∈ R: cA

i ≺t cAj in RHA if and only if

cBi ≺t cB

j in RHB. That is, two conflicting committed transactions commit in thesame order in all local histories.

Readers can verify that the replicated history RHexact−edge in Example 2 doesobey Lemma 3.7 and Proposition 3.8. Each local history is SI, all histories committhe same set of update transactions and USG(RH) is acyclic. Both histories havethe same version order for x and commit T1 and T2 in the same order.

Based on the discussion above, we can state sufficient conditions for a replicatedhistory to be 1-copy-SI.

Theorem 3.9. Sufficient conditions for 1-copy-SI. Let RH be a replicatedhistory over rmap(T ,R). RH is 1-copy-SI if the following holds

(1 ) For each Rk ∈ R, RHk is an SI-history.

(2 ) For all update transactions Ti ∈ T and for all Rk, Rl ∈ R : cki ⇐⇒ cl

i.

(3 ) USG(RH) has no G-1c or G-SIb* cycles.

Proof Sketch. To prove this, according to the definition of 1-copy-SI (Defini-tion 3.4), it is sufficient to show that we are able to construct an SI-history H overT with the same ww-, wr-, and rw-dependencies as USG(RH). The proof consistsof three parts. First, we create a global history H based on the dependency edgesin USG(RH). Then, we show that H really has the same dependency edges asUSG(RH). Finally, we show that H is actually an SI-history.

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 15

exact−edge

T21T T3rw T4

ww

wr

wr

(a) USG(H )

s1 c1 s2 c2

c3

c4

s3

s4

(b) incomplete SCSG(H ) exact−edge exact−edge

s1 c1 s2 c2

c3

c4

s3

s4

(c) complete SCSG(H )

Fig. 7. USG and SCSG of RHexact−edge

We only give an idea of the main ideas behind each of these three parts andprovide a flavor of how the details can be derived. The complete proof and adetailed description of every part is given in Appendix A.3.

Part 1: We show the construction of H along the history RHexact−edge of Ex-ample 2. We first build a total order between start and commit operations of allcommitted transactions. Then we fill in the read and write operations, determinethe version orders, and determine the versions in the read operations.

We first order start and commit operations. Clearly, for each transaction Ti, si ≺t

ci. Then, we consider the dependency edges in USG. The USG(RHexact−edge) ofour example is shown in Figure 7.(a). Whenever there is a wr- or ww-dependencyedge from Ti to Tj , we require ci ≺t sj , whenever there is a rw-dependency edgefrom Ti to Tj we require si ≺t cj . This is derived from Table II. We can present the≺t-relations built so far as a Start-Commit-Order Serialization Graph SCSG(RH)where the start- and commit operations are nodes, and there is an edge from nodeni to node nj if ni ≺t nj . Figure 7.(b) shows SCSG(RHexact−edge) for our exampleso far. As ≺t must be transitive, ni ≺t nj whenever there is a path in the graph.We can see that the graph is acyclic. Indeed, we show in Appendix A.3 that ourconstruction rules avoid any cycle ci ≺t ci because any such cycle would be due toa G-1c or G-1b* cycle in USG. We now extend SCSG to order any pair of start-and commit operations because a history must order all start/commit pairs. Forany ci, sj where there is not yet a path from ci to sj or from sj to ci in SCSGwe set sj ≺t ci. Figure 7.(c) now shows the complete SCSG(RHexact−edge). Byconstruction, the graph remains acyclic, and thus, ≺t remains a partial order.

The next step includes into the ≺t-order of global history H the read and writeoperations of each committed transaction Ti by simply ordering them between si

and ci. After that, we determine the version order in H . According to Proposition3.7, all local histories have the same version orders for all data items. We use theseversion orders for H . Finally, we let in H each read operation of transaction Ti

read the same version that T li did in the history RH l of RH in which Ti was local.

Coming back to our example, the global history Hexact−edge given in Example 2and repeated below conforms to the construction rules above. Note that there existother possible global histories, as the commit order between c2, c3 and c3 is notrestricted. Also the order of non-conflicting operations can be varied.

Hexact−edge: s1 w1(x1)w1(y1) c1 s2 w2(x2) s3 s4 c2 r3(x1) c3 r4(y1) c4 [x1 � x2]

Part 2: Next, we have to show that the SSG(H) of the global history H hasexactly the same ww-, wr- and rw-dependency edges as USG(RH). This is true asthe version order in H is the same as in the local histories of RH , and transactionsin H read the same data versions as the corresponding local transactions do in RH .

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

16 · ...

We can easily confirm this property for our example history Hexact−edge.Part 3: The final part shows that H is actually an SI-history. The proof for G-1

and G-SIa can be easily derived by looking how we constructed H and the factthat USG(RH) has no G-1c cycles. Showing that H avoids G-SIb is slightly morecomplex as SSG(H) has more edges than USG(RH), namely start-dependencyedges. The idea is to show that any G-SIb cycle in SSG(H) would require ci ≺t ci inH which is impossible because our construction of ≺t guarantees a partial order.

3.5 Observations

Proposition 3.8 indicates that all conflicting transactions must commit in the sameorder at all replicas. But when is a transaction allowed to commit? According toSnapshot-Write property of SI, if two transactions have write/write conflicts andare concurrent, one of them must be aborted. This rule also needs to hold in areplicated database. But when are two transactions concurrent in a distributedsystem? In a non-replicated system, two transactions Ti and Tj are concurrent iftheir lifetimes overlap (i.e., si ≺t cj ∧ sj ≺t ci). We can define the concurrency oftwo transactions in a replicated database according to this rule.

Definition 3.10. Concurrency. Let RH be a replicated history over rmap(R, T ).Two transactions Ti, Tj ∈ T are concurrent in RH , iff ∃Rk, Rl ∈ R: sk

i ≺t ckj /ak

j in

RHk and slj ≺t cl

i/ali in RH l.

It means that Ti and Tj are concurrent if and only if Ti does not always startbefore Tj commits/aborts at all replicas (or vice versa). Note that Rk might bethe same as Rl. It means that if Ti and Tj are concurrent in one local history theyare considered concurrent. But they are also considered concurrent if Ti executescompletely before Tj in one history and completely after Tj in another history.Based on this definition, we can derive another rule for 1-copy-SI.

Lemma 3.11. Let RH be a replicated history over rmap(R, T ), and RH is 1-copy-SI. If two transactions Ti, Tj ∈ T have write/write conflicts and are concurrentin RH, at least one of them aborts.

The proof of this Lemma is given in Appendix A.4.

4. SNAPSHOT ISOLATION AND INTEGRITY CONSTRAINTS

Database systems allow the definition of a whole range of integrity constraints,such as primary keys and foreign keys. In this section, we discuss the relationshipbetween snapshot isolation and integrity constraints in a non-replicated system.The next section extends our notions to a replicated environment.

4.1 Motivation

An integrity constraint puts constraints on the existence and values of data objectsin the system. During the execution of a transaction these constraints might beviolated. However, at commit time, all constraints must be obeyed.

The most simple constraint is the primary key constraint that disallows the ex-istence of two records in a table with the same value in the primary key attribute.

A very common constraint is the foreign key constraint. Assume a departmentrelation D(did, location) with identifer did as primary key, and an employee relation

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 17

1 T2Trw

rw

Fig. 8. SSG(Hskew) in Example 4

E(eid, ename, did) with identifier eid as primary key and the attribute did asforeign key referring to the department the employee works in. The foreign keyconstraint requires that if there is an employee record with did=‘d1’ in the employeetable, then there is a department record in the department table with did=‘d1’.

Another example is that the balance of an account may not be below zero. A bitmore advanced, the constraint could require the sum of the balances of all accountsof a client to be at least zero while each individual account can be below zero.

Such constraints can be defined at database design time and then are enforcedby the database system itself. In order to do so, a database system needs toperform some implicit read operations upon receiving certain update requests. Forinstance, upon the insert of a new tuple, the system checks whether already a recordwith the same primary key value exists, and if yes, aborts the transaction. In theforeign key example above, upon the insert of an employee record or the update ofthe did field of an existing employee record, the system performs an implicit readoperation on the department table to check whether a department record existswith the corresponding value in the did attribute. If it exists, the insert/updateof the employee record is allowed, otherwise the transaction is aborted. Similarly,upon the delete of a department record or the change of the value of the did field,the system looks at the employee table and checks whether an employee recordexists that has the same did value. If yes, the transaction is aborted otherwise themodification is ok10. In the examples with the account balances, the values of thebalances are checked to determine whether the update is possible.

In most cases, these read operations are predicate reads. For instance, in the for-eign key example, when inserting an employee tuple, the importance is the existenceof a corresponding department tuple which can only be expressed as a predicateread. The problem is that if these integrity read operations run under snapshotisolation, integrity constraints could be violated.

Example 4. In fact, the most common example given in the literature to showthat SI does not provide serializability, is an example of the violation of the con-straint that the sum of two given accounts should be above zero. If transactionswant to withdraw money from one of the accounts the values of both accountshave to be checked. Let x and y be such accounts with primary key values id=‘a1’and id=‘a2’. Let T0 have created versions x0 and y0 with balances of 50 for both.A further account z0 exists in the accounts table. Now assume two concurrenttransactions, one withdrawing 80 from x and the other 80 from y.

Hskew : s1 s2

r1(sum(balance)≥80 : id=a1∨id=a2 : {x0, y0} : {x0, y0, z0})

10In this paper we do not consider the SQL CASCADE option where the delete/update of thedepartment tuple would automatically delete/update all corresponding employee tuples.

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

18 · ...

r2(sum(balance)≥80 : id=a1∨id=a2 : {x0, y0} : {x0, y0, z0})w1(x1)w2(y2) c1 c2

The check is modeled as a predicate read (see Section 2.1.3). Input is Iset(P ) ={x0, y0, z0}, the predicate of P is id=a1∨id=a2 and thus, P finds the matching setOset(P ) = {x0, y0}. The evaluation F is sum(balance)≥80, executes over Oset(P )and returns true. Both transactions perform the same predicate read over the sameversions which were the committed versions as of start time of T1 and T2. At theend of execution, the sum over both balances is below zero. The SSG is shown inFigure 8. The execution is SI but integrity constraints are violated.

The problem is that reading from a snapshot is not the right thing to do for checkingintegrity constraints because it does not really help if the constraint holds at thebegin of the transaction. Instead, the constraint needs to hold at the time thetransaction commits.

4.2 A new isolation level: SI+IC

SI-based database systems guarantee that integrity constraints are not violated bydistinguishing between standard read operations, that read from a snapshot, andintegrity reads that are done to check constraint violations. We model a new isola-tion level SI+IC based on integrity reads. It is stronger than the basic SI that wediscussed in the last two sections, because it avoids integrity constraint violations.It is weaker than serializability because standard read operations continue to readfrom a snapshot. An SI+IC history should satisfy the following two requirements.

(1) It should provide SI properties to operations not related to integrity constraints;

(2) If a transaction commits, its updates do not violate the integrity of the database.

We model an integrity read operation as a special form of a predicate read.

Definition 4.1. Integrity Read. An integrity read operation of transaction Ti isa special predicate read operation iri(F:P:Oset(P):Iset(P))={f, t} where the eval-uation function F always returns a boolean outcome of either true (t) or false (f).Furthermore, the predicate in function P may only contain single-record conditions,i.e., for any xj ∈ Iset(P ), xj ∈ Oset(P ) if and only if P ({xj}) = {xj}.

Requiring that the predicate needs to be evaluated individually on each version inIset(P ) without taking the other versions in Iset(P ) disallows complex conditionssuch as joins. We will need this restriction to define anti-dependencies appropri-ately. No such restriction is needed for F . Note that most common integrityconstraints can be checked using our definition of integrity reads. This is true, forinstance, for all examples of integrity constraints discussed in this paper.

Example 5. Assume in our foreign key example tables D(did, location) and E(eid,ename, did), with D consisting of d10=(‘d1’, ‘Chicago’) and d20=(‘d2’, ‘New York’)inserted by transaction T0. When transaction T1 inserts a new employee (‘e1’,‘Mike’, ‘d1’) it performs an integrity read iri(6=∅:D.did=d1:{d10}:{d10, d20}). Thepredicate defined in P is D.did=d1 searching for records in D with id d1. Theversions accessed are Iset(P ) = {d10, d20}. The function F is 6=∅. It receives theonly matching version Oset(P ) = {d10} as input, and thus, returns true.

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 19

Of course, only performing integrity reads is not enough. The transaction mustalso perform the proper actions depending on the outcome of the integrity read.

Definition 4.2. IC-obeying. We say a transaction Ti is IC-obeying, if it per-forms the integrity reads necessary to confirm that its write operations do notviolate integrity constraints and it aborts when at least one of these integrity readsreturns false.

Example 6. Let us stay with the foreign key example. Assume above tablesD(did, location) and E(eid, ename, did) and the same data versions d10 and d20

in the department table. Furthermore, the employee table has two unborn dataversions e1init and e2init. Now assume a transaction T1 inserts employee e1 andtransaction T2 deletes the department d1.

T1: insert into E values (‘e1’, ‘Mike’, ‘d1’);T2: delete from D where did=d1;

Now assume a serial execution where T1 runs before T2. For simplicity, we ignore inthis and all following examples that T1 also needs to check a primary key constraint.

HIC−obey : s1 ir1(6=∅:D.did=d1:{d10}:{d10, d20})=t w1(e11) c1

s2 ir2(=∅:E.did=d1:{e11}:{e11, e2init})=f a2

T1’s integrity read determines that a department tuple with department id d1 existsand returns true. Thus, T1 performs the insert and commits. After that T2’sintegrity read determines that the department to be deleted has already an employeeand returns false. Thus, T2 aborts.

As mentioned above, it is not the transaction written by the application program-mer that performs the integrity reads. Instead, the database system extends theapplication transaction automatically by the necessary integrity reads and forcesthem to be IC-obeying. In commercial systems the integrity read takes typicallyplace before the corresponding write operations or just at commit time (using de-ferred constraint checking). In theory, it could be any time during the executionof the transaction. The important issue is that the integrity constraint should holdat the time the transaction commits. That is, while the read takes place sometimebefore the commit, it should be still valid at the time of commit. It is useless if atransaction T performs an integrity read on an object x, but the object x is over-written before T commits in such a way that the integrity constraint does not holdanymore. This is exactly the problem of history Hskew of Example 4. While T2’sread finds a sufficiently large balance, the balance is too low at commit time.

The question is what it means that the integrity read is still valid at the time ofcommit. We can observe that the outcome of an integrity read iri(F:P:Oset(P):Iset(P))can only be changed by a write operation if it affects Oset(P ) as this is the inputfor the evaluation function F . For instance, in above foreign key example, it mat-ters whether T2 performs its integrity read before T1’s insert (Oset(P ) = {} andthus evaluation F returns true) or after the insert (Oset(P ) = {e11} and F re-turns false). In contrast, if T1 inserted (‘e1’, ‘Mike’, ‘d2’), then T2’s integrity readwould return true independently of when T1’s insert occurs, because Oset(P ) wouldalways be the empty set.

We express such behavior by defining anti-dependencies for integrity reads dif-ferent than for ordinary reads.

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

20 · ...

Definition 4.3. IC-dependencies. Let H be a history over transactions T . LetTi ∈ T perform an integrity read iri(F:P:Oset(P):Iset(P))=t.1. ∀xj ∈ Iset(P ), Ti directly IC-read-depends on Tj.

2. ∀xj ∈ Oset(P ), and xk follows xj in the version order, Tk directly IC-anti-depends on Ti.

3. ∀xj ∈ Iset(P ) \ Oset(P ) and xk, Tk directly IC-anti-depends on Ti if the fol-lowing conditions are fulfilled:• xj � xk and

• P ({xk}) = {xk} and

• ∀xl such that xj � xl � xk: P ({xl}) = ∅

Property (1) defines IC-read-dependencies in the same way as for normal predi-cate reads. Property (2) indicates that Tk directly IC-anti-depends on Ti if there isa data item x, the version xj accessed by Ti’s integrity read matches the predicateand Tk creates the next version for x. This IC-anti-dependency reflects that if Ti’sintegrity read accessed xk instead of xj , the outcome of its evaluation F mightchange. Property (3) indicates that Tk directly IC-anti-depends on Ti if there is adata item x, the version xj accessed by Ti does not match the predicate, and Tk

is the first transaction to create a version of x that matches the predicate whileall versions xl that are in the version order after xj but before xk do not matchthe predicate. If Ti’s integrity read accessed xl instead of xj the outcome of Fwould not change as neither xj nor xl appear in Oset(P ), thus Tl does not IC-anti-depend on Ti. However, if Ti’s integrity read accessed xk instead of xj , Oset(P )would contain xk and the outcome of F could change.

With this, we express the following requirements for integrity reads.

Definition 4.4. IC-Consistency. Let H be a history over a set of transactionsT . An integrity read operation iri(F:P:Oset(P):Iset(P))=t of committed transac-tion Ti ∈ T is IC-consistent if the following holds.

(1) If Ti directly IC-read-depends on transaction Tj due to this integrity read thencj ≺t ci.

(2) If transaction Tk directly IC-anti-depends on Ti due to this integrity read thenci ≺t ck.

Property (1) guarantees that the read reflects a committed version at the time Ti

commits. Property (2) guarantees that any transaction that changes the outcome ofthe integrity read commits after Ti. If all integrity reads of a transaction T are IC-consistent and T is IC-obeying, then it is guaranteed that the integrity constraintsrelated to T ’s write operations hold when T commits.

Example 7. Let us continue with the same setup as in Example 6 but with aninterleaved execution. In the following history, although the transactions are IC-obeying, the foreign key constraint is violated at the end of the execution.

HIC−bad : s1 s2 ir1(6=∅:D.did=d1:{d10}:{d10, d20})=tir2(=∅:E.did=d1:{}:{e1init, e2init})=t w1(e11)w2(d1dead) c2 c1

T1’s integrity read finds a department with id d1. Hence, T1 can continue to insertthe employee tuple. Similarly, T2’s integrity read finds no employee associated

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 21

with the department. Hence, T2 can continue to delete the department. Afterboth commit, the employee (‘e1’, ‘Mike’, ‘d1’) refers to a non-existing department.Clearly this history does not respect foreign key constraints. A closer look revealsthat T2 directly IC-anti-depends on T1 as T1’s integrity read accesses data versiond10, d10 is in Oset(P ) and T2 creates the next version d1dead of d1. However, T1

does not commit before T2. Thus, property (2) of IC-dependency is violated andT1’s integrity read is not IC-consistent. Note that T1 also directly IC-anti-dependson T2 as T2’s integrity read accesses e1init and e2init which both do not match P ,and T2 creates data version e11 that matches P . Nevertheless T2 is IC-consistent,as it commits before T1.

We now derive our new isolation level as follows.

Definition 4.5. Snapshot Isolation and Integrity Constraints (SI+IC). Ahistory H over a set of IC-obeying transactions T is an SI+IC history if it fulfillsthe Snapshot-Read and Snapshot-Write properties (Definitions 2.3 and 2.4), andall integrity reads of committed transactions are IC-consistent.

Using this definition, the history HIC−bad of Example 7 is not an SI+IC historybecause T1 has an integrity read that is not IC-consistent. If T1’s integrity readdid actually read the version d1dead (leading to Oset(P ) being empty and F toreturn false) but the remaining operations remained the same, then T1 would notbe IC-obeying anymore. The integrity read would detect that no department existsbut the insert would nevertheless occur. In contrast, HIC−obey of Example 6 isan SI+IC history, as the integrity reads are IC-obeying, and T2’s integrity read isIC-consistent. Note that IC-consistency is not defined for T1’s integrity read as T1

does not commit.In fact, our definition is somewhat stronger than what is needed. Let us explain

this through an example.

Example 8. In a variation of the foreign key example, T2 does not delete thedepartment tuple but simply changes the location of the department (e.g., updateD set location = ‘New York’ where did = ‘d1’). Note that T2 does not require toperform an integrity read for this update. Consider the following execution:

HIC−rename : s1 s2 ir1(6=∅:D.did=d1:{d10}:{d10, d20})=t w1(e11)w2(d12) c2 c1

T1’s integrity read has the initial version d10 matching the predicate and the writeoperation is executed. Then, T2 renames the department, creating d12 and commitsbefore T1 terminates. As T2 creates a new data version d12 where the previousversion d10 is element of Oset(P ) of T1’s integrity read, the integrity read is notIC-consistent, and thus HIC−rename is not considered SI+IC.

However, the history does not violate integrity constraints. If the integrity readwere performed on d12, the outcome would still be true. An execution with deferredintegrity reads (performed at commit time) would capture this fact:

HIC−rename′ : s1 s2 w1(e1)w2(d12) c2 ir1(6=∅:D.did=d1:{d12}:{d12, d20})=t c1

If the integrity read is performed at commit time on the latest committed versions,then the true state of the database at commit time is captured. In HIC−rename′

both transactions commit and the history is SI+IC.

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

22 · ...

Despite being too restrictive, i.e., some histories that do not violate integrity con-straints (e.g., HIC−rename) are not considered SI+IC, we think our definition isappropriate as it is simple and, as will be shown in the next section, captures wellhow locking-based integrity reads and deferred integrity checking work.

4.3 Implementing Integrity Constraints

Many commercial database systems use locking for integrity reads. As integrityreads are mostly predicate reads, this is tricky. Thus, often only primary key,foreign key, and constraints on individual tuples are handled correctly, as they canbe implemented through locks on the primary key index.

In many cases, the integrity read takes place immediately before the correspond-ing write operation is executed. The read does not read from a snapshot but thelatest committed version and it has to be guaranteed that the outcome of evaluationdoes not change until commit time. Thus, long locks are set.

Example 9. Assume again T1 inserting an employee and T2 deleting the depart-ment. T1 has to get a lock on the department key d1 before inserting the employeetuple. T2 has to get a write lock on d1 as it is going to delete this record. Further-more, it has to check for employee records with foreign key d1. It has to find allcommitted entries, i.e., it may not read from the snapshot. Let us denote with S(X) a shared (exclusive) lock request. Then, a possible history is:

Hlocks : s1 s2 S1(D.did=d1) X2(D.did=d1)[blocked]ir1(6=∅:D.did=d1:{d10}:{d10, d20})=t w1(e11) c1

ir2(=∅:E.did=d1:{e11}:{e11, e2init})=f a2

In Hlocks T1 is the first to get the shared lock on d1, it then finds a departmentrecord. When T2 now tries to get an exclusive lock on d1 the lock T2 has to wait.T1 inserts the employee tuple, commits, and releases its lock. Now T2 gets the lock,performs the integrity read over all committed versions of employee records andfinds the record inserted by T1. It has to abort. Note that if a transaction wantsto delete or update d2 it can do so concurrently as T1 and T2 only set locks on d1.

If T2 did get first the lock, then T1 would be blocked. T2 would check the employeetable with only unborn versions, and thus delete d1, commit and release its locks.After that T1 would get the lock on d1, find no department tuple, and abort.

Two parts play a role in the correct implementation. The lock on d1 guaranteesthat the conflict is detected and one transaction is blocked until the other termi-nates. The fact that the integrity read of T2 does not access a snapshot but thelatest committed versions guarantees that no updates are missed.

Integrity constraints can also be defined with the “deferred option”. A possible im-plementation can be as follows. The write operation first executes without checkingany integrity violation. At the end of transaction, a validation takes place perform-ing integrity reads on the latest committed values. The values of the own writescan be considered. For simplicity of description, we assume validation and commitare done atomically so no locks need to be set.

Example 10. Taking again the example above, one possible history could be

Hopt : s1 s2 w1(e11)w2(d1dead) ir1(6=∅:D.did=d1:{d10}:{d10, d20})=t c1

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 23

Dependency Type SSG Edge name

Directly IC-read-depends Tiwir−→ Tj IC-read-dependency edge or wir-edge

Directly IC-anti-depends Ti

irw− → Tj IC-anti-dependency edge or irw-edge

commit-depends TiC−→ Tj commit-dependency edge

Table III. IC dependencies

ir2(=∅:E.did=d1:{e11}:{e11, e2init})=f a2

Both perform their writes. T1 first enters validation, reads the last committedversion d10 of d1 and commits. Thanks to the atomicity of validation, T2 reads thelast committed version of e11 during its validation. It has to abort. If validation isperformed in reverse order, the situation is similar.

The implementation of integrity reads to check the primary key constraints aresimilar. In a locking based approach, the primary key value would be locked. Forother constraints on a single record (e.g., the balance may not be below zero), SI al-ready disallows two transactions to concurrently perform updates. For constraintsspanning more records (e.g., the sum of the balances of a set of accounts), moreadvanced predicate locks would be needed. Thus, many systems do not allow thespecification of such constraints. In particular, only few systems support assertions.In such case, the application has to include explicit read operations into the trans-actions. However, in this case, the database system is not able to recognize themas integrity reads, and thus, typically lets them, incorrectly, read from a snapshot.

4.4 SI+IC in GID

We have seen in Section 2 how we can check a set of phenomena (G-1, G-SI) todetermine whether a history runs under SI. In this section, we show how we canextend the list of phenomena to check whether a history runs under SI+IC.

Apart of IC-read and IC-anti-dependencies that were already introduced in thelast section, we say that Tj commit-depends on Ti if Ti commits before Tj commits.All new dependencies are summarized in Table III. We have to extend the definitionof SSG to include these new dependencies.

Definition 4.6. Start-ordered Serialization Graph (SSG). The SSG(H) ofa history H over a set of IC-obeying transactions T is a directed graph whereeach node in SSG(H) corresponds to a committed transaction in H , and there isa ww-, wr-, rw-, wir-, irw-, start- , or commit- dependency edge from Ti to Tj ifTj directly write-, directly read-, directly anti-, directly IC-read, directly IC-anti,start-, or commit-depends on Ti, respectively.

Given that the graph has now more types of edges, the question is how many ofthe phenomena G-1 and G-SI have to be adjusted to consider the new edges, andwhether we have to add new phenomena. It turns out that we have to adjust verylittle. G-1 and G-SIa remain as they are. We only have to adjust G-SIb and addone new phenomenon:

—G-SIb: Missed Effects. A history H over a set of IC-obeying transactions Texhibits phenomenon G-SIb if SSG(H) contains a directed cycle with exactly one

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

24 · ...

rw-dependency edge that is prefixed by a ww-, wr-, or start-dependency edge.We refer to such cycle as a G-SIb cycle.

—G-IC: IC Violation. A history H over a set of IC-obeying transactions Texhibits phenomenon G-IC if SSG(H) contains a wir- or irw-dependency edgefrom Ti to Tj without there also being a commit-dependency edge from Ti to Tj .

G-IC reflects the requirements that an object version accessed in an integrity readmust be installed before the reading transaction commits (wir-dependency edge ac-companied by a commit-dependency edge) and that if a later version changes theoutcome of an integrity read, then it is only installed after the reading transac-tion commits (irw-dependency edge accompanied by a commit-dependency edge).G-SIb is simply extended to reflect that the phenomenon only occurs if the rw-dependency edge in the cycle is prefixed by ww-, wr- or start-dependency edgesas SI+IC-histories are allowed to have a cycle where the rw-dependency edge isprefixed by a wir- or irw-dependency edge.

Example 11. Assume a transaction T0 created versions x0, y0 and z0. Now as-sume the following history

Hcycle : s1 s2 r1(x0)w2(x2)w2(y2) c2 s3 r3(y2)w3(z3) c3 ir1(F:P:{z3}:{z3})=t c1

This history is SI+IC as the read operations r1(x0) and r3(y2) read committed ver-sions as of transaction start, no conflicting writes exist, and T1’s integrity readaccesses z3 which is the latest installed version at the time T1 commits. AsSSG(Hcycle) (Fig. 9) contains a cycle where an rw-dependency edge is prefixedby a wir-dependency and a commit-dependency edge, such cycles need to be al-lowed.

We now show that the avoidance of G1, G-SI and G-IC is sufficient and necessaryfor a history to be SI+IC.

Theorem 4.7. Necessary conditions for SI+IC. An SI+IC history H overa set of IC-obeying transactions T avoids G-1, G-SI and G-IC.

Proof Sketch. As G-1 and G-SIa are not concerned with integrity reads, themain part of the proof is to show that G-IC and the new definition of G-SIb areavoided. This is straightforward for G-IC. If Tj directly IC-read-depends on Ti, then

an SI+IC history orders ci ≺t cj . Therefore, the wir-dependency edge Tiwir−→ Tj

in SSG(H) is accompanied by a TiC−→ Tj edge. If Tj directly IC-anti-depends

on Ti, then an SI+IC history orders ci ≺t cj . Therefore, the irw-dependency edge

Ti

irw− → Tj in SSG(H) is accompanied by a Ti

C−→ Tj edge.

Assume that G-SIb is not avoided. There will be a cycle in which the rw-dependency edge is prefixed by a ww-, wr- or start-dependency edge. Since G-SIaand G-IC hold, there must also be a cycle that consists only of start- and commit-dependency edges and a single rw-dependency edge. That is, the cycle has the

form (TiS∗

−→ TjC∗

−→ Tk)∗S+

−→ Tp

rw− → Ti.

One can derive that this implies (ci ≺t sj ≺t cj ≺t ck) ≺t sp ≺t ci in H which isimpossible. The detailed and complete proof is given in Appendix B.1.

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 25

wr,sT1 T2

wir,c

rw T3

Fig. 9. SSG(Hcycle) of Example 11

1 T2T

irw

irw, c

Fig. 10. SSG(HIC−bad) of Example 12

Theorem 4.8. Sufficient Conditions for SI+IC. If a history H over a setof IC-obeying transactions T avoids G-1, G-SI and G-IC, then it is an SI+IChistory.

Proof Sketch. We have to show that H fulfills the Snapshot-Read and Snapshot-Write properties and all its integrity reads are IC-consistent. For Snapshot-Readand Snapshot-Write we refer to Appendix B.2. Showing that there is no integrityread that is not IC-consistent is again straightforward. As the avoidance of G-ICguarantees that each IC-dependency edge from Ti to Tj has a commit-dependencyedge in the same direction, the proper commit order required by the IC-consistencyDefinition 4.4 is always maintained. The details are given in Appendix B.2.

Example 12. Let us revisit HIC−bad of Example 7.

HIC−bad : s1 s2 ir1(6=∅:D.did=d1:{d10}:{d10, d20})=tir2(=∅:E.did=d1:{}:{e1init, e2init})=t w1(e11)w2(d1dead) c2 c1

SSG(HIC−bad) is shown in Figure 10. In the figure, the irw-dependency edgefrom T2 to T1 is associated with a commit-dependency edge, but the other irw-dependency edge is not. Hence, HIC−bad exibits the G-IC phenomenon. As dis-cussed, it is not an SI+IC history because one of the integrity reads is not IC-consistent. And this anomaly is expressed through the G-IC phenomenon.

4.5 Observations

Theorem 4.8 states that it is sufficient to show that a history avoids G-1, G-SI andG-IC in order to know that it is an SI+IC history. Now we show that such a historyavoids a further phenomenon:

—A history H over a set of IC-obeying transactions T exhibits phenomenon G-1c* if SSG(H) contains a cycle that consists entirely of wr-, ww-, wir–, andirw–dependency edges. We refer to such a cycle as G-1c* cycle.

Lemma 4.9. An SI+IC history H avoids G-1c*.

Proof. Assume it has such a cycle. Due to G-SIa and G-IC, there is also a cyclethat consists only of commit- and start-dependency edges. This is impossible sinceeach edge Ti to Tj in the cycle implies ci ≺t cj , and thus transitively ci ≺t ci.

5. 1-COPY-SI+IC

In this section we extend our definition of 1-copy-SI to cover integrity constraints,denoting the new correctness criterion as 1-copy-SI+IC, and discuss sufficient con-ditions for a replicated history to be 1-copy-SI+IC.

A first issue is how to handle integrity reads in a replicated environment. Normalreads are executed at only one replica. However, an integrity read is something

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

26 · ...

tightly related to the write operation. It checks something that has to hold inorder for the write operation to be allowed to execute. One possibility to assurethe proper behavior of the write is to include the integrity read at all replicas. Weassume this execution model. Therefore, we extend the ROWA mapper function ofDefinition 3.1 to include integrity reads at all replicas. We denote at IRSi the setof all integrity read operations of transaction Ti.

Definition 5.1. Mapper function. A ROWA mapper function, rmap-ic, takesa set of IC-obeying transactions T and a set of replicas R as inputs, and transformsT into a set of IC-obeying transactions T ′ = rmap-ic(T ,R). rmap-ic(T ,R) trans-forms each update transaction Ti ∈ T into a set of transactions {T k

i |Rk ∈ R}. In

this set there is exactly one local transaction T li where WSl

i = WSi, IRSli = IRSi

and RSli = RSi (Ti is local at Rl). The rest are remote transactions T r

i , where WSri

= WSi, IRSri = IRSi and RSi

i=∅ (Ti is remote at Rr). A read-only transactionTi is transformed into a single local transaction T l

i with RSli = RSi. We denote as

T k = {T ki |T

ki ∈ T ′} the set of transactions executed at replica Rk.

From there we define 1-copy-SI+IC as below. Recall that the USG(RH) of areplicated history RH contains the ww-, wr- and rw-dependency edges of theSSG(RHk) of all replicas.

Definition 5.2. 1-copy-SI+IC. Let RH =⋃

RHk, Rk ∈ R, be a replicatedhistory over rmap-ic(T ,R). We say that RH is 1-copy-SI+IC if

(1) For each Rk ∈ R, RHk is an SI+IC history;

(2) For all update transactions Ti ∈ T and for all Rk, Rl ∈ R: cki ⇐⇒ cl

i;

(3) There exists a global SI+IC history H over IC-obeying T such that(a) SSG(H) and USG(RH) have the same nodes.(b) SSG(H) has exactly the same ww-, wr-, and rw-dependency edges as

USG(RH).

Note that IC-dependency edges are not considered in USG(RH). Thus, localhistories can have different integrity reads as long as all integrity reads have thesame effect, i.e., either all local histories and the global history have integrity readsthat return true, and thus, the transaction commits, or all local histories and theglobal history have integrity reads that detect a violation, and thus, abort thetransaction. Which version of a data item each of the histories reads is not relevant,as long as the outcome of the integrity read is the same everywhere.

Example 13. Let’s revisit the example where T1 inserts an employee and T2

changes the name of the corresponding department. Again, before the executionthere exist data versions d10, d20, e1init and e2init. A possible execution is

RHArename : sA

1 irA1 (6=∅:D.did=d1:{d10}:{d10, d20})=t wA

1 (e11) cA1 sA

2 wA2 (d12) cA

2

RHBrename : sB

2 wB2 (d12) cB

2 sB1 irB

1 (6=∅:D.did=d1:{d12}:{d12, d20})=t wB1 (e11) cB

1

T1 performs an integrity read on d10 at RA, and on d12 at RB. In both cases, thesubsequent write (insert of employee) can succeed. SSG(RHA

rename) and SSG(RHBrename)

are shown in Figure 11.(a) and (b), respectively. At the commit time of any of thetransactions, no integrity constraint is violated. The USG(RHrename) only containsT1 and T2 but no edges. A global history could be equivalent to either RHA

rename

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 27

T2T1irw, c, s

(a) SSG(RH )renameA

T2T1wir, c, s

renameB

(b) SSG(RH )

Fig. 11. SSGs of Example 13

or RHBrename. Although T1 reads different versions at the different replicas, this

does not matter as both integrity reads return true.

In the following we describe a set of sufficient conditions for a replicated history tobe 1-copy-SI+IC. If these conditions hold then we are able to construct an SI+IC-history H that has the same wr-, ww-, and rw-dependency edges as USG(RH),and for every committed transaction Ti, there exists at least one replica Rk, suchthat the integrity reads of Ti in H access exactly the same data versions as theintegrity reads of T k

i in RHk. If RHk is an SI+IC-history over IC-obeying trans-actions, we have the guarantee that these integrity reads return true. Therefore, ifthe corresponding transaction Ti in H performs the integrity reads over the sameversions, we know that they also return true, and thus, Ti is also IC-obeying. Notethat we allow different transactions to have integrity reads from different replicas,e.g., Ti can have the same integrity reads as in RHk, while Tj has the same integrityreads as in RH l. But we require all integrity reads of an individual transaction Ti

to be taken from one local history because they might be related to each other (e.g.,the sum of x and y may not be below 100).

Definition 5.3. Union Serialization Graph with Integrity Dependencies(USG-IC). Let RH =

RHk be a replicated history over rmap-ic(T ,R). Wedenote as USG-IC(RH) the following graph.

(1) It has the same nodes as USG(RH).

(2) It has the same ww-, wr-, and rw-dependency edges as USG(RH).

(3) For each Ti ∈ T , there exists Rk ∈ R, each wir-dependency edge from T kj to

T ki in SSG(RHk) has a corresponding wir-dependency edge from Tj to Ti in

USG-IC(RH), and each irw-dependency edge from T ki to T k

j in SSG(RHk) hasa corresponding irw-dependency edge from Ti to Tj in USG-IC(RH).

(4) There are no further edges or nodes in USG-IC(RH).

Note there is no unique USG-IC(RH) since there can be many combinations ofchoosing a local history RHk for a transaction Ti. That is, if there are n replicasand t transactions there could be as many as nt different USG-IC(RH).

Theorem 5.4. Sufficient Conditions for 1-copy-SI+IC. Let RH be a repli-cated history over rmap-ic(T ,R). RH is 1-copy-SI+IC if the following holds

—For each Rk ∈ R, RHk is an SI+IC-history.

—For all update transactions Ti ∈ T and for all Rk, Rl ∈ R, cki ⇐⇒ cl

i;

—There exists an USG-IC(RH) that has no G-1c* or G-SIb* cycles.

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

28 · ...

irwT1 T2

c1s1

c2s2

(b) complete SCSG(RH ) rename (a) USG−IC(RH ) rename

Fig. 12. USG-IC and SCSG of RHrename

Proof Sketch. The proof is similar to the one for Theorem 3.9. We have toconstruct an SI+IC-history H with the same nodes and the same ww-, wr-, and rw-dependency edges as USG(RH). Again, we only describe the main ideas and referfor the details to Appendix C.1. We use RHrename of Example 13 as an example toillustrate our steps. We choose for USG-IC(RHrename) the IC-dependency edgefrom SSG(RHA). Thus, USG-IC(RHrename) in Figure 12.(a) consists of nodes T1

and T2 with a single irw-dependency edge from T1 to T2.Part 1: We build a similar SCSG(RH) graph as we have done for Theorem 3.9

which provides a total order between pairs of start- and commit operations. AsUSG-IC(RHrename) does not have any ww-, wr- or rw-dependency edges, s1 ands2 are ordered both before c1 and c2. Additionally, we also relate some commitpairs. Whenever there is a wir- or irw-dependency edge from Ti to Tj in USG-IC,we require ci ≺t cj , and thus we connect ci to cj in SCSG. This reflects that theglobal history must have IC-consistent integrity reads. In our example, we requirec1 ≺t c2. Figure 12.(b) shows the completed SCSG(RHrename) which remainsacyclic. The detailed proof in Appendix C.1 shows that this construction generallyprovides an acyclic SCSG, and thus, a partial order ≺t, if USG-IC(RH) has noG-1c* and G-SIb* cycles.

The ≺t order of read and write operations, the version order and the versions readby read operations are determined as in the proof of Theorem 3.9. Additionally,for each integrity read operation iri of committed Ti we need to determine the setof versions accessed. When constructing USG-IC(RH), let Rk ∈ R be the replicasuch that the wir/irw-dependency edges for Ti were taken from SSG(RHk). Thenwe let iri access the same versions as the corresponding irk

i accessed in RHk. Inour example T1 performs in Hrename the same integrity read as T1 in RHA

rename.Thus, a possible final global history is

Hrename : s1 s2 ir1(6=∅:D.did=d1:{d10}:{d10, d20})=t w1(e11) c1 w2(d12) c2

Part 2: We have to show that the SSG(H) of the newly constructed global historyH has the same ww-, wr- and rw-dependency edges as USG(RH). This part of theproof is the same as Theorem 3.9 proof part (2). For our example, it is triviallytrue, since there are no ww-, wr- or rw-dependency edges.

Part 3: Finally, we have to show that H is an SI+IC history. As most of theproof is similar to the proof of Theorem 3.9, we only look at integrity constraints.As the global history performs the integrity reads of committed transactions on thesame data version as the corresponding integrity reads in one of the local histories,the outcome must be the same, namely true. Thus, all committed transactions areIC-obeying. Furthermore, it guarantees that G-IC is avoided because SSG(H) hasnow the same IC-dependency edges as USG-IC(RH). As we set ci ≺t cj in our

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 29

construction of H whenever USG-IC(RH) has an IC-dependency edge from Ti toTj, H avoids G-IC. The details are given in Appendix C.1.

6. REPLICATION PROTOCOL

In this section, we show how our formalism can be used to prove the correctnessof replica control protocols. We first present SRCA, a replica control protocolpresented in [Lin et al. 2005]. We show that this protocol provides 1-copy-SI+IC.We then extend the protocol to accomodate some extensions proposed in [Lin et al.2005] and show that this extended protocol SRCA-Ex is still 1-copy-SI but no more1-copy-SI+IC. Finally, we provide a third protocol SRCA-2PC, a simple extensionof SRCA-Ex, that again is 1-copy-SI+IC.

In all protocols, there is one middleware that coordinates transaction executionamong the database replicas R. We assume that integrity constraints are checked indeferred mode. Furthermore, we assume that all database instances implement SIin the following way. A write of transaction Ti on data item x creates a new version,a read of transaction Ti on data item x reads the last version of x that was com-mitted before Ti started (or its own version if it has created one). Snapshot-Writeuses the first-committer-wins strategy. Transactions perform their write operationsoptimistically, creating new versions on-the-fly. At commit time of a transaction Ti,a validation takes place. If a transaction Tj committed after Ti started and wroteone of the objects that was written by Ti, then Ti has to abort and all versions itcreated are discarded. Otherwise, validation succeeds. Ti commits and its versionsbecome the latest committed versions.

6.1 SRCA Protocol

This section presents the SRCA protocol proposed in [Lin et al. 2005]. It works asfollows. The client connects to the middleware via a standard database interfacesuch as JDBC. When the client starts a transaction Ti, the middleware choosesany database replica Rl ∈ R as local replica. All operations of Ti are forwardedto this replica and executed within transaction T l

i . At commit time, if Ti wasa read-only transaction, T l

i is simply committed at its local replica. If Ti is anupdate transaction, the middleware extracts the records changed by T l

i from Rl.These changed records represent the writeset of Ti. It then performs a validationsimilar to the one within the database system described above. For that purposeit keeps track of the writesets of all committed transactions and uses a timestampmechanism to determine whether transactions are concurrent. Validation checkswhether Ti’s writeset overlaps with the writeset of any transaction Tj that alreadyvalidated and is concurrent to Ti. If such an overlap exists, Ti’s validation fails andthe middleware aborts T l

i . Otherwise, validation succeeds and Ti’s writeset has tobe applied at all replicas. For that, the middleware keeps a queue Qk for eachdatabase replica Rk. It appends Ti and its writeset to the queues of all replicas.At the local replica Rl, when Ti is the first in Ql, the middleware commits T l

i

and removes Ti from Ql. At a remote replica Rr, when Ti is the first in Qr, themiddleware starts T r

i , applies the changes, commits T ri and removes Ti from Qr. If

a commit fails, the middleware tracks this accordingly.Note that the protocol does not conform to our mapper functions defined in Def-

initions 3.1 and 5.1. An update transaction Ti that aborts does not have matching

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

30 · ...

w(x3)

s1 c1

s2 c2

T1R

R

A

B

B

AMW

Q

Q

T2

w(x1)

T1,T2

T1,T2

s2 c2

r(x1)

w(y2)

time

T2

T1 s1 w(x1) c1

T2

(a) Execution

T4 s4 r(y0) c4

T2 w(y2)

T3 a3s3r(x0) r(y0)T5 s5 c5

(b) SSGs

ASSG

SSGB

USG

T1 T2

wr,s

rws

T2T1

rwrw

T1 T2

wr

rw

rw

rw

T4

T5

T5T4

Fig. 13. Example 14

transactions T ri at remote replicas. We can simply imagine dummy transactions at

these remote replicas consisting only of start and abort operations.

Example 14. Figure 13.(a) shows an example execution. The set of transactionsis T = {T1 = w1(x), T2 = w2(y), T3 = w3(x), T4 = r4(x), r4(y), T5 = r5(x), r5(y)}.T1 and T4 are local at RA, while the rest are local at RB. We use grey boxesto identify remote transactions at each replica. The figure shows the temporalevolution of the middleware and database execution from left to right. Dash linesindicate the causal relationship of events between middleware and databases. Forbetter readability we omit superscripts, that is, we write T1 instead of T A

1 , as itshould become clear from the description where the transaction executes.

T1 is started at RA concurrently to T2 and T3 at RB. All of them can finishexecution until commit. Assume T1 is the first to submit the commit request. Themiddleware extracts the writeset and performs validation. T1’s validation succeedsand T1 is added to the queues QA and QB. Shortly after T2 wants to commit,its validation succeeds as T1 and T2 do not have a write/write conflict. T2 is alsoadded to queues QA and QB. When now T3 wants to commit, validation fails asthe middleware detects that concurrent transaction T1 has already validated anda write/write conflict with T3. Thus, the middleware tells RB to abort T3. T3

is not added to the queues. At RA T1 now commits and T2 starts, applies itswriteset and commits. Before T2 commits, read-only transaction T4 starts at RA.As the database uses SI, it reads the version x1 created by T1 and y0 as T2 hasnot yet committed. At RB, T1 is started, the writeset applied and T1 committed.The execution succeeds within the database as concurrent transaction T2 has nowrite/write conflict and T3 is aborted. After T1’s commit, the middleware submitsT2’s commit. Again it succeeds as there are no conflicts. Before T1 commits at RB,T5 is started, thus reading data versions x0 and y0. Figure 13.(b) shows the SSGsof the two local histories and the USG of the replicated history. No cycles exist.

Example 15. Our next example shows the foreign key constraint example withexisting department records d10 and d20, T1 inserting the first employee e1 for this

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 31

department and T2 deleting the department. A possible execution is as follows. T1

is submitted to RA and T2 to RB. At commit time of T1 the middleware extracts thewriteset containing e11. Validation succeeds and T1 is added to queues QA and QB.Now T2 submits its commit request, the middleware extracts the writeset containingd1dead. Validation again succeeds because there is no write/write conflict on anydata item (the middleware does not check integrity constraints). T2 is added toboth queues. Now the middleware submits the commit of T1 to RA. The integrityread accesses d11 at RA and returns true. Thus, T1 can commit at RA. Now T2

starts at RA, and the write operation is applied. When the middleware submits thecommit for T2 to RA, the integrity read accesses e11 and returns false. Thus, T2

aborts at RA. At the same time, the middleware starts T1 at replica RB, executesthe insert and submits the commit. T1 performs the integrity read accessing thelast committed version d10 of the department (T2 has not yet committed at RB).Thus T1 also commits at RB. When the middleware now submits the commitof T2 to RB, the integrity read accesses the last committed version e11 of theemployee tuple and aborts. Therefore, although the middleware validates bothtransactions successfully, T2 aborts at both replicas. The middleware has to trackthis accordingly and adjust counters, etc. Thus, the execution is as follows. TheA/B superindexes are omitted for better readability.

RHA : s1 w1(e11) ir1(6=∅:D.did=d1:{d10}:{d10, d20})=t c1

s2 w2(d1dead) ir2(=∅:E.did=d1:{e11}:{e11, e2init})=f a2

RHB : s2 w2(d1dead) s1 w1(e11) ir1(6=∅:D.did=d1:{d10}:{d10, d20})=t c1

ir2(=∅:E.did=d1:{e11}:{e11, e2init})=f a2

As only T1 commits, the graphs contain only T1 and will not be shown here.

6.2 SRCA is 1-copy-SI+IC

Theorem 6.1. SRCA provides 1-copy-SI+IC if the underlying DB replicas pro-vide SI+IC using first-commiter-wins strategy and deferred mode for integrity con-straints.

Proof. Based on Theorem 5.4, we need to prove that for any replicated historyRH possible under the protocol, (i) all local histories RHk are SI+IC-histories, (ii)an update transaction commits at either none or all replicas, (iii) there exists aUSG-IC(RH) with no G-1c* and G-SIb* cycles.Property (i) is fulfilled by assumption.Property (ii)It is clear that an update transaction T that aborts at its local replica before or attime of validation is not even started at any remote replicas. Thus, we only look attransactions that validate successfully. We show two properties. First, a validatedtransaction will not abort due to a write/write conflict. Second, the transactionwill perform the same integrity reads at all replicas.

The first part is from [Lin et al. 2005]. The middleware submits the commitof a transaction T k

i to replica Rk only if Ti is the first in queue Qk. If T ki is

a remote transaction, no other transaction commits between T ki ’s start and T k

i ’scommit. Thus, when database instance Rk performs the validation of T k

i internallyat T k

i ’s commit time, validation succeeds. If T ki is a local transaction, then T k

i

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

32 · ...

has already started at Rk when the commit is submitted to Rk. If T ki had a

write/write conflict with any other transaction that committed since the start ofT k

i , then the middleware would have detected a conflict at validation time of Ti andnot appended Ti to any queue. Note that in all replicas there might be concurrentlocal transactions that have not yet validated. But these transactions are of nointerest because they are not considered in any validation process.

For integrity reads we assume they are always checked at commit time and readthe last committed version. We show that all transactions perform the same in-tegrity reads based on the fact that commit requests are submitted in the sameorder at all replicas. Assuming that before the commit of the first transaction, allreplicas have the same state, the integrity reads of the first transaction have thesame outcome at all replicas, and all will either commit or abort the transaction (astransactions are IC-obyeing). Per induction, assume all replicas have committedthe same set of n transactions in the same order. When now transaction Tn+1 per-forms its commit, its integrity reads access the same data versions at all replicas,and thus, the outcome is the same at all replicas.Property (iii)We have just shown that all replicas commit the same set of write transactions inexactly the same order, and that all committed transactions perform their integrityreads on exactly the same data versions. This means, for any two replicas Rk andRl, if there is a ww-, wir- or irw-dependency edge from Ti to Tj in SSG(RHk), thenthere is the same edge from Ti to Tj in SSG(RH l). As a result, there exists actuallyonly a single USG-IC(RH), since independently which replica Rk we choose for atransaction Ti, its IC-dependency edges are the same as in other replicas. We nowshow that this USG-IC(RH) avoids G-1c* and G-SIb* cycles.

(1) Assume a G-1c* cycle exists in USG-IC(RH). There can be wr-, ww-, wir-and irw-dependency edges in the cycle. Note that all transactions in the cycle mustbe update transactions. This is true because each transaction in the cycle is thestart node of a wr-, ww-, wir-, or irw-dependency edge. The start node of a wr-ww- or wir-dependency edge is obviously an update transaction. Being the startnode of an irw-dependency edge means the transaction performed an integrity readwhich is followed by a successful write operation. Thus, all transactions are update

transactions, and thus, executed at all replicas. Each edge Tiwr/ww/wir/irw

−→ Tj inthe cycle is taken from the SSG(RHk) of at least one replica Rk. As the replicasprovide SI+IC, this implies that at RHk it holds that ci ≺t cj . As both Ti andTj are update transactions they commit at all replicas in the same order. Thusci ≺t cj holds at all replicas and a cycle is not possible.

(2) Assume a G-SIb* cycle exists. We break the cycle into q sections of

Tip

(wr/ww/wir/irw)∗

−→ Tjp

wr/ww−→ Tkp

rw− → Ti(p+1)%q

(where 0 ≤ p < q)In section p, Tip

, Tjp, and Ti(p+1)%q

must be write transactions while Tkpcan be a

read-only transaction. Tjp

wr/ww−→ Tkp

rw− → Ti(p+1)%q

is derived from the SSG(RH l)

of Tkp’s local history RH l. This implies cjp

≺t skp≺t ci(p+1)%w

which means that

T ljp

commits before T li(p+1)%w

at Rl. And since they are update transactions, they

corresponding transactions in the other replicas commit in the same order.

Now let’s consider Tip

(wr/ww/wir/irw)∗

−→ Tjp. As they are update transaction this

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 33

order occurs at all replicas and implies cip≺t cjp

. Therefore, for each section wehave cip

≺ c(p+1)%q in all local histories. Putting all sections together results inci0 ≺t ci0 which is impossible. Hence, G-SIb* cannot happen in the USG-IC(RH)of a replicated history RH produced by SRCA.

The proof assumes that the database system validates transactions at committime and checks integrity constraints at commit time. In case of a non-deferredmode for integrity constraints or if write/write conflicts are detected during runtime,the protocol above might lead to deadlocks. The protocol would need to be extendedfor this purpose. With this in place, correctness reasoning would be similar butaccordingly more complex since more cases need to be considered.

The crucial characteristics in the protocol that are used in the proof are thatall replicas provide SI+IC, and that all replicas commit successfully validated up-date transactions in the same order. However, they are not necessary conditions.Lemma 3.7 shows that conflicting transactions need to be committed in the sameorder, but this is not necessarily required for non-conflicting transactions. Never-theless, committing all transactions in the same order makes it easy to show thatall replicas decide on the same outcome of the transaction and that there are noG-1c* and G-SIb* cycles because any such cycle would appear in a local history.

6.3 SRCA-Extension

Lin et al. [2005] present an extension to SRCA that does not require to commit alltransactions in the same order at all database replicas. In this section we show thatwith this extension, the protocol provides 1-copy-SI but no more 1-copy-SI+IC.

The extended algorithm, denoted as SRCA-Ex, works as follows. Local executionof read-only and update transactions is as with SRCA. The same holds for thevalidation and the abort of failed transactions. The differences are as follows. Ifthe validation of a transaction Ti succeeds, the middleware commits T l

i at Ti’slocal replica Rl immediately. Furthermore, it appends Ti to the queues of allremote replicas. The middleware then starts T r

i at remote replica Rr and appliesthe writeset when there is no transaction before Ti in Qr that has a conflictingwrite operation. This means, that if a previously validated transaction Tj has awrite/write conflict with Ti, then T r

i only starts after T rj commits. However, T r

i canrun concurrently with other validated transactions for which there is no write/writeconflict. After all updates of T r

i have been applied, T ri commits. As a result, it is

now possible that transactions do not commit in validation order and the commitorder at the different replicas is different.

The algorithm furthermore puts restrictions on when to start transactions. Alocal transaction may only start if there is no ‘hole’ in the commit order. Thatis, if the middleware assigns new transaction Ti to be local at replica Rl and Tj isthe last validated transaction that committed at Rl, then transaction T l

i may onlystart at Rl if all transactions that validated before Tj have also committed at Rl.11

It is easy to see that the protocol does not provide 1-copy-SI+IC if the underlyingdatabase replicas provide SI+IC.

11This might lead to starvation of transactions as there might always be holes. Lin et al. [2005]provide extensions that avoid starvation. For space reasons, we ignore them here.

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

34 · ...

s2 c2w(y2)

s1 c1

s1 w(x1) c1

T1R

R

A

B

B

AMW

Q

Q

T2

time

T1

T2

w(x1)

T2

T1

s2 w(y2) c2

r(x1) r(y0)

a3w(x3)s3

T4 s4 c4

r(y2)r(x1)s5

s5 is delayed.

s5 c5

(a) Execution

T3T5

ASSG

SSGB

USG

T2T1

T1 T2

wr,s

rws

T1 T2

wrrw

swr,s

wr,s

wr

wr

(b) SSGs

T4

T5

T4 T5

Fig. 14. Example 17

Example 16. Let’s revisit Example 15. T1 executes at RA and inserts employeee1. T2 executes at RB and deletes the department. Both submit their commitrequest, the middleware validates first T1 and then T2 not detecting a conflict. AtRA T1 commits immediately. So does T2 at RB. When now T2 executes at RA aconstraint violation is detected and RA aborts T2. Similar, RB aborts T1 due tointegrity violation. The replicas do not commit the same set of transactions. Theexecution is as follows:

RHA : s1 w1(e11) ir1(6=∅:D.did=d1:{d10}:{d10, d20})=t c1

s2 w2(d1dead) ir2(=∅:E.did=d1:{e11}:{e11, e2init})=f a2

RHB : s2 w2(d1dead) ir2(=∅:E.did=d1:{}:{e1init, e2init})=t c2

s1 w1(e11) ir1(6=∅:D.did=d1:{}:{d1dead, d20})=f a1

However, assuming that the application has not specified any integrity constraints,the execution provides 1-copy-SI.

Example 17. Let’s revisit Example 14 and observe the execution under SRCA-Exshown in Figure 14.(a). T1 executes first at RA, writes x and validates successfully.T1 is immediately committed at its local replica RA and T1 queued in QB. T2 exe-cutes first at RB, writes y and also validates successfully as there is no write/writeconflict with T1. T2 is immediately committed at its local replica RB and queuedin QA. T3 executes at RB and writes x. At validation time, the middleware detectsa conflict with concurrent transaction T1 and aborts T3 at RB. T2 is now started atRA, its writeset applied and then committed. Before T2 commits at RA, read-onlytransaction T4 starts at RA. This is correct as only T1 has committed and is thelast to validate. T4 reads x1 and y0. At RB, T1 is started, its writeset appliedand then committed. Before T1 commits, read-only transaction T5 wants to startat RB. At this timepoint only T2 has committed at RB but it validated after T1.There is a ‘hole’ in the order of committed transactions. Therefore, T5’s start isdelayed until also T1 has committed at RB. T5 reads versions x1 and y2.

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 35

Figure 14.(b) shows the SSGs of the two local histories and the USG of thereplicated history. No cycles exist.

Theorem 6.2. SRCA-Ex provides 1-copy-SI if the underlying DB replicas pro-vide SI using first-commiter-wins strategy and there are no integrity constraints.

Proof. Based on Theorem 3.9, we need to prove that for any replicated historyRH possible under the protocol, (i) all local histories RHk are SI-histories, (ii)a write transaction commits at either none or all replicas, (iii) USG(RH) has noG-1c and G-SIb* cycles.Property (i) is fulfilled by assumption.Property (ii)Again, we only look at transactions that validate successfully. This time, we onlyhave to show that a validated transaction Ti will not abort due to a write/writeconflict. The middleware submits the commit for T l

i to Ti’s local replica immedi-ately after validation. As it has checked for write/write conflicts against concurrentvalidated transactions, the validation within the local database system Rl will alsosucceed and T l

i commit. The middleware starts a remote transaction T ri for Ti at

Rr when all other transactions that validated before Ti and had conflicting writeoperations committed. Furthermore, for any transaction Tj validating after Ti andconflicting, T r

j is not started until T ri commits. Thus, while T r

i might run concur-rently with other remote or local transactions that validated before or after Ti, itis assured that they do not have a write/write conflict with Ti. Thus, indepen-dently in which order they actually perform their commit, validation will succeed.The only transactions that might run concurrently with T r

i and conflict are localtransactions that have not yet validated (they will fail their validation).Property (iii)We need to show that USG(RH) avoids G-1c cycles and G-SIb* cycles. We firstshow two properties that help us in our proof.

First, we show that whenever there is an edge T ki

ww/wr−→ T k

j in any SSG(RHk)

(and thus Tiww/wr−→ Tj in USG(RH)), then Ti validated before Tj . We look first at

ww-dependency edges. As all histories are SI, a ww-dependency edge in SSG(RHk)implies that RHk committed T k

i before T kj started. As validation always occurs

before commit, it implies that Ti validates before Tj starts. Assume now the edgeoccurs in Tj ’s local history RH l. As T l

j starts before validation in its local historythis implies Ti validates before Tj . Now assume the edge occurs in a history RHr

where T rj is remote and Tj validates before Ti validates. Transactions are appended

to Qr in validation order and Ti may only “overtake” Tj in the queue Qr, and thus,T r

i commit before T rj , if they do not conflict. As there is a conflict, such an overtake

may not take place. Therefore, as T ri commits before T r

j , Ti must have validated

before Tj. A wr-dependency edge Tiwr−→ Tj in USG(RH) must be derived from

T li

wr−→ T l

j in SSG(RH l) of Tj ’s local history RH l. As RH l is SI, T li commits before

T lj starts and as T l

j is local in RH l, T lj starts in RH l before it validates. Thus, Ti

validates before Tj validates.Second, if Ti validates before Tj, Ti and Tj have a write/write conflict and valida-

tion succeeds for both transactions (only possible if they are not concurrent), then

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

36 · ...

there is a path T ki

ww+−→ T k

j in the SSG(RHk) of each local history RHk. Assume

there is not such a path in SSG(RHk). As the transactions write a common data

item, this implies there is a path T kj

ww+−→ T k

i . However, as we have seen above thisimplies that Tj validates before Ti violating our assumption.

With this, it is clear that USG(RH) avoids G-1c cycles (consisting only of ww-and wr-dependency edges) as this would imply a cycle in the validation order.

Now assume G-SIb* exists. We can break the cycle into q sections of

Tip

(wr/ww)∗

−→ Tjp

wr/ww−→ Tkp

rw− → Ti(p+1)%q

(where 0 ≤ p < q)

Let us consider one section p. Tip

(wr/ww)∗

−→ Tjpimplies that Tip

validates before

Tjpaccording to the discussion above. Now assume the next edge is Tjp

ww−→ Tkp

and is derived from T kjp

ww−→ T k

kpof SSG(RHk) of history RHk. This implies Tjp

validates before Tkp, and as a result Tjp

ww+−→ Tkp

in the SSGs of all local histories,

including Tkp’s local history RH l. This means we have either T l

jp

ww+−→ T l

kpor

T ljp

wr−→ T l

kpin the SSG(RH l) of the local history RH l of Tkp

. This implies that

all transactions that validated before Tjpalso committed before T l

kpstarted in

RH l according to SRCA-Ex (no holes when a transaction starts). As the edge

T lkp

rw− → T l

i(p+1)%qalso incurs in SSG(RH l) and implies skp

≺t ci(p+1)%q, we can

derive Tjpvalidated before Ti(p+1)%q

. Adding all sections together we can derive acycle in the validation order which is impossible. Hence, G-SIb* can not happen inthe USG(RH) of a replicated history RH produced by SRCA-Ex.

6.4 SRCA with 2-Phase-Commit

SRCA-Ex allows integrity reads to access different data versions leading to the pos-sibility of replicas not committing the same set of update transactions. Accessingdifferent data versions is possible because replicas can commit non-conflicting up-date transactions in different order. The question arises whether one can actuallybuild a replica control protocol providing 1-copy-SI+IC that allows non-conflictingtransactions to commit in different order and/or integrity reads to access differentdata versions at the different replicas. In fact, it is easy to derive such a protocolfrom SRCA-Ex. We denote it as SRCA-2PC. The only change is that the middle-ware does not commit each T k

i individually once execution has completed at Rk.Instead, only when execution has completed at all replicas, the middleware runsa 2-Phase-Commit protocol (2PC) with all replicas being participants. Assumingdeferred mode for integrity constraints, transactions perform their integrity readsupon receiving the prepare-to-commit request of the 2PC from the middleware.When the integrity read at a replica Rk evaluates to false, Rk votes to abort thetransaction. Only if the integrity reads at all replicas evaluate to true all vote tocommit the transaction and the transaction can commit. Note that the differentreplicas might access different data versions in their integrity reads.

Theorem 6.3. SRCA-2PC provides 1-copy-SI+IC if the underlying DB replicasprovide SI+IC using first-commiter-wins strategy and deferred mode for integrityconstraints.

Proof. Properties (i) and (iii) hold with the same reasoning as for SRCA-Ex.

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 37

Property (ii) holds because of the properties of the 2PC.

6.5 Discussion

All the protocols discussed in this section use a middleware and assume that the un-derlying database systems provide SI+IC. This makes the proof of the first property(local SI+IC histories) trivial. Protocols that are implemented within the databasekernel have to show explicitly that the local histories are SI(+IC)-histories.

Requiring that all update transactions commit at all replicas is an obvious re-quirement of ROWA approaches. While this is automatically provided if 2PC isused, showing this property for protocols that do not rely on a 2PC, is not triv-ial. In fact, SRCA-Ex does not provide 1-copy-SI+IC because the replicas mightcommit different sets of update transactions.

The complexity of showing that USG(RH) avoids G-1c* and G-SIb* cycles againdepends on the replica control algorithm itself. It is likely that the more conservativeand restrictive the algorithm is, the easier the proof will be.

7. FAILURES

7.1 Motivation

A ROWA approach cannot continue executing update transactions if one replicafails. Thus, replica control protocols typically implement a read-one-write-all-available (ROWAA) approach where only the available copies need to perform theupdate transactions. For space reasons, the following discussion excludes integrityconstraints. They can be included in a straightforward way into the formalism.

We first have to define a history in the advent of failures. We assume the crash-failure model where a database system that fails simply stops execution.

Definition 7.1. History with failure. A history H with failure over a set oftransactions T is a history according to Definition 2.1 with the following exceptions.1. The last event in H is a failure event, denoted as f .

2. If ci of transaction Ti ∈ T is contained in H , then all operations of Ti arecontained in H . If ai of transaction Ti ∈ T is contained in H , then at least si

is contained in H .

The Definition 3.2 of a replicated history can then be extended by simply indi-cating that each local history RHk can possibly be a history with failure.

Example 18. Assume two replicas RA, RB and transactions T1, T2 local at RB.

RHAfail1 : sA

1 wA1 (x1) cA

1 fA

RHBfail1 : sB

1 rB1 (y0)wB

1 (x1) cB1 sB

2 wB2 (x2) cB

2

T2 does not execute at RA since the replica fails before being able to do so. Thus,T2 only commits at the local and single available replica RB. Definition 3.4 of 1-copy-SI is violated as the two histories do not commit the same set of transactions.

In a ROWAA approach, however, only the available replicas should be requiredto commit the same set of transactions while a history with failure only needs toexecute properly until it fails. With this change, the history RHfail1 can be 1-copy-SI. USG(RHfail1) of Figure 15 contains only a ww-dependency edge from T1 to

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

38 · ...

Fig. 15. USG(RHfail1) and USG(RHfail2)

T1 T2 TT4 3wr wrrw

Fig. 16. USG(RHfail3)

T2. A corresponding global history would be the same as RHBfail1. However, what

does it exactly mean for a history with failure to execute properly until failure?

Example 19. Assume the same setup as in Example 18 but a different execution.

RHAfail2 : sA

2 wA2 (x2) cA

2 fA

RHBfail2 : sB

1 rB1 (y0)wB

1 (x1) cB1 sB

2 wB2 (x2) cB

2

This time RA executes and commits T2 but not T1 while RB commits both trans-actions. The execution again seems to be fine. USG(RHfail2) is the same asUSG(RHfail1) (see Figure 15) having a single ww-dependency from T1 to T2. Theglobal history could again be the same as RHB.

However, if RA had not failed, it would have not been possible to commit bothT1 and T2 at RA and extend RHA so that the extended history is 1-copy-SI. Theissue is that USG(RHfail2) in Figure 15 does not have any cycle because RA failedbefore committing T1. If RHA had completed execution, there would have been anadditional ww-dependency edge from T2 and T1 leading to a cycle.

This is in contrast to RHfail1 where the execution at RA could have been ex-tended to execute and commit T2, and the USG would still be acyclic.

Example 20. Let us reconsider Example 3 with T1 writing x, T4 writing y, and T2

and T3 reading both x and y. The history RHhole was not 1-copy-SI because the tworead-only transactions implicitly ordered the non-conflicting update transactions.In the following execution we let RB fail before executing T1.

RHAfail3 : sA

1 wA1 (x1) cA

1 sA2 rA

2 (x1) rA2 (y0) cA

2 sA4 wA

4 (y4) cA4

RHBfail3 : sB

4 wB4 (y4) cB

4 sB3 rB

3 (y4) rB3 (x0) cB

3 fB

The USG(RHfail3) in Figure 16 is also acyclic. However, the replicated history isnot 1-copy-SI since T2 and T3 are reading from incompatible snapshots. Again, ifRB had not crashed and applied T1’s update, then the history would not be 1-copy-SI. The issue as before lies in the fact that RHB is incomplete due to its failure,and it misses some edges needed to capture the fact that the replicated history isnot 1-copy-SI. In this case, it misses an rw-dependency edge from T3 to T1.

7.2 Failure Completed Histories

Failures result in incomplete local histories. Taking the SSGs of these incompletehistories to build the USG might prevent us from observing violations of 1-copy-SI.Our approach is to complete these local histories with failures in order to be ableto capture such violations. We can observe that the missing edges in the USG arealways due to the fact that a history with failure misses the write operations of somecommitted transactions. By adding the missing committed update transactions,local histories become complete. In RHA

fail2, if we add the writes (and commit) ofT1 then the USG contains a ww-dependency edge from T2 to T1 (see Figure 17),

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 39

Fig. 17. Extended USG(RHfail2)

T1 T2 TT4 3wr wrrw

rw

Fig. 18. Extended USG(RHfail3)

resulting in a G-1c cycle. Thus, we can detect that the history is not 1-copy-SI. Similarly, adding T1 to RHB

fail3 results in an additional rw-dependency edgefrom T3 to T1 leading to a G-SIb* cycle (see Figure 18). In contrast, adding T2

to RHAfail1, the USG remains the same as in Figure 15, and we can see that the

history is 1-copy-SI. Let us define this concept formally.

Definition 7.2. Failure Completed Replicated History. Let RH =⋃

RHk

be a replicated history over rmap(T ,R) with at least one local history with failure.A failure completed replicated history FCRH for RH has the following properties:

(1) FCRH =⋃

FCRHk is a replicated history over rmap(T ,R) where no localhistory has a failure.

(2) For each local history with failure RHk in RH , there is a local history FCRHk

without failure in FCRH , such that RHk − fk is a prefix of FCRHk.

(3) For each local history RHk in RH without failure, there is a local historyFCRHk in FCRH where RHk = FCRHk.

(4) For all update transactions Ti ∈ T and for all Rk, Rl ∈ R : cki ⇐⇒ cl

i.

Missed update transactions can be added in different ways. A failure completedhistory represents one possible continuation of the execution if no failure had oc-curred. If at least one continuation exists that represents a 1-copy-SI history thenwe consider the replicated history RH with failures to be 1-copy-SI.

Definition 7.3. 1-copy-SI in the advent of failures. Let RH =⋃

RHk bea replicated history over rmap(T ,R) with at least one local history with failure.RH is 1-copy-SI if there exists a failure completed history FCRH for RH that is1-copy-SI.

This means, in order to show that a replicated history RH with failures is 1-copy-SI we have to find a failure completed FCRH for RH where each local history isan SI-history and where USG(FCRH) has no G-1c or G-SIb* cycles. For ourexamples, we can do this for RHfail1 but not for RHfail2 and RHfail3. BothRHfail2 and RHfail3 miss one transaction. As the local histories in FCRH mustbe extensions of the local histories in RH with failures, we can only add the missingtransaction at the end of the execution resulting in USGs with cycles.

7.3 Failure Handling in the SRCA Protocols

In this section, we outline how the SRCA protocols of Section 6 handle failures.We only consider the failure of any individual database replica but not the failureof the middleware. For brevity, our discussion ignores integrity reads. In SRCA,for each failed replica the sequence of committed transactions is a prefix of thesequence of transactions committed at the available replicas. The failed history up

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

40 · ...

to the failure represents an SI-history. The history might contain some transactionsthat did not complete before the failure. A local transaction that had not validatedbefore the crash can be considered aborted in the entire system because the otherreplicas have not applied it. A local transaction that had validated before the crashwas applied by the available replicas. Thus, the failed history has to be extendedby the commit of this transaction. A remote transaction has to be extended bythe missing write operations and the commit. Furthermore, all committed updatetransactions that are completely missing in the history have to be appended, too.The missing write and commit operations are appended in the order that conformsto the order in one of the available histories. For SRCA-Ex and SRCA-2PC, thefailed history is extended in a similar way. The order of missing write and commitoperations should conform to the order in which these transactions were validated.

The proofs that these extended histories remain 1-copy-SI+IC (for SRCA andSRCA-2PC) and 1-copy-SI (for SRCA-Ex) is relatively straightforward and omitted.One can show that all additional dependency edges that appear in USG(FCRH) ofthe failure completed history FCRH would have also appeared if no local replicahad failed. As the protocols provide 1-copy-SI(+IC) histories in the failure freecase, the failure completed history will also be 1-copy-SI(+IC).

8. RELATED WORK

8.1 Work on Snapshot Isolation in General

SI became a popular isolation level since Oracle implemented it as its highest isola-tion level, and other commercial solutions have followed with their own implemen-tations. Basically all commercial systems we are aware of do not implement thefirst-committer-wins strategy but detect write/write conflicts during execution us-ing locks (first-updater-wins strategy). Oracle [Oracle Corporation 2007] does notactually store multiple versions but reconstructs previous versions by accessing thea specific page undo log which contains the old data values of records. PostgreSQL[2007] has always had a record-based multi-version system. Microsoft SQL Server2005 [2007] offers both serializability via locking and SI by reconstructing recordversions stored in persistent storage.

In the research literature, Berenson et al. [1995] defined SI by specifying a setof anomalies that SI avoids or allows. In particular, compared to serializability,SI does not avoid the anomaly “write skew”. Adya [1999] introduces the conceptsof Snapshot-Read and Snapshot-Write and defined the properties of SI throughGID. From there, Fekete et al. [2005] observe that the set of histories allowedby SI, but not by serializability are those that have cycles in the SSG with twoconsecutive anti-dependency edges. We made a similar observation and refined theG-SIb phenomenon to reflect this fact.

Some work has analyzed how serializability can be achieved on top of a SI sched-uler. Based on the GID formalism, Fekete et al. [2005] describe a set of tools toconvert a given database application so that even if it runs on a database systemproviding SI, only serializable executions are produced. Depending on the kindof application, certain vulnerable edges (conflicts) between concurrent transactionshave to be determined and restructured. More recently, Cahill et al. [2008] hasshown how SI concurrency control can be extended within the DB kernel to enforce

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 41

serializability in an efficient way. The idea is to keep track of anti-dependencies andwhenever there are two consecutive anti-dependencies to abort one of the transac-tions. Similarily, Elnikety et al. [2005] show how serializability can be achieved ina replicated system even if the local databases only provide SI. For that, read- andwritesets of transactions have to be monitored and their intersections determined.Our assumption is that serializability is not always needed but the execution in areplicated system should reflect the behavior of a non-replicated system using SI.

As we have discussed in this paper, the definitions given for SI in [Berensonet al. 1995; Adya 1999] allow for the violation of integrity constraints. Adya [1999]explores this topic further and shows that if update transactions run under serial-izability and only queries use snapshot isolation, integrity constraints do not pose aproblem. However, this does not reflect the behavior of commercial systems. Alo-mari et al. [2008] handle integrity violations in systems where integrity reads areembedded in the application programs by running those transactions in a serializ-able mode similar to [Fekete et al. 2005]. Our paper addresses integrity constraintsin detail and extends the GID to model actions the are done to maintain databaseintegrity. Our SI+IC guarantees the preservation of integrity constraints by avoid-ing G-IC and forbidding G-SIb* and G-1c* cycles. It does not require the entiretransaction to be serializable. An option that could be further explored is to runintegrity reads as sub-transactions that require serializability [Weikum and Vossen2001]. It might be difficult, though, to combine this with a formalism like GIDwhich allows a simple description of SI properties.

8.2 Snapshot Isolation in a Distributed System

Snapshot Isolation in a Federated System. In a federated system, data is par-titioned (not replicated) across a set of databases. Transactions from users areaccepted by a federation layer which redirects them to underlying databases andperforms any necessary pre- and post-processing. Schenkel et al. [1999] proposetwo algorithms to provide globally SI assuming the underlying database systemsprovide SI locally. The challenge in a distributed setting is that a transaction mightneed to read data from different databases. Using SI this means, it should read fromthe same snapshot at all replicas. Schenkel et al. [1999] indicate that for a schedulein a federated system to provide SI at the global level it should not have any twotransactions Ti and Tj that are concurrent at one database D1 while Ti executesafter Tj in database D2 and reads a data version written by Tj . The reasoning isthat in this case Ti would read from a transaction Tj that is globally concurrent toTi because there exists a database where both are concurrent. This is disallowedunder SI.

Snapshot Isolation in a Replicated System. In the last few years, several groupsstarted to work concurrently on the concept of SI in a replicated system. Elniketyet al. [2005] present Generalized Snapshot Isolation (GSI) that it is a generalizationof SI in the context of a centralized database. GSI is based on two definitions thatare similar to Snapshot-Read and Snapshot-Write. However, it allows a transaction,instead of reading the committed snapshot at the time of transaction start, to readan older snapshot. This is equivalent to artificially setting the start point of atransaction into the past. Although the paper presents a replicated protocol no

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

42 · ...

formalization is presented for the replicated case. A similar concept was defined byDaudjee and Salem [2006] as weak SI. GSI or weak SI are interesting in a replicatedsystem since when a transaction starts locally, some update transactions mighthave already committed at other replicas but not yet at the local one. Thus, whenlooking at the global state, these transactions have already committed but the localtransaction does not yet see the effects thus violating SI. In our framework, we aresimilar in concept to GSI because the start time of the transaction is set to thecurrent state of the local database, and therefore, automatically put into the past(looking at the global state). Using GSI, however, a transaction’s start timepointcan be put arbitrarily into the past. An update transaction is more likely to abortif its start timepoint is farther in the past.

Considering that the updates of a transaction are never committed at the samephysical time at the different replicas it can happen that a transaction might missimportant updates from an application point of view. For instance, in the protocolsof Section 6 an update transaction Ti could be local at history RH l. Then aconsecutive read-only transaction Tj of the same user could be local at a differentreplica Rk. It is possible that at the start of Tj at Rk, Rk has not yet applied andcommitted Ti. Thus, Tj will not see the changes of Ti. The system provides 1-copy-SI but from a user point of view, the execution is not correct since a user does notsee its previous writes. In order to capture such dependencies between transactions,Elnikety et al. [2005] introduce prefix-consistent GSI that requires that a transactionTi’s snapshot needs to contain updates of transactions that committed before Ti

and are related to Ti, for instance because they were submitted by the same useror because they belong to the same workflow. In similar spirit, Daudjee and Salem[2006] refer to strong session SI, if it provides Snapshot-Read and Snapshot-Write,and for any two transactions Ti and Tj of the same user session, if Ti’s commitprecedes the first read/write operation of Tj , then ci ≺t sj .

As far as we know, none of the formalisms developed to reason about SI ina distributed or replicated environment considers integrity constraints and theirimpact on the correctness of the system.

Replica Control for Snapshot Isolation. Several database replication protocolshave been developed based on SI [Plattner and Alonso 2004; Plattner et al. 2008;Daudjee and Salem 2006; Kemme and Alonso 2000; Lin et al. 2005; Wu and Kemme2005; Elnikety et al. 2005; Munoz-Escoı et al. 2006]. For most, however, no formalproof of correctness has been given. Protocols are either implemented into thekernel of a database system or at a middleware layer. Primary copy protocols let allupdate transactions execute at a single primary replica while secondary replicas mayonly execute read-only transactions. In contrast, update anywhere protocols allowany transaction to be local at any replica. Lazy protocols send writesets to otherreplicas only after commit of the local transactions while eager protocols send thembefore commit. Several middleware approaches have one middleware instance foreach database replica. They often use a total order multicast [Chockler et al. 2001]in order to allow for a distributed, yet deterministic validation. Snapshot isolationhas also been used in the context of multi-tier middleware systems such as J2EE.In [Perez-Sorrosal et al. 2007] a replication protocol is presented for replicatingboth the application server and database tiers. It provides 1-copy-SI and cache

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 43

transparency. It guarantees that the cached data objects in the middle tier areversioned in a way consistent with SI and the underlying database.

8.3 Other correctness criteria

1-copy-serializability [Bernstein et al. 1987] was the first and strongest correctnesscriteria developed for replicated database systems, and is offered by many replica-tion protocols [Carey and Livny 1991; Chundi et al. 1996; Breitbart et al. 1999;Pacitti et al. 1999; Kemme and Alonso 2000; Pedone et al. 2003; Amza et al. 2003;Holliday et al. 2003; Cecchet et al. 2004; Patino-Martınez et al. 2005]. The properserialization order is provided by various mechanisms, such as locking [Carey andLivny 1991; Cecchet et al. 2004], using the total order multicast of group commu-nication systems [Kemme and Alonso 2000; Pedone et al. 2003; Patino-Martınezet al. 2005], versioning [Amza et al. 2003], vector clocks and other timing mecha-nisms [Pacitti et al. 1999; Holliday et al. 2003], serialization graphs [Breitbart andKorth 1997], or by restricting how object copies can be collocated [Chundi et al.1996; Breitbart et al. 1999].

The concept of data freshness (or staleness) levels has received considerable atten-tion since it allows to provide faster response times at the cost of staler data. Rohmet al. [2002] present a lazy primary copy system where freshness of a secondaryreplica is based on the time between the last applied update at the secondary andthe most recent update on the primary replica. Queries can indicate the minimumfreshness of the data they want to see. Thus, applying writesets at secondaries canbe delayed to timepoints when there is little load in the system or until secondariesare too stale. In [Gancarski et al. 2007], staleness is defined on a relation basisand reflects the number of tuple changes a replica has not yet seen. In [Plattneret al. 2008], secondary replicas can be designed to maintain an important snapshotor to load a required past snapshot. As in [Rohm et al. 2002], applying writesetsto secondaries can be delayed to give preference to queries that will get faster re-sponse at the price of less accurate data. In all these approaches, global correctnessis not violated, and the systems still provide 1-copy-serializability or 1-copy-SI. Aswith GSI, from an abstract point of view it means that the start timepoint for aquery can be put into the past, limited by the freshness value. In [Bernstein et al.2006], the concept of Relaxed Currency Serializability is introduced and applied toa distributed and replicated cache. Several constraints, such as time-bound, value-bound or drift constraint can be defined over a set of data items. A transaction mayread from different snapshots (such as in the read-committed isolation level [Adyaet al. 2000]) as long as freshness constraints are satisfied.

9. CONCLUSION

In this paper, we present a formal framework to reason about snapshot isolationin a replicated environment. Our framework is based on General Isolation Defini-tion (GID) which provides a graph-based way to reason about the correctness ofsnapshot isolation schedules. We extend GID in several ways. Firstly, we extendit to reason about replicated histories, and define what it means for a replicatedhistory to provide SI at the global level, i.e., to provide 1-copy-SI. By extendingthe graph-based reasoning tool of GID, we can derive sufficient and necessary con-ditions for a replicated history to be 1-copy-SI by looking at the dependency graph

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

44 · ...

of the replicated history and test for certain cycles. From there, we analyze howcommercial systems that provide SI handle integrity constraints. Since basic SIcan lead to violations of integrity constraints but commercial systems maintainthem, we derive a new isolation level, denoted as SI+IC which is supported byseveral database systems. It models the maintenance of integrity constraints byfirst requiring a transaction to perform special integrity read operations on relevantdata items and then ensuring that at commit time the data versions read are stillvalid. The dependency graph of histories with integrity reads is extended and newgraph-based conditions define when a history provides SI+IC. We then extend thenotation to a replicated history to define 1-copy-SI+IC and identify conditions thatallow to determine whether a history is 1-copy-SI+IC. In order to handle failures,some special care has to be taken to capture all dependencies.

10. ACKNOWLEDGEMENTS

We want to thank the anonymous reviewers for their insightful reviews and manyconstructive suggestions. They helped us tremendously in improving this paper.

REFERENCES

Adya, A. 1999. Weak consistency: A generalized theory and optimistic implementations fordistributed transactions. Ph.D. thesis, MIT, Cambridge.

Adya, A., Liskov, B., and O’Neil, P. E. 2000. Generalized isolation level definitions. In Proc.of the IEEE Int. Conf. on Data Engineering (ICDE). 67–78.

Alomari, M., Cahill, M. J., Fekete, A., and Rohm, U. 2008. The cost of serializability onplatforms that use snapshot isolation. In Proc. of the IEEE Int. Conf. on Data Engineering(ICDE). 576–585.

Amza, C., Cox, A. L., and Zwaenepoel, W. 2003. Distributed versioning: Consistentreplication for scaling back-end databases of dynamic content web sites. In Proc. of theACM/IFIP/USENIX Int. Middleware Conf. 282–302.

ANSI X3.135-1992. 1992. American National Standard for Information Systems - DatabaseLanguage- SQL.

Berenson, H., Bernstein, P., Gray, J., Melton, J., O’Neil, E., and O’Neil, P. 1995. Acritique of ANSI SQL isolation levels. In Proc. of the ACM SIGMOD Int. Conf. on Managementof Data. 1–10.

Bernstein, P. A., Fekete, A., Guo, H., Ramakrishnan, R., and Tamma, P. 2006. Relaxed-currency serializability for middle-tier caching and replication. In Proc. of the ACM SIGMODInt. Conf. on Management of Data. 599–610.

Bernstein, P. A., Hadzilacos, V., and Goodman, N. 1987. Concurrency Control and Recoveryin Database Systems. Addison-Wesley.

Breitbart, Y., Komondoor, R., Rastogi, R., Seshadri, S., and Silberschatz, A. 1999. Up-date propagation protocols for replicated databases. In Proc. of the ACM SIGMOD Int. Conf.

on Management of Data. 97–108.

Breitbart, Y. and Korth, H. F. 1997. Replication and consistency: Being lazy helps sometimes.In Proc. of the ACM Int. Symp,. on Principles of Database Systems (PODS). 173–184.

Cahill, M., Rohm, U., and Fekete, A. 2008. Serializable isolation for snapshot databases. InProc. of the ACM SIGMOD Int. Conf. on Management of Data. 729–738.

Carey, M. J. and Livny, M. 1991. Conflict detection tradeoffs for replicated data. ACMTransactions on Database Systems (TODS) 16, 4, 703–746.

Cecchet, E., Marguerite, J., and Zwaenepoel, W. 2004. C-JDBC: Flexible database clusteringmiddleware. In In Proc. of USENIX Annual Technical Conference, FREENIX Track. 9–18.

Chockler, G., Keidar, I., and Vitenberg, R. 2001. Group communication specifications: acomprehensive study. ACM Computer Surveys 33, 4, 427–469.

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 45

Chundi, P., Rosenkrantz, D. J., and Ravi, S. S. 1996. Deferred updates and data placement in

distributed databases. In Proc. of the IEEE Int. Conf. on Data Engineering (ICDE). 469–476.

Daudjee, K. and Salem, K. 2006. Lazy database replication with snapshot isolation. In Proc.of Int. Conf. on Very Large Data Bases (VLDB). 715–726.

Elnikety, S., Pedone, F., and Zwaenopoel, W. 2005. Database replication using generalizedsnapshot isolation. In Proc. of the Int. Symp. on Reliable Distributed Systems (SRDS). 73–84.

Fekete, A., Liarokapis, D., O’Neil, E., O’Neil, P., and Shasha, D. 2005. Making snapshotisolation serializable. ACM Transactions on Database Systems (TODS) 30, 2, 492–528.

Gancarski, S., Naacke, H., Pacitti, E., and Valduriez, P. 2007. The leganet system: freshness-aware transaction routing in a database cluster. Information Systems 32, 2, 320–343.

Holliday, J., Steinke, R. C., Agrawal, D., and Abbadi, A. E. 2003. Epidemic algorithms forreplicated databases. IEEE Transactions on Knowledge and Data Engineering (TKDE) 15, 5,1218–1238.

Kemme, B. and Alonso, G. 2000. A new approach to developing and implementing eagerdatabase replication protocols. ACM Transactions on Database Systems (TODS) 25, 3, 333–379.

Lin, Y., Kemme, B., Patino-Martınez, M., and Jimenez-Peris, R. 2005. Middleware baseddata replication providing snapshot isolation. In Proc. of the ACM SIGMOD Int. Conf. onManagement of Data. 419–430.

Microsoft SQL Server 2005. 2007. SQL Server 2005 row versioning-based transaction isolation.

Munoz-Escoı, F. D., Pla-Civera, J., Ruiz-Fuertes, M. I., Irun-Briz, L., Decker, H., Ar-

mendariz-Inigo, J. E., and Gonzalez de Mendıvil, J. R. 2006. Managing transaction con-flicts in middleware-based database replication architectures. In Proc. of the Int. Symp. onReliable Distributed Systems (SRDS). 401–410.

Oracle Corporation. 2007. Oracle 11g Release 1.

Pacitti, E., Minet, P., and Simon, E. 1999. Fast algorithm for maintaining replica consistencyin lazy master replicated databases. In Proc. of Int. Conf. on Very Large Data Bases (VLDB).126–137.

Patino-Martınez, M., Jimenez-Peris, R., Kemme, B., and Alonso, G. 2005. MIDDLE-R:Consistent database replication at the middleware level. ACM Transactions on ComputerSystems (TOCS) 23, 4, 375–423.

Pedone, F., Guerraoui, R., and Schiper, A. 2003. The database state machine approach.Distributed and Parallel Databases 14, 1, 71–98.

Perez-Sorrosal, F., Patino-Martınez, M., Jimenez-Peris, R., and Kemme, B. 2007. Con-sistent and scalable cache replication for multi-tier J2EE applications. In Proc. of theACM/IFIP/USENIX Int. Middleware Conf. 328–347.

Plattner, C. and Alonso, G. 2004. Ganymed: Scalable replication for transactional web appli-cations. In Proc. of the ACM/IFIP/USENIX Int. Middleware Conf. 155–174.

Plattner, C., Alonso, G., and Ozsu, M. T. 2008. Extending DBMSs with satellite databases.VLDB J. 17, 4, 657–682.

PostgreSQL. 2007. PostgreSQL, the world’s most advanced open source database.

Rohm, U., Bohm, K., Schek, H.-J., and Schuldt, H. 2002. FAS - a freshness-sensitive coordi-nation middleware for a cluster of OLAP components. In Proc. of Int. Conf. on Very LargeData Bases (VLDB). 754–765.

Schenkel, R., Weikum, G., Weißenberg, N., and Wu, X. 1999. Federated transaction man-agement with snapshot isolation. In Int. Workshop on Foundations of Models and Languagesfor Data and Objects (FMLDO) - Selected Papers. 1–25.

Weikum, G. and Vossen, G. 2001. Transactional Information Systems. Morgan Kaufmann,Chapter 6.

Wu, S. and Kemme, B. 2005. Postgres-R(SI): Combining replica control with concurrency controlbased on snapshot isolation. In Proc. of the IEEE Int. Conf. on Data Engineering (ICDE).422–433.

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

46 · ...

Appendix A. SERIALIZATION GRAPH DEPENDENCY EDGES

Definition 3.4 of Section 3.1 defines a replicated history RH to be 1-copy-SI, if alllocal histories are SI-histories, if all local histories commit the same set of updatetransactions, and if there is a global history H with the same set of committedtransactions, whose SSG(H) has the same nodes and the same ww-, wr-, and rw-dependency edges as USG(RH). In this definition, the serialization graphs (SSGor USG) do not consider the data items that lead to the dependency edges. In thissection we argue that considering the individual data items is not necessary.

For the non-replicated case, Adya shows in [Adya 1999] that GID does not requireto look at the individual data items that trigger dependencies. But in Definition 3.4it seems feasible that a dependency edge from Ti to Tj in the USG(RH) of areplicated history is due to a conflict on data item x, while the SSG(H) of the“equivalent” global SI-history H has this same dependency edge due to a data itemy. One option to express the requirement that dependency edges reflect conflictson the same data items would be to tag dependency edges with the data items thatcause them. For instance if a history had operations ...wi(xi) ... rj(xi) ..., then the

SSG resp. USG could have a Tiwrx−→ Tj dependency edge. The 1-copy-SI definition

could then require the dependency edges of SSG(H) of the global history H tohave the same item tags as the corresponding dependency edges in USG(RH).The reasoning we used in Section 3.1 would work equally well with this extendednotation but for simplicity, we did not use it.

Instead, we show in this appendix that the dependency edges in SSG(H) mustbe due to the same data items as the corresponding edges in USG(RH).

Lemma Appendix A.1. Let RH =⋃

RHk be a replicated history over rmap(T ,R)with the following properties.

(1 ) ∀Rk ∈ R, RHk is an SI-history.

(2 ) For all update transactions Ti ∈ T and for all Rk, Rl ∈ R : cki ⇐⇒ cl

i.

(3 ) There exists a global SI-history H over T such that

(a) SSG(H) and USG(RH) have the same nodes;(b) SSG(H) has exactly the same ww-, wr-, and rw-dependency edges as

USG(RH).

Then the following holds. If a dependency edge in USG(RH) is due to data itemx, then the corresponding dependency edge in SSG(H) is also due to x. If a depen-dency edge in SSG(H) is due to data item x, then the corresponding dependencyedge in USG(RH) is also due to x.

Proof. We show for each type of dependency edge, that they must be due tothe same data items in both USG(RH) and SSG(H). Our proof only shows onedirection, the other follows the same reasoning.

(1) Assume that USG(RH) contains Tjwr−→ Ti due to data item x and taken from

SSG(RHk) of replica Rk while the corresponding edge in SSG(H) is not dueto x but due to y. This means Tj writes both x and y. Furthermore, there

must be an edge Tkwr−→ Ti in SSG(H) due to x as Ti must read x from some

other transaction (assuming a start transaction that writes all data items before

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 47

they are read). As a result SSG(RHk) must also contain Tkwr−→ Ti (all read-

dependency edges to Ti in USG(RH) are from Ti’s local history). As Tk writesx, and Ti in RHk reads x from Tj and not from Tk, and RHk is an SI-history,

it follows that Tks,ww+

−→ Tjs,wr−→ Ti. But as SSG(H) and USG(RH) have the

same ww-dependency edges and H is an SI-history, it follows that Tkww+

−→ Tj in

SSG(H), and as we also have Tjwr,s−→ Ti in SSG(H) due to y, Ti will actually

also read x from Tj and not Tk according to the Snapshot-Read property of

reading the last committed version. Thus, Tjwr−→ Ti in SSG(H) is due to x

and our assumption is wrong.

(2) Assume that USG(RH) contains a Tjww−→ Ti due to data item x taken from

SSG(RHk) while the corresponding edge in SSG(H) is not due to x but dueto y. At the same time, as Tj and Ti both write x, H must order xj and xi

and SSG(H) must also contain a ww+-dependency path between Tj and Ti.

As Tiww+−→ Tj would lead to a G-1c cycle in SSG(H), Tj

ww+−→ Ti must hold. As

by assumption Ti cannot directly write-depend on Tj due to x, there must be

at least one other transaction Tk, Tjww+−→ Tk

ww−→ Ti in SSG(H), and thus in

USG(RH). As Tk also writes x and we assume Tjww−→ Ti in SSG(RHk) due

to data item x we must either have Tkww+−→ Tj or Ti

ww+−→ Tk in SSG(RHk) due

to x. But both will result in a G-1c cycle in USG(RH) since there is already

Tjww+−→ Tk

ww−→ Ti in USG(RH).

(3) Now assume that SSG(RHk) contains a Ti

rw− → Tj due to data item x for

transaction Ti local at Rk. Furthermore there must be a Tkwr−→ Ti due to x as

we assume that a start transactions writes all data items read by transactionsin the history. By the definition of wr- and rw-dependency edges this meansthat there is also a Tk

ww−→ Tj in SSG(RHk) due to x. As we have seen already

that all ww- and wr-dependency edges in USG(RH) and SSG(H) are due to

the same data item, Tkwr−→ Ti and Tk

ww−→ Tj also exist in SSG(H) due to x.

Thus, due to Proposition 2.5, this means that SSG(H) also has Ti

rw− → Tj due

to x.

Appendix A.1 Complete Proof of Lemma 3.5

Recall that the lemma states that (non-replicated) SI-history H over a set of trans-actions T avoids G-SIb*.

Proof. Assume there is an SI-history H that has phenomenon G-SIb*. SSG(H)cannot have a cycle with only one rw-dependency because it avoids G-SIb. Thus,SSG(H) has a cycle c with m (m > 1) rw-dependency edges and each rw-dependencyedge is prefixed by a ww-, wr-, or start-dependency edge. Firstly, we can easilyderive that SSG(H) must have a cycle c′ with m (m > 1) rw-dependency edgesand all other edges in the cycle are start-dependency edges. This is because when-ever there is a ww- or wr-dependency edge between from Ti to Tj there is also astart-dependency edge because of G-SIa. Thus, in the following, we only consider a

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

48 · ...

cycle that consists of m rw-dependency edges, all other edges are start-dependencyedges, and each rw-dependency edge is prefixed by a start-dependency edge. Wecan break the cycle into m sections. Each section k ∈ {0, . . . , m−1} has the pattern

Tik

S+

−→ Tjk

rw− → Ti(k+1)%m

. According to Table II, we can derive for the ≺t-orderof H for each section k due to transitivity:

Tik

s+

−→ Tjk⇒ cik

≺t sjk

Tjk

rw− → Ti(k+1)%m

⇒ sjk≺t ci(k+1)%m

}

⇒ cik≺t ci(k+1)%m

If we now look at all sections, we obtain: ci0 ≺t ci1 ≺t · · · ≺t cik≺t ci(k+1)

· · · ≺t

cim−1 ≺t ci0 . Since ≺t is irreflexive this results in a contradiction.

Appendix A.2 Complete Proof of Lemma 3.7

Recall that this lemma indicates that for a replication history RH where USG(RH)has no G-1c cycles, two conflicting committed transactions commit in the sameorder in all local histories.

Proof. Assume two write transactions Ti and Tj updating the same data object,and two arbitrary replicas RA and RB. Since all local histories commit the sameset of update transactions, we know that if cB

i and cBj occur in RHB so do cA

i and

cAj in RHA and vice versa. Now assume cA

i ≺t cAj in RHA and cB

j ≺t cBi in RHB.

Let x be one of the objects that Ti and Tj both update. As the Snapshot-Write property requires the version order to follow the commit order, ci ≺t cj

implies xi � xj in RHA. By the definition of ww-dependency edges (as in Table

I), if xi � xj , then SSG(RHA), and thus USG(RH), have a path Tiww+

−→ Tj

consisting of only ww-dependency edges . Similarly, cBj ≺t cB

i in RHB will lead to

Tjww+

−→ Ti in USG(RH). This results in USG(RH) having a cycle consisting onlyof ww-dependency edges. This contradicts the assumption that USG(RH) avoidsG-1c.

Appendix A.3 Complete proof of Theorem 3.9

Recall that Theorem 3.9 indicates that a replicated history RH is 1-copy-SI if thefollowing holds. RH is 1-copy-SI if the following holds

(1) For each Rk ∈ R, RHk is an SI-history.

(2) For all update transactions Ti ∈ T and for all Rk, Rl ∈ R : cki ⇐⇒ cl

i.

(3) USG(RH) has no G-1c or G-SIb* cycles.

Proof. To prove this, according to the definition of 1-copy-SI (Definition 3.4),it is sufficient to show that we are able to construct an SI-history H over T withthe same ww-, wr-, and rw-dependencies as USG(RH). The proof consists ofthree parts. First, we create a global history H based on the dependency edgesin USG(RH). Then, we show that H really has the same dependency edges asUSG(RH). Finally, we show that H is actually an SI-history.

Part (1): To construct a history H.We first build a total order between start and commit operations of all committed

transactions. Then we fill in the read and write operations, determine the version

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 49

orders, and indicate the versions read by the read operations.Step 1: Partially ordering starts and commits. In order to obtain this total

order we construct a Start-Commit-Order Serialization Graph, SCSG(RH) wherethe vertices are the start and commit operations of all committed transactions.

1. For each Ti in USG(RH), there is an edge si<

−→ ci in SCSG(RH). This reflectsthe fact that the ≺t-order requires the start of a transaction to be before itscommit, i.e., si ≺t ci.

2. For each Ti, Tj, i 6= j in USG(RH) there is an edge ci<

−→ sj in SCSG(RH) iff

Tie

−→ Tj in USG(RH) where e ∈ {ww, wr}. This reflects the fact that thesedependencies imply ci ≺t sj in an SI-history12.

3. For each Ti, Tj, i 6= j in USG(RH) there is an edge si<

−→ cj in SCSG(RH) iff

Ti

rw− → Tj . This reflects the fact that an rw-dependency implies si ≺t cj

13.Now we show there is no cycle in SCSG(RH), and thus there is a partial order ofstart and commit operations. We do this by contradiction. Assume there is a cycle.It is important to note that all edges in SCSG(RH) are placed between start and

commit operations (i.e., there are neither si<

−→ sj nor ci<

−→ cj edges). Thus,without loss of generality, we can break the cycle into m(m ≥ 1) sections:

cik

<−→ sjk

<−→ ci(k+1)%m

(where 0 ≤ k < m)

In section k, the first edge cik

<−→ sjk

must be derived from an edge Tik

e−→ Tjk

(e ∈ {wr, ww}) in USG(RH). The second edge sjk

<−→ ci(k+1)%m

must be derivedeither by (a) the ≺t-order within a transaction, i.e., jk = i(k+1)%m, or by (b) an

rw-dependency between different transactions Tjk

rw− → Ti(k+1)%m

(jk 6= i(k+1)%m).We discuss all possibilities.

Assume that all edges of type sjk

<−→ ci(k+1)%m

are derived by (a) (i.e., jk =i(k+1)%m), i.e., no edge was derived by an rw-dependency. Thus, the cycle inSCSG(RH) is due to a cycle in USG(RH) that consists only of ww- and wr-dependency edges. However, USG(RH) does not have G-1c cycles.

Therefore, there must be at least one section in the cycle such that sjk

<−→

ci(k+1)%mis due to (b) (i.e., due to an rw-dependency). Note that sjk

<−→ ci(k+1)%m

must be prefixed with a cik

<−→ sjk

in the cycle. Thus, the cycle in SCSG(RH)must be due to a cycle in USG(RH) with one or more rw-dependencies where eachrw-dependency is prefixed by a ww- or wr-dependency edge. This contradicts thefact that USG(RH) has no G-SIb* cycles.

Step 2: Totally ordering start and commit operations SCSG(RH) so fardefines a partial order between start and commit operations. We make this a totalorder (that is, connecting any start with any commit) in the following way: Forany ci, sj , i 6= j that are not connected in the graph (i.e., there is no path from ci

to sj or from sj to ci), we set sj<

−→ ci. This will not lead to any new cycles by

12Note that we assume that a transaction does not read its own writes and only writes an object

once, therefore there is no Tiww,wr−→ Ti edge in USG(RH).

13Note that we do not consider i = j as we have already si<

−→ ci in SCSG(RH) due to step 1.

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

50 · ...

construction.

Now we set the ≺t-order between start and commit operations in T for ourglobal history H according to SCSG(RH). For any aborted transaction Ti, we justorder the si at the very beginning (as sources of SCSG(RH)). We simply set ai

immediately after its si.

Step 3: Ordering write and read operations. Then we include the read andwrite operations of each committed transaction Ti into ≺t of H by setting themafter si and before ci according to the execution order within the transaction.

Step 4: Totally ordering versions of data items. We now have to determinethe version order of all versions created by committed transactions. According toProposition 3.7, all local histories RHk at the different replicas have the sameversion orders for all data items. We use these version orders for H .

Step 5: Determining the versions of read operations. Finally, we have todetermine for each read operation of Ti on object x, the version that is read. Wedo this in the following way. Let Ti be a transaction local in Rl for RH and let T l

i

perform rli(xj). Then we set ri(xj) in H .

Part (2): SSG(H) has exactly the same ww-, wr- and rw-dependency edges as USG(RH)USG(RH) and SSG(H) have the same ww-dependency edges because both havethe same version order for each object x by construction step 4. As a transactiononly performs read operations at its local history, USG(RH) contains for each readoperation of Ti only the wr-dependency edge derived from Ti’s local history. As Ti

in H reads the same version as it reads in the local history in RH according to step5 above, SSG(H) has the same wr-dependency edges. Finally, since SSG(H) andUSG(RH) have the same wr- and ww-dependency edges, based on Proposition 2.5,they must have the same rw-dependency edges.

Part (3): H is an SI-historyG-1a is avoided since aborted transactions do not have any write operations inH . G-1c is avoided since SSG(H) has the same ww- and wr-dependency edges asUSG(RH) and USG(RH) has no G-1c cycle. G-SIa is avoided because during Step

1 of the construction of H , whenever Tiww/wr−→ Tj in USG(RH), we have ci ≺t sj

in H , and thus, whenever Tiww/wr−→ Tj in SSG(H) we have Ti

S−→ Tj in SSG(H).

In order to show that H avoids G-SIb we cannot only rely on USG(RH) havingno G-SIb* cycle, because SSG(H) has more edges than USG(RH), namely start-dependency edges, and thus, might still have a cycle containing an rw-dependencyand some start-dependency edges. Through construction step 1 for H we madesure that whenever there is wr− or ww-dependency edge in USG(RH), and thusin SSG(H), we have a corresponding start-dependency edge in SSG(H). Therefore,

if there is any G-SIb cycle, then SSG(H) must have a cycle TiS+−→ Tj

rw− → Ti.

TiS+−→ Tj is derived from ci ≺t sj in H . Tj

rw− → Ti must also appear in USG(RH)

and therefore, according to step 1 of the construction, we have set sj ≺t ci inH . This results in ci ≺t ci which is impossible because we have shown that ourconstruction of ≺t is irreflexive. Hence G-SIb is avoided.

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 51

Appendix A.4 Complete Proof of Lemma 3.11

Recall that this lemma indicates that in a 1-copy-SI replicated history RH , if thereare two concurrent transactions updating the same data item, at least one of themaborts.

Proof. Assume both transactions commit. It is not possible that there is localhistory RHk where Ti and Tj would be concurrent as RHk is SI and the Snapshot-Write property forbids two concurrent transactions to update the same object andboth commit. Therefore, Ti and Tj must be concurrent because there are two localhistories RHk and RH l, k 6= l, and sk

i ≺t ckj and sl

j ≺t cli. Furthermore, according

to Proposition 3.8, both RHk and RH l must commit Ti and Tj in the same order.Assume without loss of generality, this order is ci ≺t cj. This implies si ≺t cj .Therefore, at RH l we have sl

i ≺t clj and sl

j ≺t cli, meaning Ti and Tj are concurrent

at RH l which is impossible as shown above.

Appendix B. PROOFS FOR SI+IC

Appendix B.1 Complete Proof of Theorem 4.7

The theorem states that an SI+IC history H over a set of IC-obeying transactionsT avoids G-1, G-SI and G-IC.

Proof. Adya [1999] contains the proofs that show that a history that fulfillsSnapshot-Read and Snapshot-Write avoids G-1 and G-SIa. Since their definitionshave not changed as they do not relate to integrity reads, we refer the interestedreader to [Adya 1999]. Thus, we only need to prove that G-IC and the new definitionof G-SIb are avoided.

Since all integrity reads in H are IC-consistent, SSG(H) clearly avoids G-IC. An

IC-read-dependency edge Tiwir−→ Tj means that Tj directly IC-read-depends on Ti.

Since Tj’s predicate reads are IC-consistent, ci ≺t cj must hold in H , which impliesa commit-dependency edge from Ti to Tj in SSG(H). An IC-anti-dependency edge

Ti

irw− → Tj means that Tj directly IC-anti-depends on Ti. Since the read is IC-

consistent, ci ≺t cj must hold, which implies a commit-dependency edge from Ti

to Tj .Assume that G-SIb is not avoided. There will be a cycle in which the rw-

dependency edge is prefixed by a ww-, wr- or start-dependency edge. Since G-SIaand G-IC hold, there must also be a cycle that consists only of start- and commit-dependency edges and a single rw-dependency edge. That is, the cycle has theform

(TiS∗

−→ TjC∗

−→ Tk)∗S+

−→ Tp

rw− → Ti.

(TiS∗

−→ TjC∗

−→ Tk)∗S+

−→ Tp implies ci ≺t sj ≺t cj ≺t ck ≺t sp. As ≺t is irreflexive,it is not possible that sp ≺t ci. We now show that ci ≺t sp is also not possible, andthus, no such cycle can exist.

Assume ci ≺t sp. Tp

rw− → Ti is due to one of Tp’s standard read operations rp(xk),

Ti performing a wi(xi) and xi following xk in the version order, i.e., xk � xi. Takingour assumption that ci ≺t sp, if we look at the Snapshot-Read Definition 2.3,we can see that property 2.b must hold, requiring that ci ≺t ck. According to

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

52 · ...

the Snapshot-Write property (Definition 2.4) this means that xi � xk which is acontradiction.

Appendix B.2 Complete Proof of Theorem 4.8

The theorem states that if a history H over a set of IC-obeying transactions Tavoids G-1, G-SI and G-IC, then it is an SI+IC history.

Proof. For Snapshot-Read (see Definition 2.3) and Snapshot-Write (see Defini-tion 2.4), we use proofs similar to those in [Adya 1999].

—Assume Snapshot-Write is not satisfied because property (1) is violated, that is,committed transactions Ti and Tj both update data item x and are concurrent.Without loss of generality, let’s assume Ti commits before Tj . Then there is awrite-dependency edge from Ti to Tj without a start-dependency edge in thesame direction. It contradicts the avoidance of G-SIa.

—Assume Snapshot-Write is not satisfied because property (2) is violated, that isTi and Tj both update the same data item, Ti commits before Tj but xj � xi.

The commit order implies TiS

−→ Tj and the version order implies a Tjww−→ Ti in

the opposite direction. Thus, it contradicts the avoidance of G-SIa.

—Assume Snapshot-Read is not satisfied because property (1) is violated, that is,because Ti reads a data item (e.g., x) written by a concurrent transaction Tj (i.e.,ri(xj) and si ≺t cj). But this would mean SSG(H) has a read-dependency edgefrom Tj to Ti without there being also a start-dependency edge, and H wouldnot avoid G-SIa.

—Assume Snapshot-Read is not satisfied because property (2) is violated, that is,because Ti reads data from an old snapshot instead of the latest snapshot, i.e.,ri(xj) and there is a wk(xk), ck ≺t si and xj � xk.

—Assume xk is the version directly following xj . Due to Proposition 2.5, there

is Ti

rw− → Tk in SSG(H). According to our assumption ck ≺t si, we have

TkS

−→ Ti which leads to a cycle between Ti and Tk where the anti-dependencyedge from Ti to Tk is prefixed by a start-dependency edge from Tk to Ti. ThusH would not avoid G-SIb.

—Assume xj+1 is the version directly following xj and xj+1 � xk. Then, we

have Ti

rw− → Tj+1, Tj+1

ww+/S+−→ Tk (due to G-SIa) and Tk

S−→ Ti (according

to our assumption) again leading to a cycle with one anti-dependency which isprefixed by a start-dependency edge. Thus, again H would not avoid G-SIb.

Assume there exists an integrity read that is not IC-consistent.

—The integrity read could violate property (1) of Definition 4.4, i.e., a transactionTi directly IC-read-depends on Tj but ci ≺t cj or there is no order between ci and

cj . However, then SSG(H) would have an IC-read-dependency edge Tjwir−→ Ti

without a TjC−→ Ti and not avoid G-IC.

—The integrity read could violate property (2) of Definition 4.4, i.e., a transactionTk directly IC-anti-depends on Ti but either ck ≺t ci or there is no order between

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

... · 53

ci and ck. Then SSG(H) would have an IC-anti-dependency edge Ti

irw− → Tk

without a TiC−→ Tk and not avoid G-IC.

Appendix C. PROOFS FOR 1-COPY-SI+IC

Appendix C.1 Complete Proof of Theorem 5.4

The theorem states that a replicated history RH is 1-copy-SI+IC if the followingholds

—For each Rk ∈ R, RHk is an SI+IC-history.

—For all update transactions Ti ∈ T and for all Rk, Rl ∈ R, cki ⇐⇒ cl

i;

—There exists an USG-IC(RH) that has no G-1c* or G-SIb* cycles.

Proof. The proof is similar to the one for Theorem 3.9. We only show theparts that differ from the proof of Theorem 3.9. We have to construct an SI+IC-history H with the same nodes and the same ww-, wr-, and rw-dependency edgesas USG(RH).

Part (1): To construct a history H.Step 1 builds the same Start-Commit-Order Serialization Graph,SCSG(RH),

from USG-IC(RH) as described in the proof of Theorem 3.9 which provides apartial order between pairs of start- and commit operations. However, we addadditional edges:

4. For each Ti, Tj (i 6= j) in USG-IC(RH), there is an edge ci<

−→ cj in SCSG(RH)

iff Tie

−→ Tj in USG-IC(RH) where e ∈ {wir, irw}. This reflects the need thatintegrity reads need to be IC-consistent.

Now we show there is no cycle in SCSG(RH). Assume there is a cycle. The cyclein SCSG(RH) consists either (a) entirely of commit operations or (b) of start andcommit operations.

For case (a), there will be a corresponding cycle in USG-IC(RH) that consistsentirely of wir- and irw-dependency edges. This contradicts the fact that USG-IC(RH) has no G-1c* cycle.

For case (b), since there is no si<

−→ sj in SCSG(RH), we can break the cycle

into sections with either (i) the pattern of ci<

+

−→ cj , or (ii) the pattern of ci<

−→

sj<

−→ ck.

Pattern (i) ci<

+

−→ cj is due to a path of wir- and irw-dependency edges from Ti to

Tj in USG-IC(RH). In pattern (ii), ci<

−→ sj must be due to Tiwr,ww−→ Tj. sj

<−→ ck

might be because sj and ck are in the same transaction (i.e., j=k) or because of

Tj

rw− → Tk. If all dependencies sj

<−→ ck are due to j=k, we know that there is no

rw-dependency edge in the cycle. The cycle must consist entirely of ww-, wr-, wir-,and irw-dependency edges. It contradicts the fact that USG-IC(RH) has no G-1c*

cycles. If some dependencies sj<

−→ ck are due to Tj

rw− → Tk, we know that each

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.

54 · ...

must be preceded by a ci<

−→ sj that was due to a Tiwr,ww−→ Tj . Thus, there must

be a cycle in USG-IC(RH) such that all of its rw-dependency edges are prefixedwith a ww- or wr- dependency edge. This contradicts the fact that USG-IC(RH)has no G-SIb* cycles.

Thus, our extended SCSG(RH) does not contain any cycles. We can constructH by using steps 2-5 in the proof part (1) of Theorem 3.9. Additionally, we addStep 6 after Step 5.

Step 6: Determining the versions of integrity read operations. For eachintegrity read operation iri of committed Ti we need to determine the set of versionsaccessed. When constructing USG-IC(RH), let Rk ∈ R be the replica such thatthe wir- and irw-dependency edges for Ti were taken from SSG(RHk). Then welet iri access the same versions as the corresponding irk

i accessed in RHk.

Part (2): SSG(H) has the same ww-, wr- and rw-dependency edges as USG(RH)This part of the proof is the same as Theorem 3.9 proof part (2).

Part (3): H is an SI+IC historyThe part of the proof that shows that H is an SI-history is similar to the proof

of Theorem 3.9, part (3), and thus, is omitted.Construction step 6 guarantees that all committed transactions are IC-obeying.

As all integrity reads perform their evaluation on the same data versions as theircorresponding integrity reads in one of the local histories, the outcome must be thesame, namely true.

What remains to be shown is that the integrity reads are actually IC-consistent.Property (1) of IC-consistency requires that for an integrity read iri(F:P:Oset(P):Iset(P))in H , if xj ∈ Iset(P ), then cj ≺t ci. When constructing USG-IC(RH), let Rk ∈ Rbe the replica such that the wir- and irw-dependency edges for Ti were taken fromSSG(RHk). As xj is also accessed by the corresponding integrity read irk

i in RHk,

USG-IC(RH) contains a Tjwir−→ Ti. Therefore, our construction step 1.5 includes

a cj<

−→ ci in SCSG(RH), and thus, step 2 sets cj ≺t ci.Property (2) requires that if a transaction Tj directly IC-anti-depends on Ti due

an integrity read iri(F:P:Oset(P):Iset(P)) of Ti, then ci ≺t cj . Let xl be theversion accessed in Iset(P ). If xl ∈ Oset(P ), then xj is the version following xl inthe version order. If xl 6∈ Oset(P ), then xj is the first version after xl that matchesthe predicate defined in P . In both cases, according to the construction of H , xl

is also accessed by the corresponding integrity read irki in RHk and RHk and H

have the same sequence of versions from xl to xj . Therefore, it is guaranteed that

USG-IC(RH) contains a Ti

irw− → Tj , and thus, in our construction of H , steps 1.5

and 2, we set ci ≺t cj .

...

ACM Transactions on Database Systems, Vol. V, No. N, February 2009.