29
German Shegalov Transaction Timestamping in Transaction Timestamping in Temporal Databases Temporal Databases FR Informatik FR Informatik Graduiertenkolleg Graduiertenkolleg Ringvorlesung, May 26 th , 2003 based on the research by D. Lomet, C. Jensen and R. Snodgrass

Transaction Timestamping in Temporal Databases

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Transaction Timestamping in Temporal Databases

German Shegalov

Transaction Timestamping in Temporal Transaction Timestamping in Temporal DatabasesDatabases

FR InformatikFR InformatikGraduiertenkollegGraduiertenkolleg

Ringvorlesung, May 26th, 2003

based on the research by D. Lomet, C. Jensen and R. Snodgrass

Page 2: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 2

Outline

• Introduction: conventional vs. temporal– Temporal databases: valid-time, transaction-time– Databases and transactions (ACID principles)– (Optimistic) concurrency control: TO, …– (Pessimistic) concurrency control: 2PL, …

• Timestamping in transaction-time databases – Timestamping and strong 2PL (SS2PL)– Timestamping in distributed setting (2PC)– Timestamping since SQL-92 (CURRENT_TIME)

Page 3: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 3

Conventional vs. Temporal DB

• A conventional DB captures only the most current state of modeled world – e.g. current account balance, employee's salary

• A temporal DB supports a time domain and is thus able to manage time varying data– real-time stock quotes– employee's salaries between 1997 and 2000

Page 4: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 4

Notions of Time• Transaction-timeTransaction-time

– is defined as the time when a fact is stored in the database that allows for as-of queries

• Valid-timeValid-time– is defined when a fact becomes effective (valid) in

reality

• BitemporalBitemporal databases support both of above

Page 5: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 5

Transaction (ACID contract)• AtomicityAtomicity (all or nothing in case of a failure)

begin;

acc1 -= money; acc2 += money;

commit;

• Consistency– rollback updates upon a failed consistency check

• IsolationIsolation– mask inconsistent intermediate state resulting

from concurrent execution

• DurabilityDurability– committed updates must be failure-resilient

Page 6: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 6

Transaction Isolationx=0

r1(x=0)

r2(x=0) w2(x=x+20)w1(x=x+10) x=30

x=10x=10

Lost

Update:

w1(x=10)

r2(x=10)

abort1=w1-1(x)

w2(x=x+10)x=0

x=10x=20x=20

Dirty

Read:

x=0y=0

x=0y=10

Inconsistent

Read: r1(x=0)w2(x=5) w2(y=10)

r1(y=10)

Read/Write, Write/Read, Write/Write are not commutable

Page 7: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 7

CC Protocols• Basic Timestamp Ordering (BTO)Basic Timestamp Ordering (BTO)

– each transaction i obtains a ti timestamp right away

– operations are executed in the scheduled order

– ri(x): if ti ≥ w-time(x) then schedule else aborti

– wi(x): if ti ≥ max{w-time(x), r-time(x)} then schedule else aborti

• Two Phase Locking (2PL) Two Phase Locking (2PL) – prior to execution of an operation an appropriate lock is requested

– no further lock requests after some lock has been released

– lock is granted when no conflicting locks already present

– otherwise add an edge to the Wait-For-Graph (WFG)

– outperforms BTO

• Strong 2PL (SS2PL)Strong 2PL (SS2PL)– locks are held until commit (IBM DB2, MS SQL Server, …)

Page 8: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 8

Outline

• Introduction: conventional vs. temporal– Temporal databases: valid-time, transaction-time– Databases and transactions (ACID principles)– (Optimistic) concurrency control: TO, …– (Pessimistic) concurrency control: 2PL, …

• Timestamping in transaction-time databases – Timestamping and strong 2PL (SS2PL)– Timestamping in distributed setting (2PC)– Timestamping since SQL-92 (CURRENT_TIME)

Page 9: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 9

TT Database Semantics• Each record has a timestamptimestamp

• InsertInsert creates a new record

• UpdateUpdate inserts a new record version

• DeleteDelete inserts an empty record version (delete-stub) for the record being deleted

• Timeslice Timeslice Q(t) Q(t) executes QQ against DB as of tt– returns for each qualifying record the latest version with

timestamp ≤ t≤ t unless it is a delete-stub

– implies that timestamp order must agree with serialization order

Page 10: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 10

Timestamp Selection (simple)

• BTO provides proper timestamp order automatically– but it causes too many transaction restarts

• SS2PL for any pi(x) < qj(x) in conflict:

– pli(x) < pi(x) < puli(x) < ccii < qlj(x) < qulj(x) < ccjj

commit order agrees with serialization order chose commit timecommit time as timestamp timestamping is not used for CC, thus no

additional concurrency limitation

Page 11: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 11

Two Phase Commit (2PC)Coordinator DB1 DB2

force-log begin

Tim

eline

prepareprepare

force-log preparedyes

yesforce-log prepared

commitcommit

force-log commit

force-log commit

force-log commit

force-log end

ack

ack

Page 12: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 12

Timestamping Issues in 2PC• ProblemProblem

– network latencies and loosely synced clocks

– commit points are different at all sites

– max_commit_time < begin_time as perceived by the user

• ObservationObservation– when X is prepared, all conflicting concurrent transactions

will commit after X

• Solution:Solution:– each database ii votes EARLIESTEARLIESTii acceptable timestamp

that is updated after logging prepared

– commit with max{EARLIESTEARLIESTii, begin_time}

Page 13: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 13

2PC for Transaction Time DBCoordinator DB1 DB2

force-log begin(10)

Tim

eline

prepareprepare

force-log prepared;EARLIEST1++ yes(9)

yes(11)force-log prepared;EARLIEST2++

commit(11)commit(11)

force-log commit(11)

force-log commit(11)

force-log commit(11)

force-log end

ack

ack

/*begin_time = 10*/ /*EARLIEST1 = 8*/ /*EARLIEST2 = 10*/

Page 14: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 14

Timestamping since SQL-92

• SQL query can ask for current time with some precision: year, month, date, …, millisecond

• SQL-92 explicitly requires current time value to be fixed just within a single SQL statement

• In TTDB a transaction logically takes place at a single point in time– current time value must not change until commit

Page 15: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 15

"Current Time" Matters

• X1 reads non-current y as of tcurrent

• X3 updates unlocked current y (e.g. a stock goes up enormously)

• some time later: was X1 aware of X3 ?! – based on transaction timestamps: ct1 > ct3 => YES!YES!

– in fact: NOT GUILTY!!!!!!!!!!!!!!!!!!!!!NOT GUILTY!!!!!!!!!!!!!!!!!!!!!

• current time determines user-perceived transaction time

r(y0)X1

X3

time

fix tcurrent

ct1ct3

w3(y3)X2

ct2

w2(y2)

buy

Page 16: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 16

55

c2

Inconsistent Timeslice

• SS2PL acceptsaccepts the schedule above

• X1 reads y from X2 (hence, c2 < c1) => serialization X2 < X1

• timeslice(2) = { (x, 1), (y,0), (z,2) }, when taken after 8 is transaction inconsistent, it has never been currenttransaction inconsistent, it has never been current

• Reason: tX2 > tX1

although X2 < X1

time

X1

X2

fix t1current

11

fix t2current

33

w1(x=1)

22

w2(y=1)

44

r1(y=1)

66

w1(z=2)

77 88

c1

x=0y=0z=0

Page 17: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 17

66

c2

44

c1

Unrepeatable TimesliceX1

X2

fix t1current

22

timeslice1(t1current)

33

w2(y=1)

55

timeslice3(t1current)

77 88

c3

y=0

y=0 y=1

fix t2current

11

• writers after timeslice have to commit with a later timestamp than that of the concurrent timeslicing transaction

time

X3

Page 18: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 18

Solution Requirements

• SS2PL remains the primary CC mechanism– reduce the likelihood of transaction aborts

• If X has tcurrent = t then X has started and not yet committed at time t

• X1 and X2 with t1current < t2

current then

– X1 must not see X2's updates

– there exists an equivalent serial schedule: X1 < X2

Page 19: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 19

Algorithm Design

• each data item d has write-timestamp d.TT

• read timestamp d.TR in volatile memory

• reads define the lower transaction time bound tl

– initially tl := ts (transaction start time)

• VR (initially ) volatile transaction's read-set

• VI (initially ) volatile transaction's write-set (newly inserted versions)

• tX timestamp of transaction X

Page 20: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 20

Before tX Assignment

• Read(d): /*sync tX with conflict write*/

– tl := max{ tl , d.TT } /* prevent tX ≤ d.TT */

– VR := VR {d} /* will have to update d.TR */

• Write(d): /*sync tX with conflict write&read*/

– tl := max{ tl , d.TR, d.TT } /*prevent tX ≤ d.TT and tX ≤

d.TR */

– VI := VI {d} /* will have to update d.TT */

Page 21: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 21

Timestamp tX Assignment

• if because of CURRENT_TIME request– tX := tcurrent /* safe because tcurrent > tl */

• if immediately before COMMIT– tX := tl++ /* smallest possible time greater than tl */

Page 22: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 22

"Who comes too late … " will be punished by scheduler

• Read(d):– if tX < d.TT then abort X

– else VR := VR {d} /* as before */

• Write(d):– if tX < max{ d.TR, d.TT } then abort X

– else VI := VI {d} /* as before */

Page 23: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 23

Optimization I (Precision)

• user-specified current time precision allows for a broader range of acceptable timestamps– e.g. current year "now" and on Dec 31th 2003,

23:59:59,999 is still the same

• tX := tcurrent tX := (tl, th=max(tcurrent ,p)]

• allow data access as before and thus potentially increasing tl

• if tl ≥ th then abort X

• if X could be completely executed tX := tl++

Page 24: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 24

Optimization II ( RTT )• no way to maintain d.TR in main memory

• fixed-size hash table RTT: e.g. 1024 entries

• Di := {d | hash(d) = i} for i in 1 … 1024

• trade-off: RTT size vs. read timestamp accuracy

• Write(d), RTT is checked immediately– tl := max{ tl ,RTT[hash(d)], d.TT }

• Redefine VR to be 1024-bit-bitvector with

– VR[i] = 1, if d has been read and i=hash(d)

– 128 byte overhead to track accessed data items

Page 25: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 25

Commit Processing

• /* update volatile RTT*/ for i:=1 to 1024 do if VR [i] = 1 then RTT[i] := max{ RTT[i], tX }

• /* timestamp data, part of transaction*/ /*either directly or by X-id-tX mapping */ for each d in VI do d.TT := tX

Page 26: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 26

System Crashes• Observation

– timestamping for committed X is safe– RTT passed away and

• so did crash-interrupted X which needed RTT

• committed transactions do not need RTT

– last commit time is before crash time– each new X will start and commit after crash time

• Recovery Action– RTT[i]:= last commit time– conservative read-write sync detection w/o penalty

Page 27: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 27

Summary

• transaction-consistent view on historical data– timestamp order consistent with transaction

serialization order

• Simple timestamp selection at commit time

• Solution for distributed transactions with 2PC

• Solution for "CURRENT_TIME" requests

Page 28: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 28

Outlook

• Impact on multiversion concurrency control– Read-Only Multiversion, Snapshot Isolation

[Weikum + Vossen 01]

Page 29: Transaction Timestamping in Temporal Databases

May 26, 2003 G. Shegalov: Transaction Timestamping in Temporal Databases 29

Questions