Anthony D. Joseph and John Kubiatowicz Slides by Alan Fekete ( University of Sydney)

  • View
    34

  • Download
    0

Embed Size (px)

DESCRIPTION

EECS 262a Advanced Topics in Computer Systems Lecture 10 Transactions and Isolation Levels 2 October 7 th , 2013. Anthony D. Joseph and John Kubiatowicz Slides by Alan Fekete ( University of Sydney) http://www.eecs.berkeley.edu/~kubitron/cs262. Today’s Papers. - PowerPoint PPT Presentation

Text of Anthony D. Joseph and John Kubiatowicz Slides by Alan Fekete ( University of Sydney)

EECS 252 Graduate Computer Architecture Lec 01 - Introduction

EECS 262a Advanced Topics in Computer SystemsLecture 10

Transactions and Isolation Levels 2October 6th, 2014

Alan FeketeSlides by Alan Fekete (University of Sydney), Anthony D. Joseph and John Kubiatowicz (UC Berkeley)

http://www.eecs.berkeley.edu/~kubitron/cs262

10/6/2014#Cs262a-F14 Lecture-10CS252 S051Page #Todays PapersThe Notions of Consistency and Predicate Locks in a Database SystemK.P. Eswaran, J.N. Gray, R.A. Lorie, and I.L. Traiger. Appears in Communications of the ACM, Vol. 19, No. 11, 1976Key Range Locking Strategies for Improved ConcurrencyDavid Lomet. Appears in Proceedings of the 19th VLDB Conference, 1993

Thoughts?10/6/2014#Cs262a-F14 Lecture-10OverviewSerializabilityThe Phantom IssuePredicate LockingKey-Range LocksNext-Key Locking techniquesIndex Management and TransactionsMulti-level reasoning

10/6/2014#Cs262a-F14 Lecture-10Theory and realityTraditional serializability theory treats database as a set of items (Eswaran et al 76 says entities) which are read and writtenTwo phase locking is proved correct in this modelWe now say serializableBut, database has a richer set of operations than just read/writeDeclarative selectsInsertDelete10/6/2014#Cs262a-F14 Lecture-10Review: Goals of Transaction SchedulingMaximize system utilization, i.e., concurrencyInterleave operations from different transactions

Preserve transaction semanticsSemantically equivalent to a serial schedule, i.e., one transaction runs at a time

T1: R, W, R, WT2: R, W, R, R, WR, W, R, W, R, W, R, R, WSerial schedule (T1, then T2):R, W, R, R, W, R, W, R, WSerial schedule (T2, then T1):10/6/2014#Cs262a-F14 Lecture-10Two Key QuestionsIs a given schedule equivalent to a serial execution of transactions?

How do you come up with a schedule equivalent to a serial schedule?

R, W, R, W, R, W, R, R, WR, W, R, R, W, R, W, R, WR, R, W, W, R, R, R, W, WSchedule:Serial schedule (T1, then T2)::Serial schedule (T2, then T1):

10/6/2014#Cs262a-F14 Lecture-10Transaction SchedulingSerial schedule: A schedule that does not interleave the operations of different transactionsTransactions run serially (one at a time)

Equivalent schedules: For any storage/database state, the effect (on storage/database) and output of executing the first schedule is identical to the effect of executing the second schedule

Serializable schedule: A schedule that is equivalent to some serial execution of the transactionsIntuitively: with a serializable schedule you only see things that could happen in situations where you were running transactions one-at-a-time

10/6/2014#Cs262a-F14 Lecture-10Anomalies with Interleaved Execution May violate transaction semantics, e.g., some data read by the transaction changes before committing

Inconsistent database state, e.g., some updates are lost

Anomalies always involves a write; Why?10/6/2014#Cs262a-F14 Lecture-10Anomalies with Interleaved Execution Read-Write conflict (Unrepeatable reads)

Violates transaction semanticsExample: Mary and John want to buy a TV set on Amazon but there is only one left in stock(T1) John logs first, but waits(T2) Mary logs second and buys the TV set right away(T1) John decides to buy, but it is too lateT1:R(A), R(A),W(A)T2: R(A),W(A) 10/6/2014#Cs262a-F14 Lecture-10Anomalies with Interleaved Execution Write-read conflict (reading uncommitted data)

Example: (T1) A user updates value of A in two steps(T2) Another user reads the intermediate value of A, which can be inconsistentViolates transaction semantics since T2 is not supposed to see intermediate state of T1

T1:R(A),W(A), W(A)T2: R(A), 10/6/2014#Cs262a-F14 Lecture-10Anomalies with Interleaved Execution Write-write conflict (overwriting uncommitted data)

Get T1s update of B and T2s update of AViolates transaction serializabilityIf transactions were serial, youd get either:T1s updates of A and BT2s updates of A and B

T1:W(A), W(B)T2: W(A),W(B) 10/6/2014#Cs262a-F14 Lecture-10Conflict Serializable SchedulesTwo operations conflict if theyBelong to different transactionsAre on the same data At least one of them is a write

Two schedules are conflict equivalent iff:Involve same operations of same transactions Every pair of conflicting operations is ordered the same way

Schedule S is conflict serializable if S is conflict equivalent to some serial schedule10/6/2014#Cs262a-F14 Lecture-10Page #Conflict Equivalence IntuitionIf you can transform an interleaved schedule by swapping consecutive non-conflicting operations of different transactions into a serial schedule, then the original schedule is conflict serializableExample:T1:R(A),W(A), R(B),W(B)T2: R(A),W(A), R(B),W(B) T1:R(A),W(A), R(B), W(B)T2: R(A), W(A), R(B),W(B) T1:R(A),W(A),R(B), W(B)T2: R(A),W(A), R(B),W(B) 10/6/2014#Cs262a-F14 Lecture-10Conflict Equivalence Intuition (contd)If you can transform an interleaved schedule by swapping consecutive non-conflicting operations of different transactions into a serial schedule, then the original schedule is conflict serializableExample:T1:R(A),W(A),R(B), W(B)T2: R(A),W(A), R(B),W(B) T1:R(A),W(A),R(B), W(B)T2: R(A), W(A),R(B),W(B) T1:R(A),W(A),R(B),W(B)T2: R(A),W(A),R(B),W(B) 10/6/2014#Cs262a-F14 Lecture-10Conflict Equivalence Intuition (contd)If you can transform an interleaved schedule by swapping consecutive non-conflicting operations of different transactions into a serial schedule, then the original schedule is conflict serializable

Is this schedule serializable?T1:R(A), W(A)T2: R(A),W(A), 10/6/2014#Cs262a-F14 Lecture-10Dependency GraphDependency graph: Transactions represented as nodesEdge from Ti to Tj: an operation of Ti conflicts with an operation of TjTi appears earlier than Tj in the schedule

Theorem: Schedule is conflict serializable if and only if its dependency graph is acyclic

10/6/2014#Cs262a-F14 Lecture-10Page #ExampleConflict serializable schedule:

No cycle!T1T2ADependency graphBT1:R(A),W(A), R(B),W(B)T2: R(A),W(A), R(B),W(B) 10/6/2014#Cs262a-F14 Lecture-10Page #ExampleConflict that is not serializable:

Cycle: The output of T1 depends on T2, and vice-versa

T1:R(A),W(A), R(B),W(B)T2: R(A),W(A),R(B),W(B) T1T2ABDependency graph10/6/2014#Cs262a-F14 Lecture-10Page #Notes on Conflict SerializabilityConflict Serializability doesnt allow all schedules that you would consider correctThis is because it is strictly syntactic - it doesnt consider the meanings of the operations or the data

In practice, Conflict Serializability is what gets used, because it can be done efficientlyNote: in order to allow more concurrency, some special cases do get implemented, such as for travel reservations,

Two-phase locking (2PL) is how we implement it10/6/2014#Cs262a-F14 Lecture-10T1:R(A), W(A), T2: W(A),T3: WA Serializability Conflict SerializabilityFollowing schedule is not conflict serializable

However, the schedule is serializable since its output is equivalent with the following serial schedule

Note: deciding whether a schedule is serializable (not conflict-serializable) is NP-complete

T1T2ADependency graphT1:R(A),W(A), T2: W(A),T3: WA T3AAA10/6/2014#Cs262a-F14 Lecture-10Locks (Simplistic View)Use locks to control access to data

Two types of locks:shared (S) lock multiple concurrent transactions allowed to operate on dataexclusive (X) lock only one transaction can operate on data at a timeSXSXLockCompatibilityMatrix10/6/2014#Cs262a-F14 Lecture-10Page #Two-Phase Locking (2PL)1) Each transaction must obtain: S (shared) or X (exclusive) lock on data before reading, X (exclusive) lock on data before writing2) A transaction can not request additional locks once it releases any locksThus, each transaction has a growing phase followed by a shrinking phase

GrowingPhaseShrinkingPhaseLock Point!Avoid deadlockby acquiring locksin some lexicographic order10/6/2014#Cs262a-F14 Lecture-10Page #Two-Phase Locking (2PL)2PL guarantees conflict serializability

Doesnt allow dependency cycles. Why?

Answer: a dependency cycle leads to deadlockAssume there is a cycle between Ti and TjEdge from Ti to Tj: Ti acquires lock first and Tj needs to waitEdge from Tj to Ti: Tj acquires lock first and Ti needs to waitThus, both Ti and Tj wait for each other Since with 2PL neither Ti nor Tj release locks before acquiring all locks they need deadlockSchedule of conflicting transactions is conflict equivalent to a serial schedule ordered by lock point10/6/2014#Cs262a-F14 Lecture-10ExampleT1 transfers $50 from account A to account B

T2 outputs the total of accounts A and B

Initially, A = $1000 and B = $2000

What are the possible output values?3000, 2950, 3050T1:Read(A),A:=A-50,Write(A),Read(B),B:=B+50,Write(B)T2:Read(A),Read(B),PRINT(A+B)10/6/2014#Cs262a-F14 Lecture-103000, 2950, 3050Page #Is this a 2PL Schedule?1Lock_X(A) 2Read(A)Lock_S(A)3A: = A-504Write(A)5Unlock(A) 6Read(A)7Unlock(A)8Lock_S(B) 9Lock_X(B)10Read(B)11 Unlock(B)12PRINT(A+B)13Read(B)14B := B +5015Write(B)16Unlock(B)No, and it is not serializable10/6/2014#Cs262a-F14 Lecture-10Is this a 2PL Schedule?1Lock_X(A) 2Read(A)Lock_S(A)3A: = A-504Write(A)5Lock_X(B) 6Unlock(A) 7Read(A)8Lock_S(B)9Read(B)10B := B +5011Write(B)12Unlock(B) 13Unlock(A)14Read(B)15Unlock(B)16PRINT(A+B)Yes, so it is se