29
Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Mad

Concurrency control in transactional systems

  • Upload
    mayten

  • View
    45

  • Download
    0

Embed Size (px)

DESCRIPTION

Concurrency control in transactional systems. Jinyang Li. Some slides adapted from Stonebraker and Madden. What we ’ ve learnt so far…. All-or-nothing atomicity in transactions So that failures do not result in (undesirable) intermediate state How to realize all-or-nothing atomicity? - PowerPoint PPT Presentation

Citation preview

Page 1: Concurrency control in transactional systems

Concurrency control in transactional systems

Jinyang Li

Some slides adapted from Stonebraker and Madden

Page 2: Concurrency control in transactional systems

What we’ve learnt so far…

• All-or-nothing atomicity in transactions– So that failures do not result in (undesirable)

intermediate state

• How to realize all-or-nothing atomicity?– Logging– REDO/UNDO logging vs. REDO-only

logging

Page 3: Concurrency control in transactional systems

Concurrency control

• Many application clients are concurrently accessing a storage system.

• Concurrent transactions might interfere– Similar to data races in parallel programs

Page 4: Concurrency control in transactional systems

A_bal = READ(A)If (A_bal >50) { A_bal -= 50 WRITE(A, A_bal) dispense $50 to user}

Withdraw $50 from A

A_bal = READ(A)B_bal = READ(B)Print(A_bal+B_bal)

Report sum of money

An example

Storage system

A_bal = READ(A)If (A_bal >100) { B_bal = READ(B) B_bal += 100 A_bal -= 100 WRITE(A, A_bal) WRITE(B, B_bal)}

Transfer $100 from A to B

Page 5: Concurrency control in transactional systems

Yfs lab example

extentserver

d = GET(dir)newd = append “x” to dPUT(newd)

yfs client 1 (CREATE)

d = GET(dir)newd = append “y” to dPUT(newd)

yfs client 2 (CREATE)

Page 6: Concurrency control in transactional systems

Solutions?

1. Leave it to application programmers– Application programmers are responsible

for locking data items

2. Storage system performs automatic concurrency control

– Concurrent transactions execute as if serially

Page 7: Concurrency control in transactional systems

ACID transactions

• A (Atomicity)– All-or-nothing w.r.t. failures

• C (Consistency)– Transactions maintain any internal storage state

invariants

• I (Isolation)– Concurrently executing transactions do not

interfere

• D (Durability)– Effect of transactions survive failures

Page 8: Concurrency control in transactional systems

What’s ACID? an example

T1: Transfer $100 from A to B

T2: Transfer $50 from B to A

• A T1 completes or nothing (ditto for T2)

• D once T1/T2 commits, stays done, no updates lost

• I no races, as if T1 happens either before or after T2

• C preserves invarants, e.g. account balance > 0

Page 9: Concurrency control in transactional systems

Ideal isolation semantics: serializability

• Definition: execution of a set of transactions is equivalent to some serial order– Two executions are equivalent if they have the

same effect on database and produce same output.

Page 10: Concurrency control in transactional systems

Conflict serializability

• An execution schedule is the ordering of read/write/commit/abort operations

A_bal = READ(A)B_bal = READ(B)B_bal += 100A_bal -= 100WRITE(A, A_bal)WRITE(B, B_bal)Commit

A_bal = READ(A)B_bal = READ(B)Print(A_bal+B_bal)Commit

A (serial) schedule: R(A),R(B),W(A),W(B),C,R(A),R(B),C

Page 11: Concurrency control in transactional systems

Conflict serializability

• Two schedules are equivalent if they:– contain same operations– order conflicting operations the same way

• A schedule is serializable if it’s identical to some serial schedule

Two ops conflict if they access the same data item and one is a write.

Page 12: Concurrency control in transactional systems

Examples

A_bal = READ(A)B_bal = READ(B)Print(A_bal+B_bal)

A_bal = READ(A)B_bal = READ(B)B_bal += 100A_bal -= 100WRITE(A, A_bal)WRITE(B, B_bal)

Serializable? R(A),R(B),R(A),R(B),C W(A),W(B),C

Serializable? R(A),R(B), W(A), R(A),R(B),C, W(B),C

Equivalent serial schedule: R(A),R(B),C,R(A),R(B),W(A),W(B),C

Page 13: Concurrency control in transactional systems

Realize a serializable schedule

• Locking-based approach

• Strawman solution 1:– Grab global lock before transaction starts– Release global lock after transaction commits

• Strawman solution 2:– Grab lock on item X before reading/writing X– Release lock on X after reading/writing X

Page 14: Concurrency control in transactional systems

Strawman 2’s problem

A_bal = READ(A)B_bal = READ(B)Print(A_bal+B_bal)

A_bal = READ(A)B_bal = READ(B)B_bal += 100A_bal -= 100WRITE(A, A_bal)WRITE(B, B_bal)

Possible with strawman 2? (short-duration locks on writes) R(A),R(B), W(A), R(A),R(B),C, W(B),C

Locks on writes should be held till end of transaction

Read an uncommitted value

Page 15: Concurrency control in transactional systems

More Strawmans

• Strawman 3– Grab lock on item X before writing X– Release locks at the end of transaction– Grab lock on item X before reading X– Release locks immediately after reading

Possible with strawman 3? (short-duration locks on reads) R(A),R(B), W(A), R(A),R(B),C, W(B),C

R(A), R(A),R(B), W(A), W(B),C, R(B), C

Non-repeatable readsRead locks must be held

till commit time

Page 16: Concurrency control in transactional systems

Strawman 3’s problem

Possible with strawman 3? (short-duration locks on reads) R(A),R(B), W(A), R(A),R(B),C, W(B),C

R(A), R(A),R(B), W(A), W(B),C, R(B), C

Non-repeatable readsRead locks must be held

till commit time

A_bal = READ(A)B_bal = READ(B)Print(A_bal+B_bal)

A_bal = READ(A)B_bal = READ(B)B_bal += 100A_bal -= 100WRITE(A, A_bal)WRITE(B, B_bal)

Page 17: Concurrency control in transactional systems

• 2 phase locking (2PL)– A growing phase in which the transaction is acquiring

locks– A shrinking phase in which locks are released

• In practice,– The growing phase is the entire transaction– The shrinking phase is at the commit time

• Optimization:– Use read/write locks instead of exclusive locks

Realize a serializable schedule

Page 18: Concurrency control in transactional systems

2PL: an example

R_lock(A)A_bal = READ(A)R_lock(B)B_bal = READ(B)B_bal += 100A_bal -= 100W_lock(A)WRITE(A, A_bal)W_lock(B)WRITE(B, B_bal)W_unlock(A)W_unlock(B)

R_lock(A)A_bal = READ(A)R_lock(B)B_bal = READ(B)Print(A_bal+B_bal)R_unlock(A)R_unlock(B)

Possible? R(A),R(B), W(A), R(A),R(B),C W(B),C

Page 19: Concurrency control in transactional systems

More on 2PL

• What if a lock is unavailable?

• Deadlocks possible?

• How to cope with deadlock? – Grab locks in order? No always possible– Transaction manager detects deadlock cycles

and aborts affected transactions– Alternative: timeout and abort yourself

wait

detect & abort

Use crash recovery mechanism (e.g. UNDO records) to clean up

Page 20: Concurrency control in transactional systems

The Phantom problem

T1: begin_txupdate emp (set salary = 1.1 * salary) where dept = ‘CS’end_tx

T2: begin_txUpdate emp (set dept = ‘CS’) where emp.name = ‘jinyang’end_tx

T1: update BobT1: update HerbertT2: move JinyangT2: move LakshmiT2 commits and releases all locksT1: update Lakshmi(!!!!!)T1: update MichaelT1: commits

Page 21: Concurrency control in transactional systems

The Phantom problem

• Issue is lock things you touch – but must guarantee non-existence of any new ones!

• Solutions:– Predicate locking– Range locks in a B-tree index (assumes an

index on dept). Otherwise, lock entire table– Often ignored in practice

Page 22: Concurrency control in transactional systems

Why less than serializability?

• Performance of locking?

• What to do:– run analytics on a companion data

warehouse (common practice)– take the stall (rarely done)– run less than serializability

T1: begin_txSelect avg (sal) from empend_tx

T2: begin_txUpdate emp …end_tx

Page 23: Concurrency control in transactional systems

Degrees of isolation• Degree 0 (read uncommitted)

– no locks (guarantees nothing)

• Degree 1 (read committed)– Short read locks, long write locks

• Degree 2– Long read/write locks. (serializable unless you squint)

• Degree 3– Long read/write locks plus solving the phantom

problem

Page 24: Concurrency control in transactional systems

Disadvantages of locking-based approach

• Need to detect deadlocks– Distributed implemention needs distributed

deadlock detection (bad)

• Read-only transactions can block update transactions– Big performance hit if there are long

running read-only transactions

Page 25: Concurrency control in transactional systems

Multi-version concurrency control

• No locking during transaction execution!• Each data item is associated with multiple

versions• Multi-version transactions:

– Buffer writes during execution– Reads choose the appropriate version– At commit time, system validates if okay to make

writes visible (by generating new versions)

Page 26: Concurrency control in transactional systems

Snapshot isolation

• A popular multi-version concurrency control scheme

• A transaction:– reads a “snapshot” of database image– Can commit only if there are no write-write

conflict

Page 27: Concurrency control in transactional systems

Implementing snapshot isolation

• T is assigned a start timestamp,T.sts

R(A) W(B)

• T buffers writes to B and addsB to its writeset, T.wset += {B}

• T is assigned a commit timestamp• System checks forall T’, s.t. T’.cts > T.sts && T’.cts < T.ctsT’.wset and T.wset do not overlap

• T reads the biggest versionof A, A(i), such that i <= T.sts

time

Page 28: Concurrency control in transactional systems

An example

R(A) R(B) W(B)W(A)

R(A) R(B)

R(A) R(B)

No read-uncommitted

No unrepeatable-read

105 50 55 60

Page 29: Concurrency control in transactional systems

Snapshot isolation < serializability

• The write-skew problem

Possible under SI? R(A),R(B),W(B), W(A),C,C

Serializable?