Transactions Management and Concurrent Control

Embed Size (px)

Citation preview

  • 8/6/2019 Transactions Management and Concurrent Control

    1/12

    TRANSACTIONS MANAGEMENT AND CONCURRENT CONTROL.

    Process-A process (sometimes called a task, or a job) is, informally, a program in execution. A Process is notthe same as program .there is a difference between apassive program stored on disk, and an actively executing

    process. Multiple people can run the same program; each running copy corresponds to a distinct process. The

    program is only part of a process; the process also contains the execution state.

    -Distributed transactions reflect real world transactions that are triggered by events such as buying products, registeringfor a course, making a deposit to your account e.t.c. A transaction may contain many parts e.g. a sales transaction may

    require updating the customers account, adjusting the product inventory, updating the customers account. All parts of atransaction must be successfully completed to prevent data integrity problems.

    Definition.

    A transaction is a series of actions carried out by a single user or application program which must be treated as a logical

    unit of work that must be either entirely completed or aborted and no intermediates states are accepted. It results from

    execution of user program delimited by statement (function calls) of the form begin transaction and end transactions.

    Transaction States

    A transaction that changes the contents of the database must transform the database from one consistent state to another.

    Database state is a collection of all the store data items (values) in the database at a green point and time. A consistendatabase store is one in which all data integrity constraints are satisfied. If a transaction completes successfully it is said

    to have committedand the database reaches a new consistent state. On the other hand, if the transaction does not executesuccessfully, is aborted. If a transaction is aborted the database is restored to the previous consistent though ROLL BACK(undoing). Particularly committed state occurs after the final state has been executed. The transaction may be aborted due

    to violation of integrity constraints. Alternatively the system may fall before the data is recorded on secondary storage

    meaning the transaction go to a failed state can be aborted.

    Failed state occurs when the transaction cannot be committed or is aborted while in action.

    Begin transaction- This marks the beginning of transaction execution.

    End-transaction - Specifies the READ & WRITE transaction operations have ended.

    Active state -A transaction goes into an active state immediately after it starts execution, where it can issue read

    and write operations.

    Partially committed state -At this point, the recovery protocol checks if the transaction execution violates the

    integrity constraints if not the updates are committed otherwise the transactions it is aborted.

    Terminated state -Corresponds the transaction leaving the system.

    Commit transaction- Signals a successful end of a transaction so that in any changes (updates) executed by atransaction can be safely be committed to the database and will not be lost.

    Abort (Roll back)-This signals that a transaction has ended unsuccessfully and that any changes or effects that thetransaction may have applied to the database must be unclosed.

    Undo: similar to ROLLBACKbut it applies to a single operation rather than to a whole transaction.

    Redo: specifies that certain transaction operations must be redone to ensure that all the operations of a committed

    transaction have been applied successfully to the database.

  • 8/6/2019 Transactions Management and Concurrent Control

    2/12

    Transaction execution

    PROPERTIES OF A TRANSACTION (ACID)

    (i)ATOMICITY (All or nothing) property- This means a transaction is performed in its entirely or not performed at allThis requires that all operations (parts) of the transaction to be reflected in the transaction properly otherwise it is aborted

    A transaction is treated as atomic work & it is the responsibility or recovery of DBMS to ensure atomicity.

    (ii) CONSISTENCY (SERIALIZIBILITY)-A transaction is consistency preserving it its complete execution takes the DBfrom one consistent state to another. Concurrent transactions are treated as though they were executed in several orders

  • 8/6/2019 Transactions Management and Concurrent Control

    3/12

    (one after another). Thus execution of a transaction in isolation preserves the consistency of a database. So a transaction

    should always transform DB from one consistent state to another. It is the responsibility of DBMS module to enforce

    integrity & consistency.

    (iii) ISOLATION (Independence)-States that the execution of a transaction should not be interfered with in any way byother transactions executing concurrently i.e. the data used during execution by transaction cannot be sued by another

    transaction while the first one is completed.

    It is the responsibility of the concurrency control system to ensure isolation.

    (iv) DURABILITY OR PERSISTENCE -Ensures that changes (updates) applied to DB by a committed translation persistsand cannot be lost even in the event of system failure. This indicates the permanence of DB consistent state. It is theresponsibility of receiving system to ensure durability.

    CONCURRENCY, SERIALIZABILITY AND DEADLOCK

    Concurrency Control The process of managing simultaneous operations on the database without having them interfere with one another.

    (Connolly)

    Required because Many users wish to access the same data.

    Accessing the same data can lead to errors.

    -In a single user database only one user is accessing the data at any time. This means that the DBMS does not have to be

    concerned with how changes made to the database will affect other users. In a multi-user database many users may beaccessing the data at the same time. The operations of one user may interfere with other users of the database.

    -The DBMS uses concurrency controlto manage multi-user databases. McFadden et al define concurrency control asbeing concerned with preventing loss of data integrity due to interference between users in a multi-user environment.

    Concurrency control provides a mechanism for avoiding and managing conflict between users.

    -Conditions for conflict include:1. Transactions that begin at the same time.

    2. Transactions that operate independently of each other. That is, transactions that do not co-ordinate their access to the

    database.

    3. Transactions reading and/or writing the same data items:

    Conflicting Transactions1. Record for product 10 has value 50.2. Transaction Yreads record for product 10.3. TransactionXreads record for product 10.

    4. Transaction Yincrements product 10s value by 15.5. TransactionXdecrements product 10s value by 20.6. Transaction Ywrites new record for product 10 to disc.7. TransactionXwrites new record for product 10 to disc.

    Transactions YandXhas been executed at the same time:

    Transaction Y

    -Read product 10s value as 50.Added 15 to produce a value of 65.Wrote the updated product 10 record with a value of 65.

    Transaction X

    Read product 10 s value as 50.Subtracted 20 to produce a value of 30.

    Wrote the updated product 10 record with a value of 30.

    The result of this execution is that product 10 has a value of 30. TransactionXhas overwritten the result of transaction Y.Product 10 should have a value of 45.i.e. 50+15-20 = 45

  • 8/6/2019 Transactions Management and Concurrent Control

    4/12

    Problem

    -Both transactions read and updated the same value for product 10.

    There are three common types of conflict problem:1. The lost updateproblem2. The uncommitted dependencyproblem3. The inconsistent analysisproblem

    The Lost Update Problem

    -In this example, transaction Yhas read the value ofbalat time t2 as 100 and transactionXhas read the value of balat

    time t3 as 100. At time t4, transaction Ywrites the new value ofbal(200) to disc. But at time t4, transactionXhassubtracted 10 from its value ofbal(100) to produce 90. TransactionXupdates the value ofbalon disc at time t5. Theresult of this operation is that the update performed by transaction Y(bal+100=200) has been lost. TransactionXhas

    overwritten the result of transaction Y. This problem is avoided by not allowing transactionXto read baluntil transactionYhas committed its update.

    The Uncommitted Dependency Problem

    -In this example, transaction Yreads and updated the value ofbal(100+100=200) and writes the result at time t4.At time

    t5, transactionXreads the value ofbal(200) written by transaction Yand updates it. However, at time t6transaction Yhasfailed and rolled back. This means that the update it made to balis undone and value ofbalis returned to 100. Therefore,transactionXis updating the incorrect value ofbal(200). TransactionXshould be updating bal=100 because transaction

    Yhas been rolled back and its changes undone. TransactionXhas used the result of transaction Ybut this result was

    incorrect as transaction Yfailed. This problem is avoided by not allowing transactionXto read the value ofbaluntiltransaction Yeither commits or rolls back.

    The Inconsistent Analysis Problem

    -In this example, transaction Yis summing the values ofbalx, baly and balz. However, at the same time, transactionXistransferring 10 pounds between balx and balz. As transaction Yhas used the old balances ofbalx and balzits final result isincorrect. This problem would be solved by preventing transactionXfrom transferring the money between accountsbefore transaction Yhas committed.

  • 8/6/2019 Transactions Management and Concurrent Control

    5/12

    Schedules

    Schedule

    A sequence of reads/writes by a group of transactions.

    Types of schedules

    Serial Schedule A schedule where transactions are executed consecutively. Non-serial Schedule-A schedule where the operations of a transaction are interleaved.

    -The lost update, uncommitted dependency and inconsistent analysisproblems are caused by executing two or moretransactions at the same time. It is possible to avoid all problems by executing the transactions one at a time. Each

    transaction is committed before the next begins. However, it is frequently possible to interleave the execution oftransactions. That is, it is possible for the operations of two transactions to overlap as they execute. -The sequence of

    operations performed by a set of transactions is called aschedule.-When transactions are run consecutively, the schedule is aserial schedule. When the operations of a transaction overlap,

    the schedule is a non-serial schedule. A serial schedule guarantees that the transactions will not conflict because thetransactions are run at different times. However, different serial schedules may produce different results. A non-serial

    schedule does not guarantee that transactions will not conflict.

    Serial Schedule

    -Transactions YandXare executed one after another. Therefore, they cannot interfere with each other.

    This is aserial schedule.

    Non-Serial Schedule-TransactionsXand Yare interleaved. That is, the operations of transactionXoverlap with the operations of transaction Y.

    This is a non-serial schedule. This schedule produces a conflict because at the same time transactionXis transferring 10from bal1 to bal2, transaction Yis also transferring money between bal1 and bal2.

  • 8/6/2019 Transactions Management and Concurrent Control

    6/12

    Serialisable schedule .

    -A non-serial schedule that produces the same result as some serial schedule is called aserialisable schedule . Forinstance, consider the figure below:

    - TransactionsXand Yare interleaved and, therefore, this is a non-serial schedule. However, this schedule does not causeconflict. The result of this schedule is the same as executing transactionXbefore transaction Y.

    Serialisability

    Serialisable schedule A non-serial schedule that produces the same result assome serial schedule. Executing a serialisable schedule is equivalent to executing some serial schedule. However, the serialisable schedule

    may make better use of the computing resources.

    Serialisability Example

    -Schedule 1, above, is a non-serial schedule that produces the same result as the serial schedule 2.

  • 8/6/2019 Transactions Management and Concurrent Control

    7/12

    Therefore, schedule 1 is aserialisable schedule. That is, there is an equivalent serial schedule which may be used.Schedule 1 would not produce the same result as executing transaction Ybefore transactionX. But it produces the same

    result as executing transaction Yafter transactionX.

    Concurrency Control Techniques

    Locking Controls concurrent access to data.

    Read lock Allows a transaction to read a data item but not to update it. Write lock- Allows a transaction to read and update a data item.

    -A DBMS can ensure that a schedule for a set of transactions is a serialisable transaction by requiring the transactions tolockdata items before they use them.-Connolly et al defines locking as a procedure used to control concurrent access to data. When one transaction is

    accessing the database, a lock may deny access to other transactions to prevent incorrect updates.-A lock is used by a transaction to notify the DBMS that the transaction is about to read or write a particular data item.

    The DBMS may then take steps to avoid conflict with other transactions.-There are two main types of locks:

    Read Lock- allows a transaction to read a data item but not to change its contents.

    Write Lock- allows a transaction to read or write a data item.

    -More than one transaction may have a read lock on a data item. This is because none of the transactions can change the

    data item and, therefore, they will all be working with the same value. Only one transaction may have a write lock on adata item at any one time. In addition, other transactions may not hold read locks on the data item. Other transactions will

    conflict if they try to use the data items value as it is being updated.

    Using Locks

    -Locks determine how a transaction may use a data item. When a transaction wishes to access a data item it must lock it

    first. When a transaction requests a lock on a data item, the DBMS checks if the data item is already locked. If the data

    item is not locked then the transaction is allowed to lock the data item. If the data item is read locked then the transaction

    may also have a read lock on the data item. If the data item is write- locked then the transaction must waituntil the lock isreleased. Hence, when two transactions wish to update the same data item, one of them will be given a write lock and the

    other will be required to wait until the first transaction finishes.

    Granularity of Lock

    -Locks can be categorized according to their level ofgranularity. There are three major levels of granularity:

    Database Locks A database lock stops access to the whole database.

    Relation Locks A table lock stops access to a single relation.

    Tuple Locks A record lock stops access to a single tuple in a relation.

    -Selecting the correct granularity of locks used by a transaction is important. For example, a transaction that locks the

    database will exclude all other transactions from accessing the database. (A database lock creates a single user database).

    A database lock is normally used to perform operations that require exclusive access to the database, for example, acomplete database backup. Simple reads and updates will normally use tuple locks.

    TYPES OF LOCKS

    1.Binary locks -A binary lock has two states (values) i.e. locked (1) or unlocked (0). Suppose that is a data item: if thevalue of the lock of the values if 1 then item cannot be accessed by a database operation that requests the data flow. If

    the value of the lock in is 0 the item can be accessed when request this lock and unlock feature eliminate feature

    system but its the considered too restricted to yield to optimal concurrent result box at most one transaction can hold a

    lock at any time.

    2. Shared Locks- Used during read operations since they cannot conflict i.e. transactions can be allowed to accept samedata then if they are all access for reading purposes only. A shared lock is issued when a transaction want to read data on

    the database while no other database is updating the same data.

  • 8/6/2019 Transactions Management and Concurrent Control

    8/12

    3. Exclusive Locks -If a transaction is to write a database then it must have exclusive right to the database. This gives a

    transaction exclusive lock no other transaction can read or update in that duration. Exclusive locks must be used when

    potential for conflicts exist so exclusive lock is granted if a transaction wants to update (write) the database and no other

    locks are held for the data.

    Two Phase Locking

    -To be able to guarantee serialisability, DBMSs require transactions to use the two-phase locking protocol. Two-phaselocking is the procedure used by transactions to obtain and release locks on data items.

    -Transactions that use the two-phase locking protocol can be safely interleaved with other transactions that also use thetwo-phase locking protocol.

    -A transaction that uses two-phase locking has two parts to it:

    1. The Growing Phase -during the growing phase a transaction obtains all the locks it will require during its processing.

    2. The Shrinking Phase- during the shrinking phase a transaction releases all the locks it obtained during the growingphase.

    -Transactions must obtain locks on all data items they will use. When a transaction releases a lock it is not allowed to

    obtain any other locks. Using the two-phase locking protocol, a transaction will obtain all the locks it requires (growing),process the data and release all the locks (shrinking).

    All locking operations (read lock, write_lock) precede the first unlock operation in the transactions.

    _Two phases:_Expanding phase: new locks on items can be acquired but none can be released.

    _Shrinking phase: existing locks can be released but no new ones can be acquired.

    There are types of two- phase locking:

  • 8/6/2019 Transactions Management and Concurrent Control

    9/12

    Solving the Lost Update Problem

    Solving the Uncommitted Dependency Problem

    Deadlock

  • 8/6/2019 Transactions Management and Concurrent Control

    10/12

    -When two or more transactions request a lock on a data item that is already locked it is possible for the transactions to

    become deadlocked. In the example above, transactionXis waiting for a lock on record 20 while holding a lock on record10. At the same time, transaction Yis waiting for a lock on record 10 while holding a lock on record 20. The transactions

    are waiting for each other to release a lock before they can continue. This situation is called deadlock. Connolly et aldefine deadlock as an impasse that may result when two (or more) transactions are each waiting for locks held by the

    other to be released.

    Conditions for Deadlock

    -Deadlock occurs when a circular chain of transactions have write locks on data items and are waiting for locks held by

    the next transaction in the chain. A transaction cannot force another transaction to release a lock.

    Avoiding Deadlock Only one resource is locked at one time by each transaction. Not useful when more than one record must be updated.

    Resource locks are obtained in one order e.g. locks on record 10 are always obtained before locks on record 20.

    Obtain all locks before updating begins Transactions cannot start until all locks are available

    Three general techniques for handling deadlock:

    Timeouts.

    Deadlock prevention.

    Deadlock detection and recovery.

    (i) Timestamp methods- Transactions ordered globally so that older transactions, transactions with smallertimestamps, get priority in the event of conflict. Conflict is resolved by rolling back and restarting transaction.

    Timestamp

    A unique identifier created by DBMS that indicates relative starting time of a transaction.

    Can be generated by using system clock at time transaction started or by incrementing a logical counter every

    time a new transaction starts.

    Read/write proceeds only iflast update on that data item was carried out by an older transaction.

    Otherwise, transaction requesting read/write is restarted and given a new timestamp.

    (ii) Optimistic methods-Based on assumption that conflict is rare and more efficient to let transactions proceedwithout delays to ensure serializability. At commit, check is made to determine whether conflict has occurred.

  • 8/6/2019 Transactions Management and Concurrent Control

    11/12

    If there is a conflict, transaction must be rolled back and restarted.

    Three phases:

    Read: from start to just before commit; read from database into local variables and update local data.

    Validation: check to ensure serializability is not violated; if violated transaction aborted and restarted.

    Write (for update transactions): updates made to local variables then applied to database.

    (iii) Recovery- Process of restoring database to a correct state in the event of a failure.

    Need for Recovery Control

    Two types of storage: volatile (main memory) and nonvolatile. Volatile storage does not survive system crashes.

    Stable storage represents information that has been replicated in several nonvolatile storage media with

    independent failure modes.

    Types of Failure

    System crashes, resulting in loss of main memory.

    Media failures, resulting in loss of parts of secondary storage.

    Application software errors.

    Natural physical disasters.

    Carelessness or unintentional destruction of data or facilities.

    Sabotage.

    Transactions and recovery

    Transactions represent basic unit of recovery.

    Recovery manager responsible foratomicity and durability.

    If failure occurs between commit and database buffers being flushed to secondary storage then, to ensure

    durability, recovery manager has to redo (rollforward) transactions updates.

    If transaction had not committed at failure time, recovery manager has to undo (rollback) any effects of thattransaction for atomicity.

    Partial undo - only one transaction has to be undone.

    Global undo - all transactions have to be undone.

    DBMS starts at time t0, but fails at time tf. Assume data for transactions T2 and T3 have been written to secondary

    storage. T1 and T6 have to be undone. In absence of any other information, recovery manager has to redo T 2, T3T4, and T5.

    Log file

    -Contains information about all updates to database:

    Transaction records.

    Checkpoint records.

    Often used for other purposes (for example, auditing).

    Recovery facilities

    DBMS should provide following facilities to assist with recovery:

  • 8/6/2019 Transactions Management and Concurrent Control

    12/12

    Backup mechanism, which makes periodic backup copies of database.

    Logging facilities, which keep track of current state of transactions and database changes.

    Checkpoint facility, which enables updates to database in progress to be made permanent.

    Recovery manager, which allows DBMS to restore database to consistent state following a failure.