Upload
roger
View
49
Download
1
Embed Size (px)
DESCRIPTION
Module 6 Log Manager. COP 6730. Log Manager. Log knows everything. It is the temporal database The online durable data are just their current versions The log has their complete histories. Uses of the Log: Transaction recovery Auditing Performance analysis Accounting - PowerPoint PPT Presentation
Citation preview
1
Module 6Log Manager
COP 6730
2
Log Manager• Log knows everything. It is the temporal database
– The online durable data are just their current versions
– The log has their complete histories.
• Uses of the Log:– Transaction recovery
– Auditing
– Performance analysis
– Accounting
• The log can easily become a performance problem, and it can get very large. Intelligent algorithms are needed.
3
Log Sequence Numbers (LSNs)• A log file consists of a sequence of log records
• Each log record has a unique identifier, or key, called its log sequence number (LSN)
• The LSN is composed of the record’s file number and the relative byte offset of the record within that file.
typedef struct
{
long file; /* number of log file in log directory */ long rba; /* relative byte address of (first byte) record in file */
} LSN;
4
LSN Property
Property: If log record A for an object is created “after” log record B for that object, then
LSN (A) > LSN (B)
This monotonicity property is used by the write-ahead log (WAL) protocol.
Note: If two objects send their log records to different logs, then their LSNs are incomparable.
5
Value Logging
Each log record contains the old and the new states of the object.
UNDO Program: set the object to the old state.
REDO Program: set the object to the new state.
6
Logical Logging
• Value logging is often called physical logging because it records the physical addresses and values of objects
• Logical (or operation) logging records the name of an UNDO-REDO function and its parameter
7
Log Manager: OverviewThe log manager provides an interface to the log table,
which is a sequence of log records.create table log_table(
lsn LSN,
prev_lsn LSN, /* for scanning the log backward */
timestamp TIMESTAMP, /* for time domain addressing */
resource_mg RMID, /* this RM will handle the UNDO-REDO work */
trid TRID, /* creator of this record */
tran_prev_lsn LSN, /* avoid scanning the log backward during transaction UNDO */
body varchar, /* UNDO-REDO information generated
by the RM */
primary key(lsn),
foreign key (prev_lsn) references log_table (lsn),
foreign key (tran_prev_lsn) references log_table (lsn),
) entry sequenced; /* inserts go at end of tile */
8
Log Manager: Overview (Cont’d)
Example: find the records written by a RM.
select *
from log_table
where resource_mgr = :rmid
order by lsn descending;
9
File System and Archive System• The log manager provides read and write access to
the log table for all the other RMs, and for the TM.– In a perfect world, the log manager simply writes log
records, and no one ever reads them.
– In the event of failures, the log is used to return each logged object to its most recent consistent state.
• The log manager maps the log table onto a growing collection of sequential files.– As the log table fills one file, another is allocated.
– Only recent records are kept online.
– Log records more than a few hours old are stored in less expensive tertiary storage managed by the archive system
10
Why Have a Log Manager ?Can’t we maintain the log table using SQL operations ?
At restart, almost none of the system is functioning. The log manager must be able to find, read, and write the log without much help from the SQL system.
• The log manager must maintain and use a special catalog listing the physical log files.
• It must use low-level interfaces to read write these catalogs and files.
11
Normal Execution
Begin_Work ( )new transaction
TRID
Normal
Functions
Callback
Functions
UNDO,REDO,
COMMIT
WorkRequests
LockRecords
LockRequests Lock
Manager
LogManager
1. Want to Commit
2. Commit Phase 1?
3. YES to Phase 1
5. Commit Phase 2
6. Acknowledge
4.
Write
Commit
log record
and
?
RMs TM
Commit_Work ( )
Application
12
2. Read transaction’s log records
Transaction Abort
Rollback_Work ( )
Application
NormalFunctions
CallbackFunctions
LogManager
1. rollback transaction
5. write abort records
3. UNDO (log record)
4. Aborted (TRID)
RMs TM
13
• The DO-UNDO-REDO protocol is a programming style for RMs implementing transactional objectsDO program:
UNDO program:
REDO program:
• RM have following structure:
DO-UNDO-REDO Protocol
RMNormal Function: DO program
Callback Functions: UNDO & REDO programs
Old State DO
New State
Log Record
New State
Log Record
Old State
Log Record
DOUNDO
REDO
Old State
New State
14
Restart1. The TM regularly invokes checkpoints during
normal processing it informs each RM to checkpoint its state to persistent memory.
2. At restart, the transaction mgr. scans the log table forward from the most recent checkpoint to the end.
3. For each transaction that has not committed (e.g., T2), the TM calls the UNDO( ) callback of the RMs to undo it to the most recent persistent savepoint.
Checkpoint Crash
T1T2
T3
15
2-Phase Commit: CommitPhase I:
• Prepare: Invoke each RM asking for its vote.
• Decide: If all vote yes, durably write the transaction commit log record.
Note: The commit record write is what makes a transaction atomic and durable. If the system fails prior to that instant, the transaction will be undone at restart; otherwise, phase 2 will be carried forward by the restart logic.
16
2-Phase Commit: Commit (Cont’d)
Phase II:
• Commit: Invoke each RM telling it the commit decision.
Note: The RM can now release locks, deliver real messages, and perform other clean-up tasks.
• Complete: When all acknowledge the commit message, write a commit completion record to the log, indicating that phase 2 ended. When the completion message is durable, deallocate the live transaction state.
Note: Phase 2 completion record, is used at restart to indicate that the RM have all been informed about the transaction
17
2-Phase Commit: Abort
• If any RM votes no during the prepare step, or if it does not respond at all, then the transaction cannot commit.
• The simplest thing to do in this case is to roll back the transaction by calling Abort_work ( ).
18
2-Phase Commit: Abort (Cont’d)
The logic for Abort_work ( ) is as follows:
Undo: Read the transaction’s log backwards, issuing UNDO of each record. The RM that wrote the record is invoked to undo the operation.
Broadcast: At each savepoint, invoke each RM telling it the transaction is at the savepoint.
Abort: Write the transaction abort record to the log (UNDO of begin_work( )).
Complete: Write a complete record to the log indicating that abort ended. Deallocate the live transaction state.
19
Multiple Logs• In systems with very high update rates, the
bandwidth of the log can become a bottleneck.– Such bottlenecks can be eliminated by creating multiple
logs and by directing the log records of different objects to different logs.
• In some situations, a particular RM keeps its own log table for portability reasons.
• Distributed systems are likely to have one or more logs per network node.– They maintain multiple logs for performance and for node
autonomy.
– With a local log, each node can recover its local transactions without involving the other nodes.
20
Group CommitLog Insert:
1. The program acquires the log lock.
2. It fixes the log page in the buffer pool.
3. It allocates space for the log record in the page, and fills in the record.
4. It unfixes the page in the buffer pool, and unlocks the semaphore.
5. The movement of data to durable storage is coordinated by an asynchronous process called the log flush daemon.
21
Group Commit (Cont’d)
Group Commit:
The log daemon wakes up once every t ms and does all the log writing that has accumulated in the buffer pool (batch processing log writes).
advantage: I/O overhead is reduced
disadvantage: It makes transaction last longer and delays releasing locks.
22
The FIX RuleWhile the semaphore is set, the page is said to be fixed, and releasing the page is called unfixing it.
Fixed Rule:
1. Get the page semaphore in exclusive mode prior to altering the page.
2. Get the semaphore in shared or exclusive mode prior to reading the page.
3. Hold the semaphores until the page and log are again consistent, and read or update is complete.
23
The FIX Rule: 2-Phase Locking
This is just two-phase locking at the page-semaphore level.
Isolation Theorem tells us that all read and write actions on page will be isolated.
Page updates are actually min-transactions.
When the page is unfixed, the page should be consistent and the log record should allow UNDO or REDO of the page transformation.
24
Multi-Page Actions• Some actions modify several pages at once.
Examples: Inserting a multi-page record. Splitting a B-tree node.
• These actions are structured as follows:
1. Fix all the relevant pages
2. Do all the modifications and generate many log records.
3. Unfix the page.