DBMS Menu

Recovery and Atomicity in dbms

When a system crashes, it may have several transactions being executed and various files opened for them to modify the data items.
But according to ACID properties of DBMS, atomicity of transactions as a whole must be maintained, that is, either all the operations are executed or none.
Database recovery means recovering the data when it get deleted, hacked or damaged accidentally.
Atomicity is must whether is transaction is over or not it should reflect in the database permanently or it should not effect the database at all.

When a DBMS recovers from a crash, it should maintain the following −

It should check the states of all the transactions, which were being executed.
A transaction may be in the middle of some operation; the DBMS must ensure the atomicity of the transaction in this case.
It should check whether the transaction can be completed now or it needs to be rolled back.
No transactions would be allowed to leave the DBMS in an inconsistent state.

There are two types of techniques, which can help a DBMS in recovering as well as maintaining the atomicity of a transaction −

Maintaining the logs of each transaction, and writing them onto some stable storage before actually modifying the database.
Maintaining shadow paging, where the changes are done on a volatile memory, and later, the actual database is updated

Log-Based Recovery

Log-based recovery is a widely used approach in database management systems to recover from system failures and maintain atomicity and durability of transactions. The fundamental idea behind log-based recovery is to keep a log of all changes made to the database, so that after a failure, the system can use the log to restore the database to a consistent state.

How Log-Based Recovery Works

1. Transaction Logging:

For every transaction that modifies the database, an entry is made in the log. This entry typically includes:

Transaction ID: A unique identifier for the transaction.
Data item identifier: Identifier for the specific item being modified.
OLD value: The value of the data item before the modification.
NEW value: The value of the data item after the modification.

We represent an update log record as <\(T_i\) , \(X_j\) , \(V_1\), \(V_2\)>, indicating that transaction \(T_i\) has performed a write on data item \(X_j\). \(X_j\) had value \(V_1\) before the write, and has value \(V_2\) after the write. Other special log records exist to record significant events during transaction processing, such as the start of a transaction and the commit or abort of a transaction. Among the types of log records are:

<\(T_i\) start>. Transaction Ti has started.
<\(T_i\) commit>. Transaction Ti has committed.
<\(T_i\) abort>. Transaction Ti has aborted.

2. Writing to the Log

Before any change is written to the actual database (on disk), the corresponding log entry is stored. This is called the Write-Ahead Logging (WAL) principle. By ensuring that the log is written first, the system can later recover and apply or undo any changes.

3. Checkpointing

Periodically, the DBMS might decide to take a checkpoint. A checkpoint is a point of synchronization between the database and its log. At the time of a checkpoint:

All the changes in main memory (buffer) up to that point are written to disk.
A special entry is made in the log indicating a checkpoint. This helps in reducing the amount of log that needs to be scanned during recovery.

4. Recovery Process

Redo: If a transaction is identified (from the log) as having committed but its changes have not been reflected in the database (due to a crash before the changes could be written to disk), then the changes are reapplied using the 'After Image' from the log.
Undo: If a transaction is identified as not having committed at the time of the crash, any changes it made are reversed using the 'Before Image' in the log to ensure atomicity.

5. Commit/Rollback

Once a transaction is fully complete, a commit record is written to the log. If a transaction is aborted, a rollback record is written, and using the log, the system undoes any changes made by this transaction.

Benefits of Log-Based Recovery

Atomicity: Guarantees that even if a system fails in the middle of a transaction, the transaction can be rolled back using the log.
Durability: Ensures that once a transaction is committed, its effects are permanent and can be reconstructed even after a system failure.
Efficiency: Since logging typically involves sequential writes, it is generally faster than random access writes to a database.

Shadow paging - Its Working principle

Shadow Paging is an alternative disk recovery technique to the more common logging mechanisms. It's particularly suitable for database systems. The fundamental concept behind shadow paging is to maintain two page tables during the lifetime of a transaction: the current page table and the shadow page table.

Here's a step-by-step breakdown of the working principle of shadow paging:

Initialization

When the transaction begins, the database system creates a copy of the current page table. This copy is called the shadow page table.

The actual data pages on disk are not duplicated; only the page table entries are. This means both the current and shadow page tables point to the same data pages initially.

During Transaction Execution

When a transaction modifies a page for the first time, a copy of the page is made. The current page table is updated to point to this new page.

Importantly, the shadow page table remains unaltered and continues pointing to the original, unmodified page.

Any subsequent changes by the same transaction are made to the copied page, and the current page table continues to point to this copied page.

On Transaction Commit

Once the transaction reaches a commit point, the shadow page table is discarded, and the current page table becomes the new "truth" for the database state.

The old data pages that were modified during the transaction (and which the shadow page table pointed to) can be reclaimed.

Recovery after a Crash

If a crash occurs before the transaction commits, recovery is straightforward. Since the original data pages (those referenced by the shadow page table) were never modified, they still represent a consistent database state.

The system simply discards the changes made during the transaction (i.e., discards the current page table) and reverts to the shadow page table.

Next Topic :Recovery with Concurrent Transactions in DBMS