When a DBMS recovers from a crash, it should maintain the following −
There are two types of techniques, which can help a DBMS in recovering as well as maintaining the atomicity of a transaction −
Log-based recovery is a widely used approach in database management systems to recover from system failures and maintain atomicity and durability of transactions. The fundamental idea behind log-based recovery is to keep a log of all changes made to the database, so that after a failure, the system can use the log to restore the database to a consistent state.
For every transaction that modifies the database, an entry is made in the log. This entry typically includes:
We represent an update log record as <\(T_i\) , \(X_j\) , \(V_1\), \(V_2\)>, indicating that transaction \(T_i\) has performed a write on data item \(X_j\). \(X_j\) had value \(V_1\) before the write, and has value \(V_2\) after the write. Other special log records exist to record significant events during transaction processing, such as the start of a transaction and the commit or abort of a transaction. Among the types of log records are:
Before any change is written to the actual database (on disk), the corresponding log entry is stored. This is called the Write-Ahead Logging (WAL) principle. By ensuring that the log is written first, the system can later recover and apply or undo any changes.
Periodically, the DBMS might decide to take a checkpoint. A checkpoint is a point of synchronization between the database and its log. At the time of a checkpoint:
Once a transaction is fully complete, a commit record is written to the log. If a transaction is aborted, a rollback record is written, and using the log, the system undoes any changes made by this transaction.
Shadow Paging is an alternative disk recovery technique to the more common logging mechanisms. It's particularly suitable for database systems. The fundamental concept behind shadow paging is to maintain two page tables during the lifetime of a transaction: the current page table and the shadow page table.
Here's a step-by-step breakdown of the working principle of shadow paging:
When the transaction begins, the database system creates a copy of the current page table. This copy is called the shadow page table.
The actual data pages on disk are not duplicated; only the page table entries are. This means both the current and shadow page tables point to the same data pages initially.
When a transaction modifies a page for the first time, a copy of the page is made. The current page table is updated to point to this new page.
Importantly, the shadow page table remains unaltered and continues pointing to the original, unmodified page.
Any subsequent changes by the same transaction are made to the copied page, and the current page table continues to point to this copied page.
Once the transaction reaches a commit point, the shadow page table is discarded, and the current page table becomes the new "truth" for the database state.
The old data pages that were modified during the transaction (and which the shadow page table pointed to) can be reclaimed.
If a crash occurs before the transaction commits, recovery is straightforward. Since the original data pages (those referenced by the shadow page table) were never modified, they still represent a consistent database state.
The system simply discards the changes made during the transaction (i.e., discards the current page table) and reverts to the shadow page table.