DBMS NotesPYQs

Chapter 8

Crash Recovery

Full Syllabus Outline

Every topic from the syllabus data is preserved here.

Chapter Notes

Block-based notes with markdown, diagrams, code, and math.

Chapter 8 — Crash Recovery

8.1 Failure Classification

Database systems must handle three categories of failures, each requiring a different recovery strategy:

Failure TypeCauseWhat is LostRecovery Method
Transaction failureLogical error (overflow, division by zero, constraint violation, user cancel, deadlock)Only that transaction's uncommitted changesRollback the single transaction using undo log
System crashPower failure, OS kernel panic, hardware fault — but disk survives intactEverything in RAM (entire buffer pool contents)Redo/undo using the write-ahead log from the last checkpoint
Disk failure (media failure)Disk head crash, corrupt sectors, catastrophic hardware failureContents of the failed diskRestore from backup + replay archived log records

Stable Storage

Stable storage is storage that survives any single failure. Approximated in practice by:

  • Writing data to multiple physically independent disks (RAID 1, RAID 10) before acknowledging the write
  • The probability of ALL copies failing simultaneously is engineered to be negligibly small

Block-level operations (important for understanding WAL):

  • Input(B): transfer disk block B from disk to a RAM buffer frame
  • Output(B): transfer RAM buffer block B to disk (write to stable storage)

8.2 Write-Ahead Log (WAL) and Log-Based Recovery

The Fundamental WAL Rule

Before a modified data page is written to disk (Output), the corresponding log record describing that modification MUST already be in stable storage.

This rule ensures that even if the system crashes immediately after writing a data page, the log record already exists on disk — enabling recovery by undo or redo.

Log Record Format

Each log record describes a single operation by a single transaction:

Record TypeFormatMeaning
Start<Ti, start>Transaction Ti has begun
Update<Ti, Xj, old_value, new_value>Ti updated item Xj from old_value to new_value
Commit<Ti, commit>Ti has committed; all changes made durable
Abort<Ti, abort>Ti has been rolled back; changes undone
Checkpoint<checkpoint, [T1, T2, ...]>Checkpoint taken with active transactions listed

Undo: use old_value to restore Xj before Ti's modification
Redo: use new_value to reapply Ti's modification to Xj

Immediate vs Deferred Modification

Immediate ModificationDeferred Modification
When data written to diskBefore the transaction commits (as soon as WRITE executes)Only AFTER the transaction commits
Recovery operations neededBoth UNDO (uncommitted) AND REDO (committed but not yet flushed)REDO only (nothing reaches disk before commit, so no undo needed)
Log records neededMust log both old AND new valuesOnly new values needed
PerformanceBetter write concurrencySimpler recovery logic

Recovery Algorithm (with Checkpointing)

Recovery uses two phases. Both phases start from the most recent checkpoint, not from the beginning of the entire log.


Phase 1 — Redo Phase (forward scan from last checkpoint to end of log):

  1. Begin scanning forward from the last checkpoint record
  2. Redo every update found in the log (even from transactions that will ultimately be undone — simpler to redo everything, then undo)
  3. While scanning, build the undo-list:
    • Add to undo-list: transactions that have a <Ti, start> record but no <Ti, commit> record

Phase 2 — Undo Phase (backward scan from end of log):

  1. Scan the log backward from the end
  2. For each transaction Ti in the undo-list:
    • Apply undo: set Xj = old_value for each update record of Ti
    • Write a Compensation Log Record (CLR): <Ti, Xj, new_val, old_val, CLR>
  3. When you encounter <Ti, start>: write <Ti, abort> and remove Ti from undo-list
  4. Continue until the undo-list is empty

Checkpointing

A checkpoint reduces recovery time by establishing a known-good point:

  1. Write <checkpoint, [active_transactions]> to log
  2. Flush all dirty (modified) buffer pages to disk
  3. Flush the log to stable storage

Recovery shortcut: During recovery, only scan the log from the most recent checkpoint. All transactions committed BEFORE the checkpoint are already on disk — no need to redo them.

Comprehensive Checkpoint Recovery — Fully Worked Example

LOG contents (in order):
  <T0, start>
  <T0, A, 1000, 900>          ← T0: A changes from 1000 to 900
  <T1, start>
  ── CHECKPOINT, active = {T0, T1} ──   ← last checkpoint point
  <T1, B, 2000, 2100>          ← T1: B changes from 2000 to 2100
  <T1, commit>                 ← T1 commits successfully
  <T2, start>
  <T2, C, 500, 600>            ← T2: C changes from 500 to 600
  ** SYSTEM CRASH **           ← no commit for T0 or T2

────────────────────────────────────────────────────────
Recovery Procedure:

Step 0: Locate last checkpoint → "CHECKPOINT, active = {T0, T1}"
  Initial undo-list = {T0, T1}   (transactions active at checkpoint time)

────────────────────────────────────────────────────────
REDO PHASE — scan FORWARD from checkpoint to end of log:

  Process <T1, B, 2000, 2100>:   REDO → set B = 2100 ✓
  Process <T1, commit>:           T1 committed → REMOVE T1 from undo-list
                                  undo-list = {T0}
  Process <T2, start>:            T2 not seen before → ADD T2 to undo-list
                                  undo-list = {T0, T2}
  Process <T2, C, 500, 600>:     REDO → set C = 600 ✓ (will be undone in next phase)

After redo phase:
  State: A=900, B=2100, C=600
  undo-list = {T0, T2}   (T0 started before checkpoint, never committed;
                           T2 started after, never committed)

────────────────────────────────────────────────────────
UNDO PHASE — scan BACKWARD from end of log:

  Process <T2, C, 500, 600>:   UNDO T2 → set C = 500; write <T2,C,600,500,CLR>
  Process <T2, start>:          write <T2, abort>; REMOVE T2 from undo-list
                                 undo-list = {T0}
  Process <T1, commit>:         T1 NOT in undo-list → skip
  Process <T1, B, 2000, 2100>: T1 NOT in undo-list → skip
  (checkpoint record)
  Process <T0, A, 1000, 900>:  UNDO T0 → set A = 1000; write <T0,A,900,1000,CLR>
  Process <T0, start>:          write <T0, abort>; REMOVE T0 from undo-list
                                 undo-list = {} → DONE

Final state after recovery:
  A = 1000   (T0 rolled back — never committed)
  B = 2100   (T1 redone — committed before crash)
  C = 500    (T2 rolled back — never committed)  ✓

8.3 Shadow Paging

Shadow paging is an alternative recovery technique that avoids using a traditional log. It uses two page tables instead.

Mechanism

ComponentDescription
Shadow page tableThe stable, committed page table pointing to data as of the last commit — never modified during an active transaction
Current page tableThe working copy used during the transaction — updated as writes happen

How it works step-by-step:

  1. At the start of a transaction, copy the current page table to use as the shadow (or the existing shadow remains as-is)
  2. During the transaction: all writes go to NEW disk pages. The current page table is updated to point to these new pages. The shadow table still points to the old pre-transaction pages.
  3. On COMMIT: Atomically write a single pointer (the DB root pointer) to the current page table. This single atomic write makes the current state the new committed state.
  4. On ABORT or CRASH (before commit): Discard the current page table. The shadow table is still intact, so the database is exactly as it was before the transaction started.

How Atomicity and Durability Are Ensured

  • Atomicity: If crash before commit, shadow page table is untouched → automatic rollback to last committed state. No undo log is needed.
  • Durability: After commit, the new root pointer is durably written. If crash after commit, shadow table now points to the new pages → the committed state persists.

Shadow Paging vs Log-Based Recovery

AspectShadow PagingLog-Based Recovery
Undo loggingNot needed (just discard current page table)Required (must log old values)
Redo loggingNot needed (keep new pages)Required (must log new values)
ConcurrencyPoor — very difficult to support multiple concurrent transactions sharing page tablesExcellent — log is append-only
Storage fragmentationYes — data pages spread across disk as new pages are allocatedNo — pages updated in place
Garbage collectionRequired (must free old shadow pages)Not required
Practical adoptionLimited (SQLite uses a variant; mostly historical)Universal (PostgreSQL, MySQL, Oracle, SQL Server)

Exam answer — "Describe shadow paging for atomicity and durability, and state its main advantage and disadvantage over log-based recovery:"

Shadow paging maintains two page tables: the shadow (last committed state) and the current (in-progress). All writes go to new disk pages and the current table is updated. On commit, a single atomic pointer swap replaces the shadow with the current table — achieving durability. On crash before commit, the shadow is intact and no undo is needed — achieving atomicity.

Main advantage over log-based: No undo/redo log records are needed — simpler recovery.
Primary disadvantage: Very poor concurrency support (multiple concurrent transactions are difficult to manage) and storage fragmentation as data pages scatter across disk.


8.4 Remote Backup Systems (High Availability)

A remote backup system maintains a copy of the database at a geographically separate site to ensure availability when the primary site fails.

Architecture: The primary site writes to its WAL; log records are shipped to the backup site continuously or periodically. The backup site applies these log records to stay synchronized.

Types of Standby Systems

Standby TypeReplication ModeFailover TimeData Loss RiskDescription
Hot standbySynchronous — primary waits for backup to acknowledge every writeNear-zero (seconds)Near-zeroBackup is always fully up-to-date; highest cost
Warm standbySemi-synchronous — background replication with some lagMinutesSeconds to minutesGood cost/availability balance for most systems
Cold standbyPeriodic log shipping (every few hours or days)HoursMinutes to hoursCheapest; acceptable for non-critical, low-RPO systems

Transfer of control (failover procedure):

  1. Detect primary site failure (via heartbeat timeout or manual trigger)
  2. Backup site applies any remaining log records it has already received
  3. Backup site opens its database for client connections
  4. Clients are redirected to the backup site (via DNS change, load balancer update, or VIP failover)

Exam Quick-Reference — Chapter 8

  • WAL rule: log record MUST reach stable storage BEFORE the corresponding data page is written to disk
  • Log record format: <Ti, Xj, old_val, new_val> for update; <Ti, commit> for commit; <Ti, start> for begin
  • Immediate modification: data written before commit → need both UNDO and REDO
  • Deferred modification: data written only after commit → REDO only (no undo)
  • Checkpoint shortcut: start recovery from last checkpoint, not from the beginning of the entire log
  • Redo phase: scan FORWARD from checkpoint; redo all updates; build undo-list of uncommitted transactions
  • Undo phase: scan BACKWARD; undo all transactions in undo-list (those with no commit record)
  • Shadow paging commit: atomic pointer swap from current page table to shadow → no undo log needed
  • Shadow paging disadvantage: poor concurrency + storage fragmentation
  • Hot standby: synchronous; near-zero failover + data loss. Cold standby: periodic log shipping; hours to recover

Solved PYQs

A few past questions with short answers.

CT30108006Chapter 86 marksasked 2x2080 Chaitra R

How does log based recovery method work? Explain the conditions of using redo and undo operations during the recovery process.

Solution

Method: WAL records changes before disk write. On crash, replay log. REDO: Applied to committed txns whose changes weren't flushed to disk. Forward scan. Sets value=new. UNDO: Applied to uncommitted txns with partial writes on disk. Backward scan. Sets value=old. Immediate Modification: Needs both. Deferred: Only REDO needed.

CT30108011Chapter 86 marksasked 2x2080 Ashwin B, 2076 Baisakh B

Consider the following log contents when a crash occurs. Explain how recovery would be done for each log state (a), (b), (c) — covering T0 and T1 with commit/no-commit scenarios.

Solution

(a) Both Commit: REDO T0, REDO T1. Apply all new values. (b) T0 Commit, T1 No Commit: REDO T0 (apply changes). UNDO T1 (restore old values). (c) Neither Commit: UNDO both (immediate modification) or discard (deferred). Ensures no partial updates persist. Recovery scans log forward for REDO list, backward for UNDO list.

CT30108005Chapter 88 marksasked 1x2079 Ashwin B

Write the different types of failures that may occur in DBMS. Differentiate between shadow paging and log-based recovery.

Solution

Failures: Transaction (logical/system error), System Crash (power/OS, volatile memory lost), Disk Failure (media damage). Shadow Paging vs Log:

  • Shadow: Dual page tables, atomic commit pointer swap. No undo. Poor concurrency, fragmentation.
  • Log: WAL records old/new values. Supports many concurrent txns, fine-grained recovery. Complex undo/redo logic.

More PYQs

Additional practice from this chapter.