Chapter 8

Crash Recovery

4 hours4 marks15 past questionsBack to chapters Open filtered PYQs Previous: Ch 7 Next: Ch 9

Full Syllabus Outline

Every topic from the syllabus data is preserved here.

8.1
Failure classification
DE
Transaction failure (logical error, system error). System crash (power failure, hardware fault). Disk failure. Stable storage concept and implementation.
8.2
Recovery and atomicity
DEI
Why atomicity requires a recovery mechanism. Shadow-copy approach as a basic introduction. Log-based approach motivation.
8.3
Log-based recovery
DEIDMPS
Log record format: <Ti, start>, <Ti, Xj, V1, V2>, <Ti, commit>, <Ti, abort>. Immediate database modification vs deferred database modification. Undo and redo operations. Recovery algorithm: undo phase, redo phase. Checkpointing: fuzzy checkpoint. Log-based recovery with checkpoints. Worked examples with given log states. Demonstration using MS SQL Server backup/restore.
8.4
Shadow paging
DEI
Current page table and shadow page table. Write operation mechanism. Commit: make current page table the new shadow. How atomicity and durability are ensured. Advantages (no undo logging) and disadvantages (fragmentation, no concurrent transactions) vs log-based recovery.
8.5
High availability using remote backup systems
DEIDM
Remote backup architecture: primary site and remote backup site. Log shipping. Transfer of control on primary failure. Hot standby vs warm standby vs cold standby. Synchronous vs asynchronous replication.

Chapter Notes

Block-based notes with markdown, diagrams, code, and math.

Chapter 8 — Crash Recovery

8.1 Failure Classification

Database systems must handle three categories of failures, each requiring a different recovery strategy:

Failure Type	Cause	What is Lost	Recovery Method
Transaction failure	Logical error (overflow, division by zero, constraint violation, user cancel, deadlock)	Only that transaction's uncommitted changes	Rollback the single transaction using undo log
System crash	Power failure, OS kernel panic, hardware fault — but disk survives intact	Everything in RAM (entire buffer pool contents)	Redo/undo using the write-ahead log from the last checkpoint
Disk failure (media failure)	Disk head crash, corrupt sectors, catastrophic hardware failure	Contents of the failed disk	Restore from backup + replay archived log records

Stable Storage

Stable storage is storage that survives any single failure. Approximated in practice by:

Writing data to multiple physically independent disks (RAID 1, RAID 10) before acknowledging the write
The probability of ALL copies failing simultaneously is engineered to be negligibly small

Block-level operations (important for understanding WAL):

Input(B): transfer disk block B from disk to a RAM buffer frame
Output(B): transfer RAM buffer block B to disk (write to stable storage)

8.2 Write-Ahead Log (WAL) and Log-Based Recovery

The Fundamental WAL Rule

Before a modified data page is written to disk (Output), the corresponding log record describing that modification MUST already be in stable storage.

This rule ensures that even if the system crashes immediately after writing a data page, the log record already exists on disk — enabling recovery by undo or redo.

Log Record Format

Each log record describes a single operation by a single transaction:

Record Type	Format	Meaning
Start	`<Ti, start>`	Transaction Ti has begun
Update	`<Ti, Xj, old_value, new_value>`	Ti updated item Xj from old_value to new_value
Commit	`<Ti, commit>`	Ti has committed; all changes made durable
Abort	`<Ti, abort>`	Ti has been rolled back; changes undone
Checkpoint	`<checkpoint, [T1, T2, ...]>`	Checkpoint taken with active transactions listed

Undo: use old_value to restore Xj before Ti's modification
Redo: use new_value to reapply Ti's modification to Xj

Immediate vs Deferred Modification

	Immediate Modification	Deferred Modification
When data written to disk	Before the transaction commits (as soon as WRITE executes)	Only AFTER the transaction commits
Recovery operations needed	Both UNDO (uncommitted) AND REDO (committed but not yet flushed)	REDO only (nothing reaches disk before commit, so no undo needed)
Log records needed	Must log both old AND new values	Only new values needed
Performance	Better write concurrency	Simpler recovery logic

Recovery Algorithm (with Checkpointing)

Recovery uses two phases. Both phases start from the most recent checkpoint, not from the beginning of the entire log.

Phase 1 — Redo Phase (forward scan from last checkpoint to end of log):

Begin scanning forward from the last checkpoint record
Redo every update found in the log (even from transactions that will ultimately be undone — simpler to redo everything, then undo)
While scanning, build the undo-list:
- Add to undo-list: transactions that have a <Ti, start> record but no <Ti, commit> record

Phase 2 — Undo Phase (backward scan from end of log):

Scan the log backward from the end
For each transaction Ti in the undo-list:
- Apply undo: set Xj = old_value for each update record of Ti
- Write a Compensation Log Record (CLR): <Ti, Xj, new_val, old_val, CLR>
When you encounter <Ti, start>: write <Ti, abort> and remove Ti from undo-list
Continue until the undo-list is empty

Checkpointing

A checkpoint reduces recovery time by establishing a known-good point:

Write <checkpoint, [active_transactions]> to log
Flush all dirty (modified) buffer pages to disk
Flush the log to stable storage

Recovery shortcut: During recovery, only scan the log from the most recent checkpoint. All transactions committed BEFORE the checkpoint are already on disk — no need to redo them.

Comprehensive Checkpoint Recovery — Fully Worked Example

LOG contents (in order):
  <T0, start>
  <T0, A, 1000, 900>          ← T0: A changes from 1000 to 900
  <T1, start>
  ── CHECKPOINT, active = {T0, T1} ──   ← last checkpoint point
  <T1, B, 2000, 2100>          ← T1: B changes from 2000 to 2100
  <T1, commit>                 ← T1 commits successfully
  <T2, start>
  <T2, C, 500, 600>            ← T2: C changes from 500 to 600
  ** SYSTEM CRASH **           ← no commit for T0 or T2

────────────────────────────────────────────────────────
Recovery Procedure:

Step 0: Locate last checkpoint → "CHECKPOINT, active = {T0, T1}"
  Initial undo-list = {T0, T1}   (transactions active at checkpoint time)

────────────────────────────────────────────────────────
REDO PHASE — scan FORWARD from checkpoint to end of log:

  Process <T1, B, 2000, 2100>:   REDO → set B = 2100 ✓
  Process <T1, commit>:           T1 committed → REMOVE T1 from undo-list
                                  undo-list = {T0}
  Process <T2, start>:            T2 not seen before → ADD T2 to undo-list
                                  undo-list = {T0, T2}
  Process <T2, C, 500, 600>:     REDO → set C = 600 ✓ (will be undone in next phase)

After redo phase:
  State: A=900, B=2100, C=600
  undo-list = {T0, T2}   (T0 started before checkpoint, never committed;
                           T2 started after, never committed)

────────────────────────────────────────────────────────
UNDO PHASE — scan BACKWARD from end of log:

  Process <T2, C, 500, 600>:   UNDO T2 → set C = 500; write <T2,C,600,500,CLR>
  Process <T2, start>:          write <T2, abort>; REMOVE T2 from undo-list
                                 undo-list = {T0}
  Process <T1, commit>:         T1 NOT in undo-list → skip
  Process <T1, B, 2000, 2100>: T1 NOT in undo-list → skip
  (checkpoint record)
  Process <T0, A, 1000, 900>:  UNDO T0 → set A = 1000; write <T0,A,900,1000,CLR>
  Process <T0, start>:          write <T0, abort>; REMOVE T0 from undo-list
                                 undo-list = {} → DONE

Final state after recovery:
  A = 1000   (T0 rolled back — never committed)
  B = 2100   (T1 redone — committed before crash)
  C = 500    (T2 rolled back — never committed)  ✓

8.3 Shadow Paging

Shadow paging is an alternative recovery technique that avoids using a traditional log. It uses two page tables instead.

Mechanism

Component	Description
Shadow page table	The stable, committed page table pointing to data as of the last commit — never modified during an active transaction
Current page table	The working copy used during the transaction — updated as writes happen

How it works step-by-step:

At the start of a transaction, copy the current page table to use as the shadow (or the existing shadow remains as-is)
During the transaction: all writes go to NEW disk pages. The current page table is updated to point to these new pages. The shadow table still points to the old pre-transaction pages.
On COMMIT: Atomically write a single pointer (the DB root pointer) to the current page table. This single atomic write makes the current state the new committed state.
On ABORT or CRASH (before commit): Discard the current page table. The shadow table is still intact, so the database is exactly as it was before the transaction started.

How Atomicity and Durability Are Ensured

Atomicity: If crash before commit, shadow page table is untouched → automatic rollback to last committed state. No undo log is needed.
Durability: After commit, the new root pointer is durably written. If crash after commit, shadow table now points to the new pages → the committed state persists.

Shadow Paging vs Log-Based Recovery

Aspect	Shadow Paging	Log-Based Recovery
Undo logging	Not needed (just discard current page table)	Required (must log old values)
Redo logging	Not needed (keep new pages)	Required (must log new values)
Concurrency	Poor — very difficult to support multiple concurrent transactions sharing page tables	Excellent — log is append-only
Storage fragmentation	Yes — data pages spread across disk as new pages are allocated	No — pages updated in place
Garbage collection	Required (must free old shadow pages)	Not required
Practical adoption	Limited (SQLite uses a variant; mostly historical)	Universal (PostgreSQL, MySQL, Oracle, SQL Server)

Exam answer — "Describe shadow paging for atomicity and durability, and state its main advantage and disadvantage over log-based recovery:"

Shadow paging maintains two page tables: the shadow (last committed state) and the current (in-progress). All writes go to new disk pages and the current table is updated. On commit, a single atomic pointer swap replaces the shadow with the current table — achieving durability. On crash before commit, the shadow is intact and no undo is needed — achieving atomicity.

Main advantage over log-based: No undo/redo log records are needed — simpler recovery.
Primary disadvantage: Very poor concurrency support (multiple concurrent transactions are difficult to manage) and storage fragmentation as data pages scatter across disk.

8.4 Remote Backup Systems (High Availability)

A remote backup system maintains a copy of the database at a geographically separate site to ensure availability when the primary site fails.

Architecture: The primary site writes to its WAL; log records are shipped to the backup site continuously or periodically. The backup site applies these log records to stay synchronized.

Types of Standby Systems

Standby Type	Replication Mode	Failover Time	Data Loss Risk	Description
Hot standby	Synchronous — primary waits for backup to acknowledge every write	Near-zero (seconds)	Near-zero	Backup is always fully up-to-date; highest cost
Warm standby	Semi-synchronous — background replication with some lag	Minutes	Seconds to minutes	Good cost/availability balance for most systems
Cold standby	Periodic log shipping (every few hours or days)	Hours	Minutes to hours	Cheapest; acceptable for non-critical, low-RPO systems

Transfer of control (failover procedure):

Detect primary site failure (via heartbeat timeout or manual trigger)
Backup site applies any remaining log records it has already received
Backup site opens its database for client connections
Clients are redirected to the backup site (via DNS change, load balancer update, or VIP failover)

Exam Quick-Reference — Chapter 8

WAL rule: log record MUST reach stable storage BEFORE the corresponding data page is written to disk
Log record format: <Ti, Xj, old_val, new_val> for update; <Ti, commit> for commit; <Ti, start> for begin
Immediate modification: data written before commit → need both UNDO and REDO
Deferred modification: data written only after commit → REDO only (no undo)
Checkpoint shortcut: start recovery from last checkpoint, not from the beginning of the entire log
Redo phase: scan FORWARD from checkpoint; redo all updates; build undo-list of uncommitted transactions
Undo phase: scan BACKWARD; undo all transactions in undo-list (those with no commit record)
Shadow paging commit: atomic pointer swap from current page table to shadow → no undo log needed
Shadow paging disadvantage: poor concurrency + storage fragmentation
Hot standby: synchronous; near-zero failover + data loss. Cold standby: periodic log shipping; hours to recover

Solved PYQs

A few past questions with short answers.

CT30108006Chapter 86 marksasked 2x2080 Chaitra R

How does log based recovery method work? Explain the conditions of using redo and undo operations during the recovery process.

Solution

Method: WAL records changes before disk write. On crash, replay log. REDO: Applied to committed txns whose changes weren't flushed to disk. Forward scan. Sets value=new. UNDO: Applied to uncommitted txns with partial writes on disk. Backward scan. Sets value=old. Immediate Modification: Needs both. Deferred: Only REDO needed.

CT30108011Chapter 86 marksasked 2x2080 Ashwin B, 2076 Baisakh B

Consider the following log contents when a crash occurs. Explain how recovery would be done for each log state (a), (b), (c) — covering T0 and T1 with commit/no-commit scenarios.

Solution

(a) Both Commit: REDO T0, REDO T1. Apply all new values. (b) T0 Commit, T1 No Commit: REDO T0 (apply changes). UNDO T1 (restore old values). (c) Neither Commit: UNDO both (immediate modification) or discard (deferred). Ensures no partial updates persist. Recovery scans log forward for REDO list, backward for UNDO list.

CT30108005Chapter 88 marksasked 1x2079 Ashwin B

Write the different types of failures that may occur in DBMS. Differentiate between shadow paging and log-based recovery.

Solution

Failures: Transaction (logical/system error), System Crash (power/OS, volatile memory lost), Disk Failure (media damage). Shadow Paging vs Log:

Shadow: Dual page tables, atomic commit pointer swap. No undo. Poor concurrency, fragmentation.
Log: WAL records old/new values. Supports many concurrent txns, fine-grained recovery. Complex undo/redo logic.

More PYQs

Additional practice from this chapter.

Describe the mechanism by which Shadow Paging ensures atomicity and durability, and state its main advantage over log-based recovery and its primary disadvantage.
CT301080014 marksasked 1xModel Q
What is a stable storage? What are checkpoints in Log based recovery Mechanism? How is data recovered using checkpoints?
CT301080026 marksasked 1x2082 Kartik B
What are logs in DBMS? How do you recover from a crash using log-based recovery? Explain briefly.
CT301080036 marksasked 1x2081 Chaitra R
Define stable storage. Explain log-based recovery technique with example.
CT301080046 marksasked 1x2081 Ashwin B
Write the different types of failures that may occur in DBMS. Differentiate between shadow paging and log-based recovery.
CT301080058 marksasked 1x2079 Ashwin B

Crash Recovery

Failure classification

Recovery and atomicity

Log-based recovery

Shadow paging

High availability using remote backup systems

Chapter 8 — Crash Recovery

8.1 Failure Classification

Stable Storage

8.2 Write-Ahead Log (WAL) and Log-Based Recovery

The Fundamental WAL Rule

Log Record Format

Immediate vs Deferred Modification

Recovery Algorithm (with Checkpointing)

Checkpointing

8.3 Shadow Paging

Mechanism

How Atomicity and Durability Are Ensured

Shadow Paging vs Log-Based Recovery

8.4 Remote Backup Systems (High Availability)

Types of Standby Systems

Exam Quick-Reference — Chapter 8