Chapter 8 — Crash Recovery
8.1 Failure Classification
| Type | Cause | Recovery |
|---|
| Transaction failure | Logical error (divide by zero, constraint violation) or deadlock | Rollback single transaction |
| System crash | Power failure, OS bug, hardware fault | Recover from log using checkpoint |
| Disk failure | Head crash, corrupt sector | Restore from backup + replay log |
Stable storage: storage that survives all failures (achieved by writing to multiple physically independent disk copies before acknowledging a write).
8.2 Log-Based Recovery
The Write-Ahead Log (WAL) rule: before a data page is written to disk, its log record must reach stable storage.
Log Record Types
| Record | Meaning |
|---|
<Ti, start> | Transaction Ti has started |
<Ti, Xj, old_val, new_val> | Ti updated attribute Xj from old to new |
<Ti, commit> | Ti has committed |
<Ti, abort> | Ti has aborted |
<checkpoint, [active list]> | Checkpoint taken with listed active transactions |
Immediate vs Deferred Modification
| Immediate Modification | Deferred Modification |
|---|
| When data written to disk | As soon as operation executes (before commit) | Only after commit |
| Recovery needs | Both UNDO and REDO | REDO only |
| Performance | Better concurrency | Simpler recovery |
Recovery Algorithm (with checkpointing)
Phase 1 — Redo phase (forward scan from last checkpoint):
- Redo ALL updates found in log (even aborted ones — simpler to redo everything then undo).
- Build undo-list: transactions that appear in log but have no
<Ti, commit> record.
Phase 2 — Undo phase (backward scan from end):
- For each transaction in undo-list: apply UNDO (restore old_val) and write
<Ti, abort> to log.
- Continue until all undo-list transactions are undone.
Checkpointing
Periodically write a checkpoint to reduce recovery time:
- Write
<checkpoint, active_list> to log.
- Flush all dirty (modified) buffer pages to disk.
- Flush log to stable storage.
Recovery shortcut: during recovery, only scan from the most recent checkpoint — no need to redo transactions committed before that checkpoint.
Worked example — Checkpoint recovery:
Log:
<T0, start>
<T0, A, 1000, 900>
<T1, start>
<checkpoint, {T0, T1}> ← last checkpoint
<T1, B, 2000, 2100>
<T1, commit>
<T2, start>
<T2, C, 500, 600>
** CRASH **
Recovery:
Start scanning from checkpoint.
Redo-list = {T1 committed after checkpoint → REDO T1}
Undo-list = {T0 started before checkpoint, no commit → UNDO T0}
{T2 started, no commit → UNDO T2}
Redo phase (forward):
REDO <T1, B, 2000, 2100>: set B=2100
REDO <T2, C, 500, 600>: set C=600 (even though we'll undo T2 next)
Undo phase (backward):
UNDO T2: <T2, C, 600, 500>: set C=500 → write <T2, abort>
UNDO T0: <T0, A, 900, 1000>: set A=1000 → write <T0, abort>
Final state: A=1000, B=2100, C=500 ✓
8.3 Shadow Paging
Mechanism:
- Maintain two page tables: shadow (stable, from last commit) and current (in-progress).
- All writes go to new disk pages; shadow page table is never modified during a transaction.
- Commit: atomically switch the shadow page table pointer to point to the current page table.
- Abort/crash: discard current page table; shadow table still intact.
How atomicity is ensured: if crash before commit, shadow table is untouched → rollback to last committed state. If crash after commit, shadow table now points to new pages → state persists.
| Shadow Paging | Log-Based Recovery |
|---|
| Undo logging | Not needed | Required |
| Concurrency | Poor (difficult to support) | Excellent |
| Storage fragmentation | Yes (data spreads across pages) | No |
| Practical use | Limited (old systems) | Universal |
8.4 Remote Backup (High Availability)
| Standby Type | Sync | Failover time | Data loss risk |
|---|
| Hot standby | Synchronous replication | Near-zero | Near-zero |
| Warm standby | Semi-synchronous | Minutes | Seconds–minutes |
| Cold standby | Periodic log shipping | Hours | Minutes–hours |
Transfer of control: when primary fails, the backup site takes over, applying any remaining log records before accepting client connections.
Exam Quick-Reference
- WAL rule: log must reach stable storage BEFORE data page is written.
- Log record format:
<Ti, Xj, old, new> for update; <Ti, commit> for commit.
- Immediate modification: needs both UNDO and REDO. Deferred: REDO only.
- Checkpoint shortcut: only scan log from last checkpoint during recovery.
- Shadow paging: commit = atomically switch shadow pointer. No undo log needed.