DBMS NotesPYQs

Chapter 7

Transaction Processing and Concurrency Control

Full Syllabus Outline

Every topic from the syllabus data is preserved here.

Chapter Notes

Block-based notes with markdown, diagrams, code, and math.

Chapter 7 — Transaction Processing and Concurrency Control

7.1 What is a Transaction?

A transaction is a logical unit of work — a sequence of one or more database operations that is treated as a single, indivisible operation. It either completes entirely (commit) or has no effect at all (abort/rollback).

Practical examples:

  • Bank transfer: READ(A), A ← A−100, WRITE(A), READ(B), B ← B+100, WRITE(B)
  • Seat reservation: check availability, lock seat, charge card, issue ticket confirmation

Transaction States

State Descriptions

StateDescription
ActiveTransaction is executing operations normally (reading and writing)
Partially CommittedThe last operation has been executed; changes are in the buffer pool but NOT yet flushed to stable disk
CommittedTransaction has completed successfully; all changes are durably written to stable storage
FailedTransaction cannot proceed — caused by error, user ROLLBACK, system failure, or deadlock
AbortedTransaction has been rolled back; ALL its effects have been undone; database is as if the transaction never ran

Exam question: "Briefly describe the four primary states in a database transaction and illustrate with a state diagram."
Draw the diagram above and describe Active, Partially Committed, Committed, and Failed/Aborted.


7.2 ACID Properties

ACID is the cornerstone of database transaction management. All four properties must be guaranteed.

PropertyFormal DefinitionMechanism in DBMS
AtomicityA transaction is all-or-nothing — ALL operations complete or NONE take effectUndo logging (rollback journal)
ConsistencyA transaction takes the DB from one valid state to another, preserving all integrity constraintsIntegrity constraints + application logic
IsolationConcurrent transactions appear to execute serially — each transaction is unaware of othersConcurrency control (locking, MVCC)
DurabilityOnce a transaction commits, its changes persist permanently even if the system crashes immediately afterRedo logging + stable storage

Detailed ACID Example — Bank Transfer of $100 from A to B

T: Transfer $100 from Account A (balance=1000) to Account B (balance=500)
  1. READ(A)          → A_balance = 1000
  2. A ← A − 100     → A_balance = 900
  3. WRITE(A)         → write A=900 to buffer
  4. READ(B)          → B_balance = 500
  5. B ← B + 100     → B_balance = 600
  6. WRITE(B)         → write B=600 to buffer
  7. COMMIT           → flush to stable disk
ACID PropertyHow it applies
AtomicityIf crash after step 3 but before step 6, the system must undo WRITE(A) and restore A=1000
ConsistencyAt every point, total A+B must be preserved: 1500 before = 1500 after
IsolationIf T2 reads A while T is between step 3 and step 6, T2 should NOT see A=900 yet
DurabilityAfter step 7 (commit), A=900 and B=600 must survive any subsequent crash

7.3 Concurrent Execution and Anomalies

Multiple transactions execute concurrently to improve throughput. Without control, concurrency causes these anomalies:

AnomalyDescriptionExample
Lost UpdateT1 and T2 both read X; T1 writes X; T2 then overwrites T1's change using the stale value it read earlierTwo ticket agents both read "1 seat left", both write "0 seats" — one reservation is lost
Dirty ReadT2 reads a value written by T1 before T1 commits; T1 later abortsT2 reads T1's uncommitted salary update; T1 rolls back; T2 has acted on phantom data
Unrepeatable ReadT1 reads X; T2 updates X and commits; T1 reads X again and sees a different valueT1 checks account balance (1000); T2 withdraws and commits (800); T1 re-checks and sees 800
Phantom ReadT1 reads rows matching a predicate; T2 inserts new matching rows; T1 re-reads — sees a different setT1 counts CS students: 100; T2 admits new CS student; T1 re-counts: 101

7.4 Serializability

Schedules

A schedule is an ordering of operations from multiple concurrent transactions.

  • Serial schedule: transactions execute one after another with no interleaving — always correct by definition
  • Serializable schedule: an interleaved schedule equivalent to some serial schedule — the goal of concurrency control

Conflict Serializability

Conflicting operations: two operations from different transactions on the same data item, where at least one is a WRITE.

Ti operationTj operationConflict?
Read(X)Read(X)No — both reading
Read(X)Write(X)Yes — read-write conflict
Write(X)Read(X)Yes — write-read conflict
Write(X)Write(X)Yes — write-write conflict

A schedule is conflict serializable if it can be transformed into a serial schedule by repeatedly swapping adjacent non-conflicting operations.

Precedence (Serialization) Graph Test

Build a directed graph G:

  • Nodes = one node per transaction
  • Edges = draw edge Ti → Tj if any operation of Ti conflicts with and precedes any operation of Tj

Theorem:

  • G has a cycle → schedule is NOT conflict serializable
  • G is acyclic → schedule IS conflict serializable; topological order of G gives the equivalent serial schedule
Conflict Serialization Graph — Worked Example

Schedule S (interleaved):
  R1(A)  R2(A)  W1(A)  W2(A)  R1(B)  W1(B)

T1 operations: R1(A), W1(A), R1(B), W1(B)
T2 operations: R2(A), W2(A)

Step 1: Identify ALL conflicting pairs (diff transactions, same item, ≥1 write):
  • R1(A) at pos1, W2(A) at pos4  → T1 precedes T2  → draw T1 → T2
  • W1(A) at pos3, W2(A) at pos4  → T1 precedes T2  → T1 → T2 (already have it)
  • R2(A) at pos2, W1(A) at pos3  → T2 precedes T1  → draw T2 → T1

Step 2: Precedence graph:
  T1 → T2   AND   T2 → T1   ← CYCLE EXISTS

Conclusion: Schedule S is NOT conflict serializable.

──────────────────────────────────────────────────────────────
Example of a SERIALIZABLE schedule:

Schedule S2:
  R1(A)  W1(A)  R1(B)  W1(B)  R2(A)  W2(A)

Conflicts:
  W1(A) at pos2, R2(A) at pos5  → T1 → T2
  W1(A) at pos2, W2(A) at pos6  → T1 → T2
  R1(A) at pos1, W2(A) at pos6  → T1 → T2

Precedence graph: T1 → T2 only — NO CYCLE
Conclusion: S2 IS conflict serializable.
Equivalent serial schedule: T1 then T2 (from topological sort)

View Serializability

View serializability is weaker (broader) than conflict serializability. Schedule S is view equivalent to serial schedule S' if all three conditions hold:

  1. Same initial reads: If Ti reads the initial value of data item Q in S, Ti also reads it in S'
  2. Same dependent reads: If Ti reads the value of Q written by Tj in S, Ti also reads Q written by Tj in S'
  3. Same final writes: If Ti performs the last write of Q in S, Ti also performs the last write of Q in S'

Key relationship:

  • Every conflict-serializable schedule is view-serializable
  • Some view-serializable schedules are NOT conflict-serializable (e.g., blind writes — writing without first reading)
  • Checking view serializability is NP-hard → DBMS use conflict serializability in practice

7.5 Lock-Based Protocols

Locking is the primary mechanism for ensuring conflict serializability in practice.

Lock Types and Compatibility

Lock TypeSymbolMeaningCompatible with S?Compatible with X?
Shared (Read)SAllows concurrent reads✓ Yes✗ No
Exclusive (Write)XNo other read or write allowed✗ No✗ No

Compatibility Matrix:

S requestedX requested
S held by another T✓ Compatible — grant✗ Must wait
X held by another T✗ Must wait✗ Must wait

Two-Phase Locking (2PL)

Rules:

  1. Growing phase: A transaction may acquire locks but may NOT release any locks
  2. Shrinking phase: A transaction may release locks but may NOT acquire any new locks
  3. The lock point is the moment when the transaction acquires its LAST lock (the peak)

Guarantee: Any schedule produced by 2PL is conflict serializable.

Variations of 2PL

VariantAdditional RuleExtra Guarantee
Basic 2PLGrow then shrink; release locks at any time in shrinking phaseConflict serializability
Strict 2PLHold all exclusive (write) locks until COMMIT or ABORTPrevents cascading rollbacks; recoverable schedules
Rigorous 2PLHold ALL locks (read + write) until COMMIT or ABORTStrongest guarantee; most commonly used in practice

Cascading rollback problem (why Strict 2PL matters):
If T1 writes X and releases its write lock early, T2 reads X (dirty read). If T1 then aborts, T2 must also abort because it read uncommitted data. This cascade can propagate to many transactions — a nightmare for recovery. Strict 2PL prevents this by keeping write locks until commit.


7.6 Deadlock Handling

Deadlock occurs when two or more transactions are each waiting for a lock held by another — circular wait with no progress possible.

Classic example:

  • T1 holds lock on A, waiting for lock on B
  • T2 holds lock on B, waiting for lock on A
  • Neither can proceed → deadlock

Deadlock Prevention Schemes (Timestamp-Based)

Both schemes assign a timestamp to each transaction when it starts. Older timestamp = higher priority.

SchemeTi requests lock held by TjWho is killed?Style
Wait-DieIf Ti is OLDER: Ti waits. If Ti is YOUNGER: Ti dies (aborts).Younger transaction always diesNon-preemptive: older never forcibly killed
Wound-WaitIf Ti is OLDER: Ti wounds Tj (forces Tj to abort and retry). If Ti is YOUNGER: Ti waits.Younger transaction may be killed by an older onePreemptive: older can preempt younger

Starvation prevention: When a transaction is restarted after abort, it keeps its original timestamp so it gradually becomes the "oldest" and is never killed again.

Mnemonic:

  • Wait-Die: OLD waits patiently; YOUNG dies immediately
  • Wound-Wait: OLD wounds (attacks) young; YOUNG waits for old

Deadlock Detection: Wait-For Graph

Build a directed graph:

  • Nodes = all active transactions
  • Edge Ti → Tj = Ti is waiting for a lock currently held by Tj

Periodically check for cycles. If a cycle exists → deadlock detected.
Resolution: Choose a victim (typically the youngest or the one that has done the least work) and abort it. The cycle is broken; other transactions can proceed.


7.7 Multiple Granularity Locking

Instead of always locking at the individual tuple level, allow locking at different levels of the database hierarchy:

Database > Table (Relation) > Page > Tuple (Row)

Why useful? A transaction that scans an entire table can lock the table once (coarse granularity) instead of locking each of potentially millions of tuples individually.

Intention Locks

Before locking a node N at a fine level, you must place an intention lock on all ancestor nodes in the hierarchy (top-down).

Lock ModeMeaning
IS (Intention Shared)Will acquire S lock(s) on some descendant(s)
IX (Intention Exclusive)Will acquire X lock(s) on some descendant(s)
S (Shared)Read the entire subtree (all descendant tuples)
SIX (Shared + Intention Exclusive)Read entire subtree + write some children
X (Exclusive)Write entire subtree

Multiple granularity protocol:

  1. Top-down to acquire: acquire IS or IX on all ancestors before locking a node
  2. Bottom-up to release: release fine-grained locks before releasing coarse-grained locks

Exam Quick-Reference — Chapter 7

  • Transaction states: Active → Partially Committed → Committed (happy path); Active → Failed → Aborted (sad path)
  • ACID: Atomicity=undo log, Consistency=constraints+app logic, Isolation=locking/MVCC, Durability=redo log+stable storage
  • Conflicting operations: different transactions + same data item + at least one write
  • Precedence graph: edge Ti→Tj if Ti's operation conflicts with and precedes Tj's. Cycle → NOT serializable. Acyclic → IS serializable.
  • 2PL: growing phase (acquire only) → lock point → shrinking phase (release only)
  • Strict 2PL: hold write locks until commit — prevents cascading rollbacks
  • Deadlock prevention — Wait-Die: old waits, young dies. Wound-Wait: old wounds young, young waits for old
  • Deadlock detection: wait-for graph; cycle = deadlock; abort a victim to break the cycle

Solved PYQs

A few past questions with short answers.

CT30107005Chapter 78 marksasked 1x2080 Ashwin B

Define transaction and its properties. Explain the two-phase locking protocol for concurrency control in DBMS.

Solution

Transaction: Atomic unit of work. ACID: Atomicity, Consistency, Isolation, Durability.

2PL: Two phases. Growing: Acquire locks (S for read, X for write), no releases. Shrinking: Release locks, no acquisitions. Lock point defines serial order. Guarantees conflict serializability. Variants: Strict (hold X until commit), Rigorous (hold all until commit).

CT30107006Chapter 78 marksasked 1x2081 Ashwin B

Explain ACID properties in database transaction with examples. Explain how wait-die scheme and wound-wait scheme helps to prevent deadlock.

Solution

ACID: Atomic (all/nothing), Consistent (rules preserved), Isolated (concurrent ≡ serial), Durable (commit survives crash).

Wait-Die: Older waits, younger aborts. Prevents cycle because young never waits for old. Wound-Wait: Older preempts (aborts) younger. Younger waits for old. Prevents cycle because old never waits for young. Both use timestamps to break circular waits deterministically.

CT30107009Chapter 78 marksasked 1x2078 Poush B

Explain ACID properties of transactions. Explain the states of a transaction along with a state-transition diagram.

Solution

ACID: Atomicity (all-or-nothing), Consistency (constraints hold), Isolation (concurrent ≡ serial), Durability (committed persists). States: Active → PartiallyCommitted → Committed. Or Active/PartiallyCommitted → Failed → Aborted.

More PYQs

Additional practice from this chapter.