Chapter 4 — Database Constraints and Normalization
4.1 Integrity Constraints
| Constraint | Scope | Purpose | SQL Example |
|---|
| NOT NULL | Column | Prevent missing values | name VARCHAR(100) NOT NULL |
| UNIQUE | Column / group | No duplicate values | UNIQUE(email) |
| PRIMARY KEY | Column / group | Unique + not null identifier | sid INT PRIMARY KEY |
| FOREIGN KEY | Column | Referential integrity | REFERENCES Student(sid) |
| CHECK | Column / row | Custom business rule | CHECK (gpa BETWEEN 0 AND 4) |
| DEFAULT | Column | Value when none supplied | dept VARCHAR(50) DEFAULT 'CS' |
| ASSERTION | Table / DB | Multi-table constraint | CREATE ASSERTION ... |
CREATE TABLE Student (
sid INT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(200) UNIQUE,
dept VARCHAR(50) DEFAULT 'Undeclared',
gpa DECIMAL(3,2) CHECK (gpa >= 0 AND gpa <= 4.0),
mentor INT REFERENCES Student(sid) ON DELETE SET NULL -- self-ref FK
);
-- Table-level constraint (composite)
CREATE TABLE Enrollment (
sid INT REFERENCES Student(sid),
cid INT REFERENCES Course(cid),
grade CHAR(2),
CONSTRAINT pk_enroll PRIMARY KEY (sid, cid),
CONSTRAINT valid_grade CHECK (grade IN ('A','B','C','D','F','IP','W'))
);
-- Assertion (SQL standard, limited DBMS support)
CREATE ASSERTION max_salary
CHECK (NOT EXISTS (
SELECT 1 FROM Employee WHERE salary > 1000000
));
4.2 Functional Dependencies (FD)
A functional dependency A → B means: for any two tuples with the same A value, they must have the same B value. "A functionally determines B."
Armstrong's Axioms (sound and complete)
| Axiom | Rule | Example |
|---|
| Reflexivity | If B ⊆ A, then A → B | {sid, name} → {name} |
| Augmentation | If A → B, then AC → BC | sid → name ⟹ {sid,dept} → {name,dept} |
| Transitivity | If A → B and B → C, then A → C | sid → dept, dept → dean ⟹ sid → dean |
Derived rules (from the axioms):
- Union: A → B, A → C ⟹ A → BC
- Decomposition: A → BC ⟹ A → B and A → C
- Pseudotransitivity: A → B, CB → D ⟹ CA → D
Attribute Closure Algorithm
To find A⁺ (closure of attribute set A under FD set F):
Algorithm: AttributeClosure(A, F)
result ← A
repeat
for each FD (X → Y) in F:
if X ⊆ result:
result ← result ∪ Y
until result does not change
return result
Example:
Relation: R(A, B, C, D, E)
FDs: F = { A→BC, B→D, CD→E }
Closure of {A}:
Start: {A}
A→BC: {A, B, C}
B→D: {A, B, C, D}
CD→E: {A, B, C, D, E} ← all attributes!
So {A} is a superkey (and candidate key since no proper subset of {A} gives full closure).
Canonical Cover (Minimal Cover)
A canonical cover F_c has no extraneous attributes and no redundant FDs. Steps:
- Use Union rule to combine FDs with same LHS.
- Remove extraneous attributes from each LHS and RHS.
- Remove FDs that are derivable from the rest.
4.3 Normalization
Goal: eliminate update, insert, and delete anomalies by decomposing relations.
1NF — First Normal Form
Condition: every attribute is atomic (no repeating groups, no sets, no arrays).
Fix: move multivalued attributes into separate rows or a separate relation.
2NF — Second Normal Form
Condition: in 1NF AND no non-prime attribute is partially dependent on any candidate key.
(A partial dependency: non-key attribute depends on part of a composite key.)
Fix: move the partially dependent attribute + the partial key into a new relation.
Example violation:
OrderItem(order_id, product_id, product_name, quantity)
product_name depends only on product_id (part of composite PK).
Fix: split into OrderItem(order_id, product_id, quantity) and Product(product_id, product_name).
3NF — Third Normal Form
Condition: in 2NF AND no non-prime attribute transitively depends on any candidate key.
Fix: remove transitive dependency chain into a new relation.
Example violation:
Student(sid, dept, dept_head)
sid → dept → dept_head (transitive).
Fix: Student(sid, dept) and Department(dept, dept_head).
BCNF — Boyce-Codd Normal Form
Condition: for every non-trivial FD X → Y, X must be a superkey.
Stricter than 3NF — BCNF removes anomalies 3NF can miss.
Trade-off: BCNF decomposition may not preserve all FDs; 3NF decomposition always does.
| 3NF | BCNF |
|---|
| Lossless decomposition | ✓ always possible | ✓ always possible |
| Dependency preservation | ✓ always possible | ✗ not always |
| Anomaly elimination | Partial | Complete |
BCNF Decomposition Algorithm:
result ← {R}
while some relation Ri in result is not in BCNF:
find a non-trivial FD X → Y in Ri that violates BCNF (X is not superkey of Ri)
replace Ri with:
Ri1 = X ∪ Y (contains the violating FD)
Ri2 = Ri − Y + X (retains X as a join key)
return result
3NF Synthesis Algorithm:
1. Compute canonical cover Fc of F
2. For each FD X → Y in Fc, create relation (X ∪ Y)
3. If no relation contains a candidate key of R, add one
4. Remove redundant relations (subsets of another)
4.4 Lossless Decomposition
Decomposing R into R1 and R2 is lossless if:
- R1 ∩ R2 → R1 (the common attributes are a key for R1), OR
- R1 ∩ R2 → R2 (the common attributes are a key for R2).
Test: natural joining R1 and R2 should reproduce exactly R — no spurious tuples.
Dependency preservation: a decomposition preserves dependencies if every FD in F can be checked within a single relation of the decomposition (no cross-relation FD checking needed).
Exam Quick-Reference
- Armstrong axioms: Reflexivity · Augmentation · Transitivity (RAT).
- Closure test: if the closure of an attribute set includes all attributes, it's a superkey.
- 2NF removes partial deps on composite PK; 3NF removes transitive deps.
- BCNF: every determinant is a candidate key — stricter than 3NF.
- BCNF may lose dependency preservation; 3NF always preserves dependencies.