DBMS NotesPYQs

Chapter 3

Relational Query Languages

Full Syllabus Outline

Every topic from the syllabus data is preserved here.

Chapter Notes

Block-based notes with markdown, diagrams, code, and math.

Chapter 3 — Relational Query Languages

3.1 Relational Algebra

Relational algebra is a procedural formal query language — it specifies both what data to retrieve and how (the sequence of operations). SQL is based on relational algebra but is declarative (specifies only what, not how).

Core Operators

OperatorSymbolOutputDescription
Selectionσ (sigma)Subset of rowsFilters rows satisfying a predicate
Projectionπ (pi)Subset of columnsKeeps only specified columns; removes duplicates
Cartesian Product×All row combinationsPairs every tuple of R with every tuple of S
Natural JoinJoined relationJoins on all common attributes; merges duplicate columns
Theta Join⋈_θJoined relationCartesian product filtered by predicate θ
UnionAll tuples in R or SBoth relations must be union-compatible
IntersectionTuples in bothBoth must be union-compatible
Set DifferenceTuples in R but not SBoth must be union-compatible
Renameρ (rho)Renamed relationRename relation and/or attributes
AssignmentNamed resultStore intermediate result for reuse

Union-compatible = same number of attributes with compatible domains.

Selection: σdept=CSgpa>3.5(Student)\text{Selection: } \sigma_{dept='CS' \wedge gpa > 3.5}(Student)
Projection: πname,gpa ⁣(σdept=CS(Student))\text{Projection: } \pi_{name,\, gpa}\!\left(\sigma_{dept='CS'}(Student)\right)
Natural Join: StudentEnrollmentCourse\text{Natural Join: } Student \bowtie Enrollment \bowtie Course
Theta Join: StudentStudent.sid=Enrollment.sidEnrollment\text{Theta Join: } Student \bowtie_{Student.sid = Enrollment.sid} Enrollment

Common Relational Algebra Query Examples

QueryRelational Algebra Expression
CS students with GPA > 3.5σ_{dept='CS' ∧ gpa>3.5}(Student)
Names of CS studentsπ_{name}(σ_{dept='CS'}(Student))
Students enrolled in course 101π_{sid}(σ_{cid=101}(Enrollment))
Student names enrolled in course 101π_{name}(Student ⋈ σ_{cid=101}(Enrollment))
Students NOT enrolled in any courseπ_{sid}(Student) − π_{sid}(Enrollment)

Optimization tip: Always push σ as close to the base relation as possible, and push π early to eliminate unneeded columns. This reduces intermediate result sizes.


3.2 SQL Reference Schema

CREATE TABLE Student (
  sid    INT          PRIMARY KEY,
  name   VARCHAR(100) NOT NULL,
  dept   VARCHAR(50),
  gpa    DECIMAL(3,2) CHECK (gpa BETWEEN 0 AND 4)
);
CREATE TABLE Course (
  cid     INT          PRIMARY KEY,
  title   VARCHAR(100) NOT NULL,
  credits INT,
  dept    VARCHAR(50)
);
CREATE TABLE Enrollment (
  sid   INT  REFERENCES Student(sid) ON DELETE CASCADE,
  cid   INT  REFERENCES Course(cid)  ON DELETE CASCADE,
  grade CHAR(2),
  year  INT,
  PRIMARY KEY (sid, cid)
);
CREATE TABLE Instructor (
  iid    INT          PRIMARY KEY,
  name   VARCHAR(100),
  dept   VARCHAR(50),
  salary DECIMAL(10,2)
);
CREATE TABLE Teaches (
  iid INT REFERENCES Instructor(iid),
  cid INT REFERENCES Course(cid),
  PRIMARY KEY (iid, cid)
);

3.3 SQL SELECT — Full Syntax and Logical Execution Order

SELECT  [DISTINCT] column_list | *
FROM    table_list | join_expression
[WHERE  row_predicate]
[GROUP BY grouping_columns]
[HAVING group_predicate]
[ORDER BY sort_column [ASC | DESC]]
[LIMIT n  OFFSET m]

Logical execution order (how the engine evaluates — NOT the writing order):

  1. FROM + JOIN — identify all rows from all tables
  2. WHERE — filter individual rows
  3. GROUP BY — group the filtered rows
  4. HAVING — filter groups (applied after grouping)
  5. SELECT — compute output columns (aliases defined here)
  6. DISTINCT — remove duplicate output rows
  7. ORDER BY — sort the result
  8. LIMIT / OFFSET — paginate

Exam traps:

  • You CANNOT use a SELECT alias in a WHERE clause (WHERE runs before SELECT).
  • You CAN use a SELECT alias in ORDER BY (ORDER BY runs after SELECT).
  • Never use aggregate functions in WHERE — use HAVING instead.
  • WHERE filters rows (before grouping); HAVING filters groups (after GROUP BY).
-- ── BASIC SELECT + AGGREGATES ────────────────────────────────────────────
SELECT name, gpa
FROM   Student
WHERE  dept = 'CS' AND gpa > 3.5
ORDER  BY gpa DESC;

-- DISTINCT — remove duplicate rows
SELECT DISTINCT dept FROM Student;

-- Aggregate functions: COUNT, AVG, MAX, MIN, SUM
-- COUNT(*): counts all rows including NULLs
-- COUNT(col): counts non-null values only; AVG, MAX, MIN, SUM all ignore NULLs
SELECT
  dept,
  COUNT(*)             AS total_students,
  ROUND(AVG(gpa), 2)   AS avg_gpa,
  MAX(gpa)             AS top_gpa,
  MIN(gpa)             AS lowest_gpa
FROM Student
GROUP BY dept            -- group rows by dept first
HAVING AVG(gpa) > 3.0    -- then filter groups where avg GPA > 3.0
ORDER BY avg_gpa DESC;
-- ── JOINS ────────────────────────────────────────────────────────────────

-- INNER JOIN: only matching rows from both tables
SELECT s.name, c.title, e.grade
FROM   Student s
JOIN   Enrollment e ON s.sid = e.sid
JOIN   Course c     ON e.cid = c.cid
WHERE  c.dept = 'CS';

-- LEFT OUTER JOIN: all rows from left table; NULLs for non-matching right rows
-- Use: find students NOT enrolled (grade will be NULL)
SELECT s.name, e.cid, e.grade
FROM   Student s
LEFT JOIN Enrollment e ON s.sid = e.sid;

-- RIGHT OUTER JOIN: all rows from right table
SELECT s.name, c.title
FROM   Enrollment e
RIGHT JOIN Course c ON e.cid = c.cid;

-- FULL OUTER JOIN: all rows from both tables
SELECT s.name, c.title
FROM   Student s
FULL OUTER JOIN Enrollment e ON s.sid = e.sid
FULL OUTER JOIN Course c     ON e.cid = c.cid;

-- CROSS JOIN (Cartesian product): WARNING — |R| × |S| rows
SELECT s.name, c.title FROM Student s CROSS JOIN Course c;

-- NATURAL JOIN: auto-join on same-named columns (fragile — use with care)
SELECT * FROM Student NATURAL JOIN Enrollment;

-- Self-join: list pairs of students in the same department
SELECT a.name AS student1, b.name AS student2, a.dept
FROM   Student a
JOIN   Student b ON a.dept = b.dept AND a.sid < b.sid;
-- a.sid < b.sid prevents (Alice,Bob) AND (Bob,Alice) both appearing
-- ── SUBQUERIES ───────────────────────────────────────────────────────────

-- Non-correlated: independent of outer query, runs ONCE
SELECT name FROM Student
WHERE sid IN (
  SELECT sid FROM Enrollment WHERE grade = 'A'
);

-- Correlated: references outer query row; runs ONCE PER OUTER ROW (slower)
-- Find students with GPA above their own department's average
SELECT name, gpa, dept
FROM   Student s
WHERE  gpa > (
  SELECT AVG(gpa) FROM Student s2 WHERE s2.dept = s.dept
);

-- EXISTS: true if subquery returns at least one row
SELECT name FROM Student s
WHERE EXISTS (
  SELECT 1 FROM Enrollment e WHERE e.sid = s.sid AND e.grade = 'A'
);

-- NOT EXISTS: students who never got an A grade
SELECT name FROM Student s
WHERE NOT EXISTS (
  SELECT 1 FROM Enrollment e WHERE e.sid = s.sid AND e.grade = 'A'
);

-- Students never enrolled in any course
SELECT name FROM Student s
WHERE NOT EXISTS (SELECT 1 FROM Enrollment e WHERE e.sid = s.sid);
-- ⚠ Prefer NOT EXISTS over NOT IN when NULLs may be present in the subquery result

-- Scalar subquery: subquery returning a single value, usable in SELECT
SELECT name,
       (SELECT COUNT(*) FROM Enrollment e WHERE e.sid = s.sid) AS num_courses
FROM   Student s;
-- ── SET OPERATIONS ──────────────────────────────────────────────────────
-- Both relations must be union-compatible (same columns, compatible types)

-- UNION: rows in first OR second result; REMOVES duplicates
SELECT sid FROM Student WHERE dept = 'CS'
UNION
SELECT sid FROM Student WHERE dept = 'IT';

-- UNION ALL: keep ALL duplicates (faster — no dedup step)
SELECT sid FROM Student WHERE dept = 'CS'
UNION ALL
SELECT sid FROM Student WHERE gpa > 3.8;

-- INTERSECT: rows in BOTH results
SELECT sid FROM Enrollment WHERE cid = 101
INTERSECT
SELECT sid FROM Enrollment WHERE cid = 102;  -- enrolled in BOTH courses

-- EXCEPT (MINUS in Oracle): rows in first but NOT in second
SELECT sid FROM Enrollment WHERE cid = 101
EXCEPT
SELECT sid FROM Enrollment WHERE cid = 102;  -- enrolled in 101 but NOT 102
-- ── DML: INSERT · UPDATE · DELETE ──────────────────────────────────────

-- INSERT — single row
INSERT INTO Student (sid, name, dept, gpa) VALUES (10, 'Ravi', 'CS', 3.7);

-- INSERT — multiple rows at once
INSERT INTO Student VALUES
  (11, 'Priya', 'IT', 3.5),
  (12, 'Sita',  'CS', 3.9);

-- INSERT from SELECT (bulk insert from another query)
INSERT INTO CS_Backup SELECT * FROM Student WHERE dept = 'CS';

-- UPDATE — update rows matching condition
UPDATE Student
SET    gpa = LEAST(gpa + 0.1, 4.0)   -- LEAST() prevents exceeding 4.0
WHERE  dept = 'CS';

-- UPDATE with subquery
UPDATE Student SET dept = 'CS'
WHERE  sid IN (SELECT sid FROM Enrollment WHERE cid = 101);

-- DELETE — remove rows
DELETE FROM Student WHERE gpa < 1.5;

-- DELETE using subquery — remove students never enrolled anywhere
DELETE FROM Student
WHERE sid NOT IN (SELECT DISTINCT sid FROM Enrollment);
-- ── VIEWS ────────────────────────────────────────────────────────────────

-- Simple view: stores the query definition, not data
CREATE VIEW CS_Students AS
  SELECT sid, name, gpa FROM Student WHERE dept = 'CS';

-- View with join (hides complexity from user)
CREATE VIEW Student_Courses AS
  SELECT s.name AS student_name, c.title AS course_title, e.grade
  FROM   Student s
  JOIN   Enrollment e ON s.sid = e.sid
  JOIN   Course c     ON e.cid = c.cid;

-- WITH CHECK OPTION: prevents INSERT/UPDATE that violates the view's WHERE
CREATE VIEW High_GPA_Students AS
  SELECT * FROM Student WHERE gpa > 3.5
  WITH CHECK OPTION;
-- INSERT INTO High_GPA_Students VALUES(99,'X','CS',2.0) → ERROR

-- Materialized view (PostgreSQL): stores the RESULT physically; must be refreshed
CREATE MATERIALIZED VIEW Dept_Stats AS
  SELECT dept, COUNT(*) AS num_students, AVG(gpa) AS avg_gpa
  FROM   Student GROUP BY dept;

REFRESH MATERIALIZED VIEW Dept_Stats;  -- update stored result

-- Access control: grant access to view, not base table
GRANT SELECT ON CS_Students TO student_app_user;

DROP VIEW CS_Students;
DROP MATERIALIZED VIEW Dept_Stats;

Views — Key Concepts

ConceptRegular ViewMaterialized View
StorageStores query definition onlyStores the query result (physically)
FreshnessAlways current (re-executes on access)Stale until refreshed (REFRESH command)
PerformanceSlower for complex queriesFaster (pre-computed result)
Use caseSecurity, simplification, abstractionReporting, OLAP, dashboards

Updatable views: A view supports INSERT/UPDATE/DELETE only if it selects from a single base table with no DISTINCT, GROUP BY, HAVING, aggregates, or UNION.

-- ── TRIGGERS ─────────────────────────────────────────────────────────────
-- Trigger: automatically fires BEFORE or AFTER INSERT/UPDATE/DELETE
-- Use cases: audit logging, enforce complex constraints, auto-compute derived values

-- PostgreSQL: trigger function + trigger binding

-- Example 1: Validate GPA before insert/update
CREATE OR REPLACE FUNCTION validate_gpa()
RETURNS TRIGGER AS $$
BEGIN
  IF NEW.gpa < 0 OR NEW.gpa > 4.0 THEN
    RAISE EXCEPTION 'GPA must be between 0.0 and 4.0, got %', NEW.gpa;
  END IF;
  RETURN NEW;  -- BEFORE trigger must RETURN NEW (or NULL to cancel the operation)
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_gpa_check
BEFORE INSERT OR UPDATE ON Student
FOR EACH ROW EXECUTE FUNCTION validate_gpa();
-- FOR EACH ROW: fires once per modified row (row-level trigger)
-- FOR EACH STATEMENT: fires once per SQL statement (statement-level trigger)

-- Example 2: Audit trigger — log every deleted student
CREATE TABLE Student_Audit (
  deleted_at  TIMESTAMP DEFAULT NOW(),
  sid INT, name VARCHAR(100), dept VARCHAR(50)
);
CREATE OR REPLACE FUNCTION audit_student_delete()
RETURNS TRIGGER AS $$
BEGIN
  INSERT INTO Student_Audit(sid, name, dept)
  VALUES (OLD.sid, OLD.name, OLD.dept);  -- OLD = the deleted row
  RETURN OLD;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_student_audit
AFTER DELETE ON Student
FOR EACH ROW EXECUTE FUNCTION audit_student_delete();

-- Trigger timing:
-- BEFORE trigger: can modify NEW or return NULL to cancel operation
-- AFTER trigger:  cannot modify NEW; OLD available for DELETE; used for audit
-- ── STORED PROCEDURES ────────────────────────────────────────────────────

CREATE OR REPLACE PROCEDURE enroll_student(
  p_sid   INT,
  p_cid   INT,
  p_grade CHAR(2) DEFAULT 'IP'
)
LANGUAGE plpgsql AS $$
BEGIN
  IF NOT EXISTS (SELECT 1 FROM Student WHERE sid = p_sid) THEN
    RAISE EXCEPTION 'Student % not found', p_sid;
  END IF;
  IF NOT EXISTS (SELECT 1 FROM Course WHERE cid = p_cid) THEN
    RAISE EXCEPTION 'Course % not found', p_cid;
  END IF;
  INSERT INTO Enrollment(sid, cid, grade, year)
  VALUES (p_sid, p_cid, p_grade, EXTRACT(YEAR FROM NOW())::INT)
  ON CONFLICT (sid, cid) DO UPDATE SET grade = EXCLUDED.grade;
  RAISE NOTICE 'Student % enrolled in course %', p_sid, p_cid;
END;
$$;

CALL enroll_student(1, 101, 'A');

-- ── GRANT / REVOKE ────────────────────────────────────────────────────────

CREATE ROLE student_role;
GRANT SELECT ON Student, Course, Enrollment TO student_role;

CREATE USER alice WITH PASSWORD 'pass123';
GRANT student_role TO alice;

-- WITH GRANT OPTION: alice can further grant this privilege to others
GRANT SELECT ON Student TO alice WITH GRANT OPTION;

REVOKE INSERT ON Enrollment FROM alice;
-- CASCADE: also revokes privileges alice granted using WITH GRANT OPTION
REVOKE SELECT ON Student FROM alice CASCADE;

DCL — GRANT and REVOKE Summary

CommandPurposeExample
GRANT privilege ON object TO userGive privilegeGRANT SELECT ON Student TO alice
GRANT ... WITH GRANT OPTIONGive privilege + allow re-grantingGRANT SELECT ON Student TO alice WITH GRANT OPTION
GRANT role TO userAssign a roleGRANT faculty_role TO bob
REVOKE privilege FROM userRemove privilegeREVOKE INSERT ON Student FROM alice
REVOKE ... CASCADERemove + propagate to dependentsREVOKE SELECT ON Student FROM alice CASCADE

Authorization graph: A directed graph of granted privileges. If a privilege is revoked with CASCADE, all downstream grants (granted by that user) are also revoked.


Exam Quick-Reference — Chapter 3

  • Relational algebra evaluation order: σ first (push down), π next, then ×/⋈
  • SQL logical execution order: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT
  • HAVING vs WHERE: WHERE filters rows before grouping; HAVING filters groups after GROUP BY. Use aggregates only in HAVING.
  • Correlated vs Non-correlated subquery: correlated references outer query, runs per outer row; non-correlated runs once
  • NOT IN vs NOT EXISTS: NOT EXISTS is safer when NULLs may be present in subquery result
  • View vs Materialized View: view = stored query (always fresh); materialized view = stored result (stale until refreshed)
  • Trigger: fires automatically on DML events; BEFORE can modify/cancel; AFTER cannot modify NEW
  • GRANT WITH GRANT OPTION: user can re-grant; REVOKE CASCADE removes downstream grants

Solved PYQs

A few past questions with short answers.

CT30103010Chapter 38 marksasked 3x2080 Chaitra R, 2080 Ashwin B

Consider the insurance database: PERSON(licenseNo, name, address), CAR(modelNo, brand, year), ACCIDENT(reportNo, date, location), OWNS(licenseNo, modelNo), PARTICIPATED(licenseNo, reportNo, damageAmount). Write SQL for: (i) All details of persons whose name ends with 'ta' involved in an accident. (ii) License numbers and location where accident took place on Jan 20, 2020. (iii) Update brand 'BMW' to 'BMW-X' for cars manufactured in 2020. (iv) Create view PERSON_REPORT with licenseNo, Name and reportNo where damage amount <= 100000.

Solution
(i) SELECT p.*, a.* FROM PERSON p JOIN PARTICIPATED part ON p.licenseNo = part.licenseNo
    JOIN ACCIDENT a ON part.reportNo = a.reportNo WHERE p.name LIKE '%ta';

(ii) SELECT part.licenseNo, a.location FROM PARTICIPATED part JOIN ACCIDENT a ON part.reportNo = a.reportNo
    WHERE a.date = '2020-01-20';

(iii) UPDATE CAR SET brand = 'BMW-X' WHERE brand = 'BMW' AND year = 2020;

(iv) CREATE VIEW PERSON_REPORT AS
    SELECT p.licenseNo, p.name, part.reportNo
    FROM PERSON p JOIN PARTICIPATED part ON p.licenseNo = part.licenseNo
    WHERE part.damageAmount <= 100000;
CT30103009Chapter 38 marksasked 2x2080 Chaitra R

Consider Employee(empid, ename, address, title, headid), Project(pid, pname, budget, location), Work(empid, pid, responsibility, duration), Payment(title, salary). Write relational algebra to: (i) Find name and salary of employees working in 'Kathmandu'. (ii) Show employee name with their head name. (iii) Employees who live in the same location as their project. (iv) Employee title, name, project name, salary for projects with duration > 5 years.

Solution
(i) π_{ename, salary}(σ_{location='Kathmandu'}(Employee ⋈ Work ⋈ Project))

(ii) ρ_{Head(hid, hname)}(Employee)
    π_{e.ename, Head.hname}(Employee ⋈_{Employee.headid=Head.hid} Head)

(iii) π_{ename}(Employee ⋈_{Employee.address=Project.location} (Work ⋈ Project))

(iv) π_{title, ename, pname, salary}(σ_{duration>5}(Employee ⋈ Work ⋈ Project ⋈ Payment))
CT30103007Chapter 36 marksasked 2x2073 Magh B, 2071 Magh B

Consider: Employee(empid, name, address, manager_id), Department(deptid, dname), Project(pid, title, budget, deptid), Works_on(empid, pid, hours). Write relational algebra for: (i) Names of all employees from computer department with their manager name. (ii) Names of employees who work on projects with budget > 50000. (iii) Total number of projects from each department with department name.

Solution
(i) ρ_{Mgr(mgr_id, mgr_name)}(Employee) 
    π_{ename, mgr_name}(σ_{dname='computer'}(Employee ⋈ Department ⋈_{mgr_id=headid} Mgr))

(ii) π_{ename}(Employee ⋈ Works_on ⋈ σ_{budget>50000}(Project))

(iii) π_{dname, count}(Department ⋈ ρ_{ProjCount(pid, count)}(π_{pid, COUNT(pid)→count}(Project)))

More PYQs

Additional practice from this chapter.