DBMS NotesPYQs

Chapter 1

Introduction

3 hours3 marks20 past questionsBack to chaptersOpen filtered PYQsNext: Ch 2

Full Syllabus Outline

Every topic from the syllabus data is preserved here.

Chapter Notes

Block-based notes with markdown, diagrams, code, and math.

Chapter 1 — Introduction to DBMS

1.1 Core Definitions

TermDefinition
DataRaw, unprocessed facts with no inherent meaning (e.g., 42, "Alice", "2024-01-01")
InformationProcessed, organized data that carries meaning and context (e.g., "Alice scored 42 marks on Jan 1 2024")
DatabaseA large, organized, interrelated collection of data stored persistently and managed as a unit
DBMSA software system that provides mechanisms to define, construct, manipulate, and share databases among users and applications
Database SystemThe database itself combined with the DBMS software

Exam tip: The distinction between data and information is a frequent 1-mark question. Data is raw; information has context.

Why Do We Need a DBMS?

Without a DBMS, applications maintain their own data files. This leads to well-known problems:

Problem in File SystemsConcrete ExampleDBMS Solution
Data redundancyStudent address stored in Admissions file AND Library fileSingle centralized table
Data inconsistencyStudent phone updated in one file but not anotherOne source of truth; all apps share same data
Difficult data accessNeed to write a new program for every new queryDeclarative SQL — express what, not how
Data isolationEach department stores files in different formatsUniform relational schema
Integrity violationsNo way to enforce "salary must be positive" in a flat fileCHECK, NOT NULL, FK constraints
Atomicity failuresPower cut during bank transfer leaves A debited but B not creditedTransaction ACID guarantees
Concurrent access anomaliesTwo agents book the last seat simultaneouslyLocking and concurrency control
SecurityFile-level permissions are coarseFine-grained role-based access control

1.2 Applications and Evolution of Databases

Key Application Domains

  • Banking & Finance: account management, transactions, loan processing
  • Airlines: reservation systems, flight scheduling
  • Universities: student registration, grades, course catalog
  • Healthcare: patient records, prescriptions, billing
  • E-Commerce: product catalogs, orders, customer profiles

Evolution Timeline

EraTechnologyKey Feature
1960sFlat files, IMS (IBM)Hierarchical model; first structured storage
1970sNetwork model (CODASYL); relational model proposed (Codd 1970)Relational algebra and set-based access
1980sRelational DBMS go commercial (Oracle, DB2, SQL Server)SQL standardized (SQL-86, SQL-89)
1990sObject-oriented extensions; client-server architecturesSQL:1999 adds OO features
2000sData warehousing, OLAP, XML databasesAnalytical workloads separate from OLTP
2010sNoSQL (MongoDB, Cassandra, Redis), NewSQL, cloud databasesHorizontal scalability, CAP theorem trade-offs
2020sAI-augmented optimization, vector databases, cloud-nativeLLM integration, real-time analytics

1.3 Three-Level Data Abstraction (ANSI/SPARC Architecture)

The three-level architecture was proposed by ANSI/SPARC in 1975 to achieve data independence — the ability to change one level without affecting higher levels.

Physical Level (Internal Schema)

The lowest level. Describes HOW data is physically stored on disk.

  • Defines file organization: heap files, sorted files, clustered files
  • Defines index structures: B+ trees, hash indexes
  • Describes disk block size, buffer pool management, compression
  • Managed entirely by the DBA and the storage engine
  • Application developers and end users have no visibility into this level

Logical Level (Conceptual Schema)

The middle level. Describes WHAT data exists and how entities relate, without any physical storage details.

  • Defines all tables (relations), their attributes, and data types
  • Declares all integrity constraints (keys, foreign keys, CHECK)
  • This is the level where application developers and DBAs work
Student(sid: int PK, name: varchar(100), dept: varchar(50), gpa: decimal(3,2))
Enrollment(sid: int FK→Student, cid: int FK→Course, grade: char(2))
Course(cid: int PK, title: varchar(100), credits: int)

View Level (External Schema)

The highest level. Each user or application sees only a relevant subset of the database.

  • Multiple external schemas coexist over a single conceptual schema
  • Provides security: payroll view shows salaries; student view does not
  • Example: payroll staff see salary; students see only their own transcript

Exam tip: When drawing the 3-level diagram, always show: (1) multiple views at the top, (2) a single conceptual schema in the middle, (3) physical storage at the bottom. This shows the many-to-one-to-one relationship.


1.4 Data Independence

Data independence means changes at one level do not require changes at a higher level.

Physical Data Independence

The ability to modify the physical schema without requiring changes to the logical schema or application programs.

  • Add an index → no SQL change needed
  • Reorganize from heap file to sorted file → applications unaffected
  • Switch from HDD to SSD → schema unchanged
  • This is the more achievable and more practical form. Modern RDBMS fully support it.

Logical Data Independence

The ability to modify the logical schema without requiring changes to external schemas (views) or application programs.

  • Add a new column to a table → existing views unaffected
  • Split one table into two → applications may still use a view that hides the split
  • This is harder to achieve because views are defined in terms of the logical schema
Physical IndependenceLogical Independence
What changesStorage layout, indexesTable structure, relationships
Impact on logical schemaNoneView definitions may need updating
DifficultyEasier to achieveHarder to achieve

1.5 Schema vs Instance

ConceptSchemaInstance
What it isBlueprint / structural definition of the databaseActual data stored at a particular point in time
When it changesRarely (ALTER TABLE)Continuously (every INSERT/UPDATE/DELETE)
OOP analogyClass definitionObject instantiated from that class
SQL analogyCREATE TABLE statementRows in the table
ExampleStudent(sid INT PK, name VARCHAR(100), dept VARCHAR(50))Row: (1, 'Asha', 'CS')

Analogy: Think of schema as a "type" and instance as a "value" of that type — just as an int variable has a type (int) and a value (42).


1.6 DDL · DML · DCL

-- ── DDL — Data Definition Language (define/change schema) ──────────────────
CREATE TABLE Student (
  sid   INT          PRIMARY KEY,
  name  VARCHAR(100) NOT NULL,
  dept  VARCHAR(50)  DEFAULT 'Undeclared',
  gpa   DECIMAL(3,2) CHECK (gpa BETWEEN 0 AND 4)
);
ALTER TABLE Student ADD COLUMN email VARCHAR(200) UNIQUE;
DROP TABLE Student;

-- ── DML — Data Manipulation Language (read and write data) ──────────────────
INSERT INTO Student VALUES (1, 'Asha', 'CS', 3.8);

SELECT name, gpa
FROM   Student
WHERE  dept = 'CS'
ORDER  BY gpa DESC;

UPDATE Student SET dept = 'IT' WHERE sid = 1;
DELETE FROM Student WHERE gpa < 1.0;

-- ── DCL — Data Control Language (access control) ────────────────────────────
GRANT  SELECT, INSERT ON Student TO app_user;
REVOKE INSERT         ON Student FROM app_user;

-- ── TCL — Transaction Control Language ──────────────────────────────────────
BEGIN;
  UPDATE Account SET balance = balance - 100 WHERE id = 1;
  UPDATE Account SET balance = balance + 100 WHERE id = 2;
COMMIT;  -- or ROLLBACK to undo

DDL vs DML vs DCL — Quick Comparison

CategoryFull NamePurposeCommon Commands
DDLData Definition LanguageDefine/modify database structure (schema)CREATE, ALTER, DROP, TRUNCATE
DMLData Manipulation LanguageInsert, read, update, delete data (instances)SELECT, INSERT, UPDATE, DELETE
DCLData Control LanguageManage access permissionsGRANT, REVOKE
TCLTransaction Control LanguageManage transactionsBEGIN, COMMIT, ROLLBACK, SAVEPOINT

Exam trap: DDL changes the structure (schema); DML changes the data (instance).


1.7 DBMS Internal Architecture

Exam Quick-Reference — Chapter 1

  • Data vs Information: data = raw facts; information = processed data with meaning.
  • Three levels: Physical (HOW stored) → Logical (WHAT exists) → View (WHAT each user sees). Always draw this diagram when asked.
  • Physical independence: change storage → no logical change. Logical independence: change schema → views still work. Physical is easier.
  • Schema = blueprint (stable, changed with ALTER TABLE); Instance = snapshot (changes with every DML operation).
  • DDL = structure (CREATE/ALTER/DROP); DML = data (SELECT/INSERT/UPDATE/DELETE); DCL = permissions (GRANT/REVOKE).
  • File system disadvantages: redundancy, inconsistency, poor access, isolation, integrity, atomicity, security, concurrency.

Solved PYQs

A few past questions with short answers.

CT30101011Chapter 16 marksasked 2x2080 Chaitra R

Explain the advantages of using Relational Database Management Systems. What are the levels of data abstraction in Database Management system?

Solution

Advantages of RDBMS: Reduced redundancy (normalization), data integrity (constraints), concurrent ACID access, crash recovery (WAL), role-based security, standardized SQL, data independence.

Abstraction Levels: Physical (storage/indexes), Logical (schema/constraints), View (user subsets). Isolates concerns between storage, developers, and end-users.

CT30101016Chapter 16 marksasked 1x2081 Ashwin B

Define database. Explain DDL, DML and DCL with examples.

Solution

Database: Structured, organized data collection. DDL defines structure (CREATE, ALTER, DROP - auto-commit). DML manipulates data (SELECT, INSERT, UPDATE, DELETE - transactional). DCL manages access (GRANT, REVOKE).

-- DDL
CREATE TABLE Student(sid INT PRIMARY KEY, name VARCHAR(100) NOT NULL);
ALTER TABLE Student ADD COLUMN email VARCHAR(100);

-- DML
INSERT INTO Student(sid, name) VALUES (1, 'Asha');
SELECT * FROM Student WHERE sid = 1;
UPDATE Student SET name = 'Asha Sharma' WHERE sid = 1;
DELETE FROM Student WHERE sid = 1;

-- DCL
GRANT SELECT ON Student TO app_user;
REVOKE INSERT ON Student FROM app_user;
CT30101003Chapter 15 marksasked 1x2079 Ashwin B

What do you mean by Schema and Instances? Briefly explain the different level of data abstraction.

Solution

Schema vs Instance:

  • Schema: The logical blueprint/structure (table definitions, constraints, types). Defined via DDL. Rarely changes.
  • Instance: The actual data stored at a given moment. Changes with every INSERT/UPDATE/DELETE.

Data Abstraction Levels:

  • Physical: Storage mechanics (blocks, indexes, hashing).
  • Logical: Schema, relationships, and integrity rules.
  • View: User-specific virtual tables.

Abstraction ensures schema design is independent of runtime data and physical storage choices.

More PYQs

Additional practice from this chapter.