DBMS NotesPYQs

Chapter 1

Introduction

3 hours3 marks20 past questionsBack to chaptersOpen filtered PYQsNext: Ch 2

Chapter Notes

Block-based notes with markdown, diagrams, code, and math.

Chapter 1 — Introduction

1.1 What is a DBMS?

TermDefinition
DataRaw facts with no context (42, "Alice")
InformationProcessed data with meaning
DatabaseOrganized, interrelated collection of data
DBMSSoftware that stores, retrieves, and manages data

Common applications: banking, airlines, hospitals, e-commerce, student registration.


1.2 File System vs DBMS

Problem in File SystemsDBMS Solution
Data redundancy — same data in many filesCentralized storage, one source of truth
Data inconsistency — copies divergeIntegrity constraints enforce consistency
Difficult data access — need a new program per queryDeclarative SQL
Data isolation — multiple formatsUniform schema
Integrity problems — hard to enforce rulesConstraint declarations (NOT NULL, CHECK, FK)
Concurrent access anomaliesConcurrency control (locking, MVCC)
Security — file-level onlyRole-based access control
No atomicity — partial update corrupts dataTransaction atomicity via logging

1.3 Three-Level Data Abstraction (ANSI/SPARC)

The architecture separates how data is stored from what data exists from what each user sees.

Physical Level

Describes HOW records are stored on disk. Covers file organization, block structure, index structures, buffer management. Managed by DBAs and the storage engine — invisible to developers.

Logical Level (Conceptual)

Describes WHAT data exists and how entities relate. This is where table schemas live. Application developers and DBAs work here.

Student(sid: int PK, name: varchar(100), dept: varchar(50), gpa: decimal(3,2))
Enrollment(sid: int FK→Student, cid: int FK→Course, grade: char(2))
Course(cid: int PK, title: varchar(100), credits: int)

View Level

Each user/application sees only the relevant subset. Multiple views coexist on the same logical schema. Example: payroll staff see salary; students see only their own transcript.


1.4 Data Independence

Physical data independence — change storage layout (add index, reorganize file, switch SSD) without changing the logical schema or application queries. This is the more achievable and more important kind in practice.

Logical data independence — change the logical schema (add a column, split a table) without breaking existing user views. Harder because views are defined in terms of the schema.


1.5 Schema vs Instance

SchemaInstance
WhatBlueprint / structure definitionActual data at a point in time
ChangesRarely (ALTER TABLE)Constantly (INSERT / UPDATE / DELETE)
AnalogyClass definition in OOPObject created from that class
ExampleStudent(sid, name, dept)Row: (1, 'Asha', 'CS')

1.6 DDL · DML · DCL

-- ── DDL — Data Definition Language (define/change schema) ──────────────────
CREATE TABLE Student (
  sid   INT          PRIMARY KEY,
  name  VARCHAR(100) NOT NULL,
  dept  VARCHAR(50),
  gpa   DECIMAL(3,2) CHECK (gpa BETWEEN 0 AND 4)
);

ALTER TABLE Student ADD COLUMN email VARCHAR(200);
DROP TABLE Student;

-- ── DML — Data Manipulation Language (read/write data) ──────────────────────
INSERT INTO Student VALUES (1, 'Asha', 'CS', 3.8);

SELECT name, gpa
FROM   Student
WHERE  dept = 'CS'
ORDER  BY gpa DESC;

UPDATE Student SET dept = 'IT' WHERE sid = 1;
DELETE FROM Student WHERE gpa < 1.0;

-- ── DCL — Data Control Language (access control) ────────────────────────────
GRANT  SELECT, INSERT ON Student TO app_user;
REVOKE INSERT         ON Student FROM app_user;

1.7 DBMS Architecture

Exam Quick-Reference

  • Always draw the 3-level diagram when asked about data abstraction.
  • Physical independence = storage changes don't touch logical schema.
  • Schema = stable blueprint; instance = snapshot at a moment in time.
  • DDL/DML/DCL: memorize 3 SQL keywords per category.
  • File system disadvantages: redundancy, inconsistency, poor security, no atomicity.

Full Syllabus Outline

Every topic from the syllabus data is preserved here.

Solved PYQs

A few past questions with short answers.

Chapter 14 marksasked 1x2082 Kartik B

Briefly explain the three levels of data abstraction in RDBMS. What is physical data independence?

Solution

Data abstraction means hiding internal complexity and presenting data through simplified interfaces at three distinct levels. The goal is to allow users and developers to interact with the database without needing to understand the underlying storage details.

Physical Level (Internal Level): The lowest level, describing how data is physically stored on disk — file organization (heap, sequential, hashed), block size, index structures (B+ trees), compression, and encryption. Managed by the DBA and storage engine; application developers never interact with this level.

Logical Level (Conceptual Level): Describes what data is stored and the relationships among the data — table schemas, attribute types, constraints (primary keys, foreign keys, NOT NULL), and integrity rules. Developers write SQL against this level.

View Level (External Level): The highest level; each user or application sees only a relevant subset or derived view of the database. Different users can have different views of the same logical schema, improving both usability and security.

Physical Data Independence: The ability to modify the physical storage structure (e.g., switching from a heap file to a B+ tree index, moving to SSD, changing block size) without requiring any change to the logical schema or application code. This is the most practically important form of data independence in production systems.

Chapter 14 marksasked 1x2080 Ashwin B

What do you mean by data abstraction? Describe the various levels of data abstraction in database management system.

Solution

Data Abstraction is the process of hiding the complexity of the underlying storage and implementation from users, presenting data through clean, simplified interfaces at multiple levels. It allows different categories of users to interact with the database at the level appropriate to their role.

Three Levels of Data Abstraction:

1. Physical Level: Describes how data is stored on physical storage media. Details include file organization (heap, sequential, clustered), index structures (B+ tree, hash), block and record formats, and physical access paths. Only the storage engine and DBA are concerned with this level.

2. Logical Level: Describes what data is stored in the database and the relationships among the data. This is where schemas are defined using DDL (CREATE TABLE, ADD CONSTRAINT). Developers and DBAs work at this level. Example:

Student(sid INT, name VARCHAR(100), dept VARCHAR(50), gpa DECIMAL(3,2))

3. View Level: The highest level of abstraction. Different users see different views (virtual tables) of the same database. A student portal might show only name, dept, and gpa; an admin panel shows all columns including fee_status. Views provide both simplification and security.

Key Benefit — Physical Data Independence: Applications remain unaffected by changes to physical storage. For example, adding an index to speed up queries or reorganizing files for better I/O does not require rewriting any SQL queries or application logic.

Chapter 15 marksasked 1x2079 Ashwin B

What do you mean by Schema and Instances? Briefly explain the different level of data abstraction.

Solution

Schema vs Instance

Schema is the logical structure (blueprint) of a database, defined using DDL. It describes what tables exist, what columns each table has, and what constraints apply. Schema rarely changes.

Instance is the actual data stored in the database at a particular point in time. Every INSERT, UPDATE, or DELETE changes the instance without changing the schema.

PropertySchemaInstance
Defined byDDL (CREATE TABLE, ALTER TABLE)DML (INSERT, UPDATE, DELETE)
StabilityRarely changesChanges with every write operation
AnalogyBlueprint / type definitionThe actual house / object value
ExampleStudent(sid INT, name VARCHAR)Row: (1, 'Asha', 'CS', 3.8)

Three Levels of Data Abstraction

Physical Level: How data is stored on disk — file layouts, indexes, block structures. Changes here do not propagate upward (physical data independence).

Logical Level: What data exists — table schemas, attribute types, relationships, and integrity constraints. Developers write against this level.

View Level: What specific users see — customized subsets defined via CREATE VIEW. Different users see different views for security and simplicity.

Significance: Schema-instance separation means developers can design and reason about the schema independently of runtime data. Physical-logical separation means storage engineers can optimize hardware without breaking application queries.

More PYQs

Additional practice from this chapter.