Explain the different levels of RAID briefly. What is indexing in database? Differentiate between dense index and sparse index.
Solution
RAID and Indexing:
RAID (Redundant Array of Independent Disks):
| Level | Technique | Fault Tolerance | Performance | Storage Overhead |
|---|---|---|---|---|
| RAID 0 | Striping only | None | Excellent R/W | None |
| RAID 1 | Mirroring | 1 disk failure | Excellent R; good W | 2× |
| RAID 5 | Striping + distributed parity | 1 disk failure | Good R; moderate W | 1 disk |
| RAID 6 | Striping + 2 parity disks | 2 disk failures | Good R; slower W | 2 disks |
| RAID 10 | RAID 1+0 (mirror then stripe) | 1 per mirror pair | Excellent R/W | 2× |
RAID 10 is preferred for databases because it combines the write performance of RAID 0 striping with the fault tolerance of RAID 1 mirroring.
Index Types:
Dense Index: One index entry per data record (tuple). Can find any record by looking up its key in the index directly. More storage but faster search.
Sparse Index: One index entry per data block (only works on sorted/clustered data). Fewer entries but requires scanning within the block after finding it via the index. Less storage.
Primary Index: Built on the attribute(s) that determine the physical ordering of records in the file (the search key matches the file's sort order).
Secondary Index: Built on an attribute that is not the file's sort order. Must be dense (since records are not sorted by this attribute).