INDEXES: Clustered, Unclustered , Dense, Sparse indexes

An index is a condensed version of a database table, containing selected fields and always maintained in a sorted form. It includes a pointer to the corresponding record in the original table, allowing retrieval of non-indexed fields. Each index entry consists of a value and a pointer to the first record containing that value.

A database index serves as a data structure that accelerates specific operations on a file. These operations typically involve a search key, which is often a single field representing a set of record files. The entries within an index are referred to as data entries, which can be actual data records. A single data file can have multiple indexes, each with different search keys,

Search engines employ two methods for searching values within a table or file: table scan (sequential) and index scan (random). Indexes function as specialized lookup tables that enhance data retrieval efficiency. They resemble the index found at the back of a book.

An index table or file is comprised of index entries, organized in the following format:

Search Key       …. Pointer

The search key field enables the sorting of rows within the index column, while the pointer field indicates the location from which the actual data in the table can be retrieved. When a table possesses an index, it signifies that the records in that table have been sorted in some manner. Indexes are automatically generated when the primary key and unique constraints are applied to table columns.

INDEX CLASSIFICATION

Indexes can be classified as either clustered or unclustered.

CLUSTERED

A clustered index determines the physical storage order of rows/records in a table based on its sorting order. A table can only have one clustered index since there can only be one way to arrange the records at any given time. Analogously, when arranging tables in a room, they can be organized in a circular, row-based, or tightly packed manner—only one arrangement at a time. A clustered index ensures that related values in a table are stored near each other according to the index order.

  1. CLUSTERED INDEX

A clustered index organizes a file in such a way that the order of data records closely matches the order of data entries. This can only occur if the data records are sorted based on the search key field. For instance, if student records are sorted by age, an index on age that stores data entries in sorted order by age would be a clustered index.

Indexes maintaining data entries in sorted order based on the search key utilize a collection of index entries structured as a tree to facilitate searches for data entries. Consequently, clustered indexes are relatively resource-intensive to maintain when the file is updated. When data entries need to be moved across pages, or when records are identified by a combination of page ID and slot (as is often the case), all references to the relocated record within the database must be updated to point to the new location. These additional updates can be time-consuming.

  1. UNCLUSTERED INDEX

An unclustered index does not dictate the storage order of rows/records within a table. In this case, the search keys in the index column are sorted in one order, while the actual records or rows may be sorted differently or not sorted at all.

An unclustered index is a type of index that is not clustered. A data file can contain multiple unclustered indexes. For example, if student records are sorted by age, and an additional index on the GPA field is included, it would be referred to as an unclustered index.

Dense versus sparse indexes

Dense Index:

A dense index is characterized by having at least one data entry for every search key value present in the indexed file. In other words, each search key in the index column corresponds to a specific record in the table or file. This means that every search key in the index is associated with a particular record in the base table. The dense index ensures a one-to-one mapping between search keys and records.

Sparse Index:

In contrast, a sparse index does not have a corresponding record for every search key value. Instead, it may point to a group of records in the base table. For example, certain search keys may not have corresponding records in the index itself, but they can still be found by searching through other related search keys. A sparse index typically has one entry for each page of records in the data file. Each index record in the sparse index contains the search key and a pointer to the first data record with that search key value. Unlike a dense index, a sparse index must be clustered, and it occupies less space.

Primary and Secondary Index:

A primary index is created on a primary key column(s) of a relation, enforcing a unique constraint on the field and determining the physical storage order of records on the disk. It is also known as a clustered index. Essentially, a primary index is an index on a set of fields that includes the primary key. Records in a primary index are typically clustered.

On the other hand, a secondary index is defined on a non-key field, allowing duplicate values, and does not determine the physical storage order of records on the disk. It is also referred to as a non-clustered index. For instance, in a student database, the student ID serves as the primary key for looking up students. However, a secondary index can be created on the LastName column to facilitate searching for students based on their last names. Unlike a primary index, a secondary index does not include the primary key and can be created on non-key attributes. It contains duplicate data entries.

Composite Search Keys:

Composite search keys, also known as concatenated keys, are used when the search key for an index consists of multiple fields. For example, consider a collection of employee records with fields such as name, age, and salary stored in sorted order by name. When the search key is composite, an equality query involves binding each field in the search key to a constant value. For instance, one could retrieve all data entries with age = 20 and salary = 10. However, in a hashed file organization, which supports only equality queries, a value must be specified for each field in the search key for the hash function to identify the bucket containing the desired records.

Range Queries:

Range queries occur when not all fields in the search key are bound to constants. For example, retrieving all data entries with age = 20 implies that any value is acceptable for the salary field. Another example of a range query is retrieving data entries with age < 30 and salary > 40. Range queries involve selecting records based on a range of values rather than specific equality conditions.

Easy Ways to Insert Pictures, Images, and Page Numbers in Microsoft Word

Step-by-Step Guide: Align Text, Insert Blank Pages, Insert Tables, and Erase Tables

Mastering Microsoft Word: Ways to Format, Copy, Cut, Paste & Apply Effects to Texts

Mastering Microsoft Word: Essential Tips for Efficient Document Creation

Convert CR2 Images to JPGs on Windows with Step-by-Step Guide

Leave a Comment

Your email address will not be published. Required fields are marked *

Get Fully Funded Scholarships

Free Visa, Free Scholarship Abroad

           Click Here to Apply

Acadlly