A collection of data or information that has a name, called the filename. Almost all information stored in a computer must be in a file. There are many different types of files: data files, text files ,program files, directory files, and so on. Different types of files store different types of information. For example, program files store programs, whereas text files store text.
Table of Contents
A computer file is a resource for storing information, which is available to a computer program and is usually based on some kind of durable storage. A file is “durable” in the sense that it remains available for other programs to use after the program that created it has finished executing. Computer files can be considered as the modern counterpart of paper documents which traditionally are kept in office and library files, and this is the source of the term.
File organization” refers to the logical relationships among the various records that constitute the file, particularly with respect to the means of identification and access to any specific record. “File structure” refers to the format of the label and data blocks and of any logical record control information
It is used to determine an efficient file organization for each base relation. For example, if we want to retrieve student records in alphabetical order of name, sorting the file by student name is a good file organization. However, if we want to retrieve all students whose marks is in a certain range, a file ordered by student name would not be a good file organization. Some file organizations are efficient for bulk loading data into the database but inefficient for retrieve and other activities.
The objective of this selection is to choose an optimal file organization for each relation.
Types of File Organization
In order to make effective selection of file organizations and indexes, here we present the details different types of file Organization. These are:
- Heap File Organization
- Hash File Organization
- Indexed Sequential Access Methods (ISAM) File Organization
Heap (unordered) File Organization
An unordered file, sometimes called a heap file, is the simplest type of file organization.
Records are placed in file in the same order as they are inserted. A new record is inserted in the last page of the file; if there is insufficient space in the last page, a new page is added to the file. This makes insertion very efficient. However, as a heap file has no particular ordering with respect to field values, a linear search must be performed to access a record. A linear search involves reading pages from the file until the required is found. This makes retrievals from heap files that have more than a few pages relatively slow, unless the retrieval involves a large proportion of the records in the file.
Pros of Heap storage
Heap is a good storage structure in the following situations:
When data is being bulk-loaded into the relation.
The relation is only a few pages long. In this case, the time to locate any tuple is Short, even if the entire relation has been searched serially.
When every tuple in the relation has to be retrieved (in any order) every time the relation is accessed. For example, retrieve the name of all the students.
Cons of Heap storage
Heap files are inappropriate when only selected tuples of a relation are to be accessed.
Hash File Organization
In a hash file, records are not stored sequentially in a file instead a hash function is used to calculate the address of the page in which the record is to be stored.
The field on which hash function is calculated is called as Hash field and if that field acts as the key of the relation then it is called as Hash key. Records are randomly distributed in the file so it is also called as Random or Direct files. Commonly some arithmetic function is applied to the hash field so that records will be evenly distributed throughout the file.