Database storage structures

Database storage structures: Database tables/indexes are typically stored on hard disk in one of many forms, ordered/unordered Flat files, ISAM, Heaps, Hash buckets or B+ Trees. These have various advantages and disadvantages discussed in this topic. The most commonly used are B+trees and ISAM.

Contents

1 Unordered

2 Ordered

3 Structured files

3.1 Heap Files

3.2 Hash buckets

3.3 B+ trees

3.4 ISAM

4 See also

Unordered

Unordered storage typically stores the records in the order they are inserted. While having good insertion efficiency ( $O\left(1\right)$ ), it may seem that it would have inefficient retrieval times ( $O\left(n\right)$ ), but this is usually never the case as most databases use indexes on the primary keys, resulting in $O\left(\log n\right)$ or $O\left(1\right)$ for keys that are the same as database row offsets within the database file storage system, efficient retrieval times.

Ordered

Ordered storage typically stores the records in order and may have to rearrange or increase the file size in the case a record is inserted, this is very inefficient. However is better for retrieval as the records are pre-sorted, leading to a complexity of $O\left(\log n\right)$ .

Structured files

Heap Files

simplest and most basic method

insert efficient, records added at end of file – ‘chronological’ order

retrieval inefficient as searching has to be linear

deletion – deleted records marked

requires periodic reorganization if file is very volatile

advantages

good for bulk loading data

good for relatively small relations as indexing overheads are avoided

good when retrievals involve large proportion of records

disadvantages

not efficient for selective retrieval using key values, especially if large

sorting may be time-consuming

not suitable for ‘volatile’ tables

Heap files are lists of unordered records of variable size. Although sharing a similar name, heap files are widely different from in-memory heaps.

Hash buckets

Main article: Hash table

Hash functions calculate the address of the page in which the record is to be stored based on one or more fields in the record

Hashing functions chosen to ensure that addresses are spread evenly across the address space

‘occupancy’ is generally 40% – 60% of total file size

unique address not guaranteed so collision detection and collision resolution mechanisms are required

open addressing

chained/unchained overflow

pros and cons

efficient for exact matches on key field

not suitable for range retrieval, which requires sequential storage

calculates where the record is stored based on fields in the record

hash functions ensure even spread of data

collisions are possible, so collision detection and restoration is required

B+ trees

Main article: B+ tree

These are the most used in practice.

the time taken to access any tuple is the same because same number of nodes searched

index is a full index so data file does not have to be ordered

Pros and cons

versatile data structure – sequential as well as random access

access is fast

supports exact, range, part key and pattern matches efficiently

‘volatile’ files are handled efficiently because index is dynamic – expands and contracts as table grows and shrinks

less well suited to relatively stable files – in this case, ISAM is more efficient

ISAM

Main article: ISAM

See also

ISAM

B+tree

Flat file database

Hash tables

Heaps

v · d · eDatabase management systems

Database models

Database normalization

Database storage

Distributed DBMS

Federated database system

Referential integrity

Relational algebra

Relational calculus

Relational database

Relational DBMS

Relational model

Object-relational database

Transaction processing

Concepts

Database

ACID

CRUD

Null

Candidate key

Foreign key

Primary key

Superkey

Surrogate key

Armstrong's axioms

NoSQL

Objects

Relation (Table)

View

Transaction

Log

Trigger

Index

Stored procedure

Cursor

Partition

Components

Concurrency control

Data dictionary

JDBC

ODBC

Query language

Query optimizer

Query plan

Database products:

Object-oriented (comparison)

Relational (comparison)

Document-oriented

Categories:
Database management systems
Databases
Database theory

Игры ⚽ Нужна курсовая?

Look at other dictionaries:

Database — A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality (for example, the availability of rooms in hotels), in a way that supports… … Wikipedia
Database design — is the process of producing a detailed data model of a database. This logical data model contains all the needed logical and physical design choices and physical storage parameters needed to generate a design in a Data Definition Language, which… … Wikipedia
Database security — concerns the use of a broad range of information security controls to protect databases (potentially including the data, the database applications or stored functions, the database systems, the database servers and the associated network links)… … Wikipedia
Database model — A database model is the theoretical foundation of a database and fundamentally determines in which manner data can be stored, organized, and manipulated in a database system. It thereby defines the infrastructure offered by a particular database… … Wikipedia
Database normalization — In the design of a relational database management system (RDBMS), the process of organizing data to minimize redundancy is called normalization. The goal of database normalization is to decompose relations with anomalies in order to produce… … Wikipedia
Database management system — A database management system (DBMS) is a software package with computer programs that control the creation, maintenance, and the use of a database. It allows organizations to conveniently develop databases for various applications by database… … Wikipedia
database — /day teuh bays /, n. 1. a comprehensive collection of related data organized for convenient access, generally in a computer. 2. See data bank. Also, data base, data base. [1965 70; DATA + BASE1] * * * Collection of data or information organized… … Universalium
Database Console Commands (Transact-SQL) — The Database Console Commands (DBCC) are a series of statements in Transact SQL programming language to check the physical and logical consistency of a Microsoft SQL Server database.[1] These commands are also used to fix existing issues.[1] They … Wikipedia
Oracle Database — Developer(s) Oracle Corporation Development status Active Written in … Wikipedia
Correlation database — A Correlation database is a database management system (DBMS) that is data model independent and designed to efficiently handle unplanned, ad hoc queries in an analytical system environment. It was developed in 2005 by database architect Joseph… … Wikipedia

Academic Dictionaries and Encyclopedias

Database storage structures

Contents

Unordered

Ordered

Structured files

Heap Files

Hash buckets

B+ trees

ISAM

See also

Look at other dictionaries:

Share the article and excerpts

Academic Dictionaries and Encyclopedias

Wikipedia

Database storage structures

Contents

Unordered

Ordered

Structured files

Heap Files

Hash buckets

B+ trees

ISAM

See also

Look at other dictionaries:

Share the article and excerpts

Direct link