Query optimization

Query optimization

Query optimization is a function of many relational database management systems in which multiple query plans for satisfying a query are examined and a good query plan is identified. This may or not be the absolute best strategy because there are many ways of doing plans. There is a trade off between the amount of time spent figuring out the best plan and the amount running the plan. Different qualities of database management systems have different ways of balancing these two. Cost based query optimizers evaluate the resource footprint of various query plans and use this as the basis for plan selection. Typically the resources which are costed are CPU path length, amount of disk buffer space, disk storage service time, and interconnect usage between units of parallelism. The set of query plans examined is formed by examining possible access paths (e.g., primary index access, secondary index access, full file scan) and various relational table join techniques (e.g, merge join, hash join, product join). The search space can become quite large depending on the complexity of the SQL query. There are two types of optimization. These consist of logical optimization which generates a sequence of relational algebra to solve the query. In addition there is physical optimization which is used to determine the means of carrying out each operation.

Translating SQL queries into relational algebra

Different opt sequences such as the different Greek alphabet characters which are assigned to them. This is explained in the relational algebra and query optimizer.

The goal of query optimization

The goal is to eliminate as many unneeded tuples, or rows as possible. The following is a look at relational algebra as it eliminates unneeded tuples.The project operator is straightforward to implement if contains a key to relation R. If it does not include a key of R, it must be eliminated. This must be done by sorting (see sort methods below) and eliminating duplicates. This method can also use hashing to eliminate duplicates Hash table.

SQL command distinct: this does not change the actual data. This just eliminates the duplicates from the results.

Set operations: Database management heavily relies on the mathematical principles of set theory which is key in comprehending these operations.

Union: This displays all that appear in both sets, each listed once. These must be union compatible. This means all sequences of selected columns must designate the same number of columns. The data types of the corresponding columns must thereby comply with the conditions valid for comparability. Each data type can be compared to itself. Columns of data type CHAR with the different ASCII and EBCDIC code attributes can be compared to each other, whereby they are implicitly adapted. Columns with the ASCII code attribute can be compared to date, time, and timestamp specifications. All numbers can be compared to each other. In ANSI SQL mode, it is not enough for the data types and lengths of the specified columns to be compatible: they must match. Moreover, only column specifications or * may be specified in the selected columns of the QUERY specifications linked with UNION. Literals may not be specified (wisc). Intersection: This only lists items whose keys appear in both lists. This too must be union compatible. See above for definition.

Set difference: This lists all items whose keys appear in the first list but not the second. This too must be union compatible.

Cartesian Product: Cartesian products (R * S) takes a lot of memory because its result contains a record for each combination of records from R and S.

A common searching algorithm

The following is a quick overview of an algorithm used to search through the different tuples:External sorting methods are those that are suitable for large files of records stored on disk that do not fit in memory, and most database files typically fit this description. The most common example is the sort merge. The following is an explanation for this type of sort.

MergeSort is a recursive sorting procedure that uses O(n log n) comparisons in the worst case. To sort an array of n elements, we perform the following three steps in sequence:

* If n<2 then the array is already sorted. Stop now. * Otherwise, n>1, and we perform the following three steps in sequence: 1. Sort the left half of the array. 2. Sort the right half of the array. 3. Merge the now-sorted left and right halves.


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать курсовую

Look at other dictionaries:

  • Query optimizer — The query optimizer is the component of a database management system that attempts to determine the most efficient way to execute a query. The optimizer considers the possible query plans for a given input query, and attempts to determine which… …   Wikipedia

  • Query plan — A query plan (or query execution plan) is a set of steps used to access or modify information in a SQL relational database management system. This is a specific case of the relational model concept of access plans.Since SQL is declarative, there… …   Wikipedia

  • Conjunctive query — In database theory, a conjunctive query is a restricted form of first order queries. A large part of queries issued on relational databases can be written as conjunctive queries, and large parts of other first order queries can be written as… …   Wikipedia

  • Web search query — A web search query is a query that a user enters into web search engine to satisfy his or her information needs. Web search queries are distinctive in that they are unstructured and often ambiguous; they vary greatly from standard query languages …   Wikipedia

  • Search engine optimization — SEO redirects here. For other uses, see SEO (disambiguation). Internet marketing …   Wikipedia

  • Nexus (Process integration and optimization) — Nexus Screenshot of Nexus 1.1.05 Developer(s) iChrome …   Wikipedia

  • Database — A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality (for example, the availability of rooms in hotels), in a way that supports… …   Wikipedia

  • Relational algebra — Not to be confused with Relation algebra. Relational algebra, an offshoot of first order logic (and of algebra of sets), deals with a set of finitary relations (see also relation (database)) that is closed under certain operators. These operators …   Wikipedia

  • Feature Oriented Programming — (FOP) or Feature Oriented Software Development (FOSD) is a general paradigm for program synthesis in software product lines. FOSD arose out of layer based designs of network protocols and extensible database systems in the late 1980s cite web |… …   Wikipedia

  • Microsoft SQL Server — Developer(s) Microsoft Stable release SQL Server 2008 R2 (10.50.2500.0 Service Pack 1) / July 11, 2011; 4 months ago …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”