Candidate key

Candidate key

In the relational model, a candidate key of a relvar (relation variable) is a set of attributes of that relvar such that
# at all times it holds in the relation assigned to that variable that there are no two distinct tuples with the same values for these attributes and
# there is not a proper subset of this set of attributes for which (1) holds.

Since a superkey is defined as a set of attributes for which (1) holds, we can also define a candidate key as a minimal superkey, i.e. a superkey of which no proper subset is also a superkey.

The importance of candidate keys is that they tell us how we can identify individual tuples in a relation. As such they are one of the most important types of database constraint that should be specified when designing a database schema. Since a relation is a set (no duplicate elements), it holds that every relation will have at least one candidate key (because the entire heading is always a superkey). Since in some RDBMSs tables may also represent multisets (which strictly means these DBMSs are not relational), it is an important design rule to specify explicitly at least one candidate key for each relation. For practical reasons RDBMSs usually require that for each relation one of its candidate keys is declared as the primary key, which means that it is considered as the preferred way to identify individual tuples. Foreign keys, for example, are usually required to reference such a primary key and not any of the other candidate keys.

Example

The definition of candidate keys can be illustrated with the following (abstract) example. Consider a relation variable (relvar) "R" with attributes ("A", "B", "C", "D") that has only the following two legal values "r1" and "r2":

|

Here "r2" differs from "r1" only in the A and D values of the last tuple.

For "r1" the following sets have the uniqueness property, i.e., there are no two tuples in the instance with the same values for the attributes in the set:: {A,B}, {A,C}, {B,C}, {A,B,C}, {A,B,D}, {A,C,D}, {B,C,D}, {A,B,C,D}For "r2" the uniqueness property holds for the following sets;: {B,D}, {C,D}, {B,C}, {A,B,C}, {A,B,D}, {A,C,D}, {B,C,D}, {A,B,C,D}Since superkeys of a relvar are those sets of attributes that have the uniqueness property for "all" legal values of that relvar and because we assume that "r1" and "r2" are all the legal values that "R" can take, we can determine the set of superkeys of "R" by taking the intersection of the two lists:: {B,C}, {A,B,C}, {A,B,D}, {A,C,D}, {B,C,D}, {A,B,C,D}Finally we need to select those sets for which there is no proper subset in the list, which are in this case:: {B,C}, {A,B,D}, {A,C,D}These are indeed the candidate keys of relvar "R".

We have to consider "all" the relations that might be assigned to a relvar to determine whether a certain set of attributes is a candidate key. For example, if we had considered only "r1" then we would have concluded that {A,B} is a candidate key, which is incorrect. However, we "might" be able to conclude from such a relation that a certain set is "not" a candidate key, because that set does not have the uniqueness property (example {A,D} for "r1"). Note that the existence of a proper subset of a set that has the uniqueness property "cannot" in general be used as evidence that the superset is not a candidate key. In particular, note that in the case of an empty relation, every subset of the heading has the uniqueness property, including the empty set.

Determining Candidate Keys

The previous example only illustrates the definition of candidate key and not how these are in practice determined. Since most relations have a large number or even infinitely many instances it would be impossible to determine all the sets of attributes with the uniqueness property for each instance. Instead it is easier to consider the sets of real-world entities that are represented by the relation and determine which attributes of the entities uniquely identify them. For example a relation "Employee"("Name", "Address", "Dept") probably represents employees and these are likely to be uniquely identified by a combination of "Name" and "Address" which is therefore a superkey, and unless the same holds for only "Name" or only "Address", then this combination is also a candidate key.

In order to determine correctly the candidate keys it is important to determine "all" superkeys, which is especially difficult if the relation represents a set of relationships rather than a set of entities. Therefore it is often useful to attempt to find any "forgotten" superkeys by also determining the functional dependencies. Consider for example the relation "Marriage"("Husband", "Wife", "Date") for which it will trivially hold that {"Husband", "Wife", "Date"} is a superkey. If we assume that a certain person can only marry once on a given date then this implies the functional dependencies {"Husband","Date"}→"Wife" and {"Wife","Date"}→"Husband". From this then we can derive more superkeys by applying the following rule:: if "S" is a superkey and "X"→"Y" a functional dependency: then ("S"-"Y")+"X" is also a superkeywhere '-' is the set difference and '+' the set union. In this case this leads to the derivation of the superkeys {"Husband", "Date"} and {"Wife", "Date"}.

References

* cite book
last = Date
first = Christopher
authorlink = Christopher J. Date
title = An Introduction to Database Systems
publisher = Addison-Wesley
isbn = 978-0321189561
year = 2003
pages = pp. 268-276
chapter = 5: Integrity

See also

* alternate key
* compound key
* primary key
* relational database
* superkey


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • Candidate Key —   [dt. »Sekundärschlüssel«], Schlüssel …   Universal-Lexikon

  • candidate key — potencialusis raktas statusas T sritis informatika apibrėžtis Unikalus eilutės identifikatorius sąryšyje (duomenų bazėje). Potencialusis raktas gali būti paprastas (iš vieno atributo) arba sudėtinis (iš kelių atributų). Kiekvienas sąryšis privalo …   Enciklopedinis kompiuterijos žodynas

  • Key, David M(cKendree) — born Jan. 27, 1824, Greene county, Tenn., U.S. died Feb. 3, 1900, Chattanooga, Tenn. U.S. politician. Admitted to the bar in 1850, he practiced law in Chattanooga and became active in Democratic Party politics. He opposed secession by the South… …   Universalium

  • Unique key — In relational database design, a unique key can uniquely identify each row in a table, and is closely related to the Superkey concept. A unique key comprises a single column or a set of columns. No two distinct rows in a table can have the same… …   Wikipedia

  • Foreign key — In the context of relational databases, a foreign key is a referential constraint between two tables. [ [http://www.visualcase.com/kbase/database basics foreign keys.htm For a simpler visualization, see VisualCase.com] ] The foreign key… …   Wikipedia

  • Alternate key — You may have meant the Alt key on the keyboard. An alternate key (or secondary key) is any candidate key which is not selected to be the primary key (PK).For example, a relational database with a table employee could have attributes like employee …   Wikipedia

  • Foreign Key — Ein Schlüssel dient in einer Relationalen Datenbank dazu, die Tupel einer Relation eindeutig zu identifizieren. Anschaulich kann man sich eine Relation als Tabelle vorstellen. Ein Schlüssel für eine solche Tabelle ist dann eine Gruppe von Spalten …   Deutsch Wikipedia

  • Primary Key — Ein Schlüssel dient in einer Relationalen Datenbank dazu, die Tupel einer Relation eindeutig zu identifizieren. Anschaulich kann man sich eine Relation als Tabelle vorstellen. Ein Schlüssel für eine solche Tabelle ist dann eine Gruppe von Spalten …   Deutsch Wikipedia

  • Primary key — Ein Schlüssel dient in einer Relationalen Datenbank dazu, die Tupel einer Relation eindeutig zu identifizieren. Anschaulich kann man sich eine Relation als Tabelle vorstellen. Ein Schlüssel für eine solche Tabelle ist dann eine Gruppe von Spalten …   Deutsch Wikipedia

  • Surrogate key — A surrogate key in a database is a unique identifier for either an entity in the modeled world or an object in the database. The surrogate key is not derived from application data. Definition Two definitions of a surrogate appear in the… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”