Natural key

A natural key (also known as business key[1]) is a type of unique key in a database formed of attributes that exist and are used in the external world outside the database (i.e. in the business domain or domain of discourse).[2] In the relational model of data, a natural key is a candidate key and is therefore a functional determinant for all attributes in a relation. A natural key is sometimes called domain key.[3]

A natural key serves two complementary purposes: it provides a means of identification for data and it imposes a rule, specifically a uniqueness constraint, to ensure that data remains unique within an information system. The uniqueness constraint assures uniqueness of data within a certain technical context (e.g. a set of values in a table, file or relation variable) by rejecting input of any data that would otherwise violate the constraint. This means that the user can rely on a guaranteed correspondence between facts identified by key values recorded in a system and the external domain of discourse (a single version of the truth).

Examples of natural keys could include:

The presence of a key guarantees uniqueness within an information system but it is not always necessary that the key values be unique or immutable within some wider population of objects or concepts outside that system. For example a key on a CITY attribute means that the set of city names assigned to that attribute must be unique at any point in time, so there can only be one city called "Washington" for example. That does not imply that every possible city which might one day be referred to within the system must have a unique name. In logical terms, the proposition being represented by the value "Washington" is that there is a city called Washington within the domain of discourse at a point in time, not that there is only one city of that name in every conceivable domain or for all time.

Similarly, the potential occurrence of erroneous or unwanted duplicate information does not necessarily rule out the use of an attribute as a natural key. For example in the US there may be instances of duplicate Social Security numbers mistakenly issued to individuals or other instances of a person fraudulently or mistakenly using another person's SSN. In these situations the use of SSN as a natural key serves the purpose of a data integrity check - detecting potential duplication or fraud by rejecting any duplicate values with the implication that any error should be identified and resolved before entry into the system.

A natural key differs from a surrogate key which has no meaning outside the database itself and is not based on real-world observation or intended as a statement about the reality being modelled. A natural key therefore provides a certain data quality guarantee whereas a surrogate does not. It is common for elements of data to have several keys, any number of which may be natural or surrogate.

ReferencesEdit