Talk:Unique key

	This article is within the scope of WikiProject Databases, a project which is currently considered to be inactive.DatabasesWikipedia:WikiProject DatabasesTemplate:WikiProject DatabasesDatabases articles
Start	This article has been rated as Start-class on Wikipedia's content assessment scale.

Remark edit

Latest comment: 16 years ago3 comments3 people in discussion

Unique with respect to DBMS is a value which is unique. Unique key is like a primary key but unique key may start from "0".

Such semantics (=0) are by no means implied by unique keys. --Stolze 14:48, 14 February 2007 (UTC)Reply

As "The values in a unique key columns may ... be NULL"; it is not possible that: "A unique key must uniquely identify all possible rows that exist in a table" because rows that are identified by NULL cannot be uniquely distinguished. 212.136.56.20 12:16, 27 July 2007 (UTC)Reply

Which is why a column with a unique key will only allow one row where the value of the column is null. Jerazol 14:47, 27 July 2007 (UTC)Reply

Please delete this article -- it's redundant edit

Latest comment: 17 years ago3 comments3 people in discussion

Editors,

This article, "Unique Key", is redundant with Candidate Key, Primary Key and Superkey. It has no reason to exist. Please delete it. Jberkus 20:04, 23 September 2006 (UTC)Reply

Take it to WP:AFD if you feel that strongly about it. --Spartaz 21:47, 23 September 2006 (UTC)Reply

the primary key page doesn't really explain unique key, if this is removed it would be helpful to move the info there.

I copied the contect from the PK article here because it logically belongs here (PK is just a special unique key). Unless there are some objects, I will make primary key a redirect to this article here. --Stolze 14:48, 14 February 2007 (UTC)Reply

Difference: unique key vs. primary key edit

Latest comment: 13 years ago5 comments4 people in discussion

I got redirected from "Primary key". This doesn't make sense to me. Because that's not the same. A primary key necessarily is an unique key but a unique key must not be a primary key. Is this correct or am i wrong?

--Florian Sening 09:43, 30 July 2007 (UTC)Reply

A primary key is essentially a unique key, with a NOT NULL constraint added. It seems redundant to maintain 2 separate articles for 2 concepts as similar as these. Jerazol 10:23, 30 July 2007 (UTC)Reply

Okay. Then we should kill this redundancy as stated above. Okay?

--Florian Sening 14:41, 31 July 2007 (UTC)Reply

A primary key is NOT the same as a unique key. The fact that unique key is the entry and primary key redirects to it is very wrong, because primary keys are one of the most fundamental concepts of relational databases.

I was going to link to wikipedia from a forum post, but now I can't because it will redirect them to Unique Key.

99.138.111.110 (talk) 20:15, 11 April 2010 (UTC)Phil SandlerReply

The statements: "The relational model, as expressed through relational calculus and relational algebra, does not distinguish between primary keys and other kinds of keys. Primary keys were added to the SQL standard mainly as a convenience to the application programmer." are incorrect. primary keys are the intrinsic requirement of Codds Rule #2 and are a foundational feature making the relational calculus possible. Xuancris (talk) 04:49, 25 January 2011 (UTC)Reply

Surrogate Keys edit

Latest comment: 14 years ago4 comments3 people in discussion

[...]surrogate primary keys as an inalienable part of the relational data model.

This is sooo wrong. Surrogate keys are by no means a part of the relational data model. They are just there because relational databases are abused to store data from an object-oriented world where everybody is thinking in objects. If you use relational databases like they are meant to be used (for predicate logic) then most of the time there is no need for surrogate keys. Don't get me wrong - i don't favor natural keys over surrogate keys or vice versa, but this passage is just wrong. You get the feeling that surrogate keys are the way to "do it" but it doesn't even explain why they exist. I mean - it's not that this is the only reason to use surrogate keys - but this sentence just gives the wrong clue.

What do others think about that?

--Florian Sening 09:57, 30 July 2007 (UTC)Reply

Did you read the complete passage? Jerazol 10:21, 30 July 2007 (UTC)Reply

You mean the part about "Object-Oriented Programming"? Please be more precise. I sure have read it but that doesn't change the fact that this statement is just plain wrong.

--Florian Sening 14:42, 31 July 2007 (UTC)Reply

A surrogate key can (and often is) used as the primary key. In these cases it is absolutely part of the relational model. —Preceding unsigned comment added by 99.138.111.110 (talk) 20:12, 11 April 2010 (UTC)Reply

Primary keys and clustered indexes? edit

Latest comment: 16 years ago2 comments2 people in discussion

Would a primary key be the key that happens to have the clustered index over it? --Damian Yerrick (talk | stalk) 14:08, 2 August 2007 (UTC)Reply

Indexes have nothing to do (logically) with constraints. A DBMS may choose to implement a unique key constraint using an index that checks for uniqueness. --Stolze 19:22, 7 August 2007 (UTC)Reply

missing difference between Unique and Primary keys. edit

Latest comment: 16 years ago1 comment1 person in discussion

Primary keys can be the target of foreign key constraints, and as such are part of referential integrity. --David Garfield (talk) 21:46, 29 January 2008 (UTC)Reply

OK, there is a claim that unique keys can be part of a foreign key constraing. I've never seen that in a database. I would agree that it is a logical generalization, and have wondered at times why it isn't supported. --22:21, 29 January 2008 (UTC) —Preceding unsigned comment added by Divad27182 (talk • contribs)

SSN edit

Latest comment: 12 years ago2 comments2 people in discussion

I am surprised that the article gives the (US) SSN as an example of a unique key, since it violates the fundamental property in two different ways: "a unique key must uniquely identify all possible rows".

For most databases, the SSN cannot identify all possible rows, since the vast majority of individuals do not have an SSN. (The exceptions are databases that are intended to only ever include individuals who are citizens or residents of the USA.)

Secondly, the SSN is does not uniquely identify even those who have them: millions of Americans have been assigned more than one SSN (usually they are given a replacement because of some problem). And there are some SSNs that are in use (mistakenly) by multiple individuals. The result is that there is no unique mapping between SSNs and American citizens/residents.

There is another problem: the widespread use of SSNs as both a personal identifier and for authentication (in the mistaken belief that no-one will know someone else's SSN) makes the number very attractive to identity thieves. As as result, many Americans will refuse to give their SSN - or if pushed will supply a fake number. Also, privacy laws in some US states may make it illegal to use SSNs or even to ask for them.

All in all, this makes the SSN a very poor choice for a unique key. Its inclusion in this article might make people think that SSNs are a good choice for a unique key, and thus lead to the development of yet more databases that use SSNs for this purpose. This is bad for two reasons: such databases exclude the ~96% of the world's population who do not have an American SSN, and are a privacy and identity theft risk for Americans.

TomH (talk) 23:31, 28 May 2008 (UTC)Reply

It says, correctly, the SSN can identify all possible rows, not all possible people. It might be a table of people with SSNs. The point is that it's unique. As for people lying about their SSN, well, if the data is just plain wrong then that's not a database issue. You might as well argue that numbers aren't unique in general because someone might lie and say there are two different numbers called 5. 86.183.163.162 (talk) 18:32, 14 June 2011 (UTC)Reply

Primary key vs. Unique key vs. Superkey edit

Latest comment: 15 years ago1 comment1 person in discussion

These are related, possibly even overlapping concepts, but their relationship is not discussed either here nor in Superkey. Any volunteers? DES (talk) 16:51, 29 September 2008 (UTC)Reply

Merge proposed with superkey edit

Latest comment: 12 years ago6 comments5 people in discussion

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.

I don't see a consensus to merge Superkey, so I will remove merge tags if there are no objections. --Boson (talk) 14:31, 2 September 2011 (UTC)Reply

Reading the unique key article, I cannot see a significant conceptual difference between it and candidate key and/or superkey. The definition in the lead seems to say either one, or the other, or (perhaps) a special instance of either.
Perhaps there is a confusion between the abstract relational database model and practical implementations of it?
I have propsed a merge of "unique key" and "superkey". If that is inappropriate, it seems that further clarification is needed. All the best, --Jorge Stolfi (talk) 02:03, 11 August 2009 (UTC)Reply

The only problem I could see is that "Primary Key" is the more common term. There isn't even an attribution for who came up with that, and PostgreSQL and Oracle both use "primary key". Where did "superkey" even come from?Mephistopheles (talk) 15:43, 21 October 2009 (UTC)Reply

It's OK that most of the DBMS's use primary key. But the "superkey" term is used in the relational model, which, is a model, so every DBMS is an implementation of it. Outsider.dave (talk) 06:57, 25 October 2009 (UTC)Reply

This is how I see it. Unique key is an implementation of a super key in SQL. Super keys are a set of attributes (columns) that can be used to uniquely identify any tuple in the relation (row in the table). A unique key is a type of constraint on SQL table that ensures that no two rows have the same values for the specified column. These definitions are ALMOST equivalent, the only difference is that Unique Keys, as commonly implemented, don't automatically enforce NOT NULL. This means that some or all columns in a unique key may be NULL. Now, because NULL is not an actual value (it represents the lack of a value), when two rows are compared, and both rows have NULL in a column, the column values are not considered to be equal. Because neither column has a value, there are no values to be compared and two non-existent values cannot be equal. Therefore, in a table with a unique key {color,name} where {name} is nullable, you may get two rows with values {red,NULL}. Because of this, the unique key cannot be used to identify each row in the table individually, and therefore does not qualify as a super key. This problem only exists for unique keys with nullable attributes, unique keys where all attributes are not nullable do qualify as super keys. So, unique key != super key. Jesselong (talk) 11:05, 14 January 2010 (UTC)Reply

Actually, looking at the text: "A unique key must uniquely identify all possible rows that exist in a table and not only the currently existing rows." from the Unique key article. This is wrong, see my paragraph above. Common implementations of unique keys allows duplicate rows where two rows both have null in the same column. The quoted text describes super keys, a data modeling concept which can be correctly implemented using non-nullable unique keys. Jesselong (talk) 11:05, 14 January 2010 (UTC)Reply

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Merge proposed with Key field edit

Latest comment: 11 years ago2 comments2 people in discussion

The Key field article has no business being a separate article. It should, at most, be a subsection of this article. It's currently a stub. I recommend getting rid of that stub and merging it into this article. 69.116.234.82 (talk) 02:55, 27 March 2010 (UTC)Reply

No real discussion on this after 2.5 years, so I've merged this as uncontroversial. WTF? (talk) 23:07, 10 February 2013 (UTC)Reply

Unique Keys Using NULL edit

Latest comment: 12 years ago2 comments2 people in discussion

Does a unique key uniquely identify all rows when NULL is used or not? The second paragraph of the introduction both denies and then affirms this. --Tag (talk) 17:34, 29 October 2010 (UTC)Reply

you must mean the contradiction between these two sentences:

"However, a column defined as a unique key column allows only one NULL value, which in turn can uniquely identify that row/tuple."

and

"Thus, the values in unique key columns may or may not be NULL, and in fact such a column may contain multiple NULL fields" — Preceding unsigned comment added by 71.83.65.96 (talk) 08:03, 6 June 2011 (UTC)Reply

In many databases (I have experience with only a few) no two NULL values are considered the same. So having multiple NULL values does not break the unique constraint. Not sure whether this is simply implementation or theoretical as well?

Merged information from Alternate key edit

Latest comment: 13 years ago1 comment1 person in discussion

Reformatted the information as it was extremely spread out, also found a page about an alternate key page on the Oracle FAQ. I feel it could maybe use some peer improvement though. I'm not satisfied with my text. 62.140.137.122 (talk) 00:05, 18 April 2011 (UTC)Reply

Misconception: unique and null edit

Latest comment: 12 years ago1 comment1 person in discussion

Misconception. The current text (October 8, 2011) states that "a column defined as unique key column allows only one NULL value, which in turn can identify that row/tuple." This is wrong! According to the ANSI / ISO SQL and Relational Model theory, NULL should not be considered as a value repetition (since it represents "no value" or "value unknown"). Thus, a "unique constraint" (unique key) should allow more than one row with null. Some RDBMS do not allow (such as Microsoft SQL Server), but are not in accordance with the standard. Some others (like Oracle or PostgreSQL) implement in accordance with correct concept. — Preceding unsigned comment added by 201.81.69.28 (talk) 16:51, 8 October 2011 (UTC)Reply

References:

--Andre Santos 17:39, 8 October 2011 (UTC) — Preceding unsigned comment added by Andre.psantos (talk • contribs)

Natural and surrogate key definitions edit

Latest comment: 9 years ago23 comments2 people in discussion

Hello @DesmondW:, I'm planning on reverting your edits because I think they are less clear or inaccurate. If you're going to replace my examples, then be specific and come up with better ones. Thanks. Alrich44 (talk) 22:31, 15 August 2014 (UTC)Reply

Natural key: An SSN is generated by a system, the social security system.

Alrich44: A key made from data that exists outside of systems. The data is not system assigned or generated. For example, a person's name and birthdate.

DesmondW: A key made from data that exists outside the system. The data is not system assigned or generated. For example, a person's social security number

Surrogate key: There are lots of numbers inside computer systems and outside them. Saying "a number" is not a good example. An SSN is not a surrogate key in the social security system. In my state, the state driver's license ID is a compound key, part from the driver's name and part from a system generated code.

Alrich44: A non-natural key made from data that is system assigned or generated. For example, a social security number or driver's license ID.

DesmondW: A non-natural key made from data that is system assigned or generated, usually a number.

Alrich44 (talk) 02:55, 25 August 2014 (UTC)Reply

Hello DesmondW, I read farther into the Natural key article, clear to the second sentence, "For example, a USA citizen's social security number could be used as a natural key." Then I sought some other references to bolster my viewpoint. Unfortunately, I found that it is easier to find articles that provide an SSN as an example of a natural key. Apparently, the database world has strayed far from what I consider the "real" or natural world.[1][2] Apparently, any number that is not system generated in that table, most designers consider it to be natural.

I did find an article that managed to get beyond using the SSN as an example.[3] Apparently, we, computer people, need to get out more.

Bottom line, you're welcome to re-instate your SSN example. I just cannot bring myself to type it. Maybe the reason we have some duplicate SSNs is that the US government is making the same mistake. Alrich44 (talk) 12:47, 27 August 2014 (UTC)Reply

Hello Alrich44, yes it's good to discuss these things. I'm not quite sure which part of my work you agree to re-instate, please clarify. Please also consider that the birthdate / name / address example should be removed, no programmer in their right mind would use such a cumbersome key. One of the reasons for using numeric surrogate keys is to make the key more compact, even when a candidate key exists.

Isn't your example of part name part number a concatenated key? Many such examples exist.

"Apparently, any number that is not system generated in that table, most designers consider it to be natural." I agree with that statement, perhaps better to say "any key" rather than "any number" since the key does not need to be numeric. DesmondW (talk) 16:45, 27 August 2014 (UTC)Reply

P.S. Apologies if my critique was a bit sharp DesmondW (talk) 16:51, 27 August 2014 (UTC)Reply

Hello DesmondW, Apology accepted. Although, you do continue to have a way with words; this has me reflecting on my own use of them. I suppose that you're just not used to working with co-workers in their right minds who have a different opinion. I believe there is hope that we can reach a consensus. I apologize too if for some reason I'm not in my right mind. How can one tell?

You are correct about the key with the name being a concatenated key, not a compound key. Thank you for catching that one.

I've lost track of changes. I believe I've discussed the ones that were changed. Perhaps we can move forward with its current state as a discussion point of reference. There is the built in tool to see the differences in versions. I'll even add in the SSN example.

Regarding compact keys, I can see two possibilities for differences in implementation.

1. The difference between a unique index and a key. In TSQL implementations, I see little practical difference between the two. I believe in initially doing a logical model based on natural keys, as close to the natural, real world as possible. Then the usual physical implementation would convert these to unique indexes, and each table would have a system-assigned, sequential numeric ID. As a general pattern, I've learned do this even for M:M resolution tables because their keys may migrate and other reasons, including systems with front-ends having a single ID for objects has proven simpler as a general pattern.

Just because something is a key or has a unique index does not mean it is the PK, not all keys need to be compact.

Apparently, there are two camps in the database world. Those who only use natural keys, and those who always use system-assigned unique keys. The design and implementation above allows for both.

2. The difference in the size and complexity of the system. In most systems, I would follow the pattern as listed above. In a smaller, simpler system, I can see the concatenate key example (full name, birthdate, location) being used to prevent duplicates entries. Apparently, you may not have looked at my reference previously provided, so I will put it here and in the article. The cited reference is even more constricted because it does not include birthdate. In my existing example, for a real duplicate to need to exist, there were would need to be twins living together with the same full name. In the referenced example, there would only need to be a parent and child with the same name, as this often happens where one drops the Sr. and Jr. from their names, it didn't seem like a good enough example to me.

In larger, enterprise systems, there are much more complex, 3rd-party applications dedicated to preventing duplicate people from being entered; these typically still require human intervention and a system that allows for it. So, I will agree that I have not created such an index even though I can see the situation in which one could exist.

Regarding, the SSN as a good example of a key. SSNs are known to not be unique.[4] I would not create a unique index on one, and it would not be a PK in any system I design. I would treat it as almost unique. For a UI entering new people records, it would have a warning that one already exists. Processes that listed potential duplicates would identify these records. And processes that returned a result from an SSN lookup would need to allow for more than one record, including the one for looking up existing people records.

Regarding, "Apparently, any number that is not system generated in that table, most designers consider it to be natural." My statement was within the context of numbers that the database is generating as keys. I'm not sure why we need to clarify this. Will it be used in the article? I suppose a GUID could be argued either way, text or numeric, but there are numbers that contain alpha characters. I do agree that there are a lot of numbers that are not keys, so it would be a point to clarify if used in the article.

Alrich44 (talk) 16:09, 28 August 2014 (UTC)Reply

Hello again Alrich44, a good reply, we are mostly in agreement and I am enjoying the discussion.

Regarding the lost changes I suggest that we simply try to reach consensus on the general definitions.

1 I agree that a unique index and key are similar, except for the vital constraint that a key may not be null (already mentioned in the article). I suggest that this article should not delve into a discussion about compact keys (even though I champion them) since this is an implementation question which will differ between database systems.

2 In any system I implement surrogate keys are always used and any natural key (SSN, ISBN, etc) may be empty (or null) but if specified will be checked for uniqueness. Please remember that unique identifiers are actually numeric (128 bit integers) but some systems, MySQL for example (which in any case is a hybrid ISAM database), treat them as characters, which is incorrect but should not enter this discussion. Unique identifiers are, properly, numeric and, as I am sure you know, the usual alpha numeric representation (e.g. 64BAD4FD-A514-475E-8956-E78440A6B4A7) is formatted hexdecimal simply to make the number shorter.

Regarding numeric versus character keys I don't think that is relevant for this article, but I do think that the examples shown should reflect good practice.

DesmondW (talk) 18:21, 28 August 2014 (UTC)Reply

Hello DesmondW, I'm glad we're making progress and it is going well for you. I'm reviewing things too and picking up a little on the way. What do you think of the current definitions? I've added SSN to both the natural and surrogate key examples, even though I believe that it is good for neither. I can hardly believe it came off my keyboard. Maybe we should move the examples into their own sections to have the article discuss them. There will be more space to cover the pros and cons of the SSN, etc.. Please propose definition changes here. Since I wrote the originals, I'm updating them per our conversation. Are they reaching a middle ground? Alrich44 (talk) 20:39, 28 August 2014 (UTC)Reply

Hello again Alrich44, we are certainly progressing but there are a couple of things I don't like and maybe some additions. The one thing that I most disagree with is that SSN or ISBN can never be considered surrogate keys. In the master database they may be considered the original key, but a surrogate is an artificial key used instead of any other candidate, either natural or original. I also don't agree with the example of name / birthdate / address / location since the data is not necessarily external.

My proposals:

Concatenated: A key made from more than one attribute joined together as a single key, such as part of a person's name with a system generated number appended.

Natural: A key made from data that exists outside the current database. In other words, the data is not system generated. For example, a social security number in an employee payroll database.

Surrogate: An artificial key made from data that is system assigned or generated when another candidate key exists. Surrogate keys are usually numeric ID values and often used for performance reasons.

DesmondW (talk) 12:01, 29 August 2014 (UTC)Reply

←

Hello DesmondW, Do you think that US SSNs are 100% unique?[5][6][7] Alrich44 (talk) 15:46, 29 August 2014 (UTC)Reply

Hello Alrich44, I have no idea (but assume that there are duplicates, since you raise the topic) and don't think that such an aberration is relevant to this article. In any other country they will be unique, as they surely should be.

My key point is that SSN, ISBN, or other master keys can *never* be surrogate keys (and if duplicates exist they are not keys at all).DesmondW (talk) 16:00, 29 August 2014 (UTC)Reply

Hello DesmondW, Why do you consider the uniqueness of an SSN being off topic? The article is Unique key, and you just said, "The one thing that I most disagree with is that SSN or ISBN can never be considered surrogate keys."

I have never said that an SSN can never be used. What I said is that they have duplicates. Because there are duplicates, an SSN fails my first test of considering them to be used generically as a key in large systems. The owners, not the designer, need to make an informed decision as to the odds of the duplicates existing in their database and how best to handle them when/if they occur. One of the references that you apparently didn't read discusses this point and goes on to state that even the Social Security people do not assume them to be unique; they provide a process for resolving duplicates. The carefully crafted Wikipedia SSN article states that the duplicates occurred only early on in the system. Another reference points to them still occurring.

So, I do object to an SSN being used as a general example as key. My proposal was to remove it from the single sentence example, and locate it in a paragraph that can fully address it.

I don't know why you keep bringing ISBN into it. I have never addressed ISBN yet. For it, like any key, we would need to consider the odds of duplicates existing. If I was working on the Amazon book system, I wouldn't worry about duplicates, and I would not use the ISBN as a key. If I was working on the ISBN system, I would have a process to resolve duplicates.

Maybe what we are running into is that natural keys are prone to duplicates in large data sets. I do like the example of either the code or the name in a domain lookup table. In systems I build, these are not null and have unique indexes.

I've included most of your proposed definitions into the article.

Alrich44 (talk) 16:51, 29 August 2014 (UTC)Reply

Hello Alrich44, we are drifting away again and going off topic. Duplicate SSN or ISBN are data errors and completely outside the scope of this article. The main point is that, by definition, any such master key *cannot* be a surrogate key, a surrogate (stand in or substitute, do look up the meaning) is an additional artificial key used in place of any other candidate key. Please look up surrogate key definitions, you will see that this is true. It is this basic error that gets under my skin and it must be rectified.DesmondW (talk) 17:06, 29 August 2014 (UTC)Reply

Hello DesmondW, Have you noticed that SSN has been removed as an example of a surrogate key? I have been discussing you wanting to use it as a Natural key example. We are under the topic of "Natural and surrogate key definitions".

Regarding the use of SSN in the Social Security system. I'm fairly certain that the part that is the serial number is being system generated, then appended to the state segment to form what they consider to be a unique number in that database. Duplicate numbers in the SS system with the current numbering system are not projected to be an issue for another 30-40 years. Alrich44 (talk) 17:12, 29 August 2014 (UTC)Reply

Hello Alrich44, no I missed that and am happy to see consensus on such a fundamental point. So now considering natural keys, in database design SSN and ISBN uniqueness is hoped for but not guaranteed, as you point out, which is another reason why best practice is always to have a surrogate key. However, I do think that the possibility of duplicates is not part of this article and, as we both know, SSN, ISBN, or stock codes are commonplace natural keys in many systems and obvious examples. "the code or name in a domain look-up table" really is not clear, a generally known key such as SSN or ISBN will be much more obvious to most readers and it is still a valid example of a natural key even where errors exist. This article cited http://www.databasejournal.com/features/mssql/article.php/3922066/SQL-Server-Natural-Key-Verses-Surrogate-Key.htm specifically includes SSN and ISBN as natural key examples. DesmondW (talk) 17:30, 29 August 2014 (UTC)Reply

Hello DesmondW, I conceded earlier on that due to a popular myth of SSNs being unique that I could see someone, not me, entering into the example of a natural key the example of an SSN. Since you just finished stating, "...if duplicates exist they are not keys at all", and hopefully, you now recognize SSNs as having duplicates, I am befuddled why you return to insisting they be added as an example. I would rather see the examples being discussed in a later paragraph. I've removed the domain table name example. Alrich44 (talk) 17:44, 29 August 2014 (UTC)Reply

Hello Alrich44, we are drifting off topic again, let's stick to the subject. Duplicate SSN or ISBN (data errors) are not part of this article. You will find SSN and or ISBN commonly referred to as natural keys, they are everyday examples. In the spirit of Wikipedia that is what should be quoted as examples and not some theoretical example, which looks like original research. DesmondW (talk) 17:52, 29 August 2014 (UTC)Reply

Another problem I have just noticed, the table heading is "Unique key type" but foreign key is not a unique key. DesmondW (talk) 17:57, 29 August 2014 (UTC)Reply

Oh, by the way, social security number is cited in the introduction as an example of a unique key. DesmondW (talk) 18:10, 29 August 2014 (UTC)Reply

Hello DesmondW, I'm working on addressing one problem at a time. We don't cite Wikipedia as a reference.

Regarding the topic: The topic is Natural Key examples. Natural keys fit within the topic of Unique key, which is the topic of the article. Unique means one and only one. Therefore, I consider discussing the uniqueness of a natural key example as being exactly on topic.

Regarding presenting original research: I am the one who has brought references to this discussion, both those who use it as a natural key example, and those that show it is not unique. Lookup/domain/reference tables are discussed in database surrogate vs natural key discussions.[8]

"The fundamental issue is that keys are a significant source of coupling within a relational schema, and as a result they are difficult to change. The implication is that you generally want to avoid keys with business meaning because business meaning changes. Having said that, I have a tendency to use natural keys for lookup/reference tables, particularly when I suspect that the key values won't change any time soon, as I describe below."[9]

For some reason, you seem to be insistent that a single sentence example of SSN be included in the table definition and are unwilling to accept presenting the pros and cons of using an SSN as a natural key in a paragraph/section. I have agreed several times that it is a common example, one that I consider commonly misinformed, similar to propagating ideas on the web and why we don't cite Wikipedia.

Proposed Natural key section / SSN paragraph:

A common example of a natural key is the US Social Security Number (SSN).[10][11] Note: PhilP, in the DatabaseJournal reference, brings up the issue of SSNs changing, along with others who have experience using SSNs. In a few decades, SSNs will need to be either re-used, or altered to allow for more numbers.[12] Multiple people are continuing to be assigned to one SSN.[13]

The US Social Security agency provides a process for resolving duplicate assignments. "When SSA determines that the same SSN was accidentally assigned to two different people, and consequently assigns a new number to one of these individuals, the numbers are not cross-referenced."[14]

Desmond, In the DatabaseJournal reference, you may want to search down to "Marcus Alabama"'s comment on ISBNs. "For example, ISBN's were created in 1970 as a surrogate key for books. Before then you could reference a book by the Library of Congress Number but that is also a surrogate key. What naturally identifies a book is its title, author, publisher and publication date. Further proof that ISBN is a surrogate key: can you tell me what book has ISBN 978-0-07-154808-3 or 978-0-07-154994-8 or 0-07-226090-4. I bet the author couldn't even tell you."

For SSNs and ISBNs, within their systems, they may function as unique identifiers. When brought outside their systems, they may be technically referred to as a natural key, but they apparently do not pass one of the main criteria for a key, and that is uniqueness.

Alrich44 Firstly, of course natural keys are part of this article, what I was getting at is that I think discussion of key errors, such as duplicate SSN, should not be part of the article. I wasn't intending to cite Wikipedia, merely to point out that the reference is already in the article.

Secondly, I want to make a definition of a original key type which, very curiously, is not part of the SQL definitions. Taking Amazon as an example (any trading company will do), when opening an account you will be assigned an internal ID which I am going to call the master key. When this key is exported to another system then it becomes a natural key, as we have been discussing.

Now let's come to the article by Marcus Alabama, which I haven't read and am taking your quotations as given. I am absolutely astounded, what absolute rubbish (aimed at him, not you). Complete nonsense, are you sure that this was not written on April 1st? In no way is book title, author, publisher, and publication date a candidate key (I call these gross concatenations unnatural keys). Of course there can be two books with the same title by two authors with the same name published on the same date by the same publisher. Any database engineer worth a cent knows instinctively that a system generated ID is the only way to guarantee a unique ID, and in the original system this is what I am calling the master key. ISBN is such a master key in the original system and so is an SSN.

His discussion about not knowing the meaning of a master key is just hot air. A surrogate key means that there is another candidate key, master or natural, in the table that could have been used. In the case of the original ISBN system, or original social security system, the generated ID value is a master key and no other guaranteed unique key exists, they are not and can never be surrogate keys (although of course there could be an internal surrogate key in addition to the master key).

Any other article I have read describing surrogate keys follows this definition, here's one: http://www.agiledata.org/essays/keys.html

Apologies for blowing hot, but this is just the sort of misleading information that makes our business so difficult.

Regarding your proposed Natural key section, there should not be any national bias. Wikipedia is international:

Proposed Natural key section / SSN paragraph: A common example of a natural key is a national Social Security Number (SSN) for individuals,[10][11] and another is ISBN for books. Both these references have the problem from a database perspective that duplicates can, and have been, issued in error. Furthermore, they may possibly change, be extended, or even re-used, all undesirable in a database system.

I always teach that database keys must be not just unique but immutable: never changed, and never re-used. SSN, ISBN, stock codes, and any other public keys fail all these criteria, and are also character making them cumbersome. However, this discussion is implementation and, while the problems should be mentioned, these are still common natural keys and I really think that it helps to quote good examples, people don't fully understand without examples. DesmondW (talk) 18:34, 30 August 2014 (UTC)Reply

I see that you have removed "unique" from the key table column name, but that does not solve the problem. Foreign keys are not unique (obviously) and should not be included in this article about unique keys. DesmondW (talk) 21:56, 30 August 2014 (UTC)Reply

Hello DesmondW, Proposed paragraph accepted. Go ahead and add the section. I'm in full agreement with what you teach. Thanks for talking this through. Regarding FKs: You have a good technical point. For some reason, the originators of SQL decided to call them keys (unique IDs). I would find it odd to discuss all keys accept FKs, so we could look at re-titling the article, maybe consider all the SQL articles and how to best organize them. Alrich44 (talk) 20:26, 3 September 2014 (UTC)Reply

Hello Alrich44, thank you for being so magnanimous, I have enjoyed our discussion. I am busy just now with family, but next week will have time to add the agreed paragraph and some other changes that I hope will make it much clearer.

Foreign keys are keys and unique in the referenced table, so I agree that they should be in this article.

I'll set up a new talk section for any further discussion. DesmondW (talk) 14:43, 4 September 2014 (UTC)Reply

Compound and Alternate key definitions edit

Latest comment: 9 years ago4 comments2 people in discussion

Hi DesmondW, I've encouraged you to use this talk page to discuss the definitions (see above). Per the previous attempted discussion, I'm changing back some of the the recent changes, leaving the change to Concatenated, and combining the Surrogate definitions. An SSN is a system generated ID; it is not a "natural" key.

Previous Compound key: A key made from at least two attributes or simple keys, only simple keys exist in a compound key.
Desmond: A key made from more than one attribute considered in combination.
Change: Reverted to original.
Discussion: You've deleted important information in the original definition. I'm not sure what you mean by consideration. At the time of the select? An attribute does not need to be indexed, but a key is indexed when it is defined. A query will typically look for indexes first.

Previous Alternate key: A non-primary key that can be used to identify only one row in a table.
Desmond: A alternate key that can be used to identify only one row in a table.
Change: Reverted to original.
Discussion: A good definition does not use the word in its definition, unless it is using it in a sentence as an example.

You can solicit a third opinion if you like, usually it is discussed first. Alrich44 (talk) 02:55, 25 August 2014 (UTC)Reply

You must read the other articles on keys as you simply don't understand the definitions at all. I am a database engineer and teach these topics. SSN is ***not*** a system generated key in any system other than the main government system. In any other system it is imported and therefore a natural key. Likewise ISBN, in any general commercial system.
A person's name and birthdate is not a key at all, duplicates will exist. A worse example would be hard to conceive.
A surrogate key is always locally generated without any reference to external (natural) keys and surrogate keys are the most common in any system. I know, I develop them.
Please read the other Wikipedia articles, you will see that your definitions are incorrect. Also please revert my edits before I do so, this article is already terrible without the errors. DesmondW (talk) 15:56, 25 August 2014 (UTC)Reply

P.S. sorry I missed your earlier message which I should have replied to. DesmondW (talk) 17:15, 25 August 2014 (UTC)Reply

Hi DesmondW, No problem on missing the other message. Please don't change the definitions until we've reached a consensus. If you have a specific article or reference you would like me to read, then list it.

We may be down to the definition of a surrogate key. See the previous section.

By the way, I did not write the rest of the article but did seek to improve it. I hardly think that the difference we have is considered a complete lack of understanding on my part, and all-knowing on your part. This may shock you that you are not the only database engineer in existence. I'm reserving judgement on your teaching ability; I wouldn't want to judge it based on your ability to have a conversation. One of the things I work on is not using sweeping statements; they are rarely true and may lead to the wrong impression.

Thanks for beginning to have a conversation.

Alrich44 (talk) 21:44, 26 August 2014 (UTC)Reply

Lacks a single topic. Should be split. edit

Latest comment: 9 years ago1 comment1 person in discussion

This article is extremely unclear and has many inaccuracies. The cause seems to be that at least three quite different topics are included but not properly differentiated: 1) Keys in Entity Relationship Modelling, 2) Keys in the Relational model, 3) UNIQUE and PRIMARY KEY constraints in SQL. For example the statement "the key that is allowed to migrate to other entities to define the relationships" makes some sense in an ER modelling context but is irrelevant to the Relation model. Similarly the statements about nulls in UNIQUE constraints are specific to SQL. Would it be better split into separate pages for "Keys (Relational)" and "Keys (ER)". Optionally, either of those pages could say something about SQL implementation issues. Oradium (talk) 22:33, 13 January 2015 (UTC)Reply

External links modified edit

Latest comment: 7 years ago1 comment1 person in discussion

Hello fellow Wikipedians,

I have just modified one external link on Unique key. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

Added archive http://web.archive.org/web/20120425052025/http://www.xcdsql.org/Summary%20of%20SQL.html to http://www.xcdsql.org/Summary%20of%20SQL.html#chapter-Table%20constraints

When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{Sourcecheck}}).

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 18 January 2022).

If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 20:05, 20 July 2016 (UTC)Reply

Key vs Unique key vs Superkey edit

Latest comment: 6 years ago1 comment1 person in discussion

The article intro says:

a unique key (a redundant term since a key, by definition, implies attribute(s) value(s) uniqueness, therefore it should be avoided) is a superkey

This suggests that "key" = "unique key" = "superkey" are all synoyms. But then why is there a separate article Superkey, which doesn't say anything about this? If there is a difference between the terms, even if it's subtle or context-dependent, can someone please edit both articles to make that clear? --79.227.176.47 (talk) 07:16, 9 October 2017 (UTC)Reply

Rename as Key (database) edit

Latest comment: 5 years ago1 comment1 person in discussion

The page should be renamed as "Key (database)" with Unique key as a redirect. As the text already states, "unique key" is a pleonasm. Throughout there seems to be some confusion with unique constraints, but a constraint is not a key and the set of columns covered by a unique constraint is not necessarily a key. Renaming the article might help avoid that confusion. Renaming it would also make more sense of the fact that we now have a separate article for "Primary key": Key (database) would cover keys generally, with primary key dealing with the "special case" explanations. — Preceding unsigned comment added by Oradium (talk • contribs) 05:52, 25 July 2018 (UTC)Reply

Difference between Unique key and Candidate key edit

Latest comment: 5 years ago2 comments2 people in discussion

The first sentence of Unique key specifically says that it is synonymous with Candidate key:

In database relational modeling and implementation, a unique key (also known as a candidate key) of a relation is a minimal superkey for that relation; that is, a set of attributes such that:

And Candidate key has almost the same first sentence, although strangely it says it's synonymous with "secondary key" (note: Secondary key redirects to Secondary chord, a totally unrelated term):

In the relational model of databases, a candidate key(also known as secondary key) of a relation is a minimal superkey for that relation; that is, a set of attributes such that:

And on top of that the two share identical definitions:

the relation does not have two distinct tuples (i.e. rows or records in common database language) with the same values for these attributes (which means that the set of attributes is a superkey)

there is no proper subset of these attributes for which (1) holds (which means that the set is minimal).

So, is there a difference?

If so, that should be explained in one or both articles, and currently it is not. Candidate key does not mention Unique key, and the only places where Unique key mention Candidate key are to say they are synonymous, or to explain the difference between a Candidate key and a Primary key.
If not, one of these articles should be merged into the other.

Note: I write some basic SQL professionally, but that is the full extent of my qualifications on this. I'm just trying to use Wikipedia to get a basic grasp of these concepts, but this whole set of articles is really kind of hard to follow as a layperson. Let's fix it!

— nhinchey (talk) 20:10, 31 October 2018 (UTC)Reply

There is no difference. The title "candidate key" is preferable because "unique key" is a tautology. I suggest merging the two under the name Candidate key or, perhaps better: Key_(database), which currently redirects here. Oradium (talk) 22:10, 2 November 2018 (UTC)Reply

Simplified and corrected the 1st paragraph, including the two bullets edit

Latest comment: 4 years ago2 comments2 people in discussion

This paragraph has so many hyperlinks that to a reader who was unaware of what unique keys are, it would be necessary to spend half an hour perusing concepts like "relation", "attribute", "tuple" , "superkey",etc. in order to understand the definition, when the use of common synonyms (table, column, row, etc.) would suffice without the need for such perusal. I've hence taken the liberty of simplifying the prose by using the common terms instead of the technical E.F.Codd-speak terms.

In response to the previous note, while UNIQUE KEY is indeed a tautology, SQL uses the statements "CREATE Primary Key Constraint" and "Create Unique Constraint", not "Create Candidate Constraint" (or, for that matter, it uses "create table" instead of "create relation") and so, it's probably here to stay.

The 1st sentence falsely stated that "alternate key" and "primary key" were synonyms of unique key. A table can contain multiple unique keys (e.g., a surrogate (machine-generated) key and a multi-column key, but only one of these (typically the surrogate key) serves as the primary key. The multi-column key serves as the basis of a UNIQUE constraint.

Prakash Nadkarni (talk) 02:49, 24 March 2020 (UTC)Reply

Your edit is probably an improvement except that I don't see a good reason to exclude the mention of alternate and primary keys. This article certainly encompasses both those things since both are keys (and of course they are unique). This is not an article about SQL syntax but about the concept of keys. There is a separate "primary key" page that forked from this one and pointlessly duplicates some of the same info but since key is the more general and more important term this entry ought to deal with the general case. There is a small inaccuracy in your edit because a key can consist of zero attributes - the phrase "one or more" is not correct.

Oradium (talk) 19:42, 26 March 2020 (UTC)Reply

Add topic