Talk:Second normal form

Latest comment: 7 years ago by ForbiddenRocky in topic Proper Definition?
WikiProject iconDatabases Start‑class (inactive)
WikiProject iconThis article is within the scope of WikiProject Databases, a project which is currently considered to be inactive.
StartThis article has been rated as Start-class on Wikipedia's content assessment scale.

Archives: 1 2

Proper Definition? edit

From what I can gather the definition of 2NF is that there are no partial dependencies w.r.t the primary key. The definition here seems to state that there are no partial dependencies w.r.t any candidate key. This would be the definition of BCNF with the addition of 3NF. —Preceding unsigned comment added by 173.19.225.62 (talk) 21:55, 14 April 2009 (UTC)Reply

Agreed, 2NF does not demand checking all candidate keys (actually SUPER keys), AFAI'veL. Just the Chosen One (the PK). Otherwise it's BCNF (with 3NF), right? —Preceding unsigned comment added by 130.237.158.90 (talk) 16:16, 29 January 2010 (UTC)Reply

No, the article is correct: 2NF demands checking all the candidate keys. Codd's definition states that every non-key attribute must be "fully dependent on each candidate key of R". This is from the Codd paper in the Rustin book cited in the article (p43). A great deal of normalization material on the internet and in less-rigorous published guides glosses over this subtlety, and talks purely in terms of the primary key - it makes for an easier-to-grasp explanation, but strictly speaking it's not right. --Nabav (talk) 08:14, 15 May 2010 (UTC)Reply
Elmasri, Navathe: "A relation schema R is in 2NF if every nonprime attribute Ain R is fully functionally dependent on the primary key of R." section 12.3.3, 3rd edition ForbiddenRocky (talk) 05:30, 12 March 2017 (UTC)Reply
Later Elmasri, Navathe: "A relation schema R is in second normal form (2NF) if every nonoprime attribute A in R is not partially dependent on any key of R." section 12.4.1, 3rd edition ForbiddenRocky (talk) 05:34, 12 March 2017 (UTC)Reply

The two given definitions of 2NF don't actually mean the same thing. The classic text book definition (I have a copy of Silberschatz, Korth, Sudarshan; 6th edition open in front of me, turn to page 372, Ex. 8.17) is that there is no FD where the left-hand side is a subset of some candidate key (this is the first def. given in the article). However, this is not the same as the "Put simply..." summary at the end of the introduction. Since the "put simply" def. would say a relation with FD's A->BCD, B->D is NOT in 2NF but it is by the first definition the article gives. I suggest rewording the "Put simply" part, or better yet, deleting it entirely since the definition given at the beginning is about as simply put as things can get for 2NF. 66.109.61.47 (talk) 20:27, 19 August 2016 (UTC)Reply

Inconsistent Examples edit

Awefully written and very hard to understand as a beginner. I have been working with DB's a bit and thought I understood some stuff, now I am totally confused again.... —Preceding unsigned comment added by 118.127.9.172 (talk) 01:31, 8 January 2009 (UTC)Reply

I'm sorry to hear you're finding it confusing. The "Example" section is very much geared towards helping beginners understand 2NF; so if it's not meeting that aim, then maybe we need to tweak it a bit. Could you tell us what it is in particular that you are finding difficult to understand about the example? Is it perhaps that it's not giving you a clear idea what a candidate key is? --Nabav (talk) 09:44, 8 January 2009 (UTC)Reply
To help beginners understand the article better, I've added some explanatory text about why the candidate key is what it is. This is because I strongly suspect that the concept of a candidate key is a stumbling block for readers new to the subject. --Nabav (talk) 08:46, 1 February 2009 (UTC)Reply

The author states that the example cannot suffer update anomalies. This is not the case. When updating the name Jones for example, one must modify all instances of Jones, or inconsistencies will occur. —Preceding unsigned comment added by 144.82.250.214 (talk) 15:50, 11 February 2009 (UTC)Reply

These examples, like various other examples in the NF articles, make use of the informal convention of treating people's names as identifiers. That is, instead of of designating an Employee as "123" (or whatever), we just designate him by his surname. Obviously in a real business application this would not be done; but for the purpose of the example the convention is harmless in my opinion. The reader is being asked to assume for the sake of the example that "Jones" really does immutably designate one specific employee, and that there will never be any need to change the name "Jones" to "Jonze" or what have you. I seem to recall that this sort of convention is used even in the journal articles that introduced these normal forms in the first place. If it's good enough for Codd, Fagin, et. al., then it's good enough for Wikipedia, surely. So I've reinstated the assertion that these tables are free of anomalies - that's the concept we're trying to demonstrate, after all - and I've added a clarification as to what each table's candidate key is. --Nabav (talk) 17:33, 11 February 2009 (UTC)Reply

The first example having attributes Employee, Skills & Current Work Location is incorrect. 'Current Work Location' is not functionally dependent on the 'Employee' attribute as more than 1 employees have the same 'Current Work Location'. Another example should be substituted.

Mistake in Example? edit

While I could be mistaken, the Electric Toothbrushes example seems to have a mistake. It claims that {Manufacturer, Model} is a candidate key. However, {Model} is a proper subset of {Manufacturer, Model} and contains no repeated values, making it a candidate key. Does that not disqualify {Manufacturer, Model} as a candidate key, thus invalidating the example? Zilong (talk) 18:50, 13 February 2009 (UTC)Reply

I can certainly see why you would think that, but no, there is no mistake. When we designate some set of attributes as a candidate key, we are making the claim that no two rows could ever have the same combination of values in those candidate key fields; but that a proper subset of the candidate key attributes could have identical values across two rows. A candidate key is about guaranteeing uniqueness regardless of what might happen to the content of the table in the future, and regardless of what the content is right now. In the example, if we were to designate {Model} as a candidate key, we would be making the claim that no two manufacturers will ever introduce the same name for their electric toothbrush lines - so for example, there could never be both a Hoch Classic and a Forte Classic. Designating the candidate key as {Manufacturer, Model}, on the other hand, allows the possibility that there could be both a Hoch Classic and a Forte Classic. Thus our choice of candidate key is a matter of our understanding of the practical realm that the table is "about". At some point I might add an explanation of this point to the article, because you are not the first person to raise this sort of question on an NF-article discussion page. --Nabav (talk) 13:08, 14 February 2009 (UTC)Reply

The first example having attributes Employee, Skills & Current Work Location is incorrect. 'Current Work Location' is not functionally dependent on the 'Employee' attribute as more than 1 employees have the same 'Current Work Location'. Another example should be substituted.

Two words: edit

Dictionary prime! This article's use of "prime attribute" in the first sentence defies the english definition of "prime". To a beginner, this could mean anything from "primary key" to "first column" to "unique column" to (heck, might as well say it!) the second, third, fifth, seventh, etc. columns. It is ok to have a new meaning for prime that is specific to database standardization, but it is confusing to bring esoteric jargon into a discussion without defining it first. When the jargon immediately suggests a contrary meaning to how it is used, then it gets very annoying very quickly and it is highly unlikely that the reader will read on to see the rest.70.113.72.73 (talk) 00:24, 20 August 2012 (UTC)Reply

Tournament Winners not in 2NF? edit

As written, {Winner, Year} is a candidate key (because there are no winners that won two different tournaments in a year, though clearly this is logically possible). Thus winner, a subset of a candidate key, determines the non-prime Winner-DOB and the table is not 2NF. Suggest adding a row with someone winning a second tournament in a particular year to fix it. Chris Smowton (talk) 14:12, 9 February 2014 (UTC)Reply

I noted the same problem in the talk page for 3NF. Tbessler (talk) 22:14, 21 January 2016 (UTC)Reply

Definition of non-prime wrong!? edit

The current definition of a non-prime key states that "A non-prime attribute of a table is an attribute that is not a part of any candidate key of the table." This definition is essential to the further definition of 2NF and I think that it is wrong! Lets look at the example of the simple FD set of: {A->B, B->A, X->Y, Y->X}. Obviously the candidate key set is {AX, BX, AY, BY}. According to the definition given the set of non-prime attributes would be empty(!). Since a relation with an empty non-prime attribute set easily satisfies 2NF (and 3NF by the way), it would mean that this relation is in 2NF. But the relation is obviously not in 2NF since I can easily generate insert-anomalies. Masgo (talk) 10:36, 1 September 2015 (UTC)Reply

The definition is not wrong. Your {A,B,X,Y} relation has no non-prime attributes and therefore it does satisfy 2NF and 3NF with respect to your given FDs. 2NF and 3NF don't eliminate such anomalies and that is why we need the other NFs instead. "Let us call any attribute of R which participates in at least one candidate key of R a prime attribute of R. All other attributes of R are called non-prime." (E.F.Codd, 1971, Further Normalization of the Data Base Relational Model)--Oradium (talk) 20:26, 7 October 2015 (UTC)Reply

Mistake in example and in didactic edit

It looks like somebody attempted to fix the mistake in the example and failed to do so correctly. In the current version, {Model, Manufacturer Country} is a candidate key and thus Manufacturer Country is prime. Therefore the reasoning that the table is not in 2NF due to Manufacturer Country is wrong. It would be necessary to adapt the table in a way that Manufacturer Country loses its primality property.

Furthermore the approach taken here is unfortunate from the didactic point of view: Normal forms should be explained as based on the functional dependencies to be expected by the application (and thus on the schema level). They should not be explained by the functional dependencies which show up as a consequence of a particular instance (ie table) of that schema: The design depends on the schema dependencies not on the dependencies which, at a particular moment, show up in an instance table. Unfortunately, this mistake is very wide spread. This problem applies to all NF articles.

Suggestion of how to build a better example: Assuming a world, in which the first name of a person determines the gender and FirstName, LastName determines the person. In this case we could look at a scheme with the attributes FirstName, LastName, PlaceOfBirth, Gender. Our scheme would have the (only) candidate key {FirstName, LastName} and PlaceOfBirth and Gender would be non-prime attributes. Gender would depend on FirstName. A single relation (scheme) would violate 2NF.

--Oradium (talk) 17:23, 20 February 2016 (UTC)There is no basis on which to say that {Model, Manufacturer Country} is a candidate key. As stated, dependencies cannot be inferred from a instances of a relation so there is no reason to conclude that the country should be a prime attribute. It might be worth modifying the example data to make this point clear but I don't think there is any actual mistake in the example. The suggested example where Firstname determines Gender seems less realistic and perhaps more likely to cause confusion than the example as it stands.Reply

--a quick question: you are saying "There is no basis on which to say that {Model, Manufacturer Country} is a candidate key". Is it not a candidate key by just definition, as described here: [1], as for the data present in the table ?Dmetrios (talk) 14:43, 28 March 2016 (UTC)Reply

As stated in that reference a candidate key cannot be deduced from a single instance of a relation value - candidate keys are based on functional dependencies that must hold for all permitted values of a relation. By looking at the example data we can determine what candidate keys do *not* hold (e.g. {Manufacturer} is not a candidate key in the Electric Toothbrush Models table); we cannot conclude what candidate keys *do* apply. The stated candidate keys are {Model Full Name} and {Manufacturer, Model} so those are the only ones we can know about. 92.233.110.178 (talk) 21:56, 30 March 2016 (UTC)Reply

  • Yes, identifying candidate keys must rest on dependencies that must hold for all permitted values, not just those shown in the table. However, it's clearer when constructing an example to ensure that deductions from the table match deductions based on the design. This stuff is difficult enough already! Peter coxhead (talk) 10:10, 19 April 2016 (UTC)Reply

References