Talk:Boyce-Codd normal form/Archive 2

This example isn't even in 2NF

Tutor SSN is dependent on Tutor ID or vice versa, and not at all on the Student ID which is part of all candidate keys.

To say that the example falls short of 2NF would be to say that at least one of its non-prime attributes is functionally dependent on a part of a candidate key (rather than the whole thing). Tutor SSN is indeed functionally dependent on only part of the candidate key {Tutor ID, Student ID}. However, Tutor SSN is a prime attribute. For this reason, there is no breach of 2NF. --Nabav 14:38, 15 July 2007 (UTC)

Probably could be a lot clearer

Someone might want to consider rewriting this with a particular audience in mind, namely people who have "taught themselves database management" but don't have a background in formal theory. Right now, I think such a person would find this impenetrable, even though they would have all of the knowledge necessary to understand a differently written article on the topic. - Jmabel | Talk 21:01, 15 January 2008 (UTC)

I can well understand that anyone tackling this BCNF article without a background in formal theory would have trouble. But BCNF is formal theory. Those who have taught themselves database management are typically familiar with the first three normal forms, and understand these normal forms to be basically concerned with primary keys and the way in which non-key attributes depend on them. That, however, is a rather informal and loose (and strictly speaking, incorrect) way of understanding the normal forms. The distinction between 3NF and BCNF can only become apparent if you are familiar with the rigorous definition of 3NF in terms of what are known as "non-prime attributes". The rigorous 3NF definition has a loophole - which BCNF addresses. Most people's less-formal intuitive definition of 3NF, on the other hand, has no such loophole, which is why they find it very hard to understand why BCNF should exist or what it is trying to say! I know I did, prior to learning the theory. The upshot is that if you have a good intuitive understanding of 3NF, and no particular interest in learning the theory in a rigorous way, BCNF is not something you need to concern yourself with. --Nabav (talk) 20:53, 23 April 2008 (UTC)

another problem

SSNs are not unique to individuals, are they?. Isn't it possible for the same SSN to appear for two different tutors and not be an error or inconsistency? —Preceding unsigned comment added by Bbacher (talkcontribs) 16:51, 30 April 2008 (UTC)

A Social Security Number is unique to an individual. --Nabav (talk) 16:40, 21 May 2008 (UTC)
Nabav - is no doubt correct, but the "database" doesn't know this unless it is made explicit. Where non-key attributes are indeed unique, I'd suggest implementing the UNIQUE attribute constraint. Otherwise there would be nothing to stop a user entering a duplicate SS# in this example. --Niddriesteve —Preceding unsigned comment added by 125.236.56.6 (talk) 04:05, 5 June 2008 (UTC)

Vanity

I grow weary of this. Some University of Bristol undergraduates have inserted their own names as "Student IDs" in the example database table. I reverted, but have been reverted in my turn. The particularly disappointing thing is that the new table completely runs roughshod over three important facts which the original table respected: 1) a tutor (Tutor ID) can tutor multiple students, 2) a student (Student ID) can be tutored by multiple tutors, and 3) the same SSN does not correspond to multiple different people (in this case, tutors). As this was a vanity edit which had the side effect of introducing errors, once again I shall revert. There may be a case to be made for using names rather than IDs ... but if we go that way, let's do it properly ... and let's not use the names of University of Bristol undergraduates. --Nabav (talk) 23:45, 20 May 2008 (UTC)

They attempted to insert their names into the 1NF and 2NF articles as well, I see. --Nabav (talk) 23:56, 20 May 2008 (UTC)

Factual accuracy disputed?

User Niddriesteve disputed the factual accuracy of the article. He placed a note in the body of the article (I have moved it here) saying "This is a bad example and the explanation is incorrect - I will rewrite when I have time however I am posting this as some of my students have relied upon this example in assignments - Database students, please refer to your text book and ignore this article for the time being." Unfortunately, he did not give details of what he believes the inaccuracy to be. In order to clear this up, we need details from Niddriesteve as to what he thinks is in error. —Preceding unsigned comment added by Nabav (talkcontribs) 20:22, 4 June 2008 (UTC)

Further information from Niddriesteve 5 June 2008, follows

The original article suggests "there is nothing to prevent two different Tutor IDs from being shown, illegitimately, as corresponding to the same Tutor Social Security Number."

However the suggested solution (which seems to have only just appeared on the page within the last 24 hrs) proposes a relation (table)

  {Tutor_id, Tutor_name, Tutor-SS#} 

I can only presume the determinant is Tutor_ID although this isn't made explicit in the example - nevertheless in this case there is nothing (explicitly stated) to prevent the same social security number being associated with different tutors - the data anomaly inferred as the main problem thus hasn't been resolved by the proposed solution. (Of course, this is an entirely different issue to BCNF yet it has the result of providing a misleading explanation as to what BCNF supposedly resolves)

The real issue here is that BCNF resolves data anomalies occuring as the result of overlapping (composite) candidate keys (not mentioned in the example). There can be no BCNF issues in relations where a single surrogate key is chosen (also not mentioned), and the erroneous data issue I mention above, in fact has nothing to do with BCNF yet this is also not made clear in the example.

Of course there is nothing that can be done to eliminate erroneous data as a result of bad data entry within enforced constraints. Although setting a UNIQUE constraint on the SS# in the TUTOR table might help.

I don't quite have the time at present however I do have better examples and I think the explanation of BCNF could be improved considerably. —Preceding unsigned comment added by 125.236.56.6 (Niddriesteve) 01:06, 5 June 2008 (UTC)

Niddriesteve, it's a valid point you make. I have to apologise for the example (I am the originator of it). Although I think it succeeds in its aim of adhering to 3NF and violating BCNF, I see now that it doesn't succeed very well at throwing light on the particular anomalies that BCNF can be expected to address. I will revise the example and explanation to address the issues your raised, or perhaps you will get a chance to do so before I do. Thanks for the feedback. --Nabav (talk) 12:55, 6 June 2008 (UTC)
Incidentally, regarding the issue of the FD Tutor ID --> Tutor SSN, as well as the FD Tutor SSN --> Tutor ID ... in the original table that violates BCNF, we cannot guarantee that these FDs will be respected - even if we have the DBMS enforce the uniqueness of both candidate keys ({Tutor ID, Student ID} and {Tutor Social Security Number, Student ID}). Whereas in the design that I say adheres to BCNF, having the DBMS enforce the uniqueness of both candidate keys (namely, {Tutor ID} and {Tutor SSN}) will ensure that both FDs are respected. For this reason, I disagree with you slightly when you say that the data anomaly persists in the proposed solution. It persists only if one of the candidate keys is left unenforced; whereas in the original BCNF-violating table, the problem is intrinsic to the design and remains regardless of how much or how little is enforced by the DBMS. Of course, it's all very well for me to make these subtle points here ... the main issue I suppose is that the article didn't make them, and that it would be nice to have an example that removes the need to make the subtle points at all. --Nabav (talk) 20:13, 6 June 2008 (UTC)
Article has been overhauled: examples changed, new section included, references added, etc. I believe that the issues raised above have now pretty much been addressed. --Nabav (talk) 12:49, 12 June 2008 (UTC)