Wikipedia talk:Manual of Style/Dates and numbers/Archive 112

Archive 105 Archive 110 Archive 111 Archive 112 Archive 113 Archive 114 Archive 115


Concrete examples (year links)

The above discussions keep veering away from the point I'm trying to pin down. Can those who think single-linking of years is unhelpful please either say they want all single-year links removed, or give examples of when they think it is acceptable to link a year (or not acceptable - whichever list is shortest). I will give a few examples to start things off (apologies if this is repeating arguments made 5 or 6 years ago, but I wasn't here then):

[Allow me to give my personal answers by interspersing. --Kotniski (talk) 13:30, 24 September 2008 (UTC)]

  • Category:Time, date and calendar templates (more than you can shake a stick at!)
  • Template:Year nav (used to navigate between year and other history chronology articles)
  • Template:Year nav topic2 (complex template with a variety of possible outputs)
  • Template:Decade category header (used in categories)
  • Category:1850 (example of year navigation in categories)
  • Category:19th century (another example of year navigation in categories)
  • Links from "year in" articles: 1854 in literature
    • I'm happy for all the above to be linked. Kot
  • Year(s) and sometimes date(s) when an event took place: John F. Kennedy assassination
  • Birth and death years (and dates) in biographical articles: Martin Luther King, Jr.
  • Publication dates of a book or other publication (eg. film): Ulysses (novel)
  • Founding year of an organisation or construction of a building or monument: Albert Memorial
    • I'd be OK with this as long as we could successfully restrict it to specific cases like this (bearing in mind that editors tend to see certain links and insert what they think are analogous ones elsewhere - leading to overlinking). Kot
  • Mentions of years in the references for an article: Eduardo Abaroa
    • Fairly pointless I'd have thought. Kot
  • Disambiguation pages: 2000 (disambiguation)
    • Yes. Kot
  • A relevant mention of a year in an article
    • Don't understand - relevant to what? Kot
  • Any random mention of a year in an article
    • No, this has long been recognized as leading to overlinking. Kot

Most of those articles don't have the relevant years linked. Would it be acceptable for any of these examples (some are obviously acceptable, so please state which ones you would use year links in), and should anything else be linked instead? (When commenting, please don't assume I support all or any of these examples.) I'm aware that many of these are from template and category space, and involve "year in" articles, rather than year articles, but if the main points are to push chronological navigation away from wikilinks and into template and category space (eg. birth and death years are covered by categories), then please say that (it will avoid a lot of arguments).

On a broader point, when removing links, can a bot make some of the necessary distinctions? If not, is it acceptable for a bot to remove all links and then rely on humans to rebuild the previously-overlinked articles? This latter point seems to be what some people are implying by their statements, and it is an important philosophical point - when something, such as date-formatting linking, gets out of control and gets entangled with other sorts of overlinking, is it best to rip it all down with some collateral damage to be fixed by humans, or rely on humans to slowly fix things the wiki-way? The latter approach relies on people carefully reading the MoS (cue much hilarity), rather than copying what they see in articles, so I have some sympathy with the bot and bot-assisted and script approaches that seem to be aiming to level the playing field and start date linking from scratch again. But if there are clear examples people can agree on, and which have utility, those should be retained. And please, please, remember that linking, though it shouldn't have been, has been used to accumulate vast amounts of meta-data that should not be stripped out without retaining the utility of the meta-data. See the links at Wikipedia:Metadata. Carcharoth (talk) 04:41, 24 September 2008 (UTC)

Yes, I'm on record as saying that links to date fragments are appropriate in articles on date fragments; links to solitary years are fine in "year-in-X" articles; this should probably be written into MOSNUM—it's a very reasonable compromise. On very rare occasions, a year-link may be justified in normal articles, but the onus is on an editor to demonstrate why it functions to significantly increase readers' understanding of the topic; a talented editor such as Anderson has shown that this may be possible, although I believe it is not commonly applicable. (In this respect, merely providing a magic blue flying carpet to a year link for discretionary browsers is insufficient). What is the difference between "a relevant mention" and "any random mention" of a year in an article?
There is generally no justification for linking solitary years in the other types of article listed above, except for those that are purely to do with navigation. In the J.F. Kennedy article, are you suggesting the linking of "1963"? It's not useful, IMO, because of the same old specificity–generality conundrum: if the year article is comprehensive, and in particular not culturally or racially biased, it will be huge, at least in the past hundred years, and very little of the information will be of even vague relevance to the good president. Providing such contextual information is the job of the article itself, and one of the most rewarding aspects of building a good WP article. Such context should not be delegated to a sea of factoids in isolation. Tony (talk) 05:05, 24 September 2008 (UTC)
Sorry, Tony, you've scored in your own goal with this one. Read the events of 1963; about 20% have been directly connected by someone with the aasassination, and most of the rest are instance of trends (violence, Southern racism, the growth of the Soviet Empire, the weakness of the Soviet Empire....) which have been.
I agree that most of the year articles are ill-written, incomplete, and need work. What genre of two thousand articles doesn't meet that description?
But I do appreciate the buttering up. Please don't let this deter you from doing more of it. Septentrionalis PMAnderson 05:16, 24 September 2008 (UTC)
Hmm. Do you think you (Tony and Anderson) could discuss this without mentioning each other? Carcharoth (talk) 05:24, 24 September 2008 (UTC)
And spoil our Mutual Admiration Society ;)? You should have seen us two months ago; I'm certain you'd prefer this. Septentrionalis PMAnderson 05:30, 24 September 2008 (UTC)

Getting back on topic, hopefully, the alternative navigation route for that sort of browsing to find out about events and other things about the year 1963, involves waiting until the end of the article (has advantanges and disadvantages), clicking Category:1963 in the United States and browsing from there. To reach 1963 takes three more clicks: Category:1963 by country, Category:1963 and then 1963. Alternatively, all year subcategories could have the year linked, so you could go straight from Category:1963 in the United States to 1963. Or you could have a link to 1963 in the article. Incidentally, November 22, 1963 redirects to the most famous event of that day. But does anyone know what else happened that day? How would one find out? Well, jumping over to November 22, we find that that page currently lists the assassination, one birth, and four other deaths. Some people will be completely uninterested in who the other people are who died on that day (thousands of people are born every day, and thousands die every day), but some will go to that page and look. That's human nature. It might not be what people want to see an encyclopedia provide, but there is a demand for it. I'm not saying that linking 22 November is the right way to satisfy that demand, but if you want to divert such demand away from wikilinking and towards categories or templates or "year in" articles, then understanding and acknowledging the demand for such navigational links is the first step in managing them. Carcharoth (talk) 05:40, 24 September 2008 (UTC)

(edit conflict) Anderson and I will disagree in the future as in the past on some matters, but understanding and goodwill should be encouraged in this viper's pit, not be the subject of objection, Carcharoth. I agree that someone has inserted quite a few factoids concerning Kennedy in that year article, although you do have to hunt through many irrelevant facts to find them (Britain's "big freeze", a double murder in Sydney, a triple murder in Perth, WA, ...). You'd think the US, Australia and the UK were just about the only countries on the planet in January 1963. This is a problem in most year pages: they're not a world view, but show extreme culture-centric bias. They're also by their design fragmentary: the information is isolated into disconnected facts through the whole year, where the article on Kennedy is the place to do what WP does so brilliantly: wind them into a cohesive whole that presents a context for the subject. Isn't there also a daughter article on the assassination of Kennedy? That is probably the place into which these factoids should be herded and bound together into the Internet's most cogent tertiary source on the Kennedy-related factoids currently in "1963". But if editors are really keen to retain the link to that article, well, let them make a case. It doesn't change the basic decision by the community that solitary years should generally not be linked. And, BTW, Anderson's right in that 1963 is a good example of the case for the odd exception. You won't find many like it. Tony (talk) 05:55, 24 September 2008 (UTC)
And Rome was not built in a day (gratiutous overlinking!). The existing links may contain data that could help build such articles (both the year one and the specific event ones). Sure, it will all need to be sourced and properly written, but categories, for example, are often good places to look for topic organisation when trying to plan an article and interweave different strands. "What links here" is a similar resource. The balance between tearing down the existing structure and rebuilding it, and taking it down brick-by-brick is something I'm not sure about myself. Sometimes tearing it down and rebuilding it does work better. My objection to the disagreement was more the naming of people - that personalises it - not the disagreement per se. Carcharoth (talk) 06:07, 24 September 2008 (UTC) Will be away for most of a week now, so that's my lot for this page.

To answer the direct question above about pre-1500 (or whenever) year pages: I can only say that they're quite unsatisfactory at the moment, since they're so small. They'd be less trivial and provide the opportunity for greater cohesion if conflated into decade pages, I believe. Even for the diversionary browser, they're not much good, but could be made into a great strength of WP with skilled editing; then I'd be in favour of highlighting them on the main page in a big way, and they'd be appearing as nominations at FLC and, better, FAC on a regular basis. That would be excellent, and if one WikiProject could coordinate (in some respects) all of the chronolotical pages, including the year-in-X pages that would link to them, year pages could be lifted out of their present malaise. But who's going to do this? And it wouldn't solve the current structural impediments to freely linking every year that pops up in every article, the way WP used to. Tony (talk) 15:41, 24 September 2008 (UTC)

I have to disagree strongly with Kotniski that linking dates of birth and death should be standard practice. I can see absolutely no justification for this unless, in rare instances, it can be demonstrated that a year-page is not swamped with irrelevant information. This is simply open the floodgates to the linking of all solitary years. Tony (talk) 16:36, 24 September 2008 (UTC)

I’ll be pleased to give you an example. Experienced editors know how these blue date abominations work. But they are Easter eggs to many new readers. To give you an example, when {cite book} is used. If one uses this code…

{{cite book | author = A. Rupert Hall and Marie Boas Hall
| title = A Brief History of Science
| publisher = New American Library of Canada
| date = 1964
| pages = 6}}

…You produce this: [1]

  1. ^ A. Rupert Hall and Marie Boas Hall (1964). A Brief History of Science. New American Library of Canada. p. 6.

Many readers assume that the “1964” will take you to more information on that edition of the book. Notwithstanding the fact that they are reading a science article, and the citation is a science book, and notwithstanding the fact that they are clicking on a date about an edition of a particular book, readers are only taken to an absolutely random list of unrelated historical trivia, such as this one: “May 27 - Prime Minister Jawaharlal Nehru of India dies; he is succeeded by Lal Bahadur Shastri.”  Once a new reader becomes an experienced reader, they learn to ignore these worthless links.

If I write “The Secret Service updated the way it guards presidents after the 1963 assassination of John F. Kennedy, many readers assume they will go to an article on the assassination. But no, all they are met with is a metric ton of weapons-grade disappointment. Every instance I’ve seen of linked dates are done in a way where the context of the linked date leads many readers to assume they will be taken to an article with more detail of what happened on that particular subject during that year.

Well written articles anticipate what the reader would likely be interested in further exploring. By judiciously limiting the quantity of blue links, they remain novel to the reader’s mind so they don’t start being filtered out and ignored. When so written, we invite exploration and learning. But when we over-link an article with disappointing Easter eggs, we just turn that article into a giant blue turd.

If we are to adhere to WP:Principle of least astonishment, we would write “The Secret Service updated the way it guards presidents after the 1963 assassination of John F. Kennedy (click here for a list of random trivia that occurred during 1963).” How many readers are going to click on that now that we’ve given full disclosure and fair warning? A few times. At first—for the novelty. Never again after that.

So my short answer is this: do not link any single years unless it is fully disclosed via piping (aliasing) as to precisely what the reader will be taken to. And, even then, such uses should generally be limited to intrinsically historical articles, like French Revolution, since readers of such articles are often interested in many things historical.

Furthermore, dates like “May 22” should never, ever be linked—even when properly piped and in intrinsically historical articles. Why? Because if someone is reading French Revolution, they might well want to see what else occurred in 1799 (if fair disclosure is given so they aren’t Easter egg hunts). But if they read that “Abbé Sieyès moved that the Third Estate on 10 June, such readers are not going to give a holy dump that “[On this date in] 1960, Marilyn Monroe wears her “Saturday” panties for the fourth day in a row!

We mustn’t assume that readers must first become experienced readers before they know how to not step on these land mines. Autoformatting is beyond worthless. Deprecating its use was one of the wiser things done here on MOSNUM. Praise be to people like Tony, who has been patiently working this issue for years. For just as long, I’ve been patiently reverting these blue turds in the articles I’ve been working on after some editor wades into the middle of it, linking everything in sight for the shear sake that an article exists on Wikipedia that can be linked to. That’s not the way proper writing and technical writing is done. Greg L (talk) 16:49, 24 September 2008 (UTC)

Well, there was a long-standing consensus that links not take us to articles we aren't expecting. There is an article on 1963. There is an article on Kennedy assassination. We have lots of articles such as 1963 in film which get hidden under year links, which I am hoping nobody is suggesting be unlinked. Could we discuss whether a link such as Greg L describes above be hidden beneath a link to another article or not? Corvus cornixtalk 02:21, 25 September 2008 (UTC)
I agree with most of Greg L's points, and I agree that easter eggs are really annoying. Putting "The Lord of the Rings was published in 1954" just hides the article 1954 in literature behind a single year link. Some people will assume it is a link to 1954, and will refuse to click on it. Some will think it is a link to 1954 and be surprised when they click on it to get something different. Other will use mouse-over and see what the destination is, but some will not. Better to write instead: "The Lord of the Rings was published in 1954. Among the other books published that year were The Horse and His Boy by C. S. Lewis". Strictly speaking, you still don't need to link to the "1954 in literature" page from a section of an article discussing Tolkien and Lewis (see The Inklings), but I can see cases where 1954 in literature would be a useful "see also" link. Some guidelines ask for "see also" links to be incorporated into the text, but some types of articles require piping (unless you can devise a wording that uses "1954 in literature") to do that, and if the "date fragment" article (to use Ton'y wording) is an aside, it is better linked as an aside. The other place to link to such articles is in the introduction. By the way, Greg, WP:EGG is a good summary of this, though I'm sure you already know that.
Turning to the specific example, an experienced reader should be able to distinguish between:
  1. The Secret Service updated the way it guards presidents after the 1963 assassination of John F. Kennedy.
  2. The Secret Service updated the way it guards presidents after the 1963 assassination of John F. Kennedy.
  3. The Secret Service updated the way it guards presidents after the 1963 assassination of John F. Kennedy.
  4. The Secret Service updated the way it guards presidents after the 1963 assassination of John F. Kennedy.
  5. The Secret Service updated the way it guards presidents after the 1963 assassination of John F. Kennedy.
  6. The Secret Service updated the way it guards presidents after the 1963 assassination of John F. Kennedy.
  7. The Secret Service updated the way it guards presidents after the 1963 assassination of John F. Kennedy.
I think most people would agree that (1) is overlinked and poorly-linked at that. (2) makes 'assassination' an Easter egg, but does have the advantage of linking separately to JFK for the readers who want to go straight to that article instead of being routed through the assassination article. (3) is a version of (2) without the year link. (4) is clearer (ie. no Easter Egg), but hides the JFK links behind the assassination article. (5) is a version of (4) without the year link. (6) and (7) are a year-linked and non-year-linked pair for those who think linking to the assassination is overlinking. I would personally go for 3 or 5, as I think linking the year is unnecessary here, but linking to the assassination article is good. I would still aim to link from the assassination article to 1963, but either in context, or in the lead section.
Turning to birth and death years. I disagree with Tony that this is overlinking. Many biographical articles (the good ones, anyway) do describe the state of the world when the subject of the article was born, and when they died (as well as during their lifetime). Greg L's objections about the principle of 'least astonishment' don't seem to apply here. When a reader comes across a link to two year articles at the beginning of an article, clearly identified as the birth year and the death year, then I can't think of anything else they would expect other than to be taken to articles about the years. Staying with the JKF example:
  1. John Fitzgerald "Jack" Kennedy (May 29, 1917–November 22, 1963), often referred to by his initials JFK, was the thirty-fifth President of the United States, serving from 1961 until his assassination in 1963.
  2. John Fitzgerald "Jack" Kennedy (May 29, 1917–November 22, 1963), often referred to by his initials JFK, was the thirty-fifth President of the United States, serving from 1961 until his assassination in 1963.
  3. John Fitzgerald "Jack" Kennedy (May 29, 1917–November 22, 1963), often referred to by his initials JFK, was the thirty-fifth President of the United States, serving from 1961 until his assassination in 1963.
I've given three examples above. (1) is the current version in the article. (2) adds links to the birth and death years. (3) pipes in 'easter egg' links to the birth and death categories (already at the bottom of the article, but why should the reader wait until the end of the article for this?). An alternative is to wait until the birth and death are mentioned in the main text, and to link the years at that point:
  • Kennedy was born at 83 Beals Street in Brookline, Massachusetts on Tuesday, May 29, 1917, at 3:00 p.m. (sets a bad example and would encourage overlinking)
Or add a sentence or two to the "early life" section to give the context of the year 1917, and "set the scene". Mention that he was born in the closing years of World War I, that conscription began in the United States the next month, and so on. Not overdoing it, of course, as the article should be about JFK, not the history of the wider world, but some passing references to give people a flavour of the times he grew up in (the 1920s and 1930s) and the times his parents lived in. It is very difficult to do this well, but when done well it can work. One big problem is deciding what to put in and what to leave out, and to avoid slanting things by implication. In my view, one way to avoid problems with this is to simply link to years if you want to guide people to the wider "state of the world" at that time. It is, in essence, a placeholder, saying: "something needs to be said here, but for now let's link to the year article". Does anyone agree with this? Maybe something like a redirect from State of the world in 1917 to 1917 would help distinguish thoughtful links from gratuitous links? Carcharoth (talk) 05:43, 25 September 2008 (UTC)
If we are going to link years, I think they should be linked directly to the year article, without any imaginative piping to surprise the reader - this is presumably what's meant by least astonishment. But frankly I'm not sure editors would be discplined and informed enough to respect a rule that says (for example) that years of birth are to be linked. We know what happened when autoformatting links were encouraged - some editors assumed (perfectly reasonably - I know I did) that this means that year links in general must be encouraged, and went around putting them in all over the place. I fear similar effects if we start encouraging year links in particular places. If there were some great benefit to it, then I would support it, but knowing what our year articles are generally like, I think the links would be more likely to disappoint users following them than to be of any use. If we think it's significant that someone was born in the same year that World War II started or the 500th episode of Happy Days was broadcast, then we can say that explicitly in the section on their early life.--Kotniski (talk) 08:53, 25 September 2008 (UTC)
  • 'Days of the year should never, ever, be linked. Tony has been carried away by his rhetoric; he is forgetting the cases where a day of the year is a common name for something. The most obvious cases are in the French Revolutionary calendar, because they haven't run into disambiguation problems with the day articles discussed here; let 18 Brumaire be the type case. But there are other examples: August 10, July 4, 5 November. Septentrionalis PMAnderson 16:59, 25 September 2008 (UTC)
 
Just swell. I was reading the speed of light article. What the hell has this got to do with it?
  • To Kotniski: Your #1 JFK example seems to me to be the most well written one. Each link properly anticipates what a reader visiting that article would like to further explore.

    To Septentrionalis/PMAnderson: Tony and I have maintained a consistent position on the issue of linking dates. I note that Tony has been active here for a long time and—unlike other Wikipedia venues where self-appointed leaders run roughshod over others—he seems to carefully work to build consensus and read the prevailing mood. I think it is unfortunate that he has to put up with so much crap from various quarters; you just can’t make 100% of the people happy 100% of the time—especially in a world-wide collaborative writing environment like this.

    It’s a simple message point: To be effective, links should only be employed when they are topical and germane to the article. Period. The only links to mindless, rambling lists of pure trivia (events that happened on this date across history) should be limited to Wikipedia’s Trivia article. Readers of Wikipedia’s Trivia article might be fascinated and just happy as a clam to read the following:

[On this date in] 1600 - Tokugawa Ieyasu defeats the leaders of rival Japanese clans in the Battle of Sekigahara, which marks the beginning of the Tokugawa shogunate, who in effect rule Japan until the mid-nineteenth century.

But most everyone else reading all of Wikipedia’s other articles would just as soon stub their little toe on the foot of their bed than be forced to wade through these lists. Greg L (talk) 21:22, 25 September 2008 (UTC)
    • I like to see Tony "carefully work to build consensus"; he has done so sometimes, especially recently. I deny that it has always been his style. But this need not be continued while he abstains from riding in rough shoes, or here in any case.
    • I deny that unlinking all years, without exception, is consensus; I fail to see anyone but you and Tony support it. (LM has not argued on either side; he has merely permitted his bot to act.
    • I also deny that linking all years is consensus; I don't see anyone support it, and arguing against it is a straw man.
    • We are therefore in the middle ground of deciding whether there is a "particular reason" to link or not. If Lightbot could tell that, it would be an AI - and Lightmouse would be wasting his time here; he should publish and buy tickets to Stockholm. Septentrionalis PMAnderson 18:43, 26 September 2008 (UTC)
  • In other words: WP:IDONTLIKEIT. Then don't read it; don't click on years, and use the back button if you do. Nobody can be forced to wade through anything. Septentrionalis PMAnderson 18:43, 26 September 2008 (UTC)
  • No, it’s not a matter of “I don’t like it.” Your response speaks straight to the heart of the issue: you don’t “get it”. It’s not about trying to force someone to to wade through a mountain of trivia crap they would prefer not to read. It’s all about proper technical writing practices and not desensitizing readers to links by having articles on Wikipedia that are overlinked with wholly unnecessary ones that few—if any—readers will be interested in. What part of “To be effective, links should only be employed when they are topical and germane to the article” don’t you understand? We’re trying to get Wikipedia away from this fad that developed a couple of years ago of “if an article exists on Wikipedia to link to, then link to it. Or is it that you do understand this fundamental point of technical writing but don’t care?

    Lightmouse’s bots are the only way to handle this issue. There are far too many date links across the entire Wikipedia project to shake a stick at and manually fixing them all is humanly impossible. He is doing the proper thing by automating the de-linking process. The (very) few false positives that get swept up in the process can easily be manually restored. As I’ve spoken of before, the judicious use of linked years in intrinsically historical articles—like French Revolution—is appropriate; such readers of history might like to see random ramblings of historical trivia. And linked dates are a no-brainer for the bot. There are only three articles across all of en.Wikipedia where dates should be linked to random rambling lists of trivia: Trivia, Boredom and Chaos theory.

    P.S. If you are going to break into the middle of my posts, please have the courtesy of setting them off in small type or something similar to make it clearer who’s writing what. Greg L (talk) 20:16, 26 September 2008 (UTC)

  • I'm vaguely amused to note that your example of an article where year links would be appropriate, French Revolution, actually has had all of its date links removed by Lightmouse. -Chunky Rice (talk) 20:44, 26 September 2008 (UTC)
  • Dates? Like when the readers clicks on a particular date in 1799 and comes to a list of mindless trivia like “[On this date in] 1960, Marilyn Monroe wears her “Saturday” panties for the fourth day in a row!? Good riddance. Years? If properly linked so they aren’t Easter eggs and readers know the link will take them to an article showing other historical happenings of 1799, that makes sense… in a historical article, doesn’t it? These can be manually restored (after they’ve been revised so they aren’t the Easter eggs they often take the form of).

    P.S. In my 20:16, 26 September 2008 post above, I wrote as follows: “…the judicious use of linked years in intrinsically historical articles…”. Greg L (talk) 21:18, 26 September 2008 (UTC)

  • Years, like this one [1]. The event to which the year is attached is even listed on the year page. Yet it was removed, citing the MOS. I just find it amusing is all. Why even bother relinking it when someone else will just come along and indiscriminately unlink it again. -Chunky Rice (talk) 21:25, 26 September 2008 (UTC)
  • The particular example you provided doesn’t look like a particularly well-deserving or properly done link when taken in context with the rest of the sentence. But to your broader point Chunky Rice: Judicious use of properly linked years, IMO, in intrinsically historical articles would be appropriate. As to the rest of your point (“why even bother relinking it when someone else will just come along and indiscriminately unlink it again”), I think we clearly need to arrive at a broad-based consensus on how and when to link years. I personally can see where that sort of information would be of interest to those who take a fancy to history. That is what links are for: to properly anticipate what the typical reader would be interested in further exploring.

    An important issue in my mind is to keep these links from acting like Easter eggs for new readers. I remember the first time I encountered some text on Wikipedia (I was quite “new”) and the text read something like this: “But with the May 22, 1989 discovery of cold fusion”. So I clicked on the date portion, thinking I would go to an article that drilled in with keen detail on that aspect of the discovery. Of course, nothing of the sort; totally unrelated trivia. Lesson learned: don’t click on blue years and dates. Were it me adding year links to historical articles, I’d do something like this…

During the Reign of Terror phase of the French Revolution, the French people managed to lop off the heads of perhaps 12,000 victims in only the first half of 1794 (notable events of 1794).

It looks a bit ungainly for those who have long been used to just linking the year without piping, but by George, there’s no Easter egg quality to the link whatsoever; readers know precisely what to expect. This is in keeping with WP:Principle of least astonishment. And we wouldn’t even have to link every year in a historical article; after the first one, readers will have the “Ah haaa to know about these links. Then if they encounter the year “1795”, they can just go type them into the search field. Greg L (talk) 00:50, 27 September 2008 (UTC)
Greg, I just got back from a weekend away, and I've been reading through this section. Some of your comments (and others as well, to be fair) are a little bit rhetorical and over the top, especially the condescending and faintly insulting comments about trivia, boredom and chaos theory. Do you think you could tone it down a bit? I provided some examples above (though you seem to have got me and Kotniski mixed up). Do you think we could refocus the discussion on the examples I gave? Technical writing is a skill, but there is a spectrum of opinion on how much linking is appropriate, and no one correct answer in many cases. My view is that, within reason, a little bit of overlinking or underlinking is not too bad, as long as the information is still getting across to the reader. Also, some of the points I raised are getting missed in the sturm und drang - the points I made about categories and templates, for example. Carcharoth (talk) 19:39, 28 September 2008 (UTC)
  • I’ve seen your style of argument before, Carcharoth: all the bias masquerading as ‘high-road/big picture thinker’ who employs politically correct slogans so you come across as a moderate. You dished out a metric butt-load of that tactic on the ANI against Lightmouse [2]. Lightmouse is a valued contributor on Wikipedia who earned two barnstars from his peers. He listens to other editors if his bots need tweaking and he is responsible and follows MOSNUM guidelines.

    Of course, this isn’t a criticism of you as a person; it’s just that I don’t think you made a wise “choice” of the tactic to employ over over on the ANI—and here, for that matter. I will not further respond to someone who has an extreme agenda but tries to hide behind the apron strings of being a politically correct moderate. Your motives are clear. Goodbye. Greg L (talk) 17:33, 29 September 2008 (UTC)

  • Greg, just a brief response, as you've made some serious assertions here that shouldn't go unchallenged. I don't have an extreme agenda. I stand by what I said on ANI (in your link), though I think you've seriously mischaracterised what I said there (what I actually said, among other things, was that a team would handle things better than a single person and their bot). I'm going to continue to participate in this debate, and I hope you do as well. That's not a platitude, I mean that. If you want to discuss this issue of discussion style in more detail, my talk page would probably be a more appropriate place than here. Similarly, the next time I think your discussion style is disrupting a discussion, I will initially raise it on your talk page, rather than distract from the discussion. When you and I are ready to get back to discussing examples and not our styles of argument, then I'll be happy to discuss those here, as I see we have both continued to do below. Carcharoth (talk) 05:24, 30 September 2008 (UTC)

Why shouldn't all dates be linked?

I've tried to take in as much of the discussion above. But I just don't understand why all dates shouldn't be linked? What's so wrong with that? Petemyers (talk) 16:45, 28 September 2008 (UTC)

If you've taken in the discussion, I'm assuming you have already seen the personal feelings against a "sea of blue" and "links to random trivia". What's mentioned less often is that this is closely tied to an attempt at complying with our general linking guideline, WP:CONTEXT, which in a nutshell says that links not helping the reader are not only unnecessary but also unwanted. -- Jao (talk) 17:15, 28 September 2008 (UTC)
Yes, and I went to WP:CONTEXT and had a look around. I understand that insane linking is stupid and unhelpful, as it misdirects readers from high-quality links. But a date has something of an objective fact about it, that reference to a country or organisation might have. So I'd also say it would be good to link any organisation that an article mentions, even if it's a bit loose to the articles content. Why aren't dates in that kind of category? Petemyers (talk) 17:25, 28 September 2008 (UTC)
I have to agree with Petemyers here. One of the points of Wikipedia is that it provides simple cross-referencing that is easy to follow but easy to ignore. Certainly, I think the linking of years can often be helpful (and category links simply don't cut the mustard, as some posters have mentioned above), whilst day links are hardly harmful. Date autoformatting self-evidently doesn't yet work properly — I get a big mixture of formats in articles at the moment, which is far worse than seeing all dates be blue, frankly. I'd rather see link-soup than inconsistency.
At the moment, there seems (to me) to be very little in the way of consensus on this topic and it's far too soon for bots (or users) to be unlinking dates willy-nilly. I'm still not convinced I see a net advantage in unlinking dates, over the formatting and consistency benefits, personally. — User:OwenBlacker (talk) 20:56, 28 September 2008 (UTC)
It is absurd to think that a practice can be discouraged in the guideline, yet correcting articles to conform with the guideline would be unacceptable. There is doubt about whether some of the automated campaigns to fix the problem are appropriate, but I think that is more about the ability of a bot to make the changes appropriately than about the desire to remove indiscriminate linking of dates.
Another point of view is that indiscriminate date linking should be retained pending a better date autoformatting system, but I think most of the editors who commented feel that such an improved mechanism will not be created any time soon. --Gerry Ashton (talk) 21:28, 28 September 2008 (UTC)
Especially that last part is too bad. Developers are really waiting a long time with that. (True, they are all just volunteers). Still, I prefer a sea of blue links over a removal of autoformatting since I really like to see the dates in my preferred format and to stop same lame edit wars. Garion96 (talk) 21:37, 28 September 2008 (UTC)
OwenBlacker says above: "category links simply don't cut the mustard". I, for one, find category dating links useful to a certain extent, but only if they were more efficiently used in a similar matter to tagging (or keywording - ie. index term), rather than categorising or linking. Being able to tag an article about an event that took place in year X with a tag for that year, would be very helpful. Currently, this is partially done with category structures like Category:Births by year, Category:Years by country, Category:Deaths by year, Category:Disestablishments by year, Category:Establishments by year and Category:Events by year. The over-arching categories are Category:Categories by year and Category:Categories by time. Please do browse through these category structures and say what you think of them - how well do they address the need that motivates some people to link years? Carcharoth (talk) 21:46, 28 September 2008 (UTC)
(reply to Gerry). I see no contradiction between the statements that date links should not generally be added (without good reason) and that date links should not generally be removed (without good reason). Nor should anyone editing a consensus document. — Arthur Rubin (talk) 16:21, 29 September 2008 (UTC)

US military articles

While I do not think the MoS need specify under what exact circumstances DMY is appropriate for US military articles, such as this one, it probably should mention, before there's revert-warring, that such is sometimes acceptable and should NOT be switched without good reason. --JimWae (talk) 09:04, 22 September 2008 (UTC)

I agree. I've audited the dates in many many articles, which has taught me that US military editors almost always prefer their articles to be in international date formats. I respect this when I audit. The provision was in some of the earlier proposals for rewording, but appears to have been lost. Tony (talk) 11:54, 22 September 2008 (UTC)
I've restored this proviso - feel free to improve my wording.--Kotniski (talk) 12:05, 22 September 2008 (UTC)
Seems perfect to me! Tony (talk) 12:13, 22 September 2008 (UTC)
It does not seem to be true for historical articles, like Battle of Gettysburg, or even the far more recent Battle of Chosin Reservoir; but present-day should fix this. By the same token, we should perhaps mention in general that historical articles (i.e. Commonwealth historical articles) from before the introduction of the international format don't have to use it. (That's carefully phrased; we don't want some other Date Warrior going out of his way to switch Frederick North, Lord North either.) Septentrionalis PMAnderson 15:54, 22 September 2008 (UTC)
" for example, articles on present-day U.S. military topics often use day before month, in accordance with usage in that field." Anderson, I see you've added "present-day". But this seems to exclude articles about, say, WWI and WWII battleship et al, which in my considerable experience of auditing are almost entirely in international format. Don't you think "often" and "in accordance with usage in that field" provide enough lattitude? "Present-day" appears to be more restrictive than the reality out there, which I'd have thought you'd dislike. Tony (talk) 02:59, 23 September 2008 (UTC)
    • Chosin is from the Korean War, which is later. We should describe the reality, whatever it is; I don't want to encourage Date Warring in Battle of Trenton, however. Perhaps a generally will suffice. Septentrionalis PMAnderson 15:23, 23 September 2008 (UTC)
I would recommend using a term like “present-day” as it begs that former British usage (what is now called the American style) would then inferentially become justifiable, if not altogether appropriate, which would just reopen that nasty can of worms. As long as it’s a military-related article, let the editors use what they are most comfortable with. Askari Mark (Talk) 21:46, 27 September 2008 (UTC)
  • Then you'll have disputes over articles on WW1 US battleships. How would you prevent that? Tony (talk) 02:38, 28 September 2008 (UTC)
You know, Tony, that trying to prevent all disputes is a Sisyphean task. My advice on date style usage has been consistently to let the editors work it out rather than over-prescribe. Why force US military topic editors to use what they’re less accustomed to? They’ve been doing it for several years now with very little incident. Maybe some pointy types at FA get overwrought about it – you’d know much better than I – but that’s the exception in my experience. If there truly is a need for a “present-day” qualification, then it’s really the kind of thing for WikiProjects like MILHIST and SHIP to work out for themselves; I just don’t see it as something that MOSDATE needs to prescribe a “one size fits all” rule over. In short, I'm agreeing with you that there's no need for us here to be overly prescriptive. After all, why fix what isn't broken? Askari Mark (Talk) 04:17, 28 September 2008 (UTC)
Mark, have you read Anderson's wording, which seems to fit the bill? "In certain subject areas the customary format may differ from the usual national one: for example, articles on the modern U.S. military often use day before month, in accordance with usage in that field." My italics highlight all of the words that provide lattitude, within an overall wording that says "it's OK to use international in modern US military articles if editors want to" (doesn't it?). If there were ever a dispute (there hasn't been thus far in many many hundreds of articles, we'd probably just go back to the "first contributor" rule. But it seems like a cohesive, good-natured field to me, having audited many articles within it. Tony (talk) 04:34, 28 September 2008 (UTC)
Actually, it’s more the case that “articles written by modern U.S. military writers often use day before month” [changes emphasized]; it’s not unique to modern military topics. Askari Mark (Talk) 04:47, 28 September 2008 (UTC)
Sure, but it's not worth disputing on the article talk pages, is it? I'm all for letting editors decide, and in the absence of that, of going with the current format. Tony (talk) 10:06, 30 September 2008 (UTC)

Proposal for bot/script-assisted date unlinking

I've created Template:Linked dates to help editors interested in unlinking dates from large numbers of articles. If this is placed on article talk pages, I believe that it would help educate editors about the recent linked date guideline changes and give interested editors a chance to comment in advance. I'm hopeful that its use would help to avoid some of the more contentious and/or confrontational methods now being employed for unlinking dates. I put the template together in about five minutes and have absolutely no problem with it being 'edited mercilessly'. I selected a time period of two weeks because I think that's a reasonable amount of time, but would not care if it were longer. The template does not populate any maintenance categories, but it would be easy to add them in if needed. — Bellhalla (talk) 13:55, 27 September 2008 (UTC)

  • What is the point of the above post? I don’t understand. Why is the link unavailable? If this post was a tongue-in-cheek joke (I see no evidence of your having worked on such a template in your September contributions), it doesn’t really read as a joke, Bellhalla, and should be regarded as unhelpful graffiti. If you changed your mind and deleted your own template, it would be nice to give notice of your having done so, or delete or strike your post. If someone else was responsible for deleting the above template (because they disagree with what it does and it is soooo damned easy to delete someone else’s labors than actually create something), then it would be nice if they had the backbone to put a post here stating that they are the “responsible” party. If the reason is: “none of the above”, it would be nice to post a notice explaining why there is an orphaned link. Greg L (talk) 18:34, 27 September 2008 (UTC)
    • I proposed it be deleted as {{db-t2}} (incorrect representation of policy), and another admin apparently agreed. — Arthur Rubin (talk) 20:26, 27 September 2008 (UTC)
  • I see. Perhaps then, if there are no objections, I will delete this thread about ten+ hours from now? Greg L (talk) 20:59, 27 September 2008 (UTC)
    • The point of the post was a good-faith attempt at trying to avoid problems and conflict over the mass unlinking of dates that some editors are undertaking. The template was deleted under {{db-t2}} which is reserved for "blatant misrepresentations of established policy", a characterization of the template (as I wrote it, at least) that I disagree with. The draft wording that I had written was, I thought, very neutrally worded and provided a link to MOS:NUM and a link to Wikipedia:Manual of Style (dates and numbers)/Date autoformatting. The thanks you get for trying to help solve a problem… — Bellhalla (talk) 04:13, 28 September 2008 (UTC)
Bellhalla, I did this originally for 100 articles as an experiment—not a template such as this, but a pasted in section on talk, clearly headed. The silence was deafening. I have the stats for the proportion of articles in which there were positive comments "Please do it", "Fine with me", etc., which I recall were about 20%. About 6% flushed out editors who were negative about it, and who have since participated in the debate; quite a few of this hard core who took a conservative stance have since changed their tune. In the rest of the articles, no one was motivated to stop by.
I get a few queries from people who hit the MOSNUM link on edit summaries and who want reassurance; this I'm only too happy to provide. A few people are thankful enough to post nice comments, such as this and this, yesterday.
I can't see the point of this now. It's quite enough work to assist editors in this way without placing artificial hurdles in the way. Tony (talk) 04:43, 28 September 2008 (UTC)
Tony, I'm not trying to put hurdles in anyone's way at all. I'm genuinely and sincerely trying to help. Look at it this way: If your data is representative of Wikipedia as a whole, and there's a bot or script-assisted user, say UserA, that unlinks dates in, say, 500 articles in one day, there will be around 30 editors who will either revert the changes (the net result is UserA's time was wasted) or be angry (resulting in posts on talk pages, ANI, or more that end up occupying UserA's time). No one can please everybody, but use of a method like this allows editors—like you, for example—to focus on articles where their efforts are welcomed. — Bellhalla (talk) 12:24, 28 September 2008 (UTC)

I think you have highlighted an important issue. Given the huge number of editors involved, 30 editors sounds like a drop in the ocean to me. The number of pages that have had date links reduced or eliminated is getting on for half a million. As Tony says, the statistics show that most editors say nothing, do nothing, or both. Some explicitly say they are happy about delinking. Furthermore, any pro-linking argument is worthless unless it comes up with a proposal to fix the *huge* number of systematic errors, broken autoformats, and popular misunderstandings. What are your proposals for that? Lightmouse (talk) 18:55, 28 September 2008 (UTC)

That argument would only work if you could run a counter-experiment where a bot added links and we could observe whether: "most editors say nothing, do nothing, or both". It is possible that the true conclusion would then be that most editors don't care whether dates are linked or not. That might be difficult, though, for either "side" to accept (that no-one really cares what the outcome is). Carcharoth (talk) 19:44, 28 September 2008 (UTC)
I guess that explains a lot about your approach, Lightmouse. If I'm reading what you wrote correctly, you seem to have a very cavalier attitude towards "drop-in-the-bucket" editors who might have a difference of opinion. But with that aside, what sort of "pro-linking argument" are you talking about, Lightmouse? I hope you're not talking about my proposed template. If so, can you please explain what exactly in it is "pro-linking"? I'd really like to know… — Bellhalla (talk) 20:11, 28 September 2008 (UTC)

If reluctance to wait for 100% agreement on all things at all times is cavalier, then bring on the horses. Lightmouse (talk) 20:17, 28 September 2008 (UTC)

In fact, you are nowhere near 100%. Disturbing thirty editors about anything is a remarkable achievement; doubly so since this is not one of our established and widely watched polls, like RfA or AfD. There are doubtless innumerable editors who don't care, and a few who care deeply on both sides. My own position, as often, is near the middle: we should not autoformat, we should link rarely, but we should link sometimes (and therefore that decision should not be made by a bot); but, overall, whether we do or not is not of the earthshaking importance that Tony, Lightmouse and some on the other side have been claiming. I therefore expect flack from both sides. Septentrionalis PMAnderson 20:39, 28 September 2008 (UTC)

I'd just like to point out that by my data, more than 10% of links on "core" Wikipedia articles are to dates. That's a lot. Date unlinking is a really huge change to how wikicode looks and feels: many paragraphs late in articles will lose all wikilinks, and it's going to become just that much harder for new editors to grasp both what a wikilink is and when to use one. Regardless of any other arguments (see below), and while I accept that an overwhelming consensus can certainly be reached against my views, I must say that I do not feel there has been enough discussion about this: literally tens of thousands of changes, disturbance to the normal editing process (I, at least, was hit by this, by a badly-timed bot edit making page comparisons useless in the middle of an article actively being updated), and the best argument advanced for this still seems to be "too much blue on the screen". Even with a javascript workaround, it's not a lot of work to change that colour (nevermind that if an article consists of dates and wikilinks to the degree that it's a "sea of blue", it needs reworking anyway).

Is there any way we can move quickly to change this back to a proposal under discussion before too much damage is done? Extraordinarily radical changes require either extraordinarily strong consensus or at least extraordinarily good discussion and advertising; this is probably the most radical change to the manual of style in many years, certainly in the last year or two.

In case this isn't clear, the very least this should count as is as strongly opposing making any further mass changes by bot. Even if this continues, it's absolutely essential that the bot look at recent history and process only pages that have not been edited for a while: most articles don't get edited in any 24-hour period, and that seems a reasonable cut-off for any cosmetic bot changes, not just this one, to me.

Again, more than 10% of links. Extraordinary change. No advertising, and consensus that seems, at least now, extraordinarily weak.

RandomP (talk) 09:41, 1 October 2008 (UTC)

  • Ten percent? That's far too much crowding out of useful links. Who invented this binary structure of "core" and "non-core", anyway? Tony (talk) 07:13, 2 October 2008 (UTC)

Enforcement of MOS:UNLINKDATES

Understanding that I could be trout slapped for this: I've noticed that enforcement of MOS:UNLINKDATES has begun but I can find no consensus on the policy. The main article that should support it is just an essay. Is this being discussed somewhere where the conversation is not forked to the point that it is incomprehensible? -- Mufka (u) (t) (c) 13:05, 24 September 2008 (UTC)

See the footnote at the bottom of MOSNUM.--Kotniski (talk) 14:21, 24 September 2008 (UTC)
That's the discussion where there is no clear consensus. And the new discussion is forked into uselessness. -- Mufka (u) (t) (c) 15:29, 24 September 2008 (UTC)
While there's enforcement of a type at FAC and FLC, I see no one at either place raising a finger in resistance—perhaps they're keen for a professional look to their nominations. More broadly, to call the script-assisted auditing of articles "enforcement" sounds a little like spin. Most people see it rather as a service to editors, sparing them the manual labour of removing the square brackets while improving the look of their text and the exposure of their high-value links. Having decided that date autoformatting is undesirable (link available to wide enthusiastic support on request), it's hard to see why people would ever object to date auditing that involves the unblueing of dates, particularly when it involves:
  • the correction of faulty DF syntax;
  • the ironing out of inconsistencies (widespread, I'm afraid); and
  • the correction of wrong choices of DF for whole articles (also surprisingly common; today, NASA was not in US formatting, and neither were several articles on US performers—and vice versa).

This kind of clean-up is long overdue. Tony (talk) 15:51, 24 September 2008 (UTC)

There is no consensus that it's appropriate. Silence does not indicate consent, and it seems possible that the consensus that autolinking is depreciated (not deprecated) might be revisited. — Arthur Rubin (talk) 16:37, 24 September 2008 (UTC)
Actually, silence does imply consent; WP:CONSENSUS is quite clear on that, and says so in its very second sentence! — SMcCandlish [talk] [cont] ‹(-¿-)› 05:07, 4 October 2008 (UTC)
My point is that if someone is citing a policy when they make a cleanup edit, it stands to reason that the policy should exist in a valid form (i.e consensus). I don't have a penguin in this race, but it seems like the way things are, the situation is ripe for edit warring. Cleanup is good, but what if someone objects and reverts - where's the apparatus to remedy that? I'd love to see a bot steamroll every article and iron out inconsistencies. But alas, it can't be done without consensus. -- Mufka (u) (t) (c) 18:20, 24 September 2008 (UTC)
Please point to this rash of edit-warring; links? Tony (talk) 02:46, 25 September 2008 (UTC)
My concern is in the possibility for edit warring because of lack of consensus, not a current rash of edit warring. Good fences make good neighbors. -- Mufka (u) (t) (c) 13:13, 25 September 2008 (UTC)

I have nominated MOS:UNLINKDATES and MOS:UNLINKYEARS for deletion as they do not reflect consensus. Corvus cornixtalk 02:29, 25 September 2008 (UTC)

That is absurd. The community has clearly decided that date autoformatting (if that's what you're referring to) is undesirable). Why is a completely separate imprimatur required to implement what the Manual of Style says? So, shall we start requests for consensus to enable people to correct (in whatever way) every single clause in MOSNUM and MoS? Over to you: it will take a lot of headings; we're waiting ... Tony (talk) 02:44, 25 September 2008 (UTC)
Where does the MOS say to delete links? Corvus cornixtalk 02:51, 25 September 2008 (UTC)
Corvus cornix, your position amounts to "Nobody should do it, but if somebody does it, nobody should undo it". I'm pretty good at reading even very intricate maps with lots of twists and turns, but I am having a very difficult time trying to fathom how that position could possibly seem at all logically defensible to you. I think you face a very difficult sale here (and in your deletion proposals), but perhaps I'll be proved wrong. —Scheinwerfermann (talk) 03:10, 25 September 2008 (UTC)
Corvus says: "Where does the MOS say to delete links?" Two things: (1) where does it say not to do so? It would be a peculiar state of affairs if the MOS deprecated something and banned people from removing instances of it to comply with the deprecation. (2) Where does it say to delete the "p.m." from the 24-hour time "14:45 p.m."? Does it have to say explicitly that this should be done, as well as saying "24-hour clock times have no a.m., p.m., noon or midnight suffix." What you propose is a revolution in the writing and interpretation of our style guides. I'd like to see you post a proposal for this sudden necessity to double-up every single provision with an explicit "OK, do it" clause.
More likely, this is just another case of I don't like it. I'd prefer to engage with you as to the benefits of the change rather than waste time contemplating an entirely new double-up speak for our style guides (and policy pages, indeed). We've had remarkable success in convincing people who are initially cautious about the benefits of ridding WP of this cancer. You're welcome to continue the conversation on our talk pages, where it won't clutter more important business on this page. Tony (talk) 04:07, 25 September 2008 (UTC)
MOS:UNLINKDATES as a shortcut name is a kind of double-speak itself. The general question of unlinking dates is entirely separate from the autoformatting issue. The shortcuts should be more specific (off the top if my head I can't think of one, but I'm sure somebody can) so that people can continue the good work of unlinking the "deprecated" autoformatting-type dates while remaining aware that there is as yet no consensus for unlinking all dates. Scolaire (talk) 07:36, 25 September 2008 (UTC)
It seems reasonable that calling something UNLINKDATES would lead to the misunderstanding that policy exists that calls for the unlinking of existing dates. Is this what we want? -- Mufka (u) (t) (c) 13:13, 25 September 2008 (UTC)

There's nothing "double" about the shortcut. The community, after long and detailed discussion, has made a decision that has wide support. You happen to question it, which is fine, but I think your strategy is disruptive and your reasoning circular, as Schein points out above. What is double-speak is the notion that we should have a strong guideline based on consensus, but balk at implementing it. Bizarre. Tony (talk) 13:55, 25 September 2008 (UTC)

Absolutely wrong. There's consensus that dates (day-of-year) should not generally be linked, because of the autoformatting. There is now a consensus established, at your (probably improperly introduced) RfC, that dates should be unlinked. There's no consensus established for years. I'm not even sure there's consensus that years should not be linked without a particular reason, only that years should not be linked without a reason. — Arthur Rubin (talk) 15:19, 25 September 2008 (UTC)
Where is this RfC? -- Mufka (u) (t) (c) 15:27, 25 September 2008 (UTC)
Wikipedia:Requests for comment/Tony1Arthur Rubin (talk) 15:53, 25 September 2008 (UTC)
I read that RfC conclusion as saying that Tony1 was not going against procedure by de-linking, but I'm still looking for the place where it was decided that dates should not be linked, without reference to auto-formatting. If there is a consensus other than the AF consensus, why is it so d****d hard to link to it? Scolaire (talk) 16:56, 25 September 2008 (UTC)
MOS:UNLINKYEARS and MOS:SYL are much, much older than the recent MOS:UNLINKDATES. This MOS had a stable "Date elements that do not contain both a day number and a month should not generally be linked; for example, solitary months, solitary days of the week, solitary years, decades, centuries, and month and year combinations. Such links should not be used unless following the link would genuinely help the reader understand the topic more fully; see WP:CONTEXT." clause for a long time, only removed a month ago (presumably because it wouldn't be needed with the new, more general date unlinking). -- Jao (talk) 10:48, 28 September 2008 (UTC)

As far as I can tell, the vast majority of dates and years should be unlinked. I'm fine with that. But as far as I can tell, there's certainly no consensus that no dates or years should ever be linked. Is that correct? -Chunky Rice (talk) 00:46, 26 September 2008 (UTC)

Your understanding matches mine. TTBOMK nobody advocates a blanket prohibition on linking dates or years. The MOS entry says dates should not be linked, unless there is a particular reason to do so.Scheinwerfermann (talk) 13:54, 26 September 2008 (UTC)

Please re-markup dates

It doesn't have to be as links, but leaving dates in the wikicode as "plain text" makes life extremely hard for any automated parsing of those; plain text should be plain text, and dates are, I think, still being reformatted.

Seriously, this is, IMHO, an incredibly wrong thing to do: plain text should be passed through unchanged all the way to HTML; only marked-up section are changed. Plain-text-like dates just don't fit nicely into that: the markup isn't recognisable, it's unclear what the effect of <nowiki> would be, and we lose a lot of those links that are most useful, to me, at least, in parsing articles.

Can we please find some way of leaving in that useful information? I'm sure a date-matching regexp would run into things that look like dates but aren't meant as such quite often. Template syntax would work, and the main criticism of the Blue Sea would be avoided by, uh, not colouring date links blue. 5-minute change to the code, avoid millions of changes to the articles?

RandomP (talk) 20:33, 29 September 2008 (UTC)

I oppose the concept of a "5 minute change to the code". Quick and dirty code changes are what got us into this mess to begin with. I think proper date markup would include the following elements:
  • The ability to set defaults by article, for example,
    • All dates in a certain article are AD but "AD" is not to be displayed
    • Any date after 14 September 1582 in a certain article are Gregorian, earlier dates are Julian.
    • For a certain article, dates are displayed in the format day-month-year.
  • The ability to override the article defaults for a particular date.
It seems to me that when half-baked "solutions" are implemented, people go off and write software that depends on the "solution", that software gets garbage as input, and produces garbage as output. --Gerry Ashton (talk) 20:56, 29 September 2008 (UTC)
A template solution works, save for having a single point where the date format can be set in an article, though this can be easily changed by a regex tool (AWB). The only qualification is that the Gregorian/Julian date issue cannot be handled without major template programming; we need to make sure that dates, when entered, followed the above requirement. --MASEM 22:41, 29 September 2008 (UTC)
I would support (and use) a proper date markup system (for the metadata as RandomP says), but I'm unclear why article defaults are needed. The most editors will want to do is either explain what the date is in plain text (ie. write "Julian" or "Gregorian" or "AD" or "BC" somewhere) or use a date template. New editors will invariably use plain text and any date markup system needs to be simple to use, otherwise people just won't use it, or will be discouraged from editing. Article defaults sounds like an extra option that might just make the system a bit less usable for some. Articles that use both Gregorian and Julian dates could have a standard template warning that the article uses both Gregorian and Julian dates. I'm presuming here that the Gregorian/Julian date issue only affects a small proportion of articles, though still a large absolute number. Also, there are more date systems that just the BC/AD and Julian/Gregorian ones - any date markup system would have to incorporate those. Someone mentioned an ISO standard above - would it not make sense to pick a an up-to-date standard and work with that? Carcharoth (talk) 04:31, 30 September 2008 (UTC)
If there are no article defaults, then there must be encyclopedia-wide defaults. In the absence of any defaults, then we must specify everything for every date, and the markup for today might look like {{adate|2008|9|30|AD|disp-era=n|cal=g|fmt=dmy}} which is too much typing.
Also, since we have no effective mechanism to reach an agreement with our readers to use years before 1583, we cannot use ISO 8601 for any year before 1583 or after 9999. Even if we could reach such an agreement, we would always have to use the Gregorian calendar with that standard. That being the case, I have more use for a pile of horse manure than I do for ISO 8601 dates. (This statement is the literal truth.) --Gerry Ashton (talk) 05:02, 30 September 2008 (UTC)
I would like to support keeping date markup as well. From the discussions above and elsewhere it's clear that no consensus for conversion to plain text exists. Perhaps not enough time was given in the poll, or perhaps not a broad enough audience was sought, or perhaps polling doesn't work in the first place, I don't know, but I think the version as it stands now is controversial to begin with and will make changing over to specific date formatting in future MediaWiki versions as well as manipulating dates from user scripts significantly harder. It also ignores the reader's date format preference, even when specifically set, forcing the editor's preference down the reader's throat. Shinobu (talk) 13:14, 30 September 2008 (UTC)

I try to stay out of these continued repetitive debates about date linking out of concern for my sanity, but I can't help feeling what's being proposed is not very good solutions to non-existent problems. Parsers ought to be able to recognise dates when they see them to at least the accuracy you can ever achieve by extracting anything from WP. Primarily WP is written by humans for humans, and we shouldn't be making extra work for editors and complications for readers in order to make some theoretical parsers work better. It would be more helpful for automatic text analysis to write other types of information in a standard way - something like {{capitalof|Paris|France}} would generate "Paris is the capital of France" - but that's clearly not going to happen, and nor should we adopt a similar unnatural policy with regard to dates. Let people type what they want readers to read - with links if there's some real reason for those links, but usually there isn't - and worry about more important things.--Kotniski (talk) 13:41, 30 September 2008 (UTC)

First off, the parsers aren't theoretical: I have one. Second, we already do have the information you gave as an example (Paris being the capital of France) in an easily-extractable format: templates.
Most importantly, though, if you let "humans" write dates in the variety of formats they use in plain text, you end up with an unrecognisable and unusable mess. There simply is no option of not having a standard form for English dates, because the most common standard forms, month/day and day/month, are often ambiguous. If we're going to have a standard form, why not throw in template markup to make people realise they're writing in a restricted syntax?
RandomP (talk) 14:57, 30 September 2008 (UTC)
How are "30 September 2008" and "September 30, 2008" ambiguous? It's not like we're advocating numeric dates, we spell out the month. -- Jao (talk) 19:55, 30 September 2008 (UTC)

If I may summarise:

  1. clearly, no consensus exists for ripping out all date markup the way bots currently are doing
  2. even Wikipedia software, in expanding ~~~~, generates dates that are not of the form the mythical perfect parser would have to recognise
  3. that parser would have to know the names of the months at least in English (including the common English word "may", the name "april", the somewhat common word "march", and the occasionally-used adjective "august")
  4. the parser would have to guess whether numbers are year numbers or not, including small numbers. ([[70]] appears surprisingly often in historical articles, for example, and is probably currently being changed to an unrecognisable instance of "70").
  5. as evidenced by ~~~~ expansion above, the parser would also have to detect incorrect date formats: those are already quite common, even with redlinks to highlight them. Without redlinks, I see very little to stop a free-for-all as far as date formatting is concerned.
  6. year links are actually useful in some instances: 70, 1066, 1453, 1945, September 11 are all occasionally used as shorthand for the event they are identified with. Changing those exceptionally-useful links to plain text that the reader can't even click on, in the standard WP reaction of "this is a term I don't understand, and it's blue, so I'll click it to find out what it means", seems problematic to me.

I propose a solution that's about as simple as it gets for users: [[April 25]], [[2009]] should become {{April 25, 2009}}, a template which usually expands to "April 25, 2009" without any highlighting. No loss of information in wikicode, a couple of thousand template pages would have to be created, we retain warnings for incorrect date formats, and it'll remain relatively easy to extract metadata, either by CSS/JS or by an automated parser. Furthermore, I suggest that ~~~~ start using that template as well.

Dates are not plain text: depending on dialect, at least one of the numbers in them is usually read differently from the standard reading of that number, for example, and many speakers find it more natural to reverse order as well. While the things you can do with a date in an encyclopaedia are somewhat limited, wikicode should be universal: universal for all languages (so "May" shouldn't be a magic word that triggers the parser into attempting to identify it as a date), but also universal for what it's used for: a business wiki used for scheduling meetings would probably want to use dates as metadata quite extensively.

Quotes are going to contain unusual date formats in English and unrecognisable ones in other languages: not having any way at all to mark those up seems an unnecessary loss. Usage examples themselves are another source of strings that would look like dates but aren't meant as such.

Lastly, maybe it's worth remembering that it was the change from implicit, plaintext-like links (based on capitalisation) to explicit marked-up links that allowed Wikipedia to become readable, multilingual, and successful. Going the other way for no better reason than "we find it too difficult to assign a subtle link class to links to very-frequently-linked-to articles and there's too much blue on my screen" seems rather ill-advised to me.

RandomP (talk) 14:57, 30 September 2008 (UTC)

Quite - The counter-suggestions seem to do nothing but make the wiki less useful for purely abitrary style reasons or to provide new ways to obfuscate the dates so badly that I was, prior to your post, about to propose that we date everything prior to 1970 in negative Unix time. Please don't, though. Please? MrZaiustalk 15:33, 30 September 2008 (UTC)
After reading the above, and going back through the various proposals and what seems like a very weak consensus, I'd now be in favour of restoring the auto-formatting of dates (via the user's preferences), and by association, the linking of date/month and year pairs. - fchd (talk) 18:58, 30 September 2008 (UTC)
Let me also say that going back to the status quo ante is perfectly okay with me. Auto-format dates, including making them black rather than blue if that's what the user wants. But there is no strongly-defended consensus for what is currently happening, at least in this thread, so far. RandomP (talk) 19:35, 30 September 2008 (UTC)
I think basically because most people have got fed up talking about this over and over and over again, so the only people left are those with an axe to grind or who missed the previous (very extensive) discussions. Autoformatting never worked and will never work for the vast majority of our readership, and can hardly be said to provide any significant benefit to those who used it, so it should be ignored as a red herring. If you type linked dates in the old autoformatted style but then ask the software not to link them, it means editors are prevented from linking dates in those less usual instances when they might wish to. OK if you really think it will be useful you can make a date template as has been suggested already, something like {{d|2001|9|11}} with an extra optional parameter for format, and then existing autoformat-linked dates could be converted to that. But asking human editors to use such unnatural constructions when they can perfectly well just write out dates directly seems entirely unreasonable. Parsers ought to be able to pick up dates anyway with a very low error rate (the occasional phrase like "On December 1, 2008 peeople were killed"; but considering the density of factual errors already present in WP the significance of that kind of thing must be negligible).--Kotniski (talk) 09:02, 1 October 2008 (UTC)
My guess is most people haven't heard about the change. It's a huge and radical change, affecting more than 10% of links in my "core" set. If you want radical change, but don't feel strongly enough to defend it a couple of days after people hear about it first, that's probably a very good indication you shouldn't go ahead with it.
We're not asking human editors to change what they're used to: we're sending out bots to go over the markup human editors already put in, removing it, rendering dates hardly-recognisable plaintext.
It's interesting how you appear to be simultaneously saying that dates should be "natural", which I take to mean "in natural language", and easily picked out by parsers. Sorry, but parsing natural language is hard. For a start, it requires you to know the language you're parsing, while my parser is currently perfectly fine all by itself looking at the French Wikipedia, and I can never remember what they call February ...
"Existing" date links cannot be converted to anything anymore, because many of them have already been destroyed. If your point is that "most people" were aware of that, shouldn't it surprise you that you just implied it hadn't happened? If you weren't aware of the bot-assisted markup deletion, clearly it wasn't discussed enough.
Have a look at articles relating to 1st-century history: a lot of numbers occur with no context whatsoever to identify them easily as years, and are now lost. That that's a relatively small proportion of date links makes it less bad, but what makes it worse is that it's also a systematically biased subset. I see no way to distinguish "2000" as a common product name component in the 1990s, "year 2000" as a date reference, and "year 2000" as part of a phrase relating, say, to the Y2K problem.
It's not an easy problem to solve with a parser. It's an easy problem to catch, maybe, 80% of links, which still means I'd have to spend several days looking at numbers in my data set and deciding whether they're years to get back to the quality of information I have with year links.
RandomP (talk) 10:17, 1 October 2008 (UTC)

Just a few points:

  1. With the {{April 25, 2009}} solution, are you suggesting that we create these templates separately for all dates in history (and future) about which something can conceivably be known? (We also need a line of {{25 April 2009}} templates.) I'm not saying it can't be done or will cause problems, just pointing out that that's not a couple of thousand templates, but more like a couple of million. Would we also want {{April 2009}} and {{2009}}? What if an editor is writing a plot synopsis for a novel that takes place on April 25, 6009, and nobody had anticipated that? Should that editor create that template?
  2. If the markup is invisible to editors (which, for better or worse, are human), then how will we know which dates are correctly marked and which ones need to be fixed?
  3. "even Wikipedia software, in expanding ~~~~, generates dates that are not of the form the mythical perfect parser would have to recognise", I don't understand? If the parser were capable of recognizing non-marked dates, why shouldn't it recognise "30 September 2008" as a date? That's as regular as it gets.
  4. "year links are actually useful in some instances: 70, 1066, 1453, 1945, September 11 are all occasionally used as shorthand for the event they are identified with", this I definitely agree with, and nobody is suggesting the delinking of these. True though, sometimes mass delinking cathes too much, and that is a problem.

Also, could you tell us a little more about how you are using your parser? I understand that this proposal is not about that specific example, but a real example of how automatic date detection is useful to a real person would probably make us understand your points better. -- Jao (talk) 19:55, 30 September 2008 (UTC)

If you are just parsing articles for birth and death dates, that's what the {{persondata}} template is for. What other need do you have for "automated parsing" of dates? Kaldari (talk) 19:58, 30 September 2008 (UTC)
I think he is parsing articles for more than just birth and death dates. As for persondata, that is currently used on between 25,000 and 30,000 articles, which is only a small fraction of the biographical articles (though hopefully it is used on the more "notable" articles). Looking here, we see that there are 542,884 articles with the {{WPBiography}} template (though that does, for historical reasons, include music groups and other 'group' biography articles). Using the handy category totals (remember that these include the odd template or Wikipedia namespace pages), we see that Category:Living people has 306,001 articles, Category:Biography articles without listas parameter (an approximate equivalent, assuming people keep the listas and DEFAULTSORT parameters usage synchronised [a big if], to the number of articles lacking DEFAULTSORT) has 332,367 articles (if someone can tell directly how many biographical articles lack DEFAULTSORT, that would be great, but I've tried asking and no-one seems to know how to do this). Unfortunately, it is not possible to do the same calculations for Category:Deaths by year and Category:Births by year because the articles are split up. If anyone could provide a total figure for how many articles have birth and death categories on them that would be wonderful. Ultimately, this would lead to a list of those biographical articles lacking birth and death categories. It's a lot of number/list/category-crunching , but would almost certainly produce something as useful as delinking or linking birth or death years. Is Lightmouse reading this? Could his bot be used to do this, or should I do a separate bot request? [I'll leave a note on his talk page anyway - done]. Carcharoth (talk) 23:57, 30 September 2008 (UTC)
Thanks for the notes. You are right that the ~~~~ date format is indeed listed as a correct date format now; on the one hand, that invalidates my point about even that getting it wrong. On the other hand, that makes parsing, again, even harder.
I've done a number of things that I've used year links for; let me stay with a simple example, which was simply to select a list of "core" articles in order of core-i-ness, if you will, select the years, and see how they're distributed: this worked because core-i-ness, as it happened, used links. That experiment surprised me, in that 70 showed up way before some years even, if I recall correctly, in the 17th or 18th century.
With the rules as currently written on the project page (but not yet implemented widely), that experiment would just fail. In fact, it's extremely likely a parser would treat a lone appearance of "70" as a reference to a year at all, because it's such a small number; indeed, I probably wouldn't even have thought of that, and missed the surprise.
I realise that my personal little experiments are hardly an incredibly strong argument; but then, is "I don't like to see too much blue" one? Furthermore, shouldn't it count for something that changing the colour of links to dates is a simple exercise in javascript, would solve what appears to be the main (visual) argument against date-linking; while, if date linking is abolished, I need to teach my parser to dig into plain text, learn the English (and French, and German, and Italian, ad infinitum) rules for date formatting, and still lose out on corner cases.
I did not want the markup to be invisible to editors. In fact, as I've said before, I think it's extremely valuable in that someone just hitting "edit this page" gets a good demonstration of what a wikilink is: a standard term which there is more information about, such as a year number, enclosed in [[]].
And, indeed, it doesn't really count for much that some very few date links are still available (for a while, at least, until the plaintext nature of the new-style dates leads to editors using just any date format, including ambiguous and unparsable ones) through templates. I'm in no way restricted to birth/death dates.
RandomP (talk) 10:01, 1 October 2008 (UTC)
I could add that I'm tiring of fixing the syntax errors in date autoformatting: they are plentiful, and solved during my date audits. Tony (talk) 10:51, 1 October 2008 (UTC)
(reply to Random after ec) I'm not sure I understand all your points (but "I don't like to see to much blue" is a strong argument when it's the long-held opinion of the Wikipedia community at large - see WP:OVERLINK and various ancient discussions on that topic). However the recent change in the guidelines about linking related only to day-month or day-month year dates, not to solitary years, which have long been subject to delinking anyway (by hand or by bot). I appreciate it's hard to pick out a solitary 70 as a year, but in the context that concerns us here, namely 1 June 70 or June 1, 70, it ought to be much easier. In any case, Wikipedia isn't written for parsers, and dates are in no way a special case of information that could be made more parsable if marked up by hand.--Kotniski (talk) 10:54, 1 October 2008 (UTC)
Re invisible to editors: if it doesn't show in the HTML, it will be invisible to editors. Granted, if an editor spots something else to change in an article, he will click "edit this page" and then he might see date markup problems, but that's a big "if" and a big "might". That's exactly how raw date formats were invisible to editors when most dates were autoformatted and most(?) editors had a preference set. -- Jao (talk) 15:13, 1 October 2008 (UTC)
All that the presence of a date in an article tells you is that something (could be anything) related (in any possible way) to the subject of the article happened on that date. That's pretty low-value metadata, definitely not worth marking up all dates just for that. Colonies Chris (talk) 12:43, 1 October 2008 (UTC)

Let me summarise again: Please re-markup dates. It doesn't have to be as blue links, a template (such as fixing Template:Date is fine). Structured dates are parsable, and while some people think they're of low value, others (or "other", if it's just me) think they are of significant interest. There is no strong consensus for, as is currently happening, removing markup that adds value, losing information and prohibiting editors from adding markup or templates that add value, which would add information.

I'm perfectly okay with a policy that says you don't need to bother marking-up dates that you add to an article; someone else, who cares, will eventually be along to do it.

However, unstructured dates also lead people to think that just any date format is okay. This might be part of the reason that ambiguous dates are being added to Wikipedia (I didn't go out looking for that one, just stumbled over it because the template also isn't closed).

Don't want them blue? okay, let's make that the template's default behaviour. Want them blue and automatic weekday calculation? Write a little javascript.

But let's keep useful (even marginally useful, if you wish) information in the wikicode, even if it doesn't show up in the HTML. Let's stop bots from removing useful information for superficial reasons.

At this point, it's not even clear to me whether the bots are still removing markup, or have finished doing so, or what. If you're going to radically change Wikipedia by removing every tenth link, wouldn't it make sense to at least provide updates about how that's going on the talk page of the policy that was changed?

RandomP (talk) 20:58, 1 October 2008 (UTC)

RandomP, just go ahead and revert any of these changes you disagree with. The people currently unlinking dates have no authority to do so, and there certainly isn't consensus for their actions. Your preference to keep them (which several others share, including me) is just as valid. --UC_Bill (talk) 20:32, 3 October 2008 (UTC)
Comment: replacing linked dates with templated dates would be fine with me, although I don't think creating a separate template for every date is practical. However, a template that takes a date as an argument that gives something like <span class=date>...</span> would work, don't you think? That still allows people to change their date format (if ambiguous formats are properly tagged) or even link them using a user script or gadget. Alternatively, we could do something like {{October|3|2008}}, that looks sort of normal, whatever that is, and only requires twelve templates. Even a more logical format like {{3|October|2008}} would be possible, the number templates don't do anything useful anyway, except {{24}} which would have to be renamed. But all in all I still think a single template would be more practical. Shinobu (talk) 01:28, 4 October 2008 (UTC)

WP:Update

Quick questions to help me finish WP:Update.

  • this page used to say to add an &nbsp; on the left of an en-dash; now it says to do it only if "necessary for comprehension". WP:MOS says not to do it, so one or the other should change. When is it necessary for comprehension, and can't we just write the devs asking them to break lines in front of en-dashes? - Dan Dank55 (send/receive) 22:12, 1 October 2008 (UTC)
    • My view is that we should put in a hardspace only if the line does break and someone finds the line beginning with endash hard to follow. One point here is that the line will rarely break before the dash; normally the dash and space will fit on the higher line. If somebody sees a given break as a problem, then we should clarify. Septentrionalis PMAnderson 22:30, 2 October 2008 (UTC)
Um, we can't actually determine if the line does break, since that is dependent upon monitor resolution, font size, window width, and (notably) browser behavior. The only case where we could be absolutely certain is when some form of no-wrap code ({{nowrap}} or manual coding) is applied around the passage in question. — SMcCandlish [talk] [cont] ‹(-¿-)› 04:47, 4 October 2008 (UTC)
Dan: Frankly, I don't really care either way, but MOS trumps sub-MOS pages, so go with what MOS says unless/until it says something different. — SMcCandlish [talk] [cont] ‹(-¿-)› 04:47, 4 October 2008 (UTC)
  • Can someone summarize where the IEC prefix debate is and whether it's likely to change? - Dan Dank55 (send/receive) 22:48, 1 October 2008 (UTC)
As to you second bullet: There is no active debate that I am aware of (although, your question and my response may change all that). Editors shall not use the IEC prefixes—kibibytes (KiB) and mebibits (Mibit or Mib)—to routinely denote the capacity of computer storage. The only exception is for articles that directly discus the IEC prefixes, in which case, they may be used as examples to illustrate the concept. As to whether this policy is likely to change: When the rest of the computing world sees the wisdom of the IEC prefixes and adopts the IEC’s proposal, then most computing magazines will start using them. That’s when Wikipedia should follow suit. Greg L (talk) 23:08, 1 October 2008 (UTC)
Thanks much, Greg. I dreaded trying to sort out the changes for the monthly update. - Dan Dank55 (send/receive) 13:28, 2 October 2008 (UTC)
(For once lately) I completely concur with Greg L on this one. I understand the rationales of the IEC-promoters, and even side with them from a standards advocacy position off WP, but this is a WP:SOAPBOX issue. — SMcCandlish [talk] [cont] ‹(-¿-)› 04:47, 4 October 2008 (UTC)
  • I tend to agree with Anderson: where a spaced en dash occurs towards the start of the first line of a paragraph, it's pointless inserting a hard-space. I don't think hanging dashes are major problem, in any case. Tony (talk) 04:59, 4 October 2008 (UTC)

A very special case

I've been editing WP for 5 years, with a lot of attention to bios, and never thought about this until the last 24 hours. DYK that WP has a Dab Francis, Dauphin of France? Would you have imagined that for half of the individuals appearing there, his YoB and YoD are equal? (I hasten to defend the notability some such people: being heirs to a throne, their birth may be the abatement of an ongoing political crisis, and if even if their death is not the resumption of such a crisis, it may be the beginning of a crisis of confidence, by focusing attention on monarchies' perpetual question of succession.)
I have found two approaches to stating such people's vital stats in use:

  1. The year (or full date) stated once, as (1466) in the abovementioned Dab.
  2. "Born and died", as (born and died 1466) for that François's father's bio
    and imagined two more:
  3. The same year (or month or full date) stated twice, as (1466-1466)
  4. "Died in infancy", as (died in infancy, 1466)

All of them bear the burden of being so rarely needed, as to require at least a bit of head-scratching.
I don't like any but 2: In option 1, that extra thought may run

Is that YoB? No, that'd be '(born 1466)'; well then ... no, the same logic rules out YoD; i guess i can't make any sense of it other than 'born and died in 1466'. Hmm ... or an accidental erasure... of which year?".

In option 3, it's

"1466-1466"? That's not a range of years; you'd never say "lived from 1466 to 1466". Is it a slip of the keyboard, getting confused and copy-and-pasting the same year twice instead of one each? ... And if so, did he die, or get born in that year?

In option 4, (altho based on the 1st meaning in the dictionary, infancy does end at the first birthday),

OK, died after a week or so in 1466, ... but maybe born 1465 and died 6 months later in 1466? Wait, when does infancy end, could he have been born in 1464 or 1465 and died in 1466 at the age of 18 months?

Of course even option 2 is a little bit annoying, with a third of your attention perhaps on "who felt the need to accompany the date with words and quadruple its length?"
So i propose this approach (but not this language):

Where it makes sense to mention two distinguishable dates, everything takes care of itself, e.g., May - June 23, 1492
Avoid confusion in the following situations with the corresponding formats:
  1. Only the year in which both birth and death occurred is known, or the context does not call for providing a specific day of that year -- use, e.g., "(born and died 1492)".
  2. The year and month in which both birth and death occurred is known, more detail than the year is desirable, but neither the day of birth nor that of death is known -- use, e.g., "(born and died May 1492)".
  3. The day on which both birth and death occurred is known, and the context calls for more detail than the year -- use, e.g., "(born and died May 19, 1492)".

The actual language i contemplate is:

In rare cases, someone dying in infancy is notable enough for vital-stats information to appear, but no sentence or clause devoted to vital stats is called for. The combination of the precision of what is known, and the amount of detail that the context makes desirable, may indicate mention of two dates; otherwise, one of the following formats is likely to cause less confusion than any alternative:
  1. (born and died 1492)
  2. (born and died May 1492)
  3. (born and died May 19 1492)

--Jerzyt 23:05, 2 October 2008 (UTC)

This is much longer than the other bullet points in the relevant section. Is there some reason we can't leave this special case to IAR? Or, perhsps say: if the date of birth and death are the same, only one has to be stated, with one to three of the above examples? Septentrionalis PMAnderson 18:03, 3 October 2008 (UTC)
I find the ‘born and died’ formula clearest. When this is not possible, perhaps ‘born 12 March 456, died 7 August later that year’ (or ‘in that/the same year’ can be used. Shinobu (talk) 01:34, 4 October 2008 (UTC)

Proposed replacement of "Strong national ties to a topic"

Given the state of tempers, I thought I would discuss this here, rather than being bold. I propose that we replace the "Strong national ties to a topic" section as follows. I use a level-3 section head to indicate a return to commentary.

Choosing a format

While the decision must be made by the editors of each article, the following principles should be observed.

  • If a Wikiproject has achieved a consensus for the date format to use within a subject area, that format is strongly preferred.
  • When the sources for an article predominantly use one format or the other, the predominant format is preferred.
  • Otherwise, articles on topics with strong ties to a particular English-speaking country should generally use whichever format predominates, assuming that one does.
    • For the United States of America, use the month-before-day format.
    • For most other English-speaking countries, use the day-before-month format.
    • Canada uses both about equally, so either may be used.
    • Many non-English-speaking countries use a format that is very similar to one of the two acceptable formats. In such cases, the closer of the two should be used

Discussion

The concept that uniformity of date format will somehow increase Wikipedia's prestige strikes me as misguided. If we wanted uniformity, we would prescribe a particular variety of English and enforce its use in all articles. Exceptions would be limited to quotations and references to dialectical usage. Instead, we use various national varieties, including the use of "Indianisms" in some cases. The train has left the station, been boarded on ship, and the ship has sailed and is in International Waters.

Wikipedia covers a lot of fields. If the best sources in a field use a date format consistently, we want to follow the sources in style as we do in content. The best judges of that are active subject-matter Wikiprojects. Failing that, the matter is best left to the editors on an individual page. If having a link the reader from seeing "the color of the sulfur" to reading "the colour of the sulphur" is acceptable, I don't see why taking him from "April 1" to "1 April" is one iota worse.

As for a comprehensive list of countries, I see no need and little purpose. If the result is obvious, there is no need. If the result is not obvious, I do not see that this little band is more qualified than the editors of an article to make a decision. Robert A.West (Talk) 16:25, 21 September 2008 (UTC)

  • I object to the last clause. There is no reason why articles on Sweden should use 22 September, just because Swedish uses 22 september; the only Swede to have commented here expressed puzzlement that we should do this. The Swedish WP is another matter. Septentrionalis PMAnderson 17:29, 21 September 2008 (UTC)

Yeah, that last clause is really just proposing something that's already been firmly rejected. The others may have some sense, but is this continuous debate over a triviality really what Wikipedia needs?--Kotniski (talk) 17:45, 21 September 2008 (UTC)

There is no reason why articles on Sweden should use 22 September, just because Swedish uses 22 september. Ummm. That's actually a reason TO use it. I think you'd have to come up with a pretty good reason NOT to use Sweden's preferred format, all else being equal.
Not if the Swedes don't see it as one; we are not helping them or anybody else. Septentrionalis PMAnderson 20:13, 21 September 2008 (UTC)
Well, make up your mind - you say they use 22 September in one breath, and then dispute yourself in the next. Easily solved - just set your computer preferences to Swedish and see what the preferred format is. That's something any computer user can do for themselves. The Swedes don't use American date format, so please don't insult our intelligence by implying that they do. --Pete (talk) 01:16, 22 September 2008 (UTC)
What Jao said was
The idea that "13 September 2008" should somehow feel more natural than "September 13, 2008" because I consistently read and write "13 september 2008" in my native Swedish never crossed my mind. Why would it?
That seems clear enough; why would it indeed? Adopting the forms of Language A into Language B, when they are not natural in both, is a sign of incomplete mastery of one or the other. Septentrionalis PMAnderson 01:49, 22 September 2008 (UTC)
You two obviously feel the need to defend your preferred date format. So far as I can see you are both causing a lot of unecessary disruption over something that's pretty trivial really. Wikipedia is an international project. You should get used to working with people from diverse backgrounds.
On the contrary, we support letting editors of various backgrounds use their preferred formats; Skyring has been continually revert warring to have this page dicountenance that. Then again, Skyring has been doing nothing but Date Warring for his preferred all over article space for some time now. Septentrionalis PMAnderson 18:14, 21 September 2008 (UTC)
As for the wording of the proposal above, it's just common sense and courtesy. I like it. --Pete (talk) 18:06, 21 September 2008 (UTC)
It is neither. But I note the appeal to that all-prevailing rationale: WP:ILIKEIT; at base, Skyring has no other. Septentrionalis PMAnderson 20:13, 21 September 2008 (UTC)
Let me add mine to the many voices urging you to contemplate this as a useful source of guidance. --Pete (talk) 01:11, 22 September 2008 (UTC)
The rejection is here. That was Option D. By my count, 38 voices; 7 approved it, 5 were willing to tolerate it, 26 rejected it .
On the substance: we have the present text (R), which sets a firm rule for only some articles, those with strong ties to English-speaking countries. We had three alternatives which extended it to a rule for all articles (B resembled R, but was longer and differed in detail):
  • A would have required all articles in a national dialect of English to use the corresponding date format.
  • C would have required that all articles not strongly tied to the United States, US possessions and Canada use the British format.
  • D is a wording almost identical to West's clause.
A and C did best, although none had a majority. There was then a runoff, which rejected A and had a hairline majority for C; C has then polled against the present text, and failed - despite widespread canvassing by Pete - to win even a majority.
It may be worth tweaking the present text to insert
  • If a Wikiproject has achieved a consensus for the date format to use within a subject area, that format is strongly preferred.
  • When the sources for an article predominantly use one format or the other, the predominant format is preferred.
although the first will run into objections from those who regard Wikiprojects as bumptious, with no right to object to the project-wide guidance here. Septentrionalis PMAnderson 18:11, 21 September 2008 (UTC)
Looking at WP:CONSENSUS, I think you are reading to much into interpreting polls. I've summarised above the areas where we have consensus and where we don't. We've still got a way to go. --Pete (talk) 18:21, 21 September 2008 (UTC)
So can you at least please stop edit-warring on the policy page until we have consensus to change it. I've explained above why the word "English-speaking" that you keep removing is desirable; please at least answer the reasoning before continuing to repeat the change (I know it's not only you this time, but the same applies to others).--Kotniski (talk) 18:38, 21 September 2008 (UTC)
  • And there is no particular reason why we need those claims of fact at all. Septentrionalis PMAnderson 18:45, 21 September 2008 (UTC)
  • I don't think we need them for the rule to operate; editors will determine which format is prevalent in Canada, or Jamaica, the same way they tell whether those countries use color or colour: by knowing the local variant of English. We should not decide that; as has been pointed out, we don't know the answer for Jamaica, and the answer for Canada has been questioned in the course of the discussion. Septentrionalis PMAnderson 18:52, 21 September 2008 (UTC)
Ahh, I hate to be picky, but knowing the local English variant in Canada doesn't give you the date format. Canada bats both ways. English-speaking nations aren't a problem anyway. The current discussion revolves around how we treat non-English-speaking nations in Wikipedia. It's easy enough to determine the date format used in a specific - just set your computer preferences from the list provided and see what comes up. Anyone can do this. --Pete (talk) 19:14, 21 September 2008 (UTC)
Pardon me, but this appears to be a suggestion that Microsoft (!) is a reliable source. Tell it to the Marines. Septentrionalis PMAnderson 22:40, 21 September 2008 (UTC)
I'm using a Mac. --Pete (talk) 09:16, 22 September 2008 (UTC)
  • Skyring is, as often, seeing what others say through his idée fixe. We have said nothing more than that a rule which has no majority can scarcely be consensus (rules which do have majorities may or may not be) and that if a rule has no consensus, MOS should not require it. (Again, consensus claims may or may not belong here.) In short, we discuss necessary, not sufficient conditions. Septentrionalis PMAnderson 18:45, 21 September 2008 (UTC)
I am not strongly attached to the point. I consider the national-preference clause to be better than a coin-flip, but not by much. The previous discussion and polls were quite lengthy, and I obviously misread them. I am strongly attached to following relevant scholars, so I regret muddying the waters with this point. Robert A.West (Talk) 19:01, 21 September 2008 (UTC)
I agree that Wikipedia is an international effort, but en.wikipedia.org does not bear the responsibility for the internationalization of all of Wikipedia. This is the English version of Wikipedia. See the list of all the other Wikipedias. To continually bring up formatting matters concerning non-English speaking countries in English Wikipedia is trying to broaden the responsibility of this MOS beyond it's scope. I'm sure the hundred(s) of other Wikipedias have some MOS guideline for their use, and highly doubt they debate imposing the English Wikipedias methods upon their editors. So why do we continue to debate this point? It has been rejected by previous consensus and to keep rehashing it distracts from improving the current wording of MOS on date formatting as it applies to English Wikipedia.--«JavierMC»|Talk 23:12, 21 September 2008 (UTC)
Agreed. I've struck that part out of my proposal. Robert A.West (Talk) 23:35, 21 September 2008 (UTC)

Mr. West's proposal contains "When the sources for an article predominantly use one format or the other, the predominant format is preferred." I can't agree with that, because (1) it can cause flip-flopping of the date format as sources come and go, and (2) if the sources are not online, no single editor may have access to enough of the sources to determine what format predominates. Also, one can probably argue that newspapers are a major source for almost any topic, and UK newspapers often use the mdy format (or so some editors have claimed in this discussion), so this proposal could be disruptive.

A variation of this proposal that I would support in an individual article, although it may not need to be in the guideline, is that if an article contains extensive quotations that use a particular format, it would be appropriate for the unquoted parts of the article to use that format too. --Gerry Ashton (talk) 18:44, 21 September 2008 (UTC)

I would take the clause as applying to articles like Frederick North, Lord North, which should not be sourced from newspapers. Predominant should prevent switching; if the sources are so evenly divided that a few sources would overturn the balance, neither side was "predominant" to begin with. Septentrionalis PMAnderson 18:48, 21 September 2008 (UTC)
Yes, the "Frederick North" article is a fine example. There are five sources, only one of which is online, and the online source is from 1867, so is not a good guide to date format in modern writing. If a question arose about the appropriate date format for the article, few editors would be able to determine which format predominates in the sources. --Gerry Ashton (talk) 18:56, 21 September 2008 (UTC)
Actually, all five sources are at Google Books. The American Tuchman is the only one to use the European style, presumably because her book is about more than the eighteenth century. But this is a reason to leave the question to the judgment of the editors who have actually consulted the sources; I would prefer acknowledging that they may wish to diverge from the norm to any rule permitting kibitzers to switch dates on their arbitrary judgment. Septentrionalis PMAnderson 19:09, 21 September 2008 (UTC)
(ec)Both fair points. How about, "When the scholarship on a subject predominantly uses one format or the other..."? That is more to my meaning in any case. As to the subject of quotations, I agree that it is reasonable to harmonize an article's style with its quotations, but that could lead to the sort of flip-flopping that Mr. Ashton so properly deprecates. Robert A.West (Talk) 18:56, 21 September 2008 (UTC)

I would think the word "predominantly" in "When the sources for an article predominantly" should be enough to assuage Gerry's otherwise valid concern - would adding the words When "the most notable" scholarship ... help? Slrubenstein | Talk 19:01, 21 September 2008 (UTC)

This would have the interesting effect of making many British articles use American dates, as English-language newspapers, including The Times, typically use American-format dates. I think a lot of people would find this confusing. --Pete (talk) 19:16, 21 September 2008 (UTC)
If British newspapers predominantly use "American" format, on what basis do we say that "International" usage predominates in the U.K.? I would think that newspapers would be strongly motivated to use the format preferred by their readers. Robert A.West (Talk) 19:25, 21 September 2008 (UTC)
I think this is hard to phrase. I would go along with an individual article using the format that predominates in all the modern English-language scholarly sources on a topic, but not on which predominates in the subset that happens to have been cited in a particular article. Any editor who embarks on editing a decent article should consult several good sources (which might not be the ones that have been cited so far) and if one format really does predominate the field, the editor will see that. One problem with expanding this point from consensus on an individual article to a guideline for all articles is that some topics are just ignored by scholarly sources, and I shudder to think what the predominant date format might turn out to be for sources on those articles.
As for extensive quotes, I suspect that will only come up when the subject of an article is a document, and most of the quotes will come from the subject document. Since the subject document won't change, there won't be a concern with flip-flopping. And anyway, I really think that should be decided by consensus for a particular article. --Gerry Ashton (talk) 19:27, 21 September 2008 (UTC)
My concern is that if we leave "national ties" as the one guiding principle, and note that they are the reason to change an article's format, we may prevent just the sort of per-article consensus that should evolve. That result would make Wikipedia worse. Robert A.West (Talk) 19:35, 21 September 2008 (UTC)
  • How about When the sources on a subject predominantly use one format or the other, the predominant format is preferred? The minor premise that the sources actually used tend to represent the whole universe of sources will usually be true, but not always. Septentrionalis PMAnderson 19:45, 21 September 2008 (UTC)
    This sounds constructive. I agree with Robert West's comment. As for Pete's - well, this gets back to a point I made a few days ago, what is wrong with listing countries where the official date system is x or y. That is like listing countries where the official religion is x or y. We just do not know whetehr it means that this is the system that the state requires used on all official documents, but is otherwise ignored by most people, or is it really what most people practice? We are giving the word "official" too much credit. Myabe it is all we have .... but lets not pretend that it necessarily means that thi is the only dating system people use, let alkone understand, in a given country. Slrubenstein | Talk 19:58, 21 September 2008 (UTC)
With all due respect, Robert, this is really just re-plowing old ground. While your suggestion of devolving the decision to individual Wikiprojects is a novel one, I’m not sure it’s altogether a wise one. First of all, the issue is not intrinsic to any Wikiproject’s purview, but rather it’s a general, encyclopedia-wide one; secondly, it is likely to result in more inconsistency and more edit-warring; thirdly, there are many articles not covered by any Wikiproject (and even more not covered by an active one). The second bullet, if it is meant to be second in importance, basically defaults to a universal “use whichever style was first introduced” and thus obviates the need for the remainder of your points. The last sub-bullet – the one most of those above are pointedly taking exception to – was, of course, the least-preferred option according to the polls (however valid one may feel them to be), despite having a few quite vocal champions. Nor is there likely to be found a scholarly consensus. An American scholar and a British scholar writing on the American Revolutionary War will each use their native style – and neither will be bothered by it. Askari Mark (Talk) 20:10, 21 September 2008 (UTC)
Actually, both are likely to use the format presently used in the United States, as with Lord North above; that's the format on primary documents on both sides of the Atlantic. Mark Askari forgets, I conjecture, how recently the European format was introduced in Britain and the Commonwealth. Septentrionalis PMAnderson 20:16, 21 September 2008 (UTC)
No, Askari Mark does not forget how recently the European format was adopted; however, I have read a lot of scholarly works originating from both sides of the “Big Puddle”. There is also the issue that when a manuscript is submitted for publication in a journal based on the opposite bank, that journal’s style guide may call for re-rendering in the local usage. Askari Mark (Talk) 20:23, 21 September 2008 (UTC)
I acknowledge my conjecture. Journal style guides would seem to strengthen the case for the proposed language; if one or the other is predominant after that randomization, then it must have been so common in MS. that it would be surprising to see it otherwise. Septentrionalis PMAnderson 20:46, 21 September 2008 (UTC)
With all due respect, Askari, I believe that style is often field-dependent, and that Wikiprojects are precisely the groups that are most likely to make good decisions about style for each field. As for your criticism of the second bullet point, it is a decision based on the universe of sources. That is a far cry from "pick a style and stick with it." You are correct about the last point, and I have already apologized. I've struck it out to avoid having to apologize a third time. Robert A.West (Talk) 23:54, 21 September 2008 (UTC)
I agree with Slrubenstein that the word "official" should be avoided in relation to date and time formats. As examples of official neglect of the subject, the Gregorian calendar is in effect in the United States because of the the British Calendar (New Style) Act 1750. In Britain, the Parliament has declined to decide whether GMT means Universal Time or Coordinated Universal Time.
I also have share the concerns about this bullet When the sources for an article predominantly use one format or the other, the predominant format is preferred. Does this imply that if the sources used for an article change or that enough sources are added that use a different format that the article should be changed? It seems to me that if a guideline is to be provided it should be one that can not easily be gamed and encourages stability. In many case one date format is no better than another, though all date formats will seem at least unusual to at least some of the editors, and most of the editors will find at least one format unusual. I think something akin to the first format used in article to be as good of rule as any, and makes a concrete point that can be used to prevent edit warring. PaleAqua (talk) 01:32, 22 September 2008 (UTC)
Would you prefer sources on a subject, as suggested above? Septentrionalis PMAnderson 01:51, 22 September 2008 (UTC)
While that and the scholarship versions seem a little better they still seem like they present gaming risks. It is doubtful that the average editor would know most of the sources on a topic, which means that any set of sources revealed could be chosen with bias. Also they may be sources that could have both multiple additions, where versions that agree with a particular editors date format could be chosen. If such a call would be made it seems that a wikiproject involving the articles would be in a better position to make such a ruling, but that's already covered by the first point. I still think something simple that is easily determinable is the best guideline. Something like "If a Wikiproject has achieved a consensus for choosing a date format to use within a subject area, that choice is strongly preferred.", drop the "When the sources..." clause, and add at the end something akin to the retaining the existing format as the final option. PaleAqua (talk) 02:39, 22 September 2008 (UTC)
While I fully agree with Robert and PMA that journal styles can certainly display usages dominant to a particular field (terminology, specialist definitions, abbreviations, etc.), I cannot recall seeing such with regards to date formatting in particular. It’s an interesting conjecture, but I suspect if there’s a preponderance of use in a field that it may be more a reflection of a larger number of related publications existing in countries using one form vice other. Assuming there are cases that fit this conjecture, I’d recommend caution in formulating this guidance and restrict it to scholarly sources. Otherwise we might, as PaleAqua observes, come to find it being used as a justification for editwarring over the number of citation sources favoring one usage over the other. Askari Mark (Talk) 03:13, 23 September 2008 (UTC)
Ok, I have a real stupid question, why not a date template which changes the date per the browser's local settings, ie ((month=9|date=23|year=2008)) which would just show the format as preferred by the browser? Or is this impossible to implement? Paranormal Skeptic (talk) 13:47, 2 October 2008 (UTC)
It's a sensible suggestion, though I guess it's an extra step for editors to jump through. On balance, I am strongly in favour of this kind of markup, as it would enable "best guess" formatting for anonymous users and the-way-I-like-it formatting for those with accounts — plus no information is lost when converting a date link into date markup.
Comments? —pmj (talk) 04:27, 4 October 2008 (UTC)
This is being discussed below. —pmj (talk) 04:40, 4 October 2008 (UTC)

I agree with comments by Kotniski and Anderson above. Let us not forget that many many articles on Sweden-related topics (as an example) are written by Americans of Swedish ancentry. Just what the Swedes do in their own language is far from the equation. This is English—either American usage or another usage. Tony (talk) 05:02, 4 October 2008 (UTC)

When to change an article

It would seem to go along with my suggestion about that "scholarly practice" should replace "national ties" as the reason for change par excellence. Otherwise, a carefully-made choice based on scholarly practice could be reversed on the basis of alleged strong national ties. Robert A.West (Talk) 19:31, 21 September 2008 (UTC)

While it would be nice to think of Wikipedians as scholars, in general we aren't, unless we accept Borat's definition. Any rules on date formats (or anything else) should have two features:
  1. Clear and easily understood by editors
  2. Producing consistent results accepted by readers.
The reason for introducing date autoformatting in the first place was so that editors would see dates in the style they preferred. Combined with the strong national ties rule which acted to keep American articles in American format and British articles in International format, this system worked well for years. Making radical changes to a working system is something that should be approached cautiously. --Pete (talk) 20:06, 21 September 2008 (UTC)
It's already been radically changed, when date linking at all was deprecated, so that a person's Preferences no longer matter. Corvus cornixtalk 20:15, 21 September 2008 (UTC)
The system may have reduced edit wars over dates, but it did not do anything for the vast majority of readers, that is, those who are not registered. So it didn't work well for years, it swept the problem under the rug for years. --Gerry Ashton (talk) 20:53, 21 September 2008 (UTC)
So, who are the majority of readers? Are they not Americans? Your argument then should be to require American format. Corvus cornixtalk 21:06, 21 September 2008 (UTC)

However, the script that is now being used to remove date-linking could easily be modified to make date format consistent, yet still leave date-autoformatting intact. I understand somebody also has a script that could remove date-linking & retain date auto-formatting. --JimWae (talk) 21:05, 21 September 2008 (UTC)

No, that was a patch to the wikimedia software, which was written and proposed, but not adopted. Since date autoformatting is depricated, I think it is unlikely that any date-autoformatting patch will be accepted. --Gerry Ashton (talk) 21:14, 21 September 2008 (UTC)

Additionally, the headers for all discussion involving deprecation read "date-linking" and did not read "date-autoformatting" --JimWae (talk) 21:16, 21 September 2008 (UTC)

We are not by definition scholars, as Pete says, but we do rely on scholars and other notable, reliable sources when writing most articles; that is the point. Slrubenstein | Talk 21:47, 21 September 2008 (UTC)
Actually, I would venture a guess that the "strong national ties" rule for formatting is unknown to most editors and to essentially all readers who are not editors. We don't make people pass an MOS certification exam before contributing. So long as date autoformatting was the norm, I gave the matter no thought, and I suspect that most editors were like me. As Mr. Ashton points out, that did not improve Wikipedia from the point of view of the unregistered reader, but it is the way it was. As for the casual reader, I would be astonished if more than a handful who link from George III of England to Boston Tea Party look at the difference in format and care one bit. Robert A.West (Talk) 00:14, 22 September 2008 (UTC)
I think most of the points above have been made previously in discussion. I thank my fellow editors for reminding me. Most of all I thank those who reminded me of the points that I had made. That was sweet. One new thing I really like is the notion that the various wikiprojects decide how to handle matters of style in articles. Who else is better placed to know themselves, their subject and their readership?
One of the points already agreed upon as consensus is that it doesn't matter which of the two formats is used - no confusion arises as to the date. The date-linking thing arose to prevent conflicts between editors, some of whom, as we can see, are strongly attached to their preferred formats. Using national ties as a determinant worked well. Of course date-linking did not and does not conceal date formats from editors working on an article; we see the raw text when we hit that "edit" button. Problems arise when we get chauvinists attempting to push American date formats, spellings, units of measurement and so on out into subject areas that do not normally use them. And vice versa, of course.
The fact that English-language newspapers commonly use American date formats is a matter of convenience - the major syndicated news agencies all use American format and newspapers do not care to employ people to change one format to another, story after story, hour after hour, night after night. National usage is a different thing, and we don't have to go hunting down official sources to see what format Malaysia prefers - just look at the control panel in our computers, and we can see that Mr Gates, Mr Jobs and Mr Linux have done the work for us. Presumably they have researched their markets and know exactly what computer users in each country prefer.
I'd prefer that the Manual of Style be as simple, fair and practical as possible. That way we minimise friction and disruption between editors. Asking editors to hunt through sources, or balance "ise" and "ize" word endings or trawl through the history is needlessly complex, and ensures that none but the most determined of nitpickers will do it. The most relevant wikipractice concerns which units of measurement we use, and here we use whichever system of units best suits the topic, a practice that works well for all except those troublemakers who wish metres and kilograms on Americans, and vice versa. Minimising conflict and disruption with clear, well-chosen guidelines is what we should be about. --Pete (talk) 01:06, 22 September 2008 (UTC)
I'd prefer that the Manual of Style be as simple, fair and practical as possible. When did you change your mind? You have spent the past month arguing for complex and impractical rules which would allow you to bully as many articles as possible into your preferred dating format. Septentrionalis PMAnderson 01:29, 22 September 2008 (UTC)
Anderson's right. Please give it a rest, Skyring. Tony (talk) 01:30, 22 September 2008 (UTC)
Since Skyring/Pete brought up Microsoft, it is worth noting that Microsoft admits that many countries have regional variations, which is why it advises use of API's and permits the user to customize national settings.Robert A.West (Talk) 02:07, 22 September 2008 (UTC)
Robert, that's just good programming practice. There are always people or groups of people who like things a different way and giving them the tools to personalise their experience sells more boxes. The big computer/software companies are excellent examples of internationalism, and we could learn a lot from them.
  • Yes, and Microsoft's Australian English spellchecker got it horribly wrong with the s/z thing; they still haven't corrected it, so we have to use the BrEng spellchecker. Don't hold them up as an example. Tony (talk) 11:56, 22 September 2008 (UTC)
Just checked, and Microsoft Word 2008 running on my Mac doesn't flag "ise" words as spelling errors. It accepts both as valid, which seems to accord well with current Australian practice. Are ypu sure your software is up to date? I suppose we can do a comparison of the date formats recommended by Microsoft, Apple, the various flavours of Linux, Unix, Solaris etc. I wouldn't expect any great difference between them. I doubt that they pull this stuff out of thin air. Looking at what Mac recommends for Australia, I see:
  • Long date: Saturday, 5 January 2008
  • Short date: 5 January 2008
  • Abbreviated dates: 05/01/2008 and 5/01/08
  • Calendar: Gregorian
  • Times: 12:34 AM and 4:56 PM
  • Numbers: $1,234.56, 1,234.56, 123.456% and 1.23456E3
  • Currency: Australian Dollar
  • Measurement Units: Metric
That looks about right to me, though I'd tend to write the shortest date as 5/1/8 and I've customized (yes, the Mac control panel uses the "ize" form) the time to use 24 hour clock because that's the way I like it, given my military background. --Pete (talk) 22:57, 22 September 2008 (UTC)
Tony, Anderson is dead wrong. I prefer a simple effective solution. The last thing I want to do is bully anybody. Maybe Anderson feels pressured, but that seems to be SOP for him, looking back over his contributions long before he ever heard of me. --Pete (talk) 09:13, 22 September 2008 (UTC)
  • I beg to differ: Anderson's solution is much simpler, which is to respect the way in which an article is begun if it has no strong national ties to an anglophone country. Your system is complicated and requires research and often precarious judgement for many articles (can a Phillipine-related article be in international if the editors want? What about some South American countries? Have a look at the article on date formatting, which is enough to give you the chills—and it's not even referenced.) Besides, why can't US authors write about topics unconnected with other anglophone countries in US English and US date format? It's absolutely unreasonable to upset the apple-cart in this way, and inconsistent with our "first contributor" criterion for Engvar, which works superbly well. Tony (talk) 11:50, 22 September 2008 (UTC)
Upset the applecart? That's just bizarre. We've been using strong national ties for years. The relevant wording remained unchanged for nine months. I've pointed this out before, and anybody may check for themselves. Here's how the wording developed:
  • 2004: It's generally preferable to use the format used by local English speakers at the location of the event. For events within Europe and Oceania, that is usually 11 February 2004 (no comma). For the United States it's usually February 11, 2004 (with comma).[3]
  • 2005: It is usually preferable to use the format preferred in the variety of English that is closest to the topic. For topics concerning Europe, Australia, Oceania and Africa, the formatting is usually 17 February 1958 (no comma and no "th"). In the United States and Canada, February 17, 1958, (with two commas—the year in this format is a parenthetical phrase) is correct, and in Canada, 17 February 1958 is common..[4]
  • 2006: If the topic itself concerns a specific country, editors may choose to use the date format used in that country. This is useful even if the dates are linked, because new users and users without a Wikipedia account do not have any date preferences set, and so they see whatever format was typed. For topics concerning Ireland, all member states of the Commonwealth of Nations except Canada, and most international organizations such as the United Nations, the formatting is usually 17 February 1958 (no comma and no "th"). In the United States, it is most commonly February 17, 1958. Elsewhere, either format is acceptable.[5]
  • Early 2007: If the topic itself concerns a specific country, editors may choose to use the date format used in that country. This is useful even if the dates are linked, because new users and users without a Wikipedia account do not have any date preferences set, and so they see whatever format was typed.[6]
  • Late 2007: Articles on topics with strong ties to a particular English-speaking nation should generally use the more common date format for that nation.[7]
  • 2008:Articles on topics with strong ties to a particular English-speaking country should generally use the more common date format for that nation; articles related to Canada may use either format consistently. Articles related to other countries that commonly use one of the two acceptable formats above should use that format.[8]
Anderson changed the wording - without consensus, I might add - and I reverted until I got sick of his disruptive edit-warring. If anyone upset the applecart, with the resulting shitfight you now see, it's Anderson. We were doing just fine until he intervened. The guidelines were simple - use the date format of a relevant country - and there was very little confusion or disruption. The system worked. US editors can and did write about foreign countries using whatever format they wanted. Nobody stopped them doing so. Nobody really cared. I certainly don't mind if someone adds useful information without getting everything exactly as per the MoS - someone is bound to come along and square it away, and if it an article gets to FA status, which presumably is something we want for every article, then we'll have the real wikiwonks come along and get everything into showroom condition. As noted, anybody can check the most common date format used in a country by looking at their computer's control panel. If you are editing Wikipedia, you have a computer right there in front of you.
My preferred wording is simple, fair and practical. Just remove "English-speaking" from the current wording: Articles on topics with strong ties to a particular country should generally use the more common date format for that nation. For the U.S. this is month before day; for most others it is day before month. Articles related to Canada may use either format consistently.
This is similar to the way we handle units of measurement and local currencies - we don't look at the history of articles on non-English-speaking nations and if some editor used yards instead of metres initially, keep that forever. It gets changed to the appropriate unit and nobody bothers. Except for a few chauvinists who seem to think that every time a date or a unit is changed from the American way, it's another star ripped off Old Glory. --Pete (talk) 23:28, 22 September 2008 (UTC)
I have no problem with that. My problem is with the requirement that unless it's specifically American, non-American formatting must be used. Corvus cornixtalk 23:30, 22 September 2008 (UTC)
I certainly don't support compulsion along those lines. In fact there must be a huge range of articles where date formats are not an issue and either format is fine. UN agencies, for example. Or, as Tony has noted, British filmstars who move to Hollywood. Either format is acceptable there. But if an article has a natural and strong tie to a single nation, then why not use the date formats and units of measurement commonly used there? This applies to the USA, France, the UK, New Zealand, the Philippines... --Pete (talk) 23:39, 22 September 2008 (UTC)
You yourself said, If the date format used in a place where they don't speak English is day before month, then what on earth is wrong with using that format in written English? Am I missing something here? The only reason I can think of why people would edit-war and abuse other editors for the sake of using one date format over another is that they care very deeply about their own personal preference, and that's not the attitude of a reasonable person.. Corvus cornixtalk 23:55, 22 September 2008 (UTC)
Yeah. What's wrong with that? Using the Swedish date format in an article about Sweden sounds pretty reasonable to me. But if there are good reasons not to use it - through local consensus or whatever - then I certainly wouldn't compel any editor to use a format they are not happy with. --Pete (talk) 00:20, 23 September 2008 (UTC)
Because it's advocating limiting American format to American subjects, which has been my objection all along. Corvus cornixtalk 00:25, 23 September 2008 (UTC)
Well, no. It's advocating using Swedish format in Swedish subjects and American format in American subjects and British format in British subjects and so on. That sounds pretty reasonable to me. Going beyond articles with strong ties to specific nations we have articles on international topics such as Olympics or subjects with no specific ties, such as Commando. These categories have no preferred format and thus stay in the format first chosen. I'm certainly not advocating compulsion on date formats - just a return to the way we've always done things and which worked well. --Pete (talk) 12:13, 23 September 2008 (UTC)
And it's also something that was soundly rebutted by the last three weeks of discussion, polling and various other methods of trying to get the point through that this is English Wikipedia, not Russian, African, Japanese, Dutch, and a plethora of other Wikipedias, that have and maintain their own MoS, and should not be a consideration for this one. How non-English speaking countries write their date, has no basis for consideration of how the English Wikipedia will address the dating issue concerning articles written about them. It is restrictive, an unnecessary broadening of this MoS's responsibility, and frankly attempting to overly internationalize the English Wikipedia, when the non-English speaking countries have their own Wikipedias for use list. Go to their Wikipedias and try and impose our MoS on them and see how far you get. Lets have not only articles written in English here, but include all languages in this one, and delete all the others. Before I'm accused of balderdash once again, I will stop now. The notion to include, other than English-speaking countries date format conventions in this style guide, has been rejected. Lets move on please. Cheers.--«JavierMC»|Talk 02:48, 23 September 2008 (UTC)
I've yet to see consensus for any one method of dealing with this, certainly not anything that justifies a change to our long-standing, workable and uncontroversial practice. Anderson changed the wording without obtaining consensus and since then it's been one unholy mess here. ---Pete (talk) 12:13, 23 September 2008 (UTC)
Pete, note the evolution from "may use" to "should use" to a virtual "must use". That is a crucial change and one that, manifestly, never had a broad consensus. Robert A.West (Talk) 22:41, 23 September 2008 (UTC)
I can't see anything saying "must use", though I haven't checked the latest wording "tweak", maybe there is a mob with torches and pitchforks standing by ready to go. I don't support must use for date formats, with the exception that we shouldn't use ISO 8601 dates in written text. Otherwise, the difference between the two date formats is much like hanging your toilet roll underhand or overhand. Either way works perfectly well, but by jingo, you get some zealots on this topic! --Pete (talk) 01:33, 24 September 2008 (UTC)
I like the 2004 wording most. We could add that consistency within an article trumps over the "generally preferable", and that, in the case of events located in a place with no significant number of "local English speakers", we should use 5 October 2008 if the article uses Commonwealth English and October 5, 2008, if the article uses American English. (Note the comma after the year in the US format). -- Army1987 ! ! ! 14:49, 5 October 2008 (UTC)

All this debate, but things would be so much easier...

...if the rule was simply: use the date formatting produced by ~~~~. Teemu Leisti (talk) 08:59, 25 September 2008 (UTC)

But nevermind. I'm done with this discussion, at least for a few months. Teemu Leisti (talk) 11:47, 26 September 2008 (UTC)

WP:DATED merge

Wikipedia:Avoid statements that will date quickly has no reason to exist as a standalone mini-guideline. It is about nothing but date-related issues. It can be significantly compressed and simply merged into WP:MOSNUM. — SMcCandlish [talk] [cont] ‹(-¿-)› 04:36, 4 October 2008 (UTC)

Absolutely. Thanks for identifying this, Stanton. Have you posted a tag? Tony (talk) 05:03, 4 October 2008 (UTC)
Agreed - yes he has. Johnbod (talk) 10:49, 4 October 2008 (UTC)
Agreed. - Dan Dank55 (send/receive) 11:29, 4 October 2008 (UTC)
But the redirects, as a section link, should be retained. Septentrionalis PMAnderson 14:31, 4 October 2008 (UTC)
Er, that page is long out-of-date. The most current page seems to be Wikipedia:As of (which states that links such as As of 1990 are deprecated) and the current set-up can be seen at Template:As of (which has been set-up that way since July 2008), which outputs plain text and puts pages into a hidden category (the change in software that allowed this previously controversial issue to be revisited). See also Wikipedia:Updating information, which also seems in need of merging. But please don't merge stuff too quickly without finding out what has been done and what is linking to where. See Wikipedia:Featured article candidates/Congregation Beth Elohim for an example of where confusion and misunderstandings occurred over this. I know merging will help avoid future confusion, but let's not add to the confusion either. I think Wikipedia:MOSDATE#Precise language is the section that people want to merge to. I pointed this out to User:Ikara, who posted a link to the July 2008 village pump discussion. I will point them here as well. Carcharoth (talk) 20:27, 5 October 2008 (UTC)
I also found Template:Update after and Template:Update and Template:Out of date. It is rather a sprawling system, so any merge will have to do a lot of updating to make sure we are not introducing inconsistency across pages. Carcharoth (talk) 20:30, 5 October 2008 (UTC)
Excellent, this was my next plan of action after the WP:As of update two months ago, and it looks like someone got to it before I did. I fully support merging WP:DATED into another project, it is not particularly substantial by itself, and since the update half of it is wrong anyway. However I propose merging to a new, more detailed "Precise language" section within WP:As of, especially as the relevant section in WP:MOSNUM points editors to that page already. The technique discussed on WP:As of relies on precise language, and situations requiring precise language usually warrant the implementation of the "As of" technique, so it is a good target candidate for the merge. WP:As of could then be treated as a sub-project or see-also for the current "Precise language" section of MOSNUM. WP:Updating information is less relevant to precise language or WP:As of, but may be a potential merge candidate at a later date. If there is any reason not to merge to WP:As of, I still support merging DATED into MOSNUM as proposed above – Ikara talk → 22:34, 5 October 2008 (UTC)

Is there such a thing as a 'Search tool'?

A lot of people say that they need to search for articles that relate to dates. I think it would be useful if there were such a thing as a 'Search tool'. For example, the article United Kingdom general election, 2005 does not contain [[2005]]. So it is impossible to find in 'What links here' for the article '2005'.

What we really need is a 'Search tool' where the software automatically finds words. You could put a box in a prominent position at the top left with a button called 'Search' and permit more than one word. Lightmouse (talk) 16:37, 7 October 2008 (UTC)

Could you specify how Special:Search is not useful for this?--Aervanath lives in the Orphanage 16:49, 7 October 2008 (UTC)
Lightmouse is being sarcastic. And it's not especially helpful to the discussion. Shereth 16:50, 7 October 2008 (UTC)
Ah, silly me.--Aervanath lives in the Orphanage 17:03, 7 October 2008 (UTC)

OK, let me reword this. The article United Kingdom general election, 2005 does not contain [[2005]] and it is impossible to find in 'What links here' for the article '2005'. So why do people say that links to date fragments are useful for finding articles or for 'metadata' (whatever that means)? Lightmouse (talk) 17:09, 7 October 2008 (UTC)

They are an aid to searching, not an end unto themselves. I don't think anyone is suggesting that these links are required for the sake of finding articles, just that they expedite the process by providing a handy link as opposed to going over to the search box and typing it in. Is it difficult to use the search box? No. But that is not, in and of itself, justification for disallowing links. Shereth 17:13, 7 October 2008 (UTC)

When I look at 'What links here' for [[2005]], I see a long list of seemingly random articles. I can keep clicking for page after page (it is more than 25,000 articles long) but I don't know why anyone would do that. We have seen that it doesn't contain 'relevant' articles like United Kingdom general election, 2005 and anyone searching for something in particular will use a search tool. You say it is a 'handy link' to the '2005' article and that is a clear statement. But can we put an end to the myth that 'What links here' for date articles is useful for searching? Lightmouse (talk) 17:26, 7 October 2008 (UTC)

Considering Human history covers about 7000 individual years, repeatedly picking years within only the past decade to bolster arguments that none of the other 6990 years should be linked to is a straw man, imo. -- Kendrick7talk 17:30, 7 October 2008 (UTC)

OK. I am not tied to the last ten years, the issue seems generic to me. Name another year and we can discuss that. Lightmouse (talk) 17:32, 7 October 2008 (UTC)

I just wanted to snipe my comment in here. I use AWB a lot to do mass edits and I frequently (at least previous to this issue about delinking dates anyway) used the What links here to pull ni a year such as 2008 to cleanse typoes and the like.--Kumioko (talk) 17:34, 7 October 2008 (UTC)

Yes, I do that too because it is easy. However, 'Wiki search' and 'Google search' return more articles. I can understand that reason but I don't think our AWB needs have been mentioned in the MOS or in talk as a reason for linking. Lightmouse (talk) 17:53, 7 October 2008 (UTC)

It appears that 1066 has what seems to me a reasonable level of internal content. What links here yields just over 500 entries, many of which are of course other date articles. Even 1492 is tolerable. How does 1500 sound as an arbitrary cutoff threshold for discussion purposes?LeadSongDog (talk) 20:45, 7 October 2008 (UTC)

For information, here are the statistics on those dates (mainspace articles):

  • [[1066]] What links here: 387
  • [[1066]] Wiki search: 972
  • [[1066]] Google search: 781
  • [[1492]] What links here: 520
  • [[1492]] Wiki search: 1422
  • [[1492]] Google search: 848

Lightmouse (talk) 21:23, 7 October 2008 (UTC)

For 1066 Whatlinkshere, I got 540 (all spaces) narrowing to mainspace, then removing day, year, list, category and timeline articles cuts it to 279 real articles. But who's counting? ;/p LeadSongDog (talk) 22:15, 7 October 2008 (UTC)
http://en.wikipedia.org/w/index.php?title=United_Kingdom_general_election,_2005&oldid=233932869 Did contain a link before the current pogrom against linked dates. Ann it is still in what links here, thanks to one of the templates it contains. Rich Farmbrough, 20:48 10 October 2008 (UTC).
"or for 'metadata' (whatever that means)" - try metadata and Wikipedia:Metadata. For what it is worth, when I type "2005" into the search box, I get a mass of results that are not useful. Refining the search does help, of course, but if the right balance of linking was used, something like Wikipedia:Link intersection would really work well. That latter proposal is a good example of proposed features that depend on the data and "tagging" information contained in links - which is why overlinking and underlinking must be guarded against. Of course, it is agreeing where the line should be drawn that is the real problem. One person's overlinking is another person's underlinking. Carcharoth (talk) 04:02, 11 October 2008 (UTC)
As with many statistics, the utility of "What links here" is easy to blow out of proportion. It is, indeed, dangerous to treat it too seriously, since it contains so many contaminating factors. Used with care in a specific range of situations, it may be helpful to certain editors; but it should not be considered on the level of the search tools and google. Tony (talk) 04:29, 11 October 2008 (UTC)
Right. So we come down to the nub of it. Let me ask you, Tony, what do you use "what links here" for and what do you use the search box for? I rarely use the search box at all, as I find typing in something in the URL and browsing from there gets me where I want to get to faster. I use "what links here" to <gasp> find out what links to an article. I would be happier if there was a similar tool called "what is linked from here". Being able to scan a list of the outgoing links from an article would be one way of spotting overlinking. Carcharoth (talk) 04:45, 11 October 2008 (UTC)
I don't find it useful. Tony (talk) 07:46, 11 October 2008 (UTC)

Delimiting numbers

Can we agree that if we delimit values to the right of the decimal place, that it shall be done in accordance with

  1. BIMP: 5.3.4 Formatting numbers, and the decimal marker, and per
  2. NIST More on Printing and Using Symbols and Numbers in Scientific and Technical Documents: 10.5.3, Grouping digits, and
  3. ISO (which follows what the BIPM says)…

…all which require that digits be delimited every three digits to the right of the decimal marker.

This issue was thoroughly discussed in Archive 94 and at least two templates created ( {{delimitnum}} and {{val}} ) were made in conformance to those discussions (and in conformance to internationally accepted convention) in order to make it easier for editors.

There is an editor who has been changing articles from 3-digit delimiting to 5-digit delimiting [9] and states that it “looks better” that way. Well… perhaps; beauty is in the eye of the beholder. But, whether it be three or five digits, I don’t think we need Wikipedia flouting the way numbers are delimited because an editor thinks the world ought to work that way; it doesn’t.

MOSNUM is currently silent on this. We should be officially following international standards. Greg L (talk) 23:19, 7 October 2008 (UTC)

P.S. This same editor also brought this issue up here on Wikipedia talk:Manual of Style (mathematics). Let’s all get on the same page here on this one. Greg L (talk) 23:21, 7 October 2008 (UTC)

I have no objection to Wikipedia:Manual of Style (mathematics) using the mathematical convention of 5-digit groupings, while non-mathematical articles use 3-digit groupings. I think you'll find the de facto standard, both here and in the real world, is 5-digit groups if there are more than 15 digits after the decimal point (where the template Greg refers to fails, anyway). — Arthur Rubin (talk) 23:30, 7 October 2008 (UTC)
I see no consensus there, except that the templates don't work for long numbers which are rounded differently to real number format than one would expect. Perhaps there was a consensus in principle before the implementation methods were developed? — Arthur Rubin (talk) 23:35, 7 October 2008 (UTC)
See also the new consensus on KiB / MiB / GiB, where we state that the recognized international convention is not used. Here, we should also recognize that the convention is not used for very long numbers. — Arthur Rubin (talk) 23:38, 7 October 2008 (UTC)
Well, you provided some links above, in a “If it’s blue, it must be true”–fashion. You wrote “I think you'll find the de facto standard, both here and in the real world, is 5-digit groups.” Well, why do you think we’ll find as much? Reading either of your links doesn’t come up with any evidence to substantiate your allegation that the mathematics world decided to flout the rule of the SI. Please provide some evidence by a proper governing body for how things are done differently in the mathematics world.

Criminy, your arguments are weak. The IEC proposal was just that: a proposal. The consensus was to follow the way the world really works. Now ante up with the evidence of how the mathematics world marches to the tune of a different drummer or hold your peace please. Greg L (talk) 23:43, 7 October 2008 (UTC)

I don't have a copy of Abramowitz and Stegun where I can get to it, but the first 20 pages at http://www.math.sfu.ca/~cbm/aands/ demonstrate my point. 3-digit spacing is used for physical constants (even if known to many decimal places), but 5-digit spacing is used reliably for unitless numbers of 8 digits or longer. It would be hard to find a mathematician who actually works with numbers who hasn't used that reference. — Arthur Rubin (talk) 23:58, 7 October 2008 (UTC)
Google scholar 23846 26433 (digits 15–24 of π): 46
Google scholar 238 462 433 (digits 15–23 of π): 15
Arthur Rubin (talk) 00:05, 8 October 2008 (UTC)
Note that the official SI publications call for a narrow space every three digits on both sides of the decimal point. A proposal that was discussed in the past was to use commas to the left of the decimal, and narrow spaces to the right, which would have been a brand new style invented by Wikipedia. --Gerry Ashton (talk) 00:16, 8 October 2008 (UTC)
  • I know Gerry. But en.Wikipedia settled on the use of commas to delimit to the left of the decimal marker. Nothing we’re going to be doing here can change any of that. Different cultures you different decimal markers and delimiters. Now we’re talking about how to handle the right hand side of the decimal marker. And it’s quite a specific discussion: whether to abide by the three-digit convention. The issue is whether or not proper, modern mathematics publications also follow the three-digit rule. I’ll bet dollars to doughnuts they do. Greg L (talk) 00:30, 8 October 2008 (UTC)
unindented
  • Who are you trying to kid here? I can cite Web sites that say the World Trade Center was brought down by pre-planted explosives. That doesn’t mean it is a mainstream, accepted fact. Providing a Google search that comprises a grand total of 46 Google hit examples of your point falls (a *tad*) short of proving your case; if anything, it supports my theory that the mathematics world follows the rule of the SI. Please do tell: what are the dominant mathematic journals and what convention do they require in their publications? As I said above: Please provide some evidence by a proper governing body for how things are done differently in the mathematics world. Greg L (talk) 00:14, 8 October 2008 (UTC)

    P.S. Will someone please help me here with Arthur? I’ve pretty much run out of patience dealing with him. I’m done for the evening. He edit-warred with me over on Pi and on Natural logarithm—which got me wound up—and now his evidence seems to amount to nothing more than “I like it with five digits and can find examples where others have done it that way before.” That’s not nearly good enough. The issue is whether the mathematics world really (professional publications) flouts the SI and delimits to five digits rather than three. If so, I’m sure there is a style guide for mathematicians that affirms this. I’m pretty skeptical there is. Greg L (talk) 00:21, 8 October 2008 (UTC)

    • Who are you trying to kid. "Governing bodies" are exactly what we cannot use, per KiB, as it was accepted by the standards organizations and IEEE, but rejected by IEEE authors.
    • One wouldn't expect "Google scholar" to have thousands of references for anything.
    • Inserted (this is referring to digits 15-24 of π with 5-digit grouping, and digits 15-23 of π with 3-digit grouping, as noted above. it adds more searches.)
      • 5 digit spacing has 16400 on the web, 481 for books, and 46 for scholar
      • 3 digit spacing has 2820 on the web, 111 for books, and 15 for scholar
    • If you can suggest another search which could be done, please do so. Or you could check the corresponding digits of e or some other well-known constant. — Arthur Rubin (talk) 00:23, 8 October 2008 (UTC)
  • *sigh* You haven’t proven your case that professional mathematics publications delimit numbers every five digits. And that’s because professional mathematics publications simply follow the rule of SI. Now please stop being disruptive on Wikipedia by edit warring on Natural logarithm (which had been stable for many months). You’ve stated that “I think [5-digit grouping] is both ugly to edit and difficult to read.” Earth calling Arthur: It doesn’t matter what you think is *ugly* or pretty. You will not be permitted to hijack Wikipedia and impose non-standard ways of doing things. Just showing that it is sometimes done that way (notably with Pi, which is a unique case) isn’t proof and it’s absurd you’d think so. In the face of clear, convincing, standards (NIST, BIPM, and ISO) that it is three-digit groupings, then Wikipedia is three-digit groupings.

    I can accede to Pi being five digits because people are obsessed with counting all those digits and having a lot of them too. But for virtually all other purposes, three-digit delimiting is standard—it doesn’t matter what the discipline is. Greg L (talk) 00:45, 8 October 2008 (UTC)

  • Should I try one of the other standard mathematical constants? I probably wouldn't get enough hits to convince you, but I'm sure the ratio would be the same. (The journals I subscribe to seem to have no spacing whatsoever on either side of the decimal point. I see a 37-digit number in a table. I don't know what happens if the number exceeds a line of text.) — Arthur Rubin (talk) 01:03, 8 October 2008 (UTC)
  • Actually, professional mathematical journals use TeX, and the author doesn't have the choice of formatting the numbers. I don't know why you would expect otherwise. — Arthur Rubin (talk) 01:09, 8 October 2008 (UTC)
  • If the mathematical journals that you subscribe to don’t employ spaces, then why are you saying five-digit spaces are normal in mathematics? I’m no mathematician; I’m an engineer and know the SI writing style inside and out. And it is now becoming increasingly clear to me, Arthur, that notwithstanding that you are strongly advocating that all mathematics articles on Wikipedia depart from the rule of the SI (because you think the BIPM/NIST/ISO convention is “ugly”), you also have no Ph.D. in mathematics. Perhaps there is a Wikipedian who does have a Ph.D. in mathematics who will weigh in here. One who has had a mathematical paper or two published would be ideal. If no such person has weighed in by tomorrow, I plan on getting to the bottom of this.

    TeX appears to be a software tool for making complex algebraic expressions. Much of math is symbolic and Tex appears to be principally (or exclusively) a tool for dealing with the complex symbolics of mathematical expressions. However, constants still have to be dealt with on occasion and the appearance of these numeric equivalencies in professional mathematics journals will conform to style guides that editors rigorously adhere to when authors submit papers.

    I’m quite sure that when it comes to delimiting numeric equivalencies that exceed a certain number digits in the fractional side of significands, mathematical journals—if they are going to add thin-spaces at all—perceive no need to depart from the rule of SI; that would seem quite odd to me. We’ll see; I’m not holding my breath though. Greg L (talk) 03:59, 8 October 2008 (UTC)

  • What are you saying about TeX, Arthur?  . -- Army1987 (t — c) 14:29, 10 October 2008 (UTC)

The pi and e (mathematical constant) and golden ratio articles (also Square root of 2, Square root of 3, Square root of 5) have been stable for a long time with 5-digit groups. Greg L didn't get away with changing them to 3-digit groups, so now he's a bit peeved. He ignores the evidence that in books, at least, these numbers are much more frequently presented with 5-digit groups than with 3-digit groups, which basically are too hard to read for so many digits. Proposed standards or otherwise, this is just what's commonly done, and not disallowed by any blanket style rule in wikipedia, so it seems OK to leave it. Noboby but Greg L seems to mind this way. Dicklyon (talk) 06:31, 8 October 2008 (UTC)

For the benefit of those who wish to know the difference between mathematics and arithmetic: All of the (two dozen, or so) papers I've written, and most of the papers I refer to, have no number over 5 digits past the decimal point (and I think even Greg would accept that 9.23456 is acceptable as written). All the current journals require submission in TeX, so the numeric style can be set by the journal, whatever the author's preference. The online style guides for journals published by the Mathematical Association of America and the American Mathematical Society are silent on number groupings. I could download the full set of specialized macros from some journals to determine the style, but that seems to be bordering on {{or}}. Of course, if I ask one of my publishers what their style specification is, Greg wouldn't believe me if they hadn't published their answer, so I don't really see the point in asking.
Very few of the papers I read have real numbers with more than 5 digits (as opposed to integers), and styles of grouping to the left of the decimal point are irrelevant to this issue. I recall one I read a few weeks ago which had a table of probabilities to 12 digits (I think it had something to do with sabermetrics).
I should also point out that someone re-edited the pointer for previous "consensus" Archive 98 (which discussed the problems with the template) to Archive 94 (which shows a proposal, with the apparent guideline consensus of 3 editors). Furthermore, I'm not proposing (yet) that 3-digit grouping be banned, only that the standard in Mathematics articles should be 5-digit grouping for numbers 10 digits or longer. (As for the paste-to-spreadsheet argument, numbers longer than 16 digits won't evaluate properly if pasted, so there's little point.) — Arthur Rubin (talk) 14:34, 8 October 2008 (UTC)
As a further aside, numbers over 100 digits (50 on some old computers monitors) will run off the right side of the screen without hope of repair if <span> or <nowrap> is used. Breaking spaces need to be used to allow the user to read the numbers. — Arthur Rubin (talk) 16:41, 8 October 2008 (UTC)

Greg L asked below, in opposition to the 5 digit proposal, "One other note: en.Wikipedia adopted the U.S. style and standardized on delimiting to the left of the decimal marker using commas. Let’s please accept that nothing in this debate can change that and limit the discussion to the number of digits per group." He also asked above "Can we agree that if we delimit values to the right of the decimal place, that it shall be done in accordance with" BIPM and NIST standards?

My answer is no. It is not appropriate to pick apart the BIPM and NIST standards and use just the parts we like. Either format the whole number with thin spaces (or some span trick that looks like thin spaces) or use commas just to the left. It is not the role of Wikipedia to invent a brand new format. Similarly, it would look really silly to group a number every three spaces with a comma to the left of the decimal, but with a thin space every five digits to the right of the decimal. --Gerry Ashton (talk) 20:06, 8 October 2008 (UTC)

Proposal to delimit long numeric strings in mathematics articles every five digits

Arthur Rubin, above, proposed that long numeric strings in Wikipedia’s “mathematical” articles should be delimited (where a gap is added between groups of digits via a &thinsp; or a <span>) every five digits. Thus Wikipedia would not follow the rule of SI, which requires that delimiting be done every three digits. He has written that groups of three are “both ugly to edit and difficult to read.” (here).

The facts: Currently, the following mathematics-related articles on Wikipedia have the numbers delimited every five digits:

The question is whether Wikipedia should standardize on this practice on all mathematics articles. Our Natural logarithm article has been stable at three digits (to name one) but Arthur put a {dubious-discuss} tag on it yesterday.

How do others feel about this? Let’s weigh in and discuss this. Whatever the outcome of this is, we need to get it memorialized in an explicit guideline in MOSNUM that in mathematics articles, long numeric strings shall (or shall not) be delimited differently than the rest of Wikipedia.


  • Oppose The rule of the SI (BIPM: More on Printing and Using Symbols and Numbers in Scientific and Technical Documents: 10.5.3, Grouping digits) is clear that long numeric strings are always broken every three digits. Unless (perhaps) the number is Pi—which is a special case because of the great interest in the long, repeating nature of it and people are especially interested in counting the digits—Wikipedia’s math-related articles should follow the rule of the SI. By the way, different countries use different delimiters. Some use thin-spaces, some use commas, some use periods. Many HP RPN-entry calculators like the HP 41 allow the user to select either comma or period delimiting but the delimiting is always done every three digits, not five. One other note: en.Wikipedia adopted the U.S. style and standardized on delimiting to the left of the decimal marker using commas. Let’s please accept that nothing in this debate can change that and limit the discussion to the number of digits per group. Greg L (talk) 15:16, 8 October 2008 (UTC)
  • Inappropriate. There is no established consensus for the 3-digit grouping, even though it's generally rational. Discussion for this should be at WT:MSM, as the discussion for the overall 3-digit grouping with spans (which I'd also oppose, but only weakly) should be here. However, natural logarithm and its base should use the same notation. Stability suggests that of the latter article. — Arthur Rubin (talk) 16:29, 8 October 2008 (UTC)
    Counterproposal. Ban Greg L from commenting on formatting proposals. Even his signature doesn't meet Wikipedia guidelines. — Arthur Rubin (talk) 16:44, 8 October 2008 (UTC)
    Arthur, you've crossed the line into the area of personal attack. Asking that someone be gagged is a sign that you've lost the debate. I will return tomorrow in support of Greg's points. Tony (talk) 17:07, 8 October 2008 (UTC)
    It may may be that Greg has a point, but, as I and others have pointed out at WT:MSM, where this particular discussion should be taking place even if there is consensus for 3-digit grouping in Wikipedia in general, the real-world consensus in mathematics is 5-digit spacing or no spacing.
    This discussion should be at, and only at WT:MSM. Discussion of whether there in consensus for the 3-digit grouping in Wikipedia in general should be in this article. If Greg wishes to rephrase his proposal to a form appropriate for this style guide, we can attempt to return to civility. It should also be pointed out that I only noticed this because Greg started vandalising Pi. And I do mean, vandalizing, rather than merely making harmful edits. — Arthur Rubin (talk) 17:22, 8 October 2008 (UTC)
    The paragraph starting "Who are you trying to kid here?" indicates that Greg does not have an accurate concept of the real world, or of standards bodies. — Arthur Rubin (talk) 17:25, 8 October 2008 (UTC)
  • Oppose as per User:Greg L and what appears to be normal practice in the real world (i.e. not just mathematicians). - fchd (talk) 17:29, 8 October 2008 (UTC)

Note: I just got off the phone with a Ph.D. mathematician at Gonzaga University and had a nice talk about delimiting numbers and the nature of Wikipedia. These guys’ heads tend to be in a clouds and he had only heard of Wikipedia. Since mathematics is typically symbolic, he didn’t know anything about delimiting high-precision numbers—standard or not. So he gave me the names of the three mathematics organizations that dominate the publishing in that field. I’ve begun contacting the editors at AMS.org, SIAM.org, and MAA.org to get to the bottom of this. It might be that the mathematics world does not follow SI writing style (nor that of the NIST and ISO). It may also be that some of Wikipedia’s math articles are marching to the tune of a different drummer.

As I did over on Kilogram, where I corresponded maybe… 50 times with the guy who is working on the NIST’s watt balance, I’m going to go straight to the horse’s mouth on this one and ascertain the true facts. I just now contacted the publisher, publications manager, and managing editor at SIAM.

I think what may have happened here is that what is often done with pi (breaking it up every five digits for ease of *counting all them digits*) has been misconstrued as some sort of standard mathematical convention for delimiting large numbers across the entire discipline of mathematics. Greg L (talk) 17:50, 8 October 2008 (UTC)


Comment – bogus proposal – What sense does it make to consider a proposal written by a person who oppposes it? Let Arthur Rubin or Greg L make their own proposal, instead of one writing a biased case for the other. Dicklyon (talk) 18:02, 8 October 2008 (UTC)

  • It’s not complex Dicklyon. Arthur’s allegation is unambiguous and clear: he said the mathematics world has a five-digit convention for high-precision numeric strings (∆ here). And his edit warring on this issue [10][11][12][13] on the Natural logarithm article—which had been stable at the three-digit convention—makes it quite clear that he thinks Wikipedia should conform to his views on this matter. The question is this: is his proposal proper and wise?

    And if you really think I’m putting words in Arthur’s mouth or have a bias here, please examine Arthur’s 14:34, 8 October 2008 post, above, where he wrote “Furthermore, I'm not proposing (yet) that 3-digit grouping be banned, only that the standard in Mathematics articles should be 5-digit grouping for numbers 10 digits or longer”.  As I found the underlined portion of his suggestion (my emphasis) to be quite absurd (where nine-digits strings after the decimal wouldn’t be delimited at all), I left that bit of absurdity out of my summation of the proposal as I perceived it to be utterly inane.

    And I completely ignored his suggestion that consideration should given to banning three-digit grouping altogether across all of Wikipedia; I found that to be just posturing. But you are more than welcome to revise the proposal to narrowly reflect precisely what Arthur was suggesting. Be my guest. Greg L (talk) 18:26, 8 October 2008 (UTC)

As well you know, your convention for using "*" for your replies instead of the conventional ":" on most talk pages (excluding only !votes, I believe) is a probable violation of WP:TALK. I think the <span> tag in your signature violates WP:SIGNATURE, but I'm not sure.
That being said, I was explicitly requesting this as a convention in mathematics articles, even if there were a guideline for 3-digit grouping for long numbers in Wikipedia in general. In fact, there is not such a guideline, only a weak consensus from February, which was never specifically proposed as a guideline here. In fact, I'm proposing that long numbers in mathematics articles be spaced every 5 spaces after the decimal point (with "long" being subject to debate, but certainly anything longer than 15 digits, and possibly 10.) Greg quotes standards organizations, but no books which actually use a lot of numbers, journals, or journal guidelines. He also fails to note that the IEC standards for KiB, etc. were actual standards, and accepted by IEEE, but not by any of their authors. Even if he is able to find editorial standards which mandate 3-digit spacing, it might still not be relevant to the real world, without evidence those standards are actually followed.
Still, I'm saying that Greg is welcome to propose guidelines here, and I will continue to support my proposed (draft) guidelines at MT:MSMWT:MSM. — Arthur Rubin (talk) 18:45, 8 October 2008 (UTC)
I look forward to reading his replies from the math organizations he claims to be contacting above. I was considering contacting them myself, but I'm sure that Greg wouldn't believe my statements as to what they said. — Arthur Rubin (talk) 18:53, 8 October 2008 (UTC)

Greg:

We do not delimit. I can't speak for others, but that is our policy.

Best regards

Given that the Ph.D. mathematician I spoke to this morning didn’t even understand the concept of delimiting, I suspect that the other two journals will have the same style guide.

Arthur, the *fluidity* of your above 18:45, 8 October 2008 proposal (“…anything longer than 15 digits, and possibly 10” ) makes it increasingly clear to me that this ‘standard in mathematics’ never came out of the professional mathematics world but is instead an accidental invention of some Wikipedians who noted that pi is often grouped that way (for demonstration purposes with a uniquely famous number) and went on a roll with it.

So now the issue is how, when high-precision numbers are used here on Wikipedia, they should be delimited. Is there any reason mathematics-related articles should be any different from the rest of the world? Numbers with high-precision on the integer side of the decimal marker (like 65,812,016) are already delimited because 65812016 is hard to parse). The same can also be said about numbers with high precision on the fractional side of the significand, such as e = 2.718281828459, which is much easier to parse when it is delimited (2.718281828459).

I think we’ve come down to two questions here:

  1. Given the apparent fact that professional mathematic journals don’t delimit to the right of the decimal point, should Wikipedia do so in its mathematics-related articles? I would say that with nasty-ass big numbers, “yes.”
  2. If, for ease of parsing, Wikipedia’s mathematics articles do use delimiting, should Wikipedia adopt a special practice just for its mathematics-related articles that departs from what is prescribed by the BIPM (and the NIST and the ISO) and what is used in the applied world such as physics? I would say “no.”
Notwithstanding Arthur’s healthy skepticism that anyone in the world actually bothers to follow the SI-compliant practice of delimiting digits to the right of the decimal marker in groups of three (a notion most well-educated Europeans would find utterly laughable), it is actually followed throughout the world (NIST example here). After all, much of what is in the SI is just the memorializing of long-standing practices. For him to evince a skepticism on this fact betrays, in my opinion, a serious lack of knowledge of how the applied mathematics world (physics and engineering) works—either that, or a disingenuous debate tactic that backfired.

I will relent on the issue where this dispute started: I would propose that Wikipedia’s Pi article should stay with 5‑digit groupings because that practice is quite common with Pi (Google book search of pi in 5‑digits). The number pi is unique and many readers are particularly interested in counting its digits and marveling at its irrational nature. For instance, in Wikipedia’s article on pi, the text just before the value says “The numerical value of π truncated to 53 decimal places is…” When the focus is on a specific number of digits, five-digit groupings has its virtues.

But for most everything else, like e, where an arbitrarily chosen number of digits are shown and there isn’t a special emphasis on counting them, there are plenty of easy-to-find examples showing that the mathematics world is no stranger to the standard three-digit convention familiar to any European or anyone who is familiar with how to use the SI (Google book search of e delimited in 3‑digits). I see no reason for Wikipedia to stray from standard, SI-compliant practices here.

And finally, when we do delimit really big numbers, I would propose that we recommend that editors use the hand-coded <span>-based technique until character-counting parsing functions become available for tools like {{val}} and {{delimitnum}}, which are limited as to the number of digits they can handle. The virtue of using spans, like so…

2.718<span style="margin-left:0.25em">281</span><span style="margin-left:0.2em">828</span><span style="margin-left:0.25em">459</span>)

…is readers can copy and paste values into Excel, where the first sixteen digits will be treated like a real number without the necessity of hand-deleting any non-breaking spaces. Greg L (talk) 21:32, 8 October 2008 (UTC)

If ability to copy and paste numbers into spreadsheets were really relevant, we would need to avoid using commas to delimit thousands (or we would need to use some magic to make them disappear when copied and pasted), and we would need to avoid using notation like 6.02 × 1023 (or we would need to use some magic to make it become 6.02e23 when copied and pasted). -- Army1987 (t — c) 15:01, 10 October 2008 (UTC)


For the corresponding digits of e (mathematical constant) (16-25 for 5-grouping and 16-24 or 3-grouping), the results are:
  • 5: 2640 web, 184 books, 13 scholar
  • 3: 805 web, 19 books, 2 scholar (one of which doesn't appear to be "E", actually, but intended as a random string of digits)
I can't find a way to get google to search for a substring of an unspaced string. I don't deny that 3-digit grouping is used for short numbers, and have no objection to it being used. I'm suggesting that in mathematical articles where the number has over (somewhere bewteen 10 and 15, TBD) digits past the decimal point, 5 digit spacing should be used, and that 5-digit spacing is allowable for shorter numbers in those articles.
(If possible, I'd also like to see the nested span approach be deprecated; I have less objection to the code suggested here than to the code you inserted in pi, which might possibly fail if a browser is unable to handle a stack of 18 spans (in addition to whatever styles are inserted normally).
I also am stating again' that this discussion is misplaced and amounts to Greg making a WP:POINT. I admit to not having been ready for a specific proposal, but I wanted to counter Greg's edits to insert his preferred notation in stable mathematics articles against consensus, even if there were a Wikipedia guideline to use 3-digit spacing. There's no such guideline agreed to. If Greg wants to propose the guideline, then I want to make it clear that Mathematics articles should have their own guideline, which is properly discussed at MT:MSM, regardless of a general guideline here. — Arthur Rubin (talk) 22:08, 8 October 2008 (UTC)
I refuse to discuss any style issue at MT:MSM because this is the English Wikipedia and that page is in some other language. --Gerry Ashton (talk) 22:28, 8 October 2008 (UTC)
That was a typo -- try WT:MSM. Dicklyon (talk) 02:17, 9 October 2008 (UTC)
But so are the articles in question. Perhaps we should move them to the math.en or en.math Wikipedia.  :) — Arthur Rubin (talk) 22:33, 8 October 2008 (UTC)
Since Arthur Rubin insists the issue should be discussed in a foreign language, I will disregard all his views about grouping numbers. --Gerry Ashton (talk) 23:00, 8 October 2008 (UTC)
In that case, the mathematics articles should, under Wikipedia guidelines, disregard any guideline established here. I would ask you to reconsider. — Arthur Rubin (talk) 23:11, 8 October 2008 (UTC)
As for Greg, I think I could propose a guideline appropriate for this discussion which we could all (except, apparently Gerry), live with. But I don't want to put words in his mouth. I ask him to make a proposal for style guideline which he would accept, noting that (almost) all style guidelines can be overriden by subject-specific style guidelines, and we can go on from there.
Oh, and, since no one has spoken in favor of this guideline in this venue, this section should be dropped. I spoke in favor of it (as well as requesting helpful modifications) in MT:MSM, where the discussion belongs. — Arthur Rubin (talk) 23:23, 8 October 2008 (UTC)
Instead of making claims about Wikipedia guidelines which he fails to cite, perhaps Arthur would like to explain why this should be discussed in Maltese. --Gerry Ashton (talk) 23:22, 8 October 2008 (UTC)
He obviously meant WT:MSM (since he has referenced it several times before) and it was a typo. Also, not that "authority" has much weight with the WP crowd, but Arthur Rubin is certainly more qualified to address issues of standards in mathematical publications than anyone else in this discussion. He has an Erdos number of 1, for crying out loud! --Sapphic (talk) 00:00, 9 October 2008 (UTC)
  • Well I know as much or more than probably anyone around here about PEM fuel cells and I don’t even bother to edit that damned article—I’ve barely even looked at it, much less read it. I fear I’d go try to fix something and would get into an editwar with someone who got everything he knows out of Popular Mechanics. But I did try to be somewhat informed about competing energy technologies (solar for instance) so I could be well prepared to deal with the real world and help design products that fulfilled a real marketing need.

    I also try to be thoroughly familiar with all things SI and metric; it’s a classy system of units. And I’ve authored enough patent papers and white papers to actually know how to be SI-compliant when doing so. What’s got me skeptical here about Arthur’s take on this issue—and just pardon me all over the place for thinking this—is that Arthur has busied himself here denying that the SI way of delimiting numbers every three digits to the right of the decimal point is a standard that is remotely observed in the real world. He even equated this to the lack of adoption of the IEC prefixes (mebibyte, etc.) and challenged me to cite proof that the SI method is actually adhered to in the real world. While Arthur may be an exceedingly wonderful fellow to drink beer with, the above facts tell me he is certainly not coming into this argument with a sufficiently sophisticated world view and, further, that too much education on the essential facts is needed just to bring him up to speed. His arguments, that Wikipedia needs to stray from the SI on an issue that the professional math journals are silent on are… less than persuasive. Greg L (talk) 00:37, 9 October 2008 (UTC)

    P.S. I see he’s down below, expanding on how he can find no evidence that the real world follows the rule of the SI. Breathtaking. Greg L (talk) 00:39, 9 October 2008 (UTC)


This isn’t complex, Arthur. It’s quite clear that professional mathematics journals don’t typically deal with big numbers—it’s all mostly symbolic—and when they do, they’d don’t bother with delimiting. If Wikipedia is going to be delimiting big numbers for readability (which we probably should do), then we should do so in a way that is SI-compliant. I can see no absolutely no reason why numbers would be delimited every three digits to the left of the decimal place and then start being delimited every five digits on the right. The SI is clear that one delimits in groups of three regardless of which side of the decimal marker you are on. The only exception is in cases where you precede very special numbers with wording like “Here are the first one-hundred digits of…”. In that case, go ahead and do it groups of five. Greg L (talk) 23:25, 8 October 2008 (UTC)

Yes, it is complex, because you keep changing stable articles such as pi. > Several times now, I’ve stipulated that because of it’s unique nature, Pi should stay with five-digit groupings and it’s been a day since I even argued the point on Talk:Pi. See the above post; it’s short enough for those with even short attention spans. Don’t you even read posts before responding to them? You’re relying again on fallacious arguments. Greg L (talk) 00:47, 9 October 2008 (UTC) < May I suggest the following modification of whatever 3-digit grouping rule you come up with.
In any article about a mathematical constant or constants, where at least one is known and stated to 15 places, and (almost) all are less than 10, any grouping of digits to the right of the decimal point should be in groups of 5, and breaking spaces (thin or not; I don't know if there is such a thing as a thin breaking space, although one could probably construct one out of a thin non-breaking space and a non-displaying optional line break character) should be used. If, instead, these are only known and stated to 8 (or more) places, this format is optional, if such spacing is frequently used in that field.
I'm perfectly willing to accept suggestions such as changing the "15" to "20", or the "8" to "10", but guideline as a whole should stand.
Careful study shows that all the articles pointed to by the pseudo-template in square root of 2 (and I don't know why it isn't a real template) meet that condition (except for the obvious readablility requirement that breaking spaces be used if a (word, formula, or number) is likely to be wider than a page), and the entries in mathematical constant#Table of selected mathematical constants. I would rather have the condition be that mathematical constants in mathematical articles be so formatted, but it's not really important to me if there is an article about the constant. I also point to the article illegal prime which uses 5-digit spacing for an integer, but that's only formerly a featured articles, and the standards may not have been as precise back then, and it may not have been featured at the time the number was present. (And it may be illegal for Wikipedia to have the number.) — Arthur Rubin (talk) 00:09, 9 October 2008 (UTC)
As for the "real world", the only cases in which large numbers are grouped ("delimited" seems incorrect), in papers that that I've read, are:
  • REALLY long numbers, likely not to fit on a line (spaced in groups of 5).
  • Long tables of numbers with approximately the same number of digits (some spaced in 3-digit groups, some in 5-, one in 4-digit groups, believe it or not. If this were the only case, I wouldn't be able to assert it's usually spaced in 5)
  • Abramowitz and Stegun (in which not only stand-alone numbers, and numbers in tables, but also numbers in formulas are spaced in groups of 5. However, the physical constants page are spaced in groups of 3.).
This applies not only to US publications, but to such as Fundamenta Mathematica at the time I published there. But all of these are over 20 years old. I haven't seen a mathematical paper with grouped digits since then.
Arthur Rubin (talk) 00:29, 9 October 2008 (UTC)
(Reply to interpolated comments). All articles about mathematical constants should have 5-digit spacing, not just Pi. Pi is just the only one Greg has attacked. — Arthur Rubin (talk) 01:13, 9 October 2008 (UTC)
Oh for God’s sake, Arthur. Abramowitz and Stegun was first printed in 1964 and it is just a humungous tabular list of numbers. It was a jointly produced with the help of the NIST, which is infinitely clear as to the proper, modern way to express numbers that are included as part of standard prose (More on Printing and Using Symbols and Numbers in Scientific and Technical Documents: 10.5.3, Grouping digits and NIST example of proper use). Is this part of why you think “mathematics is five-digit groupings?” Perhaps if Wikipedia had mind-boggling lists of purely tabular data, they too should use five-digit groupings. But that is not what we’re talking about here. Wikipedia needs to be SI-compliant. It’s that simple. It’s quite clear from having spoken to a Ph.D. mathematician at a university this morning and having read your writings here, that neither of you guys would recognize SI-compliant writing if it bit you on the butt!

And please stop insisting that “mathematics does it this way or that way.” I just communicated this morning with a mother of all mathematical journals and they’re completely silent on this issue. And now you’re here saying we should ignore the way it’s done in the applied sciences (and the way even semi-educated Europeans do it and the way the NIST and the BIPM prescribe) because you can dredge up reference books of tabular numbers that show them that way. Greg L (talk) 01:27, 9 October 2008 (UTC)

Wrong. Abramowitz and Stegun is "tabular" in a sense, but it's the same sense a properly formatted HTML document is tabular; Chapter 1 may contain sections 1.1, 1.2 (table of physical constants), 1.3, 1.4 (table), 1.5, 1.6, etc.; Section 1.1 may contain formulas 1.1.1 through 1.1.10, table 1.1.11, subsection 1.1.12 which contains formulas 1.1.12.1 through 1.1.12.5, etc. It contains many tables, but as far as I can tell, all numbers other than physical constants are spaced in 5-digit groups, while physical constants are spaced in 3-digit groups. You can't say that wasn't intentional. (Well, you can, but it would be strange).
Google books and Google scholar confirm that many more books and papers use 5-digit spacing for precise values of mathematical constants than 3-digit spacing, at least for representations of π and e of at least 25 digits after the decimal point. (I chose "25" to be fair to the 3-digit representations; 25 digits for the 5-digit and 24 for the 3-digit.) I can't get google to search for unspaced constants, so that may be more prevelant.
  • Oppose None of the proposals from Arthur Rubin actually help the situation and plenty of other suggestions are much better. Fnagaton 13:53, 9 October 2008 (UTC)
  • Oppose – I acknowledge that it is more or less common to format long numbers that run over several lines in groups of five. But I doubt the need of such deviation from a consistent guideline here on WP. I also suggest to always use spans for spacing. Template:spaced could easily be expanded to work for long numbers. —Quilbert (talk) 14:14, 9 October 2008 (UTC)

Dueling proposals

After consideration, I propose that 5-digit spacing be used in articles about mathematical constants known precisely. You can propose that the (not exactly SI)-standard form (modified to use commas left of the decimal point) be generally used, but you really haven't done that yet, so we can't say there's a consensus to do it. — Arthur Rubin (talk) 02:05, 9 October 2008 (UTC)
  • My proposal is extraordinarily simple: Bring virtually all delimited values on Wikipedia into compliance with the SI, which underlies pretty much everything done here on Wikipedia. Period.

    There are some exceptions to this principle of adherence to the SI; for instance, the SI prescribes that a space be inserted before the percent symbol (e.g. 75 %). But Wikipedia wisely ignores this and follows the common practice observed in the real world. This is not the case with your suggestion that we flout the SI and delimit with commas every three digits to the left and every five with gaps on the right. The BIPM (and NIST and ISO) don’t mention any exception for “articles about mathematical constants known precisely”. Further, compliance with the SI has the virtue of following how the real world in real, every-day life in Europe works: delimiting in groups of three—regardless of which side of the decimal point you’re on.

    But I would stipulate that in cases where the text preceding very special numbers has wording that invites readers to count digits, such as “Here are the first fifty digits of…”, then in that case, go ahead and do it groups of five. The same should apply for large tabular seas of numeric values, such as high-precision trigonometric tables. Greg L (talk) 02:43, 9 October 2008 (UTC)

I support Arthur Rubin's as representative of typical published typography for long digit sequences, and oppose Greg L's as an unnecessary change to a narrow class of items, those mathematical constants that people want to see lots of digits of. The break point is around 10 digits, or between physical and mathematical constants. For example, 299,792,458 m/s, but 3.14159 26536. Which is how it has been for quite a while, and nobody but Greg L is seeking to change it. Dicklyon (talk) 03:38, 9 October 2008 (UTC)

I would support Greg's proposal in the paragraph before Dicklyon's if I thought we could pull it off; unfortunately, I don't think we could get it adopted. If the proposal were adopted, WP:MOS#Large numbers would change as shown:

  • CommasThin spaces are used to break the sequence every three places (2 900 000).

Interestingly, the rule about not grouping digits to the right of the decimal is hidden away someplace away from where comma-grouping is discussed, making it difficult to find.

Please interpret my version of the new rule only in terms of how it appears to the reader; the mechanism to create the appearance is still up in the air. --Gerry Ashton (talk) 03:55, 9 October 2008 (UTC)

  • For an online readership of our range and type, I fully support Greg's line on this matter. I find the agressive stance here by [unnamed] to be odd given Greg's experience in the editing of engineering and mathematics topics. Tony (talk) 08:08, 9 October 2008 (UTC)
  • Support 5-digit grouping for very long numbers because that's the way I've always seen them grouped in the real world. Long numbers doesn't really fall under the scope of SI and NIST guidelines; these rules are meant for measurements, which are never known with more than a dozen significant figures or so. Trying to shoehorn mathematical constants and long numbers into the SI is not helpful. --Itub (talk) 09:46, 9 October 2008 (UTC)
  • Any such standard needs to allow valid exceptions. For instance, geographic coordinates are usually quoted to six decimal places (and can have more, or fewer), always with with no spacing: "52.342345,-23.765134". Andy Mabbett (User:Pigsonthewing); Andy's talk; Andy's edits 10:39, 9 October 2008 (UTC)
  • This has to be one of the lamest disputes I've seen in a long time… I pinch myself, but editors really are arguing about whether to write 3.141 592 653 589 793 238 46…, 3.14159 26535 89793 23846… or just plain old 3.14159265358979323846…! Actually, I note that not one editor has proposed the third option, despite the fact that this is what is used in the article. Neither has any editor commented on the lamentable state of most of the articles about "special irrational numbers", nor about the fact that the infoboxes in such articles give the "binary" form (if such a beast really existed) before the decimal form, and also include ludicrous translations into hexadecimal notation. Oppose all proposals as WP:CREEP, lacking in consensus and an obviously serious distraction of editors from the task of improving articles. Physchim62 (talk) 13:55, 9 October 2008 (UTC)
  • No, you are wrong, Itub, when you wrote “Long numbers doesn't really fall under the scope of SI and NIST guidelines”. Let’s get our fact here straight. The BIPM’s SI style guide has a section, 10.5.3, dealing specifically with long numbers. It calls for groups of three digits. Period. This simply reflects the European’s centuries-long practice of grouping in threes to the right of the decimal marker. Further, that practice is a logical extension of how delimiting is always done to the left of the decimal marker in groups of three wherever you live; one doesn’t suddenly change to groups of five just because one is to the right of the decimal point. This issue also clearly falls with in the “scope” of the NIST (splendid example here), as they too have a style guide that calls for SI compliance. So too does the ISO.

    The US practice of delimiting with commas to the left has been adopted for use here on en.Wikipedia; we’re not going to be changing that with this discussion. When delimiting is employed to the right of the decimal marker as well, then it should be A) logical (you don’t suddenly change to groups of five), and B) SI compliant.

    And with due respect to Physchim62, style guides—like MOS and MOSNUM—serve a valuable editorial purpose; that’s why all publications, whether it’s the NY Times or the Encyclopedia Britannica, have one. We don’t need editors on Wikipedia inventing new systems here. We don’t need two versions of {{val}}: one called {{val_(SI-compliant)}} and the other called {{val_(for_mathematical_constants_known_precisely)}}. Greg L (talk) 17:05, 9 October 2008 (UTC)

  • I believe Greg L cited the wrong publication; it is NIST, not BIPM, that has a style guide That style guide (section 10.5.3) says
Because the comma is widely used as the decimal marker outside the United States, it should not be used to separate digits into groups of three. Instead, digits should be separated into groups of three, counting from the decimal marker towards the left and right, by the use of a thin, fixed space. However, this practice is not usually followed for numbers having only four digits on either side of the decimal marker except when uniformity in a table is desired.
[examples ommitted]
Note: The practice of using a space to group digits is not usually followed in certain specialized applications, such as engineering drawings and financial statements. [boldface added]
Notice the use of commas to group digits, anywhere, is clearly contrary to this style guide. --Gerry Ashton (talk) 17:39, 9 October 2008 (UTC)
  • (*sigh*) Here is the link to the BIPM’s style guide (which the NIST mirrors): 5.3.4 BIPM: SI brochure (8th ed.): Rules and style conventions for expressing values of quantities: Formatting numbers, and the decimal marker. I cited and linked to this in my 23:19, 7 October 2008 post, above, but may have gotten the two twisted in a copy/paste since then.

    Now, Gerry, let’s get real shall we? I’ve mentioned several times above that en.Wikipedia adopted the US convention of delimiting to the left with commas. Nothing we’re discussing here is ever going to change that fact. The people over on fr.Wikipedia will keep doing as they like. As I wrote several times above (it would be nice if you actually read some of the goings-on here because I’m way ahead of you here) this practice of comma-delimiting to the left is far too entrenched in the U.S. and across the Internet for you to change that with your above epiphany. None of us here in this debate on this mote of a backwater discussion is going to change the way the U.S. works in this regard nor en.Wikipedia’s adoption of that widespread convention.

    As I also wrote above, this is an issue of simply adhering to the three-digit practice that is common throughout Europe and which has been standardized for use with the SI. And if you’re point is that we should ignore the entire SI style guide because we ignore parts of it, I reject that as utterly absurd. We already reject the BIPM’s call that a space be inserted before the percent symbol, e.g. 75 % (5.3.7 Stating values of dimensionless quantities, or quantities of dimension one). Why? Because the real world doesn’t work that way. Well, the the real world actually delimits in groups of three and the BIPM and NIST and ISO know that. One doesn’t suddenly change to five-digit groupings to the right (retaining three-digit groupings to the left) just because the article mentions “mathematics” four times in the body text. We don’t need two versions of {{val}}: one called {{val_(SI-compliant)}} and the other called {{val_(for_mathematical_constants_known_precisely)}}. Greg L (talk) 18:07, 9 October 2008 (UTC)

  • Using commas to group digits, either to the left or the right of the decimal marker, is not compliant with the NIST Special Publication 811 guideline, nor is it compliant with BIPM brochure. If you want to propose something that groups with commas to the left of the decimal marker and spaces to the right of the decimal marker, go ahead, but do not claim it is "SI-compliant". YOUR PROPOSAL IS AN OBVIOUS VIOLATION OF SI. --Gerry Ashton (talk) 18:22, 9 October 2008 (UTC)
  • You obviously didn’t read or understand what I wrote above, Gerry. Either that, or you are using utterly fallacious arguments that ignore common sense in an effort to justify an inane proposal. The only part of my suggestion that is a violation of the SI is that we not *pretend to* change the United States’ long-standing practice of using commas to the delimit to the left. You’re simply being absurd and childish. And since you seem to have a brain-block on this, 3.141592658 is perfectly SI-compliant. Your implicit suggestion that all that must go out the window with U.S.’s 31,415.926585 is ludicrous. Finally, you’re shouting. As you no longer intend to debate here in a helpful or constructive manner, I will no longer respond to you here on this issue. Goodbye. Greg L (talk) 18:36, 9 October 2008 (UTC)
  • P.S. I now see that your response to this is to nominate the {{val}} template for deletion (your nomination notice here). Note that there are articles that use this. At least one of them has been awarded GA status. Do you enjoy being a pain for others? Greg L (talk) 18:53, 9 October 2008 (UTC)
STOP! Wikipedia has dispute resolution procedures, I beg you to use them now. Otherwise I will refer this discussion to WP:AN: I think that the possible consequences of this dispute are sufficient to justify such a step. All users involved are being silly, but the party's over, sorry, I will not sit by and see the format of scientific and mathematical articles be fought over in this way. Physchim62 (talk) 19:16, 9 October 2008 (UTC)
  • I consider it prudent to stop the use of badly designed templates that just happen to look ok when their capablilities are not pushed. For example, while 3.141592658 is indeed SI compliant, 86,164.555368 mean sidereal seconds per mean solar day is not. --Gerry Ashton (talk) 19:19, 9 October 2008 (UTC)


  • It doesn’t matter what just one editor thinks Gerry. Wikipedia would grind to a halt if every editor with strong feelings on a subject flouted the general consensus and deleted whatever he disagreed with. Those two templates you nominated for deletion were extensively discussed and well received on both WT:MOSNUM and WT:MOS (here and here). You were the only holdout and were quite clear that you opposed these templates ([14]). You know full well what the consensus was regarding these templates. You also know full well that the deletion of those templates would be exceedingly disruptive and would damage Wikipedia. One of the articles that makes extensive use of these templates just received WP:GA status. To the others: (Physchim62). No worries. I’ve already filed an ANI here over what Gerry did today. Greg L (talk) 20:32, 9 October 2008 (UTC)
Large or long numbers
  • Commas are used to break the sequence every three places left of the decimal point; spaces or dots are not used in this role (2,900,000, not 2 900 000).Optionally, thin spaces or markup that creates the appearance of thin spaces may be used to break the sequence every three places to the right of the decimal point; commas are never used to the right of the decimal point. Note this convention is unique to the English Wikipedia. except in Technical tables may have a unique format if it aids readability. in quotations where the original does so (such as in scientific publications). Quotations retain the format in the original.
I would be much less concerned about the Val template. I don't mind if Wikipedia creates its very own format, as long as the community does so with it's eyes wide open. I would also want some assurance that the problems alluded to in the following passage from MOSNUM have been resolved:
    • {{val}} is meant to be used to automatically handle all of this, but currently has known bugs, principal among them, not displaying some values as typed in the code (see Talk:val). Use with great consideration and always check that it will give the correct results before using it.
--Gerry Ashton (talk) 22:19, 9 October 2008 (UTC)
  • All: What Gerry is saying above is unsupportable. The {{val}} template is intended for making fully SI-compliant numeric equivalencies. Examine these examples:

SI-compliant output
  1. {{val|6.62606896|(33)|e=-34|u=[[Joule-second|J·s]]}}6.62606896(33)×10−34 J·s  Compare to NIST’s version here
  2. {{val|1.3806504|(24)|e=-23|u=J K<sup>−1</sup>}}1.3806504(24)×10−23 J K−1  Compare to NIST’s version here
  3. {{val|1.660538782|(83)|e=-27|ul=kg}}1.660538782(83)×10−27 kg  Compare to NIST’s version here
Note that {val} uses thinspaces to the left and right of the × sign. This was a compromise solution that made everyone happy on WT:MOS.
What Gerry has objected to in the past (it is a bit unclear what he is complaining about here), is that {val} can also be used to make the U.S.-style delimiting on the left-hand side of the decimal point that has been standardized here on en.Wikipedia. Thus:
  • {{val|12345678}}12345678.
It appears that Gerry would have this written out as follows: 12345678 since that is what the BIPM prescribes for world-wide use. Well, this is en.Wikipedia and delimiting with gaps to the left of the decimal point is simply not in the offering here.  It would be exceedingly naive and unrealistic to expect otherwise. It’s just that simple.
This specific issue (commas to the left) was discussed by very many editors back in February and Gerry expressed his opposition at that time. But Gerry’s views were heard and rejected by the majority as unworkable and impractical. All aspects of {{delimitnum}} (fully embodied in {{val}} ) were thoroughly discussed on both WT:MOSNUM and WT:MOS, the tool was enthusiastically supported, was put forth for being made, and a Bugzilla was posted asking the developers to make the special parser functions necessary to employ it. Gerry disagreed at that time. And he’s agitating here again on the issue.

Uses <span> to create thin spaces
Note that there is something else that the {val} and {delimitnum} templates do. They don’t use “spaces” to delimit the fractional portion of the significand (the portion of the significand to the right of the decimal marker). Instead, they use what typographers refer to as “pair kerning” via em-based control of margins (e.g. <span style="margin-left:0.25em">). Margin positioning is part of what the Web-authoring community calls span tags, which, in turn, is part of Cascading Style Sheets (CSS). Effectively, what appears to be a space would really only be a visual effect caused by the precise placement of the digits; the “spaces” wouldn’t be separate, typeable characters.

To see the difference, slowly select the two values below with your mouse:

6.022464342 (via the em-based span tags that {val} uses. Note how the cursor snaps across the gaps)
6.022 464 342 (via non-breaking spaces, note how the spaces can be individually selected)

One might ask “Why is em-based margin control via span tags nice?” Note how, as you select the two values above, the lower version has spaces that can be selected because they are distinct characters. Now try double-clicking on both of the above values. Note how you can select the entire significand of only the top value with a double-click. By using the technique illustrated in the top example, people will be able to select entire significands from Wikipedia and paste them into Excel, where they will be recognized as real numbers! This beats the hell out of the old system, where (as exemplified at Font size) simple regular spaces and non-breaking spaces are used to delimit numbers. These values can’t be copied and used in Excel without first hand-deleting each of the spaces from every value. Until the spaces have been deleted, Excel treats the numbers as text strings upon which mathematical operations can’t be performed. If you try, you’ll just be met with a #VALUE! error.

True minus sign for exponents
And there is another bit of attention to detail that SkyLined took care of with {val}. When we hand type a negative exponent, like “-34” we type using the hyphen key on our keyboard. Even if we press the ‘minus’ key on a numeric keypad, we still end up with a hyphen (ASCII character 45). The trouble with the hyphen is it appears rather short when superscripted and looks like 1 × 10-34. SkyLined’s {val} template substitutes the true minus sign (Unicode &#x2012;) when rendering the expression to produce 1×10−34.

No-wrap
Of course, {val} takes care of all the {{nowrap}} details so that no part of your entire numeric equivalency will do a line-end break.
The {val} template is easy to use, produces gorgeous, SI-compliant output (as compliant as possible if U.S.-style commas show on the left of the decimal point), and readers can double-click to select entire significands and paste them into Excel where they will be instantly treated as numeric values without any further editing. Greg L (talk) 00:40, 10 October 2008 (UTC)


Lets not forget to look at another number from the same web site 12906.4037787 Ω, and compare that to NIST's version. Are we to accept not only comma grouping and space grouping in the same article,not but in the same number, without a revision to MOSNUM? Note that the Val template also made a small binary-to-decimal conversion error too, the 699 on the extreme right should just be 7.--Gerry Ashton (talk) 00:59, 10 October 2008 (UTC)
  • Then don’t use {val} for that value if you don’t want to. Values like that are rare on Wikipedia anyway. But if you start using the Euro method of using spaces on both sides of the decimal point, some confused reader is going to change it. Greg L (talk) 01:06, 10 October 2008 (UTC)
  • Real-world example: the CRC Handbook of Chemistry and Physics has a table of mathematical constants, which is spaced every five digits, and a table of physical constants, which is spaced every three digits. They can be flexible when needed; why can't we? I'd like to see an official publication by BIPM or SI that shows a mathematical constant with more that say 20 figures and is spaced every three digits as a counterexample. --Itub (talk) 05:20, 10 October 2008 (UTC)
  • There is tons of evidence in books and papers that math constants like pi and e are much more often printed with 5-digit groups than with 3. Greg L keeps ignoring this fact of such long numbers being treated differently, essentially as digit sequences as opposed to just values, and wants to unify such digit sequences with ordinary numbers. In spite of the NIST/BIPM guide that says one may divide into groups of 3, division into groups of 5 remains widespread for such case. So there's no compelling reason to change how it has long been done in wikipedia. Dicklyon (talk) 07:02, 10 October 2008 (UTC)
  • Google books test: 164 hits for "14159 26535 89793 23846 26433 83279" vs 58 hits for "141592653589793238462643383279" and only 16 hits for "141 592 653 589 793 238 462 643 383 279". Ok, all three styles have seen some use in the real world for 30-decimal pi, but the five-digit grouping is the most popular one and the three-digit grouping is by far the least popular. --Itub (talk) 12:58, 10 October 2008 (UTC)
    • Note: similar results are found for e, the square root of 2, and the golden ratio. In one case (I don't remember which) there were zero hits for three-digit grouping while there were a handful for five-digit grouping. Also note that to have a meaningful comparison the numbers need to be at least 15 digits long. --Itub (talk) 15:41, 10 October 2008 (UTC)
  • I've felt all along that separating digits into groups of three with thin spaces both left and right of the decimal place would be a hard sell; the only places I knew of that used that standard were BIPM, NIST, and IEEE. NIST couldn't even get the rest of the government to go along with the standard. I have finally found one additional source that follows the standard: Blackburn & Holford-Strevens The Oxford Companion to the Year published by Oxford University Press in 1999, with corrections in 2003. An example from page 805 is "40 929.397 74".
I've always felt it was a choice among bad alternatives; the usual US typography, which is hard to read, the BIPM standard, which is unpopular (or maybe "rare" is a better word), and Greg L's proposal, which is unique to Wikipedia. --Gerry Ashton (talk) 13:22, 10 October 2008 (UTC)
  • "In certain subject areas the customary format may differ from the usual national one: for example, articles on the modern U.S. military often use day before month, in accordance with usage in that field." The quote is, of course, from WP:MOSNUM! I we can have such a common-sense compromise for dates, why not for numbers? Why not restrict ourselves to saying that number format should correspond to English-language usage in the subject area of the article? That would be ISO 31-0 for the physical sciences, but could be different in other fields. Hence, the speed of light would be 299 792 458 m/s but the population of New York City would be 8,274,527. Physchim62 (talk) 13:30, 10 October 2008 (UTC)
    • Note that, for the speed of light, the Google hits are about 80,000 for both 299,792,458 and 299792458, as opposed to 18.8 million for 299 792 458. I would hardly call ISO 31-0 unpopular or rare! Physchim62 (talk) 13:37, 10 October 2008 (UTC)
    • Restricting the search to Google Books, the figures are 2120 for the version with spaces and 651 for the version without spaces. None of the first 150 hits in the search for "299,792,458" actually used commas to separate the groups of digits; all simply had no separation. Physchim62 (talk) 13:56, 10 October 2008 (UTC)
      • You must have forgotten to put your spaced google web query in quotes, because it is also matching pages where 299, 792, and 458 occur in different parts of the page. Also, google web has very weird rules for dealing with punctuation, so a search for 299,792,458 or for "299 792 458" can return results for 299792458. Google books has different rules, because "299 792 458" doesn't return hits for 299792458. --Itub (talk) 14:33, 10 October 2008 (UTC)
    • OK, the figures come out about the same for spaced and unspaced versions of the speed of light, both in Google Books and Google Web. Can't say anything about the use of commas, as Google appears to systematically remove them from large numbers. Physchim62 (talk) 15:00, 10 October 2008 (UTC)

Dueling proposals (cont'd)

I think the important distinction here is to not look towards reference books filled with tabular data for guidance here on this issue. If Wikipedia had a page filled with lots of high-precision tabular data with values less than 1—such as high-precision trig tables (who uses those anymore anyway?), then we might follow that convention for our tabular trig tables. We might have tables that look like this:


90.00°  1.00000 00000 00000
89.99°  0.99999 99847 69129
89.98°  0.99999 99390 76517
89.97°  0.99999 98629 22164
89.96°  0.99999 97563 06074
89.95°  0.99999 96192 28249
89.94°  0.99999 94516 88695
89.93°  0.99999 92536 87414
89.92°  0.99999 90252 24415
89.91°  0.99999 87662 99704

89.90°  0.99999 84769 13288
89.89°  0.99999 81570 65176
89.88°  0.99999 78067 55379
89.87°  0.99999 74259 83907
89.86°  0.99999 70147 50771
89.85°  0.99999 65730 55985
89.84°  0.99999 61008 99561
89.83°  0.99999 55982 81513
89.82°  0.99999 50652 01858
89.81°  0.99999 45016 60611

89.80°  0.99999 39076 57790
89.79°  0.99999 32831 93413
89.78°  0.99999 26282 67498

Now that I’ve just made one, I just love the look of the above table. But I think for the purposes of this discussion, we should strictly limit ourselves to how standard numeric equivalences and simple numbers that are used mid-stream in the main body text ought to be expressed. I also don’t think we can even look towards how special numbers like pi are done in books since all these monster-size numbers originally came from computerized sources that adhered to the long-standing practice first used in actual line-feed printouts. I can see that notwithstanding this “source” issue, many books still saw fit to strip out the 5-digit delimiting and format them into the SI-compliant form. We simply can’t look towards the number of books that gush over high-precision values of pi nor reference books.

For most numeric equivalencies and numbers used in in-line prose (those values that don’t invite readers to start counting numbers by using wording like “Here are the first 50 digits of this never-ending number…”), I don’t see why we can’t simply follow what the NIST and the BIPM and the ISO recommend here. None of them mention a special exception for mathematical constants. What are we to do with “mathematical constants known precisely” if they exceed a hundred-thousand? Is pi12 to be written like 9,24269.18152 3374 or 9 24269.18152 3374? I don’t think we want to start delimiting on both sides in groups of five nor would we want to mix it up and delimit every three to the left and every five to the right.

For simple ordinary numbers in regular body text where we aren’t inviting readers to count digits, it make abundant sense to me to just follow what the standard bodies say to do: delimit in groups of three regardless of which side of the decimal marker you’re on. We would then have numbers that look like these: h ≈ 6.62606896(33)×10−34 J·s and k ≈ 1.3806504(24)×10−23 J K−1 and mu ≈ 1.660538782×10−27 kg and e ≈ 2.718281828. This is the right way to do it. Greg L (talk) 20:33, 10 October 2008 (UTC)

I think your "example" of π12 is just an Aunt Sally. Let me give you a real example, the von Klitzing constant – fundamental, as it happens, for watt balance measurements. Its value in current SI units, as formatted by {{val}}, is 25812.807557(18) Ω, and its conventional unit is the ohm rather than the kilohm: are you seriously suggesting that we mix comma delimitation and space delimitation in the same number? Physchim62 (talk) 21:45, 10 October 2008 (UTC)
Greg, like Gerry, are also violating WP:POINT in this discussion in intentionally misrepresenting my (draft) proposal which is and should be discussed at WT:MSM only. The way I see it, we're discussing two guidelines which do not precisely conflict.
You are proposing we formalize your guideline that we generally use the SI format (to the right of the decimal point) by inserting formatting spaces (<span>, etc.) every 3 digits. This was approximately agreed upon in discussion, but it was never actually proposed as a guideline. If proposed, I'd probably !vote to approve, even without evidence of real-world consensus. (I'd propose a separate note than numbers with 12 digits past the decimal point are generally not encyclopedic unless they are mathematical or definitional constants.)
I'm proposing a separate guideline for mathematics articles with long precisely-known (unitless) constants less than 10 in value. As 5-digit spacing is what's used in the few mathematics books which actually have long numbers in them, this reflects a real-world consensus with respect to that topic.
Arthur Rubin (talk) 21:45, 10 October 2008 (UTC)
I would agree with Arthur Rubin, with the additional point that the recommendations are only really for technical articles: general articles do not usually quote numbers with more than four decimal places. Obviously I'm in favor of using ISO 31-0 format in physical sciences articles to the left of the decimal marker as well. Physchim62 (talk) 21:59, 10 October 2008 (UTC)
What's unencyclopedic with ge = −2.002 319 304 3622(15) (which {{val}} fails to display correctly, BTW) [15]? -- Army1987 (t — c) 00:33, 11 October 2008 (UTC)
  • I’m done. Do what you want. Greg L (talk) 22:32, 10 October 2008 (UTC)
How about: "In values with five or more digits after the decimal point, these can be separated in groups of three by thin spaces (except that a single digit at the end would not have a preceding space, e.g. 0.1234567); but representations with fifteen or more digits after the point of mathematically defined irrational constants can optionally use groups of five digits (e.g. e = 2.71828 18284 59045…); in this case, the number of digits used after the point and before the ellipsis should be a multiple of five." -- Army1987 (t — c) 01:03, 11 October 2008 (UTC)
  • Makes perfect sense, Army. ;-P Greg L (talk) 01:55, 11 October 2008 (UTC)

Physchim62's proposal

Change
"Commas are used to break the sequence every three places left of the decimal point; spaces or dots are not used in this role (2,900,000, not 2 900 000), except in technical tables or in quotations where the original does so (such as in scientific publications)."
to
"Large numbers are usually broken into groups of three digits by commas, eg 'the population of New York City is 8,274,527 (figure for July 1, 2007).' Certain subject areas have different rules, notably the physical sciences, and these should be respected if they are verifiable and have the consensus of editors."

What do people think? Physchim62 (talk) 22:56, 10 October 2008 (UTC)

  • It is quite unclear what is meant. “Certain subject areas have different rules…and these should be respected.” What does that mean? It seems ambiguous to me and ambiguity leads to edit warring.

    There is already a serious lack of standardization in our articles. The “physical sciences” includes astronomy. Our own Moon article says this in the body text:

  1. …since the common centre of mass of the system (the barycentre) is located about 1 700 km beneath the surface of the Earth”
  2. “The Moon's diameter is 3,474 km.”
Euro-style number with a space in the fist; American-style with a comma in the second. That whole Moon article is a mess and I suspect that MOSNUM and its inability to decide on anything can probably share some of the blame. Note the “centre of mass” part first. The Moon article has “center” (U.S. dialect) seven times and “centre” (British dialect) six times. The barycentre link (British) is broken and redirects to Center of mass (American). And as you can also see from the way the numbers are formated, there isn’t even any consistency within that article for five-digit numbers! What a mess. That article used to be a FA article. And judging from your proposal, it now seems that how numbers will be formatted will depend on whether a reader goes to Moon or to Cheese. I think we need to get our act together here. MOSNUM is becoming the butt of jokes among the admins. Greg L (talk) 23:59, 10 October 2008 (UTC)
    • Another Aunt Sally. Let's take another style guideline, WP:ENGVAR: it doesn't prevent problems (or edit wars), but it gives a clue as to how to solve them. Moon, from your evidence, doesn't comply with the most basic of requirements of WP:ENGVAR, that to be consistent within the same article. As for numbers, the first example is incorrect according to the current guidelines but it's hardly worse than taking readers on several transatlantic dialect-trips. Physchim62 (talk) 00:21, 11 October 2008 (UTC)
  • Please be specific. What is meant by “Certain subject areas have different rules…and these should be respected.”? Are you suggesting that numbers like “8,274,527” would look different in articles that belong to the physical sciences? With your current wording, there is too much room for interpretation as to what is meant for a guideline to be posted on MOSNUM. This isn’t the NY Times where everyone has a journalism degree and works for the same boss. There are editors from different countries, different backgrounds, different educations, and different preferences to have that much ambiguity. Guidelines here need to be even clearer than those at a newspaper. Just spell out what you’re driving at. Greg L (talk) 00:48, 11 October 2008 (UTC)
  • Yes, were the number “8,274,527” significant for a physical reason (rather than being an estimate, quoted at a ridiculously low level of uncertainty, of the population of New York City on a given day) it should have a different style, that style being the one which is defined in ISO 31-0 and accepted throughout the world of scientific publication. Physchim62 (talk) 01:07, 11 October 2008 (UTC)
  • That is much more helpful (although it still requires that editors go read the ISO 31-0 article in depth). I like the part where ISO 31-0 says “long sequences of digits can be made more readable by separating them into groups, preferably groups of three…”.

    However, I don’t think the part where it says “…groups of digits should never be separated by a comma or point, as these are reserved for use as the decimal sign.” is a good idea for en.Wikipedia. Thus, according to your proposal, the main body text in the Moon article would state that the moon has a mean radius of 1 737 100 meters. Is that right? If so, I prefer the current wording on MOSNUM.

    Why? Because there are a number of ways that Europeans format numbers. If I recall correctly, there are even different ways of formatting numbers within a country, like Swedish-1 and Swedish-2. Or was it Swiss-1 and Swiss-2??? Well… one of those “S” countries with blonds walking the streets in bikinis, that’s all I remember. Accordingly, Europeans are quite accustomed to dealing with the American convention of delimiting numbers with commas (as well as four or five other systems used throughout Europe); nothing much confuses you Europeans. But Americans are not nearly so mentally ambidextrous. We could always take an attitude of “F*ck ‘em if Americans are so damned ignorant, they don’t deserve to learn… they’re still using pounds and feet!” Well, my inner child agrees with this attitude to a degree.

    But my handy, catch-all filter that I use in deciding what is best on Wikipedia is this: The goal in all technical writing is to communicate to the intended readership with minimal confusion. So the best thing to do, IMO, is keep the current guideline—and actually do a better job of adhering to it our articles. The Moon article, to name just one, is a big mess, but I’m not about to go get into a holy war on numbers over there trying to get it consistent in spelling and formatting. Greg L (talk) 01:49, 11 October 2008 (UTC)

Army1987's proposal

Replace the first bullet of MOS:NUM#Large numbers with:

  • In large numbers (i.e., in numbers greater than or equal to 10,000), commas are generally used to break the sequence every three places left of the decimal point, e.g. "8,274,527". In scientific context, thin spaces can also be used (but using {{nowrap}} to prevent line-breaking within numbers), e.g. "8 274 527" ({{nowrap|8&thinsp;274&thinsp;527}}, or using the thin space character instead of its HTML entity). Consistency within an article is desirable as always.

After the last bullet of MOS:NUM#Decimal points, add:

  • For numbers with more than four digits after the decimal point, these can be separated in groups of three by thin spaces protected by {{nowrap}}, or with {{val}}, which obtains the same appearance using margins instead of thin space characters. (An exception is that a single digit at the end would not have a preceding space; {{val}} handles this automatically, e.g. "0.1234567".) Optionally, truncated representations of mathematically defined constants with fifteen or more digits after the point can use groups of five digits (e.g. "e = 2.71828 18284 59045…"); in this case, the number of digits used between the point and the ellipsis, being arbitrary, should be a multiple of five (or of three, if three-digit groups are used). Note that {{val}} can fail to correctly handle numbers with too many significant digits.

Army1987 (t — c) 14:38, 11 October 2008 (UTC)

  • Army1987: My 01:49, 11 October 2008, above, applies to this proposed revision of the current guideline too. Some of Wikipedia’s articles are already becoming a hodgepodge of number conventions within individual articles, sometime within the same section of an article. Under this proposal, not only would there be different delimiting characters depending on where the reader lands, but there would be different group sizes of digits (three for this and five for that) depending on where the reader lands. The first thing, IMO, we need to do on Wikipedia is go get our current articles into proper compliance with the guidelines we already have. Greg L (talk) 16:18, 11 October 2008 (UTC)
    The current guideline says to use commas every three places left of the point, and says nothing about right of the point. And, given the way you cite BIPM et al., I don't believe 8,987,551,787.3681764 N m2/C2 is the formatting you would advocate. -- Army1987 (t — c) 00:32, 12 October 2008 (UTC)
  • Since no publication would use the format 8,987,551,787.368 176 4 N·m2/C2 it never would have occurred to whoever wrote the current guideline to prohibit a mixture of commas and thinspaces as digit grouping charcters. Therefore, the absence of an explicit prohibition in MOSNUM does not constitute approval of such a practice. The only digit grouping method approved of by the current MOSNUM is commas (except in tables and quotations). --Gerry Ashton (talk) 01:17, 12 October 2008 (UTC)
  • So… does 8,987,551,787.368176 make you happy? Because that’s what MOSNUM calls for and that’s the standard way it’s done in the US in daily life. Also, it is well recognized by any European with an IQ that is at least as great as their heart rate. Greg L (talk) 04:36, 12 October 2008 (UTC)
  • Greg's last example certainly complies with the practice of most American publishers and the current MOSNUM. Granted, it is a bit hard to read the digits after the decimal point. A way to respond to the difficulty would be to accept the BIPM/NIST system only in science, technology, and maybe applied math articles that contain at least one number with five or more digits to the right of the decimal point. This would accord with the usual Wikipedia practice, and the practice of some other publishers, of allowing different rules of style in different articles, so long as articles are self-consistent. Greg's idea of using commas to the left and half-spaces to the right could be adopted as an English Wikipedia house style, but I would want to see that explicitly put in the MOSNUM and see it stick. If Greg were to try such a change, I would not be the one to revert it. --Gerry Ashton (talk) 04:49, 12 October 2008 (UTC)
  • The use of such large numbers should, in theory, be fairly limited. Articles on fundamental constants and planetary data and mathematics, and suchlike. It would be nice to get a standardised format, but it shouldn't really be necessary to spend a huge amount of time on this, surely? I know the MoS is prone to WP:LAME stuff, and settling things at this level can avoid such lame edit wars breaking out, but why not just close our eyes, pick a reasonable standard, and stick to it? Carcharoth (talk) 05:38, 12 October 2008 (UTC)
  • Sometimes it does make sense to use different standards in different places. (For example, in special relativity, there are different conventions about whether m denotes rest mass or relativistic mass; whether time-like four-vectors have positive or negative norm; whether space and time are measured in the same unit (i.e. c = 1) or not, and if they aren't, whether the timelike component of the position four-vector is ct and the metric is ±diag(1, −1, −1, −1), or they are t and ±diag(c2, −1, −1, −1); whether time is the zeroth component (ct, x, y, z) or the fourth component (x, y, z, ct); whether time is imaginary and the metric is positive; and so on, and so forth. Some of these issues have settled, but others haven't.) This does not make sense, and if in this case no one standard is used, guess about other cases...

    As for me, I can see nothing wrong with Mathematical_constant#Table of selected mathematical constants using five-digit groupings and Planck units#Base Planck units using three-digit groupings. After all, this won't cause both sizes to be used in the same article, as physicists seldom use numbers with more than thirteen digits or so after the point, and, even when they do, my proposed wording doesn't forbid to use three-digit groups even for 200-digit numbers. -- Army1987 (t — c) 10:22, 12 October 2008 (UTC)

  • As for 8,987,551,787.368176, yes, it is the format used in the US in daily life, but do Americans really use numbers with that many significant figures, in "daily life"? Also, in cases such as eπ163 = 262,537,412,640,768,743.999999999999250072…, it becomes nearly illegible — how long do you take to you tell how many nines are there? -- Army1987 (t — c) 10:22, 12 October 2008 (UTC)


  • Carcharoth is dead-on correct in his 05:38, 12 October post, above—from start to finish. So too is Army1987, above. Big numbers with quantities of digits that require delimiting on both sides are relatively rare—there is only one such number in the entire Kilogram article (the number of wavelengths delineating the meter). High-precision values in technical articles are usually in scientific notation (only one digit on the left).

    After a reader has seen numbers like 1.660538782(83)×10−27 kg a half-dozen times in the article, delimiting to the right confuses no one—particularly in scientific articles. For non-scientific articles, the main point of big numbers is to be illustrative, the implication being “look, this value is really, really BIG”, or “look how precisely this value has been measured by them scientific folk.” For such illustrative purposes, numbers don’t even need to be delimited on the right. But for scientific purposes, where a proposed new value is being compared to another or to see whether it falls within an uncertainty, we delimit to the right because doing so makes it easier to parse and this facilitates understanding and minimizes confusion. Greg L (talk) 18:00, 12 October 2008 (UTC)

  • And in science we delimit to the left with spaces: any person, whatever their nationality, "with an IQ that is at least as great as their heart rate" to use your pathetic little snipe, can understand it. It's rare that it has to be done, but that is what is done, sorry. At the moment, this guideline forces editors to use a format which is illogical, ridiculous and used by nobody outside of Wikipedia. Are you in someway proud of that? Physchim62 (talk) 20:00, 12 October 2008 (UTC)
  • Oh jeez, don’t be so quick to take personal offense to everything.

    To the larger issue: I’ve already explained clearly enough that Europe has a half-dozen ways of formatting numbers and European school children are taught to deal with them all. Further, Europeans are accustomed to the American system of delimiting numbers. Americans are not so mentally ambidextrous; they have been exposed to only one number-formatting system their whole life. That’s why it’s used on Wikipedia: because the American numbering notation results in the least confusion to the greatest possible number of readers. Please accept that reality.

    And to an even broader point. There’s tendency to make our scientific articles too complex. Advanced and obscure styles, conventions, and customs used in the scientific papers are too often being incorporated into our Wikipedia articles. Our scientific articles need to clearly explain complex subjects in simpler termsnot start looking like they were written by Ph.D.s and are ready for publication in a journal. Sometimes editors are maybe a little too anxious to show off how they are *comfortable* with “scientific” conventions.

    Take the example of numeric equivalencies with superscripted negative exponents (reciprocated factors) such as 8.854187817×10−12 F m−1. A simple, two-component unit of measure like this is unnecessarily complex of a math concept for use here on Wikipedia when 8.854187817×10−12 F/m means the exact same thing and is much more accessible to a much wider readership. If we’ve got a truly very advanced scientific article, or there are three or more factors to the unit of measure, then there is little choice, such as G = 6.67428(67)×10−11 m3 kg−1 s−2. Some of us have got to stop thinking like we’re writing for other editors and think about who the customer is. Why would we write “The Avogadro constant L is 6.02214179(30)×1023 mol–1” in an introductory paragraph when we can write “The Avogadro constant L is 6.02214179(30)×1023 entities per mole”? The latter is much more accessible to a general audience. We’re here to make the complex less complex, not simply ralph what was in the science paper all over onto our articles. The same principle applies with numbering conventions.

    And I frankly think that some of our European editors have become so convinced of the innate superiority of the BIPM-prescribed number notation, that they believe that “science” can not somehow be “correct” without observing the most modern of the European numbering conventions. I think some of these editors—not all—see “science” as some sort of inroad to getting the superior Euro style into Wikipedia. “Superiority” isn’t the test; it’s simply the “degree of familiarity with the largest audience.” These editors are just deluding themselves if they think these efforts will promote the adoption of the SI numbering style in the U.S. Just like our three-year failed experiment with “promoting” the use of the IEC prefixes (mebibyte instead of megabyte, etc.), all we did was confuse our readers for three long years. Some advocates thought Wikipedia was being progressive. Others flat out were trying to promote the adoption of the IEC prefixes (I read that admission, from one proponent to another, on a talk page). But after that three-year failed experiment, Wikipedia didn’t show the computer manufacturers *the light to a new and better future.* No manufacturer was following our *lead*, readers were baffled, and Wikipedia just looked foolish. This is what happens when Wikipedia is used as a vehicle to promote change in how the world works.

    The simple fact is that America has always delimited with commas to the left and Americans are not familiar at all with other conventions. Taking an attitude of “well, Americans will just damned learn the new numbering system when they land on Wikipedia” isn’t the correct attitude. Flies like a lead balloon with me. As I said before, our Moon article is now an utter mess, with a hodgepodge of number styles being used it its body text. Perhaps, one day, American’s will be more familiar with other numeric conventions. Until that day, Wikipedia must consistently adhere to the numeric convention that causes the least confusion to the greatest number of readers. Wikipedia shouldn’t be used by as a vehicle to promote change in how things work. It would be quite unwise to start separating certain classes of articles (e.g. science-related ones) to begin using different rules for formatting numbers.

    There… I said what was on my mind. Please don’t howl over great injustices here over how I engaged in “personal attacks” and am “failing to assume good faith” and all the other red-herring diversions away from the main point: having style guides here on MOSNUM that promote writing as clearly as possible to the widest possible audience with minimal confusion. Greg L (talk) 22:13, 12 October 2008 (UTC)

No, I'm not going to howl over personal attacks because I didn't take it as one. I will point out though that by far the most widely used thousands separator in Europe is the point-on-the-line, exactly what Brits and Americans use for the decimal marker! Imagine if you saw the speed of light written as 299.792.458 m/s – looks strange, doesn't it? Which is why someone came up with a compromise, which has actually taken very well in technical fields and is widely used. An international format for writing physically significant numbers which can't be used on Wikipedia because of this guideline. If anyone trys, this sort of edit is the result. Why on earth should the speed of light be written in the same format as the population of the United States?
In your laudable quest not to confuse our readers, you are actually depriving them of information. Lets take you Avogadro example, because I remember that it's you who changed mol–1 to entities per mole. The first paragraph of the article used to read:
"The Avogadro constant (symbols: L, NA), also called Avogadro's number, is the number of "elementary entities" (usually atoms or molecules) in one mole, that is (from the definition of the mole) the number of atoms in exactly 12 grams of carbon-12.[1][2] The 2006 CODATA recommended value is 6.02214179(30)×1023 mol–1.[3]"
That second sentence gives two important pieces of information. 1) the conventional unit for the Avogadro constant (and the very fact that it has one, and is not simply a number); and 2) the symbol for that unit. Your version "6.02214179(30)×1023 entities per mole" doesn't give either, and is hardly any more comprehensible, given that it uses thinspace delimitation to the right of the decimal point and parenthetical expression of the standard uncertainty!
Anyone who studies physical science above the most basic level must come to terms with the language it uses for expressing concepts, just like any child has to learn that "+" represents and addition operation. By your misguided attempt to impose a false uniformity between articles and concepts which have nothing in common, you are condemning our readers to irritation if they already know the correct form or, worse, continued ignorance if they don't. Physchim62 (talk) 00:46, 13 October 2008 (UTC)
  • Yes, that was me who changed Avogadro constant so it’s easier for non-scientists to understand the essential nature of the measure. I don’t understand your concern. The preceding sentence (what also happens to be the first sentence of a two-sentence paragraph) states what the symbol for the Avogadro constant is: “(symbols: L, NA)”. Then the very next sentence gives the magnitude and the measure. If you think the symbols (L, NA) should be in that second sentence as well, be my guest. It seemed rather redundant to me. If scientists were responsible for school systems, they’d send policy papers to parents saying the objective was for district-wide class sizes Nclass ≤24 classroom−1. Greg L (talk) 02:02, 13 October 2008 (UTC)
  • I meant the symbol for the unit, not the symbol for the constant. And I think you'll find there are far more scientists employed in school-teaching than in industrial research (I've done both). Physchim62 (talk) 11:59, 13 October 2008 (UTC)
  • I’m sorry. When I mentioned “scientists,” above, that wasn’t directed to you Physchim62. Wikipedia needs more true experts (many more) who know what they are talking about. There are far too many editors on scientific subjects contributing to Wikipedia who simply parrot what they read out of Popular Mechanics. I was referring to the guys with their heads in the clouds who write extremely dry, abstract scientific papers for publishing. Then, when it comes time for them to write supplemental curricula material for their students, they often can’t ‘walk and chew gum at the same time’ when it comes to clarifying the abstract and unclear.

    I’ve worked with Ph.D.’s in an R&D environment. I kid you not. There was a Ph.D. chemist in his 40s who had a phone book in front of him on a table.

• The Ph.D.: “Greg, if you wanted to find a company to do laser cutting, how would you find a company like that?”
• I told him “I’m pretty certain they have a section under ‘Laser,’ but if not, I know for a fact that under ‘Machine shops,’ there are advertisements saying they do laser cutting.”
• “OK. But… where in the phone book?”
• “Uhm… under ‘Laser’ and ‘Machine shops.’ ”
• (*pause*) “Well… but where in the phone book would I look.?”
• (*I blink once or twice*) “Uhm… in the *Yellow Pages.*”
• (*pregnant pause and awkward uncomfortable look from the Ph.D. as he shifted his balance a bit from side to side on his feet*) “Well… will you show me?”
I shit you not. This is completely true. Ph.D. scientists out of an academic setting are a diverse lot. Some can start their own companies, are extraordinarily knowledgeable about a range of subjects beyond the scope of their Ph.D., and have amazing people and leadership skills that make you want to follow them anywhere. Others however, have led a sheltered life in the academic world where they never had to order a bottle reagent on their own; everything is done for them. These sort—I’ve found—tend to have a difficult time with “theory of mind”: understanding what others don’t understand and being able to come down to their level. Greg L (talk) 17:19, 13 October 2008 (UTC)