Wikipedia talk:Full-date unlinking bot/Archive 1

Latest comment: 14 years ago by Apoc2400 in topic Ready?

Notice

The proposal is edited continuously, so comments below may refer to old versions.

Purpose

I hope this proposal can largely end this far too long-running dispute. There will still be disagreement over exactly when date links are relevant, but this is better handled on a case-by-case basis for each article. Getting rid of the date links for autoformatting would allow editors to focus on links that actually require discussion. I have not been much involved in the conflict, so I hope this discussion can focus on the proposal, not people.

This is still just a draft proposal, so please wait with direct support or opposition. There will be an open and widely advertised RfC soon. Comments, potential problems and suggested improvements are welcome. --Apoc2400 (talk) 19:35, 29 May 2009 (UTC)

All seems eminently sensible to me. How do you define "intrinsically chronological articles"? Will these be enumerated explicitly as part of the exclusion list, or will there be some additional general criteria (e.g. any article with "Day" as part of the title)? --Kotniski (talk) 19:41, 29 May 2009 (UTC)
I expect most of them can be found in various subcategories of Category:Chronology. The easiest way would be to automatically create a list of the most common types of chronological articles, then manually add any missing articles to the exclusion list. --Apoc2400 (talk) 20:05, 29 May 2009 (UTC)
"A list of the most common types" - does this mean a list of regular expressions to match article titles?--Kotniski (talk) 07:21, 30 May 2009 (UTC)
I meant a list of articles, generated by regular expressions and categories. The details really aren't decided though. Any suggestions here is useful. --Apoc2400 (talk) 12:12, 30 May 2009 (UTC)

Template:Bots

The proposal should state whether the bot will fully parse and respect the {{Bots}} template. --Jc3s5h (talk) 19:42, 29 May 2009 (UTC)

It does say that "The bot will follow normal bot exclusion rules". I will make this more clear. --Apoc2400 (talk) 19:57, 29 May 2009 (UTC)

Comments

  • Will there be a defined time frame before the bot begins operating in which users are invited to add articles to the list that this bot won't touch?
  • Yes, one month.
  • Will the bot fix problems left over from no autoformatting (such as January 15 2008 - needs a comma)? Will it even catch these types of dates?
  • Any completely uncontroversial problem fixes - sure. I added an item about this. I think there is already code for this written that could be copied.
  • Probably want to ask specifically about birth and death dates - some people believe that these full dates should always be linked; I suspect most disagree, but we need to be able to show consensus.
  • I think the cases where both year and month-day of a birth or death are relevant links are very few. Those can be added to the exclusion list. Ultimately, which links are relevant in a particular article should be decided on its talk page. There is no special provision for birth and death dates, so I think it is clear that most will be unlinked.
  • Will there be a second phase to delink some month-day combinations? Will we even try to determine what consensus is on when month-day combos should be linked? Karanacs (talk) 20:09, 29 May 2009 (UTC)
  • I have no such plan. It would be much harder to avoid false positives. Perhaps with a longer exclusion listing phase first. I think there are not as many month-day combinations that were linked solely for autoformatting. The main purpose for me is to remove all those autoformatting links, not enforce a strict limit of when date links are relevant. --Apoc2400 (talk) 20:32, 29 May 2009 (UTC)
  • "The bot user page and talk page will describe its task and function in respectful language." This is slightly redundant, as two points up, we have "The bot operator will be someone who has not edit-warred or been uncivil in relation to date links." I don't think we need to state the obvious. Dabomb87 (talk) 22:08, 29 May 2009 (UTC)
  • Agree. I was tempted to remove that bit with my recent tidy-up edit. Good-faith indicates the assumption that "respectful language" is used (and there are mechanisms in place to handle the situation when it isn't).  HWV258  22:15, 29 May 2009 (UTC)
  • I merged them to one point for now. I do not know if any date-delinking bot has had a disrespectful user page. I was thinking of the (long gone) Betacommandbot which used to have rather dismissive messages at the top of the talk page. --Apoc2400 (talk) 22:36, 29 May 2009 (UTC)
    • Please do not remove. Stating the obvious has great merits when good faith is in question. Septentrionalis PMAnderson 16:47, 3 June 2009 (UTC)
  • Birth and death dates: the community has recently endorsed by a landslide majority that "the years of birth and death of [for example] architect Philip C. Johnson should not be linked, because little, if any, of the contents of 1906 and 2005 are germane to either Johnson or to architecture." This text was inserted into MOSNUM soon after by the organising Clerk, Ryan Postlethwaite. Tony (talk) 08:32, 30 May 2009 (UTC)
    This is a misststement on two points:
    • The sentence Tony quotes was part of a long phrasing crafted and revert warred for by GregL; Ryan's part was merely to give up and accept that text.
    • The landslide support was for the general principle here quoted, that links should be germane to the article as a whole. Septentrionalis PMAnderson 16:14, 3 June 2009 (UTC)
      • Are you saying that people voted for something that they didn't agree with?--Kotniski (talk) 16:32, 3 June 2009 (UTC)
        • Several editors, most clearly Gavia Immer, said they preferred silence to this option, but would accept it as a second-best compromise. That shows limited enthusiasm for the details. Septentrionalis PMAnderson 16:46, 3 June 2009 (UTC)
      • I am not sure what you mean. The text in the beginning of the introduction of this proposal is almost the same as statements that gained consensus in the RFC. --Apoc2400 (talk) 16:31, 3 June 2009 (UTC)
        • They were offered four choices. Endorsement of one of them as better than the other three, is not endorsement of every detail of that one; indeed, several protested the details of that option while voting for it. Septentrionalis PMAnderson 16:39, 3 June 2009 (UTC)

Date autoformatting

This proposal presupposes that the current system cannot be fixed, and that's an unfair characterization. As noted during the second RFC from 2008 the community supports "some form" of date autoformatting. Further, simply because a year and month/day pair are linked does not mean those links are not germane to the article. The burden should be on editors desiring this change to do due diligence rather than simply blindly removing pairs of links via a bot. —Locke Coletc 21:10, 29 May 2009 (UTC)

I understand you will probably disagree with this proposal, but are there any specific changes that would make it more acceptable to you? I think you are wrong about the community view on autofromatting, but it does say that "This proposal does not preclude any future method of date autoformatting that does not require links." The cases where both month-day and year are relevant links are very rare. --Apoc2400 (talk) 21:16, 29 May 2009 (UTC)
Agree that such "links are very rare" (could someone provide examples of germane month-day-year links?). Of course, there is nothing to stop localised debate re-establishing a link if it is germane to its place in an article. Thanks Apoc2400 for your work on this.  HWV258  21:49, 29 May 2009 (UTC)
Yes, I have a hard time imagining any such cases really. I suppose if Jesus was actually born on December 25, 1, those would both be relevant links in his birth date. Now, I don't mind that an article is added to the exception list if there is discussion on the talk page. If it turns out the date should not be linked after all, it can be removed manually. --Apoc2400 (talk) 22:29, 29 May 2009 (UTC)
I think it's a mistake to assume that the square brackets will always cause a link to be emitted. As the community seems to reject links this is something that can be changed in the software (so that [[August 1]] [[1975]] need not necessarily produce <a href="http://en.wikipedia.org/wiki/August 1">August 1</a> <a href="http://en.wikipedia.org/wiki/1975">1975</a> but rather just August 1, 1975). Basically: square brackets in syntax need not necessarily mean links in the final output. I think a better use of our time would be asking the MediaWiki devs to consider updating Dynamic Dates to simply not link dates (while providing an alternate syntax to allow linked dates where such links are intentional). —Locke Coletc 02:34, 30 May 2009 (UTC)
It seems pretty perverse to complicate the MediaWiki linking syntax permanently because of a one-off problem. Why make things difficult and confusing for all editors in the future when we can keep things simple just by doing a one-time (though rather large) bot run?--Kotniski (talk) 07:19, 30 May 2009 (UTC)
The linking syntax is already perverted (see categories and images/files which both use the linking syntax but produce wildly different results from a typical wikilink). This is at least more intuitive than the existing "perversions" (when you wikilink a date it stills gets output), and is still utilitarian enough that the community seems to mildly interested in retaining the functionality. —Locke Coletc 14:54, 30 May 2009 (UTC)
I don't see that the now irreversible bad decisions made in the distant past about category and image syntax are justification for making the same mistake again (this time even more confusingly, since it would involve mainspace links). The functionality which some people seem mildly interested in would have to work without links.--Kotniski (talk) 15:17, 30 May 2009 (UTC)
Because a) there may be an option to display dates and years linked or unlinked in Special:Preferences (in which case the linking syntax is 100% intuitive), and b) any other syntax for formatting dates is likely to be even more convoluted than simply wrapping the date fragments in square brackets (so there's a case for simplicity to keep it how it is). —Locke Coletc 15:21, 30 May 2009 (UTC)
The community has already decided it is not helpful to have dates in simple square brackets solely for the purpose of autoformatting. The statement that the "linking syntax is already perverted (see categories and images/files which both use the linking syntax but produce wildly different results from a typical wikilink)" omits the fact that we use [[Category:American architects]] and not [[American architects]] to denote the category, and [[File:Einstein1921 by F Schmutzer 4.jpg]] and not [[Einstein1921 by F Schmutzer 4.jpg]] for the image; Even if the community were to agree on the use of square brackets in the form of [[date:4 July 1776]], there would need to be a wholescale change in the markup which is not relevant to the proposed removal of the current autoformatting (sic) links by bot. However, all this 'Son of DA' talk is mere hypothesis at this point, as we have yet to see any proposal of its specifications to that effect, let alone get community endorsement. Ohconfucius (talk) 02:08, 8 June 2009 (UTC)
You need to stop misrepresenting things. In the second RFC from November the community supported "some form" of autoformatting. It seemed the major reason people opposed the current system had less to do with the actual syntax (square brackets) and more with the fact that it simply didn't work as intended. If the current system were fixed and improved the community would likely support it. This bot proposal is premature when work like this could more easily resolve the problems identified with the current system. —Locke Coletc 03:47, 8 June 2009 (UTC)
It's for all to see: I was referring to the exact wording used. FYI, it was "Dates should not be linked purely for the purpose of autoformatting". As I recall, it was overwhelmingly endorsed. Ohconfucius (talk) 04:34, 8 June 2009 (UTC)
Again you misrepresent things: the community overwhelmingly endorsed abandoning the current (broken) auto formatting system. They did not endorse endorse abandoning any auto formatting as evidenced by the second question at that RFC ("is some form of autoformatting desirable?") which turned up a majority in favor of some form of auto formatting. You cannot simply point to one set of results and ignore the others because they're inconvenient to your crusade. BTW, it is not a personal attack to note your distortion of the facts. —Locke Coletc 06:36, 8 June 2009 (UTC)
I said what I said above, so stop putting words into my mouth about what I did or didn't say. Ohconfucius (talk) 06:44, 8 June 2009 (UTC)

You seem to be repeating the same logical errors that were made when date autoformatting was designed. Firstly, any option in preferences will be used by only a tiny minority of readers. For the vast majority, dates will be linked if the editor decided to link them, and not otherwise. So it would most certainly not be intuitive for editors to have a method of linking them that was different from (and partially even the reverse of) the method used to link anything else. As for (b), it's not hard to find a syntax that would be equally simple as the square brackets; but even more simple (and surely what every sane editor would prefer) would be not to have any syntax at all! You want a date, you just write it. How hard is that?--Kotniski (talk) 15:39, 30 May 2009 (UTC)

I take a dim view of short sighted opinions such as this. Dates must be marked up so the software is absolutely certain it is dealing with a date. Methods of auto detecting dates have too high of a false-positive rate to ever be useful for Wikipedia. Further, it's possible for Whatlinkshere to be populated with dates which are marked up (but not necessarily resulting in links), returning a useful bit of data for reusers of our content. There's a laundry list of reasons to fix Dynamic Dates and only a handful of reasons to abandon it outright as has been suggested... —Locke Coletc 16:50, 3 June 2009 (UTC)
Why does the software need to be certain it's dealing with a date? Why should dates be marked up, as opposed to personal names, flowers, football teams, compound prepositions, etc. etc.? Imposing something like this on editors would just make all our lives that much more complicated. --Kotniski (talk) 17:04, 3 June 2009 (UTC)
It's not imposed at all. Gnoming editors can add these details as they edit articles. No deadline, remember? As for your other ideas, we can actually tell if a link points to a person or a flower (or football team) based on categorization of the target article. So amusingly, we already do do this. But without being marked up, the software has no idea if it's dealing with a month/day/year with any form of certainty. —Locke Coletc 03:47, 8 June 2009 (UTC)
Nor with any of the other categories if they happen not to be linked. But it doesn't need to, nor does it need to with dates. --Kotniski (talk) 19:20, 8 June 2009 (UTC)
Interesting, so those categories are useless in your opinion? I wonder why it is the community finds them valuable, often spending inordinate amounts of time carefully categorizing articles. Just because you personally do not find something useful does not mean others don't. —Locke Coletc 19:34, 8 June 2009 (UTC)
??? What has this to do with categories? If a phrase in article text isn't linked, it isn't mapped to a category or anything else. Same with names, flowers, dates, anything. And that isn't a problem.--Kotniski (talk) 06:38, 9 June 2009 (UTC)

Exclusions

Rather than a list of excluded articles (above and beyond chronological articles), I suggest that some sort of wrapper template should be placed around those few dates that should remain linked for some special reason - something like

{{keepdatelink|[[12 September]] [[1953]]|because="it's my birthday"}}

The wrapper template would have one essential characteristic: its syntax would require a reason for keeping the link to be supplied. This would both allow discussion of the reason, and document it for the benefit of future editors. The template would have no other intrinsic function - it would be just a marker to the bot and to editors. Colonies Chris (talk) 21:45, 29 May 2009 (UTC)

I am reluctant to add things related to this bot to various articles. It seems better to keep it in one place. Ideally, the editor who adds an article to the exclusion list should remove any irrelevant links from the article manually. Having a comment/reason and editor signature after each article in the list would be a good idea, but I am not sure about requiring it. --Apoc2400 (talk) 00:38, 30 May 2009 (UTC)
This template wouldn't just be for this bot. It would also serve, in the spirit of WP:IAR, to document for future editors exactly why someone considered a specific date link should be an exception to the general rule of not linking dates. If, as you suggest, an editor were to first remove irrelevant date links, leaving just those he feels is valuable, and add the article to the exclusion list, there would be no record of why those particular links were considered important (and it's more work for an editor too, removing links that the bot could fix). An additional benefit would be that the bot would not have to keep track of which articles it's visited. The bot's edit summary should include a link to a description of how to use the template to protect a date link; once date links that someone considers valuable had been protected that way, the bot would not need to worry about whether it reprocessed the article later. Colonies Chris (talk) 08:30, 30 May 2009 (UTC)

RFC length

Two weeks sounds about right to me. --Apoc2400 (talk) 18:33, 2 June 2009 (UTC)

That seems reasonable. I congratulate this proposal also for being a quite sensible view of what a bot can do, and what a bot should not do. I will probably oppose it mildly, as unnecessary; we linked all these dates by persuasion, and can unlink them by persuasion.Septentrionalis PMAnderson 16:36, 3 June 2009 (UTC)

Add a step

One change that would probably gain some support is the following, two-step, process: when the bot first visits any article, and finds full dates linked, it should leave a brief message on the talk page, saying what it proposes to do, with a link to a FAQ and another to the stop-list. This message should also say it will only be treating the article once, after which useful links can be restored.

In a week or so, if (as us most likely) nobody cares, it can revisit the article and do its edit.Septentrionalis PMAnderson 16:36, 3 June 2009 (UTC)

That would almost double the amount of server/watchlist traffic, though. I would have thought it sufficient to include relevant info and links in the edit summary when it makes the edit, including a "let me know if I've made a mistake" link like the anti-vandalism bots do. --Kotniski (talk) 16:55, 3 June 2009 (UTC)
Then you will get many more complaints; but as long as the summary links to FAQs and to the stoplist, that may be acceptable. Septentrionalis PMAnderson 19:36, 3 June 2009 (UTC)
I dislike bots that spam article talk pages. Article watchers are free to revert the bot if they believe the links are relevant. --Apoc2400 (talk) 16:59, 3 June 2009 (UTC)

Questions

What about moved articles? Will the bot come back and remove dates from an article once on its stop list if the article is moved?

After the bot has swept through WP, it would be just as well to turn it off; when newbies are no longer going to pick up the impression that linking is normal, do we need this bot. Septentrionalis PMAnderson 16:36, 3 June 2009 (UTC)

Hopefully, the bot would have a pre-made list of articles. If it doesn't, is it a big problem? I agree with you that this bot should be run once and then turned off. I want it for removing old links for autoformatting, not for enforcing strict MOS compliance. --Apoc2400 (talk) 17:04, 3 June 2009 (UTC)
As long as it is used once and turned off, the details of handling moved articles are incidental, although they should be in the FAQ. Septentrionalis PMAnderson 19:34, 3 June 2009 (UTC)

Fundamental mistake

This domment is fundamentally in error. The community support of that RFC is for the principle that links should be germane; to claim community support for every word of the present MOSNUM text is a lie. Most of them were not even discussed; some of them were opposed and deprecated by those who weakly backed the proposal in general.

A simple change in phrasing, separating the text of MOSNUM from the claim of community suppport, would fix this; I hope that WP:BRD. Septentrionalis PMAnderson 16:56, 3 June 2009 (UTC)

I think this is a discussion for another place - if you think the results of the RfC were misinterpreted, you'd better take it to Ryan.--Kotniski (talk) 17:08, 3 June 2009 (UTC)
I think Ryan interpreted the results of RfC correctly: of the four wordings proposed, one of them was, as a package, the most acceptable, and it therefore stands in MOSNUM.
You and Tony would exaggerate that to a 75% vote in favor of every aspect of that wording; that is entirely your invention, and Ryan has nothing to do with it. Septentrionalis PMAnderson 19:30, 3 June 2009 (UTC)
Can we keep the personal comments to a minimum, please? Thanks. Dabomb87 (talk) 01:07, 7 June 2009 (UTC)

Background

Septentrionalis (PMAnderson): You disputed some of the background section. Could you explain more what you mean? I am open to changing it if something is wrong. The wording there is coped almost exactly from the RFC here. Still, I am thinking of rewriting the proposal introduction anyway. --Apoc2400 (talk) 17:10, 3 June 2009 (UTC)

I reworded the introduction completely. I was always uncomfortable having a "background" section because it is unclear whether it is a neutral description of the background or part of the proposal. Now I think it is more clear that is it the later. --Apoc2400 (talk) 18:26, 3 June 2009 (UTC)
I don't see why we need to "reconfirm" that package anyway; the results, like the original RfC, will endorse the principle of germane and topical, without discussing the entire two paragraphs. Please keep this to the bot proposal, to which the phrasing of MOSNUM is irrelevant. The bot isn't going to, and can't, decide whether a link is germane and topical. Septentrionalis PMAnderson 19:33, 3 June 2009 (UTC)
What part is it you object to? Is it "Such links should share an important connection with that subject other than that the events occurred on the same date." or the Sydney Opera House example, or both? --Apoc2400 (talk) 19:47, 3 June 2009 (UTC)
I tend to agree with Septentrionalis as far as reconfirming goes. Let people see what action the bot will take, and approve or disapprove, without also having to wonder exactly what "germain and topical" means. Maybe editor X will decide that year links are good, month-day links are bad, and autoformatting is bad. Perhaps, on the whole, editor X thinks the good done by the bot outweighs the bad. So let him support the bot without having to support any other baggage. --Jc3s5h (talk) 19:50, 3 June 2009 (UTC)
I assume anyone who thinks all years or all month-days should be linked would oppose the bot. I added the reconfirmation because it is the basis for running the bot. Perhaps it is the wrong way to go. What I want to avoid is claims afterwards that that the basis of the bot does not have consensus, even if the bot itself does. Would nobody try such claims? --Apoc2400 (talk) 20:06, 3 June 2009 (UTC)
Ok, I removed the "reconfirm" part. I agree that we don't want dispute over the exact wording of MOSNUM to spill over here. --Apoc2400 (talk) 20:15, 3 June 2009 (UTC)

Proposed rewording

This proposal does not preclude any future method of date autoformatting that does not require links.

to

The previous RfCs showed that the community rejected the general principle of autoformatting. This new RfC does not attempt to address that subject. It strictly focuses on how best to deal with the double-bracketed date links in articles.

I imagine the first sentence of the proposed change will be a bit controversial; I am open to rewording if so desired. Dabomb87 (talk) 23:57, 6 June 2009 (UTC)

My reading of the previous RfCs was that the community was rather split about the general principle of date autoformatting, so I think you would be inviting further dispute unnecessarily by making this change. The point on which I believe there is near-unanimous agreement is that the current method of date autoformatting is deprecated. The current method involves putting square brackets around dates to autoformat: this is what the bot addresses. Personally, I believe that DA is a solution looking for a problem; but I think that the original wording goes a long way towards alleviating the concerns of those who passionately believe in the value of DA. I'd recommend you stick with the original wording --RexxS (talk) 00:30, 7 June 2009 (UTC)
Alright, but I added a footnote explaining the situation with the current form of autoformatting. Dabomb87 (talk) 01:03, 7 June 2009 (UTC)

Question about YYYY-MM-DD format

"The bot will unlink only day-month-year (triple) combinations, such as 5 November 1989, November 5, 1989 and 1989-11-05."

As ISO format dates are generally deprecated in prose, is there any way for the bot to leave a comment (or make a list of articles) when it delinks an ISO date? Hopefully there won't be many of them, but it would be useful for wikignomes to be able to find such articles and make a decision on rewriting the date into dmy or mdy. Perhaps a category "Articles with ISO dates in prose" could be made for temporary use? --RexxS (talk) 00:41, 7 June 2009 (UTC)

Making a list of article should be no problem. ISO dates are uncommon in prose, but rather common in tables and especially in references. It may be hard to determine what is prose and what is not. --Apoc2400 (talk) 15:07, 7 June 2009 (UTC)
No one should write "ISO dates" unless they have read the spec (the link will prompt you to download a PDF). Wikipedia articles frequently violate the spec. --Jc3s5h (talk) 15:18, 7 June 2009 (UTC)
I'm concerned that the wider the task, the more the potential pitfalls. Isn't this a separate issue? Some editors, regrettably, like ISO dates a lot. Tony (talk) 16:25, 7 June 2009 (UTC)
Tony is right that broadening the task is not desirable and it is indeed a separate issue. I never meant to suggest that the bot might attempt to fix these sort of problems, but I was hoping that it might be possible to "add value" by creating lists of possible problem articles that it recognises, to aid others at a later date. This is not an important feature, so there's no problem if it's not implemented. --RexxS (talk) 17:00, 7 June 2009 (UTC)
ISO dates are very common in citations and in tables. I convert many of these into mdy or dmy format routinely (except for sort tables) when I come across them, and have never received any objections to so doing. MOSNUM currently suggests using the same formats within the body of the article, and also encourages consistency within the reference section (without requirement for universal consistency). Since DA was deprecated, the autoformatting within citation templates has been switched off, and now we often have a multiplicity of date formats in reference sections. I think we could rely on the list of linked ISO dates from, for example, linked as at 6 March 2009. Once dates are delinked, this particular data will no longer be available. Ohconfucius (talk) 04:59, 8 June 2009 (UTC)
Oconfucius, I agree that the vast majority of dates in citations obey ISO 8601 because they seldom refer to non-Gregorian dates or to dates before 1583. I disagree that "now we often have a multiplicity of date formats in reference sections"; the multiplicity has always been there and been visible to the vast majority of readers. I disagree that a date in a table can be presumed to be an ISO 8601 date, because there is a greater likihood of it being a non-Gregorian or pre-1583 date. --Jc3s5h (talk) 18:22, 8 June 2009 (UTC)
Sure, perhaps we should call it yyyy-mm-dd dates instead to avoid any confusion. It is easier to understand too. --Apoc2400 (talk) 23:49, 9 June 2009 (UTC)

Clarification request

"The bot operator will be someone who has not edit-warred or been uncivil in relation to date links." Can we say that anyone who has been restricted or admonished in any way at the case cannot be an operator? Dabomb87 (talk) 00:55, 7 June 2009 (UTC)

That seemed like a good idea to me, so I added it here. Please feel free to change the wording if you feel the necessity to do so. NW (Talk) (How am I doing?) 03:12, 7 June 2009 (UTC)
Thanks. I felt that points 8 and 9 were similar (they both concerned the bot operator), so I merged them and split off the bit about the bot page. I removed the ref tags from the footnote, as I feel that it is an important base to cover. Dabomb87 (talk) 03:23, 7 June 2009 (UTC)
Good, thanks. I made some further changes. --Apoc2400 (talk) 23:46, 9 June 2009 (UTC)

Slight extension

In order to avoid leaving anomalies behind itself, I suggest that the the bot should, in certain very specific circumstances where the linking of the month-day fragment is clearly for the purpose of autoformatting, either also unlink those fragments or flag them up for manual attention. I'm thinking of cases such as:

  • [[17 October ]] to [[8 November]] [[1987]] (this could just be unlinked)
  • [[February 5]]/[[February 6]], [[2008]] (this might be best flagged up and handled manually)

and possibly also (though it would be more difficult to program)

These all seem to be clearly DA, but I suppose the bot needs to be extremely narrow, cautious and conservative to gain approval. I'm unsure how the third example is, in concept, different from the first (in terms of its classification as DA). Tony (talk) 08:19, 5 June 2009 (UTC)
It's not different in concept, I only gave it a separate item because of the greater difficulty in coding the bot to look for it. It's a pattern that's quite often used though, so likely to be worth the effort. Colonies Chris (talk) 09:39, 5 June 2009 (UTC)
Are these cases common? --Apoc2400 (talk) 11:13, 5 June 2009 (UTC)
It's hard to come up with any sort of estimate, but in my gnoming career I've encountered them often enough to feel that they're worth catering for. Colonies Chris (talk) 12:03, 5 June 2009 (UTC)
This is good to know. I think we should make a list of possible instances of autoformatted (i.e. linked) dates. Dabomb87 (talk) 12:53, 5 June 2009 (UTC)
Sure, can you suggest a wording for an extra point in the proposal? --Apoc2400 (talk) 13:53, 5 June 2009 (UTC)

(outdent) Sorry for the very delayed response; how about this?:

The bot will unlink only day-month-year (triple) combinations, such as 5 November 1989, November 5, 1989 and 1989-11-05. Examples are:

  • [[January 15]], [[2005]] --> January 15, 2005
  • [[27 May]] [[2007]] --> 27 May 2007

The bot will also unlink month-day items that appear in combination with such a triple i.e. in date ranges and slashed dates. The bot will unlink only those month-day items that are closely associated with a triple in that way. Examples are:

  • [[October 17]] – [[November 8]], [[1987]] --> October 17 – November 8, 1987
  • [[23 April|23]]/[[24 April]] [[1966]] --> 23/24 April 1966

Colonies Chris (talk) 10:59, 12 June 2009 (UTC)

Thx, Chris; I substituted en dashes (but from my experience, you get all types of range punctuation, one reason for the importance of wikignoming dates. May I suggest a slight modification of the wording (but not of the substative meaning)? "The bot will unlink month-day items that are clearly adjacent to and in combination with such a triple—i.e., in date ranges and slashed dates. Isolated month-day items will not be touched, as explained above." Unsure whether the last bit is necessary. Tony (talk) 11:59, 12 June 2009 (UTC)

Fine with me. It doesn't hurt to make absolutely clear that these are specific exceptions to the general rule that month-day fragments will be untouched. The bot should probably be programmed to accept m-dashes or n-dashes or 'to' as a range indicator. Colonies Chris (talk) 15:18, 12 June 2009 (UTC)
Yep, I hadn't thought of "to", but it's common; so it the hyphen. Hyphens, en dashes and em dashes, each spaced on both sides and unspaced, should probably be included (that's seven types, given that "to" is unlikely to be unspaced). If someone wants to be nit-picky, occasionally the punctuation is spaced on left or right only (can you believe it?), which looks very scrappy, but that makes another six instances, which may be unfair to the programmer(s) for the few instances that will be encountered. The punctuation and spacing can be wikignomed later for the cases in which they're wrong and, of course, remain wrong. Tony (talk) 15:29, 12 June 2009 (UTC)
It's not quite as bad as that, since a regular expression can match '{linked_date}{0 or more whitespace}{endash, emdash or hyphen}{0 or more whitespace}{linked_date}' and replace it with 'unlinked_date\space\endash\space\unlinked_date' (and the same sort of thing for "to" and "/"), which would clean up the spacing as it went along. --RexxS (talk) 19:39, 12 June 2009 (UTC)
Yes, this kind of thing can be handled very well with regular expressions. Would it be good to replace all dash and hyphen variants with space endash space? --Apoc2400 (talk) 14:51, 14 June 2009 (UTC)
I'm a little cautious about this; yes, hyphens are no good, but whether the en dash is spaced or unspaced depends on whether there are no internal spaces in both items (unspaced en dash) or at least one internal space (spaced en dash). 3–7 January, but 3 January – 3 February (In the first case, 3 and 7 are the ranged items, since January applies to each, by ellipsis for the 3; in the second case, both date and month items have an internal space). It sounds like an unfortunate complication in a process that needs to be as simple and uncontroversial as possible. The alternative is to leave range punctuation/spacing untouched, to be dealt with by gnoming later (and now). I think I favour the simple and narrow at this stage, fixing only the very gawky within items (January 3,1980). What do others think? Tony (talk) 16:19, 14 June 2009 (UTC)

About hypnen-like characters: I could imagine someone using astronomical year numbering for years before AD 1, such as 28 February -45. It would require some thoughtful coding to make sure such a case does not get caught up in the hyphen-like character changes. --Jc3s5h (talk) 18:42, 14 June 2009 (UTC)

Yes, let's remember to check that case. --Apoc2400 (talk) 18:25, 16 June 2009 (UTC)

Endash is the correct punctuation for date ranges, so hyphen and emdash should be replaced by a spaced endash if one or both of the terms contain spaces per MOS:ENDASH. But to answer Tony, I think we were discussing where the bot finds something like '{linked date} hyphen {linked date}' which excludes the 3–7 January case. As far as I can see, all linked date ranges (except for something like [[3 January 2008]]–[[2009]], which is pretty improbable, imho) that the bot finds would require spaced endashes to replace endash, hyphen, and emdash, and whatever whitespace - or lack of it - has been put around them. --RexxS (talk) 02:46, 15 June 2009 (UTC)

To me, it's much preferable to get rid of the squidgy little hyphens and the em dashesnow, in one go, even if spacing will need to be checked later by gnomes. That would do the project a favour, and has been required by the MoS for ages. If a sure-fire method for programming the corection of the spacing were at hand, we'd be in clover, but it sounds hard to me. I'd say only if there are no complications in the programming, or from BAG. Jc3 seems to be suggesting that there might be problems. Tony (talk) 05:38, 15 June 2009 (UTC)

I changed the proposal per this discussion. Please take a look. --Apoc2400 (talk) 18:25, 16 June 2009 (UTC)

Ready?

The arbitration case is closed and I see no reason to wait. Has all concerns raised above been answered? Is there anything else that needs to be changed? I just made some edits, so please take a final look at the proposal. --Apoc2400 (talk) 18:47, 16 June 2009 (UTC)

Would it be appropriate to add a comment that if any part of the proposal proves difficult to code in a way that achieves a low false-positive rate, that part will not be implemented? --Jc3s5h (talk) 18:52, 16 June 2009 (UTC)
I don't think it's necessary to state. If the bot will do less in an uncontroversial way, that is not a problem. We don't want to ask for blanket permission to change anything. --Apoc2400 (talk) 19:05, 16 June 2009 (UTC)
I am always confused with dates like 1989-11-05 as I still cannot figure out if the real date is May 11, or November 5, I wonder if other readers/editors have the same problem as I am encountering. Galoubet (talk) 10:37, 22 June 2009 (UTC)
Very likely, that is why we recommend the formats November 5, 1989 and 5 November 1989, which are both unambiguous. --Apoc2400 (talk) 11:05, 22 June 2009 (UTC)
I see one reason to wait: At least some parts of the community are sick of this whole date mess and could use a break. But it seems to be too late now, and given the format of the discussion below any useful comments are likely to be lost in a mass of "Blue linked dates! DESTROY!!!!111oneone" supports. Anomie 11:45, 22 June 2009 (UTC)
Wouldn't it be better to get a consensus and move on, rather than let it fester? Is it separating the comments into Support/Oppose/Neutral that you dislike? --Apoc2400 (talk) 11:51, 22 June 2009 (UTC)