Wikipedia talk:Full-date unlinking bot/Archive 2

Solitary years and day-months

Links of solitary years (1989) and solitary month-days (November 5, 5 November), ... will not be unlinked by the bot.

Why? Solitary years and solitary day-months are just as much depreciated by consensus. Moreover, leaving these in could lead to inconsistancy. JIMp talk·cont 14:32, 22 June 2009 (UTC)

I believe this proposal is intended to be conservative; we can always add tasks later if the community wishes to. Of all the issues, it is most clear that using links to autoformat dates is against community consensus, so that is what is being addressed first. Dabomb87 (talk) 14:41, 22 June 2009 (UTC)
The MOSNUM wording does not have a clear consensus; that dates links should be germane probably does, but each RfC has had inappropriate wording which seems to have confused !voters. — Arthur Rubin (talk) 14:57, 22 June 2009 (UTC)
Arthur already made it clear that he did not like the wording used in the WP:DATEPOLL. Nevertheless I will remind him that in Day-month linking, 256 contributors !voted in favour of that wording, against a total of 68 who preferred the other options; while in Year linking, 208 contributors !voted in favour of the wording, against a total of 84 who preferred the other options. The Arbitration clerk, Ryan Postlethwaite commented, "I think this should be fairly uncontroversial - the results aren't ambiguous at all." Any reasonable editor would agree that there is consensus for the wording copied into MOSNUM. I find it most regrettable that Arthur is still "banging the drum" in favour of his isolated view, against all the evidence to the contrary. --RexxS (talk) 18:00, 22 June 2009 (UTC)
It's nonsense to count votes for option #1 as being in favour of that wording, as quite a few claim to have !voted only on the basis of the option title, which (in my opinion, and that of many other editors) fits option #2 or #4 better. Still, I'm willing to abide by consensus that almost all date links should be removed in non-calendar articles. — Arthur Rubin (talk) 21:58, 22 June 2009 (UTC)
And it's unbelievably rude to insult over 200 other editors by pretending that you know their intentions better than they did. The datepoll specifically requested comments: not one contributor quibbled with the wording. Silence is one of the strongest forms of consent here. Wikipedia does not sanction contributors for commenting on wording; and the fact that over 200 did not do so speaks volumes. You're entitled to your opinion, but please don't try to force it down the throats of the majority. --RexxS (talk) 03:36, 23 June 2009 (UTC)

Judgement

[dates] should not be linked unless their target article content is germane and topical to the subject of the article - how will the bot determine this? Is the list 1-13 intended to explain how the bot decision making process is conceived? --Joopercoopers (talk) 15:31, 22 June 2009 (UTC)

That's why the editors will be given a month to add articles to an exclusion list (point 6). Dabomb87 (talk) 15:34, 22 June 2009 (UTC)
Gotcha - thanks - how will the month be advertised? --Joopercoopers (talk) 15:38, 22 June 2009 (UTC)
Here, on WT:MOSNUM, the technical village pump and bot owners' noticeboard at least. It will start soon after the RFC ends. --Apoc2400 (talk) 15:50, 22 June 2009 (UTC)
Needs more than that for such a wide ranging proposals IMHO, see above for my conditional support. --Joopercoopers (talk) 15:51, 22 June 2009 (UTC)
Yes, probably, after seeing the Lightbot backlash. At the very least, a watchlist notice. Dabomb87 (talk) 15:51, 22 June 2009 (UTC)
This RFC is advertised far more widely, and I asked for a watchlist notice. Do you think the exclusion list itself should also be advertised in many places? --Apoc2400 (talk) 15:58, 22 June 2009 (UTC)
No I think here's fine for the detail and discussion, but you've got to give editors (all 155,000 of the actives) some warning - so few participate in RFC's and the like. --Joopercoopers (talk) 16:12, 22 June 2009 (UTC)
The bullet points at the beginning is the background. The list 1-13 explains how the bot would work. The idea is to unlink those dates that are most certain to not be "germane and topical". That is when both the month-day and year are linked, which is very unlikely to be correct. Chronological articles like 1789 and January are also excluded because links to dates there are more likely to be relevant. --Apoc2400 (talk) 15:43, 22 June 2009 (UTC)

staged implementation

This is probably a matter for BAG to work out when it lands on their plate, but I would like to see this bot do a significant run before it has the bot flag, and a few pauses along the way. This will allow time to ensure that the community has an opportunity to analyse the reported problems thoroughly before the next run begins, so we can be sure that reported problems arn't being squashed. John Vandenberg (chat) 20:02, 22 June 2009 (UTC)

As a BAG member, this sounds like a very sensible idea to me. – Quadell (talk) 20:29, 22 June 2009 (UTC)
I think BAG almost always requires bots to do a test run first, if nothing else then to verify that it actually works like it is supposed to. A few stops during the main run would be fine too. --Apoc2400 (talk) 21:08, 22 June 2009 (UTC)
I think John is probably right. One of the big questions surrounding this type of bot is the error rate. Probably staged test runs to look at the error rate in 30 edits, error rate in 300 edits and in 3000 edits would be reasonable, rather than just the rate in 30 edits as is normally done. A low percentage error rate may lead to an unacceptable absolute error rate when a very large number of edits are involved. Since this bot will also be seen by a large number of editors, it would also be best to put significant thought into the edit summary and the bot's user/talk pages. AKAF (talk) 07:05, 23 June 2009 (UTC)
I agree with everything above. For the record, their are no set limits on trial sizes, so you're welcome to suggest as larger numbers as you think manageable. - Jarry1250 (t, c, rfa) 07:31, 23 June 2009 (UTC)
See, for instance, the ongoing extended trial at Wikipedia:Bots/Requests for approval/Erik9bot 9. – Quadell (talk) 13:07, 23 June 2009 (UTC)

I can do it

Having read this page, and seeing that no one has declared their intent, I would not mind writing the bot to get the task done. I have a history of bot writing (see User:RFC bot), a history of publishing my code, and for the most part, a history of civility. If I were the programmer of this unlinking bot, I would adhere strictly to the conditions for which there is consensus, I would subject the code to BAG approval, and I would be receptive of input and criticisms just as I have been for RFC bot for the past two years. —harej (talk) 16:22, 23 June 2009 (UTC)

Thank you very much. That would be highly useful. --Apoc2400 (talk) 17:01, 23 June 2009 (UTC)
Thanks harej. For the record, I think others may want to lend a hand, and I wouldn't rule out some sort of wider arrangement being reached here, regarding the handling of complaints. But it's nice to have some volunteering nevertheless. - Jarry1250 (t, c, rfa) 13:30, 24 June 2009 (UTC)
Thanks for stepping up to the plate harej.
This is a rather complex task, requiring a pretty sophisticated parsing engine, and it affects most content pages.
One of my motivations behind desiring that the code to be open source is so that the task can be distributed. Causes of "less than ideal" edits can be quickly determined, suggestions made, and diffs offered. John Vandenberg (chat) 05:10, 26 June 2009 (UTC)

Is this too soon?

Rather than clog the comments above, I'll note here that some concerns are being raised that this proposal (to use a bot to delink autoformatted date-triples) has been made too soon. I can see that it is a legitimate concern, given the problems in the past, but it prompted me to wonder: when is it not too soon? - in a month, two months, three, a year? What will have changed by then? It took me a while to realise, but the only thing likely to change is editors' sensibilities. So the counter-argument must be: Look at the edits, not the editor. Read through exactly what is being proposed here, and consider the effect of those edits. Then ask yourself: In your opinion, would those edits undoubtedly improve the encyclopedia? I submit that if your answer is "no", then oppose this proposal; if "yes", then support it. In the end, that must be the acid test. --RexxS (talk) 00:44, 24 June 2009 (UTC)

It's never too soon. It's overdue. JIMp talk·cont
Essentially the concern is about how people react to an edit, rather than the edit itself. And how people take on board what has happened at the ArbCom, and the decisions there. Some people have been very passionate about this on both sides of the debate, and sometimes it does well to allow emotions to settle, especially for those who were in favour of date linking. This bot proposal, coming so quickly after the ArbCom decision, does some rather hasty. And the bot itself is designed to speed up the process. I don't see the need for such haste. And I see the use of a bot, which will inevitably remove links that editors find useful, to be heading for drama. Let people unlink dates with a gadget or a script or manually if they like, article by article as they come upon it. There are some edits that are best done by humans, and unlinking dates is perhaps one of those. SilkTork *YES! 18:42, 24 June 2009 (UTC)
It is my impression that manual and semi-automated delinking leads to far for conflict and edit warring. A bot does not get angry and overstep its purpose. --Apoc2400 (talk) 23:44, 24 June 2009 (UTC)

Apoc2400's comment about bots not becoming angry is a good reason for the task to be done by an bot rather than humans. However in addition to the "too soon after RFAR" argument, the style manual is still protected. I am concerned that the community is approving a bot to perform changes before the relevant section of the style manual is considered stable. However this proposal is focusing on autoformatting, rather than date delinking, and the deprecation of autoformatting has been accepted as consensus since August last year. John Vandenberg (chat) 05:40, 26 June 2009 (UTC)

Baby and bathwater

This is an approach that throws the proverbial baby out with the bathwater. Previous discussions have already established some date linking can be useful (which the person suggesting this bot already admits). Rather than fully delinking everything, we should put some thought in a policy on what to link and start a project to systematically go through articles with the new rules in mind. - Mgm|(talk) 07:55, 24 June 2009 (UTC)

...this proposal isn't to fully delink everything? - Jarry1250 (t, c, rfa) 07:58, 24 June 2009 (UTC)
It's a very small baby in an ocean of bathwather. Send the bot through and reconnect the rare instances later. JIMp talk·cont 13:57, 24 June 2009 (UTC)
(to Mgm) The exclusion list will be used for this reason. Dabomb87 (talk) 14:29, 24 June 2009 (UTC)
(to Mgm) Currently, there is an ocean of dates that are highlighted for no good reason, because they are linked for no good reason. These obscure the very few dates that should be linked (in the opinions of some editors), so very little would be harmed by delinking them all; the way things stand now, how do readers know whether there is a good reason why a date is in blue? And nothing prevents a reader from typing "June 11" or "1657" in the search box, even if the dates are not linked. Even after years of following discussions of this topic, I can't think of a single article with a single date that really needs to drag readers off to a date or year article in order to fully educate them. If I ever came to imagine that such a date link was very important for a reader, I would list it as an item in the "See also" section. My opinion probably fails to allay your concerns, but Dabomb87's answer about the exclusion list should; editors who care about these few links will have a month to act to save them. Chris the speller (talk) 17:53, 24 June 2009 (UTC)

Consistency

IIRC, one of the main objections to date autoformatting is that the default preference is to not autoformat, so anons and users who haven't changed their preferences would see a random jumble of date formats. Won't this just make that worse? Now everyone will see the inconsistency and it will be harder to find them. Any bot to delink dates should really try to make all the dates consistent in each article by determining the most common format in the article and converting them all to the same format when delinking. Mr.Z-man 13:50, 24 June 2009 (UTC)

More than 99% of readers don't have DA preferences set and so see a random jumble of mixed date formats now. How does delinking make them any harder to find? In fact, there will be many articles on just one or two editors' watchlists, and if they have DA preferences set, they will suddenly realise that the articles they have been watching over actually have that jumble - a fact that was hidden from them by DA. That will make correcting those errors easier, not harder.
The job of working out the correct date format for any given article is unfortunately not a simple one, nor is it easy to express as an algorithm that a bot would need for implementation - it's hard enough for human editors. Frankly, attempting to broaden the scope of this proposal is a recipe for more drama. Let's keep it simple. --RexxS (talk) 16:31, 24 June 2009 (UTC)
Its easier to find for a bot or a script if they're linked. As for implementation, it shouldn't be that hard. Gather all the dates into an array, test them against the same or similar regexes that MediaWiki uses, to determine the dominant format, then use a function like strftime to put them all into the same format. Mr.Z-man 18:13, 24 June 2009 (UTC)
It may be quicker, but not easier: the bot still has to match the contents of the link with a regex that determines if it's a date - just as it would without the links. The problem with using the "dominant format" is that is not what MOS dictates - see WP:MOSNUM#Full date formatting and the three guidelines it gives. I'm afraid that is not an easy determination for a bot to do, and the last thing we want is a bot changing dates to the wrong format! --RexxS (talk) 22:45, 24 June 2009 (UTC)
That's a good reason for this bot not to unlink dates unless it can determine the correct date format, and assign the dates it unlinks to that format. — Arthur Rubin (talk) 00:04, 25 June 2009 (UTC)
I don't know what you mean. If it matches the same regexes MediaWiki uses, then it should be a date, we're already treating it as one. As currently proposed, the bot would not make such a determination, it would just unlink it regardless. At best, making them all the same format would make the article more consistent and make all the dates consistent with the MoS, at worst it would make a minority of the dates more inconsistent with the MoS but would make the article itself more consistent. If the most of the article is already non-compliant with the MoS before the bot starts, then it'll be non-complaint when it finishes, regardless of whether or not we make the format consistent. But I would think having them be wrong but consistent would be at least slightly better than inconsistent and wrong, including things that are obviously wrong like [[January 1]],[[2001]], which even with no preference set will still be rendered correctly (with a space between the comma and the year), but would be broken by simply removing the brackets. Mr.Z-man 00:52, 25 June 2009 (UTC)
If you were replying to me, then I don't know what you're trying to say. If not, I think I agree with you, that the bot should change the "linked" text to that which would appear if the editor had some preference; and the bot should make some attempt to determine the "correct" date preference for the article. — Arthur Rubin (talk) 01:39, 25 June 2009 (UTC)
I assume the reply was to me. First, I'm sorry if I was not clear. The process of matching things like 'number+none-or-more-whitespace+text-that-is-the-name-of-a-month' is exactly the same whether or not there are square brackets around it - and similarly for years. So, no, it's not "easier to find for a bot or a script if they're linked". Second, I have no idea what you mean by "we are treating it as one". Nobody is doing anything with an entity which is recognised as a date, so what's the relevance? The bot will have to determine that what it is unlinking is a date. That's the point of the proposal. As I said before, this is a proposal to rectify the linking caused by years of editors trying to make the broken system of date autoformatting work. It is not a proposal to make dates consistent within each article. Attempting to expand the scope of this proposal is just asking for the problems that this modest proposal seeks to avoid. If you want to make a separate proposal for a bot that goes through articles making dates consistent, go ahead, you'll have my support. But please don't confuse this proposal by trying to make it something that it isn't. --RexxS (talk) 02:07, 25 June 2009 (UTC)
Yes, I do agree with Arthur Rubin. The problem is not just the links. Part of the problem is that the date autoformatting results in inconsistency for people without a date preference set. This does nothing to help that, and arguably makes it worse by making them inconsistent for everyone and potentially leaving things broken. Using a separate bot to make them consistent is just stupid. You'd be wasting tons of resources by editing all of the pages twice to edit the links. Mr.Z-man 02:24, 25 June 2009 (UTC)
Indeed the problem is not just the links. But linking for DA is the part of the problem this proposal is aimed at. There is no need to try to do everything at once, just start with what is easiest to fix. Consider what you are saying in your opposition: leave it alone simply because the 0.03% of readers with prefs set will now see what everybody else is seeing. Is that really the point of your argument? Now, you think that "the date autoformatting results in inconsistency for people without a date preference set". That couldn't be further from the truth. The inconsistency for people without a date preference set is caused solely because an editor typed in a date in the wrong format or in a broken format. DA does nothing but hide that from a few editors who might otherwise rectify the problem. I suggested you make a separate proposal (for a bot to make dates consistent). The reason that is not stupid, is that this bot can't do it. Just try to make a proposal to make a bot to do that. Either you'll have an unacceptable algorithm (because it does not conform to MOS, like the one you suggested above) or you'll find that you're trying to codify the MOS requirements into an algorithm requiring judgement beyond what a bot can provide. But don't take my word for it, go ahead and try it out. It might provide a better perspective on just what jobs are not suitable for a bot after all. --RexxS (talk) 02:56, 25 June 2009 (UTC)
No, my point is that if we're going to do it, we should do it right the first time; there's no rush. Your argument seems to be "We need to do something, this is something, so let's do this!" I really don't understand your complaints about the result not conforming to the MoS. If the result doesn't conform to the MoS, it would be because the majority of the dates in the article already didn't conform to the MoS. It would be non-conforming when it started and non-conforming when it finished. The end result of this bot won't conform to the MoS in many cases, why are you not trying to force that restriction here? It would arguably be worse in every case where the dates aren't all in 100% the same format. WP:MOSNUM says that "Dates in article body text should all have the same format" and "If an article has evolved using predominantly one format, the whole article should conform to it, unless there are reasons for changing it based on strong national ties to the topic" so the only case it would be wrong to change them all to the same format would be if there was a strong national tie to a particular format AND that format is not the dominant style. Mr.Z-man 04:55, 25 June 2009 (UTC)
Sorry to say, but this whole section is a side issue. The main point is not one of consistency (inconsistencies will be a tiny minority, and can be mopped-up by editors; and as you say, there is no rush). The real issue is in the blanket (community-consensus-based) removal of unwanted date-links. The removal of those date links is something that will never happen by gradual edits. We need to wipe the slate clean. We've all been waiting too long for proposals to do it "right"—proposals incidentally, that never eventuate (from even the most ardent supporters of date linking and formatting).  HWV258  05:39, 25 June 2009 (UTC)
Yes, the proposals to do it right are taking too long, so let's do the first thing to be proposed after the ArbCom case ends and figure out the rest later. That sounds like a plan for success. Mr.Z-man 05:51, 25 June 2009 (UTC)
Ah, but the hidden message in your response suggests that there is a way to do it right. Sadly, this issue has been thrashed around for years with no viable (technical) solution emerging. You should take time to consider that there is a solution to this problem: and that is to link as few dates as possible (and, as previously indicated in this thread, the community has realised that that solution is both simple and practical). Lastly, please understand that the forums for deciding whether this gets done happened long ago (and I promise you that there is close to zero appetite for revisiting that dark and lonely corner). What we are now considering is how this thing will happen. Cheers.  HWV258  06:17, 25 June 2009 (UTC)
No, I'm not hiding it at all. There may not be a perfect way, but this proposal is far from ideal; its questionable whether its even a good way. I'm familar with that bug report. I was one of the developers called stupid, ignorant, apathetic, and negligent for trying to help with it. That was for a way to autoformat dates without linking, but that's not what I'm suggesting. I'm curious about this discussion that said all linked dates needed to be removed by a bot; I don't remember that one. The MoS only mentions that it shouldn't be done, not that it needs to be removed immediately. I'm not suggesting we reopen the debate. In fact, I'm suggesting a better option for this proposal. Mr.Z-man 06:32, 25 June 2009 (UTC)
Changing date formats was never part of the many-and-varied RfCs; removing date-linking and date-formatting syntax was. So that we can get this all in perspective, could you please provide some examples of the mixed-date-format problem (a statistical analysis would be nice)? Would you like me to check 100 random pages in order to see which ones have a mixture of date formats (I must admit that I don't have much appetite for the task as I suspect the result will be zero)?  HWV258  06:41, 25 June 2009 (UTC)
If no articles have a mixture of date formats, then standardizing them would have no effect and there would be nothing to worry about. Mr.Z-man 16:53, 25 June 2009 (UTC)
Mr.Z, please refer to the message I left on your tlk page. Ohconfucius (talk) 10:14, 25 June 2009 (UTC)

Harmonizing the date format within each article is good goal, but it is outside of the scope of this bot. It will not make every article it touches completely MOS-compilant. Rather it will bring them closer by:

  1. Reducing the ammount of blue links.
  2. Fixing simple errors like missing spaces and commas so that autoformatting can be turned off without any negative effect for the huge majority with no date preference set.

It would do this while minimizing the risk of breaking dates or delinking those few date links that are actually relevant. This is why we kept it minimal, and I think it has been throughly established that it is extremely unlikely that triple-date links such as [[January 1]], [[1953]] should be linked.

Yes, articles with varying date formats will still have varying date formats. Those can be fixed either using Lighmouses script (which would be much less controversial if the links were already taken care of), or with a bot if someone can figure out how to do this much more complex task without too many mistakes. --Apoc2400 (talk) 11:23, 25 June 2009 (UTC)

if someone can figure out how to do this much more complex task without too many mistakes - Am I talking to myself here? I'm pretty sure I gave a way to do that. It would be correct in the majority of cases and the only cases it would be incorrect would be cases that were incorrect before the bot edited it. So the worst case scenario would be that we have the same amount of non-compliant articles we did when it started. Doing it with separate bots just because we're in a hurry is just a massive waste of resources. Compared to many of the bots we have running, its not complex at all. Mr.Z-man 16:53, 25 June 2009 (UTC)
Think again. Your "way to do that" is just plain wrong. The algorithm to make dates consistent in an article is not to make the minority conform to the majority. If you would simply take the time to read WP:MOSNUM#Full date formatting as suggested earlier, you would know that you have to consider (1) "Articles on topics with strong ties to a particular English-speaking country should generally use the more common date format for that nation" (2) "If an article has evolved using predominantly one format, the whole article should conform to it, unless there are reasons for changing it based on strong national ties to the topic" and (3) "In the early stages of writing an article, the date format chosen by the first major contributor to the article should be used, unless there is reason to change it based on strong national ties to the topic". Those processes are not simple even for a human editor to carry out (e.g. what are the strong national ties in a biography of British national who became a naturalised American? how do you identify an article's first major contributor? or who made the first edit containing a date?). Unpalatable as it may be, that's not a job for a bot. Or go ahead and make one, I'd be delighted to be proved wrong if this job actually were capable of being automated. --RexxS (talk) 18:50, 25 June 2009 (UTC)
I read the damn page, thank you. You obviously haven't read anything that I said though. Why does a bot that makes the dates consistent have to be perfect but a bot that just delinks them can do whatever the hell it wants? I am not proposing a bot to fully enforce the MoS, just the "consistency" part, just as this bot proposal is only to enforce the "no linking" part. The only case where a bot to make them consistent would produce a wrong result would be IF THE ARTICLE IS ALREADY WRONG. The bot wouldn't just randomly change the format for the whole article, it would standardize it using whatever format the human editors who edited the article use most. If the article is mostly right, it would become completely right. If the article is inconsistent and in mostly the wrong format, it would at least make it consistent, fulfilling one of the date formatting criteria rather than the previous zero. If you wanted extra precision, you could look at the categories for ones that would suggest a national tie annd/or look in the early revisions of the page for the first instances of dates, both of which would not be difficult to do with a bot, but since I'm not proposing a bot to make articles fully compliant (why you repeatedly insist that I am, I have no idea), only more compliant than they are currently, it would just be extra. But you're basically arguing that a bot that edits links has to make them fully compliant with the MoS, or its unacceptable, unless its only delinking them, then it doesn't. Mr.Z-man 19:04, 25 June 2009 (UTC)
Knock off the personal attacks, swearing and shouting. It does nothing to advance your argument and much to make you look like a upset kid. Please try to understand that not everybody is going to agree with your analysis and try to accept that others may be right. In this case, this a proposal to have a bot remove date-autoformatting from date-triples. It is a modest proposal that stands a good chance of meeting the concerns of everyone, with respect to the task it is intended to do. It has never been intended to produce perfection in making all dates MOS-compliant in every respect. As many editors have pointed out over a large span of time, that task will require human editing in several areas. I am simply trying to explain to you that the task of making dates consistent falls into the latter category. As such, I disagree with your suggestion that this bot should take on that task. --RexxS (talk) 19:47, 25 June 2009 (UTC)
It has never been intended to produce perfection in making all dates MOS-compliant in every respect. - And neither has my idea. I'll admit that I'm wrong when someone can explain how having inconsistent and wrong dates is better than consistent and wrong dates. Mr.Z-man 20:10, 25 June 2009 (UTC)
It's not better, it's just that it has nothing to do with this bot. I like your idea. It would be great to have DateConsistencyBot. But this isn't it and it never could be. Maybe we could agree on that? --RexxS (talk) 23:49, 25 June 2009 (UTC)
I'm not sure what the dispute is about here. We all agree that consistency of date formats in articles is a Good Thing, and that a bot to do that might be possible (and would be a Good Thing if it could be done), no? The conclusion of this recent saga seems to be that all bar a handful of editors now accept that the consensus is that minimal linking of dates in articles is a Good Thing, no? How are those two issues connected? They are two separate problems that can quite happily have separate resolutions. A DateConsistencyBot couldn't search only for linked dates anyway, because dates are not universally linked; so the fact that another bot might be delinking dates is not relevant. I don't see these two tasks as conflicting in any way. Happymelon 17:36, 28 June 2009 (UTC)
My main problem with this proposal is that its totally opposite to how bots normally work. Rather than trying to do it as efficiently as possible using as few edits as necessary, the goal seems to be to do it as inefficiently as possible. Rather than removing all date links and making formats consistent with one bot on one run, we're going to use one bot to remove full date links, one or two more to remove links on partial dates, and then possibly another to make the formats consistent. So rather than one bot that would make no more than 2.9 million edits, we're going to have 3 or 4 that combined could make up to 11.6 million edits. IMO its pointless inefficiency just for the sake of getting a proposal pushed through as fast as possible. Mr.Z-man 16:42, 30 June 2009 (UTC)
I think the intention is to minimize drama. We have at least one editor !voting "oppose" because he suspects the bot may delink [[June 1]], [[1973 in sports|1973]], which he opposes, even though the bot description explicitly says it won't do this. And two other people have !voted citing this person's opinion as a main reason. It makes sense to me to operate under clear consensus, without trying to do too many things at once, to avoid controversy. It's hard enough just to determine consensus for this one step: some oppose because they are die-hard fans of autoformatting, some because they don't like bots doing this sort of thing, some because this task will make too many changes at once, and some (you) because it will make too few changes at once. And all this for a task whose general outline (not indiscriminately autoformatting via date links) has had community approval for many months now. – Quadell (talk) 16:57, 30 June 2009 (UTC)
That still sounds a lot like "just for the sake of getting a proposal pushed through as fast as possible." There's no reason we have to do this right now. We've lived with autoformatting for around 6 years now IIRC, its not going to hurt anything to let it be for another month or 2 while the MoS stabilizes and a complete proposal is drafted. Additionally, no one has really said why we need to do this in the first place. Yes a bot would be faster than a person, but that doesn't really explain why it needs to be done. Personally, I think the best option would be to add this to AWB general fixes; if we do that, then doing it in steps wouldn't be as big a deal as we'd just be piggybacking on edits that would be made regardless. Mr.Z-man 18:55, 30 June 2009 (UTC)
There is no reason to wait either. Will anything be better later? A problem with adding it to AWB general fixes at this points is that the user running AWB may not be able to explain the edits to others. --Apoc2400 (talk) 20:26, 30 June 2009 (UTC)
I have been arguing similarly to User:Mr.Z-man elsewhere on this page. I also think there is no rush to get the dates unlinked and that it is worth taking the time to have the bot get the dates formatted right the first time. I see a support consensus on this RfC as becoming a self-fulfilling to your question "Will anything be better later?" If the dates are indeed unlinked, many will see the date issue as resolved and not see the need to revisit autoformatting features and nothing better will come along. However, if a discussion is held before sending the bot to correct dates, appropriate consensus can be reached while there is active discussion on the subject. I apologize if I am being presumptuous and there are intentions to revisit formatting with as much enthusiasm as this unlinking RfC; such an action would dispel most of my concerns. —Ost (talk) 22:16, 30 June 2009 (UTC)
Will anything be better later? - Quite likely, but even if there isn't, its not like this proposal will expire. Not being able to predict the future is not a reason to rush. And, if something shouldn't be done with AWB general fixes, where each edit is reviewed, it definitely shouldn't be done with a fully automated bot. If something is going to be controversial and need frequent explanations when each edit is manually reviewed, it will be 10 times as controversial and drama-causing with a bot. Mr.Z-man 03:04, 1 July 2009 (UTC)

Questions?

Ok, the discussions above a getting a bit messy. If anyone has any questions about the proposal or the bot, then ask below, and I will answer unless someone does it before me. --Apoc2400 (talk) 00:10, 25 June 2009 (UTC)

Did I get it right that relevant date links could be kept? In that case I even more support the idea. The thing is, personally I seem to struggle with dates like for example 2009-06-04. Is this April 6, 2009 or June 4, 2009? Galoubet (talk) 05:39, 25 June 2009 (UTC)
  • Yes, those links that someone thinks are relevant can be marked in on of several different ways, and the bot will leave them alone. Or they can be linked again after bot runs and it will not delink again. That is June 4, btw. I personally have no problem with this format, but I do with i.e. 06/04/09. This is why we recommend only using the unambiguous formats June 4, 2009 or 4 June 2009. --Apoc2400 (talk) 11:28, 25 June 2009 (UTC)
Thanks, very much appreciate your comment. And yes, the unambiguous formats make live much easier. Galoubet (talk) 12:27, 25 June 2009 (UTC)

Suggestion on exclusion list

I suggest that whenever an article is placed on the exclusion list, it also be tagged with {{bots|deny=FDUBot}} (where FDUBot is the name of the full date unlinking bot). This will notify those who have the article on their watchlist, but who do not follow the exclusion list, that the article has been excluded. --Jc3s5h (talk) 15:53, 25 June 2009 (UTC)

And perhaps if an article is tagged with {{bots|deny=FDUBot}}, it should also be placed on the exclusion list for scrutiny? --RexxS (talk) 18:54, 25 June 2009 (UTC)

Flame wars

(Moved from vote section above.)
4. Oppose, the previous flamewars show that this area is one in which automatization would be great source of unnecessary drama. Titoxd(?!? - cool stuff) 20:32, 22 June 2009 (UTC)

  • Just out of curiosity, what would you recommend we do? Dabomb87 (talk) 00:38, 23 June 2009 (UTC)
    • Leave it alone, for starters. A bot does not have the ability to detect how to treat edge cases as well as a human does. Humans can use AWB or something similar to manually ensure that the links that are being removed are truly useless. Titoxd(?!? - cool stuff) 06:08, 23 June 2009 (UTC)
      • "A bot does not have the ability to detect..."—but that's not the point. As the ratio of useless linked dates to useful linked dates is (at a guess) tens-of-thousands to one, the only effective solution is to wipe the slate clean (exception page willing), and start again. This was mentioned many times in the debate over the previous six months.  HWV258  07:18, 23 June 2009 (UTC)
        • On the contrary, the fact that bots do not have the ability to detect which links are appropriate and which are not is the crux of the whole issue, and resulted in a couple of bans in the Arbitration case (c.f. Lightmouse and Locke Cole). There is nothing that this bot would do that cannot be accomplished by editors using semi-automated tools, so using a fully-automated bot would only serve to poke a hornet's nest that was just appeased—an act that is not only unnecessary, but IMO is imprudent and reckless. The more I think of this, the worse an idea it sounds. Titoxd(?!? - cool stuff) 11:00, 24 June 2009 (UTC)
          • The bans had nothing to do with the functions of the bot/script (remember that arbcom only addresses behavioural issues). You have clearly not appreciated the scale of the problem (unlike the editors who have been involved in this for the duration). Please read some history before replying. Cheers.  HWV258  22:25, 24 June 2009 (UTC)
            • Ad hominem. Awesome. I'll bite anyway. The remedies in the case include a prohibition on automatic mass delinking, so the ArbCom found it prudent to apply a restriction on automated editing to avoid inflaming users in a controversial situation. A bot would essentially restart the circus again. In any case, the behavior of the parties in the case demonstrates that this issue should be handled with care, which cannot be done by a bot. Titoxd(?!? - cool stuff) 06:59, 25 June 2009 (UTC)
              • You are unaware of the scale of the issue. Did you know that during last year, the bot editing approx. 100,000 articles in November alone? Those edits resulted in little or no residual issues. What went wrong is when the bot stumbled across a tennis-related article that dragged one particular editor (no longer active) out of the woodwork. That editor behaved unconscionably and ended up dragging us all through this mire. Interesting that the editors watching the other (approx.) 99,999 articles in November felt no need to continue with such actions. A storm in a teacup prolonged by one or two (currently inactive) editors. Please learn some of the history of this issue before dragging us all down paths of little statistical significance.  HWV258  22:18, 25 June 2009 (UTC)
                • Indeed, as an editor who looks after entire categories of articles I actually invited Lightmouse in at one stage, and his bot worked exactly to specification in the articles I was watching (several thousand Australian ones). Orderinchaos 20:23, 30 June 2009 (UTC)

(Outdent) I'm sorry — it's certainly accurate to state that LockeCole filed the arbitration request, and I know that it's fashionable to blame this entire situation on Locke and TennisExpert. But the latter is really a dramatic oversimplification. There was concern expressed about the bot edits completely separate and apart from the edit war on the tennis articles. Witness this AN discussion from mid-November and this AN/I discussion from a month later. Neither of those editors was involved in the first discussion and, while they were both involved in the second discussion, neither of them initiated it.

I'm likely going to support this proposal, as I think it's a reasonable and measured step forward toward addressing the issue. I say that being one of those who was expressing strong concerns about the bot edits at the time this was happening, and as someone who provided evidence in relation to those concerns during the arbitration case. I view the fact that I'm likely going to support as being a good thing, as it demonstrates that it's possible for users from all sides of this issue to find a common ground, and is evidence that we're making progress. However, I'm not yet fully committed, because I'm watching some of the concerns expressed here about inflaming things coming true before my eyes. HWV258, you've undoubtedly marked your support above, as Titoxd has marked their oppose. Discussion is good and necessary, but it's important to remember that everyone commenting, on all sides of the proposal, is entitled to their opinion. They're particularly entitled to express their opinion without being told that the point about which they're concerned isn't the point, and without presumption that they haven't sufficiently educated themselves about the issue before commenting. I can't imagine that any user would respond well to being talked down to in that manner. Mlaffs (talk) 23:11, 25 June 2009 (UTC)

Correct (and apologies), but the above was liable to sway the debate with pejorative phrases such as "...great source of unnecessary drama". Statistically, that is simply not the case. If a debate is going to happen, then it must happen on a level field without the use of emotional phrases designed to achieve a specific outcome. To get a better idea of scale, anyone thinking of voting "oppose" should take some time to read the comments at an RfC such as this one  HWV258  23:39, 25 June 2009 (UTC)
That only says that people went through 100,000 articles before they hit the flashpoint. If it had been only the fault of Tennis Expert, we wouldn't have had a five-month Arbitration case with 19 editors receiving sanctions. My original point still stands: There's been a great deal of animosity surrounding this topic, and there's no indications that all of the fuel that provided the original pyrotechnics has been consumed. A bot run would cause more friction than a gradual semi-automatic, human-assisted process, which is what I recommend. And sorry, but the drama surrounding the link wars cannot be described as anything but completely unnecessary. Titoxd(?!? - cool stuff) 23:57, 25 June 2009 (UTC)
"That only says that people went through 100,000 articles before they hit the flashpoint"—but that's the entire point (and incidentally, that was only November 2008—there were many other bot-months involved). The enormous difference between when TE overreacted and now, is that there are many RfCs that show clear community consensus (and now direction) as to the delinking of dates. In other words, when the next obstreperous editor crawls out of the woodwork and makes false claims about "community-support" (and such-like), it will be much easier to point to the reasons as to why the bot's work is deemed necessary. It will also be possible for the editor in question to add pages to an exclusion list (if justifiable). Please try to appreciate that we have all come a long way (and suffered much) in order to avoid a future "great source of unnecessary drama".  HWV258  01:25, 26 June 2009 (UTC)

Additional date combinations to clean up

I have seen some other goofy date links that can be added to this bot's task list:

  • [[June]] [[29]]
  • [[June 29]]-[[30]]

Andrwsc (talk · contribs) 19:24, 29 June 2009 (UTC)

I expect there are numerous cases where the dates are broken to start with, as with the links to the years 29 and 30 in the examples examples above. However, I think it is best to limit the scope of the proposed bot to only processing correctly formatted linked dates, and those with minor punctuation errors that are presently recognized and corrected by default-mode autoformatting. (For an example of the latter, note that "[[29 June]][[2009]]" and "[[29 June]] , [[2009]]" both display as "29 June 2009" to the IP user.) To enumerate, identify, and handle the many other possibilities would greatly increase the complexity and opportunity for error. I believe these cases would best be handled separately – either manually or through a semi-auto process such as AWB. -- Tcncv (talk) 23:08, 29 June 2009 (UTC)
These could be fixed by the bot as uncontroversial mistakes. It depends on how hard it is to implement reliably. I cannot promise that bot will fix these mistakes, but unless there is some protest, we will make sure to look at it. Even for uncontroversial fixes, we need to focus on the common ones, leaving less common mistakes for manual or semi-automation as Tcncv say above. --Apoc2400 (talk) 23:28, 29 June 2009 (UTC)
It turns out that there are only about a thousand pages that inappropriately link to a year in the 1–31 range, but not all are of the form I posted above. For example, a good percentage should be disambiguated (e.g. 11 (The Beatles album), 2424 (TV series), etc.) Perhaps it is indeed simpler to keep this as a separate task because the scale of the problem is not so enormous. — Andrwsc (talk · contribs) 16:30, 30 June 2009 (UTC)
Is it possible to compile a list of these inappropriate links to the 1–31 range? I would be happy to go through slowly and disambiguate, cleanup, etc. Dabomb87 (talk) 16:32, 30 June 2009 (UTC)
Sure! I've put the remaining working list at User:Andrwsc/bad year links. — Andrwsc (talk · contribs) 17:14, 30 June 2009 (UTC)

Query

Once this boot is done running, can the misbegotten autoformatting code finally be deleted? It's currently impossible to use its original functionality, of allowing history pages to be in the desired format, without the highly undesirable functionality of disguising problems on the page coming into play. Shoemaker's Holiday (talk) 03:41, 30 June 2009 (UTC)

Likely, yes. The reason autoformatting hasn't been disabled yet (bugzilla request) is that it would expose some date errors that autoformatting is currently fixing even for readers with no date preference set. This bot will fix those errors, as well as unlink the 1998-05-23 style dates that would otherwise become redlinks if autoformatting was turned off now. --Apoc2400 (talk) 08:53, 30 June 2009 (UTC)
It is likely that the autoformatting code would be turned off for the English Wikipedia, but would be left in the Wikimedia software for use by other projects. --Jc3s5h (talk) 14:25, 30 June 2009 (UTC)

Closing soon

This RFC will be closed in a few hours. Get your last comments in if you haven't already. --Apoc2400 (talk) 21:44, 6 July 2009 (UTC)