User talk:Citation bot/Archive 1

Archive 1 Archive 2 Archive 3 Archive 5

Which templates?

Does DOI bot handle {{citation}}, or only the {{cite}} series? —David Eppstein (talk) 15:15, 23 May 2008 (UTC)

It's restricted to {{Cite journal}} at the moment, as these are the most likely to have DOIs - allowing the bot to edit other templates would open up a whole new world in potential of little bugs. Maybe I'll broaden its scope to include these someday... Smith609 Talk 15:51, 23 May 2008 (UTC)
I think {{citation}}s that include a journal= field are just about equally likely to have DOIs. Other cites, and non-journal citations, are iffier, I agree. —David Eppstein (talk) 17:22, 23 May 2008 (UTC)
Update: It now handles {Citation} and {cite book} too. Martin (Smith609 – Talk) 04:38, 7 December 2008 (UTC)

Abstract

[1] Is it really reasonable to label something an "abstract" when the linked page has the full document linked in PDF and/or GIF form? WilyD 12:50, 24 May 2008 (UTC)

Thanks for pointing this out. Now fixed.

Is it really fixed? DOI bot has gone through all my marked articles with (abstract). See, for example, mired and D65. Just because a subscription is required it does not mean that the full article is not provided. Are you going to fix this or do we have to go through the backlog manually? --Adoniscik(t, c) 12:27, 28 May 2008 (UTC)

I think the "abstract" summary is accurate. The page the URLs on these pages link to is the abstract, or a summary of the article. If the editor had intended to supply a link to the full text, they would have saved their reader a click and linked directly to the full text, which the bot would mark "subscription required" if necessary. Most academic readers will know how to procure a full text article from an abstract link, or through their own library via DOIs. An abstract link gives the causal reader an impression of what is on the other end of the link; it doesn't say "abstract only" and imply that no other information is available. Perhaps "free abstract" would be a better wording?
I have disabled this feature for the time being until the matter is resolved; feel free to argue back! Smith609 Talk 12:51, 28 May 2008 (UTC)

Not at all. I linked to the abstract page as a matter of policy for two reasons:

  1. There is no way to go back from the PDF to the abstract page. The abstract page, however, does link to the PDF. Frequently it also provides invaluable information not readily available from the PDF, such as a permalink. The permalink generally does not resolve to the PDF.
  2. I connect to subscriber sites through a proxy server affiliated with my educational institution. The redirection is not entirely transparent and hinders my ability to save the PDF to the hard disk (which is my preferred course of action) rather than attempting to open it in the browser. However, if I first attempt to access the abstract page, I go through the redirection process and then can readily right click and save the PDF on the redirected page. If I attempt to right click and save the PDF without first going through the redirected abstract page, I merely save the HTML page where I have to enter my user credentials for my educational institution.

I do appreciate that your bot fixes the en dash in the page numbers! --Adoniscik(t, c) 15:33, 28 May 2008 (UTC)

If you've linked to the abstract page, then surely the bot should mark the link as a link to an abstract page? Smith609 Talk 16:08, 28 May 2008 (UTC)
Only if the abstract is all that is available. That is what I understand when I see the "(abstract)" remark. Maybe a better wording is indeed advisable. I can live with (free abstract), but I think it is just clutter. I would probably say nothing at all. Adoniscik(t, c) 16:22, 28 May 2008 (UTC)
How about "abstract page"...? Smith609 Talk 16:43, 28 May 2008 (UTC)

Barnstar

  The Minor Barnstar
For the great job of improving citations. utcursch | talk 16:11, 25 May 2008 (UTC)

Please restrict your bot to DOI's

I'm grateful for the handful of DOI's that your bot came up with. Nonetheless, I am going to revert your bot's changes to problem of Apollonius, an article I have been working on for months, for the following reason. I deliberately write & n d a s h ; instead of – to help me proofread the article. Forgive me for saying so, but it doesn't seem wise to make invisible formatting changes that also interfere with a human editor's ability to maintain the article. My suggestion would be to turn off that feature of your bot. Otherwise, well done! :) Willow (talk) 21:58, 27 May 2008 (UTC)

Done: The bot will no longer replace –. Thanks for your feedback, and sorry for the inconvenience! Smith609 Talk 07:19, 28 May 2008 (UTC)

Thanks, Smith! :) I plan on using your bot a lot in the future; thank you a lot for that as well! Willow (talk) 11:20, 28 May 2008 (UTC)

I see that em-dashes are fair game, still. As they should be. Again, thanks for DOI bot; although I've been running around cleaning up after its changes on my watchlist (mostly either replacing or removing deadlinks), I find what it does very valuable. —David Eppstein (talk) 17:15, 28 May 2008 (UTC)

approved for this?

I was looking over my watchlist and I noticed this change. I then checked out the two bot approval links on the bot's page. I may have missed it, but it doesn't look like your bot was approved to make changes like this.--Rockfang (talk) 16:09, 31 May 2008 (UTC)

The bot is approved to "correct common mistakes". Many edits such as this one were made during the bot's trial period and did not elicit comment. I hope it's not causing you any inconvenience? If your watchlist is becoming cluttered by bot edits, the "hide bot edits" link may come in handy. Smith609 Talk 06:12, 1 June 2008 (UTC)
Cool. I just wanted to make sure it was approved. I reread the bot approval requests and found it. Thanks for the info.--Rockfang (talk) 06:36, 1 June 2008 (UTC)

Parsing page numbers

I noticed in this edit that the bot picked out part of the pages parameter (|pages=223), but not the whole (|pages=223–235). Perhaps it is because Blackwell Synergy (gasp) use en dashes for their page range, and not the more common and less correct hyphen-minus? Who knows. Thought I'd let you know. +mt 18:35, 7 June 2008 (UTC)

Unfortunately the database that the bot consults very rarely contains "end page" data. I feel that scraping it from the website itself is an unjustifiable use of time and resources - so unfortunately you'll have to be content with the start only. Thanks for pointing it out, and sorry I can't do anything about it! Smith609 Talk 21:13, 7 June 2008 (UTC)
Quite alright, I'll just keep my eye out for it. Thanks for the info. +mt 17:26, 8 June 2008 (UTC)

Page numbers

This bot is replacing the dash for pages numbers with the endash. While this appears to be correct style, this is not an approved function for the bot. --EncycloPetey (talk) 19:21, 9 June 2008 (UTC)

Did you see the discussion on similar changes two sections up? —David Eppstein (talk) 20:06, 9 June 2008 (UTC)
Yes, I saw that claim, but that function is not listed on the user page for this bot, nor could I find the discussion approiving it for that function. When "correcting common mistakes" was explained in the bot approval, stylistic mistakes were not discussed. What was discussed were correction such as replacing id= with pmid= when appropriate, or correcting Journal= to journal=. These corrections are invisible, and this is not the same as making stylistic changes in editing. --EncycloPetey (talk) 20:35, 9 June 2008 (UTC)
This is becoming petty.
I've created a very specificially worded request for bot approval for this task. You are invited to contribute to the discussion. Smith609 Talk 07:31, 10 June 2008 (UTC)

Time Delay

Hello,

I missed reverting some vandalism due to a bot edit to Aluminium diff. Can the bot be configured not to make an edit within x minutes of an IP user, or user with less than x edits? Whilst it wasn't long before someone read and reverted the vandals edits diff, for a less major article than Aluminium and less blatant vandalism, the vandalism could have gone undetected for a while. Something along these lines would be good, as I ignored the bot edit on my watchlist. User A1 (talk) 11:47, 12 June 2008 (UTC)

DOI bot overeager?

DOI bot is completely deleting the "access date" on the templates in a couple of articles, can this be avoided? Montanabw(talk) 05:15, 24 June 2008 (UTC)

It deletes access dates where there is nothing that can have been accessed. An accessdate should only be specified in relation to a URL, since journals do not change with time. Martin (Smith609 – Talk) 09:21, 24 June 2008 (UTC)

URL vs. DOI

In the Bird nest article, the DOI bot added a link to a commercial abstract when a valid URL to the free full article was already specified. Should it be doing that? MeegsC | Talk 23:01, 24 June 2008 (UTC)

Yes - a DOI is a permanent link to an article, whereas URLs - especially to free versions - are prone to link rot. Obviously the URL link is far more useful at present, but if and when it breaks the DOI will come into its own. A DOI is also useful to people using some citation manager tools. Martin (Smith609 – Talk) 08:56, 25 June 2008 (UTC)
Thanks. MeegsC | Talk 18:43, 25 June 2008 (UTC)

Damn useful little bot

Thanks. --Blechnic (talk) 22:34, 13 July 2008 (UTC)

Citing sources

Please keep the citation bot away for projects such as Wikipedia:Citing sources. Indeed, it would probably best to restrict it to article space only. --Gerry Ashton (talk) 17:54, 5 October 2008 (UTC)

It only acts on article space unless a user activates it on another page. You should take the issue up with Askthemanwhoowns1. Martin (Smith609 – Talk)


OCLC

Hello. Today I saw for the first time that the bot adds an OCLC (diff), but perhaps it has always done this. What is the rationale behind this? The OCLC was added to a book that already has an ISBN, and you can search WorldCat via the ISBN search page, so I am wondering what extra benefit the OCLC gives. -- Jitse Niesen (talk) 21:56, 13 October 2008 (UTC)

I've never used an OCLC myself; I assumed that since it has a parameter in the template it is useful to at least some readers. Is this not the case? If not, the bot can easily go round removing OCLCs wherever an ISBN exists. It'd probably be best to ensure there's a consensus before getting the bot to remove information though. Martin (Smith609 – Talk) 04:26, 14 October 2008 (UTC)
I have no idea. One possibility is that the OCLC is useful in case the book does not have an ISBN. But that's pure speculation on my part. I can't find any discussion at Template talk:Citation on this field, so let me ask there. By the way, thanks for running the bot. -- Jitse Niesen (talk) 11:42, 14 October 2008 (UTC)
OCLC is also useful when finding different editions that may be more conveniently accessible than the one used as ref.LeadSongDog (talk) 18:51, 13 December 2008 (UTC)


DOI from URL for ACS and RSC publications

Hey there

To read the DOIs for American Chemical Society publications, for example:

Publisher doi is 10.1021

Look at "ja00001a054"

DOI should be 10.1021/ja00001a054

(ja refers to J. Am. Chem. Soc.; cr refers to Chemical Reviews, etc.).

For Royal Society of Chemistry, e.g.:

prefix the DOI with 10.1039/

so, 10.1039/b804604m

Hope this helps! --Rifleman 82 (talk) 16:23, 14 October 2008 (UTC)

Thanks for the info. Are there any instances where the bot fails to retrieve a DOI from citations of this format? If so, could you provide some examples so I can see what improvements are possible? Thanks, Martin (Smith609 – Talk) 23:04, 14 October 2008 (UTC)

Hi there

I haven't tried, but I was looking at your bot's user page, and that you said only two publishers have DOIs in the URL. I thought I might fill you in? ;) --Rifleman 82 (talk) 05:25, 15 October 2008 (UTC)

Sites that I've seen with DOIs in the URL are only BIOONE and Blackwell publishing. The former of these encodes the title in an invisible span.

Do the <meta> tags contain a dc.Identifier or citation_doi? If so, check the dc.title or citation_title matches the title we want.

Is there a DOI in the page, anywhere? Are there lots of DOIs? Do any occur in association with the title? If there are any <br>, <p>, <li> or <td> tags between the title and a DOI, the DOI could refer to a different reference, and we'll have to ignore it.

Is there a unique DOI?

Does the DOI appear in the first 5000 characters of the document? If so, it is probably part of the document description. Any later, and it's more likely to be a reference.


Template:Cite pmid/19240221

Could you please take a look at Wikipedia:Templates for deletion/Log/2009 March 9#Template:Cite pmid/19240221? The bot seems to have created a redirect (Template:Cite pmid/19240221) to a non-existent page (Template:Cite doi/10.1073.2Fpnas.0812570106). Thanks, –Black Falcon (Talk) 05:00, 10 March 2009 (UTC)

In fact, the bot appears to have created this page at least 4 times, and it has been deleted 4 times. --R'n'B (call me Russ) 18:24, 10 March 2009 (UTC)
I've created the missing page manually and will investigate why the bot didn't when I get the opportunity. Martin (Smith609 – Talk) 20:33, 14 March 2009 (UTC)
  Fixed, I think: typo in source code. Martin (Smith609 – Talk) 16:14, 21 March 2009 (UTC)

Posting at WP:BON

I've posted two questions regarding this bot at WP:BON#A separate template for each cited source?. -- John Broughton (♫♫) 13:42, 26 March 2009 (UTC)

Not a bug, but ...

Am I the only one who thinks that putting {{dead link}} into the "format" field makes the citation look kinda ugly, having superscripts parenthesised with normal parentheses: ([dead link]Scholar search). Also adding "scholar search" links for news articles doesn't actually help for link recovery. Maybe a link to an Internet Archive search would be more useful in some cases, depending on what fields are in the citation (e.g. journal/issn, vs. periodical). Cheers, cab (talk) 07:25, 28 March 2009 (UTC)

Yes, it's pretty ugly! Maybe this will encourage editors to fix the dead links...
On a serious note, it would be very easy for me to change this tag - feel free to suggest alternative wikicode and I'll use that. Martin (Smith609 – Talk) 12:38, 28 March 2009 (UTC)

cite doi slash

The bot seems to be confused about slashes in DOIs.

Starting from this: {{cite doi|10.1016/j.jebo.2006.05.017}}

The template is looking for this: Template:Cite doi/10.1016.2Fj.jebo.2006.05.017

But the bot (at least when prodded via "jump the queue") creates this instead: Template:Cite doi/10.1016/j.jebo.2006.05.017

Result:

  • doi:10.1016/j.jebo.2006.05.017
    This citation will be automatically completed in the next few minutes. You can jump the queue or expand by hand


Rl (talk) 15:28, 28 March 2009 (UTC)

  Fixed Thanks for the heads up. Martin (Smith609 – Talk) 15:43, 28 March 2009 (UTC)
Thx. There are more of these pages that should probably be deleted. Both variants of pages were created back in February: [2]. I don't see the pattern. Was it the "jump the queue link"? Rl (talk) 16:25, 28 March 2009 (UTC)
The link was the problem. I am actually in the process of applying for adminship; if I get access to administrative tools I'll delete the errant pages myself rather than making work for someone else. Martin (Smith609 – Talk) 16:55, 28 March 2009 (UTC)
You got it. Congratulations.
Btw, I searched for other potential problems with the cite doi template.
  • You may want to have your bot add/modify the templates to highlight problems it finds. For instance, DOIs that contain special characters [3].
  • Also, I'm a bit concerned about broken DOIs; there's not a lot of redundancy in a DOI if DOI resolution fails. See, for instance, the reference to Monk et al. (1996) in [4].
Rl (talk) 08:29, 29 March 2009 (UTC)
Thanks (-:
It's a good idea for the bot to highlight links to broken/incorrectly copied DOIs - I'll think about the best way to implement this. Martin (Smith609 – Talk) 20:53, 29 March 2009 (UTC)

Cite arXiv to Cite journal convertion

Many {{cite arXiv}} could be converted to {{cite journal}} For example, this page http://arxiv.org/abs/cond-mat/0605258v1 gives the journal into with this was published Nature 442, 54-58 (2006). Why not change it to a cite journal (with |id={{arXiv|0605258v1}})?Headbomb {ταλκκοντριβς – WP Physics} 20:25, 28 March 2009 (UTC)

That sounds like a great idea. I'm not entirely familiar with the ArXiV system; could you give me a set of rules which I could turn into bot code? I would need to know the situations where converting to cite journal is appropriate, because presumably there are many scenarios where it isn't (or cite ArXiV wouldn't exist). Martin (Smith609 – Talk) 01:32, 29 March 2009 (UTC)

As far as I'm aware, when you can find the journal the article has been published into, it's appropriate to change {{cite arXiv}} to {{cite journal}}, otherwise they should remain {{cite arXiv}} (I'm not very familiar with {{citation}}, but I suspect there wouldn't be any additional problem than those mentionned for {{cite journal}}). As long as you adapt the parameters |eprint=foobar |class=barfoo |version=v123456 from {{cite arXiv}} to into an |id={{arXiv|foobarv123456}} [barfoo] in {{cite journal}}, there should be no problem. |class= and |version= are optional parameters; there is no need to modify the logic used when |version= is not used, as {{arXiv|foobar}} will produce the correct output; however if class is not present, the brackets should not be present (i.e, it should look like |id={{arXiv|foobarv123456}} rather than |id={{arXiv|foobarv123456}} []). All other fields from {{cite arXiv}} (i.e., author, title, etc...) overlap with those of {{cite journal}}.

For example {{cite arXiv |author=Tom Leinster |eprint=0707.0835 |class=math.CT |title=The Euler characteristic of a category as the sum of a divergent series |year=2007 }} produces the following:
Tom Leinster (2007). "The Euler characteristic of a category as the sum of a divergent series". arXiv:0707.0835 [math.CT].

This can be duplicated using cite journal with {{cite journal |author=Tom Leinster |id={{arXiv|0707.0835v1}} [math.CT] |title=The Euler characteristic of a category as the sum of a divergent series |year=2007 }} which produces the following (not incorporating other changes by DOI bot, such as add doi, journal, volume, page, etc...):
Tom Leinster (2007). "The Euler characteristic of a category as the sum of a divergent series". arXiv:0707.0835v1 [math.CT]. {{cite journal}}: Cite journal requires |journal= (help)
Headbomb {ταλκκοντριβς – WP Physics} 02:26, 29 March 2009 (UTC)
I would still recommend a trial period (50 or so edits) on random articles from Special:WhatLinksHere/Template:Cite_arXiv just to make sure that I didn't miss something, or that I'm not aware of some subtleties.Headbomb {ταλκκοντριβς – WP Physics} 02:35, 29 March 2009 (UTC)
Thanks for your suggestions - this should be easy enough to code. I've copied the discussion to Wikipedia:Bots/Requests_for_approval/Citation_bot_5 for formal approval. Martin (Smith609 – Talk) 20:57, 29 March 2009 (UTC)

Algorithm tweak.

I noticed that the bot misses dois (and the rest of what goes along with them) when inputs like |journal=Phys. Rev. |volume=B23 are used instead of |journal=Phys. Rev. B|volume=23. See the difference it makes [5] when you fix those [6].

IMO, the algorithm should be tweaked to search for the original input first Phys. Rev. B23. If it fails, then search for Phys. Rev. B 23. If it finds something with that new search, then change Phys. Rev. B23 to Phys. Rev. B 23, otherwise leave the journal/volume info alone. Headbomb {ταλκκοντριβς – WP Physics} 08:05, 2 April 2009 (UTC)

That's really useful feedback, thanks. I'll implement that at the weekend. Martin (Smith609 – Talk) 19:03, 2 April 2009 (UTC)

Commercial links

Can something be done about adding DOIs that link to commercial sites that sell the article? —Preceding unsigned comment added by 83.180.250.211 (talk) 17:24, 7 April 2009 (UTC)

When you say "can something be done" do you mean preventing such links from being made? There is no prohibition against linking to commercial sites. In WP:EL, under "Links normally to be avoided", see the big bold exception at the very start of the section: Except for a link to an official page of the article's subject. A DOI is the official page to a reference, usually a scholarly paper on the article's subject. Wikipedia is an encyclopedia of everything, not just an encyclopedia of free stuff on the web. —David Eppstein (talk) 18:09, 7 April 2009 (UTC)
If the article is also available for free (but the DOI resolves to a non-free site), specify the url parameter to {{citation}} (sometimes Citation Bot will even do this for you). One special case: if the article is on PubMed Central, specify pmc instead of url (and the template will take care of the rest). Again, Citation Bot will sometimes detect this (but I'm not sure the databases it queries always contain enough information to find a PubMed Central entry even if one exists). Kingdon (talk) 01:07, 8 April 2009 (UTC)
But please, (1) only link to a free copy if there is reason to believe that it is a legitimate free copy (e.g. it's on the author's web site or was placed by its author on a public preprint server); we shouldn't be linking to copyright violations; and (2) even if you link to a free copy, also include the DOI, because frequently the two versions will differ and the publisher's one will be the one with more of the corrections. (Also, including the DOI helps automated tools such as Citation bot match up citations to articles.) —David Eppstein (talk) 02:00, 8 April 2009 (UTC)

Some things

In {{cite arxiv}}, some people are using the |journal= parameter (often to indicate that a paper has been submitted/accepted by a journal). This is wrong, so Citation bot should comment that parameter out. Also, sometimes |journal=arxiv is used in various citation templates, when that's the case, and when possible, Citation bot should switch to a {{cite arxiv}} template. Also, the Bot should stop overriding human input for authors.Headbomb {ταλκκοντριβς – WP Physics} 05:17, 19 April 2009 (UTC)

Under the new bot request, if the journal parameter is not empty, the template will be changed to cite journal. If the journal parameter is left blank, it doesn't cause any harm.
You've mentioned the journal=arxiv tweak elsewhere, so I won't reply to it here.
The bot's tasks include 'fixing editor errors' and 'adding missing information'. The cases you refer to fall within these categories. Feel free to head to template talk:cite journal (or citation) if you want to establish consensus to prefer incomplete citations. Martin (Smith609 – Talk) 14:15, 19 April 2009 (UTC)


Seemingly contradictory behaviour

Hello. In Navier–Stokes equations‎ and Dispersion (water waves)‎ the bot changed "cite web" and "cite journal" into "citation" templates, while in Airy wave theory‎ the opposite was done: changing "citation" templates into "cite book" and "cite journal". -- Crowsnest (talk) 20:35, 20 May 2009 (UTC)

It uses whichever template family is prevalent on the page. Martin (Smith609 – Talk) 21:24, 20 May 2009 (UTC)
Thanks. Is there no preferred family? I most often use the "citation" template nowadays, since I only have to remember the fields and behaviour of one template. -- Crowsnest (talk) 21:56, 20 May 2009 (UTC)
There's no universal preference - the only difference in output is one of punctuation, but you'd be surprised how emotional some editors can get in defending their preferred format! Martin (Smith609 – Talk) 22:46, 20 May 2009 (UTC)
Same problem in Law enforcement in Westchester County, I am citing a webpage, NOT a book... RayYung (talk) 23:56, 1 June 2009 (UTC)
Looks like a false positive to me - see instructions on bot page for instructions. Martin (Smith609 – Talk) 02:42, 3 June 2009 (UTC)


Converting citation to cite journal and cite book?

I notice (here) that the bot now converts {{citation}} to {{cite book}} and {{cite journal}}. What motivated this change?--Srleffler (talk) 04:54, 8 June 2009 (UTC)

See Wikipedia:Bots/Requests_for_approval/Citation_bot_4. Martin (Smith609 – Talk) 15:36, 9 June 2009 (UTC)
I have similar question as Srleffler. I visited the above Citation_bot_4 page, but did not seem to specifically address why common citation references would be split into cite book and cite journal, especially when no book or journal is referenced. As the Citation_bot_4 project page is archived, I was unsure how to leave a question there and created a discussion page instead. As I have no idea whether anyone would ever see a new discussion page for an archived project, I will ask the same question here and apologize for duplication. Recently, citation bot changed the Blue Valley Creamery Company article so that some citation references became cite book references and others became cite journal references. First, I thought the purpose of citation bot was to create consistent citations within an article. Why would it take one consistent citation type and changed it to two different types? Second, neither cited reference is a book or a journal so the bot actually created erroneous citations.--Rpclod (talk) 18:07, 7 January 2010 (UTC)
Use of the "Cite X" family of templates throughout an article is typically considered a consistent method of citation, because the entirely family produces similar outputs, which are slightly different from the outputs of the {{Citation}} template. How the bot decides which one to choose, I couldn't say. --RL0919 (talk) 18:40, 7 January 2010 (UTC)

Should doi templates be protected?

An interesting issue was recently presented on Global warming

"careful using "Cite doi", it transcludes data from an unprotected page in addition to providing a edit button for anons, watching all those pages can be a pain"

Since vandalism is a serious problem with many pages, perhaps all doi templates should be protected from anonymous editing. The other alternative is to modify the "watch" logic so that if a template is used on a page I am watching, then I will automatically get a notification any time the template is changed. (Maybe that should be done anyway.) Q Science (talk) 00:00, 10 June 2009 (UTC)

There is additional discussion at Global Warming Talk Q Science (talk) 16:19, 10 June 2009 (UTC)

Expand and standardize templates in manual mode

In my gnoming work, I often have to expand refs and standardize how they look in the edit window just so I can see if there are missing parameters, inconsistencies in formatting, and so on. If would be incredibly useful if citation BOT could do the dirty work (on a per-request basis, not by default). For cite journal, it should place them as (* means if present):

cite journal
Blahblah blah blah blugh.<ref>
{{cite journal
 |author= *|authorlink=
---------
OR
 *|last= *|first= *|authorlink=
 *|last2= *|first2= *|authorlink2=
 *|last3= *|first3= *|authorlink3=
 *|last4= *|first4= *|authorlink4=
and so on
---------
 *|coauthors=
 |year= *|month= *|day=    (or |date=)
 |title=
 *|language=
 *|trans_title=
 *|url=
 *|format=
 |journal=
 *|series=
 |volume= |issue= |page=
 *|publisher=
 *|location=
 *|bibcode=
 *|doi=
 *|id=
 *|isbn=
 *|issn=
 *|pmid=
 *|pmd=
 *|oclc=
 *|accessdate=
 *|laysummary=
 *|laysource=
 *|laydate=
 *|quote=
----
Other parameters (if found) place here, one per line
}}</ref> Blahblahblah blug blah.
cite web
Blahblah blah blah blugh.<ref>
{{cite web
 |author= *|authorlink=
---------
OR
 *|last= *|first= *|authorlink=
 *|last2= *|first2= *|authorlink2=
 *|last3= *|first3= *|authorlink3=
 *|last4= *|first4= *|authorlink4=
and so on
---------
 *|coauthors=
 |date=         (or |year= *|month= *|day=)
 |title=
 *|language=
 *|trans_title=
 |url=
 *|format=
 *|page= (or |pages=)
 *|work=
 *|publisher=
 *|location=
 *|bibcode=
 *|doi=
 *|id=
 *|isbn=
 *|issn=
 *|oclc=
 *|pmd=
 *|pmid=
 |accessdate=
 *|archiveurl=
 *|quote=
----
Other parameters (if found) place here, one per line
}}</ref> Blahblahblah blug blah.
cite book
Blahblah blah blah blugh.<ref>
{{cite book
 |author= *|authorlink=
---------
OR
 *|last= *|first= *|authorlink=
 *|last2= *|first2= *|authorlink2=
 *|last3= *|first3= *|authorlink3=
 *|last4= *|first4= *|authorlink4=
and so on
---------
 *|coauthors=
 *|separator=
 *|lastauthoramp=
 |year= *|month= *|day=    (or |date=)
 *|origyear=
 *|editor= *|editor-link=
 *|editor2= *|editor2-link=
 *|editor3= *|editor3-link=
 *|editor4= *|editor4-link=
---------
OR
 *|editor-last= *|editor-first= *|editor-link=
 *|editor2-last= *|editor2-first= *|editor2-link=
 *|editor3-last= *|editor3-first= *|editor3-link=
 *|editor4-last= *|editor4-first= *|editor4-link=
and so on
---------
*|chapter=
*|trans_chapter
*|chapterurl=
 |title=
 *|language=
 *|trans_title=
 *|url=
 *|format=
 *|edition=
 *|series=
 *|volume= *|issue= |page= (or |pages=)
 *|nopp=
 |publisher=
 *|location=
 *|bibcode=
 *|doi=
 *|id=
 |isbn=
 *|issn=
 *|pmid=
 *|pmd=
 *|oclc=
 *|accessdate=
 *|laysummary=
 *|laysource=
 *|laydate=
 *|quote=
 *|ref
 *|postscript=
----
Other parameters (if found) place here, one per line
}}</ref> Blahblahblah blug blah.

And similar for cite book, cite web, and so on... Headbomb {ταλκκοντριβς – WP Physics} 18:07, 29 June 2009 (UTC)

This should be quite easy, although I'd like to be sure that there is a consensus for this move before I do the coding. (I'm not sure whether this would class as a seperate task, and thus require approval at WP:BRFA.) Could you provide the desired order for the other templates, too? Thanks, Martin (Smith609 – Talk) 14:29, 13 August 2009 (UTC)
Well that's why I said in manual mode (there would have to be a checkbox which is off by default and a warning that this should be only used when subsequent work on citations is intended). I'll provide the desired order for others citation templates (tell me if I missed any). Headbomb {ταλκκοντριβς – WP Physics} 14:44, 13 August 2009 (UTC)
I'd suggest that the sequence used should agree with the sequence in the respective /doc so that, when a new blank instance is copy/pasted from the /doc to an article's wikitext it remains the same. LeadSongDog come howl 15:35, 13 August 2009 (UTC)
It would be nice, but the documentations do not list the most logical way of ordering things. I'm aligning parameters with presentation as best I can, while documentation usually present them in order of importance.Headbomb {ταλκκοντριβς – WP Physics} 15:39, 13 August 2009 (UTC)
I'm trialling this in the Cite Doi namespace - keep your eyes out for errors there; if it succeeds I'll investigate enabling this for either manual or automatic mode. Martin (Smith609 – Talk) 23:19, 5 September 2009 (UTC)

Journal parameters cleanup

You can look through (Wikipedia:WikiProject_Academic_Journals/Journals_cited_by_Wikipedia alphabetical) and see patterns. For example, many journal parameters start with a ' for no reason, others are italicized twice (templates place entries in italics automatically, no need to tell it twice), and so on (AWB typo team also contacted). Headbomb {ταλκκοντριβς – WP Physics} 01:17, 30 June 2009 (UTC)

AWB typo team appear to have fixed the bulk of them already... I think I'll leave it in their capable hands. Martin (Smith609 – Talk) 23:17, 5 September 2009 (UTC)

Feature request

I love this bot. I wonder how hard it would be to fix the en-dashes in page ranges of the {{Harv}}, {{Harvnb}} and{{Harvtxt}}? In other words fix Smith 2009, pp. 5–6 so it becomes Smith 2009, pp. 5–6? I'm guessing this is a pretty easy enhancement, since it already does this for the citation templates. ---- CharlesGillingham (talk) 10:36, 8 August 2009 (UTC)


Administrators' discussion re bot

A request to stop this bot was recently made (not by me; but I've chimed in) and there's currently a thread at Wikipedia:Administrators' noticeboard/Incidents #Citation bot. Eubulides (talk) 13:42, 12 September 2009 (UTC)

Note

Just a note; after an extensive bug-fixing session, I am now satisfied that the bot is safe to leave running unattended. I've kept an eye on it for a while, and now intend to leave it running overnight. If there are any systematic and major bugs that require the bot to stop, blocking User:Citation bot 1 will prevent the unsupervised running - without inconveniencing other users. Minor bugs should still be reported so that the bot can fix them on its next run.

Thanks,

Martin (Smith609 – Talk) 11:34, 6 October 2009 (UTC)

Is it still overriding author input or has that been fixed? Headbomb {ταλκκοντριβς – WP Physics} 14:35, 6 October 2009 (UTC)
It should only edit citations to add missing data and bring them in line with the documentation. Martin (Smith609 – Talk) 22:33, 6 October 2009 (UTC)
This is not such an edit: [7]. This is overriding a human-made decision on a style issue. Headbomb {ταλκκοντριβς – WP Physics} 22:48, 6 October 2009 (UTC)
Ah, the bot uses a different protocol for the Cite ArXiv template, which I hadn't tested - I will amend this imminently. Martin (Smith609 – Talk) 00:15, 7 October 2009 (UTC)
Well if that's fixed, then unleash the bot away. Headbomb {ταλκκοντριβς – WP Physics} 00:18, 7 October 2009 (UTC)

Blocked

I think there has been clear request for the bot not to reset the list of authors to multiple parameter style if "author=" already been defined - see User:Citation_bot/bugs#Does_the_bot_have_consensus_to_operate ? and subsequent section to it on example at Autism. As these requests not been heeded, the bot, which if I understand correctly is munching its way through articles automatically, will be seen to be disruptive at least to the medical topics where a definite preference has been for use of Diberri-generated referrences (temp off-line but hopefully will eventually be back in action). Now the bot is trying to do good work and add citation details, but on issue of author parameters clearly needs a short pause for thought... hence why I've turned it off (for now). If some past great discussion on this and alternative consensus then fine and any admin free (just act, no need fear wheel-warring) to unblock and turn back on (or just ask me). David Ruben Talk 00:45, 9 October 2009 (UTC)

Is the bot completly blocked? I ran it manually on Zedoary. The toolpage produced
However, it did not create the template Template:Cite_doi/10.1016.2Fj.ab.2005.03.037, nor updated the page. BTW, if the bot is disabled for a long period, then the {{cite}} template should be updated to remove This citation will be automatically completed in the next few minutes. CS Miller (talk) 17:57, 10 October 2009 (UTC)
David: While it might be a bit onerous, can we not just edit the templates the bot creates after it's done to reflect whatever style has been agreed upon? Or is it overriding even those? It seems to me that that's the better choice here, rather than having the bot not running at all (which people probably won't clue into quickly, since lags are not uncommon). —RobinHood70 (talkcontribs) 03:03, 11 October 2009 (UTC)
Sounds long winded approach, and I wonder whether style would be maintained if a reasonable number of references within an article. More important is whether the bot also free-runs as well as user single requests. As I understand (per wording in its own description and comments of others) the bot free runs, and no way I know to selectively pause a bot from free-running rather than on demand. The request, for name parameters not to be adjusted where "author=" already defined, sounds like a simple (I hope) programming task to be considered, but seems bot's maintainer User:Smith609 currently on a wikibreak until middle of October, so this matter should get sorted in a few days time :-) David Ruben Talk 03:52, 11 October 2009 (UTC)
I hate to be the bearer of bad news, but an instance of template:cite doi, such as template:cite doi/10.1016.2Fj.ab.2005.03.037, will only produce one format of output on all the pages upon which it is transcluded. It is incapable of being adapted to the local style choices on an article. See Template:Cite doi/doc#Formatting for details. Effectively, articles using it must adopt its format choices for all citations. Editing its output for format breaks even that possibility of consistent format in the transcluding articles. LeadSongDog come howl 06:34, 11 October 2009 (UTC)
Oh I agree, in the absence of a mediawiki feature of local variables that could be defined on a per-article basis (as has been part of debate with date-style consistency), then a solitary {{cite doi|10.1146/annurev.earth.33.092203.122621}} can only be expanded to one fixed style. The issue that caused the concern of the group of editors was where the bot runs in an article with the referrences already providing most of the citation details (i.e. either no modification will result or perhaps just items of 'pmc' & 'doi' or a missing 'month' added) - then if the "author=" is already defined as the chosen approach for a specific reference, then it should be left alone. The lack article-style variable also of course prevents the use of bots to spell check articles for consistent AmE vs BrE. Anyway the pausing allows people to consider the various implicatinms of possible approaches.David Ruben Talk 12:37, 11 October 2009 (UTC)
Ok, I'll keep an eye on the page, pending Smith609 returning. CS Miller (talk) 08:54, 11 October 2009 (UTC)
David, Citation Bot is really very helpful for the articles I work on. Is there some kind of compromise here allowing for it to be ran manually (e.g. by Request on an article, but not automatically)? -- Scarpy (talk) 18:03, 11 October 2009 (UTC)
The bot is also very useful for the articles I work on. We all want it to get working again. Unfortunately the bot can be disruptive even when it is manually triggered, because editors may not have the time or the expertise to realize that it is imposing an undesired style change. One possible way to fix the problem is to give manual invokers the option to change the |author= style, with the default to leave the style alone. However, since the bot's maintainer is offline for a few days, we'll need to wait to see whether this way is feasible. Eubulides (talk) 21:36, 11 October 2009 (UTC)
I really don't get why someone would throw a hissy fit over a style change in the citation, this whole banning seems to violate wp:point--UltraMagnusspeak 09:33, 12 October 2009 (UTC)
I have to say, I agree. It is difficult for me to understand why we need to have subtly different citations in different articles. For any other publication, this would not be an issue. ---- CharlesGillingham (talk) 16:47, 12 October 2009 (UTC)
Firstly, it is a block and not a ban. It will be lifted when/if the issues are fixed, I'm sure. Secondly, medical articles often have multiple authors, often going into seven or eight coauthors. If every reference where to write these out in full, it'd take up a silly amount of physical space on the article page. It's also contradictory to how WP:MED reference things, it goes against our guidelines and consensus. How is David violating WP:POINT? He's not disrupting it to make a point, he's blocking a bot which several good-standing editors have noted is causing unintentional disruption. Until it's fixed, we will continue to have a problem with it. Sorry if this seems bitey. Regards, --—Cyclonenim | Chat  20:14, 12 October 2009 (UTC)
Banning a extremely useful bot, that is exclusion compliant, that is only triggered by a user, just because a few editors don't like what it did to handful of articles, seems extremely wp:pointy to me. --UltraMagnusspeak 21:16, 12 October 2009 (UTC)
Blocking. They're entirely separate things, see WP:BLOCK and WP:BAN. And I disagree that it's only triggered by a user and that we're a small handful of users. It's been an issue since it ran for approval, see here for what Headbomb (talk · contribs) said. The bot has been changing "et al." in references, even when its actions will create a huge list of authors. It can't be POINTY, either, if David Ruben isn't involved in the discussion (which he wasn't before the block was made). He was uninvolved, and therefore not disrupting Wikipedia to prove his point. He took action based on the evidence presented here. Sorry that this isn't very coherent, few beers! Best Regards, --—Cyclonenim | Chat  21:32, 12 October 2009 (UTC)
Also, you'll notice at WP:BOTPOL that bots are only authorized to make edits which there is consensus for it to do. It doesn't seem to me like there's much consensus here about changing the style of citations. In fact, if you look on the talk page, there are multiple complaints. Regards, --—Cyclonenim | Chat  21:34, 12 October 2009 (UTC)

Why is |author=Chen CC, Zimmer A, Sun WH, Hall J, Brownstein MJ, Zimmer A better than |first1=Chih-Cheng |last1=Chen |first2=Anne |last2=Zimmer |first3=Wei-Hsin |last3=Sun |first4=Jennifer |last4=Hall |first5=Michael J. |last5=Brownstein |first6=Andreas |last6=Zimmer for doi:10.1073/pnas.122245999 / PMID 12060708?? This supposed marvelous "Diberri" format doesn't get the first or third names right. And what's the benefit of hiding first names if they're known? And why block a useful bot over a preference for a broken formatting tool? Just block the bot on the specific articles you don't want to be edited.  —Chris Capoccia TC 09:20, 14 October 2009 (UTC)

Diberri's tool has issues, like any other automated tool such as Citation bot. Don't go by Diberri's current performance, it's broken so any results you get from it are more likely to be full of errors. Under usual conditions, the tool is entirely accurate. Secondly, you can't just block the bot on certain articles. Blocking ceases activity everywhere, you can't specify which articles can be edited and which can't. The only person with that influence is the bots developer, who is unavailable. Until he's available, the options available were limited to either leave things as they were (which, in my eyes, were disruptive to our project as they were increasing the variation in citation styles—messy and confusing) or to temporarily block the bot. I suspect David saw things the same way, and chose the latter option, but I won't speak for him. Regards, --—Cyclonenim | Chat  15:33, 14 October 2009 (UTC)
incorrect, as it states on the citation bot page, all you need to do is add {{bots|deny=Citation bot}} at the top of an article and it will effectively block citation bot from editing it.--UltraMagnusspeak 15:39, 14 October 2009 (UTC)
I had missed that, but even so that method is entirely impractical. Do you suggest we go and place that tag on every single page under WP:MED's listings? Regards, --—Cyclonenim | Chat  16:05, 14 October 2009 (UTC)
sounds like a good task for a bot ;)   —Chris Capoccia TC 18:20, 14 October 2009 (UTC)

By the way, blocking the bot doesn't mean you can't use its results. Simply don't check the "commit edits" box. The bot will then give you the text of the resulting page. View the source of that text (use control-U in Firefox), copy whichever part of that text you like, and paste it into an edit window, and commit the results yourself. It's more awkward to do this by hand, of course, but it's a reasonable temporary workaround until the bot's maintainer comes back online. Eubulides (talk) 17:09, 14 October 2009 (UTC)

i did that for one article, and it's a real pain in the neck! the bot removes all the ref tags. you have to carefully copy and paste each ref and try not to mix them up when there are a lot close together.  —Chris Capoccia TC 18:20, 14 October 2009 (UTC)
It sounds like you did something different from what I suggested, and cut from the text page that the bot gives you. That doesn't work (as you discovered). Instead, please cut from the HTML source of that text page. With Firefox, you can type Control-U on the text page to see the HTML source code, and cut from that. I've done it; it works. (Of course the bot should be changed so that one can cut from the text page directly without having to type Control-U, but that's a different problem....) Eubulides (talk) 23:23, 14 October 2009 (UTC)
I agree. Cyclonenim, you are making some good points, but the "if/when" language seems fatalistic for what is a small and obscure detail. Looking at this from a bigger picture perspective the bot improves many more articles than it might have a negative stylistic impact on. Ultimately, it makes citations easier to verify, a purpose more central to Wikipedia's operation than minor cosmetic issues. -- Scarpy (talk) 21:58, 14 October 2009 (UTC)
I do understand your point, but I don't see the harm in killing two birds with one stone. We have issues with how the bot is working on medical articles and, primarily, Rome wasn't built in a day. Where is the rush? The operator will only be gone for another couple of weeks, at which point the problem can be fixed permanently and everyone can continue with there work. A slight delay seems a lot more sensible than allowing things to continue, only to have to backtrack on ourselves again each time the bot edits a medical page—it's absurd. Ultimately, I don't think this block will be lifted until the issues are resolved, as Wikipedia relies heavily upon consensus and bots are only allowed to make edits for which there is no controversy, no issues with consensus. Just a brief look at this thread will show you that there is very little consensus on this matter, and therefore the bot should not be making these edits until the issue is resolved. I suppose that is, though, up to the administrator who did the block. David is perfectly capable of making his own decisions. Regards, --—Cyclonenim | Chat  23:16, 14 October 2009 (UTC)

Spelling out journal names in full

This has nothing to do with author name formatting, but: can I ask you to please spell out journal names such as "Eur Arch Paediatr Dent"? It may be the standard for technical writing within the field of Paediatr Dent to not spell things out, but this convention creates an unnecessary barrier to entry for anyone else not already familiar with the standards and titles of the field. We're writing here for all Wikipedia users, not for Paediatr Dents. —David Eppstein (talk) 23:54, 15 October 2009 (UTC)

I'd be happy to try - but first you would have to establish consensus that this should be done in all articles. From my experience with similar issues, I would predict that some people will strongly defend their decision to abbreviate journals in the articles that they edit. As the authors issue above shows, even if you don't change what the end user sees, people will still go to great lengths to defend their personal preferences. If you can show that this isn't the case here, then I'll gladly go ahead. Martin (Smith609 – Talk) 01:07, 17 October 2009 (UTC)
Well WP:CITE recommends writing the journals in full. I doubt the bot will have consensus on this, but I'm for it. At the very least, the bot could add the full journal when it's missing, and not touch the abbreviated ones. Headbomb {ταλκκοντριβς – WP Physics} 01:36, 17 October 2009 (UTC)
Yes, please do write the journals out in full, let's not use the esoteric abbreviations that would be meaningful typically only to scholars who publish in the particular field of the journal. N2e (talk) 13:06, 28 October 2009 (UTC)
Actually, my request was addressed to Eubulides (talk · contribs), as it was placed directly under his comment before someone moved it. Although I wouldn't object to the bot performing this replacement, I can see that it might be controversial. —David Eppstein (talk) 03:12, 17 October 2009 (UTC)
I think we're all in agreement that it would be controversial! The topic of journal abbreviations should be thrashed out on some other talk page, though, I expect. Eubulides (talk) 06:00, 17 October 2009 (UTC)
For "should be" substitute "has been": e.g. see Template talk:Citation#Italy vs. Rome. —David Eppstein (talk) 06:51, 17 October 2009 (UTC)

Template uniformity

Given this discussion (a change a month ago broke harv formatting for {{cite journal}} style citations), perhaps it would be a bad idea to switch from citation to cite journal, at least for articles that use harv, harvtxt, harvnb, harvs, etc. —David Eppstein (talk) 06:43, 25 October 2009 (UTC)

A list of cleared ISBN?

re [8]. The bot added "|ISBN-status= May be invalid - please double check" to ISBN 9986-9216-9-4. I presume that's because it could not find it in WorldCat or some similar catalog. However, it is valid as shown at National Library of Lithuania. So is there a way to prevent the bot from flagging the same ISBN in other articles? This particular website is used very often in Lithuania-related articles. Thanks, Renata (talk) 13:15, 5 November 2009 (UTC)

New tool

Visitors to this page may be interested in a new inline reference-adding tool, Ref++. Martin (Smith609 – Talk) 00:26, 12 November 2009 (UTC)

Current revision

How can I tell what the current revision of the bot is that will run when I invoke it manually (to know whether a recent bug fix is in or not)? Thanks Rjwilmsi 08:40, 15 November 2009 (UTC)

I'll add this information to the start page anon (follow http://code.google.com/p/citation-bot/issues/detail?id=26 for updates); meanwhile all you can do is start the bot running and 'STOP' pageload (press esc or click stop) before it commits an edit if it is an unfavourable version. Or you can check the contributions and note the number of a recent edit. Although I've just poked around and realised that the correct version number is only indicated if a specific file is edited in that change, which hasn't happened in the change from 68 -> 70. I'll have to work out a workaround for that... Martin (Smith609 – Talk) 15:50, 17 November 2009 (UTC)


Title Case

Minor point : Authors name, if no puncutation or spaces, should probably be in Title Case. diff User A1 (talk) 12:21, 5 January 2010 (UTC)

There is no way to distinguish initials from short names. Therefore user input will be respected. Martin (Smith609 – Talk) 19:20, 20 February 2010 (UTC)

Fixes

Hi all, apologies for the long delay in getting back to your requests. Sometimes life gets in the way!! Anyhow, I have now fixed what I think is the contentious issue – the bot will now (r74) only unify citation types if "harv" templates are absent. It checks the page code for the string "{{harv" (case insensitive) and, if it finds it, it leaves the citation types untouched. I've fixed most of the other outstanding bugs too, and don't believe that there is anything left that warrants the bot being blocked. I've requested the unblock, but feel free to block the bot again if necessary. Martin (Smith609 – Talk) 16:15, 20 February 2010 (UTC)

{{unblock|Bugs that caused block to be requested have been fixed (see above)}}

 

Your request to be unblocked has been granted for the following reason(s):

Unblocked

Request handled by: NW (Talk)

Unblocking administrator: Please check for active autoblocks on this user after accepting the unblock request.

bot out of commission again?

the bot doesn't seem to be working? has it been blocked? or is there a problem to fix?  —Chris Capoccia TC 19:57, 22 February 2010 (UTC)

It's not blocked, but it seems to stall after announcing "Activated by Bloggins". A new bug perhaps? User:LeadSongDog come howl 21:39, 22 February 2010 (UTC)
Seems to coincide in time with R80 on the codebase. User:LeadSongDog come howl 23:00, 22 February 2010 (UTC)
I've raised an issue (#31) for this on the bot's Google code page. Rjwilmsi 16:57, 23 February 2010 (UTC)
Now resolved, bot is working again. Rjwilmsi 11:58, 24 February 2010 (UTC)

coauthor field

There's been a recent discussion at Template_talk:Citation#Coauthors touching on the fact that because the {{citation}} and {{cite x}} family don't use the coauthor in generation of the CITEREF anchor, coauthor names should not be used by {{harv}} etc. I've suggested another possible task for this bot that could find a lot of broken linkages, if you wouldn't mind having a look. User:LeadSongDog come howl 18:51, 25 February 2010 (UTC)

Bot adds author names to citations which already have author names, resulting in duplicated names! See: http://en.wikipedia.org/w/index.php?title=Fungus&diff=prev&oldid=346406814 and the following edit, which corrected the bot. Fconaway (talk) 03:17, 26 February 2010 (UTC)
The input to the bot had ".<ref>{{cite journal |last=Steenkamp ET, Wright J, Baldauf SL.|year=2006 |title=The protistan origins of animals and fungi |journal=Molecular Biology and Evolution |volume=23 |issue=1 |pages=93–106|url=http://mbe.oxfordjournals.org/cgi/content/full/23/1/93|doi=10.1093/molbev/msj011 |pmid=16151185}}</ref>" which was replaced by "<ref>{{cite journal |last=Steenkamp ET, Wright J, Baldauf SL.|year=2006 |title=The protistan origins of animals and fungi |journal=Molecular Biology and Evolution |volume=23 |issue=1 |pages=93–106|url=http://mbe.oxfordjournals.org/cgi/content/full/23/1/93|doi=10.1093/molbev/msj011 |pmid=16151185 |first1=ET |last2=Wright |first2=J |last3=Baldauf |first3=SL}}</ref>"
What the bot couldn't correct was apparently "|last=Steenkamp ET, Wright J, Baldauf SL.|" which is a fairly predictable typo in manually created input. The data in pubmed for this citation appears to be correct. User:LeadSongDog come howl 14:32, 26 February 2010 (UTC)
I agree, this wasn't a bad change. It improved the citation, but not enough. The correct response to the improvement should have been to fix the incorrect last= parameter that the bot didn't change, to last=Steenkamp, rather than reverting the whole thing. —David Eppstein (talk) 15:47, 26 February 2010 (UTC)

DOI broken?

Hello,

Your bot claimed that a reference on Transmission electron microscopy has a broken DOI diff. The DOI in question is 10.1017/S143192760708124X , whihch I doubled checked at the journal's site, here. User A1 (talk) 09:13, 20 March 2010 (UTC)

The DOI is not currently recognised by CrossRef: http://dx.doi.org/10.1017/S143192760708124X
You could contact the publisher or CrossRef to inform them of this problem.
Martin (Smith609 – Talk) 12:40, 29 March 2010 (UTC)

Blocked temporarily

I have blocked the bot temporarily, until the bot operator confirms that everything is well. Any other admin should also feel free to unblock. — Carl 21:59, 11 January 2010 (UTC)

I've had negligible Wiki time for a while; could someone please outline the steps that require implementation for the bot to be unblocked, so that I can put them into practise ASAP? Thanks, Martin (Smith609 – Talk) 23:13, 11 January 2010 (UTC)
There's no single person in charge here, but by far the most pressing problem is the one noted in User:Citation bot/bugs #"citation" does not mean "cite news", #Breaking harv anchors to citations, #Citation type unification breaks harv templates, #"citation" does not mean "cite news". I think these are all the same bug. Also, the last 10 bugs in User:Citation bot/bugs have not been responded to; I doubt whether they all need to be fixed but it has been nearly two months without a response.... Eubulides (talk) 00:27, 12 January 2010 (UTC)
It might also be helpful to have a check to prevent substrings "ed.", "edit", "tr.", "trans", "ill.", "illus" from being automatically inserted in an authorn or lastn field. While it isn't really a bot bug, these fairly common problems in the cataloguing data should somehow or other trigger manual intervention. LeadSongDog come howl 17:54, 12 January 2010 (UTC)

How are things going in moving towards unblocking this bot again? It's extremely helpful even with these minor bugs. I'd much rather be able to run the bot on a page and fix a few minor things than have to manually check everything about all the citations. It's also very helpful for automatically finding DOIs that diberri's tool misses.  —Chris Capoccia TC 08:02, 15 January 2010 (UTC)

Yes, the tool is very helpful, but unfortunately the pressing bug mentioned above is not that minor, as it breaks references (the references no longer work). The bot's maintainer is active elsewhere on Wikipedia (last edited within the past 24 hours), so we can hope for progress. Perhaps at some point somebody could ping him on his talk page. Eubulides (talk) 08:15, 15 January 2010 (UTC)
Given that the maintainer is busy in RL, we might expedite matters by consolidating the list of problems in one place for him. Are there any elsewhere that are not already identified at /bugs? LeadSongDog come howl 13:43, 15 January 2010 (UTC)
I've been trying to keep User:Citation bot/bugs up-to-date with respect to the bugs I've observed or seen reported. The current problems are the last dozen bugs in that list. I may well have missed some. Eubulides (talk) 17:31, 15 January 2010 (UTC)

A plea for Citation Bot

I can't believe this INCREDIBLY USEFUL bot has been blocked AGAIN over extraordinarily esoteric and petty citation style details. Why don't we eliminate every useful tool on Wikipedia while we're at it because it happens to be less than perfect? If you guys can't provide a work-around, could you please consider the editors that this bot is working very well for before making knee-jerk decisions like this in the future? -- Scarpy (talk) 14:14, 22 January 2010 (UTC)

If it's breaking citations (so that internal links no longer work) that doesn't sound all that petty. I think we all share your opinion of the bot's value when it's working. The bot maintainer was pinged a couple of weeks ago, but seems to be fairly busy. We can't really have an unmaintained bot running around: someone needs to take responsibility for it. I hope the maintainer can free up some time soon. Or perhaps someone else can step up as a comaintainer. Any volunteers? Eubulides (talk) 17:14, 22 January 2010 (UTC)
is the bot really running around on its own? my understanding is that citation bot is user-initiated.  —Chris Capoccia TC 17:20, 22 January 2010 (UTC)
It can be run in both modes. But it's a problem even in user-initiated mode, as the bugs are not easy for users to check. When the bot breaks a wikilink to a citation, a user cannot be expected to check for the bug, since the user is (naturally) focusing on the citation that changed, and won't know to check every faraway wikilink to the citation. Also, in practice many users run the bot without checking the results carefully, and thus break citations due to bugs other than broken wikilinks. Of course ultimately these changes are the users' responsibility, but when the problem happens often enough, the bot itself is the problem. A bot like this is a sharp tool, and should be kept in good repair, and should limit itself to edits that have consensus. Eubulides (talk) 17:35, 22 January 2010 (UTC)
I think it's appropriate that the bot is temporarily disabled until the bugs reported can be addressed, since the bot is inadvertently introducing errors into articles. However, we mustn't forget that this bot is able to make significant improvements to articles, therefore it would be very good if another user could temporarily assist in maintenance of the bot to resolve the bugs, so that the bot may continue. Perhaps in the short term certain functions performed by the bot can be disabled, so that the bugs are resolved and the remaining functionality can continue to be used? Rjwilmsi 18:32, 22 January 2010 (UTC)
Yes, that would make sense. Also, to try to help out, I've created a how-to for using the bot even when it's blocked. It's a bit of a pain, but it's better than nothing. Please see User:Citation bot/use #If blocked. Eubulides (talk) 19:50, 22 January 2010 (UTC)

Is there a possibility to run the bot at least manually - it would be still tremendously useful until it can be reenabled in full glory. I have added a bunch of references to Hysterectomy in the hope the bot would expand them for me and I could sure check the output. Richiez (talk) 13:58, 10 February 2010 (UTC)

See Eubulides' note above. The how-to is at User:Citation bot/use #If blocked. Tedious, but it works. --ἀνυπόδητος (talk) 15:07, 10 February 2010 (UTC)
Thanks for the pointer - that is really more tedious than I realised. I will check if my ancient programs to expand pmids intelligently still work.. would appear a lot easier than the aforementioned procedure.
As an opinion from an experienced programmer, the main problem of this bot appears to be that it does an incredible, nearly innumerable amount of tasks. If there is only one problem with any of these many tasks the whole bot still gets blocked - despite its well known that more than 90% of the functionality is working perfectly. Given the nature of many of the tasks - fixing broken things can never be automated 100% reliably - such breakage can be expected to recur frequently. So how about partitioning the bot into several smaller bots that could be blocked separately? Richiez (talk) 21:24, 10 February 2010 (UTC)
Are your ancient programs by any chance available for the public, or do they only run on your computer? --ἀνυπόδητος (talk) 20:57, 11 February 2010 (UTC)
See my user page. The programs are very limited in scope - all they do is expand pmid numbers to full citations (harvard or footnote style). At this simple task they offer some extras over the citation bot. Richiez (talk) 21:15, 12 February 2010 (UTC)
Thanks! ἀνυπόδητος (talk) 09:41, 13 February 2010 (UTC)
Let me know if you need some help, it might be slightly bitrotten over time but usually fairly easy to fix.Richiez (talk) 21:55, 14 February 2010 (UTC)
I don't have much time at the moment anyway, but I will probably come back to your offer. Cheers, ἀνυπόδητος (talk) 09:28, 15 February 2010 (UTC)
  • Re PMIDs, are you familiar with the Cite PMID template, that negates the need for any manual expansion of PMIDs?
thanks, I am using that heavilly:) Afaics the functionality is actually implemented by this Citation_bot?
My somewhat dusty programs did slightly more than Cite PMID does for now, so if you want some inspiration here it is:
  • autogenerate REF tag including name parameter as AuthorLastName_pubdate or AuthorLastName_pubdate_pmid for the unlikely case that the former is not unique within the article
  • multiple pmid templates are automatically pointed to same reference (via ref name=) for same ids
  • syntax was just {{pmid XXXX}} and {{isbn XXXXX}} without the REF tags which were autogenerated.
My programs did expand a locally edited text file, so I have no idea if that is realistic for a bot. Richiez (talk) 12:31, 5 April 2010 (UTC)
  • Re multiple bots; I have already done this to some extent; however it might be an idea to set up a simplified version of the bot that does the bare minimum, that can be used whenever the bells & whistles version is unavailable. I'll get to that anon. Martin (Smith609 – Talk) 16:33, 20 February 2010 (UTC)

DOI toolbox

I noticed that CitationBOT fails to run (when prompted by the .js toolbox) in pages that have endashes and emdashes in their name. Prompting the BOT from the toolserver works just fine, but not when you do it from the toolbox. Headbomb {ταλκκοντριβς – WP Physics}

Eugh, character encodings. Trying to fix those is the fastest route to a headache... I can't access the toolserver databases at the moment, to work out how page titles are stored there, so won't be able to make any headway on this for now. Martin (Smith609 – Talk) 03:40, 25 August 2009 (UTC)
  Fixed Martin (Smith609 – Talk) 02:11, 10 April 2010 (UTC)

Please make CitationBot stop adding months and ISSNs

There's no consensus for that, and it clutters citations for no good reason. See the discussion on Template talk:Cite journal. Thanks. Headbomb {ταλκκοντριβς – WP Physics} 21:17, 9 September 2009 (UTC)

Yes, it's quite clear that there is no consensus for adding either months or ISSNs. But I'm afraid Martin hasn't been active in the past few days, while the bot chugs away adding stuff people don't want. Plus, it's messing up author names. Is it time to ask for an emergency shutoff? Eubulides (talk) 08:55, 11 September 2009 (UTC)
Its running in manual operation (aka supervised mode) only, so I doubt that's appropriate. But who knows. Headbomb {ταλκκοντριβς – WP Physics} 11:05, 11 September 2009 (UTC)
Suspended addition of ISSNs and months pending establishment of consensus. Martin (Smith609 – Talk) 14:27, 17 September 2009 (UTC)


Possibly fixed by disabling the function. Keep an eye out and let me know if it recurs.

I've disabled author functions

Hi all,

I've just got back from away and haven't had time to review the above comments. For now, I have disabled the new author handling functions, so the bot will no longer remove 'et al's, split author= into author1=, author2=, etc, and will no longer detect translators. I imagine that this will allow the bot to be used again without upsetting anyone. When I have more time I will see if I can come up with a solution that placates everybody - I suspect that it may be a case of the bot leaving the author parameter alone. Thus I believe that it is safe to unpause the bot. I've only had time to do a quick test (no edit found to be necessary at this test page) so if it turns out that unwanted behaviour is occurring, feel free to re-pause it.

For future reference, different accounts are used for different functions. This allows certain aspects of bot activity to be paused without stopping others. At present, the accounts operate as follows:

  • Citation bot  : Only operates when initiated manually by a user. May be blocked if absolutely necessary; however do bear in mind that this will inconvenience users of the tool.
  • Citation bot 1 : Operates automatically, either under my supervision whilst I am fixing bugs or developing new features, or unsupervised when there are no outstanding bugs. Blocking this account is encouraged where unhandled systemic errors are present.
  • Citation bot 2 : Creates Cite Doi subpages when they don't exist. Converts Cite arXiv to Cite journal when papers are published. There should rarely be cause to block this account.
  • Citation bot 3 : Expands and formats Cite Doi subpages. Should only be blocked if it is deleteriously affecting the written output of the Cite Doi template, against the template documentation.
  • Citation bot 4 : Used for one-off tasks, using different code to other accounts. Should only be blocked if errant edits are being produced from this account.

Hope that helps. If my prompt action is required, I can be reached via e-mail or my user talk page. Martin (Smith609 – Talk) 16:12, 15 October 2009 (UTC)


Thanks Martin - speedy work, with author parameter left as is, I think that at least stops interference with existing markups (editors free to hand adjust once the main work of bot done to fill in missing citation details from say {Cite DOI:123445} or {PMID:12345} been provided) - therefore I think that addresses the main concerns raised necessitating the bot being paused... a quick call for comments please (this is not a vote, just another few sets of eyes) and I'll unblock in a couple hours.
However I'm sure Martin you are right that some uniformly acceptable parser checking of author parameter possible with some extended discussions and careful example testing:
  • easy stuff is double punctuation of say "Smith J.., Blog F." and probably some capitalisation eg "Smith a" (but beware of "de" as in "de X" where I think the "d" often is lowercase)
  • Harder is uppercase to titlecase, eg "SMITH" to "Smith", as short names may be difficult to untangle - is "NG WIN" a "Win, N" or "Ng, W" ?
  • Whilst chopping long lists of authors to a fix number plus "et al" requires both consensus to use "et al" at all, and for determination of how many names are kept, I'm sure agreement can be reached that the full list of authors should be given if just 2 or 3 authors and yet only one lead author is currenly defined (with or without "et al")
  • Chopping out titles from list authors might also be possible: "Smith J MD, Jones K MD" seems obvious but "Smith MD, James DO, Jones RCN" would I think need the human touch ("MD" & "DO" are medical qualifications whilse "RCN" is Royal College of Nursing". David Ruben Talk 21:24, 15 October 2009 (UTC)

I'm afraid it's still changing some author lists. I just tried it on Water fluoridation (without installing the result) and it made the following insertions:

  • {{cite journal |author=Bailey W, Barker L, Duchon K, Maas W |title=Populations receiving optimally fluoridated public drinking water—United States, 1992–2006 |journal=MMWR Morb Mortal Wkly Rep |volume=57 |issue=27 |pages=737–41 |year=2008 |pmid=18614991 |url=http://cdc.gov/mmwr/preview/mmwrhtml/mm5727a1.htm |author1=Centers for Disease Control and Prevention (CDC) }}
  • {{cite journal |author=CDC |title= Ten great public health achievements—United States, 1900–1999 |journal= MMWR Morb Mortal Wkly Rep |volume=48 |issue=12 |pages=241–3 |year=1999 |pmid=10220250 |url=http://cdc.gov/mmwr/preview/mmwrhtml/00056796.htm |author1=Centers for Disease Control and Prevention (CDC) }}
  • {{cite journal |author= Truman BI, Gooch BF, Sulemana I ''et al.'' |title= Reviews of evidence on interventions to prevent dental caries, oral and pharyngeal cancers, and sports-related craniofacial injuries |journal= Am J Prev Med |volume=23 |issue= 1 Suppl |pages=21–54 |year=2002 |pmid=12091093 |doi=10.1016/S0749-3797(02)00449-X |url=http://thecommunityguide.org/oral/oral-ajpm-ev-rev.pdf |format=PDF |accessdate=2009-02-03 |author9= Task Force on Community Preventive Services }}
  • {{cite journal |journal=Eur Arch Paediatr Dent |year=2009 |volume=10 |issue=3 |pages=129–35 |title=Guidelines on the use of fluoride in children: an EAPD policy document |author=European Academy Of Paediatric Dentistry |url=http://www.eapd.gr/EAPDJournal/2009v10/Issue_3/Vol_10_2_June_F_Guide.pdf |format=PDF |pmid=19772841 |author1=European Academy Of Paediatric Dentistry EA }}

There were some others but this should be enough to get you see the bugs. Eubulides (talk) 23:15, 15 October 2009 (UTC)

  Fixed in r56; thanks for testing. Martin (Smith609 – Talk) 01:31, 17 October 2009 (UTC)
Could the bot be unblocked now (especially the manual part of it)? Headbomb {ταλκκοντριβς – WP Physics} 01:41, 17 October 2009 (UTC)
Just a wee bit of a tweak to me here: the bot added a |first1= for no real reason. Headbomb {ταλκκοντριβς – WP Physics} 01:59, 17 October 2009 (UTC)
  • I'll fix that before operating the bot in automatic mode. I don't see it as a significant enough problem to stop the manual-mode bot (User:Citation bot) from being unpaused - it should be safe to unblock it now. Martin (Smith609 – Talk) 02:58, 17 October 2009 (UTC)

Tried it out on Autism article

I tried the new bot out on Autism. Most of the changes it made were good ones (thanks!) and I agree that it's safe to unblock now. The problems I noted were:

  • A bogus "|first2=C." in a single-author article (see this diff). In some sense this is the most troubling because if I run the bot again it'll do it again, with no obvious workaround in general.
  • The following insertion (underlined) was not desired:
{{cite journal |author=[[Leo Kanner|Kanner L]] |title=Autistic disturbances of affective contact |journal=Nerv Child |volume=2 |pages=217–50 |year=1943 }} Reprinted in {{cite journal |year=1968 |journal=Acta Paedopsychiatr |volume=35 |issue=4 |pages=100–36 |pmid=4880460 |author1=Kanner |title=Autistic disturbances of affective contact. }}
I worked around the problem by replacing "|author1=Kanner |title=Autistic disturbances of affective contact." with "|author=<!-- Pacify Citation bot. --> |title=<!-- Pacify Citation bot. -->".
  • Adding |doi_brokendate=2009-10-17 to doi:10.1001/jama.285.24.3093, a DOI that is not broken. That's the Journal of the American Medical Association, by the way.
  • Insertion of several |pmc= values that are counterproductive because the PubMed Central articles are embargoed until next year. The affected articles were PMC 2677584, PMC 2677593, PMC 2692135, and PMC 2692092. We've talked about this before; I just wanted to mention it again, as it's an ongoing problem. I worked around this by replacing, for example, "|pmc=2677584" with "|pmc=<!-- 2677584 embargoed until 2010-05-27 -->", but this is unsatisfactory as it will require manual intervention on each due date, something that's unlikely to be done correctly by hand.

Eubulides (talk) 05:58, 17 October 2009 (UTC)

The pmc-embargo-date parameter is now in place to address this latter concern. Making the bot understand it is on the to-do list. Martin (Smith609 – Talk) 07:30, 17 October 2009 (UTC)
  • The bot has now been unblocked. Martin (Smith609 – Talk) 07:38, 17 October 2009 (UTC)


Authors

This one is weird [9]. Headbomb {ταλκκοντριβς – WP Physics} 04:04, 21 October 2009 (UTC)

Another weird one: [10]. Headbomb {ταλκκοντριβς – WP Physics} 15:05, 26 October 2009 (UTC)
Again the bot strips et al to make its own replacements. [11]. Headbomb {ταλκκοντριβς – WP Physics} 19:30, 14 November 2009 (UTC)

Three issues

PMID citations redirecting to redlinked (nonexistent) DOI citations

At Psychopathy#References, there are several instances of PMID citations being redirected to redlinked DOI citations. Here they are:

Why is this happening? Other than editing each citation manually, is there a way to fix this? Thanks! Alamanth (talk) 16:37, 11 November 2009 (UTC)

Hurrah! Blue links! The red links were caused when the bot's run was interrupted. Now, instead of saving the new redirect landing pages to the end, the bot creates them as soon as it comes across them. I've also built in a feature to search for dud redirects and complete them – just include the page in the category Category:Pages with incomplete PMID references and the bot will fix them. Be sure to decategorise once the bot has done its magic. Martin (Smith609 – Talk) 02:54, 12 November 2009 (UTC)
Hooray! Thanks for the explanation. I don't understand it entirely (the third sentence in particular), but I do appreciate it: if you've got time and can re-explain the third sentence, that'd be neat.
The category is automatically added to a page when an incomplete 'cite pmid' reference exists. Each time the bot runs (ever few minutes) it checks all pages in the category and completes the incomplete references. So if you add the category manually to the article that contains the incomplete templates, you should remember to remove it, or the bot will visit the page each time it runs in perpetuity, which is a waste of its resources. Martin (Smith609 – Talk) 16:43, 13 November 2009 (UTC)
Question: what interrupted the bot's run; was it something I or another user did, and can it be avoided? Also, when using the handy "Pages with incomplete..." category, which page should I include in the category: the article page, or the incomplete template "pages" (like those above)? Thanks so much. Alamanth (talk) 17:56, 12 November 2009 (UTC)
It was probably interrupted on the server - there's nothing that users can do to interrupt its run, so don't worry about that. Martin (Smith609 – Talk) 16:43, 13 November 2009 (UTC)

References incomplete after several hours, 'jump the queue' does not work

These references appear with the following text: "This citation will be automatically completed in the next few minutes. You can jump the queue or expand by hand." Jumping the queue does not work, and rather than expand them by hand, I would prefer for the citations to be expanded automatically, to ensure they are expanded completely, accurately, and uniformly.

Thank you. Alamanth (talk) 16:34, 11 November 2009 (UTC)

The error here seems to be that the DOIs aren't yet functional. Check that they are free of errors (I suspect that the space in the second one should not be there, for instance); if so, you will have to wait for them to be added to the CrossRef database before the bot can touch them, I'm afraid. Martin (Smith609 – Talk) 03:00, 12 November 2009 (UTC)
I thought I might have done those incorrectly (I was tired); forward slashes were formatted incorrectly (as ".2F"). Here are the correct links, which, sadly, also do not work:
I clicked, found errors for each, and reported them to DOI.org. Is there anything else I (or anyone) can do to fix these? Also, is there another way to automatically populate reference templates, using the information I do have? Thanks! Alamanth (talk) 17:18, 12 November 2009 (UTC)
Actually, the template names require the slashes to be dot-encoded. The easiest way to complete them manually is to click on the 'expand by hand' link by the reference - this will give you a pre-prepared template to complete. Otherwise, I'm afraid you've done all you can, and it's just a case of waiting for the links to be fixed! Martin (Smith609 – Talk) 16:43, 13 November 2009 (UTC)

WebCite temporarily down – shouldn't call that a dead link

Can I suggest that the bot not add "dead link" notices to citation templates that already have the |archiveurl= and |archivedate= parameters? It seems that WebCite is temporarily down, which is why the bot was unable to access the archive page in some citation templates in "Criminal Law (Temporary Provisions) Act (Singapore)": see [12]. — Cheers, JackLee talk 05:53, 10 April 2010 (UTC)

Done in r102. I'll try to roll it out in the stable version too before r102 goes public. Martin (Smith609 – Talk) 06:14, 10 April 2010 (UTC)
Thanks. * Sigh * – WebCite seems to go down once too often for my liking. — Cheers, JackLee talk 17:53, 10 April 2010 (UTC)

Wiki doi is a covert advertising instrument

For a discussion of what is in my view a gross violation of our core principle Wikipedia:NOTADVERTISING, please see HERE. Gun Powder Ma (talk) 11:47, 11 June 2010 (UTC)

Numerous responses to this post are here; the discussion should be kept centralized rather than occur in multiple places. postdlf (talk) 18:16, 12 June 2010 (UTC)

{{Resolved}}

Dead link + search

Would it be possible to use templates to place "[dead link]Scholar search" in citations? This way they could be exclude from print, and PDFs and Books would come out nicer. Headbomb {talk / contribs / physics / books} 02:09, 12 May 2010 (UTC)

The bot used to add a scholar search link to citations but this was disabled on the request of editors who felt that it added too much clutter. Does this answer your question? Martin (Smith609 – Talk) 21:26, 15 May 2010 (UTC)

{{resolved}}

Translated titles and brackets and doublequotes, oh my!

Please see Wikipedia_talk:WikiProject_Medicine#Ref_formatting_on_non-English_journals for a report on malformed rendering of anglicized titles using cite journal with |url=http://www.somewhere.net and |title=[Some translated title]. I think this likely relates to Crum's edits on 23, 25 June 2009 to template:citation/core. It seemed to me that there should be a way to automagically convert to using |trans_title= instead, perhaps with the help of user:citation bot. Then, looking at the source for citation bot I find that r43 seems to have deliberately disabled this, evidently at the request of Eubulides, for reasons which are still opaque. Any chance of getting this sorted out on a more permanent basis? LeadSongDog come howl 15:42, 28 April 2010 (UTC)

I can't remember the ins and outs of it; I'm sure that Eubulides would be able to fill you in. If you and he agree, I'm happy to re-implement it. The best thing would be if you could spell out a procedure here and agree on it (e.g. "move any component of the title parameter in square brackets to the trans_title parameter"); once I have explicit details of what the bot should do, I'll be able to implement it. Martin (Smith609 – Talk) 16:08, 28 April 2010 (UTC)

ISSN for journal

The citation bot failed to add the issn for the Journal of Design History in this edit. The correct issn would be 0952-4649. Is this a bug?Smallman12q (talk) 13:24, 5 May 2010 (UTC)

No, but thanks for pointing it out. Some editors prefer ISSNs not to be added to their articles, so the bot won't add an ISSN to an existing citation. Martin (Smith609 – Talk) 14:17, 5 May 2010 (UTC)

Proposal to block

See WP:ANI#Citation bot. —David Eppstein (talk) 21:44, 11 January 2010 (UTC)

Excellent, thanks for raising it and getting a quick block. Perhaps a WP:Kill all humans 'bot policy is in order?—Ash (talk) 22:02, 11 January 2010 (UTC)
I've emailed the operator to advise, since he's evidently off-wiki. I can't say that I agree with the need for the block, though. The problem cases are rarely other than garbage in, garbage out, mostly characterized by {{cite x}} without a stated |author1= or |last1=. At least by making them look odd they'll be more apt to get some human attention. LeadSongDog come howl 22:27, 11 January 2010 (UTC)
The section "Citation type unification breaks harv templates" above is exactly the sort of thing the operator should be fixing. Really the "unification" task is dodgy to begin with; I am surprised it was approved. But if it going to run the bot is going to have to avoid breaking anything in the process. — Carl (CBM · talk) 22:40, 11 January 2010 (UTC)
I am also concerned that it overrides the title by consulting some database that the average editor has trouble understanding or accessing. If that database has the title of a publication wrong, the bot will edit-war with editors who know better forever (or until someone puts a bot exclusion template into the article). --Jc3s5h (talk) 23:07, 11 January 2010 (UTC)
It isn't just "some" database. For books, it is the same database that most libraries rely on as authoritative. While WorldCat can only be as accurate as the data the bibliographic metadata source (usually a publisher or a library) provides, that isn't a flaw in the bot. Those gripes should be directed to the appropriate source library or publisher to fix their metadata. Similarly, for DOIs the responsibility rests with the publisher to correctly register the data, although in that case the registry is widely distributed by design. I don't know if the Wikimedia Foundation has established a membership in the OCLC, but if not, I'd suggest that it would be worth discussing, if only to ensure that WikiBooks would be acurately catalogued. There is considerable possibility for improvement, e.g. it could bring an end to the "Daily ISBN query limit exceeded" messages. LeadSongDog come howl 17:29, 12 January 2010 (UTC)
Deciding who to believe in case of conflicting information is a human activity, not a bot activity. The bot should not override the title, especially in an obscure way without providing explicit instructions about what to do when the bot is wrong. --Jc3s5h (talk) 17:41, 12 January 2010 (UTC)
Can you give another example where it has done this, or are you refering only to the Z/journal instance you mention above? LeadSongDog come howl 18:51, 12 January 2010 (UTC)
The z/Journal example is the only one I have come across. --Jc3s5h (talk) 19:00, 12 January 2010 (UTC)
We've had problems in the past with the Citation bot messing with journal title capitalizations; I ran into it with FEBS Journal and with S.A.P.I.EN.S., journal titles that the bot incorrectly changed to title case. Eventually this was fixed with an exception list maintained in User:Citation bot/capitalisation exclusions. Will it fix this problem to add z/Journal to that list? Since the source code to the Citation bot is not publicly available, I don't know how it uses that list or whether adding z/Journal will fix the problem. Eubulides (talk) 19:21, 12 January 2010 (UTC)
See here. Read-only for nonmembers, but it is there. LeadSongDog come howl 19:35, 12 January 2010 (UTC)
Ah, thanks, and I took the liberty of adding this URL to User:Citation bot #External links. The source code answers my question: no, adding z/Journal to that list would not fix things, because the code always capitalizes the first letter of a journal title, no matter what's been excluded. This needs to get fixed. Eubulides (talk) 19:55, 12 January 2010 (UTC)
  Fixed in r140 — Martin (Smith609 – Talk) 20:58, 15 May 2010 (UTC)
But what can be done is to email the editor of zjournal, advise them of their error, and request that they fix their cataloguing data at the Library of Congress Catalogue and on WorldCat. That fixes the problem for everyone, not just WP.LeadSongDog come howl 20:18, 12 January 2010 (UTC)

(unindent) I have emailed z/Journal as LeadSongDog suggests. If I understand Eubulides, however, that will not prevent CitationBot from capitalizing the first letter. Since I don't know what conventions the Library of Congress Catalogue and WorldCat impose on titles, I can't say whether we might run into some irreconcilable differences between what a publication calls itself and how it is named by these organizations. --Jc3s5h (talk) 21:51, 12 January 2010 (UTC)

There does not appear to be an imposed convention, just an error. For example the heading "z/OS" appears in the LOC authorities. The authorities are sorted ignoring capitalization and non-alphabetic characters. Strangely there is no authority record for z/Journal or obvious variations. The cataloguing error could also be reported here.LeadSongDog come howl 13:37, 15 January 2010 (UTC)

ISBN error

In this edit, the bot added the isbn for the wrong volume of Schuster's Hepaticae and Anthocerotae. As far as I am aware, the first two volumes did not have assigned ISBNs. --EncycloPetey (talk) 04:06, 11 May 2010 (UTC)

See User:Citation_bot#False_positives. Martin (Smith609 – Talk) 16:58, 11 May 2010 (UTC)

Talk page blanking

Hi. Citation bot 2 (talk · contribs) has just blanked Talk:Global warming controversy for no good reason. It first added a broken DOI note [13], then removed all other talk page content [14]. Please investigate...--Stephan Schulz (talk) 16:46, 11 May 2010 (UTC)

Thanks for the report; I will investigate ASAP. For the meantime I've added a check to the code that should stop the bot from blanking pages, but please let me know if this is ineffectual. Thanks. Martin (Smith609 – Talk) 16:59, 11 May 2010 (UTC)
It just did it again...[15]. --Stephan Schulz (talk) 17:04, 11 May 2010 (UTC)
I can't replicate the bug; however I have added safeguards against this problem that I hope will be effective. Martin (Smith609 – Talk) 21:25, 15 May 2010 (UTC)
I've replaced the bogus {{cite doi}} instance that started the problem. Really they should all be replaced with {{cite xxx}} using |doi=. As I've said before, it seemed at first like a nice idea, but it is fundamentally broken and unnecessary. It begs for vandalism, which it then hides in the history of an unwatched subpage. It imposes one stylistic choice on multiple citing articles. And given that the dois sometimes go turn up dead like this one it just plain makes citations stop working, leaving facts effectively uncited. Let's put {{deprecated}} to it once and for all.LeadSongDog come howl 19:08, 11 May 2010 (UTC)
I think that we can safely leave this to editors' individual discretion; there is no mandate to use the {cite doi} format over {cite journal | doi} if that is more appropriate for a given article. Martin (Smith609 – Talk) 21:25, 15 May 2010 (UTC)

Automating the publisher field

Hello, I'm spencer, a freebase programmer from Canada. I've put together a web service that can figure out a reference's 'publisher' unambiguously, given a url, using data in freebase.

the app was originally put together for this ubiquity citation tool but the more i think about it, the more it makes sense to write this information into existing references. I'd like to get some feedback on the idea.

freebase has many (several hundred thousand) 'official websites' mapped to freebase topics, and they are also connected to wikipedia. So if you give it a domain name, it will check to see if it is a publisher (defined as a newspaper, periodical, broadcaster, or blog), and if so, it will return its wikipedia link.

the api is http://referee.freebaseapps.com/?url={url}
Heres an example:

http://referee.freebaseapps.com/?url=http://www.nytimes.com/2010/03/16/arts/television/16cspan.html

and it's not just popular American websites either. [16] [17]

I think its a cool way to infer information in wikipedia, improving references. Love to hear what you think. I am willing to invest a fair amount of time to this project, if it is supported Spencerk (talk) 22:50, 23 March 2010 (UTC)

Sounds like an excellent idea. Your API should make it very simple for me to incorporate into the Citation Bot; all I would need would be information on when a publisher should and should not be added to a reference. Is there any instance in which your tool will return a value but it may not be appropriate to include it into the reference? For instance, might there be instances where editors have included the 'publisher' information in an alternate field?
Thanks, Martin (Smith609 – Talk) 12:45, 29 March 2010 (UTC)

hi martin! the answer i think is that it is all totally flexible, and depends more upon wikipedia policy than anything. There seems to be some ambiguity between 'publisher' and 'work' parameters, that I don't fully understand, and am not alone. Currently the api will return 'wordpress' as the publisher to a blog for example. but these things can be changed to accommodate policy. I've pinged a few others, hoping for some clarifications, cheers Spencerk (talk) 19:13, 31 March 2010 (UTC)

Two thoughts. One, I think it is up to the proposer of this potential enhancement to explain exactly how freebase determines who the publisher is. If there is no description from freebase about how they determine this, I don't think we can use them. Ordinarily the identity of the publisher is determined by the person writing the cite, and is more or less aimed at the entity that put up the funds to publish the work and which also has the last word about the content. So, I would be the publisher of my personal web page, because I pay for the subscription, and I control what is in it, even if I decide to put a contribution from some other author on it. If freebase thought my ISP was the publisher of my personal website, freebase would be wrong.
Second, freebase might get any particular entry wrong, even if it is accurate enough to be used in most cases. Therefore a bot should only be used to populate empty publisher fields, not to replace values entered by editors. Jc3s5h (talk) 19:46, 31 March 2010 (UTC)
As a possible cross-check, consider that a WorldCat query for "nytimes.com" "New York Times Co" returns a non-empty list of results. Other well-catalogued serials should have similar behaviour. This should aid in finding the ISSN if still blank. User:LeadSongDog come howl 20:22, 31 March 2010 (UTC)

thanks Jc3s5h! i like your definition of publisher, and i think its more satisfying than whats in the documentation. Freebase doesn't actually have a publisher type, so i've used newspaper, periodical, news_reporting_organisation, and blog. If it belongs to any of these categories, it is considered a publisher, and i think this is valid. I agree aswell that it should not replace user-contributed data. Is there a way to run this as a test? to see its results? I will try to whip something up if this is valuable.
@LeadSongDog, this is also a good idea, though freebase stores issn's aswell, and it would be easy, once the publisher is known, to grab this piece of information. Using it this may pose a licensing problem though. I will ask the freebase mailing list about this. Spencerk (talk) 11:05, 4 April 2010 (UTC)

  • i got the go-ahead from the freebase legal guy to use the db in this way. so, im ready to get this folded into the bot when you are Spencerk (talk) 20:45, 25 May 2010 (UTC)
    Super, thank you! So all I need to do is to make the bot query the API you mentioned overhead with a specified URL (with no entry in the "publisher" field), and retrieve the publisher? I'm using this implementation in the current beta (r150). I know that other bots are in the business of adding "title" fields to sparse citations; are you aware of an API that I could query to populate the title field too? Martin (Smith609 – Talk) 21:26, 25 May 2010 (UTC)
    Update; I've already noted several cases where the publisher is already credited in the |journal= (e.g.) or |periodical= (e.g.) field. For now, I am simply not querying the API for a publisher if a |journal= already exists. Is this the optimal solution? Martin (Smith609 – Talk) 21:50, 25 May 2010 (UTC)
  • hooray! I'm excited to learn its been adopted! yay! you rule martin. I'm really confident in this strategy. Automating the title field would require scraping the page. there are a few projects that have put together scrapers for common sites, but nothing so far is very well maintained. This is something I would like to write in the future. I agree dropping journal references is safe to avoid a possible though unlikely, (ive checked a bunch of journals with the api[18][19]) discrepancy. Please do not ever hesitate to ping me for help,questions,bugs. i 'commit'. Spencerk (talk) 17:17, 7 June 2010 (UTC)

Error

This edit was wrong in that there should not have been any postscript attribute. Jc3s5h (talk) 17:35, 28 May 2010 (UTC)

The absence of a postscript likely produces an output format that is inconsistent with other citations on the page, hence the bot flags it with a hidden comment that does not change the display, but draws editors' attention to it. An editor can then make an informed decision whether to leave the citation inconsistent with others in the page (they could remove the hidden comment if they chose to, without modifying the output; the bot won't add it back) or to manually bring it in to line with other citations. This isn't a decision that can be automated. Martin (Smith609 – Talk) 18:20, 28 May 2010 (UTC)
It did the same over at the skyscraper article, making the references look even more inconsistant. A pretty pointless action IMHO. Astronaut (talk) 03:24, 2 June 2010 (UTC)
"even more inconsistant"? Please explain how. Martin (Smith609 – Talk) 13:47, 2 June 2010 (UTC)

{{resolved}}

Cite book

How is this edit an improvement? I've seen other edits correctly changing cite web to cite news, but I see no advantage - and possibly a disadvantage - to changing cite book to Citation --JimWae (talk) 20:27, 2 June 2010 (UTC)

See User:DOI bot/bugs#Unifying citation types again and User:Citation bot#Changing citation to cite journal / cite book / etc. Martin (Smith609 – Talk) 20:50, 2 June 2010 (UTC)

{{resolved}}

URL and publisher

Hi. I'd like to know on what basis (= user consensus) the bot adds entries like url and publisher. As yet your bot page misses that vital information which should be put on top. Frankly, I deem these addditions superfluous in articles and would like to stop them permanently from being added in articles I created. Regards Gun Powder Ma (talk) 08:58, 11 June 2010 (UTC)

Leaving incomplete data

On this diff [20], I don't know whether the second author is really an editor, but the edit left the citation data in an obviously broken state. Maybe the bot could test for that sort of thing and leave an error. — Carl (CBM · talk) 00:39, 7 July 2010 (UTC)

That looks like a possible addition to the anticipated typo list, for |author2-last= vice |last2= (and similar). LeadSongDog come howl! 19:10, 9 July 2010 (UTC)
  Done Martin (Smith609 – Talk) 14:37, 15 July 2010 (UTC)

{{resolved}}

Postscript

Hello. Thanks for the good work that Citation bot does. I am wondering what is the purpose of adding a postcript parameter to citation templates in articles such as this [21] ? ----Steve Quinn (talk) 14:02, 23 July 2010 (UTC)

Do see User:Citation_bot#Changing_citation_to_cite_journal_.2F_cite_book_.2F_etc. Thanks, Martin (Smith609 – Talk) 14:18, 23 July 2010 (UTC)
Bot is right, {{cite journal}} documentation just needs a little update. already includes an explanation of |postscript=. Rjwilmsi 18:06, 23 July 2010 (UTC)
But, as mentioned above, "Postscript" is not mentioned in the documentation for {{Cite news}}. This is very confusing for editors who wonder what is going on when a reference they've added is amended by the bot. Please add an explanation there. Thanks. PamD (talk) 18:43, 23 July 2010 (UTC)
The two templates treat it the same documentation updated. Rjwilmsi 18:54, 23 July 2010 (UTC)
Thanks for that, Rjwilmsi: it's a pity that the creator of the bot didn't themself help clear up the confusion by amending the documentation, rather than directing people to the bot's talk page. I wasn't confident enough about my understanding of "postscript" to amend the documentation myself. PamD (talk) 19:31, 23 July 2010 (UTC)
Not quite true, Martin updated the documentation of cite journal to add postcript in April 2009. Rjwilmsi 19:46, 23 July 2010 (UTC)
But that wasn't much help for anyone puzzled by its addition to {{cite news}}. PamD (talk) 22:00, 23 July 2010 (UTC)
In the same vein, please could you explain the purpose of this edit[22] which adds postscript=<!--None-->, thereby making the citation look inconsistent with all of the other citations in the article. —Sladen (talk) 11:36, 29 July 2010 (UTC)
I've updated User:Citation_bot#Postscript_parameter to clarify this. Does anybody have any suggestions to make this less confusing? Instead of |postscript=<!-- None --> the bot could implement something like |postscript=<!-- This parameter stops the citation from ending in a period, which makes it inconsistent with other citations in the article; please remove this comment if this is not the intentional behaviour, or replace the content of this comment if the terminal period really should be omitted -->[[Category:Articles identified by a bot as having inconsistent citation formats that require editor confirmation]] : this would be clearer (improvements to wording welcomed) but more verbose (which would probably have the benefit of attracting editors to fix the problem). Martin (Smith609 – Talk) 14:39, 29 July 2010 (UTC)
Why aren't the periods just added to the citation templates that don't currently have one? --Ħ MIESIANIACAL 15:17, 29 July 2010 (UTC)
Because the templates only implement WP:MOS#CITE and WP:Citing sources, they don't dictate the form to be rendered. The question of whether or not to end the reference with a period is left to editors, with an admonition for consistency within an article. In some disciplines (such as medicine) it is generally accepted that the postscript period is used. In others it is not used. Accordingly, the templates provide the postscript parameter to allow editors to taylor the template behaviour to suit the article's accepted style. Of course, a simpler version of Martin's suggestion might be |postscript=<!-- Bot inserted parameter. Remove or change value to "." for the cite to end in a period if preferred. -->[[Category:Articles with inconsistent citation formats]] (although I suspect that category would include virtually all articles except those in FA class. LeadSongDog come howl! 16:50, 29 July 2010 (UTC)
Thanks for the improved wording. However the instruction "remove" may not work; if the parameter is removed from cite journal, the period will be added; but if it is removed from citation, then no period will be produced. I'm not sure how to convey this ambiguity. Martin (Smith609 – Talk) 19:43, 29 July 2010 (UTC)
  Done in r182. Martin (Smith609 – Talk) 20:22, 5 August 2010 (UTC)

{{resolved}}

!

Hi, could you please remove the "!" (or more) from your invitation in the edit summary? It's getting too buzzy. If so, thx. -DePiep (talk) 22:35, 30 July 2010 (UTC)

  Done in r181. Martin (Smith609 – Talk) 20:02, 5 August 2010 (UTC)
{{Resolved}}

Postscript in Cite news - not in the template documentation

There is no mention of a "postscript" parameter, as either required or optional, in the documentation for {{cite news}}. It's therefore very confusing for editors who have an article such as Pennine Way on their watch list if the bot adds it, even as an empty parameter. I gather from browsing around that this has been discussed elsewhere. Please either get the template documentation updated or stop adding this parameter. PamD (talk) 07:01, 28 June 2010 (UTC)

Like the editor below, I'd like an explanation of this "postscript" field. The bot is adding a field I've never heard of, and which is not in the documentation, to a template I regularly use. I need to know why. Please explain here. Thanks. PamD (talk) 21:58, 3 July 2010 (UTC)
See User:Citation_bot#Changing_citation_to_cite_journal_.2F_cite_book_.2F_etc. Feel free to make any modifications to documentation that you feel necessary. Martin (Smith609 – Talk) 17:49, 6 July 2010 (UTC)
That doesn't really explain why "postscript=<!--None-->" is inserted. --Ħ MIESIANIACAL 16:00, 23 July 2010 (UTC)
{{resolved}} by re-wording comment content. Martin (Smith609 – Talk) 14:54, 17 September 2010 (UTC)

Question about performance

Postscript comments Why does User:Citation bot 1 add |postcript=<!--None--> to {{Cite news}}? (e.g.) Please respond on my talk. Thanks. —Justin (koavf)TCM☯ 16:44, 3 July 2010 (UTC)

See User:Citation_bot#Changing_citation_to_cite_journal_.2F_cite_book_.2F_etc. Feel free to make any modifications to documentation that you feel necessary. Martin (Smith609 – Talk) 17:49, 6 July 2010 (UTC)
{{resolved}} Martin (Smith609 – Talk) 14:53, 17 September 2010 (UTC)

Can you filter obviously bad metadata

Is it possible (and for that matter, desirable) to filter the citation metadata you're getting for cases when the alleged publisher name has strings of the form "Vol. [0-9IVXLC]+" in its name (e.g. [23]), and not add the "publisher=" parameter in those cases? That would seem to be a pretty clear indication that someone is giving you mixed up data --- I'd be surprised to hear of a real publishing company with a name like that. (Though I'm sure some editor will read this and make it their life goal to start one. WP:BEANS, etc. =)). Cheers, cab (talk) 00:45, 17 June 2010 (UTC)

Here's more: [24]. I agree, this is very bad behavior on Citation bot's part. If it can't clean the data as you suggest, it shouldn't use it. I don't see the point of adding publisher information when the journal name is sufficiently uniquely identifying, nor of adding urls that go to the same place as the doi, in any case. —David Eppstein (talk) 04:28, 19 June 2010 (UTC)
Another problem I am also seeing from bad meta-data on occasion is [25] --- John Doe alone writes a review of a book by Jane Smith, but then Jane Smith is mistakenly listed as a co-author of the review --- i.e. the meta-data mistakenly claims she wrote a review of her own book.
This one I'm less sure how to filter, except by rather byzantine schemes which will no doubt have a high failure rate (e.g. if you could somehow detect from the title of the review that it is indeed a review, then extract the book title, publisher, and year, then find that in a book database, then ignore all metadata which claims that the author of a book is also a co-author of a review of that book). Maybe there's a simpler way. cab (talk) 15:41, 19 June 2010 (UTC)

In the last few weeks I've been rolling back dozens of these "fixes". In fact, whenever I see "tweaked publisher" in the edit summary, I can be pretty sure that the edit is wrong and needs to be rolled back. I think it would be a good idea to disable the publisher feature of the bot until this is fixed — that way, at least, the other useful edits it makes can be kept and the bad edits don't creep into unwatched articles. —David Eppstein (talk) 03:58, 29 June 2010 (UTC)

I have also seen these, and I saw more of these this morning. I have blocked the bot and notified Smith609. Once it's fixed, anyone should feel free to unblock it. — Carl (CBM · talk) 11:43, 29 June 2010 (UTC)

{{unblock|I think that the majority of problems involving publisher metadata involve SpencerK's API, which I cannot modify; thus I have stopped the bot collecting data from this source until it is fixed. This should make it safe to re-enable the bot.}}

 

Your request to be unblocked has been granted for the following reason(s):

Problem reported to be fixed

Request handled by:  Sandstein 

Unblocking administrator: Please check for active autoblocks on this user after accepting the unblock request.

Citation bot 1 (talk · contribs) was also blocked for the same rationale and is safe to unblock too. ([also]) Martin (Smith609 – Talk) 08:50, 3 July 2010 (UTC)

It still seems to be having the same problem: [26]. —David Eppstein (talk) 17:40, 3 July 2010 (UTC)

Another edit with the same problem hit my watchlist as well [27]. I blocked Citaton bot 1 for 24h. I will check the main citation bot; could the configurations be different? — Carl (CBM · talk) 22:35, 3 July 2010 (UTC)
The main bot has the same problem [28]. I blocked it temporarily as well. — Carl (CBM · talk) 22:38, 3 July 2010 (UTC)

Too temporarily, it seems. —David Eppstein (talk) 23:03, 4 July 2010 (UTC)

I extended the block now. If nothing else, this could be papered over with a regexp. — Carl (CBM · talk) 03:19, 5 July 2010 (UTC)

{{unblock|see following}}

 

Your request to be unblocked has been granted for the following reason(s):

Only one way to find out...

Request handled by: Smashvilletalk

Unblocking administrator: Please check for active autoblocks on this user after accepting the unblock request.

It looks like I'd misdiagnosed the problem; the errant data source was JSTOR. I have now disabled all addition of publisher parameters so that the bot is safe to run. When I have more time I will see about re-implementing them if they are bug-free. Could those who have observed the bugs comment on the following?
  • Did the errant information always appear in concert with a reference listed in JSTOR (or Google Books??)?
  • If the script were to remove any information after "Vol. ", would the correct data be returned in all the observed cases?
Thanks for your input, Martin (Smith609 – Talk) 15:12, 6 July 2010 (UTC)
Ok, I unblocked Citation bot 1. Someone else got to the main Citation bot before I did. —David Eppstein (talk) 16:24, 6 July 2010 (UTC)
Ad the questions: (1) yes, IIRC (2) not at all; all the info was bogus, the stuff before "Vol." had nothing to do with the publisher, it was the name of the journal. Moreover, as far as I can see, adding publisher info to journal references is unhelpful even if the info is correct, as the standard academic practice is to identify journals only by their names with no publisher specification.—Emil J. 18:26, 6 July 2010 (UTC)

Auto New York Times citation

I've made a php tool for creating New York Times citations from urls at Wikipedia:WikiCite Builder. Would you be interested in adding something like this to your bot?Smallman12q (talk) 20:48, 16 May 2010 (UTC)

We are doing similar work: User:Rjwilmsi/CiteCompletion. Rjwilmsi 22:04, 18 May 2010 (UTC)
Both these tools sound great! It'd be wonderful if Citation bot could incorporate their functionality. Do either of you have an API whereby the bot can send a query and retrieve a list of parameters? If you'd rather communicate via e-mail then feel free to drop me a line. Thanks! Martin (Smith609 – Talk) 15:25, 22 May 2010 (UTC)
Sorry...never saw the responses here...if you want the source...I can email it to you (or put it up on sourceforge). It's fairly easy to implement using the Article Search API and the Times Newswire API. I'm also making a more enhanced offline version called AutoWikiCite...see Wikipedia:Village_pump_(technical)#AutoWikiCite.Smallman12q (talk) 21:10, 17 July 2010 (UTC)

Can you run it against a category of articles please

I would like to set the bot loose on a category of articles. Category:Recipients of the Medal of Honor. But I don't see an option to be able to do that. --Kumioko (talk) 21:05, 23 June 2010 (UTC)

Merged into issue 52 and uplisted on my To-Do list! Martin (Smith609 – Talk) 19:45, 15 July 2010 (UTC)

Placing a dead link template in a ref

With this edit why did the bot put the dead link template in a newly added format parameter and not follow the template documentation?--Rockfang (talk) 03:37, 15 July 2010 (UTC)

This does not affect the bot but rev 6861 AWB will now tidy this up in line with {{dead link}} documentation. Rjwilmsi 11:47, 20 July 2010 (UTC)
Thank you for the info. :) Rockfang (talk) 18:27, 20 July 2010 (UTC)
As of r178, the bot will not check for dead links. The documentation is problematic, see Template_talk:Dead_link#Can.27t_follow_placement_instructions. Martin (Smith609 – Talk) 15:26, 24 July 2010 (UTC)

Problem on articles using "et al" Harvard style citations in line

Can someone please block this bot from being run on articles such as E1b1b, E1b1b1a, List of R1a frequency by population, and R1a? Whatever the pros and cons of different styles of referencing, these articles will simply not function if this bot continues to disable their references by changing the listing of authors in the cite templates.--Andrew Lancaster (talk) 11:45, 26 June 2010 (UTC)

Perhaps you could explain the error in more detail? This will allow me to fix the bug. Martin (Smith609 – Talk) 08:51, 3 July 2010 (UTC)
In this edit the bot adds numerous authors when the first author is listed as "Adams et al." but leaves the "author = Adams et al." as is. Presumably this same problem would occur if the editor who wrote the citation decided to us a corporate author (e.g. 9/11 Commission) but wherever you look up authors names of individual authors.
The Citation template itself is partly at fault. The default behaviour is to list 9 authors followed by "et al." The practice in most fields is to list up to N authors (N depends on the field) and if the total number of authors > N then only list the first author followed by et al. However, if the paper contains more than one author by the first author, additional authors are listed to disambiguate the citations.
Obviously this is a difficult area for a bot to handle. Keep in mind that citations composed by editors (with or without the aid of templates) is the normal practice, and robotic citation has never received an explicit endorsement by the community. Thus you must not expect editors to alter the way they use the templates so they will be compatible with your bot. Your bot must correctly deal with whatever the editors write, and avoid any tasks that it can't do perfectly. Jc3s5h (talk) 13:24, 3 July 2010 (UTC)
I'm not sure what would be more explicit than WP:BAG approval, but in any case, perfect or not, those edits vastly improved grossly defective cites. Yes, author=last1 et al should have been replaced by last1= and first1=, but it was still a big improvement. We should not let the quest for perfection be a roadblock to improvement. LeadSongDog come howl! 16:18, 3 July 2010 (UTC)
Big improvement? I see no big improvement in the edit I pointed out. The only minor improvement I see is changing a cite book to a citation template. The rest is providing data above and beyond what is normal in scholarly citations. Jc3s5h (talk) 17:10, 3 July 2010 (UTC)
Standard practice in biomedical journals certainly does not identify the lead author solely by surname and other authors solely as et al. Naming those other authors has to be regarded as a significant improvement, as does adding a PMID linkage. LeadSongDog come howl! 03:19, 7 July 2010 (UTC)
You mention that the articles do not function, and that the references are disabled. Despite your expansion, I still do not understand what you mean. Would you mind spelling this out for me, as well as the desired behaviour of the bot? Thanks, Martin (Smith609 – Talk) 17:53, 6 July 2010 (UTC)
It appears that the article is using links of the type created by {{harv|Foo et al|YYYY}} to match the poorly-spelled-out author names in the actual citations. When Citation bot fixes the citations, the links in the harv templates stop working. The solution, of course, is to spell out the authors the same way in the harv templates as well. It does a disservice to the other authors of the papers to list only the first one, and there's no good reason for the omission. —David Eppstein (talk) 18:26, 6 July 2010 (UTC)
But if, as stated above, the bot is leaving the |author=Bloggs et al. as it is, then how does the edit lead to a broken harv link? Martin (Smith609 – Talk) 13:31, 7 July 2010 (UTC)
If I read the above example correctly, the bot is writing ...author=Adams et al.|last2=Bosch|first2=E|last3=Balaresque|first3=PL|last4=Ballereau|first4=SJ|last5=Lee|first5=AC|last6=Arroyo|first6=E|last7=López-Parra|first7=AM|last8=Aler|first8=M|last9=Grifo|first9=MS... or something of the sort. No matter how this is approached, it's going to make a mess of the CITEREF. When the bot adds other lastn and the article has such instances of {{harvcoltxt|Adams et al.|2008}} the bot should ensure the harv family linkage is updated in a way that is consistent with the new #CITEREFAdamsBoschBalaresqueBallereauLeeArroyoLópez-ParraAlerGrifo2008 (or whatever). The harv template has to change in a coherent fashion with the citation template in order to not break the linkages. LeadSongDog come howl! 18:25, 26 August 2010 (UTC)
The {{harv}} family of templates are written in such a way that if they are given up to three surnames, all will display; but if given exactly four, they are condensed. So {{harv|Smith|Jones|Brown|2010|p=1}} displays as (Smith, Jones & Brown 2010, p. 1), but {{harv|Smith|Jones|Brown|Doe|2010|p=1}} displays as (Smith et al. 2010, p. 1). The first expects the full citation to have |last1= to |last3= filled in; the second expects anything from four to nine of the |lastn= fields to be filled in. If done this way, the linking works without any "et al" fudging. --Redrose64 (talk) 19:57, 26 August 2010 (UTC)
OK, a few more examples:

{{harv|Smith|Jones|Brown|2010}} displays as (Smith, Jones & Brown 2010, p. 1) linked to CITEREF_SmithJones_Brown2010, but {{harv|Smith|Jones|Brown|Doe|2010}} displays as (Smith et al. 2010, p. 1) linked to CITEREFSmith_Jones_Brown_Doe2010 , and {{harv|Smith|Jones|Brown|Doe|Wong|2010}} displays as (Smith et al. 2010, p. 1) linked to CITEREFSmith_Jones_Brown_DoeWong

Now perhaps it's just me, but that seems like it needs some improvement.LeadSongDog come howl! 21:18, 26 August 2010 (UTC)
Improvement in the documentation, to say more clearly "don't do that", or improvement in the template, to display something more eye-catching when too many arguments are supplied? —Preceding unsigned comment added by David Eppstein (talkcontribs) 21:28, 26 August 2010
It does state "Up to four authors can be given as parameters (see the examples). If there are more than 4 authors only the first 4 should be listed; listing more will cause odd things to happen."; also "The year and author name(s) must not have extra space before and after, else the generated links will not work." {{harvnb}} documentation has similar warnings, although worded differently. --Redrose64 (talk) 21:36, 26 August 2010 (UTC)
The thing is that those spaces are normal usage. Tools like diberri's even generate them automatically, depending on choice of options. The right answer is to either make the templates ignore them or have citationbot strip them out of both the harvx and citx imput. LeadSongDog come howl! 16:28, 28 August 2010 (UTC)
Leading and trailing spaces are immaterial for the citation templates, because everything goes through named parameters. With {{harv}} and its family, five of the parameters are unnamed, so leading/trailing spaces are significant, so if a tool is preparing {{harv}} templates with such spaces, the tool is buggy and the tool owner should be notified. However, per Help:Template#Whitespace problems, the templates could be modified so that spaces can be internally stripped. --Redrose64 (talk) 17:02, 28 August 2010 (UTC)
Is there any justification not to make that fix to the templates? I certainly haven't thought of one, other than the transitional issues, which might indeed be ugly unless a new series of template names were used. Of course having a bot strip all such excess spaces wikiwide would avoid the ugliness, but I'm not sure it wouldn't face detractors among the "I want legible wikitext" crowd, with whom I tend to agree. LeadSongDog come howl! 18:00, 28 August 2010 (UTC)

Proper way to handle DOI with an < and >?

I have a DOI with < and > in it and I don't know how to properly include it so that both the wiki can handle it and the citation bot recognizes it on List of Ig Nobel Prize winners. The doi is 10.1002/(SICI)1097-010X(19980215)280:3<260::AID-JEZ7>3.0.CO;2-L I have verified this at http://www.crossref.org/guestquery/ Search on Article Query with name of Fong , and with title of Induction and Potentiation of Parturition in Fingernail Clams (Sphaerium striatinum) by Selective Serotonin Re- Uptake Inhibitors (SSRIs)

Ideas?Naraht (talk) 19:12, 4 October 2010 (UTC)

{{cite journal|doi = 10.1002/(SICI)1097-010X(19980215)280:3<260::AID-JEZ7>3.0.CO;2-L}}
? Martin (Smith609 – Talk) 20:55, 4 October 2010 (UTC)
OK, I did that, will the citation bot come through and replace like it does for the {{cite doi template?Naraht (talk) 01:33, 5 October 2010 (UTC)
Eventually; to make it do so immediately, see WP:UCB. Martin (Smith609 – Talk) 02:11, 5 October 2010 (UTC)
I managed to make it work with the cite doi. Just as a question, is finding a pmid for an article better than finding a doi, vice versa, or will it track down one from the other?Naraht (talk) 02:38, 5 October 2010 (UTC)
Either should work. Use whatever's easier for you. Martin (Smith609 – Talk) 03:07, 5 October 2010 (UTC)
Thank you very much. List of Ig Nobel Prize winners has a *very* eclectic list.Naraht (talk) 03:42, 5 October 2010 (UTC)

{{resolved}}

Cite encyclopedia?

Hi there

Could your bot be tweaked to run on {{cite encyclopedia}} references? Many online encyclopedias now have DOIs, and if I were to invoke {{cite journal}}, your bot runs just fine on them [29][30]. It'd be wonderful if we can do it. Even if it requires some hand coding, it'd still be better than having to use the cite journal workaround. Thanks! --Rifleman 82 (talk) 07:18, 6 October 2010 (UTC)

Sure, will do this when I get the chance. Martin (Smith609 – Talk) 21:29, 6 October 2010 (UTC)
Done in r219. Martin (Smith609 – Talk) 02:48, 23 November 2010 (UTC) {{resolved}}

Why does Citation Bot replace names with initials?

I don't know if this is a bug, or Citation Bot is merely over-zealously interpreting some style guideline, but it is quite a pain to keep fixing changes where the bot changes first names to just a letter... see e.g. here. Is there a way to prevent the bot from doing this? Shreevatsa (talk) 13:59, 27 August 2010 (UTC)

See Template:Cite doi#Formatting. Martin (Smith609 – Talk) 18:00, 8 September 2010 (UTC)
Ah I see you have documented the bug, but I still don't understand why it's that way. Anyway, the solution is to subst: it, right? And this can be done only after the bot has got around to creating the template? (I guess the answers to both are "obviously, yes"; just making sure.) It would have been great to keep the wikitext small and have the bot not mess with the formatting once it's been edited by someone else, but well, thanks anyway.) Shreevatsa (talk) 18:50, 8 September 2010 (UTC)
The idea is that each cite doi template can be transcluded on multiple pages, to save cluttering pagecode by substing. For this to be possible, a consistent format must be maintained. If you can't use this format, then it's better to enter {{cite journal|doi=10.xxxx/etc}} and run citation bot on the page. Martin (Smith609 – Talk) 19:25, 8 September 2010 (UTC)
Thanks! That seems a good approach. Shreevatsa (talk) 12:33, 13 September 2010 (UTC)

Problem at hexafluorophosphate

In this recent edit, the citation bot changed the final reference in the article (end of the diff). The bot correctly noted that the paper has three authors, and only two were listed. However, when it added the third author it did not recognise that the missing author was actually the first author. So, instead of making the authors Csöregh, Kierkegaard, and Norrestam the bot made the authors Kierkegaard, Norrestam, and Norrestam. The initial for Kierkegaard in the given reference was also incorrect, which the bot did not correct. I'm not sure if these qualify as bugs, but it does seem to me to be an issue if the bot adds in an additional author which is not the author missing from the citation. Sorry if this is the wrong place to bring this up, my knowledge of bot functioning is poor. Regards, EdChem (talk) 06:37, 17 September 2010 (UTC)

Thanks for the report. Unfortunately there's not much that can be done about this; the bot operates on the principle that human WP editors are more likely to be correct than remote databases, so will never modify user-input data (beyond formatting it). Martin (Smith609 – Talk) 14:12, 17 September 2010 (UTC)
Ok, that's a sensible principle. But, could the bot either recognise which author was missing and hence to add, or (failing that) leave a talk page message that there was a problem with the reference (3 authors from doi records, 2 in current ref, and that adding the last author would appear to create a duplicate)? Adding a duplicate author seems unhelpful to me. Thanks. EdChem (talk) 14:45, 17 September 2010 (UTC)
That'll take considerable coding but I'll add it to the list of future enhancements. I wonder whether the problem is widespread enough for me to devote much time to it? Martin (Smith609 – Talk) 14:52, 17 September 2010 (UTC)
Good question, I'm sorry to say I don't have a useful answer. I only noticed the problem because I was surprised to see such a large edit from a reference bot on the hexafluorophosphate page - I did a more than 5x expansion for DYK recently, so I knew the references were fairly complete. I haven't checked if the reference was mine, but if it was I'm embarrassed to have missed an author. I guess the problem to consider is: if a paper has N authors but only n are listed in a ref, is it reasonable to assume that the ones listed are authors 1, 2, ..., n, and it is authors n+1 to N that are missing? In this case, there were 3 authors and authors 2 and 3 were shown, author 1 was missing. I have no idea how common missing authors are, nor whether the assumption being made is valid in the majority of cases. Perhaps a review of the diffs of the bot's edits would be informative? EdChem (talk) 15:18, 17 September 2010 (UTC)

Franco-Mongol alliance

How can we stop the bot from "fixing" the citations on this article? We are using a different citation style on this article, and I've had to revert the bot twice now. --Elonka 06:27, 22 September 2010 (UTC)

Can you clarify what error the bot is introducing? Its edits appear to add ISBNs, which is surely a good thing, and consolidate duplicate fields. The problem may be in that you have some book references with two |location=/|year= fields. If you need to specify two locations or dates they must be in different fields, otherwise the template won't display both of them, regardless of the citation bot. Rjwilmsi 11:50, 22 September 2010 (UTC)
The years and ISBNs are already in the References section of the article, so do not need to be duplicated in the individual in-line citations. It is unnecessary information, which also makes the notes section more difficult to read, since it adds blue links which are of limited value. Further, this article, Franco-Mongol alliance, has been the subject of intense dispute over the years, in large part because of citations. After extensive dispute resolution, mediation, and multiple arbitration cases and amendments, the article has become stable. Consensus has been achieved on citation style, so please, to avoid further disruption, can we prevent the bot from continuing to try to change citations to a non-consensus version? --Elonka 22:30, 22 September 2010 (UTC)
As it says on User:Citation bot, you could simply add {{bots|deny=Citation bot}} to the article. LeadSongDog come howl! 00:24, 23 September 2010 (UTC)
Perfect, that's what I needed, thanks! --Elonka 04:18, 23 September 2010 (UTC)

A bad interaction between AWB and Citation Bot 1

Hello Martin, your input would be welcome at the discussion here -- John of Reading (talk) 16:17, 19 September 2010 (UTC)

Assuming that User:Citation bot 1 does need to track certain articles that it has edited, we seem to be agreed that a template call would be much better than the existing category call. The template definition could then place the article into the category. I'd be interested to know who or what is making use of the tracking category, though. Also, given a templated solution there would be no need for the category, since a "what links here" of the template would do the same job. -- John of Reading (talk) 14:12, 23 September 2010 (UTC)
  Done in r188, using the new {{inconsistent citations}}, and categorized the category to Category:Wikipedia cleanup categories. Martin (Smith609 – Talk) 15:43, 23 September 2010 (UTC)
Excellent! Could you add some documentation on the template talk page to warn everyone that this template is used by the citation bot, and to explain what kinds of inconsistency it is flagging? -- John of Reading (talk) 16:01, 23 September 2010 (UTC)
Done, but it's hurried & garbled. Feel free to improve the clarity if you can! Thanks. Martin (Smith609 – Talk) 16:26, 23 September 2010 (UTC)
Thank you. -- John of Reading (talk) 17:23, 23 September 2010 (UTC)

Can we stop running the bot over articles two times?

I've reverted some of the bots changes to articles, such as Hubert Walter, when it ran back in June - see diff from June, and now it's going through and doing the exact same changes again diff from today. There really isn't a need for a jstor url when the doi is there, and on FAs, the url is actually a degredation of hte article, since the bot doesn't include the necessary field of "format=fee required". Once the bots run once, shouldn't it NOT repeat the runs again? Ealdgyth - Talk 14:11, 6 August 2010 (UTC)

We can't stop the bot running on a page just because it ran before – new cites may have been added to the article. If jstor URLs were only added with |format=fee required would that be sufficient? Rjwilmsi 14:20, 6 August 2010 (UTC)
Personally, I really dislike the sea of blue that all those links add. When you get something with a LOT of links, it makes it difficult to read and figure out. Adding the parameter would help, but chances are good I'd just revert the additions of the urls. Any way you could NOT add urls to things that have dois? Ealdgyth - Talk 14:21, 6 August 2010 (UTC)
The URL is valuable if it provides a link to free access to the full version of the text – the DOI link doesn't necessarily do this. I don't know what the Jstor URLs provide (not my bot). Rjwilmsi 14:27, 6 August 2010 (UTC)
JSTOR linkages are, for many users, free (beer) access even when the DOI resolves to a for-fee site. They also have very high quality, well structured bibliographic metadata well suited to use by the bot. If we didn't identify the JSTOR id, the bot couldn't reliably use that data. The stylistic "sea of blue" question seems largely irrelevant to the bot, at most it affects the template and user can always change prefs to display links differently if blue is troublesome to him. LeadSongDog come howl! 17:52, 26 August 2010 (UTC)

Then is there any way to signal the bot not to edit a particular citation, to prevent it from repeatedly making the same incorrection? I don't necessarily want to stop it from editing the entire article like with {{bots|deny=Citation bot}}, but practically all of the JSTOR metadata has the same problem with book reviews (listing the author of a reviewed work as a co-author of the review). Thanks cab (call) 04:39, 29 September 2010 (UTC)

Suggestion

Hi again - I am one of the really lazy editors and would like this feature: add sensible name= to ref where its missing, such as

<ref>{{cite pmid|12716040}}</ref> ===> <ref name=Benjamin_2003>{{cite pmid|12716040}}</ref>

If Benjamin_2003 weren't unique make it Benjamin_2003_pmid. Is it in the scope and doable for this bot? Dreaming even further, if I were to write {{refcite pmid|12716040}} could it save me the hassle to type the reftags altogether?Richiez (talk) 20:19, 20 September 2010 (UTC)

It's certainly possible and is something I've thought of in the past. I'd have to file a bot request for this proposal, and it's worth ironing out the best approach before I do that. Should the bot add names to any ref without a name? Should it combine refs with duplicate content under one of the same name? How should it select a name? Would it suffice to go Smith2001, using Smith2001a if Smith2001 is not available, and incrementing the letter until successful?
The refcite option would also be possible; the simplest implementation would be for the bot to replace "refcite pmid" with the ref tags and markup. Any further thoughts on the implementation are welcome; if there is no opposition I'll eventually file a bot request. Martin (Smith609 – Talk) 21:45, 20 September 2010 (UTC)
I'd prefer it if reference names were only added to repeated citations. Adding a name when a citation is only used once just adds pointless bulk to articles. — Cheers, JackLee talk 07:13, 21 September 2010 (UTC)
I agree that the data is less helpful in "Cite journal" references; however with "cite pmid" references the extra bulk makes it easier for editors to identify the citation, as the author's name is not otherwise visible in the source code. Did you only have the first scenario in mind with your comment? Martin (Smith609 – Talk) 18:01, 21 September 2010 (UTC)
Any systematic refname would be helpful. A little extra wikitext bulk for ones that as yet have only one usage is worth it to encourage editors to use existing cited references. There's rarely need for hundreds of refs, this would encourage reuse. LeadSongDog come howl! 18:23, 21 September 2010 (UTC)
Any naming scheme like Smith2010c or Smith_2010c would be fine, it should be added to all cite pmid and similar for readability and easy reusability. The simple refcite implementation would be certainly good enough for me. Combining refs automatically would be very nice though may raise a few implementation concerns, like what happens if the same source is defined once as cite pmid and another time as a filled out cite journal template.. which one does the bot choose? Richiez (talk) 22:55, 21 September 2010 (UTC)
Prior experiments have convinced me that the bot should only combine citations if their parameters are identical (with the sole exception of whitespace). I'll propose naming all references to cite pmid, cite doi and family as a new bot request imminently; with a separate request for replacing "{{ref pmid|1234}}" with <ref name=Smith2010>{{cite pmid|1234}}</ref> and post here when the requests are ready for comment. Martin (Smith609 – Talk) 15:49, 23 September 2010 (UTC)
I'd very much prefer to see it replaced with <ref name=Smith2010>{{cite journal|pmid=1234}}</ref> so that we don't encourage the use of {{cite pmid}}, for all the reasons that I've previously elaborated on. LeadSongDog come howl! 16:24, 24 September 2010 (UTC)
As long as cite journal gets expanded to a quarterpage long monster making text editing a pain I am all for encouraging cite pmid. Maybe ref name could hold additional information like journal abbrev to make you happier? <ref name=Smith2010JVirolMethods>{{cite pmid|1234}}</ref> Richiez (talk) 18:05, 24 September 2010 (UTC)
No, short of getting (at least) semi-prot or an anti-vandal watch-bot for all the template space subpages, {{cite pmid}}, {{cite pmc}}, {{cite doi}} and company simply shouldn't be trusted. Now if citation bot were to subst them once the data was all validated, my view might be different, but that would still result in the "quarterpage long monster". Of course, there's no real need to use multiline in the wikitext, it could just as readily be collapsed. LeadSongDog come howl! 19:06, 24 September 2010 (UTC)
  • I've never actually encountered vandalism in the cite xx template space. How prevalent is it? Can you point to any examples? Martin (Smith609 – Talk) 20:06, 24 September 2010 (UTC)
    Not seen yet, but so far there are still less than 2000 instances of cite pmid. But beans, and all.LeadSongDog come howl! 06:31, 25 September 2010 (UTC)
    I would view {{cite pmid}} as fixed format and editable by bot only, if someone wants a different format there might be specialised templates or special arguments to this template but if it can be done I would certainly not complain if the template were protected or regularly cleaned by bots. Richiez (talk) 09:44, 25 September 2010 (UTC)
    But of course that would mean there is no way to correct erroneous data. Perhaps editprotected? LeadSongDog come howl! 19:00, 1 October 2010 (UTC)

Amusingly (?) I was thinking about a new family of cite templates along the lines of {{Ref xxxx|name="bloggs 2009"|group="notes"|all the other stuff}}. Reason? Removing the need to learn mark-up, apart from templates (which you need anyway) to write wiki source. Rich Farmbrough, 11:56, 2 October 2010 (UTC).

While {{cite pmid}}, {{Cite pmc}}, {{Cite doi}} are all protected, anti-vandaling the sub-space fits in with some other ideas I have been having. Most of the sub-space is individually not widely transclude, and no more of a target than ordinary pages, however. Rich Farmbrough, 12:00, 2 October 2010 (UTC).

Best way to specify et al. in preparation for citation bot?

Consider the following reference as I make edit 1 -> 2 then I call the citation bot, which does 2 -> 3. The end result leaves an et al. outstanding that I have to clean up manually. Is there a better way for me to make edit 1 -> 2 (given that I don't know at that time whether the citation bot will be able to add additional authors) so that there's no cleanup task left? Or could citation bot remove et al. when it adds authors?

  1. <ref name="Heller">A.C. Heller ''et al'' 2006 Surgery of the mind and mood: a mosaic of issues in time and evolution. ''Neurosurgery'' '''59'''(4): 720-40</ref>
  2. <ref name="Heller">{{Cite journal | last1 = Heller | first1 = A.C. ''et al.'' | year = 2006 | title = Surgery of the mind and mood: a mosaic of issues in time and evolution | url = | journal = Neurosurgery | volume = 59 | issue = 4| pages = 720–40 }}</ref>
  3. <ref name="Heller">{{Cite journal | doi = 10.1227/01.NEU.0000240227.72514.27 | last1 = Heller | first1 = A.C. ''et al.'' | last2 = Amar | year = 2006 | first2 = Arun P. | last3 = Liu | first3 = Charles Y. | last4 = Apuzzo | first4 = Michael L.J. | title = Surgery of the mind and mood: a mosaic of issues in time and evolution | url = | journal = Neurosurgery | volume = 59 | issue = 4| pages = 720–40 }}</ref>

Thanks Rjwilmsi 15:11, 1 October 2010 (UTC)

There is a better way: Specify |display-authors=1 instead of typing "et al". Then authors added by the bot won't be displayed to viewers, but will be available to user plugins that extract metadata. Martin (Smith609 – Talk) 16:32, 1 October 2010 (UTC)
{{Cite journal | doi = 10.1227/01.NEU.0000240227.72514.27 | last1 = Heller | first1 = A.C. 
| last2 = Amar | year = 2006 | first2 = Arun P. | last3 = Liu | first3 = Charles Y. 
| last4 = Apuzzo | first4 = Michael L.J.
 | title = Surgery of the mind and mood: a mosaic of issues in time and evolution
 | url = | journal = Neurosurgery | volume = 59 | issue = 4| pages = 720–40
 |display-authors=1 }}
I'm not necessarily looking to suppress the display of additional authors, rather to not take away information from the existing unformatted citation: I'm not sure that |display-authors= solves my problem, because at the time of my edit with only one known author I can only set |last1= and |first1= in which case |display-authors=1 won't show et al. unless the citation bot later adds more authors, which I can't be sure about at the time of my edit. Rjwilmsi 03:04, 2 October 2010 (UTC)
Oh, I think I understand. Et al handling has been outstanding on my to-do list for a while so imrpovements are in the works. However unless you are willing for the et al to just not display for one minute whilst you run the bot, the manual removal afterwards is the only option, I'm afraid. Martin (Smith609 – Talk) 13:15, 2 October 2010 (UTC)
It's the uncertainty whether the bot will add the extra authors that's the problem, so I will have to stick with removal afterwards where necessary. May I ask for you TODO list to include adding logic along the lines of "if there's an et al. and I complete the authors, remove the et al. Rjwilmsi 14:40, 2 October 2010 (UTC)

I have something to confess while being audacious

Without permission, since about Sep. '09, I had been 'bot denying' CB proper from the cite journal templates I initiated; I am practically OCD over having the author's names remain seemlessly and neatly presented from the 'author=' parameter, and protecting from the eternally intermittent and inconveniently hellish author's initial bugs that nobody has time to keep having to edit-undo. Now, this affected hundreds of templates I watched and/or initiated; but before blocking, I made sure all had complete dois and pmids, I gave them urls/pmcs, etc., and I watched over them like a mommy. I apologize; and an editor has recently removed those via AWB. Now, here's my audacity: May I put the block back to make my Wikilife improve? I am not taking what I did lightly... I love Wikipedia! Rcej (Robert) - talk 08:02, 3 October 2010 (UTC)

I think you need to work with the citation bot rather than against it. What specifically are the author name formatting problems? Rjwilmsi 08:50, 3 October 2010 (UTC)
Taking as an example Template:Cite doi/10.1055.2Fs-2004-815608 it's currently:
{{Cite journal | author = Kudoh S, Keicho N | month = Oct | title = Diffuse panbronchiolitis | journal = Seminars in respiratory and critical care medicine | volume = 24 | issue = 5 | pages = 607–618 | year = 2003 | pmid = 16088577 | doi = 10.1055/s-2004-815608 }}
Kudoh S, Keicho N (2003). "Diffuse panbronchiolitis". Seminars in respiratory and critical care medicine. 24 (5): 607–618. doi:10.1055/s-2004-815608. PMID 16088577. {{cite journal}}: Unknown parameter |month= ignored (help)
If I adjusted it to use |last1= etc. (the {{cite journal}} preferred format): {{Cite journal | last1 = Kudoh | first1= S |last2= Keicho | first2=N | month = Oct | title = Diffuse panbronchiolitis | journal = Seminars in respiratory and critical care medicine | volume = 24 | issue = 5 | pages = 607–618 | year = 2003 | pmid = 16088577 | doi = 10.1055/s-2004-815608 }}
Kudoh, S; Keicho, N (2003). "Diffuse panbronchiolitis". Seminars in respiratory and critical care medicine. 24 (5): 607–618. doi:10.1055/s-2004-815608. PMID 16088577. {{cite journal}}: Unknown parameter |month= ignored (help)
would that be okay by you? Rjwilmsi 09:08, 3 October 2010 (UTC)
I feel badly for doing that; its certainly okay. :) If only the bot was that consistent. The problem of fouling up author's names has been pretty much a constant for over a year! Even today, I had to fix 6 templates the bot messed up, for the simple reason the block was removed. I sorta liked my arrogance... I had more time to work on articles! Rcej (Robert) - talk 06:00, 4 October 2010 (UTC)

Is there a good reason for using initials rather than spelling out the authors' names as they are listed in the journal paper (Shoji Kudoh and Naoto Keicho in this example)? Spelling the names out avoids ambiguities, helps avoid errors of incorrect abbreviation, and helps others make appropriate cross-links to articles on the authors. Unlike in a printed journal, we don't have any particular space limitation that would justify the abbreviation. —David Eppstein (talk) 06:16, 4 October 2010 (UTC)

Some style manuals, such as APA style, call for using author initials. Originally Wikipedia cite xxx templates were modeled after APA style. The Citation template isn't modeled after anything, people just throw in whatever they want. Jc3s5h (talk) 14:12, 4 October 2010 (UTC)
Sensibly or not, we strive to be consistent where possible. As a general rule on PubMed, authors' full names are not always available, but their initials are. If we use full names in our articles, it will be quite inconsistent from one citation to the next. If we use initials then consistency is achievable. Searching PubMed for names of the form "Adams John Quincy" is less likely to match than a search for "Adams JQ". Whether or not the PubMed database entry has the given names, the initials will match. If there is a stable link to the article (doi, pmid, pii, etc) there is no real ambiguity introduced by abbreviating the name with initials. LeadSongDog come howl! 16:46, 4 October 2010 (UTC)
If Rcej can provide a list of sub templates/pages affected I will run through them to move the authors into |last1= etc. I won't be changing from initials to first names or vice versa, I see that as a separate consideration. Rjwilmsi 17:14, 4 October 2010 (UTC)

Use in other wikis

Hi!

Is it possible to run this bot in other wikis? What would be necessary in order to use it at Portuguese Wikipedia? Helder (talk) 19:14, 12 October 2010 (UTC)

If someone is willing to undertake the necessary bureaucratic steps at the relevant wikis, I'd be happy to set the bot up to operate in different languages. Martin (Smith609 – Talk) 13:37, 12 October 2010 (UTC)

Unintelligible bot message

Any clues as to what this message might mean? It makes no sense at all, so I cannot guess what needs fixing.

*{{Cite book |first=Jane |last=Cumberlidge |title=Inland Waterways of Great Britain 8th Ed. |publisher=Imray Laurie Norie and Wilson |year=2009 |isbn=978-1-84623-010-3 |postscript=<!-- Bot inserted parameter. Either remove it; or change its value to "." for the cite to end in a ".", as necessary. -->[[Category:Articles with inconsistent citation formats]]}}

Bob1960evens (talk) 12:52, 9 October 2010 (UTC)

It means that in the article there are some citations that will display with a period at the end, and some that won't, so the bot is suggesting somebody might like to decide which is appropriate and make them all consistent. Rjwilmsi 13:04, 9 October 2010 (UTC)
If you can think of a better wording for the message, please do let me know! Martin (Smith609 – Talk) 22:27, 9 October 2010 (UTC)
Thanks for the info. The "8th Ed." displayed at "8th Ed.." so I changed it to "(8th Ed)" which displays as "(8th Ed)." but is the message suggesting that adding "postscript=." would also solve the problem? Bob1960evens (talk) 09:24, 13 October 2010 (UTC)

Not always finding available PMIDs

I wanted to raise this as a discussion point instead of simply filing a bug report in case there's more detail than meets the eye. In this edit I was able to find the PMID of the article by Googling the title. When the Citation bot was run on the article before it didn't find this entry though. Is there a good reason, otherwise it would appear the bot is missing some easy wins? Note, this is not an isolated case. Thanks Rjwilmsi 10:00, 18 October 2010 (UTC)

Wrong ISBN

The article Lambley, Northumberland has had the ISBN added but it is the wrong one. I have the 1966 reprint of that book – the British Library has the 1965 first edition (and the 1973 edition). The 1966 version does not have an ISBN number so I left that field blank. The bot has added isbn = 0709135491 which is the 1973 edition, and since that edition has more pages any page numbers are likely to be wrong.

Is the bot meant to be doing that? And if so, what do I do about it? I can only quote from the books that I have or have access to.

The British library catalogue entries are: 1966 and 1973.

Twiceuponatime (talk) 09:27, 7 January 2011 (UTC)

Probably then the google books search returned the later edition when it should not have. If you set isbn=<!-- no ISBN available--> then the bot won't add it back. In addition maybe you could check worldcat and provide |oclc= for the edition you have, so that users at least have that link. Rjwilmsi 12:20, 7 January 2011 (UTC)

{{resolved}}

Discussion ongoing

Editors here may find Wikipedia_talk:Citing_sources/example_style#why_not_standardize_on_one_format.3F to be of interest. LeadSongDog come howl! 20:57, 13 January 2011 (UTC) {{resolved}}

Question

Why the bot does not recognize a perfectly valid doi in this article. Ruslik_Zero 18:37, 15 November 2010 (UTC)

There are occassional errors in the dois, but at the moment, the bot doesn't seem to be working on any dois (see Special:Contributions/Citation_bot_2). The operator is on a wikibreak, but I've left them a note about it. SmartSE (talk) 00:27, 18 November 2010 (UTC)
All fixed. Sorry about the inconvenience. Incidentally, did all the "broken dois" get fixed? I can't find many. Martin (Smith609 – Talk) 02:42, 20 November 2010 (UTC)

{{resolved}}

vcite journal support?

Can the bot be enhanced to support {{vcite journal}}? I'm not a fan of the reduced metadata from this template, but the Vancouver style is an established format e.g. Wikipedia:Manual_of_Style_(medicine-related_articles)#Citing_medical_sources. Thanks Rjwilmsi 10:33, 6 December 2010 (UTC)

I put substantial effort into unifying the citation templates to use a single input/output format, so that the bot could act on them without cross-compatability issues. The vcite is a step in the opposite direction and I envisage it as a potential source of maintenance headaches, therefore it's not something that I personally feel to be a good investment of my time.
However, the bot is open source, and I would gladly welcome any source code contributions from other editors who wish to provide this functionality. Martin (Smith609 – Talk) 14:50, 6 December 2010 (UTC)

{{resolved}}

Listing editors as authors

The Citation bot listed two editors as authors in addition to the previous mentioned problems of omitting the pmid and issue fields on the first run, omitting the month field, and specifying just the first page in the pages field for the citation Template:Cite_doi/10.1056.2FNEJMra0906948. You can review the changes and history to see the updates.

Whywhenwhohow (talk) 19:04, 21 December 2010 (UTC)

Should be {{resolved}} in r492. Martin (Smith609 – Talk) 15:45, 11 February 2011 (UTC)

Duplicate authors

The Citation bot added the first author twice (as author 1 and author 6) and left out the issue and name of the journal for Template:Cite doi/10.1002.2F14651858.CD005102.pub2. Here is the citation offered with the journal article at [31].

Citation: Nield L, Summerbell CD, Hooper L, Whittaker V, Moore H. Dietary advice for the prevention of type 2 diabetes mellitus in adults. Cochrane Database of Systematic Reviews 2008, Issue 3. Art. No.: CD005102. DOI: 10.1002/14651858.CD005102.pub2.

and here is the generated citation

  • Nield, L.; Summerbell, C. D.; Hooper, L.; Whittaker, V.; Moore, H.; Nield, L. (2008). Nield, Lucie (ed.). "Dietary advice for the prevention of type 2 diabetes mellitus in adults". The Cochrane Library (3): CD005102. doi:10.1002/14651858.CD005102.pub2. PMID 18646120.

Whywhenwhohow (talk) 19:47, 21 December 2010 (UTC)

In Template:Cite doi/10.1002.2F14651858.CD006424.pub2, the Citation bot duplicated author 2 as author 5.

Whywhenwhohow (talk) 20:17, 21 December 2010 (UTC)

{{resolved}}: the duplicate authors were listed twice, the second time as editors. They will now be marked as such. Martin (Smith609 – Talk) 15:34, 11 February 2011 (UTC)

Naming refs (which aren't reused)

What is the point, if any, of this edit? The bot seems to have named several refs that were used only once in the article. They seem to have no particular connection to PMIDs or DOI numbers. I don't see where this activity by the bot is described, and I don't see the usefulness of an edit like this. Thanks for any clues, — JohnFromPinckney (talk) 07:53, 3 January 2011 (UTC)

… and here at Metropolitan Opera (?) -- Michael Bednarek (talk) 09:52, 3 January 2011 (UTC)
This was discussed at Wikipedia:Bots/Requests_for_approval/Citation_bot_6; the current behaviour reflects the behaviour during the trial. If there is consensus to do so (as it seems that there might be), this can be restricted to duplicate references and those that don't list full details (i.e. the cite doi family). Please establish this (or otherwise) below and I'll update the code. Martin (Smith609 – Talk) 13:15, 3 January 2011 (UTC)
Thanks for the explanation. While I find the consolidation of multiple identical references into named refs highly desirable, I think the adding of names to single-use references is a bit unnecessary – especially if those are the only edits in an article – and even confusing: when I edit an article, I expect named refs to be used somewhere else and then I'm baffled when they're not. -- Michael Bednarek (talk) 13:37, 3 January 2011 (UTC)
Yes, thanks for replying. I don't see that the approval at BRFA took much consideration of whether adding names to one-use refs was desirable or disruptive; it looks to me as though exactly one guy commented, and that was mostly about the format. Further, the rationale for the naming is to help clarify refs made using the opaque cite pmid, cite jstor, etc., templates, which seems a fine idea to me, now that I've seen the request and explanation. However, the edits pointed to above suggest the bot is naming any and all cite template refs. For the cite news, cite web, etc. usages, one-use refs don't benefit from naming and are arguably less clear to editors than without naming. So, just to be formally clear, I think that adding names to one-use refs that don't need them is unwelcome and borders on being disruptive; please refine the bot code.— JohnFromPinckney (talk) 02:00, 4 January 2011 (UTC)
Okie dokie, that sounds like consensus to me. I don't believe that the bot is running automatically at the moment, so I'll refine the code according to these comments before I set it going again. Thanks all for taking the time to chime in. Martin (Smith609 – Talk) 02:51, 4 January 2011 (UTC)

Why has the discussion moved back here? People went to the bugs page because the header at the top of the page directs them to go there there on threat of being ignored. The discussion has now lost all the previous comments (mostly complaints) without even a link back to it. The argument that this was discussed at Bot Approvals is spurious. There is no discussion of this issue there, it is perfectly possible that the approvers were not aware of this implication. In any case, a bot approval by the small numbers at BAG does not trump the views of the wider Wikipedia community on whether it is actually useful or not. BAG are focused on whether the bot has bugs or will cause widespread disruption. I think you are getting a clear message from editors that unecessary clutter is not welcome, and singleton references is clutter. SpinningSpark 16:21, 3 January 2011 (UTC)

I'd agree the singleton refnames are at best less than optimal, though in the case of list-defined references I can see that there's some argument to be made for them as a stepping-stone to adding inline cites. I don't see any need for them in mature articles.LeadSongDog come howl! 22:58, 4 January 2011 (UTC)

{{resolved}}

Editing text outwith citation templates

Status
{{resolved}}
Reported by
Martin (Smith609 – Talk) 18:35, 30 January 2011 (UTC)
Type of bug
Deleterious
What happens
text outside of templates is modifeid: here, the removal of spaces after "Murray ".
What should happen
No edit to article text
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Royal_Exchange,_Manchester&diff=prev&oldid=410999888
Replication instructions
Problematic article copied to User:DOI_bot/Zandbox
We can't proceed until
A specific edit to the bot's code is requested below.
Requested action from maintainer
Don't replace spaces in article text.


Accented characters in citewatch.php

Status
{{resolved}}
Reported by
Martin (Smith609 – Talk) 14:18, 31 January 2011 (UTC)
Type of bug
Cosmetic
What happens
unicode characters replaced by ?s
What should happen
Accents preserved
Relevant diffs/links
Template:Cite_doi/10.1038.2Fnphys1334; Template:Cite_doi/10.1038.2Fnphys1203
We can't proceed until
A specific edit to the bot's code is requested below.
Requested action from maintainer
UTF-8 support


Improvement

Status
{{resolved}}
Reported by
Martin (Smith609 – Talk) 15:12, 31 January 2011 (UTC)
Type of bug
Improvement
What happens
Some references are referred to as ref_a, ref_autogenerated, etc.
What should happen
Refer to them as "Barnes 1992", etc.
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Pallaviciniites&diff=prev&oldid=411173545
Replication instructions
n/a
We can't proceed until
A specific edit to the bot's code is requested below.
Requested action from maintainer
Implement


Suggestions of any other meaningless reference names that could be replaced with a semantically-meaningful name are very welcome! Martin (Smith609 – Talk) 02:23, 11 February 2011 (UTC)

Category:Science citation templates

Would it be possible to make it so this is more... organized?

Something like a category tree that would resemble

And so on? Headbomb {talk / contribs / physics / books} 11:22, 16 January 2011 (UTC)

Feel free to move this at some other more relevant place if there is one. Headbomb {talk / contribs / physics / books} 11:23, 16 January 2011 (UTC)
Does Special:PrefixIndex/Template:Cite doi/ help? Martin (Smith609 – Talk) 16:54, 21 January 2011 (UTC)
Not quite no. Seems like it would be a trivial thing to modify {{Cite doi/subpage}}, {{Cite jstor/subpage}}, {{Cite pmid/subpage}} and {{Cite pmc/subpage}} to handle this. Would you object if I made the changes?Headbomb {talk / contribs / physics / books} 21:08, 21 January 2011 (UTC)
Do go ahead, although I'm still not quite sure what your objective is here. Martin (Smith609 – Talk) 02:24, 22 January 2011 (UTC)

{{resolved}}

Alright, I made it work as intended. Cleaned up a bunch of templates at weird locations, as well as a few which were used for things they couldn't be used for. Made another BOTREQ to cleanup a few others (see Wikipedia:Bot requests#Cleanup {{cite doi}} templates). Headbomb {talk / contribs / physics / books} 04:45, 9 February 2011 (UTC)
Category:Cite doi templates is now sanely browsable, and the obvious weird crap/typos have been sent to speedy. Should make things much easier to maintain now. Headbomb {talk / contribs / physics / books} 09:24, 9 February 2011 (UTC)

{{resolved}}

Cochrane Database oddity.

Status
{{resolved}} in r249
Reported by
Headbomb {talk / contribs / physics / books} 07:20, 10 February 2011 (UTC)
Type of bug
Inconvenience
What happens
When checking for dois from the Cochrane Database of Systematic Reviews, the bot retrieves...
|last1=Smith |last2=A.B.
|last2=Thomson |last2=C.D.
|last3=Jones |last2=E.F.
...

and very, very often, if not always, it will add a

|lastX=Foobar |lastX=Barfoo

where this last author has already been mentioned.

Relevant diffs/links
See for instance [32] [Candy, B. is duplicated].
Replication instructions
For a more complete list of these problematic citations, check [33] [{{Cite doi/10.1002.2F14651858.CD000007}} until {{Cite doi/10.1002.2F14651858.CD008272.pub2}}]. Also occasionally, the "duplicate" entry seems to always be better than the "first" entry, in that the initials of the author are correct in the "duplicate" but not the "first" [34].
We can't proceed until
Bot operator's feedback on what is feasible


Investigating. Martin (Smith609 – Talk) 02:39, 11 February 2011 (UTC)
{{resolved}}: the final author was listed as an editor of the paper; I didn't know that this information was kept in CrossRef! The script will now list editors in that capacity. Martin (Smith609 – Talk) 02:55, 11 February 2011 (UTC)

Withdrawn papers

That raises an interesting point. Pubmed marks the first of that list, at PMID 19588315, as having been withdrawn. Shouldn't citationbot find and mark these somehow? Anyhow, looking at the fairly long revision history for the Candy et al citation shows that subsequent editions by various revisions of the bot have added variously the author, last, and first. Clearly the bot should not be mixing these. If the human input uses last and first then the bot should follow that, and vice versa. Unfortunately the implication of this is that articles using {{cite doi}} will not have consistent citation format, but will mix them based on whatever was first used for each of the dois cited. Perhaps at GAR or FAR the templates should be subst'd after vetting? LeadSongDog come howl! 14:57, 10 February 2011 (UTC)
The Cite Doi templates all use a consistent reference format: Template:Cite doi#Formatting. The bot won't edit titles that already exist, in case an editor has manually checked or amended them. The prefix "Withdrawn" only exists in the PubMed database, not in CrossRef's (which the script checks first); so it would take major re-factoring to accommodate this rare occurrence. If an article relies on a withdrawn source to support its statements, then the bot silently adding the word "withdrawn" to the reference doesn't really address the problem of improper information in the article. I'm not sure that the advantage in the bot doing this is large enough to be worthwhile. Martin (Smith609 – Talk) 02:39, 11 February 2011 (UTC)
No need for it to be added silently. It could just as well add {{update}} to the citation at the same time as marking the title as "Withdrawn: Dewey Wins!". That will draw human attention to fixing the citing statements.LeadSongDog come howl! 14:35, 11 February 2011 (UTC)
It certainly sounds like a worthwhile task, but I'm afraid that it's beyond the scope of this bot. Hopefully you'll find someone interested at Wikipedia:Bot_requests. Martin (Smith609 – Talk) 15:32, 11 February 2011 (UTC)
Thank you, that sounds like a plan. Raised it there. LeadSongDog come howl! 16:35, 11 February 2011 (UTC)

{{resolved}}

Confusing JSTOR with PMID

Status
{{resolved}}
Reported by
Martin (Smith609 – Talk) 15:14, 11 February 2011 (UTC)
Type of bug
Deleterious
What happens
The bot mistakes a JSTOR catalogue number with a PMID, and then expands from that source.
What should happen
JSTOR, not PubMed, used to expand the citation further
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Template%3ACite_doi%2F10.2307.2F56388&action=historysubmit&diff=320361535&oldid=308217923
We can't proceed until
Bot operator's feedback on what is feasible


Could not replicate; presume fixed. Martin (Smith609 – Talk) 21:10, 12 February 2011 (UTC)

Incomplete citations, problems with accented characters

Newly generated citations are missing various fields including pmid, issue, and month, have incomplete page ranges, and sometimes invalid characters appear instead of accented characters.

For example, see

This one demonstrates the problem with accented characters where they appear as a question mark inside a diamond – �

Sometimes the Citation bot will add missing fields when it is run a second time. For example, see

And for Template:Cite doi/10.1097.2FAOG.0b013e3181f680c8 the Citation Bot added an invalid last1 field with a value of &Na;

Whywhenwhohow (talk) 16:13, 3 December 2010 (UTC)

Thanks for the report and for listing examples. I'll look into this when I get an opportunity. Martin (Smith609 – Talk) 16:58, 3 December 2010 (UTC)
Perhaps related issue is that a pmid-based lookup does not catch some DOIs that are then found by a second run e.g. diff. Thanks Rjwilmsi 21:09, 3 December 2010 (UTC)
The citations were created using cite pmid and the Citation Bot created redirects to cite doi and then lost the original pmid specified. It would be useful if the process of creating cite doi from cite pmid did not lose the specified data. Whywhenwhohow (talk) 21:57, 4 December 2010 (UTC)
Most of these problems should be fixed, although I'll have to double-check. Let me know if you can replicate any (in r242 onwards). Martin (Smith609 – Talk) 18:38, 30 January 2011 (UTC)
Author accents problem not fixed – example. Rjwilmsi 16:44, 11 February 2011 (UTC)
Thanks for the report! I've merged it into the bug report below. Martin (Smith609 – Talk) 16:58, 11 February 2011 (UTC)
Still malfunctioning: http://en.wikipedia.org/w/index.php?title=Template:Cite_doi/10.1016.2Fj.phytochem.2009.05.001&action=history. Martin (Smith609 – Talk) 19:32, 23 February 2011 (UTC)
Done. {{resolved}}

Bibcodes

In many templates, there are urls of the form "http://adsabs.harvard.edu/abs/2007ApJ...654..373E". These should be converted to bibcodes (|bibcode=2007ApJ...654..373E).

For example

to

Headbomb {ταλκκοντριβς – WP Physics} 17:42, 29 June 2009 (UTC)

How can the bot recognise a BibCode – is it safe to assume that anything after adsabs.harvard.edu/abs/ , and nothing else, should be converted to a BibCode parameter, and the url parameter removed? Martin (Smith609 – Talk) 23:21, 5 September 2009 (UTC)
Yes, just retrieve it from the url. Headbomb {ταλκκοντριβς – WP Physics} 21:32, 9 September 2009 (UTC)
Also, there might be retrievable information from the bibcode page (doi & arxiv preprint). It might be a cheap and efficient way to add doi/arxiv links. Headbomb {ταλκκοντριβς – WP Physics} 03:47, 10 September 2009 (UTC)
Also, urls of the form "http://ads.ari.uni-heidelberg.de/abs/1976Icar...29..255B" should be converted to bibcodes (|bibcode=1976Icar...29..255B). Headbomb {ταλκκοντριβς – WP Physics} 14:46, 22 October 2009 (UTC)

Alright I checked for all mirrors, here's the full list:

The structure will always be FOO/abs/BIBCODE where FOO is one of the mirror urls. Bibcodes are always 19 characters long.Headbomb {talk / contribs / physics / books} 04:50, 27 April 2010 (UTC)

Any update? Headbomb {talk / contribs / physics / books} 21:02, 3 December 2010 (UTC)
Looks feasible; I've a lot of other projects on the go at the moment but I'll take a look at this when I next have a session improving Citation bot. Meanwhile, feel free to delve in to the source code yourself! Martin (Smith609 – Talk) 14:52, 6 December 2010 (UTC)
I can probably do this for AWB. Rjwilmsi 16:54, 6 December 2010 (UTC)
Small note, in the urls, some characters have their html counterparts listed instead. For example A&A will often (but not necessarily always) reads as A%26A in the url. The url-encoding should be cleaned up when placed in |bibcode=. Headbomb {talk / contribs / physics / books} 02:02, 7 December 2010 (UTC)

Ping?Headbomb {talk / contribs / physics / books} 01:58, 9 February 2011 (UTC)

User:Rjwilmsi/Bibcodes. Headbomb I will need your help. Rjwilmsi 21:45, 11 February 2011 (UTC)
Quick check: do BibCodes ever contain /s? Martin (Smith609 – Talk) 01:40, 12 February 2011 (UTC)
  Done in r250. Martin (Smith609 – Talk) 01:58, 12 February 2011 (UTC)
It's working... but it produces very annoying formating. Headbomb {talk / contribs / physics / books}
  Done in r251. Martin (Smith609 – Talk) 21:01, 12 February 2011 (UTC)
It should also removed the accessdates, as they are no longer needed. Headbomb {talk / contribs / physics / books} 04:16, 12 February 2011 (UTC)
  Done in r251. Martin (Smith609 – Talk) 21:01, 12 February 2011 (UTC)
The most common URL format is covered but there are also some other formats to handle. Rjwilmsi 08:10, 12 February 2011 (UTC)
I don't know of every format that the URLs take, so it's difficult to compose a filter that will match them all without false positives.
Will this work:
  • Check whether the URL starts with one of the mirror addresses
  • If it does, any pattern of four digits followed by 15 further characters, some of which may be URL encoded, is a bibcode?
Martin (Smith609 – Talk) 21:01, 12 February 2011 (UTC)

I made some database scane, and four good matches exist within URLs

  • bibcode=BIBCODE
  • query?BIBCODE
  • abs/BIBCODE
  • full/BIBCODE

There are other links to the ads database, but nothing that can/should be cleaned up by this bot. Headbomb {talk / contribs / physics / books} 21:38, 12 February 2011 (UTC)

Yes, that's a better approach. And ~98% of our existing bibcodes match the MySQL regex of '^[12][0-9][0-9][0-9][A-Z0-9.&]{15}'. If the citation bot can do 95% of them I'll sort out the rest. I can provide a list of the articles (~2000) and a list of distinct URLs (3714 to main site). Rjwilmsi 21:50, 12 February 2011 (UTC)
Shouldn't that be ^[12][0-9][0-9][0-9][A-Z0-9\.&]{15} ? Headbomb {talk / contribs / physics / books} 22:20, 12 February 2011 (UTC)
Thanks for the suggestions, guys! I've incorporated them in r254, and the URLs you listed above now work; let me know if there are any more tweaks required. Martin (Smith609 – Talk) 00:06, 13 February 2011 (UTC)

{{resolved}}

Legacy cleanup needed?

  • There are still numerous references to DOI_bot or DOI-bot or User:DOI bot that crop up. Am I right in thinking these should all be converted to the Citation_bot equivalents or is there some ongoing utility to the distinct names?
  • Neither "DOI" nor "Citation" appears at [35], which leads me to think something is amiss. Is it just a documentation problem, or is the bot now re-hosted elswhere?
  • At User:Smith609/toolbox.js the PortletLinks still go to an apparently non-functional address [36]

Am I missing something basic? LeadSongDog come howl! 18:22, 17 January 2011 (UTC)

Cleanup is almost certainly needed -- you'd be very welcome to dive in! However, there are two versions of the URL: DOI_bot is the latest working version of the code (well, it doesn't work at the moment, but in general it does...) and citation_bot is the last stable version of the code (which may contain minor errors that have since been fixed, etc, but it should always work). Martin (Smith609 – Talk) 22:23, 17 January 2011 (UTC)
That explains some of the confusion for sure. On the sidebar for this page, I see under "Reference formatting", the link Automatic (thorough). Note that URL refers to doibot, which from your description is the production code. But when I click on it, it announces "Welcome to Citation Bot", which at the minimum is confusing. I suppose both versions of the code are intended to run as distinct users for accountability? In any case I'd suggest making the docs explicit about the distinction between production and developmental bot code. On User:Citation bot/use for instance, we probably should not see mention of DOI bot without a (developmental) caveat in front of it. Similarly on User:Citation bot the link [37] starts a script that reports "Getting login details ... done. Initializing MYSQL database ... loaded connect script. Will connect when necessary. Initializing ... ... Establishing connection to Wikipedia servers ... Using account Citation bot. Fetching parameter list ... " Again, the double identity is confusing. LeadSongDog come howl! 18:40, 18 January 2011 (UTC)

{{resolved}}

jstor=

Citation bot is replacing urls with "jstor=" parameters to {{citation}}. As far as I can tell, the citation template has no such parameter. Isn't this a problem? It is causing the jstor links to stop working.E.g. compare the first entry in the references section before and after this diff and note that prior to the diff there were both a jstor link and a doi link; after the diff the jstor link has vanished. Adding a {{JSTOR}} template to the id parameter would work, and might be a better solution. —David Eppstein (talk) 19:02, 14 February 2011 (UTC)

|jstor= is now supported by Template:Citation. Thanks for the report! Martin (Smith609 – Talk) 19:14, 14 February 2011 (UTC)
Thanks for the fix! —David Eppstein (talk) 20:41, 14 February 2011 (UTC)
You're welcome. {{resolved}}

JSTOR URL addition

Status
cannot replicate: {{resolved}}
Reported by
Martin (Smith609 – Talk) 20:27, 15 February 2011 (UTC)
What happens
JSTOR URL added
What should happen
Only jstor parameter should be used; don't include duplicate data
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Template%3ACite_doi%2F10.2307.2F1541497&action=historysubmit&diff=414120853&oldid=414120739
We can't proceed until
maintainer


I also wonder whether it would be better if |jstor= behaved like |pmc= in that it would display as {{JSTOR|123456}} if |url= in use? Rjwilmsi 18:03, 16 February 2011 (UTC)

JSTOR data

Status
{{resolved}}
Reported by
Martin (Smith609 – Talk) 19:38, 23 February 2011 (UTC)
What happens
1: Tidy citation and try to expand
- Populating from JSTOR database: unhandled data: 

JOURNAL___ BioScience

What should happen
Use data?
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Red_algae&diff=prev&oldid=415560231
Replication instructions
Penultimate reference: Kohlmeyer, J. (1975). "New Clues to the Possible Origin of Ascomycetes" (PDF). BioScience (American Institute of Biological Sciences) 25 (2): 86–93. doi:10.2307/1297108 Endnote in Red algae
We can't proceed until
Bot operator's feedback on what is feasible


I note that edit didn't correct the jstor id, when it showed as 10.2307/1297108 rather than just 1297108 (I've since corrected it). Duplicating the full doi as the jstor id is a fairly predictable human error. The bot could check for it. LeadSongDog come howl! 20:02, 23 February 2011 (UTC)
Arrgh! For some reason we now have a bunch of wrong subpages of {{Cite jstor}} at Special:PrefixIndex/Template:Cite_jstor/. The other Cite xxx seem fine. LeadSongDog come howl! 20:30, 23 February 2011 (UTC)
Odd. Might be to do with the conversion of {{ref jstor}}. I'll take a look. Martin (Smith609 – Talk) 20:42, 23 February 2011 (UTC)
Looks like the bot finds the info, but just reports an error. Martin (Smith609 – Talk) 20:42, 23 February 2011 (UTC)
It seems a few pages Martin was working on in the last days of January are at issue. They have jstor subpage numbers that are full doi numbers. It seems {{ref jstor|1304949}} and its like may be at issue, or perhaps Citation bot 7. See also Special:WhatLinksHere/Template:Ref_jstor. LeadSongDog come howl! 20:50, 23 February 2011 (UTC)
Yes, this problem was caused by the bot's wrong handling of {ref jstor}. Both issues fixed in r256. Martin (Smith609 – Talk) 03:36, 25 February 2011 (UTC)

Page numbers for online sources

Status
new bug
Reported by
– VisionHolder « talk » 21:10, 23 February 2011 (UTC)
Type of bug
Inconvenience: Humans must occasionally make immediate edits to clean up after the bot
What happens
Both on articles and on DOI/PMID/JSTOR template citations the bot misinterprets the issue number for page numbers for online articles that do not have page numbers (e.g. BMC Evolutionary Biology 2009, 9:30 = volume 9, issue 30, and not page 30)
What should happen
Bot should omit page numbers if none are provided
Relevant diffs/links
https://secure.wikimedia.org/wikipedia/en/w/index.php?title=Template%3ACite_doi%2F10.1186.2F1471-2148-9-30&action=historysubmit&diff=415563864&oldid=407410437
Replication instructions
The bot automatically does this to any citation where page numbers are not specified.
We can't proceed until
Maintainer: A specific edit to the bot's code is requested below.
Requested action from maintainer
Please ensure that the bot understands when page numbers are lacking from a citation for online sources.


Other examples of this include: [38], [39], and [40]. – VisionHolder « talk » 21:10, 23 February 2011 (UTC)

In the first case (I haven't checked the others, 30 is the Medline page number, not the issue. This is fairly standard for online-only pubs (like the BMC series) that the colon precedes a paper number, rather than an issue. Citing Medicine has examples.LeadSongDog come howl! 21:25, 23 February 2011 (UTC)
Sounds like this recurring issue is {{resolved}}. Please correct me if I'm wrong. Martin (Smith609 – Talk) 05:30, 25 February 2011 (UTC)
Unfortunately, no, it's not resolved. But I don't know how to avoid this short of using {{nobots}}. If the sources the bots are pulling from are inconsistent, I don't know what to say. Could the bot check the info from multiple sources, and if a page number is not provided by the other sources, then it ignores page numbers from Medline? Basically, if the other sources aren't providing the page number, then we assume the page number retrieved from Medline is actually the issue number. Or more simply, if the issue number equals the page number exactly, then ignore it? – VisionHolder « talk » 05:56, 25 February 2011 (UTC)

Incomplete citations, problems with accented characters

Newly generated citations are missing various fields including pmid, issue, and month, have incomplete page ranges, and sometimes invalid characters appear instead of accented characters.

For example, see

This one demonstrates the problem with accented characters where they appear as a question mark inside a diamond – �

Sometimes the Citation bot will add missing fields when it is run a second time. For example, see

And for Template:Cite doi/10.1097.2FAOG.0b013e3181f680c8 the Citation Bot added an invalid last1 field with a value of &Na;

Whywhenwhohow (talk) 16:13, 3 December 2010 (UTC)

Thanks for the report and for listing examples. I'll look into this when I get an opportunity. Martin (Smith609 – Talk) 16:58, 3 December 2010 (UTC)
Perhaps related issue is that a pmid-based lookup does not catch some DOIs that are then found by a second run e.g. diff. Thanks Rjwilmsi 21:09, 3 December 2010 (UTC)
The citations were created using cite pmid and the Citation Bot created redirects to cite doi and then lost the original pmid specified. It would be useful if the process of creating cite doi from cite pmid did not lose the specified data. Whywhenwhohow (talk) 21:57, 4 December 2010 (UTC)
Most of these problems should be fixed, although I'll have to double-check. Let me know if you can replicate any (in r242 onwards). Martin (Smith609 – Talk) 18:38, 30 January 2011 (UTC)
Author accents problem not fixed – example. Rjwilmsi 16:44, 11 February 2011 (UTC)
Thanks for the report! I've merged it into the bug report below. Martin (Smith609 – Talk) 16:58, 11 February 2011 (UTC)
Still malfunctioning: http://en.wikipedia.org/w/index.php?title=Template:Cite_doi/10.1016.2Fj.phytochem.2009.05.001&action=history. Martin (Smith609 – Talk) 19:32, 23 February 2011 (UTC)
Done. {{resolved}}

Bibcodes

In many templates, there are urls of the form "http://adsabs.harvard.edu/abs/2007ApJ...654..373E". These should be converted to bibcodes (|bibcode=2007ApJ...654..373E).

For example

to

Headbomb {ταλκκοντριβς – WP Physics} 17:42, 29 June 2009 (UTC)

How can the bot recognise a BibCode – is it safe to assume that anything after adsabs.harvard.edu/abs/ , and nothing else, should be converted to a BibCode parameter, and the url parameter removed? Martin (Smith609 – Talk) 23:21, 5 September 2009 (UTC)
Yes, just retrieve it from the url. Headbomb {ταλκκοντριβς – WP Physics} 21:32, 9 September 2009 (UTC)
Also, there might be retrievable information from the bibcode page (doi & arxiv preprint). It might be a cheap and efficient way to add doi/arxiv links. Headbomb {ταλκκοντριβς – WP Physics} 03:47, 10 September 2009 (UTC)
Also, urls of the form "http://ads.ari.uni-heidelberg.de/abs/1976Icar...29..255B" should be converted to bibcodes (|bibcode=1976Icar...29..255B). Headbomb {ταλκκοντριβς – WP Physics} 14:46, 22 October 2009 (UTC)

Alright I checked for all mirrors, here's the full list:

The structure will always be FOO/abs/BIBCODE where FOO is one of the mirror urls. Bibcodes are always 19 characters long.Headbomb {talk / contribs / physics / books} 04:50, 27 April 2010 (UTC)

Any update? Headbomb {talk / contribs / physics / books} 21:02, 3 December 2010 (UTC)
Looks feasible; I've a lot of other projects on the go at the moment but I'll take a look at this when I next have a session improving Citation bot. Meanwhile, feel free to delve in to the source code yourself! Martin (Smith609 – Talk) 14:52, 6 December 2010 (UTC)
I can probably do this for AWB. Rjwilmsi 16:54, 6 December 2010 (UTC)
Small note, in the urls, some characters have their html counterparts listed instead. For example A&A will often (but not necessarily always) reads as A%26A in the url. The url-encoding should be cleaned up when placed in |bibcode=. Headbomb {talk / contribs / physics / books} 02:02, 7 December 2010 (UTC)

Ping?Headbomb {talk / contribs / physics / books} 01:58, 9 February 2011 (UTC)

User:Rjwilmsi/Bibcodes. Headbomb I will need your help. Rjwilmsi 21:45, 11 February 2011 (UTC)
Quick check: do BibCodes ever contain /s? Martin (Smith609 – Talk) 01:40, 12 February 2011 (UTC)
  Done in r250. Martin (Smith609 – Talk) 01:58, 12 February 2011 (UTC)
It's working... but it produces very annoying formating. Headbomb {talk / contribs / physics / books}
  Done in r251. Martin (Smith609 – Talk) 21:01, 12 February 2011 (UTC)
It should also removed the accessdates, as they are no longer needed. Headbomb {talk / contribs / physics / books} 04:16, 12 February 2011 (UTC)
  Done in r251. Martin (Smith609 – Talk) 21:01, 12 February 2011 (UTC)
The most common URL format is covered but there are also some other formats to handle. Rjwilmsi 08:10, 12 February 2011 (UTC)
I don't know of every format that the URLs take, so it's difficult to compose a filter that will match them all without false positives.
Will this work:
  • Check whether the URL starts with one of the mirror addresses
  • If it does, any pattern of four digits followed by 15 further characters, some of which may be URL encoded, is a bibcode?
Martin (Smith609 – Talk) 21:01, 12 February 2011 (UTC)

I made some database scane, and four good matches exist within URLs

  • bibcode=BIBCODE
  • query?BIBCODE
  • abs/BIBCODE
  • full/BIBCODE

There are other links to the ads database, but nothing that can/should be cleaned up by this bot. Headbomb {talk / contribs / physics / books} 21:38, 12 February 2011 (UTC)

Yes, that's a better approach. And ~98% of our existing bibcodes match the MySQL regex of '^[12][0-9][0-9][0-9][A-Z0-9.&]{15}'. If the citation bot can do 95% of them I'll sort out the rest. I can provide a list of the articles (~2000) and a list of distinct URLs (3714 to main site). Rjwilmsi 21:50, 12 February 2011 (UTC)
Shouldn't that be ^[12][0-9][0-9][0-9][A-Z0-9\.&]{15} ? Headbomb {talk / contribs / physics / books} 22:20, 12 February 2011 (UTC)
Thanks for the suggestions, guys! I've incorporated them in r254, and the URLs you listed above now work; let me know if there are any more tweaks required. Martin (Smith609 – Talk) 00:06, 13 February 2011 (UTC)

{{resolved}}

Legacy cleanup needed?

  • There are still numerous references to DOI_bot or DOI-bot or User:DOI bot that crop up. Am I right in thinking these should all be converted to the Citation_bot equivalents or is there some ongoing utility to the distinct names?
  • Neither "DOI" nor "Citation" appears at [41], which leads me to think something is amiss. Is it just a documentation problem, or is the bot now re-hosted elswhere?
  • At User:Smith609/toolbox.js the PortletLinks still go to an apparently non-functional address [42]

Am I missing something basic? LeadSongDog come howl! 18:22, 17 January 2011 (UTC)

Cleanup is almost certainly needed -- you'd be very welcome to dive in! However, there are two versions of the URL: DOI_bot is the latest working version of the code (well, it doesn't work at the moment, but in general it does...) and citation_bot is the last stable version of the code (which may contain minor errors that have since been fixed, etc, but it should always work). Martin (Smith609 – Talk) 22:23, 17 January 2011 (UTC)
That explains some of the confusion for sure. On the sidebar for this page, I see under "Reference formatting", the link Automatic (thorough). Note that URL refers to doibot, which from your description is the production code. But when I click on it, it announces "Welcome to Citation Bot", which at the minimum is confusing. I suppose both versions of the code are intended to run as distinct users for accountability? In any case I'd suggest making the docs explicit about the distinction between production and developmental bot code. On User:Citation bot/use for instance, we probably should not see mention of DOI bot without a (developmental) caveat in front of it. Similarly on User:Citation bot the link [43] starts a script that reports "Getting login details ... done. Initializing MYSQL database ... loaded connect script. Will connect when necessary. Initializing ... ... Establishing connection to Wikipedia servers ... Using account Citation bot. Fetching parameter list ... " Again, the double identity is confusing. LeadSongDog come howl! 18:40, 18 January 2011 (UTC)

{{resolved}}

jstor=

Citation bot is replacing urls with "jstor=" parameters to {{citation}}. As far as I can tell, the citation template has no such parameter. Isn't this a problem? It is causing the jstor links to stop working.E.g. compare the first entry in the references section before and after this diff and note that prior to the diff there were both a jstor link and a doi link; after the diff the jstor link has vanished. Adding a {{JSTOR}} template to the id parameter would work, and might be a better solution. —David Eppstein (talk) 19:02, 14 February 2011 (UTC)

|jstor= is now supported by Template:Citation. Thanks for the report! Martin (Smith609 – Talk) 19:14, 14 February 2011 (UTC)
Thanks for the fix! —David Eppstein (talk) 20:41, 14 February 2011 (UTC)
You're welcome. {{resolved}}

JSTOR URL addition

Status
cannot replicate: {{resolved}}
Reported by
Martin (Smith609 – Talk) 20:27, 15 February 2011 (UTC)
What happens
JSTOR URL added
What should happen
Only jstor parameter should be used; don't include duplicate data
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Template%3ACite_doi%2F10.2307.2F1541497&action=historysubmit&diff=414120853&oldid=414120739
We can't proceed until
maintainer


I also wonder whether it would be better if |jstor= behaved like |pmc= in that it would display as {{JSTOR|123456}} if |url= in use? Rjwilmsi 18:03, 16 February 2011 (UTC)

JSTOR data

Status
{{resolved}}
Reported by
Martin (Smith609 – Talk) 19:38, 23 February 2011 (UTC)
What happens
1: Tidy citation and try to expand
- Populating from JSTOR database: unhandled data: 

JOURNAL___ BioScience

What should happen
Use data?
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Red_algae&diff=prev&oldid=415560231
Replication instructions
Penultimate reference: Kohlmeyer, J. (1975). "New Clues to the Possible Origin of Ascomycetes" (PDF). BioScience (American Institute of Biological Sciences) 25 (2): 86–93. doi:10.2307/1297108 Endnote in Red algae
We can't proceed until
Bot operator's feedback on what is feasible


I note that edit didn't correct the jstor id, when it showed as 10.2307/1297108 rather than just 1297108 (I've since corrected it). Duplicating the full doi as the jstor id is a fairly predictable human error. The bot could check for it. LeadSongDog come howl! 20:02, 23 February 2011 (UTC)
Arrgh! For some reason we now have a bunch of wrong subpages of {{Cite jstor}} at Special:PrefixIndex/Template:Cite_jstor/. The other Cite xxx seem fine. LeadSongDog come howl! 20:30, 23 February 2011 (UTC)
Odd. Might be to do with the conversion of {{ref jstor}}. I'll take a look. Martin (Smith609 – Talk) 20:42, 23 February 2011 (UTC)
Looks like the bot finds the info, but just reports an error. Martin (Smith609 – Talk) 20:42, 23 February 2011 (UTC)
It seems a few pages Martin was working on in the last days of January are at issue. They have jstor subpage numbers that are full doi numbers. It seems {{ref jstor|1304949}} and its like may be at issue, or perhaps Citation bot 7. See also Special:WhatLinksHere/Template:Ref_jstor. LeadSongDog come howl! 20:50, 23 February 2011 (UTC)
Yes, this problem was caused by the bot's wrong handling of {ref jstor}. Both issues fixed in r256. Martin (Smith609 – Talk) 03:36, 25 February 2011 (UTC)

Page numbers in Google Book references

Status
  Won't fix
Reported by
Type of bug
Improvement
What happens
They are present within the URL but not written in the reference
What should happen
Should page numbers be referenced when they are included in the Google url?
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Ammonite&action=historysubmit&diff=409188670&oldid=409188404


Pages should refer to the total page count, not a specific page.   Won't fix. {{resolved}}

Edit war: bot vs. bot

Status
new bug
Reported by
R'n'B (call me Russ) 21:36, 20 February 2011 (UTC)
Type of bug
Inconvenience
What happens
Citation Bot 2 is warring with a double-redirect-fixing bot and creating double redirects
What should happen
Citation Bot (a) should not create double redirects and (b) should comply with Template:Bots
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Template:Cite_pmc/344925&action=history
We can't proceed until
Bot operator's feedback on what is feasible


That problem seemed to start with this edit. An instance of {{cite pmid}} with an existing pmid and pmc were newly matched to a doi, and then citation bot 2 replaced the content of that template with a redirect to the corresponding {{cite doi|10.1073.2Fpnas.81.3.801}} on 10 Feb 2011, then all hell broke loose five days later.LeadSongDog come howl! 21:39, 22 February 2011 (UTC)

Fixed in r260. Martin (Smith609 – Talk) 04:52, 25 February 2011 (UTC)

{{resolved}}

Conflicting PMID edit war

Status
new bug
Reported by
– VisionHolder « talk » 15:08, 18 February 2011 (UTC)
Type of bug
Inconvenience
What happens
The bot keep creating "Duplicate data" that overwrites an existing PMID entry at Template:Cite_pmid/11264397. I've check the PMID, and the original is the only one that pulls up. I'm not sure where this duplicate is coming from, but it's leading to an edit war between me and the bot.
What should happen
The bot needs stop, or we need need to find a manual fix.
Relevant diffs/links
https://secure.wikimedia.org/wikipedia/en/w/index.php?title=Template%3ACite_pmid%2F11264397&action=historysubmit&diff=414618737&oldid=414613689
We can't proceed until
Bot operator's feedback on what is feasible


A few links that might yield up a clue to the problem:
  • these are the finds for "25530199", the Barelegs jstor id
  • this is the earliest that mentions that jstor id, note the bot never ran against it (perhaps because it's in a userspace subpage?)
  • this edit first referenced doi:10.1113.2Fjphysiol.2002.025049 on 1 Dec 2010
  • this shows the doi created by the bot as a result a few minutes later, then manually revised (though not entirely fixed) by Asantorelli
  • this page was created by the bot five hours after Martin created Template:Taxonomy/Cheungkongella and Cheungkongella on 10 Jan 2011
  • this edit at 14:02 30 Jan 2011 caused the bot to create this cite doi at 09:20 31 Jan 2011 (the first contrib it shows for several hours preceding)

Other edits in Jan and Feb follow similar patterns. LeadSongDog come howl! 19:19, 18 February 2011 (UTC)

This is interesting, but how does this article, with its jstor id and doi get tied up with this pmid? I'm not seeting it. Also, how do we fix it? – VisionHolder « talk » 19:33, 18 February 2011 (UTC)
And just fyi, this ref template is used on an article that will be featured on the main page on February 22. If we can't resolve this by then, I will need to stop using this template ref and restore the full citation to the article. – VisionHolder « talk » 20:18, 18 February 2011 (UTC)
Ok, I suppose you mean this:
  • this edit by Martin cited jstor 1301748 at 10:54 on 19 March 2011, and
  • this template created by Martin at 15:53 19 Jan 2011
  • Then this edit by the bot at 13:54 on 30 Jan
  • But I note that there is no template:cite jstor/1301748 subpage to template:cite jstor, so perhaps a tweak can fix it. I'll see what can be done. LeadSongDog come howl! 21:29, 18 February 2011 (UTC)
    • Ok, the specific case fix seems to work, using this method as follows:
    • Find the jstor id for being erroneously inserted
    • Search for that id as a literal string as it was in the transclusion of Attention: This template ({{cite jstor}}) is deprecated. To cite the publication identified by jstor:1301748, please use {{cite journal}} with |jstor=1301748 instead. in Shelbyoceras at here
    • Click on the "jump the queue" link (obviously the queue fell over at some point)
    • Verify the bot fixes that cite
    • Go back to the troublesome cite doi instance
    • Click on edit
    • Remove all parameters except the doi, then save
    • Click to run the bot on that cite doi instance
    • Verify it does not reintroduce the jstor and associated metadata
    • The more general quick fix would seem to be to find all the instances of "cite jstor" that fell out of the queue and deal with them. None of which really explains why the bot was plunking that jstor id into unrelated cite doi pages, but maybe we'll figure that out now. Cheers, LeadSongDog come howl! 21:54, 18 February 2011 (UTC)
And just when you seemed to have it, the bot went back to its old ways and over-wrote itself. – VisionHolder « talk » 23:44, 18 February 2011 (UTC)
OK, I think that one's now fixed. Same as the others. LeadSongDog come howl! 07:01, 19 February 2011 (UTC)
Looks good. Thanks for the fix! – VisionHolder « talk » 14:04, 19 February 2011 (UTC)
You are welcome, though I still don't understand the root cause. Why was the bot pulling data from the queued-but-not-cleared jstor cites and plopping it into seemingly unrelated doi subpages? LeadSongDog come howl! 20:59, 19 February 2011 (UTC)

I'm sorry to say that the war has not ended. The bot is back at it again. – VisionHolder « talk » 19:37, 20 February 2011 (UTC)

I know this may sound a little rash, but can we shut this bot down until we figure how what is causing this behavior? Besides the disruption with this one PMID, who knows how many other unwatched citations are being mangled. – VisionHolder « talk » 20:44, 20 February 2011 (UTC)
Probably not helpful to shut it down, that's how this started. Each of these cases seems to have started with users creating instances of {{cite jstor}} when the bot was shut down. The linked subpage in this instance, from User:Smartse/Platypodium_elegans, had an instance of {{cite jstor|3094763}} that the bot had not run against. Therefore the linked {{cite doi}} subpage was never created until I did so today at this edit. Now I need to fix the {{cite pmid|11264397}} that got corrupted as a result. Hang in there, we'll figure it out. Cheers. LeadSongDog come howl! 16:44, 21 February 2011 (UTC)
@Martin, the above issues seems to relate to bug number 84. Some instances of {{cite jstor}} were not being expanded. These cases seemed mostly but not all to be either references within {{automatic taxobox}} or within userspace article subpages. The associated {{cite doi}} subpage was not being created. Still no clear idea why it was that citation bot 2 was merging the content that should have gone into these subpages instead going into unrelated subpages. Perhaps some kind of race condition, with two instances of the bot running at the same time? Just a thought. LeadSongDog come howl! 20:35, 21 February 2011 (UTC)
I don't seem to be able to replicate this problem. Can any of you guys figure out how to? It's possible (though unlikely?) that the bug has been fixed in the improved JSTOR handling introduced in r256... Martin (Smith609 – Talk) 04:25, 25 February 2011 (UTC)
There are still some eight bad jstor subpages here from the past two months that will need cleanup. I'll tackle these manually.LeadSongDog come howl! 05:10, 25 February 2011 (UTC)
Ok, five left there that can just be deleted by an admin, there are no links to them.LeadSongDog come howl! 06:32, 25 February 2011 (UTC)

{{resolved}}

Formatting replaced with missing spaces

Status
  Won't fix
Reported by
Martin (Smith609 – Talk) 15:55, 23 February 2011 (UTC)
Type of bug
Cosmetic
What happens
Formatting braces are removed, with spaces stripped too
What should happen
Tags replaced with wikisyntax
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Template%3ACite_doi%2F10.1111.2Fj.1475-4983.2008.00797.x&action=historysubmit&diff=415525031&oldid=415524978
Replication instructions
Expand above doi


Publisher error. Cannot fix. Martin (Smith609 – Talk) 04:01, 25 February 2011 (UTC)

{{resolved}}

Don't leave accents in ref names

Status
{{resolved}}
Reported by
Martin (Smith609 – Talk) 19:40, 14 February 2011 (UTC)
Type of bug
Improvement
What happens
Autogen ref names retain accented chars
What should happen
replace with letters that appear on a keyboard
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Archaeolithophyllaceae&diff=prev&oldid=413924585
Replication instructions
n/a
We can't proceed until
A specific edit to the bot's code is requested below.
Requested action from maintainer
See above


What's wrong with accented characters in ref names? If your keyboard doesn't support easy typing of them (as mine does) it's easy enough to copy and paste them (and a good idea in any case even with unaccented characters in order to make sure that the copied name is character-for-character the same). —David Eppstein (talk) 20:43, 14 February 2011 (UTC)
WP:REFNAME advises the use of simple ref names. On the English wiki I think sticking to non-accented characters supports that. Rjwilmsi 18:17, 16 February 2011 (UTC)
  Done in r264. Martin (Smith609 – Talk) 03:48, 26 February 2011 (UTC)

Keeps creating things at {{cite doi/doi:10....}} rather than {{cite doi/10....}}

In the last week or so, I keep having to send template like {{Cite_doi/doi:10.1088.2F1748-9326.2F4.2F4.2F045102}} to speedy deletion. Not only are they located at the wrong place (and duplicate existing and proper templates, like {{Cite_doi/10.1088.2F1748-9326.2F4.2F4.2F045102}}, but they are created for seemingly no reason, as nothing links to them! Usually this was due to bad use of {{cite doi}}, such as {{cite doi|doi:10.1234....}} rather than {{cite doi|10.1234....}}, but this is not the case here.

So yeah, that should be fixed. And the input of {{cite doi}} should be checked and stripped of "doi:" before the bot does anything. Headbomb {talk / contribs / physics / books} 12:30, 16 February 2011 (UTC)

There are 28 in the January 2011 db dump. Plasticspork corrected them on 9 February. I can provide the list if needed. Rjwilmsi 18:13, 16 February 2011 (UTC)
  Done per your request in r265. Martin (Smith609 – Talk) 03:59, 26 February 2011 (UTC)
{{resolved}}

DOI derivation from URL not quite right

Status
{{resolved}}
Reported by
Rjwilmsi 12:00, 24 February 2011 (UTC)
Type of bug
Inconvenience: Humans must occasionally make immediate edits to clean up after the bot
What happens
/abstract left at end of DOI
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Acanthaspis_petax&diff=415676013&oldid=398351444
We can't proceed until
Input from editors


Hmm, I'd noticed that publishers had started doing this in their URLs and wondered whether it would cause a problem. I think that the problems always involve DOIs ending in .x ... does this sound right? Martin (Smith609 – Talk) 02:59, 25 February 2011 (UTC)
If so, this is fixed in r257. Martin (Smith609 – Talk) 04:00, 25 February 2011 (UTC)

Missing end of page range

Status
{{resolved}}
Reported by
Ucucha 14:26, 6 March 2011 (UTC)
Type of bug
Improvement
What happens
In creating a new {{cite doi}} subpage, the bot only gives the start of a page range (and, slightly less inconveniently, it does not give the issue number).
What should happen
The bot should give the full page range and the issue number. Note that the CrossRef API (http://www.crossref.org/openurl/?pid=<snip>&id=doi:10.1644/09-MAMM-A-192.1&noredirect=true) correctly gives these data.
Relevant diffs/links
Example fix of bot output
Replication instructions
Deleting the cite doi subpage linked to should work; I've seen this problem in some other citations, though.
We can't proceed until
Bot operator's feedback on what is feasible
Requested action from maintainer
Fix the code so that this doesn't happen.


I think you can fix this in DOItools.php around line 394 by using

ifNullSet("issue", $crossref->issue); if (!is("page")) ifNullSet("pages", $crossRef->first_page . "&ndash;" . $crossRef->last_page);

and by adding "issue" to ifNullSet(). Ucucha 14:36, 6 March 2011 (UTC)

Thanks for the insight!   Done in r273. Martin (Smith609 – Talk) 15:28, 6 March 2011 (UTC)
Thanks; that does indeed fix it. Ucucha 20:16, 6 March 2011 (UTC)

Edit summaries

I noticed that User:Citation bot 1 is making edits without supplying an edit summary. I don't think bots should be exceptions to the general rule that it is good practice to provide a summary for every edit. While not very big deal, it would certainly be helpful to other users if it supplied one. Thank you. Gnome de plume (talk) 14:16, 8 March 2011 (UTC)

That's odd. Thanks for the pointer. I'll investigate. Martin (Smith609 – Talk) 15:05, 8 March 2011 (UTC)

{{resolved}} in r275. Martin (Smith609 – Talk) 13:18, 16 March 2011 (UTC)

Bot edit summaries

It would be nice if your bot left edit summaries for its edits, so that we can tell at a glance what it is doing. See this recent edit. Thanks. —SW— gossip 17:13, 10 March 2011 (UTC)

It's already been reported above. The bot was providing very good edit summaries up until a few days ago, so something must have been accidentally broken just recently. Rjwilmsi 17:31, 10 March 2011 (UTC)

{{resolved}} in r275. Martin (Smith609 – Talk) 13:19, 16 March 2011 (UTC)

Error on the Say's law page

Go here to check it out. The bot messed up a reference.--Dark Charles 07:55, 9 March 2011 (UTC)

Reference was a bit messed up before as well, I fixed. Rjwilmsi 08:19, 9 March 2011 (UTC)
Thanks for the report. Could you provide a link to the documentation for this sort of usage of the reference tag, so that I can be sure that the bot supports all possible permutations? Thanks, Martin (Smith609 – Talk) 14:32, 9 March 2011 (UTC)
(talk page stalker) There isn't any: it's misuse of the <ref>...</ref> tag, caused by misunderstanding of WP:NAMEDREFS and/or WP:REFNAME. Looking at <ref name="Say1803" pp.138–9)> in HTML terms, the attributes of tags are delimited by spaces: name="Say1803" is a valid attribute, but pp.138–9) isn't, so the latter is ignored by the MediaWiki parser. --Redrose64 (talk) 17:54, 10 March 2011 (UTC)

{{resolved}}

issue=-1

Hello. Citation bot is adding "issue=-1" to some references, for a journal which (since decades) does not use issue numbers anymore; see here. I assume the bot gets the info from the internet somewhere, but negative issue numbers seem unlikely for a journal anyway. I hope you can fix this. Success, and you are doing a great job, Crowsnest (talk) 12:27, 10 March 2011 (UTC)

Perhaps that arose because doi:10.1017/S0022112086002 is not found in the handle system. Ditto for doi:10.1017/S00221120840. LeadSongDog come howl! 20:28, 10 March 2011 (UTC)
Yes, but the links http://dx.doi.org/10.1017%2FS0022112086002999 and http://dx.doi.org/10.1017%2FS0022112084002160 -- as provided in the Morison equation article -- do work. So that seems to be related to the {{DOI}} template. -- Crowsnest (talk) 22:05, 10 March 2011 (UTC)
You left out the trailing "999" and "02160" in the doi's: doi:10.1017/S0022112086002999 and doi:10.1017/S0022112084002160 do work (so the problem is unrelated to the {{doi}} template). -- Crowsnest (talk) 22:21, 10 March 2011 (UTC)
Ah, I was mislead by popups, which showed only part of the doi numbers in that diff. Sorry for the red herring.LeadSongDog come howl! 05:04, 11 March 2011 (UTC)
The error is in the CrossRef API, which has <issue>-1</issue>. The bot should probably discard negative issue numbers it gets from that API. Ucucha 00:04, 11 March 2011 (UTC)
Only match from January 2011 db dump for |issue=-1 is on Winston Churchill, which I've just corrected. Rjwilmsi 00:38, 11 March 2011 (UTC)

{{resolved}} in r280. Martin (Smith609 – Talk) 02:12, 17 March 2011 (UTC)

Latex in titles

Status
{{wontfix}}
Reported by
Logan Talk Contributions 23:19, 13 March 2011 (UTC)
Type of bug
Inconvenience: Humans must occasionally make immediate edits to clean up after the bot
What happens
See Template:Cite doi/10.1007.2FBF02102090
What should happen
It should display normally
Relevant diffs/links
Template:Cite doi/10.1007.2FBF02102090
Replication instructions
Generate a citation tag for that article
We can't proceed until
Operator: Bot operator's feedback on what is feasible


I don't think it's a bug. It's merely that the journal reports the title using LaTeX formatting and citation bot copies the title as reported by the journal without reformatting it for the wiki. Reformatting of that type is more than should reasonably be expected for a bot. In this case I think this falls under the category of operator error: the user requesting the {{cite doi}} should go back and fix the parts that citation bot's automatic formatting doesn't get right, because it's not to be expected that it gets everything right. In this case, the {{cite doi}} was added by Logan to replace an already-well-formatted citation template, and I also don't see the point of doing that either. —David Eppstein (talk) 23:51, 13 March 2011 (UTC)
I agree: nothing that I can fix here, I'm afraid. Martin (Smith609 – Talk) 03:29, 17 March 2011 (UTC)

{{resolved}}

Bot leaves duplicate refs

Status
{{resolved}}
Reported by
Redrose64 (talk) 13:34, 15 March 2011 (UTC)
Type of bug
Inconvenience
What happens
Bot has taken four refs with identical content, and instead of consolidating into one, it's consolidated into two pairs - I see no difference between the content of "Maggs" and "Maggs_a". In doing so, bot has abandoned existing ref name ("Maggs97", which incorporates the page number - note that elsewhere, article also has ref named "Maggs100") and imposed its own standard ("Maggs" and "Maggs_a")
What should happen
http://en.wikipedia.org/w/index.php?title=Bristol_and_Gloucester_Railway&diff=418950681&oldid=418386742
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Bristol_and_Gloucester_Railway&diff=next&oldid=418386742
We can't proceed until
Bot operator's feedback on what is feasible


This should be fixed in revision 285. Martin (Smith609 – Talk) 04:51, 17 March 2011 (UTC)

Issues and new line

Status
{{wontfix}}
Reported by
Headbomb {talk / contribs / physics / books} 08:13, 16 March 2011 (UTC)
Type of bug
Cosmetic/Inconvenience
What happens
When putting new issue information, the bot places them on a new line when it should place them on the same line
What should happen
if volume and page are on the same line, so the issue should be on that same line as well
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Universal_extra_dimension&curid=4597562&diff=419089069&oldid=369640212


When adding a new parameter, the bot uses the first parameter's formatting as a template for any new parameters. Seems to me like there could be an infinite number of special cases like this. Happy to apply a submitted patch, but won't be opening this can of worms myself. Martin (Smith609 – Talk) 13:24, 16 March 2011 (UTC)

Old bug

Status
{{resolved}}
Reported by
Mark Dominus (talk) 16:16, 16 March 2011 (UTC)
Type of bug
Inconvenience
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Henry_Gordon_Rice&diff=218653201&oldid=207280800


This is pretty old, but I'm reporting it anyway. Enjoy. —Mark Dominus (talk) 16:16, 16 March 2011 (UTC)

This has been resolved, as is easily verified by testing. Martin (Smith609 – Talk) 04:58, 17 March 2011 (UTC)

volume=The

|volume=The Blah...| -> |volume=Blah... |series=The|

example -- I don't see a reason for changing that... please check. --Tabya (talk) 13:02, 17 March 2011 (UTC)   Fixed in r288. Martin (Smith609 – Talk) 03:33, 18 March 2011 (UTC) {{resolved}}

Series (duplicate)

Status
Duplicate
Reported by
Jasonfward (talk) 11:13, 18 March 2011 (UTC)
We can't proceed until
Agreement on the best solution



This edit http://en.wikipedia.org/w/index.php?title=Voyage_of_the_Damned_(Doctor_Who)&curid=12025149&diff=0&oldid=419259871 was just plain wrong. The original version has been restored. Jasonfward (talk) 11:13, 18 March 2011 (UTC)

See #Series above. The bot has been being rather gung-ho when it meets "volume" information that is not just a number. Jheald (talk) 11:21, 18 March 2011 (UTC)
In this case, the special title of that issue probably ought to go into the "issue" field, after what is there already. You don't really want it in the "volume" field, because for a journal the 'volume' field is meant to describe things that apply to more than just one issue (and because you don't want it typeset in bold). Jheald (talk) 11:29, 18 March 2011 (UTC)

{{resolved}}

Page numbers for online sources

Status
Resolved?
Reported by
– VisionHolder « talk » 21:10, 23 February 2011 (UTC)
Type of bug
Inconvenience: Humans must occasionally make immediate edits to clean up after the bot
What happens
Both on articles and on DOI/PMID/JSTOR template citations the bot misinterprets the issue number for page numbers for online articles that do not have page numbers (e.g. BMC Evolutionary Biology 2009, 9:30 = volume 9, issue 30, and not page 30)
What should happen
Bot should omit page numbers if none are provided
Relevant diffs/links
https://secure.wikimedia.org/wikipedia/en/w/index.php?title=Template%3ACite_doi%2F10.1186.2F1471-2148-9-30&action=historysubmit&diff=415563864&oldid=407410437
Replication instructions
The bot automatically does this to any citation where page numbers are not specified.
We can't proceed until
Maintainer: A specific edit to the bot's code is requested below.
Requested action from maintainer
Please ensure that the bot understands when page numbers are lacking from a citation for online sources.


Other examples of this include: [44], [45], and [46]. – VisionHolder « talk » 21:10, 23 February 2011 (UTC)

In the first case (I haven't checked the others, 30 is the Medline page number, not the issue. This is fairly standard for online-only pubs (like the BMC series) that the colon precedes a paper number, rather than an issue. Citing Medicine has examples.LeadSongDog come howl! 21:25, 23 February 2011 (UTC)
Sounds like this recurring issue is {{resolved}}? Please correct me if I'm wrong. Martin (Smith609 – Talk) 05:30, 25 February 2011 (UTC)
Unfortunately, no, it's not resolved. But I don't know how to avoid this short of using {{nobots}}. If the sources the bots are pulling from are inconsistent, I don't know what to say. Could the bot check the info from multiple sources, and if a page number is not provided by the other sources, then it ignores page numbers from Medline? Basically, if the other sources aren't providing the page number, then we assume the page number retrieved from Medline is actually the issue number. Or more simply, if the issue number equals the page number exactly, then ignore it? – VisionHolder « talk » 05:56, 25 February 2011 (UTC)
A shame to break the bot in cases where the page number is equal to the issue number. A manual fix that's less drastic than {{nobots}} is to put |issue=<!-- some comment; e.g. "This comment stops the bot adding an issue number that is erroneously in the publisher's databse" --> in the citation. It may also be worth contacting the publisher to make them aware of this problem? Martin (Smith609 – Talk) 14:31, 25 February 2011 (UTC)
Have a look at the XML from the pubmed record. These cases use the "PubMedPage" XML tag IIRC. There should be no "Page" or "Issue" tag there. Also, all papers from the specific journal should behave the same way (at least for that volume).LeadSongDog come howl! 15:05, 25 February 2011 (UTC)
I think that this only happens with BMC journals. Probably simplest just to add an exception for them. As of r263, where ever the bot comes across one, it'll remove the issue number. Martin (Smith609 – Talk) 03:41, 26 February 2011 (UTC)

<Pagination> <MedlinePgn>403-14</MedlinePgn> </Pagination>

  • Note that in the interactive "Summary view" there are no round parentheses () found.
  • The relevant para in Citing Medicine 2007 is at Chapter 23 Journals on the Internet. There, the relevant example given is:

25. Journal article on the Internet with volume but no issue or other subdivision Wolfe L. America's fidelity crisis: politics, hypocrisy and family values. Electron J Hum Sex [Internet]. 2006 Oct 25 [cited 2007 Jan 5];9:[about 8 p.]. Available from: http://www.ejhs.org/volume9/Wolfe.htm LeadSongDog come howl! 06:35, 26 February 2011 (UTC)

  • I'm not sure that I understood you. The first two PMIDs have issue and page numbers; the third has a page range. Does the bot mishandle these? Martin (Smith609 – Talk) 06:55, 26 February 2011 (UTC)
Perhaps you're looking at something I'm not. All I see for those three on their pubmed records is volume and pages, but no issue. Where are you finding issue numbers? LeadSongDog come howl! 18:09, 26 February 2011 (UTC)
Ah, I see that the CFP site does list issue numbers (same as the calendar month). LeadSongDog come howl! 18:21, 26 February 2011 (UTC)
Oh, you're right - I misread the volume number. But these aren't problematic, because the page numbers aren't listed in the pubmed database as issue numbers. See for example http://en.wikipedia.org/w/index.php?title=Template:Cite_pmid/10099806&action=history. Martin (Smith609 – Talk) 18:24, 26 February 2011 (UTC)
Perhaps cite/core should have a documented value to use for |issue=, e.g. issue=no when it is known that there is none applicable? LeadSongDog come howl! 19:01, 26 February 2011 (UTC)
Feel free to propose that there, although it shouldn't be necessary on the bot's account. Is it now safe to mark this bug report as resolved? Martin (Smith609 – Talk) 02:20, 17 March 2011 (UTC)

{{resolved}}

bibcode reverted

Hey, in the edit your bot recently made to Providence Bay, Siberia the bibcode pointed to a 250+ page pdf, while the url you replaced pointed directly to the article. I reverted.

If the bot cannot recognize that it is not pointing to the same place, then maybe this sort of edit needs to be done manually.Dankarl (talk) 01:34, 9 March 2011 (UTC)

And, as the previous commenter reported, no edit summary.Dankarl (talk) 01:41, 9 March 2011 (UTC)

I don't think you're quite right. |bibcode= goes to the abstract page and has links for either GIF or PDF versions of the article, giving the reader the choice. Rjwilmsi 08:15, 9 March 2011 (UTC)
What's wrong with the edit? As Rjwilsmi point out, readers can pick their favoured version, the bibcode= makes it explicit that they will be taking to the ADS website, and it reduces the clutter in PDFs. It's superior all around. Headbomb {talk / contribs / physics / books} 12:04, 9 March 2011 (UTC)

If there were in fact an abstract there would be some merit to that argument. There is not an abstract, and using the bibcode merely sends the reader to an intermediary page with links. Not an improvement.Dankarl (talk) 02:43, 10 March 2011 (UTC)

I recommend that you continue this discussion at Template talk:Cite journal. Martin (Smith609 – Talk) 05:05, 17 March 2011 (UTC)

{{resolved}}

Not appreciated.

Martin, I don't appreciate having your bot rearranging citation formats (like splitting parameters on different lines) just because you thought it was a good thing to do. And where it was trying to insert "series=" parameters it was going just plain wrong. The one instance where I have seen it catch an actual error (in what I have done), involving a misnumbered "last" parameter, it failed to fix it properly. In such cases I would strongly suggest that any "fixing" be limited to identifying a possible problem, such as adding a tag. Any actual fixing should be left to the discretion of a human agent, preferably one familiar with the source. - J. Johnson (JJ) (talk) 22:03, 9 March 2011 (UTC)

Following on from the above: this is a misuse of the |series= parameter, which is documented in {{Cite journal}} as follows:
  • series: According to the 14th edition of Chicago Manual of Style p. 576, "As in the case of book series, some journals have attained such longevity that they have begun a new series of volumes or issues. Identification of the series (n.s., 2d ser., 3d ser., ser. b) must be made in citations to these journals."
If "Vol" is redundant in |volume=, it should simply be removed. --Redrose64 (talk) 14:33, 10 March 2011 (UTC)
Is this the same issue as #Series below? Martin (Smith609 – Talk) 03:30, 17 March 2011 (UTC)
I would say that, generally speaking, it is the same issue: a |volume= which does not contain a pure number is not necessarily going to consist of a series prefix followed by the "real" volume number.
Rather than attempt an automatic fix which may well fail, it might be better to add an inline-cleanup template. Unfortunately I don't know of one that's directly suitable: {{volume needed}} seems closest but has the opposite meaning. Possibly {{ambiguous}} --Redrose64 (talk) 09:52, 17 March 2011 (UTC)

{{resolved}}

More than 8 authors...

I note from this diff [47] that where a paper has lots of authors, the bot is listing eight and then "et al".

Why eight ?

If you're going to truncate the list with "et al", most style guides would surely truncate after the first, or at most the second or third author. Jheald (talk) 19:10, 13 March 2011 (UTC)

It's not the bot doing that (you may not have noticed, but "et al." only occurs in the rendered output, it's not in the diff at the top or even in the wikicode), but is a design feature of {{citation/core}}. If there are up to eight authors, all get displayed; if there are nine or more, the first eight are displayed but the others are replaced by "et al.". This can be configured using the |display-authors= parameter; for example, |display-authors=3 will display three then "et al.". Note that whatever the truncation point, the first nine are always put into the COinS metadata, so it's best to specify as many as possible. --Redrose64 (talk) 21:13, 13 March 2011 (UTC)
I think it's best to remove any author after the et al.. It's just clutter in the templates, COinS metadata is not worth the tradeoff in this case. Headbomb {talk / contribs / physics / books} 00:15, 14 March 2011 (UTC)
That would be a question for discussion at Template talk:Citation, but it seems to me we've been over this ground before. LeadSongDog come howl! 03:19, 14 March 2011 (UTC)

{{resolved}}

JSTOR detection needs improving

Status
{{resolved}}
Reported by
Martin (Smith609 – Talk) 19:49, 2 March 2011 (UTC)
What happens
JSTOR left as URL
What should happen
http://en.wikipedia.org/w/index.php?title=Template%3ACite_doi%2F10.2307.2F1306131&action=historysubmit&diff=416784063&oldid=410621533
Relevant diffs/links
(Nothing hapeens: manual edit:) http://en.wikipedia.org/w/index.php?title=Template%3ACite_doi%2F10.2307.2F1306131&action=historysubmit&diff=416784063&oldid=410621533
Replication instructions
n/a
We can't proceed until
Agreement on the best solution


I'm not sure what happend in this case, but it would be good to catch it whether it is in the "10.2307" doi or in the url. For the latter, I'd suggest that you want the string to be able to accept any of the prefixes "http://www.jstor.org", "www.jstor.org", "http://jstor.org" or "jstor.org" followed by "/", "%2F", or ".2F" separators. Not sure if you need to support other protocols such as https.

LeadSongDog come howl! 17:37, 5 March 2011 (UTC)

Something similar: this. The bot shouldn't leave "|doi=10.2307/3755023 |id={{JSTOR|3755023}} |jstor=3755023 " (producing three links to the same page). Ucucha 00:00, 9 March 2011 (UTC)
Done in r282. If the DOI points to JSTOR, the bot won't necessarily remove the JSTOR parameter, in case users don't recognize the DOI as a link to a resource that they can access. This is negotiable, but any consensus should be reflected at the Template:Cite journal documentation Martin (Smith609 – Talk) 02:49, 17 March 2011 (UTC)

Multiple sweeps req'd

http://en.wikipedia.org/w/index.php?title=Template%3ACite_doi%2F10.1186.2F1471-2148-4-44&action=historysubmit&diff=418315248&oldid=418206360   Fixed in GitHub Pull 324{{resolved}}

Series

The bot ought to be a lot more conservative before it adds a "series" field, and moves material there.

In addition to the example noted above, [48], see for example: [49] [50]

Or this one (the Annales de Chimie et de Physique ref), where there was a series to be found, but the bot still got it wrong: [51]

-- that is just 3/3 just from the last three appearances of this bot on my watchlist

Current parsing for pulling out the series is not fit purpose, and needs to be turned off.

(Also, re that last diff, an "issue = 0" may be questionable too) Jheald (talk) 18:48, 13 March 2011 (UTC)

Examples in more detail, to save having to follow the links:
  1. Before: volume = vol 30 ... After: series = vol | volume = 30 ... (Ideally would have: suppressed the word "vol")
  2. Before: volume = 32 No. 1 ... After: series = 32 | volume = No. 1 ... (Ideally would have: made volume = 32 | issue = 1 )
  3. Before: volume = 1 & 3 ... After: series = 1 | volume = & 3 ... (Ideally would have: made no change -- intention was to cite the two volumes of a multi-volume work that contained essays in the relevant area).
  4. Before: volume = 1 (3me Série) ... After: series = 1 | volume = (3me Série) ... (Ideally would have: made series = (3me Série) | volume = 1 ).
These indicate some of the difficulties in parsing a strange volume entry, which should probably be noted for manual intervention and passed over, unless it really does match a recognised pattern. Jheald (talk) 10:38, 18 March 2011 (UTC)
Seemed like the best way to handle cases where the series was put in with the volume number, e.g. Proc R Soc B 45. Is it better to include the "B" in the journal name in this instance? Martin (Smith609 – Talk) 13:20, 16 March 2011 (UTC)
NB as of r288, this is what the bot will do. Martin (Smith609 – Talk) 03:34, 18 March 2011 (UTC)
Yes, the letters in something like J Phys A or Phys Rev D are indicating a journal split into a number of titles that are all going to go on being published, rather than a new series that takes over from what has gone before; so in these cases the letters do properly belong with the journal title rather than the series field. But alternatively, I don't think there's any particularly strong objection to leaving them in the volume field, as volume A27 or whatever.
But I think the real problem that comes out of the examples above is that when there is unexpected odd stuff in the volume field, it can be really quite hard to parse. So if what you're suggesting is just to dump everything in the journal name field, that may or may not always be such a good idea. In this area it might be an idea to get the bot to make a list of odd stuff it has found in the volume field that it doesn't know what to do with, then progressively build up a whitelist of case-types where there is a particular fix. Jheald (talk) 10:25, 18 March 2011 (UTC)
  Done: Bot's only editing if the "volume" starts with a single letter in the range A-D, in which case it puts this letter in the "journal" parameter. This also helps the bot to recover data from CrossRef. Martin (Smith609 – Talk) 18:52, 19 March 2011 (UTC)
That range should probably be extended to at least A-H. Journal of Physics goes from A to G, European Physical Journal goes up to H (which is the highest I've seen). Headbomb {talk / contribs / physics / books} 10:14, 20 March 2011 (UTC)
Extended to A-J. Martin (Smith609 – Talk) 13:00, 27 March 2011 (UTC) {{resolved}}

Issue number in issue-less journal

Status
{{resolved}}
Reported by
Crowsnest (talk) 12:42, 17 March 2011 (UTC)
Type of bug
Inconvenience
What happens
Citation adds an issue number (issue=1) for a journal without issues, only volumes.
What should happen
Citation bot should do nothing
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Stokes_boundary_layer&diff=419262365&oldid=378580574
http://en.wikipedia.org/w/index.php?title=Hamiltonian_fluid_mechanics&diff=prev&oldid=419258704


The CrossRef database records the issue as "1". To avoid this false positive, I'm ignoring any issue numbers reported as 1 in their db. A shame but I doubt that they'll fix the errors... Martin (Smith609 – Talk) 04:24, 18 March 2011 (UTC) {{resolved}}

Misc oddities

{{resolved}}

Avoagadro constant

The bot seems to have erroneously deleted a number of authors from a citation in the Avogadro constant article. Not sure why this happened, but possibly the typo in author number 17 set it off. SpinningSpark 10:11, 19 March 2011 (UTC)

The templates only support authors by |lastn= parameters from 1 to 9. What's your objective of listing all the extra authors in invalid template parameters? Rjwilmsi 11:21, 19 March 2011 (UTC)
It was not me who listed all the authors, I just fixed the mess the bot had made of it - notice that the bot left the citation with 19 authors still listed so the excuse that it was trying to limit it to 9 does not really hold. Personally, I never use citation templates, and would probably have resorted to et al. after the first one. I guess the editor was just trying to record all contributors without favouritism. SpinningSpark 17:29, 19 March 2011 (UTC)
Et al considered harmful. Although, that's for a field in which 19-author papers are extremely unusual. —David Eppstein (talk) 20:14, 19 March 2011 (UTC)
Actually, it was 26 authors, the bot cut it down to 19. SpinningSpark 00:07, 20 March 2011 (UTC)
That's just a blog entry related a personal opinion. Headbomb {talk / contribs / physics / books} 20:17, 19 March 2011 (UTC)
Um, yes? Your point is? —David Eppstein (talk) 20:26, 19 March 2011 (UTC)
et al. is acceptable for inline Harvard style citations but there is no reason (my previous comment notwithstanding) that all authors cannot be listed in the bibliography in Wikipedia. We do not have the space restrictions of a printed journal. SpinningSpark 00:05, 20 March 2011 (UTC)
No, but et al. is an accepted stylistic choice, and our PDFs and other print version do have size considerations. Plus, listing all the authors of papers like arXiv:0803.0732 (with 21 authors) is just silly. And that's not touching the "worse case" scenarios with paper with 100+ plus authors, like J Phys G 37 075021 (2010). Headbomb {talk / contribs / physics / books} 12:54, 20 March 2011 (UTC)
Yes, I agree. Some papers have ridiculous numbers of authors, too many to type or list. But I was responding to the original poster's suggestion of resorting to "et al" after only the first author. The practice of ignoring all but the first author in this way goes too far in the other direction — it does a disservice to the others especially in areas such as mathematics where the order of authors is alphabetical rather than by priority. I think the current defaults for the citation templates and the harvard templates set a good balance. —David Eppstein (talk) 16:52, 20 March 2011 (UTC)

{{resolved}}

Strip line break from parameters

Status
{{resolved}}
Reported by
Headbomb {talk / contribs / physics / books} 21:12, 19 March 2011 (UTC)
What should happen
Parameters should not contain any line breaks. AKA
  • |title=Relative incapacitation contributions of pressure wave and wound channel \n in the Marshall and Sanow data set

should be

  • |title=Relative incapacitation contributions of pressure wave and wound channel in the Marshall and Sanow data set
Relevant diffs/links
[52]
We can't proceed until
Agreement on the best solution


The logic should be general since this should never happen for any parameter (except possibly |author=) AKA

  • |journal=\n Journal of \n Foo|journal=Journal of Foo
  • |doi=\n 10.123456789|doi=10.123456789
  • |publisher=American Mathematical \n Society|publisher=American Mathematical Society
  • etc...

This would not affect "single line" vs "multiline" referencing, just make sure things are not broken up for no reason. Headbomb {talk / contribs / physics / books} 21:12, 19 March 2011 (UTC)

This is especially important in the titles of citations containing |url=, since the formatting breaks when we have line breaks in the title in that case. —David Eppstein (talk) 21:36, 19 March 2011 (UTC)
As an aside there's an AWB genfix for this particular case of |url=. Rjwilmsi 20:56, 20 March 2011 (UTC)
Could you implement this into AWB's genfixes as a general thing for citation templates? Headbomb {talk / contribs / physics / books} 21:32, 24 March 2011 (UTC)
The bot now replaces line breaks with breaking spaces, as of r309. Martin (Smith609 – Talk) 15:08, 27 March 2011 (UTC)

Wrong journal identified

Status
{{resolved}} in r308
Reported by
Gaius Cornelius (talk) 17:04, 20 March 2011 (UTC)
Type of bug
Deleterious
What happens
Bot has taken a manual reference to the Proceedings of the Society for Psychical Research, and added details for a paper from the Physical Review that had the same year, volume and page number.
Relevant diffs/links
see here
We can't proceed until
Agreement on the best solution


Cite doi expansion not working

Status
{{resolved}}
Reported by
Ucucha 13:41, 27 March 2011 (UTC)
Type of bug
Improvement
What happens
Bot does not expand newly cited {{cite doi}} references
What should happen
It should expand them
Relevant diffs/links
Cite dois (warning: huge diff) added yesterday; some still not expanded (I had the bot expand some manually by jumping the queue). See also the subpages of User:Ucucha/List of mammals.
Replication instructions
Add a new {{cite doi}} to a page and wait.
We can't proceed until
Bot operator's feedback on what is feasible


I had the bot on hold until I fixed a couple of bugs: it's just been run. Please check Special:Contributions/citation_bot_2 in case there are any errors! Martin (Smith609 – Talk) 16:20, 27 March 2011 (UTC)

Thanks. Ucucha 18:30, 27 March 2011 (UTC)

Database anomaly

http://en.wikipedia.org/w/index.php?title=Crystallographic_database&diff=prev&oldid=421074857 Merged below. {{resolved}}

En-dashes in "issue" parameter

Enhancement: http://en.wikipedia.org/w/index.php?title=Extinction_event&diff=prev&oldid=421079180   Fixed in GitHub Pull 322 {{resolved}}

Reference name generation

Something more informative than depunctuated URL would be nice: http://en.wikipedia.org/w/index.php?title=Bifidobacterium_animalis&diff=prev&oldid=421085160 {{resolved}}

PMID=1

Status
{{resolved}} in r318
Reported by
LeadSongDog come howl! 18:20, 28 March 2011 (UTC)
Type of bug
Inconvenience: Humans must occasionally make immediate edits to clean up after the bot
What happens
when populating cite doi, and redirecting cite pmid, |pmid= is given but mislabled as |1=
What should happen
label as |pmid=
Relevant diffs/links
Template:Cite doi/10.1016.2Fj.mcp.2011.01.005
We can't proceed until
Bot operator's feedback on what is feasible


Is this ubiquitous or restricted to certain references? Martin (Smith609 – Talk) 19:08, 28 March 2011 (UTC)

Not unique. Looking through contributions it seems the problem is always in an edit to a cite doi subpage that is linked from a cite pmid subpage where the edit comment reads "(Redirecting to DOI citation)". LeadSongDog come howl! 19:34, 28 March 2011 (UTC)
OK, I just spent several hours fixing a bunch of these up, there are still lots to do, I don't know how many. I stopped at this going backwards through the history. See my contribs for examples. It looks like these were created in processing cases of {{cite pmid|123456}} (in lieu of {{cite pmid}}. The bot creates the subpage for cite doi but does not correct the error. Then, because it doesn't always find the right data without knowing the pmid, it either craps out or replaces it with incorrect metadata. Several times this mal-metadata was for a particular paper from Cladistics. It'd be good if the bot could somehow clean up the mess it made by itself. Citation bot 2 should be stopped until this is fixed.LeadSongDog come howl! 22:55, 29 March 2011 (UTC)
Citation bot 2 is now in "manual operation only" and won't run automatically. Hope to fix the bug tonight. Martin (Smith609 – Talk) 12:10, 30 March 2011 (UTC)
Thank you! LeadSongDog come howl! 13:44, 30 March 2011 (UTC)
Done in r318. Martin (Smith609 – Talk) 22:36, 30 March 2011 (UTC)
Ah, that fix reminds me of a long forgotten paper, doi:10.1145/130722.130737, that was written to identify a data structure corollary to Djikstra's famous "Goto considered harmful" thesis. Thank you. LeadSongDog come howl! 23:06, 30 March 2011 (UTC)

Removal of jstor link by Citation bot

Status
{{resolved}} Not a bug
Reported by
Dr.K. λogosπraxis 13:17, 29 March 2011 (UTC)
Type of bug
Deleterious: Human-input data is deleted or articles are otherwise significantly affected. Bot removed jstor link from citation.
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Rosalind_Picard&diff=prev&oldid=421251875
We can't proceed until
Agreement on the best solution


Not a bug, URLs are produced by |jstor= parameter. In this case, |jstor=10.2307/1576591 is set to the doi (10.2307/1576591) rather than the jstor (1576591). I've fixed this. Headbomb {talk / contribs / physics / books} 14:21, 29 March 2011 (UTC)

Bibcode and arxiv

Status
{{resolved}} in r319
Reported by
David Eppstein (talk) 14:36, 29 March 2011 (UTC)
Type of bug
Inconvenience: Humans must occasionally make immediate edits to clean up after the bot
What happens
Bibcode for arxiv eprint is misinterpreted as journal paper, with "eprint xxx" in the journal entry and with part of the arxiv id misinterpreted as a page number. Additionally (possibly a separate bug) {{arxiv}} with two parameters is converted to an arxiv param that erroneously has spaces between the two parts of the arxiv id, causing the outgoing link to break
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Erd%C5%91s%E2%80%93Faber%E2%80%93Lov%C3%A1sz_conjecture&curid=691803&diff=421296535&oldid=384934864
We can't proceed until
A user confirms that the fix worked


It was still doing this in r316 [53] but I'll let you know if I see it again in later versions. Running the fixed bot over Pseudoforest again would probably provide some validation. The diff there also contains another example of the bot misinterpreting bibcode data as being a journal (in this case, for a Ph.D. thesis instead of an arxiv preprint). —David Eppstein (talk) 00:12, 31 March 2011 (UTC)
Just checked [54]. It at least seems to not pretend arxiv bibcodes are journal articles. The Ph.D. thesis => journal problem is still present, and the transformation of a URL for OEIS into an article about the OEIS is still wrong, but I'm not sure they're really bugs, unless one things that placing too much trust in the Bibcode database is a buggy thing to do. So I'm marking this resolved, but I'm not entirely convinced that you won't continue to have trouble with this. —David Eppstein (talk) 01:15, 31 March 2011 (UTC)
  Fixed in GitHub Pull 326 I've added a manual exclusion for journals with titles beginning "Thesis". If you're aware of a way to work out the quality of the arxiv journal parameter, that'd be great - otherwise I guess we can see what the false positive rate is and reassess. Martin (Smith609 – Talk) 01:32, 31 March 2011 (UTC)
It's shit. See [55]. Headbomb {talk / contribs / physics / books} 04:52, 31 March 2011 (UTC)
Also wouldn't it be better to exclude anything containing "thesis"? I mean you could have PhD Thesis just as well as "Thesis (PhD)". Headbomb {talk / contribs / physics / books} 00:13, 1 April 2011 (UTC)

Redundant JSTOR

Status
Not a bug
Reported by
A.Cython (talk) 00:17, 31 March 2011 (UTC)
Type of bug
Inconvenience
What happens
deleted info of a citation
What should happen
nothing
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Noemvriana&action=historysubmit&diff=421542252&oldid=414783487
We can't proceed until
Agreement on the best solution


See #Removal of jstor link by Citation bot Headbomb {talk / contribs / physics / books} 04:27, 31 March 2011 (UTC) {{resolved}}

Degrassi: The Next Generation (season 4)

I have revered Citation bot 1's edit to Degrassi: The Next Generation (season 4), it applied lower case to a title, and part of the edit summary was garbled. 117Avenue (talk) 02:49, 31 March 2011 (UTC) {{resolved}} in r328 Martin (Smith609 – Talk) 04:15, 1 April 2011 (UTC)

URL removed from cite web

Status
  Fixed in GitHub Pull 336 {{resolved}}
Reported by
Martin (Smith609 – Talk) 14:01, 1 April 2011 (UTC)
Type of bug
Deleterious
What happens
URL removed from cite web when "url=" is missing
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Diplura&diff=prev&oldid=421815640
We can't proceed until
A specific edit to the bot's code is requested below.


URL eaten

Status
{{resolved}}:   Fixed in GitHub Pull 336
Reported by
Martin (Smith609 – Talk) 19:03, 6 April 2011 (UTC)
Relevant diffs/links
http://en.wikipedia.org/wiki/Eunicidae
We can't proceed until
A specific edit to the bot's code is requested below.


I don't see the problem there? Citation bot never removed any url... ? Headbomb {talk / contribs / physics / books} 22:03, 6 April 2011 (UTC)

Arxiv from ID

Status
  Fixed in GitHub Pull 341
Reported by
69.171.138.77 (talk) 21:45, 9 April 2011 (UTC)
What happens
"arvix" retained as string
What should happen
"arxiv" string removed
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Redshift_quantization&diff=prev&oldid=423211558
We can't proceed until
A specific edit to the bot's code is requested below.


See also above Headbomb {talk / contribs / physics / books} 23:10, 9 April 2011 (UTC) {{resolved}}

Toolserver: 403: User account expired

Status
{{resolved}}
Reported by
  —Chris Capoccia TC 13:10, 14 May 2011 (UTC)
Type of bug
Catastrophical
What happens
can't access Citation bot at all through any means because the server reports:

403: User account expired

The page you requested is hosted by the Toolserver user verisimilus, whose account has expired. Toolserver user accounts are automatically expired if the user is inactive for over six months. To prevent stale pages remaining accessible, we automatically block requests to expired content.

If you think you are receiving this page in error, or you have a question, please contact the owner of this document: verisimilus [at] toolserver [dot] org. (Please do not contact Toolserver administrators about this problem, as we cannot fix it—only the Toolserver account owner may renew their account.)

HTTP server at toolserver.org - ts-admins [at] toolserver [dot] org

We can't proceed until
Agreement on the best solution


Eek; I've been inactive for a while and for some reason didn't receive an account renewal e-mail. I've contacted the administrators to request reactivation. Martin (Smith609 – Talk) 15:52, 14 May 2011 (UTC)

Details taken from wrong article

Status
new bug
Reported by
Miradre (talk) 16:05, 5 March 2011 (UTC)
Type of bug
Inconvenience: Humans must occasionally make immediate edits to clean up after the bot
What happens
When using Cite Pmid template and ID 19707526 this gives the wrong details. Seems to be from this article http://www.ncbi.nlm.nih.gov/pubmed?term=19009023
What should happen
Should be the details from this article http://www.ncbi.nlm.nih.gov/pubmed?term=19707526
We can't proceed until
Maintainer: A specific edit to the bot's code is requested below.

-->


The doi provided in the PubMed data is incorrect. Citationbot used that incorrect doi.LeadSongDog come howl! 16:25, 5 March 2011 (UTC)
I've replaced the redirect at template:cite pmid/19707526 that was targetted to the incorrect doi with an instance of cite journal, and populated that.LeadSongDog come howl! 16:41, 5 March 2011 (UTC)
Thanks! Miradre (talk) 16:46, 5 March 2011 (UTC)
You're welcome. @Martin, would it be feasible for the bot to check at least one of the parameter values (e.g. author or page) given at the doi against the pubmed equivalent before assuming the pubmed doi is correct? That would catch most such errors. LeadSongDog come howl! 16:50, 5 March 2011 (UTC)
How widespread is the error? Is there a field that will always be the same in PMID and CrossRef databases, if the PubMed DOI is correct? (Title, for instance, may encode the punctuation differently.) I'm wary of creating many false negatives. Martin (Smith609 – Talk) 15:10, 7 March 2011 (UTC)
I have no way of knowing how rarely they get it wrong, but I have seen in come up before (it was mentioned here, or one of the tt:cite xxx pages, some months back?) I think it may arise when an epub-before-press is indexed, but that's just a guess.
Looking at the latest schema under <xsd:element name="journal_article"> the "publication_date" and "pages" fields would seem to be eminent candidates for comparison. The "contributors" should have at least a fuzzy match (some nontrivial substring in common?) with pubmed's "authors", similarly for titles.

Of course the journal name, abbreviation, or issn should match well if at all, but that doesn't seem to catch cases like this one, where the doi pointed to the wrong article in the right journal. Given that pubmed lists several dates for some articles, it would seem sensible to accept that any of these dates match the crossref publication_date. If nothing matches but the journal, perhaps a {{vn}} tag could be inserted in lieu of the bot making the edit. LeadSongDog come howl! 18:45, 7 March 2011 (UTC)

Broken refs

The bot seems to have broken a few refs here. Possibly due to an earlier edit by User talk:Rjwilmsi that broke a ref here. Thought you should know. Otherwise the bot and user generally do a stirling job! --Paul (talk) 10:50, 11 March 2011 (UTC)

I've fixed the error on the article, and will now look at whether the error was originally from me or the citation bot. Rjwilmsi 11:52, 11 March 2011 (UTC)
Many thanks Rjwilmsi!--Paul (talk) 12:04, 11 March 2011 (UTC)
Okay, problem is with Citation bot, see sanxbox example. Problem: when there are named references after a {reflist} it's not okay to condense references if the first occurrence falls before {reflist} and the second after.Rjwilmsi 12:11, 11 March 2011 (UTC)
Great. Thanks for tracking that down. We formatted the page that way to make it easier to incorporate the 'Further reading' refs into the main article. Apologies for causing problems for your otherwise excellent bot! --Paul (talk) 16:13, 11 March 2011 (UTC)

error in changing column to volume

Status
{{resolved}}
Reported by
Nick Levinson (talk) 19:44, 22 May 2011 (UTC)
Type of bug
Inconvenience
What happens
Why is Citation bot replacing "column" with "volume"? They're not synonymous. It replaced in Wife selling (English custom), this was reverted or edited back, and the bot did it again to the same article. It was reverted again, this time by me, because I checked the 1797 Times case and it looks to me like it should be column, not volume. I didn't check the others on that page. Unless there's an unusual reason for this (in which case please let us know), it seems to me the bot should be fixed or its user should reconsider. Thank you.
Relevant diffs/links
Links given in "what happens".
Replication instructions
If in the bot and not by the operator, please ask the operator.
We can't proceed until
Your choice.
Requested action from maintainer
Column and volume should not be confused.


The {{citation}} template does not recognise a |column= parameter. Knowing this, Citation bot is assuming a typo, and that the closest-matching valid parameter (at least in terms of spelling/sound-alike) is |volume=, so it's amending accordingly.

If you need to give the column number in your refs, use the |at= parameter instead of |page=, i.e. instead of |page=3|column=B, put |at=p. 3, col. B and similarly for the other four instances. --Redrose64 (talk) 21:20, 23 May 2011 (UTC)

Pubmed link conversion breaks links to Gene ID

Status
{{resolved}}
Reported by
Rjwilmsi 14:17, 27 April 2011 (UTC)
Type of bug
Deleterious: Human-input data is deleted or articles are otherwise significantly affected. Many bot edits require undoing.
What happens
pubmed URL to |pmid= conversion creates link to unrelated pubmed article
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Melanin-concentrating_hormone_receptor_2&diff=419283163&oldid=411536477
We can't proceed until
Operator: Bot operator's feedback on what is feasible


I think that this was fixed a while ago. Please point me to a recent edit if not. Martin (Smith609 – Talk) 22:09, 11 June 2011 (UTC)

Minkowski addition (FYI, I fixed it)

Status
{{resolved}}
Reported by
 Kiefer.Wolfowitz 21:13, 10 May 2011 (UTC)
Type of bug
Inconvenience: Humans must occasionally make immediate edits to clean up after the bot
What happens
Not applicable (NA)
What should happen
NA
Relevant diffs/links
Strange edit that put in an apparently non-existent medical article about kidneys in the middle of a mathematical/economics article.
Replication instructions
Not of interest
We can't proceed until
Nothing
Requested action from maintainer
NONE.


Hi, Robot!

The bot made a weird edit, in the middle of another citation, which I fixed, FYI.

Best regards from a fan of CiteBot1,  Kiefer.Wolfowitz 21:13, 10 May 2011 (UTC)

The inserted data was for PMID 538. This seems to be a combination of errors. First, that pmid is likely wrongly assigned in the pubmed database. If not it was one of the very first ever assigned. The bot seems to have taken the page number 538 from the REPEC data, perhaps from this entry and somehow misapplied it as a pmid. Still that was back in March, the bot's seen some serious revision since then.
@Martin, if the bot doesn't already explicitly handle RePEc data, it might be worth having a look at [56] for clues.LeadSongDog come howl! 13:44, 19 May 2011 (UTC)
Looks like RePEc isn't widely used in WP. Thanks for the pointer though. Martin (Smith609 – Talk) 21:48, 11 June 2011 (UTC)

Citations in WP: space should not be edited by bots

Status
{{resolved}}
Reported by
Jc3s5h (talk) 23:31, 18 May 2011 (UTC)
Type of bug
Inconvenience
What happens
The bot edits in Wikipedia space, which often includes discussion of various citation formats, or examples of what NOT to do.
What should happen
No edits to Wikipedia space
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Wikipedia:Citation_templates&curid=1348162&diff=429799053&oldid=429798079
We can't proceed until
Agreement on the best solution


Pages that should not be edited by the bot should contain {{bots|deny=citation bot}}. Martin (Smith609 – Talk) 13:44, 19 May 2011 (UTC)

Justify why citations in Wikipedia space should be edited by bots. Jc3s5h (talk) 15:14, 19 May 2011 (UTC)
I have complained about this bot behavior at Wikipedia talk:Bots/Requests for approval#Restrict Citation bot to article space.

Replacement of ISBNs

Status
{{resolved}}
Reported by
— Cheers, JackLee talk 16:29, 19 May 2011 (UTC)
Type of bug
Inconvenience: Humans must occasionally make immediate edits to clean up after the bot
What happens
The bot changed "{{citation|author=Karen Hearn, ed.|title=Dynasties: Painting in Tudor and Jacobean England 1530–1630|location=London|publisher=[[Tate|Tate Gallery]]|ISBN=978-1-85437-169-0 (hbk.), ISBN 978-1-85437-157-7 (pbk.)}}" to "{{citation|author=Karen Hearn, ed.|title=Dynasties: Painting in Tudor and Jacobean England 1530–1630|location=London|publisher=[[Tate|Tate Gallery]]|id=(hbk.), (pbk.)|isbn=978-1-85437-169-0}}". I thought bots weren't changing the use of |id= to |isbn=?
Relevant diffs/links
[57] (my reversion of the error)
We can't proceed until
Agreement on the best solution


That citation would violate wp:SAYWHEREYOUGOTIT. You should really choose either the hbk or the pbk based on what you used, then either drop the other entirely or else add, e.g., " (also released in pbk as ISBN 978-1-85437-169-0)" after {{citation}} and before the /ref tag, if for no other reason than pagination sometimes varies between the two printings.LeadSongDog come howl! 21:44, 19 May 2011 (UTC)
Fair enough in the example above, but such a citation could also appear in a "Further reading" section where readers are being alerted to the existence of multiple versions of a book and no particular version is being referred to. — Cheers, JackLee talk 09:36, 20 May 2011 (UTC)
Perhaps, but even then, "id=ISBN 978-1-85437-169-0 (hbk.), ISBN 978-1-85437-157-7 (pbk.)" would be erroneous. See Template:Citation/doc and Wikipedia:ISBN. In short, an id should be a unique identifier, not a list. If you want a unique identifer that lists all editions I'd suggest using the OCLC number instead. But we usually give readers some credit. If they can't get the exact ISBN suggested, they can usually figure out enough to click the "other editions" link to find something similar, or perhaps "ask a librarian". Cheers.LeadSongDog come howl! 19:05, 20 May 2011 (UTC)
Hmmm, OK. I asked about this ages ago and was advised to use the magic word "ISBN" as shown in the example above, but perhaps the responder didn't realize there was a policy on this. — Cheers, JackLee talk 12:38, 21 May 2011 (UTC)

Strange page numbers

Status
{{resolved}}
Reported by
Ruud 17:23, 19 May 2011 (UTC)
What happens
strange page numbers and someone made the editor who is not
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Haskell_%28programming_language%29&diff=prev&oldid=429909561
We can't proceed until
Agreement on the best solution


The odd page numbering seems to have been picked up from this source or this one or this one which refer to "pages 12–1–12–55" or "pages 12–1 – 12–55", rather than this one which refers to "pages 12:1–12:55" in citing the same source. It looks like an error elsewhere has been propagated forward.LeadSongDog come howl! 21:30, 19 May 2011 (UTC)
And what about incorrectly changing other to editor? —Ruud 14:59, 20 May 2011 (UTC)
Ah, I missed that one. That edit seems to accord with the editor data for ISBN 9780444533876 (if that's the exact book you mean). The cataloguing for that book does seem odd on googlebooks, with the same three names as editors, authors, and then a fourth author compounded from the three. In contrast, the data for ISBN 9780720422085 (Vol 2, 1972) is cleaner and is also partially available in a google limited preview. I note that cataloguing of it is quite rare, appearing on WorldCat as OCLC 185347618 with only one copy shown, in the National Library of Sweden. Compare ISBN 0720422086, OCLC 490011803, and OCLC 552956116. Each has a slightly variant punctuation of the title, and each erroneously includes the volume number in that title. The umbrella record is OCLC 373234 which lists the several variant versions. In particular OCLC 642603650 is widely held, in 445 libraries. This sort of thing is inevitable, but instructive, especially for books preceding the adoption of the ISBN system. LeadSongDog come howl! 18:03, 20 May 2011 (UTC)
OCLC 642603650 seems to give the most correct information, stating "Responsibility: Haskell B. Curry, Robert Feys, with two sections by William Craig. Vol.1.". If this was just a one time miss, I guess there's little to worry about. —Ruud 19:17, 20 May 2011 (UTC)

Automated deletion of human inputted material, page number where a url ends with an & before |page=

Status
{{resolved}}
Reported by
Fifelfoo (talk) 06:14, 8 June 2011 (UTC)
Type of bug
Deleterious
What happens
citation bot removes page number from citation
What should happen
Not removing the page number
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Mass_killings_under_Communist_regimes&diff=433159867&oldid=433159772
We can't proceed until
Agreement on the best solution


It only removed "&lpage=77", which is a redundant part of the Google Books URL. "&PA77" is still there, and that ought to be enough to get you to the right page in Google Books. Perhaps |p=77 should be added, but I don't think this is a bot error. However, in that same edit, why did the bot remove {{fails verification}}? Ucucha 06:33, 8 June 2011 (UTC)

Template:Cite etc..'s |page= parameter is a human readable wikipedia parameter. It is intended to display the contents of |page= to human beings, not as a machine readable hyperlink. Fifelfoo (talk) 07:32, 8 June 2011 (UTC)
As I said, the template did not have |page= set. The text was el-page, not pipe-page. Ucucha 07:35, 8 June 2011 (UTC)
One solution is to uses |page=77 and not |url=. That way it is evident to the reader that the link goes to the specific cited page, not to the full work. LeadSongDog come howl! 13:59, 8 June 2011 (UTC)
Thanks Ucacha, it was an el. My eyes are growing older on l1|. Fifelfoo (talk) 04:12, 9 June 2011 (UTC)

Milk snake

Status
{{resolved}}
Reported by
WilliamKF (talk) 23:15, 22 April 2011 (UTC)
Type of bug
Inconvenience
What happens
Template tag author1-first changed to author8-link; template tag author1-last changed to author1-link; template publisher added with redundant information of volume and issue
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Milk_Snake&oldid=380973736 http://en.wikipedia.org/w/index.php?title=Milk_Snake&oldid=368315002
We can't proceed until
Agreement on the best solution


Example edits that were incorrect: Mistakenly changed "| author1-first = Alan H. " to "| author8-link = Alan H. " Mistakenly changed "| author1-last = Savitzky" to "| author1-link = Savitzky" and redundantly added "| publisher = Journal of Herpetology, Vol. 35, No. 4"

There is no such parameter as "author1-last". Nevertheless, I've asked the bot to start correcting this to "last1". Martin (Smith609 – Talk) 22:15, 11 June 2011 (UTC)

Extracting data from templates within ID parameter

Status
Needs tweaking
Reported by
Martin (Smith609 – Talk) 04:42, 18 March 2011 (UTC)
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Hot_Jupiter&diff=prev&oldid=419413162
Replication instructions
Expand: Edwards, Gladys Brown (1973). The Arabian: War Horse to Show Horse (Revised Collector's ed.). Covina, California: Rich Publishing, Inc. LCCN 71-247969. {{cite book}}: Invalid |ref=harv (help)
We can't proceed until
A user confirms that the fix worked


Not sure what is asked to discuss? {{arxiv}} can be declared in 3 main ways. {{arxiv|astro-ph/0610314}}, {{arxiv|astro-ph|0610314}} and {{arxiv|archive=astro-ph|id=0610314}}. In all cases, these should be converted to |arxiv=astro-ph/0610314. Headbomb {talk / contribs / physics / books} 06:18, 18 March 2011 (UTC)
Thanks for the list. I'll get on this asap. Are there other identifiers, e.g. bibcodes, that might have been specified in this way and should now be given their own parameter? Martin (Smith609 – Talk) 13:41, 18 March 2011 (UTC)
Yes. Most of them in fact.
Template that can be straightforwardly converted (meaning they have |id= redundant with |1= and no other parameters (at least when used in a {{cite xxx|id=}})) include {{asin|0123456789}} (except when |country= is used), {{bibcode|0123456789}} (|label=/|2= can be ignored), {{doi|0123456789}}, {{issn|0123456789}}, {{jfm|0123456789}}, {{jstor|0123456789}} (except when |sici=/|issn= are used), {{mr|0123456798}}, {{oclc|0123456789}} (but not when |2=/|3= are used), {{ol|01234567789}} (only if |author= is not used), {{osti|0123456789}}, {{pmc|0123456789}}, {{pmid|0123456789}}, {{IETF-RFC|0123456789}} (use |rfc=0123456789 when in a {{cite xxx}}), {{ssrn|0123456798}}, and {{zbl|0123456789}}
The two weird-ish ones are {{arxiv}} (detailed above) and {{lccn}} which would need careful examination (and might best be left alone) before converting. Headbomb {talk / contribs / physics / books} 14:13, 18 March 2011 (UTC)
Thanks so much for compiling this list! All the cases you mentioned here now work in r299, although it took a bit of coding at Template:LCCN] to incorporate that one! Martin (Smith609 – Talk) 04:33, 19 March 2011 (UTC)
And thanks for incorporating them in the bot. I'll compile a list of URLs associated with these identifiers which could be removed/converted. Headbomb {talk / contribs / physics / books} 15:40, 19 March 2011 (UTC)

It seems to miss the LCCNs... see [58]. Headbomb {talk / contribs / physics / books} 13:48, 9 April 2011 (UTC)

  Fixed in GitHub Pull 344 Martin (Smith609 – Talk) 02:24, 12 April 2011 (UTC)

{{resolved}} (?)

Bot replaces book reference by journal's book review

Status
Possibly   Fixed in r307. {{resolved}}
Reported by
Pyrotec (talk) 09:00, 24 March 2011 (UTC)
Type of bug
Deleterious
What happens
(1) Bot has taken a manual reference to a specific volume of a published book, and replaced it with a journal book review of the same book but a different volume. (2) In the same edit it replaced a valid book title with a journal review of a workshop on explosives.
Relevant diffs/links
[59]
We can't proceed until
A user confirms that the fix worked


A similar problem in this edit? Vol III of Explosives encyclopedia translation gets linked to an article in journal Science with title that refers to Vol 1 of the Encyclopedia. My guess is a book review in Science borrowed the book title, bot mismatched using Science title (ignoring author/translator != Dolan), bot mismatched volume (1 != III). Glrx (talk) 19:32, 23 March 2011 (UTC)

Well, in that case, the input OCLC 499857211 given links to a record for a 1967 edition, not one of the numerous 1964 records for Vol 3. The book review here seems to refer to a different 1964 edition and at that, an edition of Vol 1. Simple case of garbage in, garbage out.LeadSongDog come howl! 20:57, 23 March 2011 (UTC)
Well Volume 1 (the version I have) was published in 1964; Volume 2 (the version I have) was published in 1965 and volume 3 (the version I have) was published in 1967. Or, rather than GIGO, vandalism by bot. Pyrotec (talk) 22:22, 23 March 2011 (UTC)
Before the bot edit, that citation was:

*{{Citation |first=Tadeusz |last=Urbański |authorlink= Tadeusz Urbański |title=Chemistry and Technology of Explosives |volume= III |edition=First English |year=1967 |location=Warszawa |publisher=PWN - Polish Scientific Publishers and Pergamon Press |isbn= |oclc=499857211 |doi=}}. Translation by Marian Jurecki, edited by Sylvia Laverton. See also ISBN 9780080104010.

The only linked database for that record was the oclc record, which shows the title as "Chemistry and technology of explosives. Vol.3", the publisher as "Pergamon; PWN, 1967" and the author as "Tadeusz Urbański" There are also very similar OCLC 270730324, OCLC 313039828, OCLC 491753177 all of which appear to describe the same edition and volume (perhaps on different presses?) Meanwhile, the googlebooks entry seems to omit indicating any of these oclc numbers (it shows no "Find in a library" link). And this googlebooks entry links to OCLC 1496279 covering all four volumes, which attributes only volume 2 to PWN. So yes, it was a GIGO scenario. None of the catalogues had records that correspond directly to the data listed in the pre-edit version, but several were near misses. The bot in such circumstances is expected to find the best match it can, which in this case it (incorrectly) decided was doi:10.1126/science.145.3637.1176, a.k.a. Bibcode:1964Sci...145.1176U book review in Science. Having made that decision, it used the results. In unclear cases such as this, where the match is ambiguous, the bot might better tag the citation for human examination, perhaps with {{vn}} or similar. LeadSongDog come howl! 05:40, 24 March 2011 (UTC)
Interestingly Google books and World Cat both indicate that it is a book, NOT a journal with a review of that book. Many journals carry book reviews, often every issue of the journal has multiple book reviews, and the title of the book review is often the full publication details of the book. If the bot is going to substitute the book review details from a journal everytime it makes a match between a book and a journal with a review of that book, then the bot is non-compliant Wikipedia:Bot policy. In this particular case the book (volumes 1 to 3) do not bear ISBNs, otherwise I would have used them. They do have Library of Congress Catalog Card Numbers. It is interesting to speculate whether the miss-behaving bot would have removed the ISBN from this book reference if there had been one and replaced it with the ISSN of the journal carrying the book review. Pyrotec (talk) 09:00, 24 March 2011 (UTC)

That was not the only change made by the bot in the RDX article, (see [60]). Another book: Cooper, Paul W. (1996), Explosives Engineering, New York: Wiley-VCH, ISBN 0-471-18636-8: that was properly cited (see Amazon listing) was changed to a journal review of a workshop on explosives (see Journal entry). Pyrotec (talk) 09:36, 24 March 2011 (UTC)

Again in this case, an incomplete {{citation}} had an ambiguous identifier. The given ISBN 0471186368 is on OCLC 34409473 and on OCLC 232968272. Both those records show the publisher as VCH. The same author/date/publisher also apply to ISBN 156081926X which seems to be an abridged or introductory version, the title of which includes the title of the other. In any case, adding info on the book review was in error. Clearly when {{citation}} has |isbn= populated this should not happen, though {{cite book}} would not have had an issue. LeadSongDog come howl! 13:41, 24 March 2011 (UTC)
I really fail to so how "Nuclear explosives engineering: Puerto Rico summer workshop" , J. A. Cheney and W. K. Talley, In the Journal of Nuclear Engineering, Published by Elsevier Science Ltd, which has an ISSN can be regarded as a match. Pyrotec (talk) 15:41, 24 March 2011 (UTC)
I clicked on both links and there is an image of the book and guess what, the image gives the publisher as Wiley-VCH - try it yourself. I also used the British Library Integrated catalogue (at [61]) and guess who the publisher is - Wiley-VCH Pyrotec (talk) 16:15, 24 March 2011 (UTC)
Oh well. I'll manually check the last 500 edits by the bot and list all the errors here. Pyrotec (talk) 16:15, 24 March 2011 (UTC)
That would be very helpful. The string "explosives engineering" in the JNE article title is probably what the bot matched. If you want, you could sandbox the citation and experiment with variations. The ISBNs starting 0-471 and those starting 1-560 do appear to distinguish VCH from Wiley-VCH, so that may have been a factor.
@Martin, can the bot check to see it the article is in Category:Books and (if it is not) assume that an ISBN is refering to the work cited (i.e. exclude book reviews in news or in journals)? LeadSongDog come howl! 18:26, 24 March 2011 (UTC)
This looks like a case of the error listed below, where AdsAbs returns "fuzzy" matches. I've tried to replicate this specific edit with the current version of the bot, and the DOI and bibcode are no longer added. Can you verify whether or not this has fixed the bug globally? Martin (Smith609 – Talk) 13:58, 27 March 2011 (UTC)

Wrong information

Status
  Fixed in GitHub Pull 321
Reported by
Ucucha 03:33, 28 March 2011 (UTC)
Type of bug
Deleterious
What happens
Bot expands DOI with completely wrong information
Relevant diffs/links
[62]. Clicking on the DOI gives the right citation, and CrossRef also gives information about the correct article.
We can't proceed until
A user confirms that the fix worked


The bot processed the actual DOI for this paper just before this one: Mikovits, J. A.; Ruscetti, F. W. (2010). "Response to Comments on "Detection of an Infectious Retrovirus, XMRV, in Blood Cells of Patients with Chronic Fatigue Syndrome"". Science. 328 (5980): 825–825. Bibcode:2010Sci...328..825M. doi:10.1126/science.1184548.. See [63]. Might those data have lingered somewhere in the bot? Ucucha 03:47, 28 March 2011 (UTC)

In hindsite, this looks like the PMID=1 bug below. LeadSongDog come howl! 22:41, 30 March 2011 (UTC)

{{resolved}}

Auto-uppercasing of journal titles (?) causes mojibake

Status
  Fixed in GitHub Pull 327 ; {{resolved}}
Reported by
cab (call) 12:40, 31 March 2011 (UTC)
Type of bug
Inconvenience
What happens
See supplied diff. Only the first character of the journal title gets mangled. The mangled character 兵 is E5 85 B0 in UTF-8. It gets mangled to Ņ�. Note that UTF-8 for Ņ is C5 85. I suspect the first byte of 兵 is being mistakenly treated as an ASCII character, and something attempts to uppercase it by subtracting 0x20.
Relevant diffs/links
https://secure.wikimedia.org/wikipedia/en/w/index.php?title=Indiaans_in_Japan&diff=prev&oldid=417758819
We can't proceed until
Agreement on the best solution


  • Testing that theory:
    • base character 兵
    • italic
    • ucfirst 兵
    • lcfirst 兵
    • Nocaps
    • Smallcaps
    • Allcaps
    • base character Ņ
    • italic Ņ
    • ucfirst Ņ
    • lcfirst ņ
    • Nocaps Ņ
    • Smallcaps Ņ
    • Allcaps Ņ
      • Doesn't look as if that theory holds up, nor obvious variants. Anyhow, the article has since addressed the issue with the use of {{asiantitle}}, which seems a more elegant answer. I'm a bit surprised they didn't use |language= though. LeadSongDog come howl! 15:20, 31 March 2011 (UTC)

Since we're reporting auto-casing, there should be a tweak for words following a :. For example here Citation Bot changes Light-Front Holography and Gauge/Gravity Duality: The Light Meson and Baryon Spectra to Light-Front Holography and Gauge/Gravity Duality: the Light Meson and Baryon Spectra. It is customary, but not always, to capitalize the first word after a colon since it's usually the start of a subtitle. Hence Star Wars Episode IV: A New Hope and not the jarringly weird Star Wars Episode IV: a New Hope. Headbomb {talk / contribs / physics / books} 23:56, 31 March 2011 (UTC)

All done, I hope. Martin (Smith609 – Talk) 04:02, 1 April 2011 (UTC)

Bare references renamed

Status
  Fixed in GitHub Pull 337
Reported by
Martin (Smith609 – Talk) 12:24, 1 April 2011 (UTC)
Type of bug
Deleterious
What happens
Bare references renamed
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Operation_Opera&diff=prev&oldid=421808300
We can't proceed until
A specific edit to the bot's code is requested below.


{{resolved}}

Missed a page?

Status
{{resolved}}
Reported by
Headbomb {talk / contribs / physics / books} 17:18, 1 April 2011 (UTC)
Type of bug
Improvement
What happens
Fills the template...
What should happen
... but without the page
We can't proceed until
Input from editors


Pagination for this article is not present in the CrossRef database. Martin (Smith609 – Talk) 13:24, 9 April 2011 (UTC)

Spires and adabs concur with the PhysRevC source, they give an article id (031303) in lieu of a page number, plus a page count (5pp.) but no actual page. This is similar to the situation with Pubmed listing of electronic-only journals discussed recently. LeadSongDog come howl! 13:45, 12 April 2011 (UTC)

conversion of {{arxiv}} to |arxiv= fails when the template has named parameters

Status
  Fixed in GitHub Pull 335
Reported by
David Eppstein (talk) 20:05, 6 April 2011 (UTC)
Type of bug
Inconvenience: Humans must occasionally make immediate edits to clean up after the bot
What happens
In v331, {{arxiv|archive=X|id=Y}} becomes arxiv=archive=X/id=Y, should become arxiv=X/Y
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Universal_graph&curid=1452141&diff=422744870&oldid=418687216
We can't proceed until
Operator: Bot operator's feedback on what is feasible


Not fixed. See [64] Headbomb {talk / contribs / physics / books} 13:35, 9 April 2011 (UTC)

This reported as a separate bug below and   Fixed in GitHub Pull 341. Martin (Smith609 – Talk) 02:02, 12 April 2011 (UTC)

{{resolved}}

Wrong ISBN applied to old book

Status
{{resolved}}?
Reported by
Redrose64 (talk) 21:55, 7 April 2011 (UTC)
Type of bug
Inconvenience: Humans must occasionally make immediate edits to clean up after the bot
What happens
The citation is that of the first edition, by E.T. MacDermot (pub. Great Western Railway 1931, 654 pages). The ISBN applied with this edit is that of the second edition, "by E.T. MacDermot, revised by C.R. Clinker" (pub. Ian Allan 1964, 480 pages). The two editions are very different, the second edition having significantly fewer pages than the first, so had ISBNs been around prior to 1964, the ISBN of the first edition would have been different.
What should happen
Please don't apply an ISBN unless you're absolutely certain that it's correct for the edition concerned.
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Greenford_Branch_Line&curid=1639029&diff=422914126&oldid=418279295
We can't proceed until
Agreement on the best solution


I've also seen and been annoyed by this bug, although I don't have diffs easily available. —David Eppstein (talk) 21:57, 7 April 2011 (UTC)
The catalogue data seems to show this as a 1927 multivolume set. This may have caused a mismatch on the 1931 cover date which pertains to the specific volume, not the whole set. Added OCLC and given names in article. LeadSongDog come howl! 23:16, 7 April 2011 (UTC)
I found the one that annoyed me after all: here. —David Eppstein (talk) 23:50, 7 April 2011 (UTC)
Given names are fine: but the OCLC page is specific to vol. I, not a set (it states "Contents: v. 1. 1833-1863."). The bug reported above concerns misattribution of vol. II.
I possess a first edition, and can obtain a second edition easily enough tomorrow - Didcot Public Library have a set, and it's 12 mins walk from my house. The first edition is curious in that there are two volumes, but three physical books:
  • Title page: "Vol. I. 1833-1863 Part I" is dated 1927 and pages are i-xvi and 1-456;
  • Title page: "Vol. I. 1833-1863 Part II" also dated 1927, pages i-x, 457-902;
  • Title page: "Vol. II. 1863-1921" dated 1931, pages i-xii, 1-654
The 2nd edition reduced the line spacing and font size, allowing vol. I to be bound as a single book; the pagination changed completely, and a third volume was added for 1923-47 (this one was by O.S. Nock).
Clicking the Edward Terence MacDermot link turns up fifteen entries for these books (both first and second editions), including vol. 3 which MacDermot contributed nothing to - it was entirely the work of O.S. Nock. All seem to have errors (for example, OCLC 55853902 purports to be vol. II, but is in fact vol. I part II - the pages are shown as "458-502", which is certainly a typo for "457-902"), so I do not trust that OCLC 4106652 one bit. The best match seems to be OCLC 55853736. The set itself seems to be OCLC 504222427. --Redrose64 (talk) 00:09, 8 April 2011 (UTC)
Compare OCLC 52543245 which is "2v. in 3." Note the trailing comma in the title, also seen in this Library of Congress record. There are several versions on Google books with snippets and cover views, including this which seems to be your Volume 2. The cover view is distinct from this which is Volume 1 Part 2. Note the last line above the coats of arms on the covers. Google's "Find in a library" links both these to OCLC 4106652. Then this entry at the OpenLibrary shows "3v. in 4:", including the volume by Oswald Stevens Nock (see the MARC record from Binghamton U. Still, I'm fairly sure you want this OpenLibrary record, derived from this MARC record contributed by Talis. Vol 2, 654pp, 1931. Is that the one you have? LeadSongDog come howl! 03:57, 8 April 2011 (UTC)
Searching the Library of Congress authorities for "MacDermot, E.T." finds seven records, of which the definitive one is "MacDermot, Edward T. (Edward Terence), 1873-1950" I note that the different OCLC entries discussed above are inconsistent as to which of the seven they use, though no doubt these variations will eventually get sorted out. If you're really keen, you could register an account at the OpenLibrary and leave comments to that effect. While at it, you might want to upload a cover scan to OpenLibrary. LeadSongDog come howl! 04:23, 8 April 2011 (UTC)
Sorry, but that link fails for me with "Internal Server Error The server encountered an internal error or misconfiguration and was unable to complete your request. Please contact the server administrator, ils@loc.gov and inform them of the time the error occurred, and anything you might have done that may have caused the error. More information about this error may be available in the server error log."
The "cover" shown at Google Books is actually the title page. The cover proper is plain brown leathercloth, with no text; I don't have the dustjacket, but that would probably have matched those for vol. I, which are pale blue, mostly plain apart from the six words "HISTORY OF THE GREAT WESTERN RAILWAY" in black across four lines. I don't see a parameter in {{cite book}} which is suitable for an OpenLibrary number, so I'm sticking with OCLC 55853736. Thanks for your help. --Redrose64 (talk) 13:47, 8 April 2011 (UTC)

Open Library number is |ol=number.

*{{cite book
 |last=MacDermot |first=E. T.
 |year=1927
 |title=History of the Great Western Railway
 |publisher=[[Great Western Railway Company]]
 |ol=22019452M
 |lccn=28015258
}}

gives

Headbomb {talk / contribs / physics / books} 03:15, 12 April 2011 (UTC)

Thanks, I've added |ol= and |lccn= (plus some others) to the documentation of {{cite book}}, but in a rather scanty form. --Redrose64 (talk) 15:22, 12 April 2011 (UTC)

URL eaten

Status
  Fixed in GitHub Pull 336 -
Reported by
Peregrine981 (talk) 08:16, 8 April 2011 (UTC)
Type of bug
Inconvenience: Humans must occasionally make immediate edits to clean up after the bot
What happens
bot deleted a valid URL for an article
What should happen
bot should leave it alone
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Paul_R._Ehrlich&diff=422938413&oldid=420803595
We can't proceed until
A user confirms that the fix worked
Requested action from maintainer
Just FYI


It's not quite a bug since the URL wasn't used by the template. But some improvement could be made here I think. Headbomb {talk / contribs / physics / books} 08:33, 8 April 2011 (UTC)

URL incorrectly formatted since |url= not present, but that's not a reason to delete the content. Fortunately this should be pretty rare. Rjwilmsi 09:16, 8 April 2011 (UTC)

{{resolved}}

Messing with authors for no real reason

Status
  Fixed in GitHub Pull 342
Reported by
Headbomb {talk / contribs / physics / books} 13:47, 9 April 2011 (UTC)
Type of bug
Inconvenience
What happens
Citation Bot reorders authors parameters very weirdly
What should happen
The bot shouldn't mess with that stuff
We can't proceed until
A user confirms that the fix worked


{{resolved}}

Bad merging of references

Status
{{resolved}}
Reported by
Headbomb {talk / contribs / physics / books} 04:46, 11 April 2011 (UTC)
Type of bug
Catastrophical
What happens
Wrongfully merging and renaming duplicate references
Relevant diffs/links
http://en.wikipedia.org/w/index.php?title=Period_%28gene%29&diff=prev&oldid=423391043
We can't proceed until
A user confirms that the fix worked


Bot paused until I can investigate. Martin (Smith609 – Talk) 13:47, 11 April 2011 (UTC)
Cannot replicate. Martin (Smith609 – Talk) 01:52, 12 April 2011 (UTC)
The problem seemed to be triggered by the empty named ref: <ref name=BioClocks></ref> , which the bot should have simply deleted but instead replaced.LeadSongDog come howl! 19:25, 12 April 2011 (UTC)
Actually, <ref name=BioClocks></ref> is a synonym for <ref name=BioClocks /> . I realized this and updated the bot in a recent edit, presumably after this bug was reported. Martin (Smith609 – Talk) 21:17, 13 April 2011 (UTC)

Suggested small enhancement

Martin (Smith609 – Talk) 17:19, 14 June 2011 (UTC)   Fixed in GitHub Pull 355 && {{resolved}}.

incorrect change in Stalemate

Status
  Fixed in GitHub Pull 348
Reported by
Bubba73 You talkin' to me? 21:32, 2 May 2011 (UTC)
Type of bug
Deleterious
What happens
Dashes in a comment after a date were taken as a date range instead
Relevant diffs/links
[65]
We can't proceed until
Agreement on the best solution


{{resolved}}

Explanation please

Is there a reason why this bot is changing cite book to citation? Did we do away with this? See here [66] Heiro 05:29, 19 May 2011 (UTC)

The article had 6 of the former, 15 of the latter. The bot picked the more common form.LeadSongDog come howl! 12:51, 19 May 2011 (UTC)

{{resolved}}

Overzealous change: Machlin to MacHlin

Status
  Fixed {{resolved}}
Reported by
Ucucha 14:20, 2 June 2011 (UTC)
Type of bug
Inconvenience
What happens
Diff. The name of author "Machlin" gets changed into "MacHlin". The bot apparently got its data from the PubMed API, which I don't know how to access, but the main PubMed page for this paper correctly shows the author as "Machlin".
Replication instructions
Run bot on a page with text of [67].
We can't proceed until
Bot operator's feedback on what is feasible


I don't know how feasible a fix is; there are probably many cases where an article was written by a true Scotsman and the bot should make this change, because its input data often make errors with case, but there are certain to be more exceptions like this one. Ucucha 14:20, 2 June 2011 (UTC)

With the putative exception of MacHeath, I can't think of any Macs who would have an H in the fourth position of their name, so I've disabled MacCapitalization for MacH*s (in r 346). Apologies to any MacHectors that I may upset with this decision! Martin (Smith609 – Talk) 21:27, 11 June 2011 (UTC)

Missing data

Status
  Done{{resolved}}
Reported by
Operator
Type of bug
Improvement
What happens
Some metadata not extracted
Relevant diffs/links
Template:Cite doi/10.1007.2F978-94-007-0680-4_11
We can't proceed until
A specific edit to the bot's code is requested below.
Requested action from maintainer
Is the bot missing available data?


Book chapters now supported. Martin (Smith609 – Talk) 21:22, 11 June 2011 (UTC)

Archiving problem

The archiving bot archived the same 11, 12, 13, 14 or 22 identical threads into Archive_1 over and over again.

See the edit history for Archive_1, especially between 23 June 2011 and 19 August 2011.

The duplication of threads has now been removed from Archive_1. See Archive1 (the 3rd archive) for these threads. The item numbers below refer to those found in the version of Archive_1 dated 19 August 2011 and the version of Archive1 (the 3rd archive) dated 2 October 2011.

  • Items 178 to 188 (11 items) of Archive_1 were duplicate copies of items 1-4, 10, 11, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 189 to 199 (11 items) of Archive_1 were duplicate copies of items 1-4, 10, 11, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 200 to 210 (11 items) of Archive_1 were duplicate copies of items 1-4, 10, 11, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 211 to 222 (12 items) of Archive_1 were duplicate copies of items 1-4, 10-12, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 223 to 234 (12 items) of Archive_1 were duplicate copies of items 1-4, 10-12, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 235 to 246 (12 items) of Archive_1 were duplicate copies of items 1-4, 10-12, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 247 to 258 (12 items) of Archive_1 were duplicate copies of items 1-4, 10-12, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 259 to 270 (12 items) of Archive_1 were duplicate copies of items 1-4, 10-12, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 271 to 283 (13 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-12, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 284 to 296 (13 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-12, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 297 to 309 (13 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-12, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 310 to 322 (13 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-12, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 323 to 335 (13 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-12, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 336 to 349 (14 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-12, 14, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 350 to 363 (14 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-12, 14, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 364 to 377 (14 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-12, 14, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 378 to 391 (14 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-12, 14, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 392 to 405 (14 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-12, 14, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 406 to 419 (14 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-12, 14, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 420 to 433 (14 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-12, 14, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 434 to 447 (14 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-12, 14, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 448 to 461 (14 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-12, 14, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 462 to 475 (14 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-12, 14, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 476 to 489 (14 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-12, 14, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 490 to 503 (14 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-12, 14, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 504 to 505 (2 items) of Archive_1 were duplicate copies of items 1-2 of Archive1 (the 3rd archive).

The following duplicate threads were manually moved over to Archive_2 in 2012 and have now also been removed.

  • Items 506 to 517 (12 items) of Archive_1 were duplicate copies of items 3-4, 6, 10-12, 14, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 518 to 531 (14 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-12, 14, 17, 18, 22, 25, 29 of Archive1 (the 3rd archive).
  • Items 532 to 553 (22 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-14, 17-19, 21-27, 29, 31 of Archive1 (the 3rd archive).
  • Items 554 to 575 (22 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-14, 17-19, 21-27, 29, 31 of Archive1 (the 3rd archive).
  • Items 576 to 597 (22 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-14, 17-19, 21-27, 29, 31 of Archive1 (the 3rd archive).
  • Items 598 to 619 (22 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-14, 17-19, 21-27, 29, 31 of Archive1 (the 3rd archive).
  • Items 620 to 641 (22 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-14, 17-19, 21-27, 29, 31 of Archive1 (the 3rd archive).
  • Items 642 to 663 (22 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-14, 17-19, 21-27, 29, 31 of Archive1 (the 3rd archive).
  • Items 664 to 685 (22 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-14, 17-19, 21-27, 29, 31 of Archive1 (the 3rd archive).
  • Items 686 to 707 (22 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-14, 17-19, 21-27, 29, 31 of Archive1 (the 3rd archive).
  • Items 708 to 729 (22 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-14, 17-19, 21-27, 29, 31 of Archive1 (the 3rd archive).
  • Items 730 to 751 (22 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-14, 17-19, 21-27, 29, 31 of Archive1 (the 3rd archive).
  • Items 752 to 773 (22 items) of Archive_1 were duplicate copies of items 1-4, 6, 10-14, 17-19, 21-27, 29, 31 of Archive1 (the 3rd archive).