User talk:Citation bot/Archive 10

Latest comment: 5 years ago by AManWithNoPlan in topic hardcoded hdl url
Archive 5 Archive 8 Archive 9 Archive 10 Archive 11 Archive 12 Archive 15

Coitoid finds doi, Citation bot does not

If expanding from raw JSTOR URL (http://www.jstor.org/stable/3363372), the bot does not find doi's, while citoid does. Can we somehow call the same resources as Citoid?

Citation bot

"New Lights upon Old Tunes. "The Arethusa"". The Musical Times and Singing Class Circular. 35 (620): 666–668. 1894. JSTOR 3363372.

Citoid (+Citation bot afterwards)

"New Lights upon Old Tunes. "The Arethusa"". The Musical Times and Singing Class Circular. 35 (620): 666–668. 1894. doi:10.2307/3363372. JSTOR 3363372. (tJosve05a (c) 21:18, 30 August 2018 (UTC)

We do not add DOIs that are not in CrossRef at this time. AManWithNoPlan (talk) 23:56, 30 August 2018 (UTC)
two more comments. It adds nothing since it is just a jstor stable ID doi. Also we based our jstor code on Citoids so they having nothing that we don’t. AManWithNoPlan (talk) 00:18, 31 August 2018 (UTC)

{{notabug}}

cite journal -> cite book wtf?

Status
{{notabug}} Cannot reproduce weird
Reported by
Headbomb {t · c · p · b} 18:19, 24 August 2018 (UTC)
What happens
changes cite journal to cite book for unclear reasons
What should happen
not that
Relevant diffs/links
[1]
We can't proceed until
Feedback from maintainers


Looks like a false positive, but I can't reproduce from the citation alone. Did you get any clue from the bot's output as to what was happening here? Can you reproduce from the page? Martin (Smith609 – Talk) 18:25, 24 August 2018 (UTC) API gives...

Checking AdsAbs database
   > AdsAbs search 3476/50000: title:"Music and Connectionism"
   + Adding bibcode: 1994ASAJ...96.1218T
   + Adding journal: Acoustical Society of America Journal
   - Dropping parameter "publisher"
   + Adding volume: 96
   + Adding issue: 2
   + Adding pages: 1218
   + Adding doi: 10.1121/1.410341

Headbomb {t · c · p · b} 18:30, 24 August 2018 (UTC)

So, it found a review of the book. Probably matching on name alone. AManWithNoPlan (talk) 19:21, 24 August 2018 (UTC)

Whitespace issue

Status
{{fixed}} not sure when, but it is
Reported by
Martin (Smith609 – Talk) 15:07, 21 August 2018 (UTC)
What happens
Replacement of {Cite arxiv with {Cite journal modifies whitespace
What should happen
Retain pre-existing whitespace. There will be blood!
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Black_hole&diff=prev&oldid=855893224
We can't proceed until
Feedback from maintainers


Removal of trailing full stop

Status
{{fixed}}
Reported by
wumbolo ^^^ 10:22, 27 August 2018 (UTC)
What happens
Trailing period is removed (Bot changes "Washington D.C." to "Washington D.C" in |title=)
What should happen
Trailing period should not be removed
Relevant diffs/links
[2]
We can't proceed until
Feedback from maintainers


This should apply to all such abbreviation (unspaced or S.H.I.E.L.D. or spaced R. G.), plus a small list of words like "Inc., Ltd." Headbomb {t · c · p · b} 14:14, 27 August 2018 (UTC)

I do not know why the bot even does this. There are just too many cases when it should be there. AManWithNoPlan (talk) 14:58, 27 August 2018 (UTC)
Actually if you leave the abbreviations alone, there are very very few false positives left. Headbomb {t · c · p · b} 16:32, 27 August 2018 (UTC)

https://github.com/ms609/citation-bot/pull/703/files AManWithNoPlan (talk) 00:27, 1 September 2018 (UTC)

Caps: I

Status
new bug
Reported by
Headbomb {t · c · p · b} 04:05, 30 August 2018 (UTC)
Relevant diffs/links
[3]
We can't proceed until
Feedback from maintainers


That will take a special case for the journal name. AManWithNoPlan (talk) 13:28, 30 August 2018 (UTC)

https://github.com/ms609/citation-bot/pull/699

Caps: per

Status
new bug
Reported by
Headbomb {t · c · p · b} 02:25, 2 September 2018 (UTC)
Relevant diffs/links
[4]
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/710 AManWithNoPlan (talk) 02:45, 2 September 2018 (UTC)

volume / issue demixupification

Status
{{fixed}}
Reported by
Headbomb {t · c · p · b} 23:53, 22 August 2018 (UTC)
What happens
Nothing
What should happen
In {{cite journal}} find (\s*)\|(\s*)volume(\s*)=(\s*)(\d+)\s*\((\d+(-|–|\–|\{\{ndash\}\})?\d*)\)$1|$2volume$3=$4$5$1|$2issue$3=$4$6

However, if |issue= is already set and ≠ $6, skip

We can't proceed until
Feedback from maintainers


See [5] for a small sample of what is screwed up. The regex would catch more cases though. Headbomb {t · c · p · b} 23:53, 22 August 2018 (UTC)

I think if issue is set, the look for and remove (issue).
 If not set then look for ^([A-Z0-9]+)(\([0-9].\))$

Thus volumes and numbers and capitals. Issues start with numbers AManWithNoPlan (talk) 03:45, 2 September 2018 (UTC)

Except for all the issues that don't (e.g. 'Suppl. 1', 'Fasc. 1', 'Special Issue'). If the issue is set, skip this fix. There's likely a problem with the citation, but it's not something the bot could reliably fix. (e.g. if you have weird stuff in issue volume number/year/pagenumber chances as you'll have weird stuff everywhere in volume/issue/page). My regex has been fairly well tested in User:CitationCleanerBot, and I don't recall running into any issue with it. The only things I can't make it do with AWB is clean up the volume if |issue= is already set and = $6, because I'm skipping on "if issue is set", which you presumable could do with citation bot (if ≠ $6, should be skipped, per above). Headbomb {t · c · p · b} 11:19, 2 September 2018 (UTC)

https://github.com/ms609/citation-bot/pull/713 AManWithNoPlan (talk) 15:16, 2 September 2018 (UTC)

cite article

I had to do this to get the bot to do this. (tJosve05a (c) 12:57, 23 August 2018 (UTC)

too many citation templates AManWithNoPlan (talk) 23:10, 23 August 2018 (UTC)
https://github.com/ms609/citation-bot/pull/709 AManWithNoPlan (talk) 23:37, 1 September 2018 (UTC)

{{fixed}}

eJournal of...

Status
{{fixed}}
Reported by
Headbomb {t · c · p · b} 16:03, 23 August 2018 (UTC)
What happens
|journal=eFoobar|journal=EFoobar
What should happen
If you have |journal=eFoobar as a pattern, keep it |journal=eFoobar
Relevant diffs/links
[6]
We can't proceed until
Feedback from maintainers


Let's see how recurrent this issue is, and re-evaluate if adding individual exceptions becomes unmanageable. Martin (Smith609 – Talk) 18:29, 24 August 2018 (UTC)

Well, eLife has about 375 uses on Wikipedia WP:JCW/E12, and eJournal / e-Journal appear a crap ton. (Note that they will often display as ELife / EJournal /E-Journal due to how JL-Bot presents that information.) So most could probagbly be handled with an exception for eLife / eJournal / e-Journal. Headbomb {t · c · p · b} 18:36, 24 August 2018 (UTC)

  • My suggestion would be to (somehow) code so that "{lowercase}{Uppercase}bar" (such as iPhone, eLife, aJournal) not be cap/case-adjusted. (tJosve05a (c) 19:54, 24 August 2018 (UTC)
On top of eLife, e-?Journals?, there's also bioRxiv, eNeuro, engrXiv, ePlasty, e?Prints?, eVolo, hprints, mAbs, mBio, mSphere, mSystems. Headbomb {t · c · p · b} 13:37, 25 August 2018 (UTC)

https://github.com/ms609/citation-bot/pull/699 AManWithNoPlan (talk) 18:27, 1 September 2018 (UTC)

Arrows and not always quotes

Status
{{fixed}}
Reported by
(tJosve05a (c) 13:52, 27 August 2018 (UTC)
What happens
The bot replaces |title=The Serving Soldier » Home with |title=The Serving Soldier " Home
What should happen
Nothing
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=JISC_Digitisation_Programme&diff=prev&oldid=856780234
https://en.wikipedia.org/w/index.php?title=Rajesh_Khanna&diff=prev&oldid=856850915
We can't proceed until
Feedback from maintainers


Unless their is a pair of « », we should not assume these are quotation marks, they may in fact be arrows, as here. (tJosve05a (c) 13:52, 27 August 2018 (UTC)

You are correct, they seemed to be misused often as arrow. « « «  » »  » AManWithNoPlan (talk) 21:11, 1 September 2018 (UTC)
https://github.com/ms609/citation-bot/pull/708 AManWithNoPlan (talk) 19:53, 2 September 2018 (UTC)

Wrongly sets class in generic citation template

Status
{{fixed}}
Reported by
Headbomb {t · c · p · b} 20:30, 3 September 2018 (UTC)
What happens
Sets |class= in {{citation}} with |journal= set
Relevant diffs/links
[7]
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/720 AManWithNoPlan (talk) 21:58, 3 September 2018 (UTC)

Errors detected in PMID search (SimpleXMLElement Object( [FieldNotFound] => 161:SPASCN )

Not sure if this is a bug or not, but it feels odd to see an error message, so wanted to confirm what it ment....

* Expand citation: ''Solanum perlongistylum'' and ''S. catilliflorum'', New Endemic Peruvian Species of Solanum, Section Basarthrum, Are Close Relatives of the Domesticated Pepino, ''S. muricatum''
 > Extracting information from SICI
 > Found and used SICI [..> rifydoi]
 > Checking that DOI 10.3417/1055-3177(2006)16[161:SPASCN]2.0.CO;2 is operational... DOI ok.
   . Initial authors exist, skipping authorlink in tidy
   . Initial authors exist, skipping authorlink in tidy
   . Initial authors exist, skipping authorlink in tidy
   . Initial authors exist, skipping authorlink in tidy
 > Checking AdsAbs database
   > AdsAbs search 4720/50000:
       doi:"10.3417/1055-3177(2006)16[161:SPASCN]2.0.CO;2"
   > AdsAbs search 4721/50000:
       pub:"Novon: A Journal for Botanical Nomenclature"
       year:2006
       issn:1055-3177
       volume:"16"
       page:"161–167" [..> indpmid]
 > Searching PubMed... 
 - Errors detected in PMID search (SimpleXMLElement Object
(
    [FieldNotFound] => 161:SPASCN
)
); abandoned. nothing found.

(tJosve05a (c) 14:33, 22 August 2018 (UTC)

pubmed does not support URLs with square brackets, except when they do🙄. If someone can figure out how to do this more reliably we are all ears. AManWithNoPlan (talk) 15:23, 22 August 2018 (UTC)
there is a reason square brackets gone banned from DOIs. Pretty much killed SICI too. AManWithNoPlan (talk) 15:24, 22 August 2018 (UTC)
I have a trouble ticket submitted to pubmed about how to search for these evil DOIs AManWithNoPlan (talk) 02:20, 2 September 2018 (UTC)
{{wontfix}} sucks AManWithNoPlan (talk) 00:22, 5 September 2018 (UTC)
We will no longer attempt this search though: https://github.com/ms609/citation-bot/pull/724 AManWithNoPlan (talk) 03:48, 5 September 2018 (UTC)

Question to the maintainers

What is prefable? Filing bug reports and feature requests here, or on GitHube (as issues)? (tJosve05a (c) 14:13, 26 August 2018 (UTC)

Personally (not a maintainer), I prefer here, since I don't need to register on another site, we've got access to familiar wikitext, plus it's easier to link / browse / search / track issues, and we have watchlists. It's also where the bot summaries say to report bugs, and if you report something here it lets others know the issue was reported. If you report it on GitHub, then I additionally need to check GitHub to make sure I'm not filing a duplicate bug report. Headbomb {t · c · p · b} 15:28, 26 August 2018 (UTC)
I was just thinking that it is easier to connect code fixes (pulls) with issues, and search for issues on github (and see which are fixed/to be fixed). And other free software coders may be able to find reported issues and help out. Plus all ’contributors’ to an issue gets notified when updates to their reported issue is made. I’m not promoting the usage of one or the other, just asking if one was preferred or not from a mainenence pov. (tJosve05a (c) 15:34, 26 August 2018 (UTC)
I prefer here only. Most people cannot post issues on GitHub. AManWithNoPlan (talk) 17:34, 26 August 2018 (UTC)
This page is definitely more accessible for bug reporters, which is our principal aim. But as a maintainer, I see a number of advantages to GitHub issues: firstly, I'm more likely to spot them in a timely fashion; secondly, they are integrated with GitHub edits, so it's less overhead to keep track of what has been fixed, and it's possible to link code edits to the issue that has motivated them; thirdly, it's much easier for me to see which issues would benefit from my attention (particularly on occasions when ClueBot III is down!). So I personally would encourage bug reporters who are comfortable doing so to report bugs on GitHub, so long as it doesn't cost them additional time or inconvenience – but certainly want anyone to feel welcome to submit bug reports in whichever format suits them best. This said, a rare thing for me to have much time to contribute to the bot's maintenance, so the preferences of AManWithNoPlan are more pertinent than my own! Martin (Smith609 – Talk) 08:51, 27 August 2018 (UTC)
emergency should go to both. AManWithNoPlan (talk) 22:13, 27 August 2018 (UTC)
{{notabug}} archive AManWithNoPlan (talk) 14:19, 5 September 2018 (UTC)

Vietnam War page fails

Status
new bug
Reported by
(tJosve05a (c) 20:04, 24 August 2018 (UTC)
What happens
I continually get 500 - Internal Server Error when running the bot on Vietnam_War (using https://tools.wmflabs.org/citations/doibot.php?edit=toolbar&slow=1&user=USERNAME&page=Vietnam_War). I've tried for two days now.
We can't proceed until
Feedback from maintainers


I think that it is too big. AManWithNoPlan (talk) 20:18, 24 August 2018 (UTC)

[Edit conflict] Was saying the same thing. I wonder what the easiest fix is here? Increase the server's timeout? Martin (Smith609 – Talk) 20:18, 24 August 2018 (UTC)
Either that, or perhaps process each section of the page seperatly or something (as a batch). (tJosve05a (c) 20:19, 24 August 2018 (UTC)
fractional pages is something one can do by hand. Annoying but doable. AManWithNoPlan (talk) 22:28, 26 August 2018 (UTC)
It is not a bug in citation bot. https://bugs.php.net/bug.php?id=45735 This is the line that seg faults in Page.php while(preg_match($regexp, $text, $match)) AManWithNoPlan (talk) 14:35, 5 September 2018 (UTC)
{{notabug}} Flagging and moving to GitHub for us to remember and think about. AManWithNoPlan (talk) 15:33, 5 September 2018 (UTC)

parses arxiv data incorrectly when page numbers are huge


https://github.com/ms609/citation-bot/pull/664 AManWithNoPlan (talk) 16:35, 28 August 2018 (UTC)

Maybe live, maybe not: [8] Headbomb {t · c · p · b} 00:45, 2 September 2018 (UTC)

Adds bibcode to cite arxiv (and also an extra eprint to cite arxiv)

Status
{{fixed}}
Reported by
Headbomb {t · c · p · b} 19:45, 2 September 2018 (UTC)
Relevant diffs/links
[9]
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/716 https://github.com/ms609/citation-bot/pull/715 AManWithNoPlan (talk) 00:22, 3 September 2018 (UTC) https://github.com/ms609/citation-bot/pull/717 AManWithNoPlan (talk) 01:27, 3 September 2018 (UTC)

Duplicate journal name

Status
{{fixed}}
Reported by
wumbolo ^^^ 14:04, 28 August 2018 (UTC)
What happens
Journal's name is duplicated.
What should happen
If {{cite journal}}'s |journal= is added, any of its possible duplicates (like |work= and |website=) should be removed. That's a CS1 error, and the bot should not produce CS1 errors.
Relevant diffs/links
[10]
We can't proceed until
Feedback from maintainers


The decision to include the stupid generic |work= in the citation templates is a bane to bots everywhere. AManWithNoPlan (talk) 14:28, 28 August 2018 (UTC)

Is there a reaosn not to use |work= for all citation templates? It is a global/generic parameter which works for all citation templates, and |journal=, |website= etc. are just synonyms. (tJosve05a (c) 16:27, 28 August 2018 (UTC)
|work= is vague and unclear to most people. What's a work? It can be a book title, a conference proceeding titles, a journal title, a website, ... |journal= or |website= or whatever is clear and cannot be confused. Headbomb {t · c · p · b} 16:44, 28 August 2018 (UTC)
Exactly! It can be anything, and the user doesn't have to specify the specific type of work, that shoudl be done with the template (such as {{cite journal}} or {{cite book}}. It's much easier to use the parameter |work=. It looks terrible when the bot changes "cite news [...] |work=BBC News" to |website=BBC News instead of changing it to {{cite news}} and keep work. Now it creates more work for editors, both to change form |website= to |work= or |newpaper=, and to {{cite news}}, instead of just to {{cite news}} which still results in the same output. (tJosve05a (c) 17:08, 28 August 2018 (UTC)
Work is confusing AF for most people. If people are citing a journal, they go "this is the journal's name", not "this is the work's name". Recognizable and understandable parameter names are important. Maybe {{cite web}} should be excluding from renaming 'work', because all sorts of crap get puts in there, but everywhere else 'work' should be purged. Headbomb {t · c · p · b} 17:20, 28 August 2018 (UTC)
work is not the wrong choice all the time, but almost anytime you see it used, it is the wrong choice AManWithNoPlan (talk) 19:05, 29 August 2018 (UTC)

https://github.com/ms609/citation-bot/pull/719 AManWithNoPlan (talk) 03:30, 3 September 2018 (UTC)

Changing "work" to "website"

In this edit, all the bot does is replace the {{cite web}} parameter work= with website=. The template's documentation says they are aliases. Even if the bot was also doing something useful, these changes clutter up the diff screen for a style that is not in any way preferable. Why is the bot changing these? Bilorv(c)(talk) 16:51, 5 September 2018 (UTC)

Yes, agree completely! Have reported it as a bug... the "work" parameters should be left untouched. —Joeyconnick (talk) 20:19, 5 September 2018 (UTC)
flagging as {{fixed}} to archive discussion since all the real action is in the bug report AManWithNoPlan (talk) 21:25, 12 September 2018 (UTC)

Use on non-Wikipedia wikis?

Is it possible to use this on MediaWiki installations that are not part of wikipedia.org? — Omegatron (talk) 01:49, 7 September 2018 (UTC)

Theoretically Yes, but you would have to run it yourself and remove the en.wikipedia.org stuff and deal with authentication etc. AManWithNoPlan (talk) 16:29, 9 September 2018 (UTC)
I proposed some changes to aid in this. Still a long way off. https://github.com/ms609/citation-bot/pull/743 AManWithNoPlan (talk) 21:21, 9 September 2018 (UTC)
we have done what we easily can for now. come back again later if still interested when we have fewer bugs to deal with {{wontfix}} AManWithNoPlan (talk) 21:26, 12 September 2018 (UTC)

API: Category refinements

Much, much better. However, it could still be a bit better: When you start, you have

*** Processing page '{2018 FFA Cup preliminary rounds}' : 12:13:01
--------------------------------------------------------------------------
[12:13:02] Processing page '[[2018 FFA Cup preliminary rounds]]' — [[edit]]—[[history]]

This should be simplified to

--------------------------------------------------------------------------
[12:13:02] Processing page '[[2018 FFA Cup preliminary rounds]]' – [[edit]] – [[history]]

This eliminates redundancy and, the spaces help + use endashes. When no changes are required, you have

# No changes required.
 # # #

This should be simplified to

# No changes required.

When you have a change, you have

 # Writing to Peace Pledge Union... 
 Written to [[Peace Pledge Union]]{{POV|date=October 2015}}
The '''Peace Pledge Union (PPU)''' is a British [[pacifist]] ...

This should be simplified to (with a line break after "Writing to Peace Pledge Union...")

# Writing to [[Peace Pledge Union]]...
{{POV|date=October 2015}}
The '''Peace Pledge Union (PPU)''' is a British [[pacifist]] ...

And when you end with

[[history]] / [[last edit]]

This could be much simpler/clearer with

[[diff]]

Headbomb {t · c · p · b} 12:16, 21 August 2018 (UTC)

{{wontfix}} not going to pollute code with all sorts of "if ( running a category) then " code. AManWithNoPlan (talk) 21:56, 12 September 2018 (UTC)

Add JSTOR links

Not sure if there is a reason why this isn't done yet, but would it be possible to add JSTOR links in cases where this isn't already added? For example, source 12 in Brachiosaurus has a doi, but I know it is also on JSTOR[11], so shouldn't the bot be able to cross check? FunkMonk (talk) 00:22, 14 September 2018 (UTC)

{{wontfix}} nope. not searcable. jstor disabled tbat years ago. AManWithNoPlan (talk)|

dont be this guy https://en.wikipedia.org/wiki/Aaron_Swartz AManWithNoPlan (talk) 01:18, 14 September 2018 (UTC)
jstor -> data <-> doi. there is no jstor to doi mapping. AManWithNoPlan (talk) 02:13, 14 September 2018 (UTC)
@AManWithNoPlan: Not true. https://www.jstor.org/openurl?doi=10.2307/455826 > https://www.jstor.org/stable/455826. Some doi mapping exists, just does not work for dois with brackets etc. in them from what I can tell... (tJosve05a (c) 07:41, 14 September 2018 (UTC)
This is a jstor-specific doi, which is mapped JSTOR --> DOI. You still can't query the JSTOR database by DOI, and get a JSTOR match. Headbomb {t · c · p · b} 12:06, 14 September 2018 (UTC)
you do not need to query a database to map that doi. That's jstors doi prefix. AManWithNoPlan (talk) 13:11, 14 September 2018 (UTC)
wrong again. when we plug your doi into that url we find nothing. 

https://www.jstor.org/openurl?doi=10.1671/0272-4634(2003)023[0344:teovpi]2.0.co;2. {{cite web}}: Missing or empty |title= (help) AManWithNoPlan (talk) 13:19, 14 September 2018 (UTC)

we might be able to find a few things, but we need an isssn. AManWithNoPlan (talk) 13:24, 14 September 2018 (UTC)

https://support.jstor.org/hc/en-us/articles/115005079047-JSTOR-OpenURL-Linking-

tiny font bug

{{Fixed}}

FYI, I noticed text-weight: bold, which is not valid CSS. The property is font-weight. --Izno (talk) 22:34, 9 September 2018 (UTC)

https://github.com/ms609/citation-bot/pull/750 AManWithNoPlan (talk)

Misuse of Format

In that same edit: |format=Full text, |format=Accepted manuscript, |format=Submitted manuscript are all inappropriate uses of |format= (not a 'new' parameter); see documentation.
Trappist the monk (talk) 09:41, 24 July 2018 (UTC)

diff The bot continues to add |format= with inappropriate values, in this case Full text. The purpose of the 'format' parameters is to identify for the reader the file format of the linked source, PDF, XLS, DOC, etc. (see the documentation).

This should make you happier. https://github.com/ms609/citation-bot/pull/513 AManWithNoPlan (talk) 14:18, 9 August 2018 (UTC)

But the just-as-innappropriate

case 'submittedVersion': $format = 'Submitted manuscript'; break;
case 'acceptedVersion': $format = 'Accepted manuscript'; break;

remain. Headbomb {t · c · p · b} 14:26, 9 August 2018 (UTC)

Sorry, it doesn't. None of |format=Full text, |format=Accepted manuscript, |format=Submitted manuscript are appropriate in |format= ever. Read the template documentation. The only thing that belongs in |format= is the electronic file format: PDF, XLS, DOC, MP3, etc.
Trappist the monk (talk) 14:30, 9 August 2018 (UTC)
I said happier, not happy. Make a pull request to comment out the other two and discuss with the maintainer. My change was a no brainer. AManWithNoPlan (talk) 18:06, 9 August 2018 (UTC)
Here's a request that someone pulls this. Headbomb {t · c · p · b} 21:39, 9 August 2018 (UTC)
Perhaps a question for the template page rather than here: but how, if not through 'format', ought a link to a pre-print be indicated? If my institution has access to a full text, I want to be using the DOI link rather than scrubbing through an unformatted preprint, but if the title links to a formatted PDF, I'd rather click that and avoid navigating a paywall. So I think it's worth indicating the destination of the URL. Martin (Smith609 – Talk) 07:26, 11 August 2018 (UTC)
We should probably have a |preprint-url= that appends "preprint" at the end of the template. Headbomb {t · c · p · b} 13:50, 11 August 2018 (UTC)
To be honest, I have no idea why adding (PDF) to a link is useful. Only would make sense to me in the case of (Proprietary CAD program file) AManWithNoPlan (talk) 15:23, 11 August 2018 (UTC)

Unless it is a file format, nothing should be added by the bot in |format=. think what you want about the existance of such a paramenter all we want, but don't misuse it. (tJosve05a (c) 20:29, 24 August 2018 (UTC)

https://github.com/ms609/citation-bot/pull/771 by Josve05a
https://github.com/ms609/citation-bot/pull/780 also fixes old esots AManWithNoPlan (talk) 20:53, 14 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 14:04, 20 September 2018 (UTC)

Cite web for Google Books

Should perhaps have been fixed with https://github.com/ms609/citation-bot/pull/652 but in https://en.wikipedia.org/w/index.php?title=Vallejo_%28ferry%29&diff=prev&oldid=856647186 the bot converted raw Google Book URLs to {{cite web}}. (Is this just because I'm using the gadget tool now and not the user script, and that pull fixes are delayed, or is this a new error?) (tJosve05a (c) 18:17, 26 August 2018 (UTC)

All tools use the same code base (unless you specify citations-dev in the URL). AManWithNoPlan (talk) 18:27, 26 August 2018 (UTC)
The Bot does not think that it’s a book because there is no evidence that it is a book: isbn etc. AManWithNoPlan (talk) 20:54, 27 August 2018 (UTC)
Except you know... being on Google Books. Headbomb {t · c · p · b} 01:44, 28 August 2018 (UTC)
All things on Google Books are books. --Izno (talk) 01:56, 28 August 2018 (UTC)
Didn't say it was a guarantee. Not sure what's best in those cases. Cite journal/magazine would be ideal, obviously, but failing that should we have a cite web or cite book. Cite web is really the shittiest of templates for citations, a sort of 'when all else fails' type of thing. Question is it is better to have magazines as books, or books as websites, which of the two are more widespread? Headbomb {t · c · p · b} 02:29, 28 August 2018 (UTC)
I'd rather use {{citation}} for when the bot does not know if a Google Books URL (and only those for now) is a book or other kind of media, since it isn't the webpage itself you reference, but the media it describes. Citoid uses {{citation}} widespread in cases such as this. (tJosve05a (c) 15:56, 3 September 2018 (UTC) (tJosve05a (c) 15:55, 3 September 2018 (UTC)

{{wontfix}} for now. We have some improvements coming. AManWithNoPlan (talk) 17:08, 20 September 2018 (UTC)

I disagree with the Consensus the drives the bot's actions

Women's liberation movement in North America Don't even know where to start. Changing isbn numbers from those given in source viewed, changing publishing information or deleting publisher, removing publishing location, all distort and are incompatible with accuracy of citation. As a historian, this bot failed to improve any of the citations and distorted the accuracy of information about source and material. Improving citations are always welcome, but deleting information which identifies sourcing accurately is not worthwhile. SusunW (talk) 04:16, 8 September 2018 (UTC)

Exactly same comments apply to Women's liberation movement. Inaccurate data, or changes to data, which dilute sourcing is unacceptable. SusunW (talk) 04:22, 8 September 2018 (UTC)
Actually, those are all in line. Citing journals never include location or publisher information in any style guide etc..., and ISBN 13 are prefered over ISBN 10. Headbomb {t · c · p · b} 11:26, 8 September 2018 (UTC)
The second biggest problem with those articles is all the google books links that need deleted. The Reference section reads like a paid shill for google wrote it. The biggest problem is the use of session specific urls that lead to nothing-not even google previews. AManWithNoPlan (talk) 13:28, 8 September 2018 (UTC)
AManWithNoPlan, I have never and will never accept pay for writing an article for Wikipedia. Your comment is unfounded and way off base, as a group of editors, none of whom received any pay, wrote the article(s) using sources with links that were available to them in their areas. As these multiple editors live in locations around the globe, they may well have access to sources you are unable to access. That neither makes the links invalid nor advertisements for google. I am uninterested in discussing your edits further, as your manner is very aggressive and accusatory. Headbomb as a historian, where you got it includes the publisher information and location to facilitate others in finding the material. As someone who does not live in the global north, it is often impossible to find a source shown on a web search without knowing the publisher/location. Going to the publisher's website, one can oft times locate the article of interest or ask for it to be provided. If the source gives 10 digits, it is a revision to make it 13; however, that is a minor issue. SusunW (talk) 16:05, 9 September 2018 (UTC)
The discussion would be facilitated if you provided diffs of specific edits as examples. Otherwise we may end up with everyone looking at different parts of the elephant. ♦ J. Johnson (JJ) (talk) 22:12, 8 September 2018 (UTC)
I never accused you of being a paid shill. I said it read like that(believe me, I have over google books articles and be chastised for it). If you are referencing specific pages in a book, then please list page numbers and link to those pages in google. If you are not linking to specific free pages then you should not include the google links and should let isbn be the link out. As for 13 vs 10 on ISBN, Wikipedia style guides say they should be converted—it is equivalent to adding 1 to to a USA phone number to specify country code - no real change. AManWithNoPlan (talk) 16:17, 9 September 2018 (UTC)
ISSN seems to be the Wikipedia approved method instead or publisher + location. This bot automatically removes issn once a doi is added. AManWithNoPlan (talk) 16:20, 9 September 2018 (UTC)
"Seems to be"? If there is no explicit basis for that then it becomes an unsanctioned and questionable alteration. And I would object to replacing an explicit identifier of a periodical with an identifier where the publisher might adopt a naming scheme for articles that does not expressly identify the periodical. Or does so in their own peculiar way, so that across a range of DOIs the only general way of identifying a periodical is to look at the record for the article. And the ISSN is no longer available if a specific DOI becomes unavailable. I don't know that ISSNs should be required, or even routinely added, but where an editor sees fit to add one it should not be removed. ♦ J. Johnson (JJ) (talk) 21:56, 9 September 2018 (UTC)
my mistake. I only remove ISSN if it adds the doi. AManWithNoPlan (talk) 22:30, 9 September 2018 (UTC)
You missed my point: I'm saying don't remove an ISSN even if you add a DOI. ♦ J. Johnson (JJ) (talk) 21:26, 10 September 2018 (UTC)
Thank you J. Johnson, you seem to grasp the problem. As I stated, I do not live in the global north. What someone in the US/UK, etc may have access to, does not mean that others can access it. Tying a citation to one instance of a document, like the DOI, limits accessibility. There are very often multiple access points to a single document. Giving broader information on where the source can be found improves the ability of both writers and readers to obtain source materials. If I am writing an anchor article, such as WLM, removing links to citations or altering them to links that I may not be able to access makes it far more difficult to write the biographies of the redlinked people in the article. While citing a DOI may allow someone who cannot access the link I used to access the source, it is also likely that it won't, leaving them searching for another point of access. I get that technicians and writers/researchers don't speak the same language, but we can always try to understand each other if our goal is improvement of the encyclopedia. SusunW (talk) 02:51, 10 September 2018 (UTC)
For journals including Publisher and Location is considered over-linking and thus considered to be incorrect. This is especially true for thing such J Chem Phys or J Phys Chem where they are super easy to find online. For obscure journals (which seem to be very common in the articles you are discussing--seriously they are hard to find if at all), it might be best to include some extra contact information in the |id= since en.wikipedia.org has decided that to remove publisher and location for all journals. AManWithNoPlan (talk) 03:35, 10 September 2018 (UTC)
the Consensus i believe is over a decade old. I was able to find proof a while back that the function was not new 7 years ago. AManWithNoPlan (talk) 03:56, 10 September 2018 (UTC)
@SusunW: and as a long time editor and person interested in Scientometrics, academia, technical writing and library science, I can tell you that absolutely no one ever includes journal locations and publishers when citing journals. No style guide recommends doing so out there, and for good reason: The information is pointless and doesn't help anyone locate anything. Whether it's me in Canada, or someone in Djibouti, no one will look up e.g. Signs and ever need the information that it's published in New York or Syracuse or Philadelphia or London or Milan or Chicago to read the article or access them. Likewise, that Journal of Physics is published by the IOP is not information anyone needs to care about when accessing those articles. If you're at a library, journals aren't catalogue by publishers. If you're online, you DOIs and websites. The only time the location or publisher might ever be useful is if you have two or more journals named the same way, e.g. Open Medicine, and then you're better off disambiguating them via ISSN. Headbomb {t · c · p · b} 11:37, 10 September 2018 (UTC)
Headbomb for the last decade, I have lived in various places in Latin America and the Caribbean. In no place that I lived was there a public library, so going into a library and "looking it up" is an impossibility. Most publishings from this region are not listed in World Cat (or digitized), as possibly you have seen me post at Women in Red. When repeatedly I have sourcing from RS stating that various persons have published 200-300 books and articles and there are 0 entries in Google Scholar, Scopus, PubMed, World Cat, or any other compilation system, you understand that it is part of the skew of systemic bias toward the global north. While "absolutely no one" might use this information in your location, the publisher and location are the primary way that I find sources from journals. If I can find the publisher, I can often back into the ISSN, if there is one, or sometimes find an accessible link. With an ISSN I can determine if there is a library which holds the work and ask Megalibrarygirl to try to find it. If that doesn't work, I try to find a Wikipedian who lives in that area to photograph the source or I write to the publisher to see if they will copy and send it to me. Yes, it is a lot of work, but it is the reality in much of the world. Open knowledge platforms should help overcome the difficulties of research, not reinforce the problems, but we cannot do that if people refuse to recognize that the diversity in the world doesn't often make single solutions viable. SusunW (talk) 13:57, 10 September 2018 (UTC)
Again, that's what the DOI is for (and occasionally ISSNs when the journal isn't obvious). And if something isn't listed in Scopus/Google Scholar/whatever it's not knowing that something is published in Madrid vs Chicago vs Montreal vs Shanghai that will help you find it. [12] brings things closer to how citations should be presented (possibly with misfires when cite magazine / cite web should be used rather than cite journal), and that's a good thing. Headbomb {t · c · p · b} 15:38, 10 September 2018 (UTC)

The real solution is wiki linking the journal name and making an article about the journal. AManWithNoPlan (talk) 15:47, 10 September 2018 (UTC)

Have you considered that journals so obscure that the publisher's name and location would be useful are most likely non-notable? ♦ J. Johnson (JJ) (talk) 21:29, 10 September 2018 (UTC)
If that is true, then they might fail to meet wikipedia's requirement of verifiability and thus should not be used as reference (this is mostly a joke). Seriously, a single central article will allow people all over the world to find this journal. AManWithNoPlan (talk) 22:25, 10 September 2018 (UTC)
Not at all. You have confused WP:V with WP:N. Verifiability requires reliability, but neither is the same criterion as notability. The latter is about "significant attention by the world at large and over a period of time". So it is quite possible to have a journal (likely very specialized) that is well-known and highly respected in a narrow field of experts, and has published articles relevant to some topic, but which has not gained "significant attention by the world at large". WP:V does not require a WP:N source. ♦ J. Johnson (JJ) (talk) 00:14, 11 September 2018 (UTC)
I know the difference that’s why I said mostly joking. But, links from references to the journal would note notability. Just because a journal is published by Adair county library and ribs joint does not mean it is not notable. 90% of Wikipedia fails the notability test AManWithNoPlan (talk) 01:48, 11 September 2018 (UTC)

{{notabug}} AManWithNoPlan (talk) 14:04, 20 September 2018 (UTC)

Don't forget to add author dots

Status
Feature request
Reported by
Headbomb {t · c · p · b} 23:46, 25 August 2018 (UTC)
What happens
Adds |first=M. M
What should happen
Adds |first=M. M.
Relevant diffs/links
[13]
We can't proceed until
Feedback from maintainers


The meta data does not have them. That’s the problem. AManWithNoPlan (talk) 22:15, 27 August 2018 (UTC)

Bot logic in those cases: If in any of |first#= in a citation you find the pattern (^| )[A-z]\., replace (^| )([A-z])( |$) with $1$2. in all other |first#= found in the citation.
Could be retroactive on existing uses of |first#= too. It's a really really widespread problem. Headbomb {t · c · p · b} 01:49, 28 August 2018 (UTC)
Harry S Truman would not approve, but we can fix this. AManWithNoPlan (talk) 16:25, 28 August 2018 (UTC)
https://github.com/ms609/citation-bot/pull/760 AManWithNoPlan (talk) 03:10, 12 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 14:05, 20 September 2018 (UTC)

Edit summary when expanding raw url

In https://en.wikipedia.org/w/index.php?title=Khanate_of_Kazan&diff=prev&oldid=856050147 the bot expands from an URL to a {{cite journal}}, which is amazing! However, the bot should mention this in the edit summary somehow. (tJosve05a (c) 14:38, 22 August 2018 (UTC)

report to user https://github.com/ms609/citation-bot/pull/764 AManWithNoPlan (talk) 22:23, 12 September 2018 (UTC)
report to edit summary https://github.com/ms609/citation-bot/pull/765 AManWithNoPlan (talk) 22:23, 12 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 14:07, 20 September 2018 (UTC)

When running from a category, mention the category

this edit was triggered via

  • https://tools.wmflabs.org/citations/category.php?edit=toolbar&slow=1&user=Headbomb&cat=Livestock%20stubs

With the edit summary

This should instead be

https://github.com/ms609/citation-bot/pull/763 AManWithNoPlan (talk) 21:37, 12 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 14:09, 20 September 2018 (UTC)

API: Make output more user friendly

When you run the page, you are presented with

Follow Citation bot’s progress below.
More details | Bot’s recent edits | Report bugs | Source code

Activated by Headbomb.

> Expanding 'Ununennium'; will commit edits.
Reading authentication tokens from tools.wmflabs.org.

[00:35:06] Processing page 'Ununennium' — edit—history
...

This would be much clearer/less intimidating if it was something like

Follow Citation bot’s progress below.
How to Use / Tips and Tricks | Bot’s recent edits | Report bugs | Source code

Citation bot activated by Headbomb. The bot will automatically make edit(s) if it can.
>Bot logging on tools.wmflabs.org.

[00:35:06] Processing page 'Ununennium' — edit—history
...

Headbomb {t · c · p · b} 00:55, 30 August 2018 (UTC)

https://github.com/ms609/citation-bot/pull/772 AManWithNoPlan (talk) 23:25, 13 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 14:09, 20 September 2018 (UTC)

OCLC url → OCLC parameter

Status
new bug
Reported by
Headbomb {t · c · p · b} 19:02, 28 August 2018 (UTC)
What should happen
|url=https://www.worldcat.org/oclc/873805659|oclc=873805659
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/741 AManWithNoPlan (talk) 03:02, 9 September 2018 (UTC)

{{fixed}} AManWithNoPlan (talk) 14:10, 20 September 2018 (UTC)

Running the bot again results in new changes being made

Running the bot multiple times after each edit on the same page results in new edits being made. All possible edits should be done before saving the article.

(tJosve05a (c) 21:33, 26 August 2018 (UTC)

(In edit 2 the bot perfomed an edit which is a bug, reported above as User_talk:Citation_bot#Adds_year_even_if_date_is_there_after_getting_arxiv_data.) (tJosve05a (c) 21:35, 26 August 2018 (UTC)
major refactoring of the code in the last few days. That must be why. The code is more efficient now. It used to check thing again and again and again. AManWithNoPlan (talk) 02:35, 27 August 2018 (UTC)
https://github.com/ms609/citation-bot/pull/749 AManWithNoPlan (talk) 21:18, 9 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 14:11, 20 September 2018 (UTC)

Better arxiv url recognition

Status
new bug
Reported by
Headbomb {t · c · p · b} 00:40, 2 September 2018 (UTC)
What happens
Fails to expand <ref>https://arxiv.org/ftp/arxiv/papers/1312/1312.7288.pdf</ref>
What should happen
Recognize this as equivalent to https://arxiv.org/abs/1312.7288
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/711 AManWithNoPlan (talk) 02:54, 2 September 2018 (UTC)

{{fixed}} AManWithNoPlan (talk) 14:11, 20 September 2018 (UTC)

Bot can't decide which dash to use

Status
new bug
Reported by
Nessie (talk) 01:52, 2 September 2018 (UTC)
What happens
bot makes a misteak with dashes when adding pages. Ran bot once, added pages, second time it had to fix what it did the first time. Seems like it could be done more streamlined
What should happen
use the right dash the first time
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Phytophthora_megakarya&type=revision&diff=857629281&oldid=857628450
Replication instructions
run bot on article with only one page listed, then scan again
We can't proceed until
Feedback from maintainers


I ran into the same issue. Sometimes I run the bot through an article twice because it appears in multiple reference cleanup required sections and I notice that the bot would add a page number with regular hyphen (-), then clean it up later with an en dash(–). Examples are [14][15] and [16][17]. If the intention of the bot is to have en dashes for page numbers, maybe it could do that when adding it so it does not have to make the subsequent edit again. -- AquaDTRS (talk) 20:07, 6 September 2018 (UTC)

https://github.com/ms609/citation-bot/pull/747 AManWithNoPlan (talk) 21:01, 9 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 14:12, 20 September 2018 (UTC)

remove website=Google for books

Status
new bug
Reported by
(tJosve05a (c) 22:22, 2 September 2018 (UTC)
What happens
The bot converted a {{cite web}} with |work=Google.com to a {{cite book}} with |website=Google.com
What should happen
The bot should remove |website=[Gg]oogle.com from Google Books
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Charles_F._Hermann&diff=prev&oldid=857769452
We can't proceed until
Feedback from maintainers


Which is better via= or delete? AManWithNoPlan (talk) 22:32, 2 September 2018 (UTC)

Delete, imo. (tJosve05a (c) 22:35, 2 September 2018 (UTC)
https://github.com/ms609/citation-bot/pull/773 AManWithNoPlan (talk) 23:32, 13 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 14:12, 20 September 2018 (UTC)

In CS1 templates (i.e. all but Template:Citation), remove postcript = .

Status
new bug
Reported by
Headbomb {t · c · p · b} 00:21, 5 September 2018 (UTC)
What should happen
[18]
Relevant diffs/links
It does literally nothing. Compare
with |postcript=.
without |postcript=.
We can't proceed until
Feedback from maintainers


What are your thoughts on remove empty |postcript= on {{citation}} also, since it does nothing? AManWithNoPlan (talk) 21:35, 8 September 2018 (UTC)

with |postcript=
without |postcript=
Code in progress https://github.com/ms609/citation-bot/pull/740 AManWithNoPlan (talk) 21:47, 8 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 14:12, 20 September 2018 (UTC)

Try and fix broken dois

In case a DOI does not resolve (i.e. is broken/inactive), check if the DOI has more than one forward-slash. If it does, remove the second and all content after it. Real example: 10.1111/ruso.12119/full to 10.1111/ruso.12119. If it resolves and gives matching metadata, replace the |doi= field. (tJosve05a (c) 01:49, 6 September 2018 (UTC)

Alternativly, find and match snippits such as /full and remove them, if at the end of a broken DOI. (tJosve05a (c) 01:50, 6 September 2018 (UTC)
https://github.com/ms609/citation-bot/pull/737 AManWithNoPlan (talk) 21:33, 8 September 2018 (UTC)
Same for /meta and /abstract. Headbomb {t · c · p · b} 13:41, 18 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 14:13, 20 September 2018 (UTC)

Don't remove PDF URLs simply because it has a DOI in its path

Should we really remove |url= simply because the URL has a known doi in it? I think if the URL is a PDF file, it is worth keeping since it is linking to the journal article directly (as open source). We don't always remove |url= when |doi= is present, only if that specific URL happens to have the DOI in its path. Either we should always delete the URL, or never in my own opinion, but if we should, we shouldn't do so when the URL is a PDF. (tJosve05a (c) 22:28, 27 August 2018 (UTC)

I'm not quite following the argument here; why is a PDF with a DOI in its URL any more likely to be open source than an HTML page with a DOI in its URL? Martin (Smith609 – Talk) 07:11, 28 August 2018 (UTC)
Sorry, my bad. Not open source - open access. If it is a PDF-link it most likely links directly to a freely available version, while the identifiers (such as DOI) might link to a paywall. (tJosve05a (c) 13:13, 28 August 2018 (UTC)
I'd be interested to see an example of a URL that contains a DOI where the DOI resolves to a paywall, but the URL leads to an open-access PDF. Martin (Smith609 – Talk) 06:00, 1 September 2018 (UTC)
I’m sure the Wikipedia:OABOT project has examples. (tJosve05a (c) 10:28, 1 September 2018 (UTC)

https://github.com/ms609/citation-bot/pull/704 AManWithNoPlan (talk) 01:38, 1 September 2018 (UTC)

{{fixed}} AManWithNoPlan (talk) 14:14, 20 September 2018 (UTC)

Converts cite book to cite journal erroneously

Status
new bug
Reported by
Headbomb {t · c · p · b} 02:19, 28 August 2018 (UTC)
What happens
Converts

to

Relevant diffs/links
[19]
We can't proceed until
Feedback from maintainers


Gotta love bad meta data. The bibcode has a journal parameter. AManWithNoPlan (talk) 02:46, 28 August 2018 (UTC)

ISBN should have higher precedence than journal, at least on ADSABS. Headbomb {t · c · p · b} 02:53, 28 August 2018 (UTC)
I've seen lots of edits in the last few days where the bot has mangled book references due to thinking they were journal references. The bot should be shut down until this is fixed.--Srleffler (talk) 01:43, 7 September 2018 (UTC)
can you point to pages people edited with the bots help where things went wrong. It will help us figure out a heuristic to decide if meta data is bad. I should note that all bot edits are human initiated. AManWithNoPlan (talk) 01:50, 7 September 2018 (UTC)
  • This edit had a bunch of problems, some of which looked like the bot mistaking books for journal articles. Check my subsequent edits for what I reversed.
  • This edit. The bot converted a cite web to a cite journal. The linked document was a book. (It was also the wrong URL, but the correct reference was not a journal article either.)
I think I saw several other bad edits by the bot in the last few days, but I can't find the others right now.--Srleffler (talk) 02:27, 7 September 2018 (UTC)
The citeseerx one is GIGO, so hard to fix. We made it better, just not perfect. AManWithNoPlan (talk) 02:57, 7 September 2018 (UTC)
Thank you for the examples from the bibcode database. The books all look like: 2003hoe..book.....K and such. That's really easy to notice. AManWithNoPlan (talk) 02:57, 7 September 2018 (UTC)
https://github.com/ms609/citation-bot/pull/736 AManWithNoPlan (talk) 03:14, 7 September 2018 (UTC)
Thanks for the quick fix!--Srleffler (talk) 04:45, 7 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 15:19, 20 September 2018 (UTC)

Possible GIGO or bug

  • In this edit the amendment to the citation between the <ref name="Lāmmerzahl"> … </ref> tags is mangled. Although the item cited isn't really an issue of a "journal", the problem here isn't with the bot's substitution of "journal" for "book", but with the choice of "Gyros" for the title of the supposed journal. The item cited is actually one of an aperiodically issued series of "Lecture Notes", and "Gyros" is simply the first word in the title of that issue. There seems to me to be no ideal way for any of the citation templates to cover this situation, but this seems to me to be among the best of a few reasonably acceptable alternatives.
David Wilson (talk · cont) 02:55, 11 September 2018 (UTC)
All but the Gyros GIGO {{fixed}} AManWithNoPlan (talk) 15:22, 20 September 2018 (UTC)
I verified that it is GIGO.
   [numFound] => 1
   [start] => 0
   [docs] => Array
       (
           [0] => stdClass Object
               (
                   [arxiv_class] => Array
                       (
                           [0] => gr-qc
                       )
                   [identifier] => Array
                       (
                           [0] => 2001gcit.conf..195H
                           [1] => 2001gr.qc.....3067H
                           [2] => 2001LNP...562..195H
                           [3] => 10.1007/3-540-40988-2_10
                           [4] => 2001gcit.conf..195H
                           [5] => gr-qc/0103067
                           [6] => 10.1007/3-540-40988-2_10
                           [7] => 2001gr.qc.....3067H
                       )
                   [year] => 2001
                   [page] => Array
                       (
                           [0] => 195
                       )
                   [bibcode] => 2001LNP...562..195H
                   [pubdate] => 2001-00-00
                   [author] => Array
                       (
                           [0] => Haugan, Mark P.
                           [1] => Lämmerzahl, C.
                       )
                   [volume] => 562
                   [doi] => Array
                       (
                           [0] => 10.1007/3-540-40988-2_10
                       )
                   [pub] => Gyros, Clocks, Interferometers ...: Testing Relativistic Gravity in Space
                   [doctype] => inbook
                   [title] => Array
                       (
                           [0] => Principles of Equivalence: Their Role in Gravitation Physics and Experiments That Test Them
                       )
               )
       )

bot is converting "work" parameters to "website" in Cite web

Status
new bug
Reported by
Joeyconnick (talk) 20:17, 5 September 2018 (UTC)
What happens
converts several entries from "work" to "website";
What should happen
bot should leave "work" parameters alone
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=The_End_of_the_F***ing_World&diff=858184130&oldid=857995508&diffmode=source
We can't proceed until
Feedback from maintainers


In {{cite web}}, since |work= is an alias for |website= which is the template native parameter. Might not be ideal for many things that should actually be {{cite news}}. AManWithNoPlan (talk) 20:22, 5 September 2018 (UTC)

Since it is an alias, then it is making edits for the sake of making edits, which as I understand it is relatively undesirable. Also, most "news" citations these days are from online sources, so {{cite web}} is widely used for these types of citations. —Joeyconnick (talk) 21:21, 5 September 2018 (UTC)
I'm not sure I follow. Looking at the documentation I can't see anything that should be {{cite news}} rather than {{cite web}}, unless it's a news article that's not available online (which would produce an error with {{cite web}}). Bilorv(c)(talk) 01:51, 6 September 2018 (UTC)
Is anyone working on this issue? What's the process for getting it fixed? The bot just made this useless edit. Bilorv(c)(talk) 17:26, 12 September 2018 (UTC)
there have been bigger fish to fry. i will look into it. AManWithNoPlan (talk) 18:14, 12 September 2018 (UTC)
https://github.com/ms609/citation-bot/pull/762 AManWithNoPlan (talk) 18:24, 12 September 2018 (UTC)

@AManWithNoPlan:. This is very disruptive. Might suggest disabling the tool until you are able to find and fix the problem, particularly if this is not even a "biggest fish" bug. It should not be converting |work= -> |website= at the rate of 1,000s or 10's of thousands. -- GreenC 13:40, 16 September 2018 (UTC)

the fix is merged in, it just needes deployed. The reason it is small fish is because it in no way effectes what humans see on wikipedia. AManWithNoPlan (talk) 16:30, 16 September 2018 (UTC)
Is there any plan to perform clean-up and revert the incorrect changes made by the bot? Keith D (talk) 20:30, 16 September 2018 (UTC)
There are no plans to generate a new series of edits that have no effect on what users see. In many cases website is better choice, so a mass revert would be disruptive. Lastly, no one has volunteered to do it who is able to do it. Also, the bot itself never does edits that are not under the guidence of a human AManWithNoPlan (talk) 23:36, 16 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 14:15, 20 September 2018 (UTC)

Follow-up on removal of URLs for broken DOIs

Status
new bug
Reported by
(tJosve05a (c) 01:46, 6 September 2018 (UTC)
What happens
The bot removes URL for a citation due to it having a |doi=. However, that doi is marked as broken/inactive.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Hobet_Coal_Mine&diff=prev&oldid=858273809
We can't proceed until
Feedback from maintainers


Follow-up from User_talk:Citation_bot/Archive_9#Broken_dois_and_removal_of_URLs, still not fixed. (tJosve05a (c) 01:46, 6 September 2018 (UTC)

Weird. Looks like need to remove /full form DOIs too. AManWithNoPlan (talk) 02:35, 6 September 2018 (UTC)
Also, see #Try and fix broken dois. (tJosve05a (c) 02:36, 6 September 2018 (UTC)
we do some of that already, but that’s a good addition to the tools. AManWithNoPlan (talk) 02:38, 6 September 2018 (UTC)
https://github.com/ms609/citation-bot/pull/737 once rolled out this will remove bad stuff from new and existing DOIs. It will probably not remove the broken notice unless you run the bot again. AManWithNoPlan (talk) 22:41, 7 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 15:23, 20 September 2018 (UTC)

Caps: voor

Status
new bug
Reported by
Headbomb {t · c · p · b} 02:13, 6 September 2018 (UTC)
Relevant diffs/links
[20]
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/729 AManWithNoPlan (talk) 15:10, 6 September 2018 (UTC)

{{fixed}} AManWithNoPlan (talk) 14:15, 20 September 2018 (UTC)

Caps: till, av, och, för, mot, zum, non

Status
new bug
Reported by
(tJosve05a (c) 02:23, 6 September 2018 (UTC)
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=List_of_Pluteus_species&diff=prev&oldid=858265568
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/729 AManWithNoPlan (talk) 15:11, 6 September 2018 (UTC)

{{fixed}} AManWithNoPlan (talk) 14:15, 20 September 2018 (UTC)

Invalid dates caused by arXiv data containing page numbers that look like dates

Status
new bug
Reported by
SarekOfVulcan (talk) 13:32, 6 September 2018 (UTC)
What happens
Invalid dates
Relevant diffs/links
Aharonov–Bohm effect, last couple of edits
We can't proceed until
Feedback from maintainers


Thanks for reporting the issue. I believe this might occur for a number of articles which I ran the bot through, although I won't know which ones until the list of articles with invalid dates gets populated again in the next cycle. Also, I was thinking maybe the bot could include a feature to check for an invalid year before it replaces it, just in case it finds a set of numbers that look like dates elsewhere again. -- AquaDTRS (talk) 19:38, 6 September 2018 (UTC)

this bug has been fixed. Just waiting for the new code to get deployed on Wikipedia AManWithNoPlan (talk) 01:53, 7 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 14:15, 20 September 2018 (UTC)

Fouling causes internal server error

Status
new bug
Reported by
-- AquaDTRS (talk) 19:24, 6 September 2018 (UTC)
What happens
Running the bot through Fouling causes a "500 - Internal Server Error" to appear. No log is provided. Not sure what causes it.
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/733 AManWithNoPlan (talk) 23:53, 6 September 2018 (UTC)

{{fixed}} AManWithNoPlan (talk) 14:17, 20 September 2018 (UTC)

Bot stops in Neutrino

Status
new bug
Reported by
-- AquaDTRS (talk) 19:46, 6 September 2018 (UTC)
What happens
When running the bot through Neutrino, it stops at the line
  ! No match for bibcode identifier: 2012PhLB..713...17I; 2014A&A...571A..16P
  + Adding url: http://www.jstor.org/stable/78071"
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/732 AManWithNoPlan (talk) 20:27, 6 September 2018 (UTC)

Pull has been merged, but issue has not been fixed. (tJosve05a (c) 19:59, 16 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 14:20, 20 September 2018 (UTC)

bug fixed code not propagated to Wikipedia

Status
new bug
Reported by
(tJosve05a (c) 08:34, 10 September 2018 (UTC)
What happens
changed |title=The First Destroyer « to |title=The First Destroyer "
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Pakistan_Navy&diff=prev&oldid=858886004
We can't proceed until
Feedback from maintainers


Regression of User_talk:Citation_bot/Archive_10#Arrows_and_not_always_quotes

{{fixed}} AManWithNoPlan (talk) 14:17, 20 September 2018 (UTC)

Bot marks working DOI as broken

Status
new bug
Reported by
(tJosve05a (c) 09:53, 10 September 2018 (UTC)
What happens
Bot marks |doi=10.1002/(SICI)1097-0134(20000515)39:3<216::AID-PROT40>3.0.CO;2-# as inactive/broken, despite it being alive and working
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=CKMT1B&diff=prev&oldid=858892042
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/751 AManWithNoPlan (talk) 16:24, 11 September 2018 (UTC)

{{fixed}} AManWithNoPlan (talk) 14:18, 20 September 2018 (UTC)

Did not expand raw url in first edit; required two edits

Status
new bug
Reported by
(tJosve05a (c) 12:50, 10 September 2018 (UTC)
What happens
The bot didn't do https://en.wikipedia.org/w/index.php?title=Luis_Bu%C3%B1uel&diff=858908056&oldid=858905144
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Luis_Bu%C3%B1uel&diff=prev&oldid=858905144 and next edit by me
We can't proceed until
Feedback from maintainers


https://en.wikipedia.org/w/index.php?title=Andragogy&diff=859059054&oldid=859058657

https://github.com/ms609/citation-bot/pull/755 AManWithNoPlan (talk) 18:13, 11 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 14:18, 20 September 2018 (UTC)

Replaced specific |at= with non-specific |pages=

Status
new bug
Reported by
(tJosve05a (c) 08:18, 11 September 2018 (UTC)
What happens
|at=pp.425–439, see Table 2 p. 426 for tempering temperatures to |pages=425–439
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Chocolate&diff=prev&oldid=859032904
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/756 AManWithNoPlan (talk) 20:36, 11 September 2018 (UTC)

{{fixed}} AManWithNoPlan (talk) 14:18, 20 September 2018 (UTC)

CrossRef has � instead of ü, ä and ö in metadata

Status
new bug
Reported by
(tJosve05a (c) 13:39, 11 September 2018 (UTC)
What happens
The bot added in parameters instead of the proper unicode character
What should happen
The bot should have added ü, ä and ö
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Nicandra&diff=859064052&oldid=831983248
https://en.wikipedia.org/w/index.php?title=Theory_of_imputation&diff=859065464&oldid=844744066
We can't proceed until
Feedback from maintainers


Sorry, but the data is wrong in CrossRef. We could detect it, but we cannot fix it. AManWithNoPlan (talk) 16:19, 11 September 2018 (UTC)

Where there should be an ü and an ä there is a http://www.fileformat.info/info/unicode/char/0fffd/index.htm AManWithNoPlan (talk) 16:32, 11 September 2018 (UTC)
Unfixable AManWithNoPlan (talk) 00:11, 14 September 2018 (UTC)
How about not adding any author names at all if one such characters appears, since it is naste and puts garbage on Wikipedia. Or at least warn users when it added this somehow. (tJosve05a (c) 07:53, 14 September 2018 (UTC)
You might complain to CrossRef. I think people would rather see The German Lust f�r Science that Error: no title specified in a reference. AManWithNoPlan (talk) 16:00, 14 September 2018 (UTC)
This has come up before and the consensus has always been in favor of the bot's actions. AManWithNoPlan (talk) 16:13, 14 September 2018 (UTC)
I have complained to CrossRef. AManWithNoPlan (talk) 16:14, 14 September 2018 (UTC)
{{wontfix}} AManWithNoPlan (talk) 14:20, 20 September 2018 (UTC)

Bot fails on Science

Status
new bug
Reported by
Headbomb {t · c · p · b} 23:49, 13 September 2018 (UTC)
Replication instructions
Run the bot on this version of the page [21] and see it choke. This edit identifies the problematic citation.
We can't proceed until
Feedback from maintainers


I think thats fixed in out gothub development tree. AManWithNoPlan (talk) 01:05, 14 September 2018 (UTC)

PHP can be a little aggressive with memory usage AManWithNoPlan (talk) 01:08, 14 September 2018 (UTC)
It seems to happen on JSTOR links mostly. Headbomb {t · c · p · b} 17:17, 16 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 14:21, 20 September 2018 (UTC)

Square brackets with more than one pipe are unsupported

Status
new bug
Reported by
(tJosve05a (c) 11:22, 15 September 2018 (UTC)
What happens
See bottom of diff, regarding File:Wikisource-logo.svg in |title=.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Bra&diff=prev&oldid=859649075
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/783 fix code written AManWithNoPlan (talk) 04:28, 16 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 14:22, 20 September 2018 (UTC)

Not showing my name

Status
new bug
Reported by
5 albert square (talk) 14:21, 15 September 2018 (UTC)
What happens
In the edit summary, the bot just says "user activated" despite me providing my name. I've entered it as "5 albert square", "User:5 albert square" and "User:5 albert square" and no difference.
What should happen
Instead of saying user activated, it should be showing my username.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Sam_Callis&diff=next&oldid=854551463
We can't proceed until
Feedback from maintainers


did you try your canonical username? i.e. with underscores instead of spaces? AManWithNoPlan (talk) 19:32, 15 September 2018 (UTC)
Sorted! Thanks!--5 albert square (talk) 20:29, 15 September 2018 (UTC)
we should fix that, since most people use apaces. should also remove user: if peolple add that. AManWithNoPlan (talk) 21:31, 15 September 2018 (UTC)
It would be helpful. I must admit I didn't think to try it with underscores. Thanks again!--5 albert square (talk) 22:38, 15 September 2018 (UTC)
https://github.com/ms609/citation-bot/pull/782 AManWithNoPlan (talk) 00:14, 16 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 14:22, 20 September 2018 (UTC)

Bot stops in Fluvoxamine

Status
new bug
Reported by
(tJosve05a (c) 19:58, 16 September 2018 (UTC)
What happens
The bot stops running on Fluvoxamine when it comes to the following:
 ! CrossRef server error loading headers for DOI 10.1002/(SICI)1520-6394(1998)8:1 <64::AID-DA10>3.0.CO;2-S: HTTP/1.0 400 Bad request DOI ok.
 ! No CrossRef record found for doi '10.1002/(SICI)1520-6394(1998)8:1 <64::AID-DA10>3.0.CO;2-S'; marking as broken
We can't proceed until
Feedback from maintainers


assuming php memory bug AManWithNoPlan (talk) 03:19, 18 September 2018 (UTC)

{{fixed}} AManWithNoPlan (talk) 14:23, 20 September 2018 (UTC)

Adds redundant doi-broken-date when doi-broken already present

Status
new bug
Reported by
DferDaisy (talk) 00:35, 18 September 2018 (UTC)
What happens
doi-broken parameter is already in use, then alias of doi-broken-date is redundantly added (aliases listed at Module:Citation/CS1/Configuration).
Relevant diffs/links
in external links section
We can't proceed until
Feedback from maintainers


Thats what happens when a template is designed by people who do not plan ahead. AManWithNoPlan (talk) 02:33, 18 September 2018 (UTC)
https://github.com/ms609/citation-bot/pull/788 AManWithNoPlan (talk) 20:23, 18 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 14:24, 20 September 2018 (UTC)

Confusing links for conference papers

Status
new bug
Reported by
Martin Monperrus
What happens
The conference papers (very common in Computer Science) are rewritten as 'cite book'. However, the open-access URL usually points to the article itself (and not to the whole proceedings).

As a result, the link to the paper is on the proceedings name and not in the title. For instance "Proceedings of ISSTA, Demonstration Track" https://hal.archives-ouvertes.fr/hal-01321615/file/astor.pdf points to "ASTOR: A Program Repair Library for Java" (on https://en.wikipedia.org/wiki/Automatic_bug_fixing)

This is confusing both for human readers and for search engines.

What should happen
The conference papers could be "cite article", this is one possible solution.
Relevant diffs/links
https://en.wikipedia.org/wiki/Automatic_bug_fixing
Replication instructions
Run citation_bot on a page with references to conference papers)
We can't proceed until
Feedback from maintainers


This is GIGO. Headbomb {t · c · p · b} 13:37, 18 September 2018 (UTC)

In that case {{notabug}} AManWithNoPlan (talk) 14:38, 20 September 2018 (UTC)

Remove leftover deadurl

Status
new bug
Reported by
Headbomb {t · c · p · b} 00:13, 19 September 2018 (UTC)
What should happen
Remove |deadurl=no/yes/whatever when you no url is present.
Relevant diffs/links
[22]
We can't proceed until
Feedback from maintainers


{{fixed}} removes this when removing url now AManWithNoPlan (talk) 14:25, 20 September 2018 (UTC)

Redundant duplication of author name parameters

Status
new bug
Reported by
DferDaisy (talk) 18:18, 9 September 2018 (UTC)
What happens
"author-first" and "author-last" were already present, bot (or bot user?) redundantly added "last1" and "first1".
Relevant diffs/links
see author-last=Malaisé, see ref name=Knops1991
We can't proceed until
Feedback from maintainers


Thank you for changing the bug title. AManWithNoPlan (talk) 18:24, 9 September 2018 (UTC)

https://github.com/ms609/citation-bot/pull/746 AManWithNoPlan (talk) 23:39, 11 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 14:08, 21 September 2018 (UTC)

Remove access date when there is no URL.

{{cite web}} should be the exception. Leave that one alone, unless it's converted. e.g. [23]. Headbomb {t · c · p · b} 23:18, 18 September 2018 (UTC)

true, because people are clueless. AManWithNoPlan (talk) 23:25, 18 September 2018 (UTC)
Recommend reviewing the criteria for orphan |access-date= removal at User:GreenC_bot/Job_5 that was arrived at by lengthy community input over a 5 month period. -- GreenC 01:48, 19 September 2018 (UTC)
{{notabug}} and another one bites the dust. AManWithNoPlan (talk) 03:42, 22 September 2018 (UTC)

Cleanup chapter-url too

Status
new bug
Reported by
Headbomb {t · c · p · b} 23:10, 20 September 2018 (UTC)
What should happen
[24]
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/810 AManWithNoPlan (talk) 23:15, 20 September 2018 (UTC)

{{fixed}}

Bot converts garbage parameter to another garbage parameter

Status
new bug
Reported by
(tJosve05a (c) 21:16, 17 September 2018 (UTC)
What happens
|authorlinux= to |authorlink#=
Nether of those parameters are valid.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Pomegranate&diff=prev&oldid=860029209
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/791 AManWithNoPlan (talk) 23:08, 20 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 22:06, 21 September 2018 (UTC)

Bad bibcode data for arxiv

Status
new bug
Reported by
(tJosve05a (c) 23:14, 18 September 2018 (UTC)
What happens
Bot changed |pages=8159 to |pages=astro-ph/9508159
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Cosmic_microwave_background&diff=prev&oldid=860190112
We can't proceed until
Feedback from maintainers


Brandenberger, Robert H. (1995). "Formation of Structure in the Universe": 8159. Bibcode:1995astro.ph..8159B. {{cite journal}}: Cite journal requires |journal= (help)

https://github.com/ms609/citation-bot/pull/807 AManWithNoPlan (talk) 16:27, 20 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 22:04, 21 September 2018 (UTC)

Title vs script-title

Status
new bug
Reported by
(tJosve05a (c) 23:03, 18 September 2018 (UTC)
What happens
Bot changed
{{cite book |author={{noitalic|{{lang|zh-hans|国务院人口普查办公室、国家统计局人口和社会科技统计司编}}}} |date=2012 |script-title=zh:中国2010年人口普查分县资料 |location=Beijing |publisher={{noitalic|{{lang|zh-hans|中国统计出版社}}}} [China Statistics Press] |page= |isbn=978-7-5037-6659-6 }}

to

{{cite book |author={{noitalic|{{lang|zh-hans|国务院人口普查办公室、国家统计局人口和社会科技统计司编}}}} |title=中国2010年人口普查分县资料 |date=2012 |script-title=zh:中国2010年人口普查分县资料 |location=Beijing |publisher={{noitalic|{{lang|zh-hans|中国统计出版社}}}} [China Statistics Press] |page= |isbn=978-7-5037-6659-6 }}

Making the title 中国2010年人口普查分县资料 appear twice.

What should happen
Do not add |title= if |script-title= is the same (or includes the title in it's string).
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Beijing&diff=860189104&oldid=859222257
We can't proceed until
Feedback from maintainers


Same with История русского автомата in https://en.wikipedia.org/w/index.php?title=7.62×39mm&oldid=860197219 (tJosve05a (c) 00:34, 19 September 2018 (UTC)

https://github.com/ms609/citation-bot/pull/809 AManWithNoPlan (talk) 19:58, 20 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 13:08, 22 September 2018 (UTC)

Weird edit summary

Status
new bug
Reported by
(tJosve05a (c) 23:00, 18 September 2018 (UTC)
What happens
# # # citation_bot_placeholder_comment 0 # # #
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Beijing&diff=860189104&oldid=859222257
We can't proceed until
Feedback from maintainers


{{cite web |<!--trans--->title = Beij}} unfixable without massive effort AManWithNoPlan (talk) 02:49, 22 September 2018 (UTC)

Since we do not see the title, the bot could add one AManWithNoPlan (talk) 02:51, 22 September 2018 (UTC)
{{wontfix}} assuming rare and weird edit summary is a warning. AManWithNoPlan (talk) 19:00, 23 September 2018 (UTC)

Follow-up on the follow-up on removal of URLs for broken DOIs

Status
new bug
Reported by
(tJosve05a (c) 20:54, 21 September 2018 (UTC)
What happens
{{cite journal|last1=Kaye|first1=Steven|last2=Fox|first2=Joseph M.|last3=Hicks|first3=Frederick A.|last4=Buchwald|first4=Stephen L.|title=The Use of Catalytic Amounts of CuCl and Other Improvements in the Benzyne Route to Biphenyl-Based Phosphine Ligands|journal=Advanced Synthesis & Catalysis|date=31 December 2001|volume=343|issue=8|pages=789–794|doi=10.1002/1615-4169(20011231)343:83.0.CO;2-A|url=http://onlinelibrary.wiley.com/doi/10.1002/1615-4169(20011231)343:8%3C789::AID-ADSC789%3E3.0.CO;2-A/full|language=en|issn=1615-4169|doi-broken-date=2017-04-22}}

to

{{cite journal|last1=Kaye|first1=Steven|last2=Fox|first2=Joseph M.|last3=Hicks|first3=Frederick A.|last4=Buchwald|first4=Stephen L.|title=The Use of Catalytic Amounts of CuCl and Other Improvements in the Benzyne Route to Biphenyl-Based Phosphine Ligands|journal=Advanced Synthesis & Catalysis|date=31 December 2001|volume=343|issue=8|pages=789–794|doi=10.1002/1615-4169(20011231)343:83.0.CO;2-A|language=en|issn=1615-4169|doi-broken-date=2018-09-21}}
What should happen
DOn't remove URL if doi is broken
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Dialkylbiaryl_phosphine_ligands&diff=prev&oldid=860606933
We can't proceed until
Feedback from maintainers


Perhaps even replace with new one as in this cass the doi was missing a character AManWithNoPlan (talk) 21:06, 21 September 2018 (UTC)

{{fixed}} AManWithNoPlan (talk) 13:36, 24 September 2018 (UTC)

Dropping pdf urls

do not drop urls that point to .pdf even if they have doi AManWithNoPlan (talk) 03:14, 23 September 2018 (UTC)

this got flagged as el fixo on accident AManWithNoPlan (talk) 03:15, 23 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 14:46, 25 September 2018 (UTC)

Change parameters to better choices

I don't know if this is the case (pretty sure it isn't), but the bot should convert

  • |publication-date=|date=
  • |publication-place=|location=

If |location= / |date= aren't set / are empty Headbomb {t · c · p · b} 13:48, 12 August 2018 (UTC)

thoughts on other things that we should upgrade. AManWithNoPlan (talk) 15:18, 12 August 2018 (UTC)

Only other one I can think of is

  • |orig-year=/|origyear=|year=

Headbomb {t · c · p · b} 16:19, 12 August 2018 (UTC)

I think (though I could be wrong) that |orig-year= should be converted to |year= only if (a) |year= is empty and (b) |orig-year= contains only a valid four-digit year. Both must be true. If |orig-year= contains additional text, it should not be moved to |year=; that will cause an error message to appear. – Jonesey95 (talk) 17:45, 12 August 2018 (UTC)
Yes, I mean doing those conversions only when they don't overwrite existing parameters. Slightly clarified the title of this section to reflect that. Headbomb {t · c · p · b} 02:37, 13 August 2018 (UTC)
https://github.com/ms609/citation-bot/pull/824 AManWithNoPlan (talk) 16:34, 25 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 15:03, 27 September 2018 (UTC)

wayback.archive.org

Status
new bug
Reported by
(tJosve05a (c) 23:42, 18 September 2018 (UTC)
What should happen
Always remove |website=wayback.archive.org, |publisher=wayback.archive.org etc.
Remove |website=archive.org, |publisher=archive.org etc. if the (main) url has another domain that archive.org.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Timothy_Dalton&diff=860192999&oldid=855632533
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/822 AManWithNoPlan (talk) 15:49, 25 September 2018 (UTC)

{{fixed}} AManWithNoPlan (talk) 15:29, 27 September 2018 (UTC)

Bot fails to respect comment exclusion in Titles (capitalization and quotes) and others

Status
new bug
Reported by
Headbomb {t · c · p · b} 12:48, 3 September 2018 (UTC)
What should happen
Comment should block bot
Relevant diffs/links
see below
We can't proceed until
Feedback from maintainers


https://en.wikipedia.org/w/index.php?title=CKMT1B&diff=858892445&oldid=858892287

https://en.wikipedia.org/w/index.php?title=%CA%BBOumuamua&diff=prev&oldid=861195731

Working on it https://github.com/ms609/citation-bot/pull/826 AManWithNoPlan (talk) 18:56, 26 September 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 16:04, 27 September 2018 (UTC)

Forget Amazon just the same as Google Books

Status
new bug
Reported by
(tJosve05a (c) 07:15, 24 September 2018 (UTC)
What should happen
Forget |publisher=Amazon.com if removing Amazon URL in favor of ISBN.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Woody_Allen&diff=prev&oldid=860961750
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/823 AManWithNoPlan (talk) 15:54, 25 September 2018 (UTC)

{{fixed}} AManWithNoPlan (talk) 16:19, 27 September 2018 (UTC)

Bot adds redundant parameters

Status
new bug
Reported by
Kim Post (talk) 19:19, 24 September 2018 (UTC)
What happens
Citation bot adds a database identifier, even if the citation already has a DOI which resolves to that same identifier.
What should happen
The bot should not add an identifier with the same target as the existing DOI. In general, the bot should not add all possible links to avoid bloating a bibliography. When a stable URL like a DOI is present, an additional specific database identifier is often not useful, see WP:SOURCELINKS. While some editors may want such identifiers, and it's common e.g. in WikiProject Medicine, a bot should be relatively conservative to allow for WP:CITEVAR. Redundancy should not be the default.
Relevant diffs/links
Diff of Manuel Torres (diplomat)
We can't proceed until
Feedback from maintainers


I don't see this as a CITEVAR thing (and would vehemently challenge you to defend it there beyond the support for |vauthors= and the like for name-formatting or the use of CS1 versus CS2).
Anyway, this is a feature IMO, as the DOI may not always resolve to the same named identifier. --Izno (talk) 20:45, 24 September 2018 (UTC)
I'm not sure I understand your response. It's a CITEVAR issue because editors may disagree not only on how to format citations, but on what information to include in them. For example, whether to specify the medium is part of the "style" of a citation; the word is used in policy pages with that meaning. Do you mean that the {{cite}} template should only be used with one specific style? As best I know "Citation Style 1" refers primarily to formatting and is not a complete citation style of its own.
I make the suggestion because this is a specifically avoidable example of the "add as much as I can find" approach the bot uses. Other cases are not as easily detected; multiple identifiers may sometimes be useful. Unless the bot knows what citation style (if any consistent one) is in use then it often won't be clear whether adding an identifier is helpful or harmful to style conformance. Kim Post (talk) 21:33, 24 September 2018 (UTC)
This comes up from time to time and it always get overwelmingly resolved that users are more important than editors. AManWithNoPlan (talk) 22:38, 24 September 2018 (UTC)
Why do editors curate the bibliography if not for the benefit of the reader? (I'd concede that redundant identifiers have some potential use to editors.) I searched the archives for "identifier" and the closest parallel seems to be the removal of ASINs that were redundant with ISBNs. Please do link to a relevant consensus though; I'm happy to adapt if there is one. Kim Post (talk) 23:21, 24 September 2018 (UTC)
I do not have the time to look it up at this point. Several other people who watch this board will most likely do it. They tracked down the justification for removing publisher and location from all journal references; so they can find anything. AManWithNoPlan (talk) 23:27, 24 September 2018 (UTC)
This is not a bug, and is desirable. I know if I have access to JSTOR or not, I don't know if I have access to a generic DOI or not. Plus, if in the future the DOI points to a different database than JSTOR, the JSTOR link will still be functional. Headbomb {t · c · p · b} 17:04, 25 September 2018 (UTC)
{{notabug}} AManWithNoPlan (talk) 14:35, 27 September 2018 (UTC)

Better cleanup of date/year, page/pages/at, via

|page= and |pages= are aliases.

  • |page=13-25 should be converted to |pages=13–25
  • If any of |page=/|pages=/|at= is set, remove the others (if they are empty / redundant)

|year= and |date= are aliases.

  • |date=2008 should be converted to |year=2008
  • If any either of |year=/|date= is set, remove the other (if it is empty)

|via= online makes sense if a URL is provided, so remove it if there is no url provided.

So a citation like

{{cite journal |last=Smith |first=John |date=2007 |year= |title=Foobar |journal=Barfoo Journal |volume=3 |issue=4 |page=34-44 |pages= |via=}}

cleans up to

{{cite journal |last=Smith |first=John |year=2007 |title=Foobar |journal=Barfoo Journal |volume=3 |issue=4 |pages=34–44}}

Headbomb {t · c · p · b} 17:01, 25 September 2018 (UTC)

I would prefer to see the canonical version, which is |date=. Otherwise these are reasonable suggestions. --Izno (talk) 17:03, 25 September 2018 (UTC)
If you want to just present the year, it makes it clearer that it's not to be expanded to full dates. And if you want to present full dates, it also makes is much easier to find year-only dates. Headbomb {t · c · p · b} 17:07, 25 September 2018 (UTC)
|date= is not a a true alias of |year= – true aliases cause the 'more than one of param and param' error message as here with |work= and |journal=:
{{cite journal |title=Title |work=Work |journal=Journal}}"Title". Journal. {{cite journal}}: More than one of |work= and |journal= specified (help)
Because there are occasions when both |date= and |year= are required (or desired), they cannot be aliases. I agree with Editor Izno that |date= should be preferred over |year= when both are not required.
Trappist the monk (talk) 17:45, 25 September 2018 (UTC)
"there are occasions when both |date= and |year= are required (or desired)" what would those occasions be? Having |date=2008-04-26 and |year=2008 just presents redundant information. Headbomb {t · c · p · b} 18:39, 25 September 2018 (UTC)
The requirement is described in the documentation. There are editors who do not like to have the disambiguator character displayed in the final rendering.
Trappist the monk (talk) 11:14, 26 September 2018 (UTC)
And the game goes to Trappist the monk who scored the winning shot. But, obviously if pages is set, then blank page and at should be removed and such. AManWithNoPlan (talk) 14:38, 26 September 2018 (UTC)

https://github.com/ms609/citation-bot/pull/829 AManWithNoPlan (talk) 23:37, 26 September 2018 (UTC)

{{fixed}}

Causing cite date errors

Status
new bug
Reported by
Keith D (talk) 10:45, 26 September 2018 (UTC)
What happens
The BOT removes the full stop from the end of the |year= field when set to "n.d." thus causing there to be a cite date error
What should happen
Make no changes to the |year= in this case.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Olmec_colossal_heads&diff=861226490&oldid=859577463
We can't proceed until
Feedback from maintainers


Add tests and soon code. https://github.com/ms609/citation-bot/pull/825 AManWithNoPlan (talk) 18:27, 26 September 2018 (UTC)

Code added. Just waiting for deployment to wikipedia now. AManWithNoPlan (talk) 18:44, 26 September 2018 (UTC)
{{fixed}}

Wikilinks in journal name removed

Status
new bug
Reported by
Lithopsian (talk) 18:41, 27 September 2018 (UTC)
What happens
Citation bot alters, for example, "journal=The Astrophysical Journal" to "journal=The Astrophysical Journal"
What should happen
Leave it alone, unless possibly there is something blatantly wrong like it is not the right journal name, or perhaps a redlink. Is it because the link was a redirect? An entry "Journal=The Astrophysical Journal" seems to be left alone.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=51_Pegasi&diff=861477452&oldid=861463060
Replication instructions
See the diff link, or run the bot against the original version
We can't proceed until
Feedback from maintainers


Unless the entire journal name is wiki-linked, the data is almost always wrong. Secondly, partial links corrupt the COINS data and should not be done that way. AManWithNoPlan (talk) 20:02, 27 September 2018 (UTC)

That's probably OK. Some of the other links were to things like The Astrophysical Journal Letters, those can always be done with a redirect of the whole title if they're considered important enough. Lithopsian (talk) 20:39, 27 September 2018 (UTC)
just link the whole title and the bot leaves it alone. {{notabug}} AManWithNoPlan (talk) 02:10, 28 September 2018 (UTC)

Bizarre last1 field generated for cite web

Status
new bug
Reported by
Lithopsian (talk) 18:45, 27 September 2018 (UTC)
What happens
The bot adds a very strange last1 entry to an already-populated cite web template
What should happen
Nothing on cite web? Not this text as the name, anyway.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=51_Pegasi&diff=861479640&oldid=861479220
Replication instructions
I generated a diff containing only this change. Note that it has now been reverted and does not appear in the article.
We can't proceed until
Feedback from maintainers


{{fixed}} AManWithNoPlan (talk) 21:02, 27 September 2018 (UTC)

Adds time element to ref date

Status
new bug
Reported by
Keith D (talk) 20:48, 27 September 2018 (UTC)
What happens
The BOT adds a time element to a reference |date= field causing a cite date error to occur such as "| date= 2011-05-10T06:34:00-0400"
What should happen
Do not add time element to date field like "| date= 2011-05-10"
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Gelada&diff=861468095&oldid=848830991
We can't proceed until
Feedback from maintainers


{{fixed}} AManWithNoPlan (talk) 21:01, 27 September 2018 (UTC)

Can anything be done with ScienceDirect.com urls?

For example

<ref>{{cite web |url=https://www.sciencedirect.com/science/article/pii/S0024379512004405 |title=Geometry of the Welch bounds}}</ref>

or

<ref>https://www.sciencedirect.com/science/article/pii/S0024379512004405</ref>

Those URLs are extremely common and if they can be parsed (similar to DOI urls), that would be fantastic. And then they could be removed since they'll be redundant with DOIs. Headbomb {t · c · p · b} 22:52, 29 August 2018 (UTC) We would need to grab the hmtl, parse as xml, <meta name="citation_doi" content="10.1016/j.laa.2012.05.036" /> Or we could use: https://api.elsevier.com/content/object/pii/S0024379512004405 AManWithNoPlan (talk) 14:31, 30 August 2018 (UTC)

In other words super simple. If DOI is found, then add DOI and forget URL. But, if DOI is already set, then forget URL if DOI is the same. AManWithNoPlan (talk) 14:57, 30 August 2018 (UTC)
Glad to hear it's not a hard thing to do! Let's do it then! Headbomb {t · c · p · b} 15:59, 30 August 2018 (UTC)
More generally, the thing to do will be to use the Citoid API to extract information from any relevant URL. Citoid have an enormous and maintained database of journal web pages, including, I am sure, ScienceDirect, listing how to obtain relevant metadata from them. I'm tapping away at getting something up and running on this basis (though I'm low on time again at the moment). Martin (Smith609 – Talk) 05:51, 1 September 2018 (UTC)
We have code to query Citoid in GitHub (original jstor code). We got throttled by Citoid so we just looked at Citoid jstor code and incorporated it. AManWithNoPlan (talk) 03:14, 9 September 2018 (UTC)

API: New feature, reference rebuild

I'd like the option to have a 'rebuild references' when they are so crappy we need to TNT them (for whatever reason), and start anew. Two options would be present

  • &rebuild=multiline (multiline option)
  • &rebuild=inline (inline option)

This would present things in a 'standardized' parameter order with 'standardized' whitespace

{{cite arXiv}}
multiline inline
<ref>
{{cite arXiv
 |last1= |first1=
 |last2= |first2=
 |...
 |date= or |year=
 |title=
 |arxiv=<import>
 |class=
}}</ref>
<ref>{{cite arXiv |last1= |first1= |last2= |first2= |... |date= or |year= |title= |arxiv=<import> |class=}}</ref>
{{cite book}}
multiline inline
{{cite book
 |last1= |first1=
 |last2= |first2=
 |...
 |date= or |year=
 |chapter=
 |chapter-url=<import>
 |chapter-url-access=<import>
 |editor1-last= |editor1-first=
 |editor2-last= |editor2-first=
 |title=
 |trans-title=<import>
 |language=<import non-English>
 |script-title=<import>
 |url=<import>
 |url-access=<import>
 |access-date=<import>
 |format=<import, if valid>
 |archive-url=<import>
 |archive-date=<import>
 |dead-url=<import>
 |series=
 |volume= |pages= (or |page=)
 |location=
 |publisher=
 |type=
 |arxiv=<import>
 |asin=<import> |asin-tld=<import>
 |bibcode=<import> |bibcode-access=<import>
 |biorxiv=<import>
 |citeseerx=<import>
 |doi=<import> |doi-access=<import> |doi-brokendate=
 |hdl=<import> |hdl-access=<import>
 |isbn=<import>
 |ismn=<import>
 |issn=<import>
 |jfm=<import>
 |jstor=<import> |jstor-access=<import>
 |lccn=<import>
 |mr=<import>
 |oclc=<import>
 |ol=<import> |ol-access=<import>
 |osti=<import> |osti-access=<import>
 |pmc=<import> |embargo=<import>
 |pmid=<import>
 |rfc=<import>
 |ssrn=<import>
 |zbl=<import>
 |id=<import>
 |quote=<import>
 |ref=<import>
}}
{{cite book |last1= |first1= |last2= |first2= |... |date= or |year= |chapter= |chapter-url=<import> |chapter-url-access=<import> |editor1-last= |editor1-first= |editor2-last= |editor2-first= |title= |trans-title=<import> |language=<import non-English> |script-title=<import> |url=<import> |url-access=<import> |access-date=<import> |format=<import, if valid> |archive-url=<import> |archive-date=<import> |dead-url=<import> |series= |volume= |pages= (or |page=) |location= |publisher= |type= |arxiv=<import> |asin=<import> |asin-tld=<import> |bibcode=<import> |bibcode-access=<import> |biorxiv=<import> |citeseerx=<import> |doi=<import> |doi-access=<import> |doi-brokendate= |hdl=<import> |hdl-access=<import> |isbn=<import> |ismn=<import> |issn=<import> |jfm=<import> |jstor=<import> |jstor-access=<import> |lccn=<import> |mr=<import> |oclc=<import> |ol=<import> |ol-access=<import> |osti=<import> |osti-access=<import> |pmc=<import> |embargo=<import> |pmid=<import> |rfc=<import> |ssrn=<import> |zbl=<import> |id=<import> |quote=<import> |ref=<import>}}
{{cite journal}}
multiline inline
{{cite journal
 |last1= |first1=
 |last2= |first2=
 |...
 |date= or |year=
 |title=
 |trans-title=<import>
 |language=<import non-English>
 |script-title=<import>
 |url=<import>
 |url-access=<import>
 |access-date=<import>
 |format=<import, if valid>
 |archive-url=<import>
 |archive-date=<import>
 |dead-url=<import>
 |journal=
 |series=
 |volume= |issue= |pages=
 |type=
 |arxiv=<import>
 |asin=<import> |asin-tld=<import>
 |bibcode=<import> |bibcode-access=<import>
 |biorxiv=<import>
 |citeseerx=<import>
 |doi=<import> |doi-access=<import> |doi-brokendate=
 |hdl=<import> |hdl-access=<import>
 |isbn=<import>
 |ismn=<import>
 |issn=<import>
 |jfm=<import>
 |jstor=<import> |jstor-access=<import>
 |lccn=<import>
 |mr=<import>
 |oclc=<import>
 |ol=<import> |ol-access=<import>
 |osti=<import> |osti-access=<import>
 |pmc=<import> |embargo=<import>
 |pmid=<import>
 |rfc=<import>
 |ssrn=<import>
 |zbl=<import>
 |id=<import>
 |quote=<import>
 |ref=<import>
}}
{{cite journal |last1= |first1= |last2= |first2= |... |date= or |year= |title= |trans-title=<import> |language=<import non-English> |script-title=<import> |url=<import> |url-access=<import> |access-date=<import> |format=<import, if valid> |archive-url=<import> |archive-date=<import> |dead-url=<import> |journal= |series= |volume= |issue= |pages= |type= |arxiv=<import> |asin=<import> |asin-tld=<import> |bibcode=<import> |bibcode-access=<import> |biorxiv=<import> |citeseerx=<import> |doi=<import> |doi-access=<import> |doi-brokendate= |hdl=<import> |hdl-access=<import> |isbn=<import> |ismn=<import> |issn=<import> |jfm=<import> |jstor=<import> |jstor-access=<import> |lccn=<import> |mr=<import> |oclc=<import> |ol=<import> |ol-access=<import> |osti=<import> |osti-access=<import> |pmc=<import> |embargo=<import> |pmid=<import> |rfc=<import> |ssrn=<import> |zbl=<import> |id=<import> |quote=<import> |ref=<import>}}
{{cite web}} (you can image the inline option)
multiline inline
{{cite web
 |last1= |first1=
 |last2= |first2=
 |...
 |date= or |year=
 |editor1-last= |editor1-first=
 |editor2-last= |editor2-first=
 |title=
 |url=<import>
 |website=
 |series=
 |volume= |pages=
 |location=
 |publisher=
 |type=
 |arxiv=<import>
 |asin=<import> |asin-tld=<import>
 |bibcode=<import> |bibcode-access=<import>
 |biorxiv=<import>
 |citeseerx=<import>
 |doi=<import> |doi-access=<import> |doi-brokendate=
 |hdl=<import> |hdl-access=<import>
 |isbn=<import>
 |ismn=<import>
 |issn=<import>
 |jfm=<import>
 |jstor=<import> |jstor-access=<import>
 |lccn=<import>
 |mr=<import>
 |oclc=<import>
 |ol=<import> |ol-access=<import>
 |osti=<import> |osti-access=<import>
 |pmc=<import> |embargo=<import>
 |pmid=<import>
 |rfc=<import>
 |ssrn=<import>
 |zbl=<import>
 |id=<import>
 |quote=<import>
 |ref=<import>
}}
{{cite web |last1= |first1= |last2= |first2= |... |date= or |year= |editor1-last= |editor1-first= |editor2-last= |editor2-first= |title= |url=<import> |website= |series= |volume= |pages= |location= |publisher= |type= |arxiv=<import> |asin=<import> |asin-tld=<import> |bibcode=<import> |bibcode-access=<import> |biorxiv=<import> |citeseerx=<import> |doi=<import> |doi-access=<import> |doi-brokendate= |hdl=<import> |hdl-access=<import> |isbn=<import> |ismn=<import> |issn=<import> |jfm=<import> |jstor=<import> |jstor-access=<import> |lccn=<import> |mr=<import> |oclc=<import> |ol=<import> |ol-access=<import> |osti=<import> |osti-access=<import> |pmc=<import> |embargo=<import> |pmid=<import> |rfc=<import> |ssrn=<import> |zbl=<import> |id=<import> |quote=<import> |ref=<import>}}

Whatever is marked <import> would be carried over from the old citation, with URLs/Identifiers used to rebuilt the rest of the citation. The rest would be present (if the bot can/would fill them), or omitted (if the bot can't/wouldn't fill them). Headbomb {t · c · p · b} 17:59, 1 September 2018 (UTC)

The idea is that this would facilitate this type of cleanup and standardization. Headbomb {t · c · p · b} 18:10, 1 September 2018 (UTC)
If multi-line every argument ideally would have its own line as it's confusing for other bots when there is a combination, they can't determine automatically what kind of template it is supposed to be multi-line or single-line. -- GreenC 13:48, 24 September 2018 (UTC)
Not sure I follow. The point is for one time runs to rebuild stuff in an easily-reviewable way. What happens after that is business as usual. Other bots have nothing to do with this. Headbomb {t · c · p · b} 14:43, 24 September 2018 (UTC)
I think that other bots expect either multi-line or one-big-line and have multiple lines but with last and first on one line makes the other bots confused. AManWithNoPlan (talk) 14:48, 24 September 2018 (UTC)
Well, that's already the case, and not the problem this request is trying to solve. Headbomb {t · c · p · b} 15:00, 24 September 2018 (UTC)
  • From past experience, this type of behaviour has the potential to be unpopular. I'm not sure that it's quite within the remit of the bot, as it stands. If you think it's important, I would suggest opening a new bot request for approval to determine the parameters under which this behaviour would be acceptable. Martin (Smith609 – Talk) 11:58, 28 September 2018 (UTC)

Capitalization of journals

Status
new bug
Reported by
Martin (Smith609 – Talk) 11:51, 27 September 2018 (UTC)
What happens
Despite the inclusion of BMC in the capitalization exclusion file, the bot missed an opportunity to capitalize BMC in Bmc Medical Education
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Physiology&diff=prev&oldid=861432924, line 28

See also bioRvix at https://en.wikipedia.org/w/index.php?title=Homo_sapiens&diff=prev&oldid=861469112

We can't proceed until
Feedback from maintainers


  Fixed in GitHub Pull 855 AManWithNoPlan (talk) 21:37, 27 September 2018 (UTC)

Don't expand raw URLs using Citoid/Zotero

Status
{{fixed}}
Reported by
(tJosve05a (c) 18:47, 27 September 2018 (UTC)
What happens
Caused bad timestamps, repeats author names, just...no.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=User_talk:Citation_bot&action=edit&section=new&preload=User_talk:Citation_bot/preload&preloadtitle=Untitled_new_bug
We can't proceed until
Feedback from maintainers


The function of the bot just a day agao where it did not add titles to cite webs and cite news was perfect in my view. I feel that the curent version of the bot is too unstable and adding a lot of junk. Can a option not to run Zotero expansion be added (or the reverse). Such as ?edit=toolbar&slow=1&zotero=0 (tJosve05a (c) 21:42, 27 September 2018 (UTC)
That function went a lot overboard initially. No data validation, etc.. I was surprised to see it up and running. AManWithNoPlan (talk) 22:47, 27 September 2018 (UTC)
We've scaled back our use of Zotero, and will continue to monitor until we strike the right balance. Marking as resolved. Martin (Smith609 – Talk) 12:06, 28 September 2018 (UTC)

Adding superfluous date formatting

Status
  Fixed in GitHub Pull 845
Reported by
(tJosve05a (c) 06:13, 28 September 2018 (UTC)
What happens
|date=scheme=dcterms.ISO8601; 2013-10-23
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Bert_Officer&diff=prev&oldid=861543786
We can't proceed until
Feedback from maintainers


Capitalization and punctuation in Washington, D.C.

Status
{{fixed}}
Reported by
(tJosve05a (c) 08:33, 28 September 2018 (UTC)
What should happen
https://en.wikipedia.org/w/index.php?title=United_States_Postal_Service&diff=next&oldid=861553736
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=United_States_Postal_Service&diff=861553736&oldid=860917127
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/858 AManWithNoPlan (talk) 16:08, 28 September 2018 (UTC)

More script-title issues

Status
{{fixed}}
Reported by
(tJosve05a (c) 09:48, 28 September 2018 (UTC)
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Jos%C3%A9_Rizal&diff=prev&oldid=861558796
We can't proceed until
Feedback from maintainers


What do you suggest? The template supports two different title parameters and then shows them both. We added code that prevents duplicates, but in these cases the script title and title are very different (or maybe it basically one is printed and the other is cursive styling of the same words). Perhaps:

if (has script-title and new title is not all western characters) then
  ignore new title
else 
  add title
 end if

AManWithNoPlan (talk) 16:12, 28 September 2018 (UTC)

Yes, that would be a nice solution. However, a "non-script" title isn't necessary when a script-title is present, and in most cases where script-title is used, there is no "western title" availible at all, only a |trans-title=.
if (has script-title) then
  ignore new title
else 
  add title
 end if

(tJosve05a (c) 16:18, 28 September 2018 (UTC)

Bot hasn't edited in a few hours / fails to edit when triggered

Activating the bot sends in into an endless loop of doing absolutely nothing. Can't really explain more save it just fails to run properly on any page you try to run it on. Headbomb {t · c · p · b} 14:39, 29 September 2018 (UTC)

{{fixed}} AManWithNoPlan (talk) 20:36, 29 September 2018 (UTC)

Adds invalid date

Status
{{fixed}}
Reported by
Keith D (talk) 11:17, 28 September 2018 (UTC)
What happens
Adds invalid date "|date=-001-11-30T00:00:00+00:00
another example "| date= 0000 uu"
another example "|date=1"
What should happen
Add a valid date or leave blank
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Draft:Rison_(Singer)&diff=861510456&oldid=861509469
https://en.wikipedia.org/w/index.php?title=Yearbook_on_International_Communist_Affairs&diff=861515747&oldid=861515469
https://en.wikipedia.org/w/index.php?title=Slug&diff=861522041&oldid=860710903
We can't proceed until
Feedback from maintainers


Thanks for the report. Looking into it. Martin (Smith609 – Talk) 12:08, 28 September 2018 (UTC)

Cite LSA support

Status
new bug
Reported by
(tJosve05a (c) 22:37, 3 October 2018 (UTC)
What should happen
Implement support for {{cite LSA}}
We can't proceed until
Feedback from maintainers


Why???? What you expect the bot to do? AManWithNoPlan (talk) 22:48, 3 October 2018 (UTC)

{{wontfix}} template is basically a fancy formatting tool. AManWithNoPlan (talk) 22:55, 3 October 2018 (UTC)
Perhaps expand:
{{cite LSA|url=https://www.ncbi.nlm.nih.gov/pubmed/4043876|year=1985|title=Kondous laventicus, a new ceboid primate from the Miocene of the La Venta, Colombia, South America}}
Missing author name1985. Kondous laventicus, a new ceboid primate from the Miocene of the La Venta, Colombia, South America. .
to {{Cite LSA|last=Setoguchi|first=T.|date=1985|title=Kondous laventicus, a new ceboid primate from the Miocene of the La Venta, Colombia, South America|url=https://www.ncbi.nlm.nih.gov/pubmed/4043876|journal=Folia Primatologica; International Journal of Primatology|volume=44|pages=96–101|year=1985}}
Setoguchi, T.. 1985. Kondous laventicus, a new ceboid primate from the Miocene of the La Venta, Colombia, South America. Folia Primatologica; International Journal of Primatology 44. 96–101. .
Or somehting at least. (tJosve05a (c) 23:09, 3 October 2018 (UTC)
Rarely used and actually not easy as all to implement. Bug us again when bug queue is empty. AManWithNoPlan (talk) 03:03, 4 October 2018 (UTC)

url-access

Status
{{notabug}}
Reported by
(tJosve05a (c) 22:42, 3 October 2018 (UTC)
What happens
The bot adds http://dare.uva.nl/personal/pure/en/publications/functional-reconstruction-of-structurally-complex-epitopes-using-clips-technology(ce45bb5a-7823-4872-a0b1-e5e5a99a79e5).html |type=Submitted manuscrip
What should happen
The bot should also add |url-access=free
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Epitope_mapping&diff=862365170&oldid=861623490
We can't proceed until
Feedback from maintainers


Or, actually, in this case it should have added |hdl=11245/1.309707 instead of the URL, but in general when adding a free URL, it should add |url-access=free. (tJosve05a (c) 22:42, 3 October 2018 (UTC)

No. |url-access=free is not supported by cs1|2 because values in |url= are presumed to be free-to-read.
{{cite book |title=Title |url=//exampl.com |url-access=free}}
Title. {{cite book}}: Invalid |url-access=free (help)
Trappist the monk (talk) 22:45, 3 October 2018 (UTC)
Hmm...sorry! My bad. I thought that was not tru with {{cite journal}}. (tJosve05a (c) 22:57, 3 October 2018 (UTC)

Hdl

Status
new bug
Reported by
(tJosve05a (c) 22:57, 3 October 2018 (UTC)
What happens
The bot adds http://dare.uva.nl/personal/pure/en/publications/functional-reconstruction-of-structurally-complex-epitopes-using-clips-technology(ce45bb5a-7823-4872-a0b1-e5e5a99a79e5).html |type=Submitted manuscrip
What should happen
|hdl=11245/1.309707 instead
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Epitope_mapping&diff=862365170&oldid=861623490
We can't proceed until
Feedback from maintainers


{{wontfix}} the meta data is poor quality AManWithNoPlan (talk) 03:05, 4 October 2018 (UTC)

Recognize NCBI bookshelf links?

Status
{{fixed}}
Reported by
Headbomb {t · c · p · b} 16:10, 21 August 2018 (UTC)
What happens
If |url=https://www.ncbi.nlm.nih.gov/books/NBK24662/, nothing happens
What should happen
It should be possible to extract some useful information from that url.
We can't proceed until
Feedback from maintainers


The treasure trove of URL readers used by Citoid do actually parse this page: https://github.com/zotero/translators AManWithNoPlan (talk) 18:37, 21 August 2018 (UTC)

The books have a lot of stuff like <span itemprop="datePublished">2001</span> AManWithNoPlan (talk) 18:42, 21 August 2018 (UTC)

Further researchgate support

Status
new bug
Reported by
(tJosve05a (c) 09:43, 11 September 2018 (UTC)
What should happen
The bot should be able to detect the doi on https://www.researchgate.net/publication/23445361
<div class="publication-meta-secondary">DOI: 10.1136/jnnp.2008.144360 [...]
Relevant diffs/links
My edit https://en.wikipedia.org/w/index.php?title=Dysautonomia&diff=next&oldid=859041106 and the bot's edit just before
We can't proceed until
Feedback from maintainers


Would want to not add the researchgate specific dois automatically, but that one would be good. AManWithNoPlan (talk) 14:03, 22 September 2018 (UTC)
10.13140 is the researchgate prefix. It is not a CrossRef DOI, but a DataCite DOI. AManWithNoPlan (talk) 14:49, 25 September 2018 (UTC)
Does the Zotero API now handle this? Martin (Smith609 – Talk) 12:03, 28 September 2018 (UTC)
Strangely think it did and now it doesn't. I do not know why. Maybe I remember wrong, or maybe we added data integrity checks that thought this was too questionable of data. For Citoid it does, but us it does not. This test is commented out in the github tests: https://www.researchgate.net/publication/23445361 AManWithNoPlan (talk) 14:51, 28 September 2018 (UTC)

{{wontfix}} they block us and anything that looks like scraping. AManWithNoPlan (talk) 02:01, 5 October 2018 (UTC)

Bare references in []s

Status
{{notabug}}
Reported by
Martin (Smith609 – Talk) 06:50, 29 September 2018 (UTC)
What happens
We missed an opportunity to expand some bare URLs; it looks like it should be easy to capture these.
What should happen
https://en.wikipedia.org/w/index.php?title=Sandby_borg&diff=next&oldid=861478626
We can't proceed until
Feedback from maintainers


Those were not bare urls though. Headbomb {t · c · p · b} 13:43, 4 October 2018 (UTC)

Would converting [http....html] to http....html be considered a significant enough improvement to be justified -- even if not expanded into anything? I hate those references that are just another number in square braces. AManWithNoPlan (talk) 14:37, 4 October 2018 (UTC)
Well [http//...html] is a bare url. But in that diff, you have [http://...html FOOBAR], which isn't bare. Headbomb {t · c · p · b} 16:52, 4 October 2018 (UTC)
I realize that, but what are your thoughts on references that are bare and have not title and yet have square brackets around them. AManWithNoPlan (talk) 02:49, 5 October 2018 (UTC)
Aren't they currently expanded? I thought they were? Headbomb {t · c · p · b} 03:01, 5 October 2018 (UTC)
Why yes they are..... AManWithNoPlan (talk) 03:05, 5 October 2018 (UTC)

Adding strange template

Status
{{fixed}}
Reported by
Keith D (talk) 11:21, 29 September 2018 (UTC)
What happens
Adds template {{[[Template:metaTags.other['article:published_time']|metaTags.other['article:published_time']]]}} to article
What should happen
Add correct detail and not template
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Draft:ROL_Cruise&diff=861666818&oldid=861017727
We can't proceed until
Feedback from maintainers


This is a GIGO problem, since the info exists in the headers of the web page(s) in question, but it would be great if the tool could ignore this junk instead of inserting it. See the archives of User talk:Zhaofeng Li/reFill, another tool that editors have been using to semi-automatically insert this junk for years. Gnomes remove it manually if tool-using editors fail to see it in Preview. – Jonesey95 (talk) 15:02, 29 September 2018 (UTC)

https://github.com/ms609/citation-bot/pull/873 AManWithNoPlan (talk) 03:06, 4 October 2018 (UTC)

ILR Review

Status
{{fixed}}
Reported by
(tJosve05a (c) 20:26, 29 September 2018 (UTC)
What happens
|journal=Ilr Review
What should happen
|journal=ILR Review
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Minimum_wage&diff=prev&oldid=861757734
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/874 AManWithNoPlan (talk) 03:07, 4 October 2018 (UTC)

Bot chokes up on Signet ring cell carcinoma

Status
{{fixed}}
Reported by
Headbomb {t · c · p · b} 22:49, 3 October 2018 (UTC)
Relevant diffs/links
[25]
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/877 AManWithNoPlan (talk) 03:04, 4 October 2018 (UTC)

Adding book title as journal

Status
{{fixed}}
Reported by
Martin (Smith609 – Talk) 07:00, 29 September 2018 (UTC)
What happens
Besides which, it's a book, not a journal

Rasmussen, D. T. (2002). "The origin of Primates". In Hartwig, W. C. (ed.). The Primate Fossil Record. Cambridge: Cambridge University Press. pp. 5–9. Bibcode:2002prfr.book.....H. {{cite book}}: |journal= ignored (help)

Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=List_of_fossil_primates&diff=prev&oldid=861471487
We can't proceed until
Feedback from maintainers


Newer code avoids those book bibcodes AManWithNoPlan (talk) 03:08, 5 October 2018 (UTC)

API: New feature, random edit

I'm currently using https://tools.wmflabs.org/citations/category.php?cat=1980_births&slow=1 which makes one edit and then stops (this seems to be a bug from above discussions). I like to make the bot run on random pages and then stop when it has made an edit, I, however, don't want to specify a category. I'd love to be able to use a link such as https://tools.wmflabs.org/citations/random.php and just have the bot find a page where it will make an edit. (tJosve05a (c) 20:59, 19 August 2018 (UTC)

That would be cool. The bot would grab a random page and then make the edit. I tried https://tools.wmflabs.org/citations/doibot.php?edit=toolbar&slow=1&page=Special:Random and it did not just work, so new code would be needed. AManWithNoPlan (talk) 21:19, 19 August 2018 (UTC)
API:Random. --Izno (talk) 21:50, 19 August 2018 (UTC)
in the mean time just open up new pages by clicking on the Random page link and then clicking the citation link off to the left side. AManWithNoPlan (talk) 22:19, 19 August 2018 (UTC)
Nah, in the meantime I'll just use the (broken) category API and let the bot run on a category with lots of articles, since it will continue to run on articles until it finds an article where an edit will be made, as with running it on individual pages using Special:Random, there is a great chance of no edits being made. (tJosve05a (c) 22:29, 19 August 2018 (UTC)

The bot historically logged each page that it visited to a database, and could be run on the page that had been longest without a visit. The database didn't make the migration to ToolForge, but some of the code still exists. Something like what you suggest would be a good step towards the bot running unsupervised again (which had to be discontinued because I didn't have AManWithNoPlan to keep up with bug reports!) Martin (Smith609 – Talk) 13:56, 21 August 2018 (UTC)

@Josve05a: you still want this? I mean there's so much to work on that a random article seems wasted. Headbomb {t · c · p · b} 22:21, 26 August 2018 (UTC)
It would be a nice feautre, but it isn't so that I'm demanding it. Just thought I should ask to see... (tJosve05a (c) 07:13, 27 August 2018 (UTC)

{{wontfix}} too many other things to do AManWithNoPlan (talk) 14:49, 8 October 2018 (UTC)

remove even more google books

Status
new bug
Reported by
(tJosve05a (c) 23:06, 29 September 2018 (UTC)
What happens
The bot converted {{cite web}} to {{cite book}} for a Google Book URL.
What should happen
It should remove |website=Google Books and |website=Books.google.es
Relevant diffs/links
We can't proceed until
Feedback from maintainers


Regression of User talk:Citation bot/Archive 10#remove website=Google for books (tJosve05a (c) 23:06, 29 September 2018 (UTC)

NOT a regression. Just more ways that google books likes to describe itself. https://github.com/ms609/citation-bot/pull/896 AManWithNoPlan (talk) 02:43, 7 October 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 14:32, 8 October 2018 (UTC)

cite arxiv --> cite journal misfires

Status
new bug
Reported by
Headbomb {t · c · p · b} 18:46, 30 September 2018 (UTC)
What happens
Adds "journal = Arxiv Mathematics E-Print"; adds |class= to cite journal
What should happen
Never upgrade a cite arxiv to a cite journal which has the string 'eprint' or 'arxiv'; never add |class= to cite journal, remove |class= from cite journal.
Relevant diffs/links
[26]
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/902 AManWithNoPlan (talk) 15:05, 8 October 2018 (UTC)

{{fixed}} AManWithNoPlan (talk) 17:02, 8 October 2018 (UTC)

Springer support

Status
new bug
Reported by
(tJosve05a (c) 13:30, 2 October 2018 (UTC)
What happens

  • Adds |doi=10.1007/978-3-642-75924-6_15#page-1.
  • Adds |publisher=Springer, Berlin, Heidelberg
What should happen

  • Add |doi=10.1007/978-3-642-75924-6_15
  • Add |publisher=Springer or SpringerLink
    • Possibly add |location=Berlin, Heidelberg
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=User:Josve05a/cite-sandbox&diff=prev&oldid=862145920
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/886 AManWithNoPlan (talk) 04:36, 5 October 2018 (UTC)

https://github.com/ms609/citation-bot/pull/885 AManWithNoPlan (talk) 04:36, 5 October 2018 (UTC)

{{fixed}} AManWithNoPlan (talk) 14:31, 8 October 2018 (UTC)

More date errors

Status
new bug
Reported by
Keith D (talk) 22:40, 4 October 2018 (UTC)
What happens
Bot adds |date=Invalid date
What should happen
Leave field blank
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Ron_Bakir&action=history
We can't proceed until
Feedback from maintainers


{{fixed}} AManWithNoPlan (talk) 14:30, 8 October 2018 (UTC)

Stop converting ... to …

Status
new bug
Reported by
Headbomb {t · c · p · b} 06:13, 5 October 2018 (UTC)
Relevant diffs/links
[27]
We can't proceed until
Feedback from maintainers


This is very annoying. Headbomb {t · c · p · b} 06:13, 5 October 2018 (UTC)

I did not realize the bot did that. Just curious why annoying. AManWithNoPlan (talk) 12:42, 5 October 2018 (UTC)
MOS:ELLIPSIS. --Izno (talk) 13:52, 5 October 2018 (UTC)
I noticed it started doing this around this time. Or maybe it was this time. I can't say if it's related or not though. Headbomb {t · c · p · b} 13:57, 5 October 2018 (UTC)
It actually requested recently. https://github.com/ms609/citation-bot/pull/889 AManWithNoPlan (talk) 03:56, 6 October 2018 (UTC)

--AManWithNoPlan (talk) 02:08, 7 October 2018 (UTC)

{{fixed}} AManWithNoPlan (talk) 14:30, 8 October 2018 (UTC)

Follow redirects

Status
new bug
Reported by
(tJosve05a (c) 21:34, 7 October 2018 (UTC)
What should happen
Both URLs in this diff links to the same URL, only that one is redirecting. So both refs should have same content
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=User:Josve05a/cite-sandbox&curid=44054197&diff=862967313&oldid=862967303
We can't proceed until
Feedback from maintainers


that website uses invalid ssl certs and so the bounces get stopped by https libraries. i really do not want to turn that off. AManWithNoPlan (talk) 04:21, 8 October 2018 (UTC)

{{wontfix}} sadly. AManWithNoPlan (talk) 14:27, 8 October 2018 (UTC)

more non-standard jstor URLS

Status
new bug
Reported by
(tJosve05a (c) 22:19, 2 September 2018 (UTC)
What happens
The bot deos not recognize https://www.jstor.org/stable/j.ctt6wp6td.10?seq=9#metadata_info_tab_contents as a JSTOR aceeptable URL/ID.
What should happen
The bot hsould convert the raw URL ref to {{cite journal}} with |jstor=j.ctt6wp6td.10
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Nieuwmarkt_Riots&diff=prev&oldid=857769249
We can't proceed until
Feedback from maintainers


Actually all the j.something are books/book chapters. Headbomb {t · c · p · b} 23:28, 2 September 2018 (UTC)
AManWithNoPlan (talk) 23:36, 9 September 2018 (UTC)
[
  {
    "itemType": "bookSection",
    "notes": [],
    "tags": [],
    "title": "Underground Visions:: Strategies of Resistance along the Amsterdam Metro Lines",
    "abstractNote": "The association between Amsterdam and the underground is rather ambiguous to say the least. On the one hand, the Netherlands, and Amsterdam in particular, are proud to present themselves as hospitable vis-à-vis alternative ‘underground’ cultures – a legacy from the 1960s and 1970s when feminist, gay, hippy, student, and squat movements were dominating the social and cultural scenes. The global tourist reputation of Amsterdam as the capital of sex, drugs, and rock ’n’ roll has largely been built on the legacy of a 1960s underground culture that was leftist and avant-garde.  At the same time, however, the other notion of",
    "publisher": "Amsterdam University Press",
    "ISBN": [
      "9789089645050"
    ],
    "pages": "77–96",
    "bookTitle": "Paris-Amsterdam Underground",
    "series": "Essays on Cultural Resistance, Subversion, and Diversion",
    "url": "http://www.jstor.org/stable/j.ctt6wp6td.10",
    "date": "2013",
    "libraryCatalog": "JSTOR",
    "accessDate": "2018-09-09",
    "shortTitle": "Underground Visions",
    "author": [
      [
        "Ginette",
        "Verstraete"
      ]
    ],
    "seriesEditor": [
      [
        "Christoph",
        "Lindner"
      ],
      [
        "Andrew",
        "Hussey"
      ]
    ],
    "source": [
      "Zotero"
    ]
  }
]

{{fixed}} AManWithNoPlan (talk) 15:15, 9 October 2018 (UTC)

hardcoded hdl url

Status
new bug
Reported by
Headbomb {t · c · p · b} 20:32, 30 September 2018 (UTC)
What happens
Adds hardcoded hdl urls
What should happen
use |hdl=
Relevant diffs/links
[28]
We can't proceed until
Feedback from maintainers


we will probably add regex's to catch the more common ones. We do the same with pubmed. AManWithNoPlan (talk) 22:11, 30 September 2018 (UTC)

wrote regex code. should make it easy to add more over time. https://github.com/ms609/citation-bot/pull/903 AManWithNoPlan (talk) 23:49, 8 October 2018 (UTC)
{{fixed}} AManWithNoPlan (talk) 15:16, 9 October 2018 (UTC)

Missing spaces

Status
new bug
Reported by
(tJosve05a (c) 22:02, 7 October 2018 (UTC)
What happens
The bot parses <div class="article-title">Exile Drama: The Translation of Ernst Toller's <i>Pastor Hall</i> (1939)</div> as Exile Drama: The Translation of Ernst Toller's Pastor Hall(1939), stipping the space beteen the </i> (1939)
What should happen
There should be a space before the parentheses
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Pastor_Hall&diff=862969493&oldid=840491692 and https://en.wikipedia.org/w/index.php?title=Scalesia&diff=862969997&oldid=831795778
We can't proceed until
Feedback from maintainers


Problem with euppublishing.com? (tJosve05a (c) 22:02, 7 October 2018 (UTC)

Complain to the publisher about giving bad meta data to crossref (and you will be ignored most likely). We do not get data from the web information you quote AManWithNoPlan (talk) 04:15, 8 October 2018 (UTC)
{{wontfix}} sadly. AManWithNoPlan (talk) 14:27, 8 October 2018 (UTC)
Can't we make a manual "fix" with this publisher? If 10.3366/, ensure there is a space before any parantesis. If not, consider adding one. Or something like that? Or, if htere is no space, make the bot try and scrape the landing page and see if the HTML there has a space? (tJosve05a (c) 17:43, 8 October 2018 (UTC)
No way, no how that we a going to make a website correct the crossref data. Also, no space before a ( is correct in many contexts. AManWithNoPlan (talk) 20:19, 8 October 2018 (UTC)