User talk:Citation bot/Archive 38

Latest comment: 4 days ago by AManWithNoPlan in topic Cosmetic edit for p -> page
Archive 35 Archive 36 Archive 37 Archive 38

Ultraviolet

The bot repeatedly added a full stop to a DOI parameter in a citation. Edit Achmad Rachmani (talk) 03:15, 27 December 2023 (UTC)

{{notabug}}, that's pubmed-related GIGO. Headbomb {t · c · p · b} 03:34, 27 December 2023 (UTC)

Bot drew the first date it saw on a webpage and added it as the date of publication in a citation

Status
{{fixed}} by adding officialcharts.com to websites without valid dates.
Reported by
Tkbrett (✉) 15:49, 28 December 2023 (UTC)
What happens
The bot drew the first date it saw on a webpage and decided that that was the date of publication. In reality, this website has no date of publication, but instead collects UK singles chart information for a particular band. The date it added (April 20, 1966) is actually the UK chart debut of the single "Daydream" by the Lovin' Spoonful.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=The_Lovin%27_Spoonful_discography&diff=1192282157&oldid=1191615017
We can't proceed until
Feedback from maintainers


Changes nbsp to x000B0

Status
{{fixed}} the HTML decoding code to deal with double-encoded data
Reported by
Headbomb {t · c · p · b} 05:12, 30 December 2023 (UTC)
What happens
[1]
What should happen
nbsp is better
We can't proceed until
Feedback from maintainers


Actually, it just removes the nbsp. AManWithNoPlan (talk) 20:10, 30 December 2023 (UTC)

Speaking of that diff, I notice Citation bot is modifying a {{cite book}} citation, which contains a |journal= parameter, where one of |title= and |journal= is a strict substring of the other, which both also match /[Pp]roceedings/ or /[Mm]eeting/ or /[Ss]ymposi/. I would think with these characteristics, Citation bot could confidently alter the template type to {{cite conference}}, removing the article from Category:CS1 errors: periodical ignored (25,655) (assuming it is the only erroneous citation of this type present in the article). Folly Mox (talk) 21:36, 30 December 2023 (UTC)
Cite conference is a garbage template that should not be used whenever possible. Headbomb {t · c · p · b} 21:54, 30 December 2023 (UTC)
I disagree, and find that:
  1. Publishing platforms often omit standard citation information (like "paper presented at Some Expensive Conference") in their treatment of conference proceedings
  2. Publishing platforms almost always omit standard bibliographic information (like editorial contributions) in their treatment of conference proceedings
  3. There's no other way to cite something that both has chapters and is a journal issue
  4. Altering {{cite journal}} to {{cite book}} where an ISBN is present has contributed to tens of thousands of template errors
I'm not sure why you'd characterise this particular citation template as garbage (perhaps a link could be dropped if you're not feeling like explaining), but in my efforts to contract the maintenance category Category:CS1 errors: periodical ignored (25,655), I've had cause to use {{cite conference}} in dozens of cases, and there are likely thousands more that could be identified without too much difficulty. Folly Mox (talk) 22:24, 30 December 2023 (UTC)

add a time component to some free DOIs

DOI prefix 10.1155's registrant is Hindawi, an open access publisher. However, Hindawi became open access in 2007, and some (rare) DOIs from prior to 2007 are not free, e.g.

  • Booker, A. R.; Strombergsson, A.; Venkatesh, A. (2006). "Effective computation of Maass cusp forms". International Mathematics Research Notices. 2006: 71281. doi:10.1155/IMRN/2006/71281.{{cite journal}}: CS1 maint: unflagged free DOI (link)

The bot should only mark 10.1155 DOIs from year 2007 and up as free, and not all of them. Headbomb {t · c · p · b} 01:23, 31 December 2023 (UTC)

This DOI belongs to OUP, not Hindawi. Nemo 14:10, 31 December 2023 (UTC)
No, the DOI is Hindawi's. OUP purchased IMRN in 2007. Headbomb {t · c · p · b} 16:55, 31 December 2023 (UTC)
No, the DOI belongs to OUP. You can read more about DOI transfers in https://www.crossref.org/documentation/register-maintain-records/creating-and-managing-dois/transferring-responsibility-for-dois/ . Nemo 16:42, 1 January 2024 (UTC)
On a related tangent, a substantial fraction of "dead" DOIs on wikipedia are DOIs that are owned by a different company than the current journal owner. Medknow is other big cause. AManWithNoPlan (talk) 18:21, 1 January 2024 (UTC)
Hindawi is the registrant for 10.1155 DOIs. Who currently legally owns that particular article is irrelevant. Headbomb {t · c · p · b} 23:20, 1 January 2024 (UTC)
{{fixed}} Date component added to the 1155 doi code. AManWithNoPlan (talk) 14:19, 2 January 2024 (UTC)
Headbomb, no. Hindawi controls the 10.1155 prefix, not the 10.1155 DOIs. Nemo 18:46, 2 January 2024 (UTC)

Don't remove functioning pdfs.semanticscholar.org links

Status
{{fixed}} URL detection code
Reported by
Nemo 14:08, 31 December 2023 (UTC)
What happens
Citation bot removes functioning links to pdfs.semanticscholar.org from the URL parameter.
What should happen
Only remove PDF links if they're actually duplicative of the identifier, i.e. if they were redirected to the landing page. While most pdfs.semanticscholar.org URLs got broken at some point, some are still working. The presence of the identifier gives no information about the availability of the full text, so it does not convey the same amount of information. (At least until an s2cid-access=free parameter is introduced, but I'm not aware of such a proposal.)
Relevant diffs/links
special:diff/1192041925 (though in this case there's also a doi-access=free link so the title remains linked)
We can't proceed until
Feedback from maintainers


Nemo 14:08, 31 December 2023 (UTC)

ISBN has no hyphens

Status
{{fixed}}
Reported by
Grimes2 (talk) 19:53, 31 December 2023 (UTC)
What happens
isbn has no hyphens
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Linda_Colley&diff=prev&oldid=1192864576
We can't proceed until
Feedback from maintainers


That's the only ISBN in the article. How would Citation bot know the intended style for the article was a hyphenated ISBN? This doesn't seem like a bug to me. Folly Mox (talk) 21:19, 31 December 2023 (UTC)

Determining the correct hyphens is on the to do list, but that requires a massive database. AManWithNoPlan (talk) 01:42, 1 January 2024 (UTC)
How does https://anticompositetools.toolforge.org/hyphenator/ do it? Grimes2 (talk) 04:48, 1 January 2024 (UTC)
Another alternative is to replace the unhyphenated ISBN with {{Format ISBN|<10- or 13-digit ISBN>}}. That template is set to auto-subst so AnomieBOT will take care of the substing. Because Citation bot is a bot, it may be necessary to add {{Format ISBN}} to User:AnomieBOT/TemplateSubster force.
That not being acceptable, Module:Format ISBN/data holds a list of ISBN ranges and the number of digits in each of the three center digit groups of a 13-digit ISBN. Perhaps the bot can read that module.
Trappist the monk (talk) 15:18, 1 January 2024 (UTC)
Thanks for the link. I will look into that. AManWithNoPlan (talk) 16:21, 1 January 2024 (UTC)

10.14256 is open access

10.14256 DOIs should be marked free (http://www.casopis-gradjevinar.hr/about-the-journal/open-access-statement/). Headbomb {t · c · p · b} 23:30, 1 January 2024 (UTC)

{{fixed}} AManWithNoPlan (talk) 14:10, 2 January 2024 (UTC)

Why can't the bot figure out the doi here?

Status
{{fixed}} - sometimes it works, sometimes it doesn't. URL expansion is the least consistent feature.
Reported by
Headbomb {t · c · p · b} 21:07, 2 January 2024 (UTC)
What should happen
[2]
We can't proceed until
Feedback from maintainers


Normally the bot is able to figure it out the DOI from the url, but here I had to give the DOI before the bot processed that citation. Headbomb {t · c · p · b} 21:07, 2 January 2024 (UTC)

DPG Media Privacy Gate

Status
{{fixed}}
Reported by
Spinixster (chat!) 03:33, 3 January 2024 (UTC)
What happens
Special:Diff/1193304102
What should happen
Do not use DPG Media Privacy Gate as title, like Special:Diff/1193304718
We can't proceed until
Feedback from maintainers


This one may be hard or maybe even undoable. If it can't be fixed, an option would be to not convert those links? Spinixster (chat!) 03:33, 3 January 2024 (UTC)

10.7759 are free dois (Cureus)

should be flagged as such. Headbomb {t · c · p · b} 01:39, 31 December 2023 (UTC)

I've tried a few and found them closed: phabricator:F41641477. How did you check their status? Nemo 14:26, 31 December 2023 (UTC)
@Nemo bis: Both of those worked for me. DuncanHill (talk) 14:32, 31 December 2023 (UTC)
Thanks for checking. Too bad this publisher is so confusing, there might be some geographical restrictions involved. It's probably safer to only link PubMedCentral when available. Nemo 14:35, 31 December 2023 (UTC)
Indeed, both links are free. They want your email for the PDF download, but the HTML version is displayed by default. There's nothing weird about how Cureus works. Headbomb {t · c · p · b} 16:47, 31 December 2023 (UTC)
The result is the same. Their website doesn't offer anything useful, so there's no point sending people there with false hopes of finding what an open access article would usually provide. Nemo 16:41, 1 January 2024 (UTC)
Their website offers the full article, freely. That is useful. Headbomb {t · c · p · b} 23:17, 1 January 2024 (UTC)
Not for my browser. Nemo 18:44, 2 January 2024 (UTC)
Can you provide a screenshot or something? Your personal browser, based on which you make decisions about what automated edits should be made across wikipedia, seems consistently significantly different than everyone else's browser. –jacobolus (t) 03:36, 5 January 2024 (UTC)
Maybe you can also give more information about your browser settings / location / ...?
Every browser on my machines across 2 operating systems, including when I try to access these via a different IP address, shows the full text of the articles. –jacobolus (t) 03:43, 5 January 2024 (UTC)
Indeed, to get that screen, one must actively request the PDF version of the article. Headbomb {t · c · p · b} 06:08, 5 January 2024 (UTC)
True, if I don't click anything I instead get a blank screen with a banner: phabricator:F41651606. (Granted, that's partly because my browser is configured to reject, by default, the various surveillance systems installed on this website.) Nemo 15:27, 5 January 2024 (UTC)
I have added to the free list, but it seems to require one to accept at least one cookie 🍪. AManWithNoPlan (talk) 20:41, 6 January 2024 (UTC)

Enabled 1-click activation of Category:CS1 errors: extra text: pages and similar

These four cats haven't been implemented apparently

Third time's the charm? Headbomb {t · c · p · b} 03:07, 9 January 2024 (UTC)

{{fixed}} Ironically, it was "extra text" that caused the problem. There was an invisible unicode character in the source code from copying and pasting your lists. AManWithNoPlan (talk) 19:23, 11 January 2024 (UTC)
Stupid invisible characters... Headbomb {t · c · p · b} 21:06, 11 January 2024 (UTC)

Weird author generation

Status
{{fixed}}
Reported by
Susmuffin Talk 05:21, 13 January 2024 (UTC)
What happens
An author parameter was added that contained "(:None)".
What should happen
Not that
Relevant diffs/links
[3]
We can't proceed until
Feedback from maintainers


Cite web changed to cite magazine

Any online source should use cite web. On the jazz project Cleanup Listing, I have fixed many errors due to people using cite book and cite magazine instead of cite web. On the Steve Oliver page here, Citationbot changed the Billboard reference from cite web to cite magazine. Why? Nearly always, the citation is from an online source (an online version of Billboard), not the physical copy of the magazine. I'm not a fan of Citationbot's changes.—Vmavanti (talk) 03:41, 9 January 2024 (UTC)

Because Billboard is a magazine, whether it's online or in print is irrelevant. Headbomb {t · c · p · b} 05:05, 9 January 2024 (UTC)
Agree. Online sources that are books should use cite book. Online sources that are magazine articles should use cite magazine. Online sources that are journal articles should use cite journal. It is simply false that "Any online source should use cite web.". These are not errors and should not be "fixed". —David Eppstein (talk) 06:32, 9 January 2024 (UTC)
I disagree. A web source should use cite web. Take a look sometime at the kinds of errors found on the Jazz Cleanup Listing. I didn't change them BECAUSE they used cite news. I changed them because the cite news usages were creating error messages as found in the Cleanup Listing.Vmavanti (talk) 15:33, 9 January 2024 (UTC)
I think it's useful to distinguish when a reference is for a print magazine, though, since the parameters will be different (presence of page numbers, date of publication, quite often the same article has different titles in print vs online). WP:SAYWHEREYOUGOTIT, if you got it from the website that’s different from a print magazine — especially for older articles which might have digitization errors from OCR or if an online version has (sometimes silently) made emendations.
Billboard is also an online database, and references to that as a magazine I think also confuses things.
This is different of course from digital facsimiles of magazines and books which are identical in all respects to the paper versions including pagination. Umimmak (talk) 06:43, 9 January 2024 (UTC)
(To clarify, if it's an actual news article I think {{cite news}} is still better than {{cite web}} — I don't think all online content should be referenced via {{cite web}}.) Umimmak (talk) 06:53, 9 January 2024 (UTC)
I'm basing my judgment on 1) common sense (a web source uses cite web); 2) it's an easier template for contributors to use, based on the number of errors I have seen over eight years of editing when it comes to using cite news or cite magazine. Ask a member of the public. I have spoken to many of them over the years. I have seen their successes and their mistakes. Plenty.Vmavanti (talk) 15:33, 9 January 2024 (UTC)
Common sense: an online magazine uses cite magazine, exactly like how online journals uses cite journal. Headbomb {t · c · p · b} 17:30, 10 January 2024 (UTC)
Disagree. Vmavanti has pointed ou the flaw in that reasoning. Templates like cite magazine were never designed to be used for simple web pages. Hence it having paremeters like page number. Only if it is a digital copy (e.g. scan) should such a template be used for an online source. Otherwise web pages should use cite web. Tvx1 16:59, 11 January 2024 (UTC)
What if someone is trying to verify the citation who has access to a library with a physical copy of the magazine but not proxy access to the institutional subscription to the magazine? Why not include all necessary parameters to verify a source by either physical or digital means, if the information is available? Folly Mox (talk) 17:16, 11 January 2024 (UTC)
The lack of page numbers is irrelevant. Lots of academic journals no longer refer to articles by page numbers. They are still journals and should be cited using {{cite journal}}. Same for magazines. The type of citation format describes the type of publication that is being cited, its editorial or organizational structure, not the format by which some editor happened to find it, because that is more important for understanding the nature of the source. —David Eppstein (talk) 17:57, 11 January 2024 (UTC)
I agree that online sources should not be cited using {{cite web}} if a more specific template with broader parameter support could apply. Folly Mox (talk) 18:55, 10 January 2024 (UTC)

More Free DOI prefixes

All from the Microbiology Society. Not all its journals are OA though, hence this per-journal DOI thing.

Access Microbiology

  • 10.1099/acmi[...]

Microbiology

  • 10.1099/mic[...]

Journal of General Microbiology

  • 10.1099/00221287[...]

Microbial Genomics

  • 10.1099/mgen[...]

Headbomb {t · c · p · b} 23:44, 14 January 2024 (UTC)


D-Lib Magazine

  • 10.1045

Headbomb {t · c · p · b} 00:25, 15 January 2024 (UTC)

FASEB

  • 10.1096

Headbomb {t · c · p · b} 00:32, 15 January 2024 (UTC)

  Fixed AManWithNoPlan (talk) 14:20, 16 January 2024 (UTC)

Removal of via parameters

Status
  Not a bug
Reported by
Jo-Jo Eumerus (talk) 06:54, 9 January 2024 (UTC)
What happens
It seems like the bot removes |via=Google Books, despite Wikipedia:Citing sources#Say where you read it advicing that the citation say how the source was accessed.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Mount_Churchill&diff=1194462097&oldid=1194336920


This is the incorrect usage of Via. Without a URL, there can be no "via". There are also often many copies of the same book on google, and they come and go. Say "via google books" is no more helpful than "I googled it, so trust me". AManWithNoPlan (talk) 15:39, 9 January 2024 (UTC)

Using via without a URL is literally the opposite of "Say where you read it". AManWithNoPlan (talk) 15:42, 9 January 2024 (UTC)
I don't think removal of |via= where |url= is not present constitutes an error, but I did notice in this diff Citation bot modifying a {{cite report}} (Ewert et al 2018) by reparameterising |title= to |chapter=, then adding a |title= that duplicated |series=, which I just fixed. Folly Mox (talk) 18:53, 10 January 2024 (UTC)
IMO, via=Google Books makes as much sense as via=the Faculty library, i.e., none, it is just noise. Ditto via=Internet Archive and via=JSTOR. Can someone give an example of a sensible via that does not have a url because I can't think of any.

remove publisher = NLM for cite journal

Status
  Won't fix -- too many books, etc. I did most of them by hand
Reported by
Headbomb {t · c · p · b} 06:18, 7 January 2024 (UTC)
What should happen
[4]
We can't proceed until
Feedback from maintainers



The NLM can be a publisher, but it won't be a publisher of any journal. Headbomb {t · c · p · b} 06:18, 7 January 2024 (UTC)

Couldn't you put the NLM as the publisher of the NLM Technical Bulletin? BhamBoi (talk) 00:39, 15 January 2024 (UTC)
I suppose that would be the one exception. Headbomb {t · c · p · b} 00:45, 15 January 2024 (UTC)

Cite journal for Harper's Magazine

Special:Diff/1196306645 altered a web citation of Harpers.org into cite journal. Should the template used instead be cite magazine? Οἶδα (talk) 16:46, 17 January 2024 (UTC)

Likewise Special:Diff/1196532984 changed a citation to an entry on the Dictionary of Irish Biography (www.dib.ie) to cite journal, even though it's not a periodical. Why? --Paul_012 (talk) 10:32, 18 January 2024 (UTC)
Both {{fixed}} AManWithNoPlan (talk) 14:07, 19 January 2024 (UTC)

Untitled_new_bug

Status
new bug
Reported by
46.188.59.163 (talk) 16:14, 23 January 2024 (UTC)
What happens
Table floated away after the last update
We can't proceed until
Feedback from maintainers


Which article? Which edit? — Preceding unsigned comment added by GoingBatty (talkcontribs) 18:10, 23 January 2024 (UTC)

Add volume=volume issue=issue

Status
  Fixed
Reported by
Headbomb {t · c · p · b} 21:25, 22 January 2024 (UTC)
What happens
[5]
What should happen
not that


This is clearly nonsense. Headbomb {t · c · p · b} 21:25, 22 January 2024 (UTC)

Handle Current Topics in Microbiology and Immunology better

Status
{{fixed}} - will have to run bot twice, but it gets better each time.
Reported by
Headbomb {t · c · p · b} 01:46, 7 January 2024 (UTC)
What should happen
[6]
We can't proceed until
Feedback from maintainers


Had to TNT the title/journal for it to properly give the information Headbomb {t · c · p · b} 01:46, 7 January 2024 (UTC)

Date format error

Status
{{fixed}} - no longer requires the bot to run twice.
Reported by
Headbomb {t · c · p · b} 17:39, 10 January 2024 (UTC)
What should happen
[7]
We can't proceed until
Feedback from maintainers


Factiva links

Status
{{notabug}}
Reported by
MartinPoulter (talk) 14:10, 17 January 2024 (UTC)
What happens
Factiva links are being replaced, as in this edit. The new links do not work. The original https://global-factiva-com.bris.idm.oclc.org links are created in Factiva when I use the "Share" function and select "In other accounts". So, despite the URL containing an identifier for my institution, it seems like these links are suitable for wider sharing. The new link format introduced by the bot does not let me access the news articles I've cited.
We can't proceed until
Feedback from maintainers


Your links can only be used by U of Bristol people. They are useless to everyone else. I am not sure if the new links work either, since I do not have a Factiva account. But, the reality is that factiva links are worthless. I suggest looking at https://en.wikipedia.org/wiki/Template:Factiva AManWithNoPlan (talk) 14:39, 17 January 2024 (UTC)

publication-date instead of date blocks the addition of doi-access=free

Status
  Fixed
Reported by
Headbomb {t · c · p · b} 02:27, 24 January 2024 (UTC)
What should happen
[8]


Incorrect values added to a citation

Status
  Fixed - added various things to list of bad data
Reported by
Vgbyp (talk) 11:34, 27 January 2024 (UTC)
What happens
Added incorrect first1, last1, and date values.
What should happen
Shouldn't have done anything with that link
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Euro&diff=prev&oldid=1199254732
We can't proceed until
Feedback from maintainers


Feature request: Replace f/ with f/ in citation titles

Status
{{wontfix}} - too hard to do for something this rare
Reported by
GoingBatty (talk) 14:48, 18 January 2024 (UTC)
What should happen
To remove more pages from Category:CS1 errors: invisible characters, please replace {{f/}} with f/ in citation titles per the instructions at Template:f/
Relevant diffs/links
e.g. this edit
We can't proceed until
Feedback from maintainers


I did them all by hand. I will have to think about this, since f/ is a template. AManWithNoPlan (talk) 19:28, 23 January 2024 (UTC)

weird mojibake issue

Status
{{wontfix}} - meta data is terrible. All caps and weird wrong unicode.
Reported by
Headbomb {t · c · p · b} 00:38, 4 February 2024 (UTC)
What happens
[9]
What should happen
[10]
We can't proceed until
Feedback from maintainers


Can't the bot at least not overrule the current input when it's got crap like "’" in the titles? Headbomb {t · c · p · b} 01:33, 4 February 2024 (UTC)

Wontfix addition of mojibake is an invitation for {{bots|deny}}, or stronger measures if it is more widespread. What do you think we would do to a human editor who did this? —David Eppstein (talk) 02:05, 4 February 2024 (UTC)

Reduce cosmetic edits by Citation bot

Citation bot is a useful tool that adds missing data and fixes formatting errors in citation templates1. However, sometimes it makes minor edits that do not affect the appearance or content of the article, such as removing a space between a quotation mark and a reference tag. These edits are considered cosmetic and are discouraged by the Wikipedia policy on bot usage. For example, see this: [11].

Cosmetic edits by bots can clutter the page history, the watchlist, and the recent changes, making it harder for editors to track the actual changes to the article. They can also trigger the abuse filter and lead to the account being blocked, as it happened to me: [12]. Therefore, I propose that Citation bot should avoid making cosmetic edits when checking multiple pages, unless they are accompanied by other significant edits.

To implement this feature, Citation bot could keep track of the number and type of edits it makes to each page before saving it. Then, it could compare the number of edits with a configurable threshold value, which would determine the minimum number of edits required for the bot to save the page. For example, the bot could save the page only if it makes at least 3 "fast" edits (such as adding a URL or an access date) or 1 "slow" edit (such as retrieving a bibcode or a DOI) per page. The default threshold value could be 1, to preserve the current behavior of the bot.

This way, Citation bot could reduce the number of cosmetic edits and comply with the Wikipedia policy, while still improving the quality and consistency of the citations. I think this would benefit both the bot operators and the Wikipedia community. What do you think? Maxim Masiutin (talk) 19:35, 25 January 2024 (UTC)

The bot is pretty good about avoiding minor edits already. Can you point to some specific minor edits? None of the above count as minor. AManWithNoPlan (talk) 14:38, 26 January 2024 (UTC)
{{ping|AManWithNoPlan} OK, let me find some of those minor edits, I will find and let you know within a day. Maxim Masiutin (talk) 19:22, 7 February 2024 (UTC)

Incorrect conversion of working paper to cite journal

Status
new bug
Reported by
David Eppstein (talk) 07:30, 30 January 2024 (UTC)
What happens
Special:Diff/1200695439
Relevant diffs/links
Not that. "Economics Working Paper Archive" is not a journal.
We can't proceed until
Feedback from maintainers


In attempting to reinstantiate the same bad edit [13], the bot made a different bad edit that mashed together two differently titled versions of the same news story [14] [15]. The version with the doi is the better version to use, but the bot should not have mashed that into the existing citation of the other version. —David Eppstein (talk) 20:43, 30 January 2024 (UTC)
Working paper website and its title added to the database of things that look like journals, but are not. AManWithNoPlan (talk) 14:03, 31 January 2024 (UTC)

adding a volume for journals that don't have volumes

Hi,

This bot is adding a volume number which is the same as the issue number for publications that don't have a volume number. Surely this is a mistake?

https://en.wikipedia.org/w/index.php?title=Gypsy%2C_Roma_and_Traveller_people_%28UK%29&diff=1202613177&oldid=1202374242 Boynamedsue (talk) 08:21, 3 February 2024 (UTC)

The existing data is the mistake. AManWithNoPlan (talk) 00:43, 4 February 2024 (UTC)

More Academia.edu normalization

Status
  Won't fix
Reported by
Headbomb {t · c · p · b} 01:41, 21 January 2024 (UTC)
What should happen
[16]


Are there other url prefixes? AManWithNoPlan (talk) 19:04, 23 January 2024 (UTC)

This is annoying, since the article numbers actually change. AManWithNoPlan (talk) 20:02, 24 January 2024 (UTC)

DOIs - heads up

The bots DOIs code has been improved substantially. Will now flag many more bad dois as dead. Used to only detect things like doi:10.22111/jsr.2013.848, but will now also flag things like doi:10.3201/eid1007.040396. Please report any mis-flagged DOIs. AManWithNoPlan (talk) 21:16, 23 January 2024 (UTC)

Could you make is so that the free DOI checking based on prefixes is done before the broken check? It was a bit of a pain in the ass to deal with all the broken Medknow DOIs that should have been flagged as free but weren't beause they were broken. Headbomb {t · c · p · b} Headbomb {t · c · p · b} 02:29, 24 January 2024 (UTC)
Since the template no longer links dead dois, even if free, I have removed that check. AManWithNoPlan (talk) 17:54, 24 January 2024 (UTC)
https://en.wikipedia.org/wiki/Category:CS1_maint:_DOI_inactive_as_of_February_2024 Huge number getting found. AManWithNoPlan (talk) 13:43, 7 February 2024 (UTC)

inappropriate conversion of cite web to cite journal

Status
  Fixed in several ways.
Reported by
Trappist the monk (talk) 14:14, 5 February 2024 (UTC)
What happens
Ignoring the fact that this source took its content from Wikipedia, bot converted more-or-less correct {{cite web}} to a wholly incorrect {{cite journal}}. This error caught because the bot included html numeric entities for [[ and ]] around whatever it was that it thought to be the journal name.
Relevant diffs/links
diff
We can't proceed until
Feedback from maintainers


Figure stuff about based on google scholar link

Status
{{wontfix}} since too hard, and lots of people seem to link to scholar for sholar's sake and not for the article
Reported by
Headbomb {t · c · p · b} 00:02, 27 December 2023 (UTC)
What should happen
https://en.wikipedia.org/w/index.php?title=Akinola_Alada&diff=next&oldid=1191984133
We can't proceed until
Feedback from maintainers


I will have to think about this. AManWithNoPlan (talk) 01:02, 27 December 2023 (UTC)

Idk, can't people click through to the publisher? Seems like potentially a lot of processing to save a click for extra low effort editors.... And does google scholar have reliably complete citations? Or is it more like Citation bot would have to be programmed to follow the link and parse the target? Folly Mox (talk) 03:01, 27 December 2023 (UTC)
This request is to parse the google scholar information since it's given, and fill the template accordingly. IDC what happens to the original link. Headbomb {t · c · p · b} Headbomb {t · c · p · b} 03:13, 27 December 2023 (UTC)
The bot would have to follow the link and expand based upon that. The other problem is that a lot of the links are intended to be to scholar and not the article itself. AManWithNoPlan (talk) 16:03, 27 December 2023 (UTC)
I think the key part is view_op=view_citation in the url. If that's there, seems to be a metadata page, rather than something more useful. Headbomb {t · c · p · b} 22:12, 27 December 2023 (UTC)

Adding today's date as publication date for a book

Status
{{fixed}} by adding some code to detect super new dates on google books. Weird. I cannot replicate this.
Reported by
PamD 10:48, 3 February 2024 (UTC)
What happens
current date was added to a "cite book" which had a missing date
What should happen
leave well alone unless sourcing correct date via isbn
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Pulch%C3%A9rie_Abeme_Nkoghe&diff=1202592158&oldid=1202324262
We can't proceed until
Feedback from maintainers


created |last= composed solely of punctuation

Status
{{fixed}}
Reported by
Trappist the monk (talk) 13:45, 4 February 2024 (UTC) 13:50, 4 February 2024 (UTC)
What happens
Attempts to create a last/first split name from a non-name: Iowa. (Ter.) (from Google books metadata?) → | last1=) | first1=Iowa. (Ter
What should happen
values assigned to |authorn=, |firstn=, |lastn= should never be composed solely of punctuation and/or digits; this applies to the other namelists as well
Relevant diffs/links
diff; also diff
We can't proceed until
Feedback from maintainers


gigabyte78

Status
{{fixed}} - thats some messed up PubMed data
Reported by
Abductive (reasoning) 04:46, 10 February 2024 (UTC)
What happens
diff
We can't proceed until
Feedback from maintainers


What is the bot doing here? It takes out the pages and puts in "gigabyte" and some numbers. I undid it and it repeated the next day, so it's not some transient thing. Abductive (reasoning) 04:46, 10 February 2024 (UTC)

This is totally wrong. The bot is replacing |page(s)= with the suffix from |doi=. The bot should not be doing that. Ever.
Trappist the monk (talk) 13:02, 10 February 2024 (UTC)

bot inappropriately converts invalid parameter |pmcid= to |s2cid=

Status
{{fixed}}
Reported by
Trappist the monk (talk) 14:34, 10 February 2024 (UTC)
What happens
|pmcid=PMC5528981 (pmcid is not a valid parameter; PMC5528981 is not a valid |pmc= value) is wrongly converted to |s2cid=PMC5528981 where PMC5528981 is not a valid |s2cid= value
Relevant diffs/links
diff
We can't proceed until
Feedback from maintainers


Untitled_new_bug

Status
new bug
Reported by
22Anshika (talk) 20:29, 12 February 2024 (UTC)
We can't proceed until
Feedback from maintainers


I can see there are some issues with the links in an article I edited on Wikipedia. The bot might find it unreliable, but the links I have used, which I feel might be the issue, are 100% legitimate. Actually, that is the website we check to confirm the results of chess matches and tournaments. So please help me remove the warning from the top of the page. I am hoping to present it to the person, on his birthday, which is in less than 24 hours now. Thanks

Adds Interstate Commerce Commission as report author

Status
{{fixed}} - added bad author detection to DX.doi.org code.
Reported by
JudeFawley (talk) 07:50, 20 February 2024 (UTC)
What happens
"United States. Interstate Commerce Commission" added as author of NTSB reports
What should happen
no author should be added for reports with no named authors, as long as the report publisher is the entity that actually authored the report. If any author is added, it should be the publisher itself, in this case the NTSB, not the Interstate Commerce Commission, who had no involvement with the report.
Relevant diffs/links
https://en.m.wikipedia.org/w/index.php?title=List_of_boiling_liquid_expanding_vapor_explosions&diff=prev&oldid=1209089420&title=List_of_boiling_liquid_expanding_vapor_explosions&diffonly=1
We can't proceed until
Feedback from maintainers


CrossRef fails

Status
{{fixed}} - it was some odd character encodings.
Reported by
UtherSRG (talk) 19:30, 22 February 2024 (UTC)
What happens
Attempted to run on Caconemobius anahulu....
>Consult APIs to expand templates
  >Checking that DOI 10.1111/een.13011 is operational...
  !CrossRef title did not match existing title: doi:10.1111/een.13011
  >  Possible new title: Lava crickets (Caconemobius spp.) on Hawai'i Island: first colonisers or persisters in extreme habitats?
  >  Existing old title: Lava crickets (''Caconemobius'' spp.) on Hawai'i Island: first colonisers or persisters in extreme habitats?
What should happen
Bot should ignore the italics/bold/etc, or at least attempt a re-match without the formatting... or is this just the destination spitting out bad data and the bot is doing the correct recheck?
We can't proceed until
Feedback from maintainers


Single and the only author expanded from first/last to first1/last1

Status
new bug
Reported by
Maxim Masiutin (talk) 19:07, 24 February 2024 (UTC)
We can't proceed until
Feedback from maintainers


@AManWithNoPlan: I raised an issue in the past, but it is now archived as User_talk:Citation_bot/Archive_37, see there: "Bug? The bot should not replace first/last to first1/last1 when there is just one author"

There was a reference: <ref>{{Cite book |last=Handy |first=E. S. Craighill |url={{google books|plainurl=y|id=PoXQAgAAQBAJ|page=120}}|title=Ancient Hawaiian Civilization: A Series of Lectures Delivered at THE KAMEHAMEHA SCHOOLS |last2=Davis |date=2012-12-21 |publisher=Tuttle Publishing |isbn=978-1-4629-0438-9 |language=en}}</ref>

The bot changed it to the following: <ref>{{Cite book |last1=Handy |first1=E. S. Craighill |url={{google books|plainurl=y|id=PoXQAgAAQBAJ|page=120}}|title=Ancient Hawaiian Civilization: A Series of Lectures Delivered at THE KAMEHAMEHA SCHOOLS |last2=Davis |date=2012-12-21 |publisher=Tuttle Publishing |isbn=978-1-4629-0438-9 |language=en}}</ref>

To reproduce it, copy the example to your sandbox and click "Citations" button (you should have this button present as a gadget enabled in Wikipedia preferences).

@AManWithNoPlan: You wrote {{tl|wontfix}}, since the complexity of going back and changing them will just make the bot's author handling that much more insane, and it is already complicated enough.

How did you get to a conclusion that it is already complicated enough? This issue and issues like this is very important because when a bot only changes a page with from first/last to first1/last1 is not only questionable but triggers WP:COSMETICBOT violation.

Unnecessary volume parameter

Status
{{fixed}} - added International Journal of the Sociology of Language to list of journals that crossref calls issues volumes
Reported by
Demetrios1993 (talk) 17:38, 24 February 2024 (UTC)
What happens
The bot added |volume= with the value of 134, which actually pertains to the |issue=, and is already included. The actual volume of the source is the year-related value of 1998, and can be confusing; it is redundant.
What should happen
Not that
Relevant diffs/links
diff
We can't proceed until
Feedback from maintainers


Cite web for blog article converted to cite journal

Status
{{fixed}} - urls with "breakingnews" or "/blog/" in them will not trust Zotero when it says "this is a journal"
Reported by
  — Chris Capoccia 💬 14:15, 26 February 2024 (UTC)
What happens
Blog article on Neurology Today converted to cite journal when it's not a journal article
Relevant diffs/links
diff
We can't proceed until
Feedback from maintainers


URL expansion seems to be broken

No matter what article I run Citation Bot on, it reports !Operation timed out after 20001 milliseconds with 0 bytes received or >Could not resolve URL for every URL it tries to process, before eventually !Giving up on URL expansion for a while. It appears that at least some of the specialized URL expanders (for journals and such) are still working, but not the general-purpose one. :Jay8g [VTE] 01:04, 27 February 2024 (UTC)

Journals etc are not done via URL expansion. The URL expander is a wikipedia provided service, so I cannot fix this problem. AManWithNoPlan (talk) 14:10, 27 February 2024 (UTC)

Please remove claim that editors may use the bot to check a single article, because it consistently fails

Every time I have tried to use the bot to check a single article, it fails with Error: Citations request failed. The bot has always suffered intermittently from this problem, but at least BrownHairedGirl used to challenge those few editors who saturated the service. Since she was blocked, there has been noone with the expertise to call out the abuse, so presumably it has become endemic.

Regardless of the reasons, it is surely time to declare in all honesty that the bot is not a practical option for single shot use. Better still, create another instance of the bot that cannot be used by any editor for more than one article in a 24-hour period. 𝕁𝕄𝔽 (talk) 12:22, 3 March 2024 (UTC)

Example of such a failure? Because the bot hasn't had any availability issues in several years for me. Headbomb {t · c · p · b} 14:36, 3 March 2024 (UTC)
It fails more often than it succeeds for me when checking single articles. --John B123 (talk) 14:43, 3 March 2024 (UTC)
It just failed on Tunisian Constitution of 2022. --John B123 (talk) 15:05, 3 March 2024 (UTC)
It ran just fine for me. Took less than a minute. Headbomb {t · c · p · b} 15:14, 3 March 2024 (UTC)
I always have this issue when using the edit window button that the gadget adds, but using the toolbar button (in read mode) or the Toolforge page almost always works. I've just given up on the edit window button, assuming it was something wrong with my configuration (perhaps Firefox or one of my extensions is blocking it), since the other methods don't give me problems. :Jay8g [VTE] 00:13, 4 March 2024 (UTC)
Thanks. I've been using it from the edit window button and had the problems. Trying it from the toolbar link worked fine on a couple of articles. Looks like the problem is with the edit window button rather than the bot it self. --John B123 (talk) 01:00, 4 March 2024 (UTC)
I use it from the toolbar button on the edit window, I'm not aware of alternative methods. I tried to use it on Robert Hooke four times on three widely [by hours, then days] separated occasions and each time it failed. This is precisely the behaviour that I had come to expect and so had given up on using the tool. But a GA award and an upcoming DYK made me feel I should try it again. The outcome is as I expected: failure. --𝕁𝕄𝔽 (talk) 08:57, 4 March 2024 (UTC)
And just to take the wind out of my sails, I tried again just now and it went through in less than a second (no changes required, which is all I wanted to know). But my concern still stands: is anybody monitoring these failures? --𝕁𝕄𝔽 (talk) 11:17, 4 March 2024 (UTC)
Sometimes when it "fails", it is just the web-browser giving up, and the bot does still run. AManWithNoPlan (talk) 13:29, 4 March 2024 (UTC)
That generates an error something like "no response from server". In this case, the response is Error: Citations request failed, which can only come from the bot, surely? --𝕁𝕄𝔽 (talk) 16:55, 4 March 2024 (UTC)
I feel like there's a bit of confusion about the different options here. There are three different ways to activate the bot:
  • Through https://citations.toolforge.org/ - this almost always works unless the bot is down
  • Through the "expand citations" link in read mode (in the tool area on the right side of the page if you're using the default skin) - this almost always works unless the bot is down
  • Through the "✓ Citations" button in edit mode -- this almost never works and gives the "Error: Citations request failed" message described by users above
The first two sometimes give a Wikimedia Error message but the bot still runs, as AManWithNoPlan describes above. The last option is the one that I believe John B123 and JMF are describing. This is the one that puts the edits into the edit window to be saved under the user's account (rather than under the Citation Bot account), so it's not something that can still work in the background even if the browser gives up. :Jay8g [VTE] 19:05, 4 March 2024 (UTC)
But the browser is not giving up, because the response in that case would be "no response from server". It has to be the widget.
So if the honest appraisal is that the ✓ Citations button is consistently unreliable (which is true), then it is time to remove it and stop advertising it. 𝕁𝕄𝔽 (talk) 08:51, 5 March 2024 (UTC)
The button is still useful and works in the vast majority of cases. That it fails on very large pages and on select citations is not a reason to remove the option. Headbomb {t · c · p · b} 12:34, 5 March 2024 (UTC)
In that case, can the Error: Citations request failed message be improved? Like "try again using Expand citations option in the tools column (left)"? Because it all honesty, I would have to say that the toolbar icon works in the vast minority of cases. I have been complaining here for years and it has taken until now to find the solution. --𝕁𝕄𝔽 (talk) 12:45, 5 March 2024 (UTC)
Is there data to show that it "works in the vast majority of cases"? There are three of us here who can't seem to ever get it to work. It's possible that it doesn't work in some browsers/configurations -- I'm using Firefox. :Jay8g [VTE] 19:09, 5 March 2024 (UTC)
and I'm using Chrome. 𝕁𝕄𝔽 (talk) 20:26, 5 March 2024 (UTC)
I'm using Chrome too. --John B123 (talk) 20:43, 5 March 2024 (UTC)
Firefox here. Headbomb {t · c · p · b} 02:23, 6 March 2024 (UTC)

still adding |chapter= to {{cite journal}}

Status
  Won't fix, since GIGO and I will continue to track. It shows up about once a week
Reported by
Trappist the monk (talk) 13:15, 18 February 2024 (UTC)
What happens
Bot added |chapter= to {{cite journal}}. {{cite journal}} and the other periodical or periodical-like cs1 templates ({{cite magazine}}, {{cite news}}, {{cite periodical}}, {{cite web}}) do not support |chapter= (and aliases |contribution=, |entry=, |article=, |section=)
Relevant diffs/links
diff
We can't proceed until
Feedback from maintainers


This happens every couple days. They are logged, and I go back and manually fix them (if someone else does not get to them first). Almost always, the citation is broken before the chatper is added. AManWithNoPlan (talk) 15:03, 20 February 2024 (UTC)

Wayback Machine

Status
  Fixed
Reported by
GreenC 23:40, 28 February 2024 (UTC)
Relevant diffs/links
Special:Diff/1209026325/1210917735


The bot is probably trying to address all those broken reFill edits, but in some rare cases it's actually correct |title=Wayback Machine. Not sure how to address. -- GreenC 23:45, 28 February 2024 (UTC)

Should not remove via=The Wikipedia Library on cite encyclopedia

Status
new bug
Reported by
David Eppstein (talk) 07:05, 14 December 2023 (UTC)
What happens
Special:Diff/1189805360
What should happen
Not that. The via link on this citation is a necessary part of the citation, as it describes how a copy of the citation was and can be obtained. Without that information, the citation fails to describe how to find the reference.
We can't proceed until
Feedback from maintainers


I think I agree with Citation bot on this one. I think the parameter value should be |via=EBSCO Literary Reference Center Plus. It wasn't obvious to me, a Wikipedia Library user, that I was supposed to use the default search bar at the top that is powered by EBSCO. They have pretty poor coverage of my usual topic areas. After forgetting to place my search term, "Baker & Taylor Author Biographies", in quotes for literal string matching, I actually went to Taylor & Francis next on a misguided hunch, before just asking google which publishing platform licensed the reference work, after which I was able to verify that TWL does provide access to it.

If that was my experience, what about the experience of a reader without TWL access who tries to verify that citation? What about our experience when someone sets |via=Inaccessible University Undergraduate Library System? Folly Mox (talk) 07:29, 14 December 2023 (UTC)

The fact that you found this specific instantiation of WP:SAYWHEREYOUGOTIT difficult to follow might be a reason for making an easier-to-follow recipe for finding the information. It is not an excuse for blanking that information. Also, although that source happens to be in the EBSCO source, I think the default search bar uses a combination of sources. I have found plenty of non-EBSCO material that way. I agree the search is not good in general, but in this case searching the title as a quoted string found it easily. —David Eppstein (talk) 07:33, 14 December 2023 (UTC)
I agree that had I formatted my initial search properly, I would have found the source without false starts and getting lost. I think what I'm trying to communicate is that |via=(membership in something with an institutional subscription) is never going to be helpful for people outside that membership, and even for the members it's more of a starting point (yes, I should be able to access this content) than a way (via) to access the content. Just some sleepy thoughts. Folly Mox (talk) 08:09, 14 December 2023 (UTC)
For the same reason you think we should remove all paywalled doi links on journal articles because they are never going to be helpful to someone without a subscription? Maybe just remove non-free-to-read references altogether? No. —David Eppstein (talk) 15:23, 14 December 2023 (UTC)
I'm sorry I communicated so poorly. That is not at all what I intended, and after having slept I do agree with you that no |via= parameter is a disimprovement in this case over TWL, but EBSCO would be more helpful (since other institutional subscriptions have access to it). Folly Mox (talk) 17:18, 14 December 2023 (UTC)
The distinction I'm attempting to draw here is between publishing platforms (accessible to many different groups, host the actual content, material of sufficient interest to wealthy outgroup folk can be purchased for an exorbitant sum) and access systems (TWL, SomeUniversity.edu, "I have access to ProQuest because I'm a journalist or whatever"). Access systems are generally entirely closed and invite-only, and typically don't offer a means to help specify which work is cited except by proxy links that only work when logged in to the access system.
In this case, if we take "The Wikipedia Library" to mean "The Wikipedia Library search bar", that does sufficiently identify the source, but it still only works for us. Even if we take it at face value, like I did, it gives us a starting point for verification in the way that Citation bot's removal doesn't. EBSCO would let any reader know which publisher they or their institution needs a subscription with (or to hand over money to) in order to verify. Folly Mox (talk) 19:03, 14 December 2023 (UTC)
I do not know how to make an EBSCO link to EBSCO content obtained through the Wikipedia Library that will remain permanently valid and will allow both Wikipedia Library subscribers and other EBSCO subscribers to access the content. If I did know how to provide such a link, I would have used it instead of just saying that you can find the content through the Wikipedia Library. Maybe you can educate me on how to provide such links instead of continuing to harangue me on how the access method I described was somehow so useless that bot-removal was an improvement. —David Eppstein (talk) 20:03, 14 December 2023 (UTC)
David Eppstein, I'm legitimately deeply sorry I've made you feel harangued. I've been trying to explain myself, because I was feeling misunderstood entirely (which is likely my fault due to poor wording). I did say above that I have come round to the feeling that Citation bot's edit was a disimprovement on your original |via=. As to creating an EBSCO link, that's also not what I intended to mean. My position is that the most useful value of |via= for this citation is "EBSCO Literary Reference Center Plus" as I said in my original comment. That's all.
Sorry again. Folly Mox (talk) 21:51, 14 December 2023 (UTC)
I, for one, would not have any idea how to access "EBSCO Literary Reference Center Plus" (except maybe after seeing this thread), despite regularly using The Wikipedia Library. —David Eppstein (talk) 22:19, 14 December 2023 (UTC)

I also agree with Citation bot. Inclusion of |via=Wikipedia Library is cruft of very low value. The specific library system through which someone accessed an source (or even, gasp, Sci-hub) does not need documentation. Ifly6 (talk) 15:51, 14 December 2023 (UTC)

We need some way of identifying how to find the citation. In this case my judgement as an editor was that the title and name of work alone were inadequate, and that the via= provided that identification. This is not the sort of judgement Citation bot should be automatically reversing. Your opinion as another human agreeing with the removal is not relevant to the question of whether this is the sort of edit a bot should be making. —David Eppstein (talk) 17:56, 14 December 2023 (UTC)
In this particular case, the via= parameter is rather helpful; the citation is bare enough without it that improving it was on my list of things to fix about the article. The only bot edit I could imagine being good here would be to wiki-link all occurrences of The Wikipedia Library in the via= parameter, because it's probably unfamiliar to readers who aren't themselves fairly serious Wikipedia editors. XOR'easter (talk) 18:14, 14 December 2023 (UTC)

From the citation I'm not quite sure what "Baker & Taylor Author Biographies" is. It would help to specify that Baker & Taylor is the publisher and what format the work is in. It seems to be some kind of database, so people would know to search it in the usual places like Worldcat. Given the date, it's most likely based on a previously published book which the publisher has acquired, so the best solution would be to cite the original authors and source. Nemo 20:58, 14 December 2023 (UTC)

I don't know exactly what it is either. It is what The Wikipedia Library told me the citation was from. The suggested AMA-format citation provided by EBSCO / The Wikipedia Library is:
Anne Sigismund Huff. Baker & Taylor Author Biographies. January 2000:1. Accessed December 14, 2023. https://search.ebscohost.com/login.aspx?direct=true&db=lkh&AN=49334395&site=eds-live&scope=site
You will notice the useless login-page url and the total lack of publisher and format information. Given that information, it's not obvious how a human editor could reasonably have been expected to produce anything better. But we are not here to talk about that, we are here to talk about how a bot editor can be prevented from making a not-very-good citation even worse. —David Eppstein (talk) 22:24, 14 December 2023 (UTC)
Incidentally, by some web searching I found a different way to link EBSCO content: if you use the "permalink" function on the right toolbar you will get a link that demands a Wikipedia Library login rather than an EBSCO login. So I guess it can only be read by other Wikipedia library users? How helpful. —David Eppstein (talk) 22:36, 14 December 2023 (UTC)

I'd recommend replacing the "Via" with "Literary Reference Center Plus", since the source is not really the Wikipedia Library per se. I see using the latter for "via" as something akin to putting "via=My local librarian printed it out for me", which is frankly not very useful to anyone who has a different local librarian. –jacobolus (t) 00:45, 15 December 2023 (UTC)

The intended meaning of the "via" was that to access this source, assuming you have Wikipedia Library access, you should go to the Wikipedia Library and type the title into the search bar across the top of the screen. The search bar is not labeled "Literary Reference Center Plus". I do not know what "Literary Reference Center Plus" is. Searching the Wikipedia Library page for the string "Literary Reference Center Plus" finds nothing. Putting via="Literary Reference Center Plus" would, for me, be as useless as leaving it blank. Not everyone has the same local librarian but all established Wikipedia editors (you know, the people who might want to verify a reference, for instance to see what it says in the context of an AfD discussion or to use it to expand the article) have the same Wikipedia Library. It would be better to have a link that readers and not just editors could access, but we don't. And again, you're missing the point: it should not be whether someone else might have come up with a better description of how to access the reference, it should be whether it is appropriate for a bot to be blanking this deliberately-included information. —David Eppstein (talk) 01:57, 15 December 2023 (UTC)
It is indeed unfortunate though that EBSCO and Baker & Taylor are apparently really bad at providing meaningful links or information about their various published documents.
There is at least a little bit more relevant metadata which might help someone locate this document: Baker & Taylor Author Biographies is OCLC 877175691, and apparently at Literary Reference Center Plus (the name of the EBSCO database providing the document, accessible from a wide variety of public and university libraries, which should definitely be mentioned somewhere in this citation), this particular record is apparently Accession Number 49334395.
You're probably right that the bot shouldn't blank the via parameter in this kind of case. I wouldn't be surprised to see a human editor blanking it though. –jacobolus (t) 03:26, 15 December 2023 (UTC)
Finally through some more searching I find that the correct solution (I think?) should be to use {{EBSCOhost}} with the id as a parameter. I say "should be" because it doesn't actually work. The example in the EBSCOhost template documentation leads to a document, but the one in the citation above just sends me to a search page that tells me nothing by that id was found in the "Academic Search Complete" database. To make it work I also have to include the magic incantation dbcode=lkh: "Anne Sigismund Huff". Baker & Taylor Author Biographies. January 2000. EBSCOhost 49334395. Now wouldn't it be nice if a bot could figure all that out instead of just blanking things. —David Eppstein (talk) 06:28, 15 December 2023 (UTC)

Add Internet Archive Scholar links

Status
Moved to short list
Reported by
Nemo 22:07, 28 November 2023 (UTC)
What happens
Nothing
What should happen
Add links to Internet Archive Scholar archived copies, where available and found by DOI, if Unpaywall and PMC have none.
We can't proceed until
Feedback from maintainers


This should be relatively fast with the API; Google Scholar is doing the same and shows those OA links, which were generally archived due to being public domain or CC-licensed. You can see the docs at https://scholar.archive.org/api/redoc but here's an example:

$ curl -sH "Accept: application/json" https://scholar.archive.org/search?q=doi:10.1080/14786449908621245 | jq -r .results[0].fulltext.access_url
https://archive.org/download/crossref-pre-1909-scholarly-works/10.1080%252F14786449608620921.zip/10.1080%252F14786449908621245.pdf

Optionally the metadata can be used to construct the scholar.archive.org URL, which in this case is https://scholar.archive.org/work/heaairhf5fgkvgie4h54rpc4nm/access/ia_file/crossref-pre-1909-scholarly-works/10.1080%252F14786449608620921.zip/10.1080%252F14786449908621245.pdf and for a wayback URL would be something like https://scholar.archive.org/work/rv4lw3nikrfstp7bvvlxapsylu/access/wayback/https://pubs.rsc.org/en/content/articlepdf/2022/dt/d2dt00998f . (This will reduce confusion by bots which think there's utility in converting web.archive.org links into something else.)

Nemo 22:07, 28 November 2023 (UTC)

Seems like a lot of them are just copies of arXiv PDFs. AManWithNoPlan (talk) 17:16, 4 December 2023 (UTC)
If by "a lot" you mean about 2 million out of 25 million: yes, I'd expect the entire arxiv to be archive by IA scholar. There's no need to link these if there's already an arxiv identifier. (Though it's sad that the arxiv identifier doesn't auto-link.) Nemo 22:31, 4 December 2023 (UTC)
I am curious which type of url is best. I am always a bit leery of PDF links that do not end in PDF (option 3). I wonder if the first method would ever provide multiple options. AManWithNoPlan (talk) 22:13, 7 December 2023 (UTC)
Recommend the /download/ link, because it has the .pdf extension, it's more standard than the scholar.archive.org URLs, the URL is shorter and less complex, it's more aligned with where the content is actually located. scholar.archive.org is basically an index, not a repository. The data is hosted at //archive.org (that seems confusing since it's the same site but they are different servers). -- GreenC 01:21, 8 December 2023 (UTC)
As GreenC says, the archive.org/download/ links are usually preferred. In this case I'd prefer the scholar.archive.org resolver because 1) the edits will look more consistent, using the same domain name whether the PDF is under web.archive.org or archive.org, 2) some of these items might be split and relocated in the future, in which case the scholar.archive.org links will probably still work somewhat but the archive.org/download/ links may break. These are just aesthetic or very rare issues though.
I recommend using scholar.archive.org for the works which are linked to web.archive.org though, because bots and the cite templates themselves often complain about web.archive.org being in the url parameter, so you'd be forced to add all of url, archive-url, url-status=unfit and the entire family of parameters. Nemo 09:19, 8 December 2023 (UTC)
For example, doi:10.4103/0973-8398.104830 (which currently citation bot auto-links but ends up on a non-resolvable domain www[.]asiapharmaceutics[.]info) could be linked to https://scholar.archive.org/work/7ss2kx3v75d3jifq2pc4uiucce/access/wayback/http://www.asiapharmaceutics.info:80/index.php/ajp/article/download/52/48 Nemo 08:28, 11 December 2023 (UTC)

From valid reference to unrecognizable junk in three Citation bot edits

Not reporting as a bug, because I think the original bug that started this chain of garbage is long fixed, but: Special:Diff/924930722 (2019): adds the dois for the reviewed items to two references to reviews of the item; Special:Diff/958243206 (2022): expands one of the references with more metadata from the doi; Special:Diff/1196114785 (2024): piles on even more metadata creating a broken citation template because of incompatible parameters (|chapter=, added in this edit, and |journal=, present in the original reference).

This is a phenomenon I have frequently complained about, to little avail: when Citation bot takes a single pass over an article, the results are often (but not always) improvements. But when Citation bot and the other bots take pass after pass after pass over an article, any mistakes are amplified, to the point where eventually they overwhelm the improvements.

Given that the last of these was "suggested by Grimes2": User:Grimes2, you are ultimately responsible for these bad edits. Please take more care in checking that the results of your suggestions are actually improvements. —David Eppstein (talk) 18:06, 16 January 2024 (UTC)

David Eppstein's version:
Citation bot result:
Citer result:
In Eppstein's version the author doesn't match the doi. Grimes2 (talk) 19:07, 16 January 2024 (UTC)
That is not my version. My version is Esposito, Pierpaolo (2010), Mathematical Reviews, MR 2500525{{citation}}: CS1 maint: untitled periodical (link). It is intended as a reference to Esposito's review of Tarantello's book, not as a reference to the book. —David Eppstein (talk) 19:17, 16 January 2024 (UTC)
Where is Esposito's review? Can't find under MR 2500525, nor doi, nor ISBN. Grimes2 (talk) 19:21, 16 January 2024 (UTC)
The problem is that the DOI is wrong. Once a citation script gets ahold of an incorrect DOI, there's really no stopping the compounding errors. Folly Mox (talk) 19:48, 16 January 2024 (UTC)
Seems to be "Garbage In, Garbage Out". Grimes2 (talk) 19:55, 16 January 2024 (UTC)
The problem is that the wrong DOI was added by Citation bot (an old fixed bug). And then over the course of two more edits Citation bot took that wrong DOI as an excuse to add more and more wrong metadata until the citation was totally trashed. The problem is that this sort of bot edit can amplify earlier bot mistakes into bigger mistakes, and over time that noise comes to dominate any signal. —David Eppstein (talk) 20:29, 16 January 2024 (UTC)
The review is subscription-only content under the MR. If you are not a subscriber you will only see the metadata for the book that it reviews. If you are a subscriber you will see the review: a paragraph of text, beginning "The author gives a nice and very clear survey on some planar elliptic problems ..." —David Eppstein (talk) 20:27, 16 January 2024 (UTC)
I'm all for people activating Citation bot taking more care in making sure that Citation bot's edits aren't creating template errors or garbling references. I've been slowly gnoming away errors via Special:RandomInCategory/CS1 errors: periodical ignored, and low key recording which user activated Citation bot where Citation bot is responsible for the error and I bothered to check the history. Results can be seen by searching my recent contributions for "activator" (changed from "suggested by"). Most of the major Citation bot users make appearances there, and the sample size is still really low, but the vibe seems to be that the people who use Citation bot to run over a whole category don't appear to check in on its output after a run, which they should definitely be doing.
That said, I'm not sure how anyone could have detected the problem under discussion in this thread after the most recent run. Even with the stable identifier MR 2500525, I don't see any mention of the reviewer Pierpaolo Esposito, and haven't been able to find this review with a manual search of the AMS website or with google scholar. At least on my device, the linked review has no information that it's a review at all, and just gives the information about the reviewed work. David Eppstein, is there more information there for subscribers? As it stands, I don't know if there's any way a script could have avoided this error. Folly Mox (talk) 20:18, 16 January 2024 (UTC)
Yes, the review is visible to subscribers. —David Eppstein (talk) 20:46, 16 January 2024 (UTC)
I do a spot check of category runs and report malfunction here. Grimes2 (talk) 21:37, 16 January 2024 (UTC)
Grimes2, I haven't seen you leaving Citation bot errors unaddressed during my maintenance category repair work. Thanks for your diligence. The error under discussion here seems it would have been difficult to spot without an AMS subscription. Maybe for citations to this source going forward, the citation template syntax should include a string instructing Citation bot to ignore it. Bypass comments could also be added to existing citations to Mathematical Reviews in an AWB run, to prevent any further garbling. Folly Mox (talk) 22:13, 16 January 2024 (UTC)
The specific error of adding a DOI to an MR review citation was fixed long ago. My concern here is more general: that repeated bot runs tend to amplify earlier bot errors, so that the more times the bot is run on the same citations, the less likely it is to be a good citation. We need some mechanism to cut the bad feedback loop early. —David Eppstein (talk) 22:42, 16 January 2024 (UTC)
It's not so bad to run citation bot again on earlier bot errors, because the error becomes obvious (red error message/error category). Could citation bot detect those red error messages. Grimes2 (talk) 22:57, 16 January 2024 (UTC)
It would be pretty convenient if Citation bot could read the categories of the pages it edits before and after, and if it adds a page to a maintenance category, notify the activator on their talkpage. Folly Mox (talk) 00:19, 17 January 2024 (UTC)
User:Qwerfjkl (bot) (Task 17) is doing this. Grimes2 (talk) 17:35, 18 January 2024 (UTC)
Folly Mox, if there are any categories that my bot doesn't currently monitor that you're thinking off, I am happy to add them. — Qwerfjkltalk 21:23, 25 January 2024 (UTC)

Stripping wikilinks when changing |work to |journal

Status
new bug
Reported by
Levivich (talk) 20:09, 1 February 2024 (UTC)
What happens
|work=[[American Political Science Association]] 2010 Annual Meeting Paper -> |journal=American Political Science Association 2010 Annual Meeting Paper
What should happen
|work=[[American Political Science Association]] 2010 Annual Meeting Paper -> |journal=[[American Political Science Association]] 2010 Annual Meeting Paper (wikilink should be retained)
Relevant diffs/links
Special:Diff/1201974944 line 743
We can't proceed until
Feedback from maintainers


Use {{Cite SSRN}} for citing SSRN papers. Headbomb {t · c · p · b} 10:25, 2 February 2024 (UTC)

Books do not have "issues"

Status
  Fixed
Reported by
Kusma (talk) 13:51, 12 March 2024 (UTC)
What happens
Bot turns volume "2.1" for a book into "volume 2, issue 1". Issue number isn't even displayed for books
What should happen
Leave the volume number alone. Books can have volume numbers like 2.1 or 2/1 or similar, and rarely have "issue numbers" like journals.
Relevant diffs/links
[17]


Breaks citation to 2nd ed of book by adding pmid/doi info from 1st ed, labeled as a journal

Status
  Fixed
Reported by
David Eppstein (talk) 17:49, 12 March 2024 (UTC)
What happens
In Special:Diff/1212531849, the bot borked yet another citation forcing manual cleanup. The citation in question goes to the 1969 2nd Dover edition of Neugebauer's book The Exact Sciences in Antiquity. Instead, the bot added metadata from the 1957 first edition (matching the "orig-year" parameter), published as a book in Acta Historica Scientiarum Naturalium et Medicinalium. By adding this using a |journal= parameter (invalid for cite book), instead of the correct |series= parameter, the bot broke the citation, causing it to emit an error message. The pmid that appears to have triggered this bad edit was also previously added by Citation bot, in 2020, in Special:Diff/985257358.
What should happen
Not that. At a bare minimum, the bot should never add |journal= to {{cite book}}.


Semantic scholar links continue to mostly consist of spam

Can Citation bot please stop littering every s2cid it can find wherever it can possibly fit? The vast majority of these links contain zero useful information beyond a (redundant) link to the publisher's website (typically paywalled), and putting them on every citation in Wikipedia is more or less spam. It's a distracting waste of space with no redeeming benefits.

The easiest solution here would be to deprecate the s2cid parameter from the citation templates, hide them from the output, and just be done with it.

Next best, probably my personal recommendation, would be that only humans should ever add s2cid links (and ideally the ones which were added by a bot in the past should be removed), or barring that that a human should manually review any s2cid that gets added by any bot. At the very very least, the bot should try to check them for meaningful content and skip the vast majority of totally useless ones going forward. –jacobolus (t) 18:13, 20 October 2023 (UTC)

  • Agree totally; please stop the s2cid spam. Esculenta (talk) 18:48, 20 October 2023 (UTC)
    Agreed as well! They only got added because of someone who works for Semantic Scholar (Help talk:Citation Style 1/Archive 66#Request to add Semantic Scholar IDs to the citation template). If there is truly a consensus among editors working on a page that it would improve the citation to include an |s2cid= … fine I guess, but a vast majority of the time someone who has never edited a given article runs the prompt and the bot clutters up all the citations with a spammy parameter without any human editors actively wanting it there. Umimmak (talk) 18:59, 20 October 2023 (UTC)
    Although I don't agree that s2cid is a spam, still, the point is not whether it is a spam or not, but how to tell the bot to not add this attribute.
    One option could have been via a template. For example, in cs1 config we may add an attribute s2cid=disabled (or any other boolean value that means no or false or zero). Another option is to use "bots" template. For example, on my user page I can specify {{bots|optout=cs1-errors}}. We may add an attribute such as {{bots|optout=s2cid}}
    Whichever option you prefer, we need a consensus. With a consensus, I can ask the citation bot developers to accept this feature via my source code pull request. Maxim Masiutin (talk) 00:29, 4 January 2024 (UTC)
    In my opinion the bot should never add this template parameter, and should remove every existing one that was ever added by a bot. In theory, the parameter would be okay in cases where it adds a new unique access to the full text which was not otherwise available. I have literally never seen this happen in practice. –jacobolus (t) 04:47, 4 January 2024 (UTC)
    I saw that. Maxim Masiutin (talk) 04:53, 4 January 2024 (UTC)
I would also be supportive of "deprecate the s2cid parameter from the citation templates, hide them from the output, and just be done with it", along with stopping the bot from adding them. Unlike most of the other codes we use, I cannot remember ever seeing a case where these were useful. Stopping the bot is on-topic here but the other stuff should probably be discussed on Help talk:Citation Style 1, which is the centralized discussion point for all the citation and cite templates. —David Eppstein (talk) 20:22, 20 October 2023 (UTC)
I think this depends on which articles you are reviewing. There are plenty of useful places like S2CID 16831869. Citation bot already avoids adding s2cid where there are no sources.  — Chris Capoccia 💬 19:08, 25 October 2023 (UTC)
The example you cited is a poor example, because the publisher's page is open access; this citation should use doi-access=free and not include an s2cid. Citation bot already avoids adding s2cid where there are no sources – This is nowhere close to accurate. Citation bot adds tons of completely vacuous s2cids that provide no information beyond a link to the publisher page, more or less analogous to blogspam. –jacobolus (t) 19:14, 25 October 2023 (UTC)
You're not paying attention to what I wrote. Yes it adds s2cid where only link is publishers and same as DOI. But it does not add s2cid where there are no sources.   — Chris Capoccia 💬 15:30, 26 October 2023 (UTC)
What do you think the point is of adding an S2CID containing no meaningful content beyond a link to the publisher's website which was also already included in the citation template? From my perspective, such S2CIDs are spam with zero redeeming value. –jacobolus (t) 15:57, 26 October 2023 (UTC)
I don't agree with you. The bot is a tool, and there is nothing wrong in the tool to add s2cid if it is a legitimate attribute in the cs1 module. If you think that s2cid should not be used in Wikipedia, ask for the removal from the cs1 module, not from the bot. Maxim Masiutin (talk) 17:41, 16 February 2024 (UTC)

Came across some more S2CID spam today which led me to this conversation. Is there an actual way to have an RfC or something for this? It's fine if humans want to add it, but for something with a DOI already there, having a bot add something that is pretty useless doesn't help. Why? I Ask (talk) 05:15, 12 November 2023 (UTC)

There was a comment about s2cid being useful when when their servers are down, but Portico (https://www.portico.org/why-portico/) might be a good alternative. --SilverMatsu (talk) 03:01, 5 December 2023 (UTC)
Portico only helps when "triggered" (e.g. the publisher goes bankrupt). Internet Archive scholar keeps track of the available archives and is more suitable for such a use case: see #Add Internet Archive Scholar links. Nemo 13:24, 4 January 2024 (UTC)
I personally like Semantic Scholar but I never use s2cid links from English Wikipedia citation templates. It's one of those IDs which are useful sometimes when everything else fails, but should probably be hidden by the citation templates in most cases. I don't know whether it's realistic to get such a change implemented in the citation templates though. Nemo 13:24, 4 January 2024 (UTC)

Citation bot continually adds back particular S2CIDs even when they have been explicitly manually removed by humans for being useless. Can whoever controls this bot please stop such behavior? (Or ideally just get rid of S2CIDs altogether?) Otherwise I am encouraged to ban Citation bot from editing particular article pages altogether. –jacobolus (t) 15:59, 16 February 2024 (UTC)

The bot does not edits the pages by itself, it is asked by the people to edit the pages or is used as a gadget. Talk to the people who ask the bot to edit the pages you are referring to. Maxim Masiutin (talk) 16:04, 16 February 2024 (UTC)
People invoking Citation bot never check the edits, and should not be responsible for doing so if you want to have Citation bot continue to operate as it currently does. They certainly aren't going to be manually double checking every bit of metadata spam to check whether it's useful or not. –jacobolus (t) 16:30, 16 February 2024 (UTC)
You raised good point. People are responsible if their edits are by tools or by bots. As for the consensus on whether the citations bot should add the S2CIDs or not, there is no consensus on not adding it, therefore, you don't have grounds for banning the bot. Maxim Masiutin (talk) 17:39, 16 February 2024 (UTC)
The SC2IDs in question contain literally zero useful information. They typically have a subset of the metadata already included in the Wikipedia article, and are nothing more than a redundant spam wrapper around the DOI, which typically points at a paywalled publisher website. Including them in Wikipedia serves no encyclopedic purpose. Readers do not benefit from these links, and any reader who cares to find a semantic scholar link can trivially find it for themself. The CS1/CS2 templates, Citation bot, and ultimately Wikipedia are abusing readers' attention for strictly marketing purposes, which violates core Wikipedia principles.
If we're going to include these marketing links we might just as well also include links to every other citation index (Google scholar, Microsoft Academic, Scopus, and so on), book selling website (Amazon, etc.), etc. These are typically more valuable than S2CIDs, and if spam marketing is fair game, the more the merrier right?
If Citation Bot insists on adding these to pages even after they have been deliberately manually removed by human editors, it should be blocked from those pages by adding {{bots|deny=Citation bot}} to the pages. But a better solution would be for the Citation bot developers to stop spamming Wikipedia with these links. –jacobolus (t) 18:08, 16 February 2024 (UTC)
Yeah I’ve been so frustrated by this and other changes that I tend to just block the bot… maybe once in a blue moon it’s useful to have a semantic scholar link, but it certainly shouldn’t be part of a standard citation. And it’s frustrating when editors who have never worked on a page before just run citation bot, can’t preview their edits to see if they make sense, and don’t check to see if the edit which was made was an improvement. The bot keeps adding it to more and more pages, creating the impression of a growing consensus, unless editors are on constant vigilance to block/revert the bot. Umimmak (talk) 20:27, 16 February 2024 (UTC)
+1 I've accepted it as a part of my digital existence here to routinely have to play s2cid bot-revert pong, but I dream of a different future. Esculenta (talk) 20:51, 16 February 2024 (UTC)
In the meanwhile, I can make a script that looks for all {{cite}} templates with s2cid parameters, and, when it find that, can do one of the two following ways:
  1. delete s2cid parameter from that template ant put <!-- Deny Citation Bot--> to the whole template;
  2. remove the value from the s2cid parameter and use <!-- Deny Citation Bot--> for this value only to keep it empty.
It will be a script similar to other scripts, such as Wikipedia:AutoEd. Maxim Masiutin (talk) 00:28, 17 February 2024 (UTC)
If there will be a consensus that the bot should not add S2CIDs, the developers of the bot remove it, I hope. Maxim Masiutin (talk) 22:22, 16 February 2024 (UTC)
Bots are required to only perform tasks for which consensus exists. Where was consensus for the bot to add these links established? Nikkimaria (talk) 22:27, 16 February 2024 (UTC)
@Nikkimaria the bot aimed to expand citations. Maxim Masiutin (talk) 23:03, 16 February 2024 (UTC)
Maxim Masiutin, I am aware of what the bot is intended to do. My question is, where was consensus for this specific addition established? Nikkimaria (talk) 23:06, 16 February 2024 (UTC)
Thank you, I understood your question. You can search requests for approval and look at the contents of the requests. Probably your question on whether the bot has been modified after the approval? Maxim Masiutin (talk) 23:22, 16 February 2024 (UTC)
I looked at the requests and did not see specific mention of this addition. Nikkimaria (talk) 23:31, 16 February 2024 (UTC)
@Nikkimaria maybe the intender and approved use is to expand all parameters that exist in cs1/cs2? Have you seen the full list of parameters which were explicitly authorized? Maxim Masiutin (talk) 23:45, 16 February 2024 (UTC)
@Nikkimaria if you find the explicit lists of approved attributes, such as first1/last1, let us know Maxim Masiutin (talk) 23:49, 16 February 2024 (UTC)
The bot's most recent approval predates the implementation of this parameter by almost a decade, so if there was a parameter list associated with its approvals it could not possibly have included this. Nikkimaria (talk) 23:56, 16 February 2024 (UTC)
@Nikkimaria afaik, if you set particular attribute to empty value, it will not expand it, so it may be one of solutions. Maxim Masiutin (talk) 23:58, 16 February 2024 (UTC)
I don't think there was ever explicit consensus for it (which is okay enough; people should try to improve the encyclopedia in the best way they can, and others who disagree can start a conversation about it). The feature was added to the citation templates at the direct request of the creators of Semantic Scholar. See CS1 talk archives § Request to add Semantic Scholar IDs to the citation template. –jacobolus (t) 23:58, 16 February 2024 (UTC)
@Jacobolus I guess the bot just supports all supported attributes that is within the scope. The bot supports exclusion of page, citation or a particular attribute from expanding. Maxim Masiutin (talk) 00:01, 17 February 2024 (UTC)
It's okay enough for human editors, but not bots - if you are correct then the task should be removed. Nikkimaria (talk) 00:08, 17 February 2024 (UTC)
@Nikkimaria: As far as I can tell this discussion happened entirely on GitHub [18], originating from a request made by an employee of Semantic Scholar. Thanks to Izno for finding that and adding it to past discussion at User talk:Citation bot/Archive 19 § Semantic scholar 2. Umimmak (talk) 05:02, 17 February 2024 (UTC)
Thanks for confirming. @AManWithNoPlan: please disable this. Nikkimaria (talk) 05:15, 17 February 2024 (UTC)
@AManWithNoPlan: it appears the bot is still adding these - could you please fix that? Nikkimaria (talk) 23:57, 12 March 2024 (UTC)
  • Still playing s2cid bot-revert pong on a daily basis. Do we need an RFC to deal with this? Esculenta (talk) 14:58, 3 March 2024 (UTC)
  • @Esculenta we can make a tag for a page to tell bits to not insert this attribute, I can implement this tag snd submit a pull request to the developers of the bot, but for now there are tags to tell the bot to not expand citations at all per individual citation oe per page or not add the id for a particular citation. Maxim Masiutin (talk) 18:50, 3 March 2024 (UTC)
    we can automate this task with a script, give me pages that you teverted and I will show you what I mean. Maxim Masiutin (talk) 18:52, 3 March 2024 (UTC)
    That's not a good solution in my opinion. We shouldn't have to litter every page's markup with instructions telling bots not to litter. –jacobolus (t) 18:58, 3 March 2024 (UTC)
    I think a better solution is to disable it completely, and have whoever wants this added as a permanent feature go through the bot approval process and request community feedback on the implementation of new identifiers. Esculenta (talk) 19:34, 3 March 2024 (UTC)
    @Esculenta of course there is always right to start an RFC. Maxim Masiutin (talk) 19:38, 3 March 2024 (UTC)

  Fixed AManWithNoPlan (talk) 14:27, 15 March 2024 (UTC)

@AManWithNoPlan – can you describe what "fixed" means? Citation bot will no longer add S2CIDs? –jacobolus (t) 14:40, 15 March 2024 (UTC)
It will only had the link if there is not already a free link and s2cid is licensed and this is the big one, they have a link to a PDF. That last one will stop almost all of the links, other than ones that actually are useful. AManWithNoPlan (talk) 14:46, 15 March 2024 (UTC)
Awesome, thanks! –jacobolus (t) 15:03, 15 March 2024 (UTC)
I will archive, but note that existing runs will not see the change, so long category runs might take a while. I will now archive. AManWithNoPlan (talk) 15:20, 15 March 2024 (UTC)

Two Citation Bot false positive errors

Status
  Fixed
Reported by
UndercoverClassicist T·C 09:39, 8 March 2024 (UTC)
What happens
Two errors where the bot has added parameters in error:
  • In Beulé Gate, given the citation {{cite journal|last1=Billard | first1=Yves| last2=Chandezon| first2=Christophe| year=2012| title=Ernest Beulé (1826–1874). Archéologie classique, histoire romaine et politique sous Napoléon III| trans-title=Ernest Beulé (1826–1874). Classical Archaeology, Roman History and Politics under Napoleon III| journal=Liame| volume=24| url=http://journals.openedition.org/liame/277| access-date=2024-02-09| doi=10.4000/liame.277| lang=fr|issn=2264-623X| doi-access=free}}, it mistakenly added an additional |issue=24 (diff)
  • In PY Ta 641, given the citation {{cite book|last=Judson|first=Anna P.|year=2020|title=The Undeciphered Signs of Linear B: Interpretation and Scribal Practices|publisher=Cambridge University Press|isbn=9781108859745|doi=10.1017/9781108859745}}, it mistakenly added |url=https://www.repository.cam.ac.uk/handle/1810/265630. This URL links not to the book but to Judson's PhD thesis by the same name. (diff)

In the meantime, I've marked the citations with comments so that the bot doesn't get to them.


I have reported the URL to the open access button as a mistake, and fixed the other. AManWithNoPlan (talk) 17:25, 11 March 2024 (UTC)

Adds URL for specific chapter of book The SAGE Handbook of Domestic Violence when citation should be to the whole book

Status
  Fixed by adding to bad data list
Reported by
  — Chris Capoccia 💬 01:33, 15 March 2024 (UTC)
What happens
expanding DOI 10.4135/9781529742343 correctly creates book citation but incorrectly adds URL for specific chapter
Relevant diffs/links
diff
We can't proceed until
Feedback from maintainers


I have reported this mata-data error to the openaccess button people. AManWithNoPlan (talk) 15:01, 15 March 2024 (UTC)

Bork bork bork

Status
  Won't fix, but I manually fixed the wrong doi/link/etc on the page
Reported by
David Eppstein (talk) 03:03, 5 February 2024 (UTC)
What happens
In two consecutive edits last summer, Citation bot modified a journal citation with an incorrect date and doi, but otherwise correct metadata, by adding an incorrect isbn and s2cid pointing to the 1989 conference version of the same paper (Special:Diff/1169023860), and then relied on the bogus metadata it had just added to convert the citation to a book citation, adding the book title but leaving the journal title in place and creating a borked citation template (Special:Diff/1171957733). This error in a BLP went unfixed until I found it just now.
What should happen
Not that.
We can't proceed until
Feedback from maintainers


ZooKeys

Status
  Won't fix, since rare and GIGO
Reported by
Chidgk1 (talk) 12:33, 16 March 2024 (UTC)
What happens
changes journal to book
What should happen
according to Wikipedia ZooKeys is a journal so either the article should be corrected or the bot should not change journal to book
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Panthera_pardus_tulliana&diff=1214002029&oldid=1214000596
We can't proceed until
Feedback from maintainers


The reference in question was a book style reference to a journal. The bot guessed wrong. I have since cleaned up the reference by hand some. AManWithNoPlan (talk) 14:31, 16 March 2024 (UTC)

fails

Status
  Not a bug of the bot, but general finickiness of the gadget
Reported by
CommonKnowledgeCreator (talk) 14:40, 15 March 2024 (UTC)
What happens
When attempting to run bot, website returns error message that states "en.wikipedia.org says Error: Citations request failed".
What should happen
Bot should run
We can't proceed until
Feedback from maintainers


Often this is a result of the editor having other incompatible options enabled. Please see the help page for the gadget tool https://en.wikipedia.org/wiki/MediaWiki_talk:Gadget-citations.js AManWithNoPlan (talk) 15:01, 15 March 2024 (UTC)

Grove online

Status
  Not a bug
Reported by
86.177.202.175 (talk) 18:50, 27 December 2023 (UTC)
What happens
partially broken citation
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Division_viol&diff=1191904104&oldid=1191729983


Afaik [?], Grove Online is routinely cited inline using Template:Cite web, which (unlike Template:Cite Grove) allows for inclusion of actual author information. Afaik, this is correct and therefore does not require automated correction. 86.177.202.175 (talk) 18:50, 27 December 2023 (UTC)

The various Grove wrapper templates (see Template:Cite Grove § See also) use {{cite encyclopedia}}. I think that you are mistaken in your claim that {{cite Grove}} does not allow for inclusion of actual author information.
Trappist the monk (talk) 20:45, 27 December 2023 (UTC)
Uhm, yes... I was mistaken:) Thanks, 86.177.202.175 (talk) 21:44, 27 December 2023 (UTC)
I've tried with Cite Grove, like this (though to my eyes it looks a bit 'busy'). Fwiw, in the paaast, I think I've seen refs like this changed to Cite web (perhaps because of it being the 'Online' version?) 86.177.202.175 (talk) 22:01, 27 December 2023 (UTC)

Bot confused by chapter title = book title

Status
  Fixed
Reported by
David Eppstein (talk) 23:18, 5 March 2024 (UTC)
What happens
In Special:Diff/1138690565, the bot removed the |chapter= parameter from a reference to the chapter "Eulerian Numbers" in the book Eulerian Numbers, possibly out of confusion because of the fact that the chapter and the book have the same title. This left the reference in a state where it cited the whole book but didn't name the chapter within it that its doi and page numbers pointed to. Then, in Special:Diff/1211621780, the bot decided to use the doi to fill in the chapter parameter once more, but in doing so it removed the |title= parameter of the reference, again likely out of confusion from the equality of titles. The combination of these two edits left the reference in a broken state without a book title. This had already happened twice before, in Special:Diff/984898465, Special:Diff/1068697840, so the bot has broken the same reference in the same way at least three times, going back at least to 2020.
What should happen
None of those things


That is an interesting problem (I know someone who's first and last name are the same, and it causes similar confusion with people). I will look into ways to detect that. I have added comments to both parts to fix the specific page. AManWithNoPlan (talk) 15:14, 6 March 2024 (UTC)

Removed proxy/dead URL that duplicated identifier

There is a message in the edit summary "Removed proxy/dead URL that duplicated identifier".

I find this message frightening and even misleading, because the word "dead" is frigthening on itself, however, in most cases URL is not dead but just duplicates the identifier such as PMID or DOI, such as doi-10.15347/WJM/2023.003|url=https://doi.org/10.15347/WJM/2023.003 or pmid=35987379|url=https://pubmed.ncbi.nlm.nih.gov/35987379/. These URLs are not dead, and they also cannot be considered "proxy" in a classical sense. Please consider removing "proxy/dead" from the message so it will be just "Removed URL that duplicated identifier", for the following reasons:

  1. The term "proxy/dead" might be confusing for users who are not familiar with the terminology; also, these messages are read not only by the users of the citation bots but by all Wikipedia editors; if the bot modified an article those editors were working on, they would see this frightening message without ever using the bot. A more straightforward message "Removed URL that duplicated identifier" would be easier to understand and more accurate.
  2. The term "dead" is often associated with broken or inaccessible links, which is not the case here. The URLs are functional and simply duplicate the identifier. Therefore, using "dead" might lead to misunderstandings. The word "dead" can have negative connotations and might cause unnecessary alarm. Using neutral language would contribute to a better user experience.
  3. The term "proxy" in a classical sense refers to a server that acts as an intermediary for requests from clients seeking resources from other servers. In this context, it might not be the most appropriate term to use.

Please remember that these messages go to the edit summaries which are kept forever, these are not just log message that only one user will see. Therefore, we should be very cautions about the edit summaries that we leave. Maxim Masiutin (talk) 09:19, 27 March 2024 (UTC)

There was a time when that was 100% correct, but I have   Fixed this to match current reality. AManWithNoPlan (talk) 14:08, 27 March 2024 (UTC)
@AManWithNoPlan thank you very much! Maxim Masiutin (talk) 14:16, 27 March 2024 (UTC)

wrong author and things

Status
{{fixed}} - to DOI meta-data is weird. Bot will not reject that author and titles that end in .pdf
Reported by
Spinixster (trout me!) 07:42, 26 March 2024 (UTC)
What happens
this
What should happen
Not that, I changed the link and tried converting with the bot again. It did not work, so I manually fixed it.
We can't proceed until
Feedback from maintainers


I'm guessing the bug is due to the page being a download page. A way to resolve this, I think, is to convert the download page into the original page. Spinixster (trout me!) 07:42, 26 March 2024 (UTC)

Since when is it ok and not a violation of WP:CITEVAR for the bot to reformat manually-formatted citations to use the citation templates? Despite the edit summary "Changed bare reference to CS1/2", this was not a bare-url reference; it was formatted, but manually formatted. There are many reasons to use manual formatting for references, among the most salient being not wanting the bots to mess with the citations. —David Eppstein (talk) 19:24, 27 March 2024 (UTC)

Error with adding issue param of a letter when none is needed

Hello, I have noticed the bot has made an error on the Kingsman (franchise) page when it has been used three times on it (first by me when I was cleaning it up and noticed it, and the second times by two other editors performing standard use). The edits in question are the same: here and here. It appears the bot is looking for an |issue= use in an archived dead CBR citation which has a quote in it, and the bot is pulling the "C" from the "U.N.C.L.E." bit of the quote as an instance of |issue= when it is not. It removes the "N.C." from the word, thus breaking the link as a result. Trailblazer101 (talk) 01:53, 2 April 2024 (UTC)

  Fixed - that was some very old code. AManWithNoPlan (talk) 13:47, 2 April 2024 (UTC)
Thank you so much! Trailblazer101 (talk) 17:06, 2 April 2024 (UTC)

1992 Sussex Arms bombing

Hi, I get why you removed the cite url, but why does the access date have to be removed as well? Sussex Arms pub bombing78.152.229.53 (talk) 20:26, 30 March 2024 (UTC)

Amazon URLs should not be included, if their is an ISBN. AManWithNoPlan (talk) 22:25, 30 March 2024 (UTC)
If there is no URL, the access date is pointless and will throw off an error. It's often pointless even when there is a url. Headbomb {t · c · p · b} 05:47, 31 March 2024 (UTC)

apos/amp/quote in trans-title

Status
{{fixed}}
Reported by
Headbomb {t · c · p · b} 19:56, 3 April 2024 (UTC)
What should happen
[19] / [20]
We can't proceed until
Feedback from maintainers


gadget

Status
  Won't fix
Reported by
Alon Alush
What should happen
Page shows the bot's changes after it finishes analyzing
Relevant diffs/links
"Error: citations request failed" box shows up
Replication instructions
. Enable the citations expander gadget in Wikipedia. Press Citations button on pages. After waiting for a bit, this error box will show up around 85% of the time. See image


Not fixable. This is generally a result of the web-browser giving up too soon. AManWithNoPlan (talk) 13:12, 26 March 2024 (UTC)

@Alon Alush:, see previous discussion at User talk:Citation bot/Archive 38#Please remove claim that editors may use the bot to check a single article, because it consistently fails. I'm don't know of any browser that is capable of giving such a site-specific message, but it fruitless to spend time argue the whys and wherefores. The fact of the matter is that the Citations button will fail if the article is longer than a couple of screenfuls. Don't waste your time, just use Expand citations from the tools column left of the editing page. --𝕁𝕄𝔽 (talk) 16:20, 26 March 2024 (UTC)
If for some reason, you want to edit just part of a page and the button fails, then copy the section (maybe the whole page???) to a sandbox and use the Expand citations option. AManWithNoPlan (talk) 14:11, 27 March 2024 (UTC)

Edit conflit

Status
{{notabug}}
Reported by
LeapTorchGear (talk) 08:49, 22 April 2024 (UTC)
What happens
[07:39:50] Processing page 'Draft:Scottish mother's day' — edit—history

>Remedial work to prepare citations

>Consult APIs to expand templates

>Expand individual templates by API calls.. nothing found.. no record retrieved.

>Remedial work to clean up templates.

>Writing to Draft:Scottish mother's day...

  !API call failed: Edit conflict..  Will sleep and move on.
  !Unhandled write error.  Please copy this output and report a bug.  There is no need to report the database being locked unless it continues to be a problem. .
  !Possible edit conflict detected. Aborting.
 diff
We can't proceed until
Feedback from maintainers


That's an edit conflict. Just rerun the bot on the page. Headbomb {t · c · p · b} 09:12, 22 April 2024 (UTC)

Date bug

Status
  Fixed
Reported by
Stevie fae Scotland (talk) 20:04, 14 April 2024 (UTC)
What happens
Twice today at 2022 Glasgow City Council election, the bot has added |date=14 April 2022 to a citation which can't possibly have been published then. The election took place in May 2022 so the information in the source can't have been published before that.
Relevant diffs/links
See diffs: [21] [22]


Question on citation bot version on Toolforge and the gadget button

How can I query the citation bot version (e.g. a Github commit tag or date):

  1. that currently runs on Toolforge at https://citations.toolforge.org/;
  2. that is invoked from Wikipedia "Citations" button of the Citation expander gadget enabled in the Wikipedia user preferences?

Maxim Masiutin (talk) 08:23, 20 April 2024 (UTC)

it is usually the latest and greatest. SVN allowed version numbers to be automatically added. GIT i dont think allows that. AManWithNoPlan (talk) 12:10, 21 April 2024 (UTC)
In Git, version number is a commit hash ususally. Maxim Masiutin (talk) 13:02, 21 April 2024 (UTC)
The edit summary is already too long. AManWithNoPlan (talk) 00:21, 25 April 2024 (UTC)
I don't mean edit summary, but maybe you could implement a query to request current version or an URL to see current version used. Maxim Masiutin (talk) 00:25, 25 April 2024 (UTC)
Just look on github, if you want to know. AManWithNoPlan (talk) 00:36, 25 April 2024 (UTC)
OK, thank you! I didn't know that the version at the "Citations" button and on citations.toolforge.org is the most current; I thought there is a time lag between GitHub and the Wikipedia/toolforge use. Maxim Masiutin (talk) 01:06, 25 April 2024 (UTC)
The delay is usually in the range of minutes. Although, once a run starts the bot version for that run does not change.   Won't fix. AManWithNoPlan (talk) 01:24, 25 April 2024 (UTC)

Submit a pipe-separated list to a queue instead of immediate processing for particular users

Can you please implement an opportunity to submit a pipe-separated list of article to expand via https://citations.toolforge.org/ to a queue instead of immediate processing for particular users who has legitimate interest for it, such as me to hunt for NULL DOIs? Currently, when then I submit a list that cannot be quickly processed, my web browser shows me a timeout error and no pages are processed or only a few so I don't know where it stopped. I'd like to ensure that all pages were processed sooner or later if I am authorized to submit such requests.

  Won't fix AManWithNoPlan (talk) 00:36, 25 April 2024 (UTC)

10.1016/j.heliyon is free access

Status
  Fixed
Reported by
Headbomb {t · c · p · b} 23:14, 22 April 2024 (UTC)
What should happen
[23]


invoking unreal template

Status
  Won't fix
Reported by
A876 (talk) 03:48, 23 April 2024 (UTC)
What happens
cite newspaper
What should happen
cite news


{{Cite newspaper}} redirects to {{Cite news}}.

The was implemented, and then people got all mad that they cite newspaper template was radically different than cite news and they chose it on purpose. AManWithNoPlan (talk) 00:37, 25 April 2024 (UTC)

Bot tries to convert extra text to date when it shouldn't

Status
  Fixed - removed that code
Reported by
:Jay8g [VTE] 06:51, 23 April 2024 (UTC)
What happens
This
What should happen
Probably no edit at all
Relevant diffs/links
[24]
We can't proceed until
Feedback from maintainers


It seems like whenever there is a string of numbers that looks sort of like a year in extra text floating around in a citation template, the bot decides that's a year -- even when those numbers are part of a longer string. This seems like a bad idea.:Jay8g [VTE] 06:51, 23 April 2024 (UTC)

When I run the bot for a citation template where I have entered a doi about a chapter, the bot does not add the author name to the template.

example (diff):

SilverMatsu (talk) 16:03, 24 April 2024 (UTC)

  Not a bug - the publisher has not documented the data in the DOI. AManWithNoPlan (talk) 00:25, 25 April 2024 (UTC)
Thank you for letting me know. I also tried using the book information URL, but it doesn't seem to work. However, the author's name seems to be written in the Citation Tools on the book information page.
example: "Adjoint functors". Handbook of Categorical Algebra. Encyclopedia of Mathematics and its Applications. Vol. 1. Cambridge University Press. 1994. pp. 96–131. doi:10.1017/CBO9780511525858.005. ISBN 978-0-521-44178-0.
--SilverMatsu (talk) 02:23, 25 April 2024 (UTC)

Citation bot is warring with itself on Hepatitis E

Status
{{fixed}} it, I think
Reported by
:Jay8g [VTE] 19:38, 23 April 2024 (UTC)
What happens
[25]
What should happen
Not sure which page number is correct, but the bot should not edit war with itself
We can't proceed until
Feedback from maintainers


I have blocked it on the page. Will investigate. AManWithNoPlan (talk) 00:29, 25 April 2024 (UTC)

Cosmetic edit for p -> page

Status
{{fixed}}
Reported by
Izno (talk) 21:21, 25 April 2024 (UTC)
What happens
|p= was converted to |page= without any other significant changes made. This is a cosmetic edit in the context of the citation templates.
What should happen
No edit should be made.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Xerox_1200&diff=prev&oldid=1220648742
We can't proceed until
Feedback from maintainers


Look to be a large volume of cosmetic edits being made right now - here is another common type. Nikkimaria (talk) 03:29, 27 April 2024 (UTC)
Let me check. We just upgraded the PHP version. AManWithNoPlan (talk) 11:09, 27 April 2024 (UTC)