Open main menu

Wikipedia:Bots/Requests for approval

< Wikipedia:Bots

BAG member instructions

If you want to run a bot on the English Wikipedia, you must first get it approved. To do so, follow the instructions below to add a request. If you are not familiar with programming it may be a good idea to ask someone else to run a bot for you, rather than running your own.

 Instructions for bot operators

Contents

Current requests for approval

FRadical Bot

Operator: FR30799386 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 17:58, Thursday, September 20, 2018 (UTC)

Automatic, Supervised, or Manual: Manual

Programming language(s): AutoWikiBrowser

Source code available: AWB

Function overview: This bot-task will try to remove all instances of the use of MiszaBot, MiszaBot I, MiszaBot II, MiszaBot III from the parameters of the template {{Auto archiving notice}} and replace them with Lowercase sigmabot III.

Links to relevant discussions (where appropriate):

Edit period(s): (Irregular) As and when I get time to run the bot. I will try not to exceed a 15 edits/minute edit rate.

Estimated number of pages affected: ~4294 pages will be affected

Namespace(s):Talk: namespace

Exclusion compliant (Yes/No): No

Function details: In most of the article talkpages, the manually set |bot= parameters of the template {{Auto archiving notice}} point to the long inactive set of MiszaBots namely MiszaBot, MiszaBot I, MiszaBot II, MiszaBot III. I will via this bot account (using AWB) try to make the notice point to the right bot, namely Lowercase sigmabot III. The logic used is outlined below :

  • First all the pages transculding the template Auto archiving notice are extracted using the Make List function of AWB.
  • These pages are then filtered to include only those in the Talk: namespace.
  • The pages are then pre-parsed to remove those with \|bot *\= *Lowercase\ sigmabot\ III
  • Finally, the pages are then checked for the strings = *MiszaBot(regex), MiszaBot I, MiszaBot II, MiszaBot III and then replaced with =Lowercase sigmabot III for the first and Lowercase sigmabot III for the rest.

Additionally, each and every edit will be reviewed by the operator(me) via the AWB. Regards — fr+ 17:58, 20 September 2018 (UTC)

Discussion

EranBot 3

Operator: ערן (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 16:07, Saturday, September 15, 2018 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python

Source code available: source on github

Function overview:: This bot submits newly added text to the iThenticate API which determines if other sources are similar to it. Suspected copyvios (>50% similarity) can then reviewed manually (copypatrol; top reviewers: Diannaa, Sphilbrick, L3X1). In this BRFA I would like to ask to join it to copyviobot group, to access pagetriagetagcopyvio API which will be used by PageCuration extension aka Special:NewPagesFeed (see phab tasks).

Links to relevant discussions (where appropriate): prev BRFA, copyviobot, Epic task for copyvio in new pages feed (and subtasks)

EditActive period(s): Continuous

Estimated number of pages affected: N/A. The bot will tag suspected edits using API. This may be used by special page Special:NewPagesFeed.

Namespace(s): main namespace and drafts (the bot is not editing them, but may check them for copy)

Exclusion compliant (Yes/No): N/A

Function details:

  • any diff (except rollbacks) in main and draft NS which adds large chunck of text may be a subject for copyvio check
  • Copyvio check is done using iThenticate service (WP:Turnitin who kindly provided us access to their service)
  • Changes that are similar to existing text in external source are reported (can be reviewed in https://tools.wmflabs.org/copypatrol/en ) so users can further review them manually.
  • (new) By adding the bot to copyviobot group, it will be possible to access to suspected diffs more easily from Special:NewPagesFeed later

Eran (talk) 16:07, 15 September 2018 (UTC)

Discussion

48% of the edits reported as suspected copyvio required additional follow up ("page fixed"). In tools.labsdb:select status, count(*) from s51306__copyright_p.copyright_diffs group by status;

The full details how it is going to be shown in Special:NewPagesFeed would probably need to be discussed with community and with Growth team (MMiller, Roan Kattouw) - however, it is already possible to see an example in beta test wiki (search for "copyvio"). It would be important to note tagged page just means an edit may contain copied text (such edits may be OK [CC-BY content from government institutions], copyright violation [copy & paste from commercial news service] or promotional content [may be legally OK sometimes, but violates WP:Promo). Eran (talk) 16:07, 15 September 2018 (UTC)

It isn't sinking in how this fits in with the CopyPatrol activities. I'd like to discuss this further. Please let me know if this is a good place to have that discussion or if I should open up a discussion on your talk page or elsewhere.--S Philbrick(Talk) 18:03, 15 September 2018 (UTC)
Sphilbrick: I think it is relevant in this discussion, can you please elaborate? thanks, Eran (talk) 19:30, 15 September 2018 (UTC)
I start with a bit of a handicap. While I understand the new pages feed in a very broad sense, I haven't actually worked with it in years and even then had little involvement.
It appears to me that the goal is to give editors who work in the new page feed a heads up that there might be a copyvio issue. I've taken a glance at the beta test wiki — I see a few examples related to copyvios. I see that those entries have a link to CopyPatrol. Does this mean that the new page feed will not be directly testing for copyright issues but will be leaning on the copy patrol feed? I checked the links to copy patrol and found nothing in each case which may make sense because those contrived examples aren't really in that report, but I would be interested to know exactly how it works if there is an entry.
The timing is coincidental. I was literally working on a draft of a proposal to consider whether the copy patrol tools should be directly making reports to the editors. That's not exactly what's going on here but it's definitely related.
What training, if any is being given to the editors who work on the new pages feed? Many reports are quite straightforward, but there are a few subtleties, and I wonder what steps have been taken to respond to false positives.--S Philbrick(Talk) 19:57, 15 September 2018 (UTC)
CopyPartol is driven by EranBot with checks done by iThenticate/Turnitin. This BRFA is to send revision IDs with possible violations to the API, which will cause the CopyPatrol links to be shown in the new pages feed. — JJMC89(T·C) 04:53, 16 September 2018 (UTC)
Sphilbrick: thank you for the good points.
  • Regarding training for handling new pages feed and copyvios - I was about to suggest to document it, but actually it is already explained in Wikipedia:New pages patrol#Copyright violations (WP:COPYVIO) quite well (but we may want to update it later)
  • Directly making reports to the editors - This is good idea, and actually it was already suggested but was never fully defined and implemented - phab:T135301. You are more than welcome to suggest how it should work there (or in my talk page and I will summarize the discussion on phabricator).
Eran (talk) 18:40, 16 September 2018 (UTC)
Thanks for the link to the training material. I have clicked on the link to "school" thinking it would be there, but I now see the material in the tutorial link.
Regarding direct contacts, I'm in a discussion with Diannaa who has some good reasons why it may be a bad idea. I intend to follow up with that and see if some of the objections can be addressed. Discussion is [[|User_talk:Diannaa#Copyright_and_new_page_Patrol|here]].--S Philbrick(Talk) 18:54, 16 September 2018 (UTC)
@Sphilbrick: thanks for the questions, and I'm sorry it's taken me a few days to respond. It looks like ערן has summarized the situation pretty well, but I'll also take a stab. One of the biggest challenges with both the NPP and AfC process is that there are so many pages that need to be reviewed, and there aren't good ways to prioritize which ones to review first. Adding copyvio detection to the New Pages Feed is one of three parts of this project meant to make it easier to find both the best and worst pages to review soonest. Parts 1 and 2 are to add AfC drafts to the New Pages Feed (being deployed this week), and to add ORES scores on predicted issues and predicted class to the feed for both NPP and AfC (being deployed in two weeks). The third part will add an indicator next to any pages who have a revision that shows up in CopyPatrol, and those will say, "Potential issues: Copyvio". Reviewers will then be able to click through to the CopyPatrol page for those revisions, investigate, and address them. The idea is that this way, reviewers will be able to prioritize pages that may have copyvio issues. Here are the full details on this plan. Xaosflux has brought up questions around using the specific term "copyvio", and I will discuss that with the NPP and AfC communities. Regarding training, yes, I think you are bringing up a good point. The two reviewing communities are good at assembling training material, and I expect that they will modify their material as the New Pages Feed changes. I'll also be continually reminding them about that. Does this help clear things up? -- MMiller (WMF) (talk) 20:32, 20 September 2018 (UTC)
Yes, it does, thanks.--S Philbrick(Talk) 21:37, 20 September 2018 (UTC)
  • User:ערן how will your bot's on-wiki actions be recorded (e.g. will they appear as 'edits', as 'logged actions' (which log?), etc?). Can you point to an example of where this get recorded on a test system? — xaosflux Talk 00:22, 16 September 2018 (UTC)
    Xaosflux: For the bot side it is logged to s51306__copyright_p on tools.labsdb but this is clearly not accessible place. It is not logged on wiki AFAIK - If we do want to log it this should be done in the extension side. Eran (talk) 18:40, 16 September 2018 (UTC)
    phab:T204455 opened for lack of logging. — xaosflux Talk 18:48, 16 September 2018 (UTC)
Thanks, Xaosflux. We're working on this now. -- MMiller (WMF) (talk) 20:33, 20 September 2018 (UTC)
  • I've never commented on a B/RFA before, but I think that another bot doing copyvios would be great, esp if it had less false positives than the current bot. Thanks, L3X1 ◊distænt write◊ 01:12, 16 September 2018 (UTC)
    • L3X1: the Page Curation extension defines infrastructure for copyvio bots - so if there are other bots that can detect copyvios they may be added to this group later. AFAIK the automated tools for copyvio detection are Earwig's copyvio detector and EranBot/CopyPatrol and in the past there was also CorenSearchBot. The way it works is technically different (one is based on a general purpose search using Google search, one is based on Turnitin copyvio service) and they are completing each other with various pros and cons for each. I think Eranbot works pretty well (can be compared to Wikipedia:Suspected copyright violations/2016-06-07 for example)
    • As for the false positives - it is possible to define different thresholds for the getting less false positives but also missing true positives. I haven't done a full Roc analysis to tune all the parameters but the arbitrary criteria is actually works pretty well somewhere in the middle ground. Eran (talk) 18:40, 16 September 2018 (UTC)
  • Follow up from BOTN discussion, from what has been reviewed so far, the vendor this bot will get results from can check for "copies" but not necessarily "violations of copyrights" (though some copies certainly are also copyvios), as such I think all labels should be limited to descriptive (e.g. "copy detected"), as opposed to accusatory (humans should make determination if the legal situation of violating a copyright has occured). — xaosflux Talk 01:30, 16 September 2018 (UTC)
    That would be part of the new pages feed, which the bot doesn't control. Wikipedia talk:WikiProject Articles for creation/AfC Process Improvement May 2018 or Phabricator would be more appropriate venues for discussing the interface. — JJMC89(T·C) 04:53, 16 September 2018 (UTC)
    @JJMC89: what I'm looking for is where is a log of what this bot does control. As this is editor-managed, its not unreasonable to think another editor may want to run a similar or backup bot in the future. — xaosflux Talk 05:14, 16 September 2018 (UTC)
  • Would it be possible to assign a number of bytes to "large chunck of text"? SQLQuery me! 02:25, 16 September 2018 (UTC)
    500 bytes. — JJMC89(T·C) 04:53, 16 September 2018 (UTC)
  • Procedural note: The components for reading changes, sending data to the third party, and making off-wiki reports alone do not require this BRFA; making changes on the English Wikipedia (i.e. submitting new data to our new pages feed, etc) are all we really need to be reviewing here. Some of this may have overlap (e.g. what namesapces, text size, etc), however there is nothing here blocking the first 3 components alone. — xaosflux Talk 18:54, 16 September 2018 (UTC)

Bots in a trial period

Bots that have completed the trial period

PrimeBOT 29

Operator: Primefac (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 01:05, Saturday, August 11, 2018 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): AWB

Source code available: WP:AWB

Function overview: Replace invalid parameters in {{infobox person}}

Links to relevant discussions (where appropriate): BOTREQ, RFC 1, RFC 2

Edit period(s): One time run

Estimated number of pages affected: Between 5848-16560 (see function details)

Namespace(s): Main

Exclusion compliant (Yes/No): Yes

Function details: Ethnicity (5848 uses) and religion (3890 uses) were removed from {{infobox person}} following the two RFCs listed above. The general consensus is that better infoboxes should be used if religion/ethnicity are relevant to the subject.

In the BOTREQ linked above, an additional valid point was made that this continued proliferation is likely due to copy/pasting existing infoboxes. In the interest of making this a worthwhile venture, I figured that I'd remove or modify the top ~40 parameters with 40+ uses in addition to the religion/ethnicity params. This bot run will remove (list 1) or replace (list 2) invalid parameters. If each parameter use is on its own unique page (unlikely) a total of 16,560 pages will be edited; at minimum 5848.

If deemed a "good thing", I can make these non-minor edits so that users who think the religion/ethnicity/term_start/associated_acts/etc parameters should be valid will be notified and they can change infoboxes.

Discussion

  Approved for trial (50 edits).. With the same remarks as Wikipedia:Bots/Requests for approval/PrimeBOT 28 concerning WP:GENFIXES and User:Headbomb/sandbox#Proposed logic. Headbomb {t · c · p · b} 01:41, 11 August 2018 (UTC)

@Primefac: can anything be done about the removal of a line break in a nested template? (Or here.) Especially with the new release of AWB which has better nested template logic.Headbomb {t · c · p · b} 12:24, 18 August 2018 (UTC)::This and this and many others add empty lines, this should be tweaked. Headbomb {t · c · p · b} 12:27, 18 August 2018 (UTC)
This should remove all stray '|' in the infobox. Headbomb {t · c · p · b} 12:29, 18 August 2018 (UTC)
{{OperatorAssistanceNeeded}} can you incorporate the fix Headbomb suggested above? — xaosflux Talk 15:17, 18 August 2018 (UTC)
Seen, travelling, will respond more in full later this week. Primefac (talk) 14:25, 21 August 2018 (UTC)
The short answer is "yes, I should be able to fix the above issues". I just saw the improvements to AWB; I think that will likely affect how my code is parsed (since I have a lot of hardcoded-but-not-great exceptions for nested templates) but should also allow me to get a little more... vigorous with my regex to handle some of the above issues. Would definitely recommend/request another trial. Primefac (talk) 20:13, 23 August 2018 (UTC)
@Primefac:   Approved for extended trial (50 edits). proceed when you're ready to. Headbomb {t · c · p · b} 20:56, 23 August 2018 (UTC)
Issue that got solved

Having a slight issue... As you can see at http://rubular.com/r/gE07i70wKT the regex I've got for handling the "last parameter" works fine, it picks up the }} - in the replacement I replace everything with $6 (or $1, if I remember to ?: all of the other parens). However, in AWB it only picks up | allegiance = {{flagdeco|, leaving behind | allegiance = {{flagdeco|. Can't figure out what's the issue. Primefac (talk) 00:49, 27 August 2018 (UTC)

Edit/update - I figured out why this behaviour was happening: AWB was ignoring the }} that closed out the template (as I was searching for "in {{infobox person}}"), which means that a) I have to completely rework my code, and b) yes I am still working on this. PrimeBOT (talk) 01:58, 9 September 2018 (UTC) Good lord, I'm an idiot (and tired), didn't see I had logged into PrimeBOT's account to clear out some junk, please forgive the not-approved edits just made to this page. Primefac (talk) 02:00, 9 September 2018 (UTC)
  Trial complete.. Edits. Primefac (talk) 00:18, 10 September 2018 (UTC)

@Primefac: Reviewing. Some question(s): This removed ethnicity, but not religion. Is that intended? What about Denomination/Birth name/Born? The bot doesn't have to address everything, but since it's going to be doing a lot of edits, it would be good to catch what it can catch.Headbomb {t · c · p · b} 14:43, 11 September 2018 (UTC)

Some answers
  • Not intended, but fixed during the Task 28 run - the rule needed to be run multiple times for maximum effectiveness
  • |denomination= doesn't seem to be on the TemplateData list (nor do I see it in a text-based search), and the others are used <25 times (total), so they can likely be dealt with manually.
My intention with this bot run was to remove all of the parameters that are used often enough that it is a hassle to go through and remove them manually (I mean, I certainly wouldn't want to remove 4000 instances of |religion=). I can certainly add in any additional params if you think they would be useful, but hopefully this bot run will get the tracking cat down to a level where manual editing is once again feasible. Primefac (talk) 02:39, 16 September 2018 (UTC)

Galobot

Operator: Galobtter (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 07:12, Friday, August 3, 2018 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python (Pywikibot)

Source code available: Here

Function overview: Fix multiple unclosed formatting tags lint errors

Links to relevant discussions (where appropriate): Wikipedia:Village pump (technical)#Remex: Pages that used to look fine are now broken Wikipedia:Bot requests#HTML errors on discussion pages

Edit period(s): One time run

Estimated number of pages affected: ~10000 based on 34000 errors

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): No

Function details: Basically, replaces things like <tt><tt> with <tt></tt> as Jc86035 suggested here. Tags fixed are <tt>, <s>, <u>, <b>, <i>, <code>, and <strike>. Specifically:

  1. for each page that has multiple unclosed formatting tags, finds every multiple unclosed formatting tags error for that page
  2. uses the "location" output of Linter to narrow down where to fix the error in the page text
  3. searches for two instances of start tags of the erroneous tag
  4. if there are no closing tags or templates in between, it replaces the latter instance with a closing tag
    update: the two deprecated tags are now handled a bit differently; <strike> tags are replaced with <s> tags and <tt>...</tt> with {{mono}} if the fix is to the 99%+ case of <tt>reviewer<tt>
  5. only makes an edit if it has fixed all multiple unclosed formatting tags errors.

I know that Ahecht's Ahechtbot is having a BRFA partly for doing the same for just <s>; however, this fixes all non-nesting tags with such errors, and as it only edits when all errors are fixed there shouldn't be any double-watchlist hits from both bots or anything like that. Also, my bot account was blocked by Oshwah for having "bot" in its name; probably should be unblocked, at-least now :)

Discussion

Unblocked as there is a BRFA open on this and it is not editing outside the bot policy. — xaosflux Talk 13:03, 3 August 2018 (UTC)
  • Regarding "only makes an edit if it has fixed all multiple unclosed formatting tags errors" - is this for the entire page, how will you determine this? — xaosflux Talk 13:08, 3 August 2018 (UTC)
    Yes, this is for the entire page. All the multiple unclosed formatting tags errors of the page are gotten through an API call; if for any of the errors on a page it cannot make a fix, it doesn't edit the page (there is no need to make more than one fix per error given by Linter). This filtering decreases the number of pages edited by ~5%. Galobtter (pingó mió) 13:24, 3 August 2018 (UTC)
  • From an "error-handling" perspective, how likely is it that there will be nested instances of these calls? I know it's unlikely that there will be something like <i>This is <i>italics</i> when we do this</i> (which shows up as This is italics when we do this, but it's very possible you could have someone saying "To highlight code, use <code>", which has two <code> calls in it. Primefac (talk) 16:12, 3 August 2018 (UTC)
    I mean, I know that the second example I've given doesn't actually throw any errors, but if there was another error on the page, would it correct it? Primefac (talk) 16:13, 3 August 2018 (UTC)
    Interesting edge case, but no :). Linter gives the location of the error (from the start tag to the incorrect end tag (addendum: or sometimes till the end of the line)). The script only tries fixing the specific tag that Linter says is problematic within that particular location. (see here for example of API output) So another error elsewhere would not cause "fixing" of that.
(Additional thoughts that may not make sense and are of minor import: The only way that would even be close happening is if the page had two unclosed formatting lint errors as in here. Linter sometimes gives the location as from the first erroneous tag to the very last one, instead of stopping at the second paired erronous tag (but not in the case I've made though), and thus the whole of the text would be in the location of one of the reported errors, and thus the text to be fixed would include the example you've given, and the program would be looking to fix a <code> error within that. But the program would only fix the first error there and not "fix" the next line; and the location of the second error would only contain <code>Bar<code>) Galobtter (pingó mió) 16:47, 3 August 2018 (UTC)
Cool. I've been a little more out-of-the-loop on the Linter stuff recently, so I wasn't sure how the errors were being handled these days. Primefac (talk) 16:29, 4 August 2018 (UTC)
  Done, thanks. <tt>...</tt> are instead replaced with {{mono}} if there are no pipes in between to muck things up. <strike> are replaced with <s> Galobtter (pingó mió) 09:44, 14 August 2018 (UTC)
@Galobtter: You should also check for curley brackets, just in case someone was trying to type <tt>}}</tt> and accidentally did <tt>}}<tt> instead. I also created a pull request to explicitly call out the first parameter as "1=" and to add "|needs_review=yes". --Ahecht (TALK
PAGE
) 18:18, 17 August 2018 (UTC)
99%+ of <tt> fixes are of <tt>reviewer<tt> from an old version of {{Pending changes reviewer granted}} so I'm thinking of maybe only changing to {{mono}} then (to avoid any errors); or at-least in that case the fixes won't need review ({{mono}} is what is used now for those notices) Galobtter (pingó mió) 18:28, 17 August 2018 (UTC)
Yeah, code updated to only replace with mono if it is <tt>reviewer<tt> Galobtter (pingó mió) 10:36, 18 August 2018 (UTC)
  • @Xaosflux: tis been nearly a month, I have dealt with any issues brought up; would appreciate if this could move forward. Thanks. Galobtter (pingó mió) 13:43, 1 September 2018 (UTC)
  •   Approved for trial (150 edits). SQLQuery me! 23:56, 7 September 2018 (UTC)
  • SQL, thanks,   Trial complete. Edits. I made the bot skip user talk base pages for the trial per comments by Xaosflux on the Ahechtbot BRFA regarding creating new messages alerts. A large portion of the edits were of changing the <tt> tag; here are sample edits of each of the tags: tt (tt fix that isn't of <tt>reviewer<tt>), s, code, b, i, u, strike. There was one error, but I spotted it and tweaked the bot code (turns out the location given by linter is sometimes 1 off, and code was tweaked to account for that; it now skips that page). Other than that I was checking every edit and there were no issues I could spot. Galobtter (pingó mió) 12:49, 8 September 2018 (UTC)

TokenzeroBot 6

Operator: Tokenzero (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 12:58, Saturday, August 18, 2018 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): python, pywikibot

Source code available: GitHub

Function overview: Handle other predatory journals by creating redirects and hatnotes, exactly as done before for OMICS by TokenzeroBot 5.

Links to relevant discussions (where appropriate): Requested at User talk:Tokenzero#New redirects.

Edit period(s): One time run for each of several publishers.

Estimated number of pages affected: a few thousand created redirects

Namespace(s): mainspace, talk

Exclusion compliant (Yes/No): yes

Function details: Functionality the same as previous Wikipedia:Bots/Requests for approval/TokenzeroBot 5, but for a few new lists of journal titles from different publishers, given by Headbomb (talk · contribs) on the go, on a case by case basis (e.g. User:Headbomb/SRP). That is, the bot shall create redirects or hatnotes to point from these titles, and fix categories of previously created redirects.

More precisely, for each title Foobar on the list:

  • If Foobar exists, consider Foobar (journal) instead (unless the title already contained journal or already was a redirect, in which case skip it)
  • Consider also variants obtained by replacing "and" with "&" (and vice versa, if the title doesn't contain Latin "Acta")
  • Consider also variants obtained by taking the the ISO 4 abbreviation (dotted and undotted, computed using the automatic tool, using multilanguage rules iff the title contains "Acta").
  • If any of the consider variants already exists, skip it, just to be safe.
  • Otherwise, create a redirect from each variant, for example:
#REDIRECT[[OMICS Publishing Group]]

and create a talk page for that redirect, containing {{WikiProject Academic Journals}}. For the main variant also add a category to the redirect, e.g. [[Category:OMICS Publishing Group academic journals]]. For the ISO-4 variant also add {{R from ISO 4}}.

Then, for some publishers, for each title that looks like a misleading extension of another journal's name, like Foobar: Open Access (the exact pattern may depend on the publisher):

{{Confused|text=[[Foobar: Open Access]], published by the [[OMICS Publishing Group]]}}


Discussion

Since this is a general task, not a specific one with a fixed set of edits, I'll recuse myself from approval. I'll point out past tasks like this are well-oiled and have been well-trialed. One thing I'll point out (I only caught this later, so that still needs fixing, likely by this bot), is that the ISO abbreviations should not be categorized in the corresponding publisher category (e.g. Category:OMICS Publishing Group academic journals as of writing). So the bot in general would have a scope of "doing redirect maintenance for WP Journals, broadly speaking". Headbomb {t · c · p · b} 13:43, 18 August 2018 (UTC)

  Approved for trial (50 edits). SQLQuery me! 15:45, 28 August 2018 (UTC)
@SQL:   Trial complete. See edits. Tokenzero (talk) 16:41, 2 September 2018 (UTC)
Looks all good to me, btw. Headbomb {t · c · p · b} 17:26, 2 September 2018 (UTC)

Texvc2LaTeXBot

Operator:

Time filed: 19:45, Monday, June 18, 2018 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): Python (pywikibot)

Source code available: Yes

Function overview: The bot will exclusively edit mathematical and chemical formulas to manage upgrading of the LaTeX math engine and the eventual removal of the texvc backend.

Links to relevant discussions (where appropriate): phab:T195861

Edit period(s): one time runs

Estimated number of pages affected: on enwiki max. 1524 (User:Salix_alba/maths2018); initially 204 (User:Texvc2LaTeXBot/enwiki)

Namespace(s): all namespaces

Exclusion compliant (Yes/No): Yes

Function details:

  • As a first step, the bot will perform the replacements listed in the table mw:Extension:Math/Roadmap#Step_1_Part_A:_Remove_problematic_texvc_redefinitions on the 204 pages listed in User:Texvc2LaTeXBot/enwiki.
  • The first 204 pages will be checked for correct operation, then extended to a further 1320 pages which have maths syntax which needs updating.
  • After editing those 204 pages, we will apply for bot flags on the remaining 553 projects that have some mathematical equations, perform the same replacements and incorporate their ideas and concerns.
  • Subsequent steps will only be performed if a consensus is reached. The update process involves either breaking rendering of version histories or replacing all math and chem tags (around 65000 pages on the English Wikipedia).
  • If you have questions, suggestions or concerns regarding the update process, please post them on mw:Extension:Math/Roadmap or join our commission at phab:T195861.

@Physikerwelt and Salix alba: Feel free to improve/modify.--Debenben (talk) 19:45, 18 June 2018 (UTC)

Discussion

Could you make a sample edit to see what exactly would be involved here? Headbomb {t · c · p · b} 20:44, 21 June 2018 (UTC)

  •   Note: This bot appears to have edited since this BRFA was filed. Bots may not edit outside their own or their operator's userspace unless approved or approved for trial. AnomieBOT 21:11, 21 June 2018 (UTC)
I ran the bot on the first three pages in the list. They happened to be only \and and \or replacements, but the others are conceptionally the same.--Debenben (talk) 21:17, 21 June 2018 (UTC)
@Debenben: diffs? Headbomb {t · c · p · b} 23:25, 21 June 2018 (UTC)
It was run in error on some main space article here are three diffs
I've copied some article to my userspace to test
--Salix alba (talk): 05:43, 22 June 2018 (UTC)
@Headbomb: I assumed the sample edit should be performed on a regular page and chose three pages, hoping it would also cover some of the replacements Salix alba did in the userspace. I am sorry about the misunderstanding in case you did not want any main space edits yet.--Debenben (talk) 10:39, 22 June 2018 (UTC)

@Debenben: Before proceeding with full trial, there should be (if there isn't one already) a noticed posted at WP:FORMULA, as well as WP:PHYS, WP:CHEM and WP:WPMATH since those are the projects most affected. Also, see WP:BOTMULTIOP, as I understand multiple people will be operating this bot. Headbomb {t · c · p · b} 12:32, 22 June 2018 (UTC)

@Salix alba: Do you want to take care of all English speaking projects? I could do the German and French ones and we could write a custom userpage "unless otherwise identified edits on English speaking projects are done by Salix alba".--Debenben (talk) 13:08, 22 June 2018 (UTC)
@Debenben: Yes I'm quite happy to be sole operator of the bot on en-wikipedia (and other projects). I've got my head about how the bot runs now.
@Headbomb: Yes publicising the work of the bot and the associated migration project should be publicised in the places you mention. I'll get to it. --Salix alba (talk): 15:11, 22 June 2018 (UTC)
@Headbomb: I posted notices at the places mentioned above on Friday.--Salix alba (talk): 16:33, 25 June 2018 (UTC)

{{BotOnHold}} There's a security problem with the bot, I've blocked it until it's resolved. Debenben, you've been added to a private bug report. Max Semenik (talk) 05:00, 26 June 2018 (UTC)

Can you add me to the bug report. I think I'm now responsible for the bots use of the English wikipedia.--Salix alba (talk): 05:55, 26 June 2018 (UTC)

The issue is resolved and the bot is unblocked, we can continue. Max Semenik (talk) 21:07, 27 June 2018 (UTC)

Alright, well we can move on to trial once the operator has read WP:BOTPOL and specifically WP:BOTACCOUNT/WP:BOTREQUIRE. In particular, {{Bot}} should be added to the bot's user page. Headbomb {t · c · p · b} 20:12, 28 June 2018 (UTC)
Cool. I'll read up on the relevant docs. --Salix alba (talk): 21:02, 28 June 2018 (UTC)
I've now added {{Bot}} and done a couple of test edits. What a good number of edits for the trial phase? --Salix alba (talk): 17:06, 30 June 2018 (UTC)
  Approved for trial (10 edits for each fix).. Link to this BRFA in the edit summary during the trial. Headbomb {t · c · p · b} 18:14, 30 June 2018 (UTC)
{{OperatorAssistanceNeeded|D}} this request will be moved to expired if it is no longer being worked on. — xaosflux Talk 15:24, 18 August 2018 (UTC)

  Trial complete. I've done a number of trials with no problems. --Salix alba (talk): 06:36, 19 August 2018 (UTC)

One change we made to the script was to reject all pages with a nowiki tag. So a page like Talk:Gamma function which should really have a change \C -> \Complex is not edited and has to be done manually. --Salix alba (talk): 07:49, 19 August 2018 (UTC)


Approved requests

Bots that have been approved for operations after a successful BRFA will be listed here for informational purposes. No other approval action is required for these bots. Recently approved requests can be found here (edit), while old requests can be found in the archives.


Denied requests

Bots that have been denied for operations will be listed here for informational purposes for at least 7 days before being archived. No other action is required for these bots. Older requests can be found in the Archive.

Expired/withdrawn requests

These requests have either expired, as information required by the operator was not provided, or been withdrawn. These tasks are not authorized to run, but such lack of authorization does not necessarily follow from a finding as to merit. A bot that, having been approved for testing, was not tested by an editor, or one for which the results of testing were not posted, for example, would appear here. Bot requests should not be placed here if there is an active discussion ongoing above. Operators whose requests have expired may reactivate their requests at any time. The following list shows recent requests (if any) that have expired, listed here for informational purposes for at least 7 days before being archived. Older requests can be found in the respective archives: Expired, Withdrawn.