Users eligible for autopatrolledEdit


During a conversation with @WereSpielChequers: and later @IJBall: the subject of identifying users who might be viable candidates for being granted the autopatrolled privilege came up. There is an existing database report which is intended to provide this information, but the report as it stands today isn't very useful. The main issues (as far as I am aware) are that it is not filtered to show editors who have created articles recently (say, in the last 30 days) and is does not exclude bots and a few other minor things.

I wrote a quick Perl script which uses the MediaWiki API to create a report which does take into account these factors. I have run it a few times, and have been manually pasting the output into my userspace page here.

This isn't a long-term solution, because apart from anything else I am both busy and forgetful :)

WereSpielChequers would ideally like the original database report page to be updated with my version of the report, instead of Community Tech Bot's version, but obviously at present the bot and I would be in conflict with each other, and it would be terribly untidy. So there are a few options I can think of for how we might move forward, and I wonder if you might have any suggestions or input.

1. Perhaps the tidiest thing would be if the existing report could be modified to use the same criteria that I use in mine (which are described at the top of the page), and then I would just stop running my report entirely. I don't know how likely that is to happen on a short timescale, but perhaps you would have some insight on this?

2. Alternatively, perhaps Community Tech Bot could be told to stop updating that page, and I could update it manually for now (and probably seek bot approval to have it run automatically each week).

3. I could update some new page elsewhere in the Wikipedia: namespace, and the existing database report page could maybe redirect to it.

4. Some other option? Community Tech Bot running my code maybe?

I'm not very familiar with the database reports service, or Community Tech Bot, so there may be some background I'm missing here which would make everything clearer; but for now we have two things. We have a page where everyone is looking for this information, and we have a report which is regularly being run to provide the information they are looking for. Unfortunately the two things are not connected in any way :)

Would appreciate your thoughts,

Thparkth (talk) 16:40, 29 December 2015 (UTC)

Hi! Firstly my apologies for this late reply. I've been on vacation. Secondly, that's a fantastic idea, thank you. I'd be very happy to modify the bot to run your script. The only thing that I need is the SQL query you used to generate it. I'll update the bot when I get it. Thanks a lot for helping maintain the reports. :) NKohli (WMF) (talk) 17:28, 5 January 2016 (UTC)
Thanks for getting back to me, hope you enjoyed your vacation :)
The current code is not SQL-based but rather uses Perl and MediaWiki::API. Hopefully this week or next I will have time to look at a pure SQL version. I'll get back to you on this :)
Thparkth (talk) 02:47, 6 January 2016 (UTC)
  • OK, so I have a pure-SQL version of this now which I have tested and appears to work successfully. Note that it is an expensive and long-running query - it will probably take five to six hours to run. This is similar to the performance of the Perl version. The basic issue is that calculating how many articles a user created (with all the various conditions in place) is quite an expensive operation. It might be possible to optimize it further than I already have, but I have literally spend at least twelve hours refactoring this in various radical ways, and this is the fastest option I could find. Anyway, this code works, and I believe should be ready to go!

        Identify users who meet the criteria for being granted "autopatrolled"
        on the English Wikipedia but who don't already have it.

        Author: Andrew Crawford (thparkth) <>

        License: Public Domain


        /* "editor" consisting of user_name, wrapped in HTML tags linking to the sigma "created" tool */
                '<a href="',
              ) AS editor,
        /* derived column "created count" returned by this subquery */
        ) AS created_count
        /* This query returns users who have created pages in the last 30 days
           and who are not already members of autoreviewed */

                        /* User created a page within the last thirty days */
                        rc_timestamp>date_format(date_sub(NOW(),INTERVAL 30 DAY),'%Y%m%d%H%i%S')
                        /* It was an article */
                        /* The user was human */
                        /* It was a new page */
                        /* It's not a redirect */
                        /* User doesn't already have autoreviewer */
                        NOT EXISTS
                                SELECT 1 FROM user_groups WHERE ug_user=user_id AND (ug_group='autoreviewer' OR ug_group='sysop')
        ) as InnerQuery
        created_count > 24
        created_count DESC

This can also be found as /data/project/thparkthsql/autopatrolled_eligible.sql on Tool Labs.

By the way, I enjoy this kind of work, and I am gaining some familiarity with the schema and the Tool Labs environment so please do let me know if there is a backlog I could be helping with.


Thparkth (talk) 14:11, 7 January 2016 (UTC)

Hi @Thparkth: I'm extremely sorry it took me this long to get back to you. I just updated the report. Let's hope this works. I'm a little worried since it's an expensive query and often queries running more than a couple of hours are killed abruptly. Aside, would you like to be a co-maintainer for the database-reports labs tool? There are several other reports lying dead which could use some love. :) It's a simple python-based bot, whose code can be seen at Thank you so much! -- NKohli (WMF) (talk) 09:47, 15 April 2016 (UTC)
Thanks NKohli, that's great. Thanks for tolerating my eccentric SQL indentation style ;). Virtually all of the processing time is spent in working out the created count for each user. An index on rev.rev_parent_id would probably help a great deal with that. I will probably go back to this now and try to find away to avoid using that column entirely; finding the lowest rev.rev_id for the page should be logically the same thing and it's a primary key!
Yes, I would love to be a co-maintainer for database-reports, and thank-you for the opportunity.
Thparkth (talk) 10:44, 15 April 2016 (UTC)
I added you as a maintainer for the tool. Are you on Github? Then I can add you as a collaborator on the repo and you can push code easily. :) Also, are you on IRC? -- NKohli (WMF) (talk) 15:34, 15 April 2016 (UTC)
I just created a github user (thparkth). I am very familiar with git but less so with github. I am "thparkth" on IRC also. Thparkth (talk) 16:37, 15 April 2016 (UTC)

Welcome to The Wikipedia Adventure!Edit

Hi NKohli (WMF)! We're so happy you wanted to play to learn, as a friendly and fun way to get into our community and mission. I think these links might be helpful to you as you get started.

-- 15:58, Wednesday, January 13, 2016 (UTC)

API change will break your botEdit

Hi Niharika,

Did you see the announcement that http:// access to the API is going away in a few weeks? It looks like the Community Tech bot is going to break as a result of this change.

BTW, I'm encouraging people who need help updating their code to ask at w:en:Wikipedia:Bot owners' noticeboard or on the mailing list. I don't know whether there will be any requests, but if you can help, it'd probably make someone's day much happier. :-) Whatamidoing (WMF) (talk) 23:50, 19 May 2016 (UTC)

Thank you for pointing this out to me, Whatamidoing (WMF)! I did indeed miss the announcement. I'll fix up the bot over the weekend and try to keep an eye on the mailing list and the noticeboard as well. Really happy to help. :) Thank you so much. NKohli (WMF) (talk) 03:29, 20 May 2016 (UTC)
Hey Whatamidoing (WMF) Can you point me to where I should be making the https fix? I couldn't see where I'm using http in my code instead of https (in a very cursory glance through). Thanks! -- NKohli (WMF) (talk) 06:43, 21 May 2016 (UTC)
I don't really know anything except that your bot was on the list. :-(
Brandon Black in Ops should be able to provide you with more information. Whatamidoing (WMF) (talk) 05:11, 22 May 2016 (UTC)
Okay! Thank you. :) -- NKohli (WMF) (talk) 06:07, 22 May 2016 (UTC)

Database ReportEdit

Hi Niharika. Wikipedia:Database reports/Unused templates has been broken for over two years, and I haven't had much luck finding someone from the community interested in getting it back up and running. Is this something you or others on your team might be able to help with? If it's a technically trivial extension, low transclusion templates (1 or 2 transclusions) would be of interest as well. ~ Rob13Talk 05:27, 27 July 2016 (UTC)

Hi Rob, thanks for bringing this to my attention. I'd try and look into it soon. -- NKohli (WMF) (talk) 16:24, 28 July 2016 (UTC)

Community Tech botEdit

I noticed that Community Tech bot has stopped updating as of the end of August. Can you give the bot a push and get it working again? Or if this is a known problem, do you have an ETA on when it might be back up? Thanks. Pinging Thparkth on this, too. -- Gogo Dodo (talk) 01:58, 7 September 2016 (UTC)

Hey @Gogo Dodo:, thanks for the ping. I'll look into the problem today. -- NKohli (WMF) (talk) 08:30, 7 September 2016 (UTC)
Okay, this should be fixed now. The bot will resume updating pages as usual. Feel free to report back if you find any more problems with this. Thanks! -- NKohli (WMF) (talk) 11:51, 7 September 2016 (UTC)
Thank you for fixing it. It is updating now. While I'm here, I noticed that a few talk pages listed are false positives and have been appearing for awhile now. Can you check it out? The ones that keep reappearing are:
It is a rather bizarre mix and I don't see any obvious connection between them. Thanks again. -- Gogo Dodo (talk) 05:19, 11 September 2016 (UTC)
Hi @Gogo Dodo: which report is this for? -- NKohli (WMF) (talk) 11:15, 12 September 2016 (UTC)
It's Wikipedia:Database reports/Orphaned talk pages. All the above pages except Talk:DDWG were created between 22 April and 3 May 2016. DDWG was moved to that name 6 May 2016. Maybe a database got some bad entries in that period. I recall an incident where some pages were designated as mainspace pages somewhere even though their name specified another namespace, but I think that was longer ago. PrimeHunter (talk) 11:42, 12 September 2016 (UTC)
The latest database reports by the bot have omitted <onlyinclude>...</onlyinclude> around the timestamp. Example: [1]. It's needed to only transclude the timestamp and not the whole report at Wikipedia:Database reports. I have fixed two reports so right now there is no problem at Wikipedia:Database reports. PrimeHunter (talk) 01:00, 10 September 2016 (UTC)
Hi @PrimeHunter: I don't see any difference between [2] and [3]. What's different? -- NKohli (WMF) (talk) 15:24, 10 September 2016 (UTC)
The page itself looks the same. onlyinclude only affects how it looks when it's transcluded on another page like Wikipedia:Database reports where only the part inside <onlyinclude>...</onlyinclude> will be shown. When the bot version was current the entire page was shown in a single cell in the "Data as of" column at the "PRODed articles with deletion logs" row. My edit [4] means only "06:30, 10 September 2016 (UTC)" is currently shown in that cell. For testing, try previewing this section without nowiki around the below code. PrimeHunter (talk) 15:49, 10 September 2016 (UTC)
Understood. I'll fix it by Sunday. Thanks! -- NKohli (WMF) (talk) 18:01, 10 September 2016 (UTC)
I have added a parameter to omit the transclusion for now so there is no rush.[5] PrimeHunter (talk) 11:02, 12 September 2016 (UTC)
Thanks for that. This should be fixed now (in future report updates, that is). I pushed those reports today so they should update soon. I'll try to keep an eye on it and make sure it behaves. -- NKohli (WMF) (talk) 11:12, 12 September 2016 (UTC)
Thanks for adding onlyinclude taqs. The timestamp should be inside the tags like the old revision in [6] in order to display on transclusion at Wikipedia:Database reports. The current revision only displays "..." on transclusion. PrimeHunter (talk) 11:10, 13 September 2016 (UTC)
Oops, sorry. I changed it to <onlyinclude>~~~</onlyinclude>. Is that okay? Thanks. -NKohli (WMF) (talk) 15:43, 15 September 2016 (UTC)
You need five tildes for a timestamp alone. Three tildes make a username without timestamp so this should work: ~~~ <onlyinclude>~~~~~</onlyinclude>. PrimeHunter (talk) 18:47, 15 September 2016 (UTC)
It's working perfectly now. I manually moved the timestamp in some weekly and monthly reports which I guess will automatically do it in future updates. Regarding the above post by Gogo Dodo, Wikipedia talk:Database reports#Zombies in the database shows a report by another bot is also unable to detect that Category:216 BC births and Category:235 BC births exist, so I guess it's a database issue and not a problem in the bot code. PrimeHunter (talk) 11:36, 17 September 2016 (UTC)

[[Wikipedia:Database reports/PRODed articles with deletion logs]]: {{Wikipedia:Database reports/PRODed articles with deletion logs}} [[Wikipedia:Database reports/Articles by size]] (doesn't use onlyinclude but isn't currently transcluded at [[Wikipedia:Database reports]]: {{Wikipedia:Database reports/Articles by size}}

User:NKohli (WMF)/megawatch.jsEdit

To get around the 50 page restriction, you can split the result array into multiple arrays each with 50 pages...

//Split into lists of 50 (max number for api)
result_split = new Array( Math.ceil(result.length/50) );
for (var j=0; j<result.length; j+=50) {
	result_split[j/50] = result.slice(j, j+49);

and then make multiple api calls with a loop...

var count=0;
for (var ii = 0; ii < result_split.length; ii++) {
	new mw.Api().get( {
	// ... (your code) ...
	titles: result_split[ii].join( '|' )
	// ... (your code) ...
	.done( function () { 
		if (count === result_split.length) {
			$( '#megawatch' ).addClass( 'watched' );
	} );

Though you might still want to limit the number of watched pages to a reasonable number, as there are some pretty large categories out there. - Evad37 [talk] 23:48, 27 January 2017 (UTC)

This is useful. Thanks! I'll update the script soon. :) -- NKohli (WMF) (talk) 11:06, 2 February 2017 (UTC)

Your BRFAEdit

Your BRFA, Wikipedia:Bots/Requests for approval/Community Tech bot 3 has been approved. — xaosflux Talk 22:16, 22 February 2017 (UTC)

Thanks! -- NKohli (WMF) (talk) 03:46, 23 February 2017 (UTC)

Your feedback matters: Final reminder to take the global Wikimedia surveyEdit

Popular pages reportEdit

What is the popular pages report and should WP:CHICAGO have one?--TonyTheTiger (T / C / WP:FOUR / WP:CHICAGO / WP:WAWARD) 20:02, 24 March 2017 (UTC)

Hi! The popular pages report shows the top-viewed 500/1000 projects in a Wikiproject in the past month. See Wikipedia:WikiProject Alaska/Popular pages for example. There was a bot, Mr. Z bot which used to generate these reports previously but it died a few months ago and the reports were broken. So we made a new bot to do the same job. If you do not want a report for WikiProject Chicago I can remove it from the bot config. The current report is at Wikipedia:WikiProject Chicago/Popular pages. Thanks. -- NKohli (WMF) (talk) 20:16, 24 March 2017 (UTC)

User:Community Tech bot/Popular pages config.jsonEdit

How do I set up User:Community Tech bot/Popular pages config.json for a project? Looking to do it for Wikipedia:WikiProject Disney and Wikipedia:WikiProject Amusement Parks. Elisfkc (talk) 03:59, 27 April 2017 (UTC)

Hi Elisfkc, I see that both WikiProject Disney and Amusement Parks are listed in that config. So that means that the bot is already working on generating their reports. The bot is running a little behind schedule since this is its second run and the code is getting changed frequently but both of those projects should be updated soon. You can find the reports here: Wikipedia:WikiProject Amusement Parks/Popular pages and Wikipedia:WikiProject Disney/Popular pages. -- NKohli (WMF) (talk) 04:51, 27 April 2017 (UTC)
Oh, I just didn't see them. Thanks. Elisfkc (talk) 13:47, 27 April 2017 (UTC)

Wikipedia:WikiProject New York City Public Transportation/Popular pagesEdit

Could you possibly help set up this page? Thanks.--Kew Gardens 613 (talk) 12:17, 6 August 2017 (UTC)

Done! Sorry for the delay, I was on vacation. Let me know if anything looks out of place. Thank you! -- NKohli (WMF) (talk) 22:41, 9 August 2017 (UTC)

Tech News: 2018-29Edit

16:01, 16 July 2018 (UTC)

Popular pagesEdit

Hello. How I can request WikiProject to popular pages task? I thought it will be updated after I create certain page but it didn't happened. Eurohunter (talk) 13:04, 24 October 2018 (UTC)

Hi Eurohunter. The process for enabling the bot to work on a certain project involves adding the project to this config page. I can add the project to that page if you can tell me the details for which project you want this for, which page you want the report on and how many records would you like (maximum of 1000). Thank you. -- NKohli (WMF) (talk) 19:05, 24 October 2018 (UTC)
Wikipedia:WikiProject Basshunter. It shouldn't exceed 500 records. Eurohunter (talk) 19:27, 24 October 2018 (UTC)

Tech News: 2018-44Edit

20:08, 29 October 2018 (UTC)

Tech News: 2018-45Edit

17:28, 5 November 2018 (UTC)

Tech News: 2018-46Edit

19:22, 12 November 2018 (UTC)

Tech News: 2018-47Edit

23:28, 19 November 2018 (UTC)

Tech News: 2018-48Edit

22:22, 26 November 2018 (UTC)

Tech News: 2018-49Edit

16:12, 3 December 2018 (UTC)

Tech News: 2018-50Edit

17:33, 10 December 2018 (UTC)

Tech News: 2018-51Edit

20:34, 17 December 2018 (UTC)

Tech News: 2019-02Edit

18:29, 7 January 2019 (UTC)

Tech News: 2019-03Edit