Wikipedia talk:Database reports

Add topic
Active discussions

Requests: Please list any requests for reports below in a new section. Be as specific as possible, including how often you would like the report run.

Shortest biographies of living peopleEdit

Wikipedia:Database reports/Shortest biographies of living people is updating daily, but appears to be broken. Articles that have changed length are still appearing as their previous byte size and the body of the report does not appear to be changing. Michaelwallace22 (talk) 17:12, 11 June 2022 (UTC)

See this VPT thread. This is expected to be a problem for another day or two. Many database reports will show out-of-date information until the database catches up. – Jonesey95 (talk) 17:27, 11 June 2022 (UTC)

Request: editors by number of unreviewed pagesEdit

This would be very useful for prioritisation at new page patrol (related discussion: Wikipedia_talk:New_pages_patrol/Reviewers#Request:_Report_of_number_of_unreviewed_articles,_grouped_by_creator) and WP:PERM/A. I've mocked up an SQL query on quarry that's straightforward enough – can it be converted into a regular database report? – Joe (talk) 10:03, 27 June 2022 (UTC)

Never mind. I've used {{Database report}} to create the report at Wikipedia:New pages patrol/Reports#Unreviewed new articles by creator (top 10). -MPGuy2824 (talk) 09:05, 2 November 2022 (UTC)
That's a really handy template, thanks! – Joe (talk) 09:27, 2 November 2022 (UTC)

Request: Talk page by number of archivesEdit

Also longest talk page by current discussion (taking Wikipedia namespace page), longest pages (not only articles) for record and archiving purposes. I made it in my userpage. Thingofme (talk) 13:36, 27 June 2022 (UTC)

Request: New Page Reviewer activity reportEdit

https://quarry.wmcloud.org/query/32574 is an example. Could this be converted into a regular database report run monthly, and for both the prior six AND twelve months? MB 04:59, 7 July 2022 (UTC)

Wouldn't this just be a modification of Wikipedia:Database reports/Top new article reviewers? – Joe (talk) 13:21, 7 July 2022 (UTC)
That report stops at the top 100. It would be unwieldy if extended to over 700 lines in each table. This report is geared for use in removing NPP right if minimal review requirements are not met. A full list is not needed twice a day for this purpose, just monthly. Also, need 6 month interval to send out a warning message. The top 100 report shows 3 and 12 months, but not 6. MB 13:57, 7 July 2022 (UTC)
Maybe go further and directly generate a list of reviewers that don't meet the requirement, then? Though as far as I know there currently isn't a minimum activity requirement for NPP. – Joe (talk) 14:02, 7 July 2022 (UTC)
The current requirement is >0 in 12 months. That is listed in Wikipedia:New_pages_patrol/Reviewers#Guidelines_for_revocation which says 12 months of inactivity (that needs to be clarified - 12 months of NPP inactivity - more about this on my TP). This report could show just those with 0 reviews, but it would need to be changed if we ever were to increase the minimum. MB 15:33, 7 July 2022 (UTC)

Request: Revival of Wikipedia:Database reports/Uncategorized templatesEdit

From time to time I see a template, mostly navboxes, that are uncategorized. I'm sure since this report is longer updated since June 2014, I was wondering if this report can be revived to find what I assume by now is a small number of such templates. And it shouldn't be just be for navboxes. It should include sidebars, campaignboxes, infoboxes, and all other templates. WikiCleanerMan (talk) 22:55, 16 July 2022 (UTC)

It would take a lot a work to do it, I simply assume. Gather some friends and perhaps it could work. «2nd|ias» 02:29, 17 July 2022 (UTC)
If the query that produced the old reports is conveniently available anywhere, I don't see it, but this seems pretty straightforward. quarry:query/66056 and quarry:query/66057. Are there any categories that should be ignored for this purpose? —Cryptic 05:09, 17 July 2022 (UTC)
Both queries provide quite a useful report, but I think the uncategorized templates we can ignore for the moment are for projects like Today's featured article or Picture of the Day. --WikiCleanerMan (talk) 16:37, 17 July 2022 (UTC)
Hi Cryptic. Thanks for the Quarry links, those are nice. Is there a way to have the individual rows be hyperlinked? For example "Latest_preview_software_release/Nightingale" as a string isn't super friendly, but a hyperlink to Template:Latest preview software release/Nightingale would be pretty slick. --MZMcBride (talk) 19:23, 17 July 2022 (UTC)
There isn't - phab:T74874. I'll sometimes format results like in quarry:query/61982 so they can be pasted into a sandbox, but most of the time the boilerplate isn't worth the irritation. —Cryptic 03:51, 18 July 2022 (UTC)
Oh right, dang. --MZMcBride (talk) 06:03, 18 July 2022 (UTC)
Hi Cryptic. I had missed your comment about how to find the query. The source code for most database reports is published to the wiki (example) in addition to being hosted in a Git repository on GitHub (cf. <https://github.com/mzmcbride/database-reports>). The newer Rust code is even fancy enough to publish its own source code to a wiki subpage when the report is run. --MZMcBride (talk) 07:43, 18 July 2022 (UTC)
The first query looks pretty good. The second one, which also lists subpages, contains many pages that probably do not need to be categorized, such as subpages of testcases pages. I recommend that if a report is created, just the first (no subpages) report be created, at least at first. 9,000+ pages is enough to work on for a while. – Jonesey95 (talk) 21:15, 17 July 2022 (UTC)
I updated the report and it's now paginated. --MZMcBride (talk) 06:03, 18 July 2022 (UTC)
This is perfect. Thanks, MZMcBride. --WikiCleanerMan (talk) 16:11, 18 July 2022 (UTC)
Conversation's looking pretty good, it's on the right track; progress is going well here. «2nd|ias» 00:12, 21 July 2022 (UTC)

Unused templates report malfunctioningEdit

On Unused Templates Task Force talk page, User:Jonesey95 stated that the report has gone "haywire since June 23, the report stopped being contained on a single page, And then it expanded to include thousands of additional pages that apparently had been excluded previously, including the notorious DYK pages". The DYK pages have been fixed since the July 21 edit request at Template talk:DYK conditions. But the report is still not working the way it has been for the past several months. Can someone implement a fix to this? WikiCleanerMan (talk) 14:23, 5 August 2022 (UTC)

MZMcBride, are you aware of a fix since the DYK issue has been resolved? --WikiCleanerMan (talk) 13:49, 6 August 2022 (UTC)
Hi WikiCleanerMan. Nothing has gone haywire. If you think there's a bug in a database report, please explain with specific examples. --MZMcBride (talk) 16:18, 8 August 2022 (UTC)
MZMcBride, if you take a look at pages 1, 7, 13, 19, 20, 22, 23, outside of the template grouping of Attached route, Editnotices, TFA titles, and other project related template groups such as User templates, the rest on these pages should be moved to the first page. Since these project templates don't represent an issue of unused templates and I'm sure Edit notices and TFA titles should be marked transclusion less to avoid being on the reports. --WikiCleanerMan (talk) 15:39, 9 August 2022 (UTC)
Yep, many templates should be marked as intentionally transclusionless. --MZMcBride (talk) 17:11, 10 August 2022 (UTC)
Now there are 38 pages of the report. There are more DYK and Attached route templates. Cleary, the bug is getting out of hand and impeding the way the report should work. --WikiCleanerMan (talk) 21:51, 11 August 2022 (UTC)
The DYK subpages aren't a bug and you're complaining in the wrong place to the wrong person. They're a direct and deliberate result of this edit.
I can't for the life of me figure out what's wrong with the attached route templates, though; every one I've checked seems to be in use. —Cryptic 22:36, 11 August 2022 (UTC)
I think the attached routes are being embedded within articles, but are not considered transcluded from MediaWiki's perspective at the moment. I think we should adjust the code that embeds the attachments so that MediaWiki considers them transcluded. Maybe by doing something silly such as <div style="display: none;">{{Attached KML/...}}. It seems very reasonable to me to treat these embedded attachments as transclusions.
For the editnotices, it's also kind of silly and wrong that they're embedded in the edit view but aren't considered transclusions by MediaWiki. They're so wonky that we may want to just edit Template:Editnotice to insert Category:Wikipedia transclusionless templates. --MZMcBride (talk) 22:42, 11 August 2022 (UTC)
The attached routes actually aren't in use - I'd misread whatlinkshere. They're not consistently transcluded, as I asserted, but linked (from the little "KML file" near the end of, for example, Wendell H. Ford Expressway); and, like a dolt, I didn't replace the space in 'Attached KML/...' with an underscore when testing the query.
Their use case doesn't really allow for categorizing them. While they really don't belong in template space, and might even be better off over at Wikidata or perhaps Wikisource instead, the path of least resistance is probably to filter them out by title. Similarly, not all edit notices actually transclude {{Editnotice}}. —Cryptic 23:11, 11 August 2022 (UTC)
Thanks, Cryptic. By bug, I meant the wonkiness of the report in its current state, not that the DYKN templates were the bug themselves, just their inclusion. --WikiCleanerMan (talk) 23:27, 11 August 2022 (UTC)
I've also informed the coordinators at DYK and TFA about this. So, they will most likely join in on here. --WikiCleanerMan (talk) 00:07, 12 August 2022 (UTC)

Hi Cryptic. I think you're potentially misunderstanding. The "KML file" link is to an &action=raw URL, which is not a standard wikilink and is not tracked by MediaWiki's WhatLinksHere functionality.

What I mean by embedded is that, as described at Help:Attached KML, the XML from pages such as Template:Attached KML/Wendell H. Ford Expressway is being embedded into the article using JavaScript. So when you click the map in the top-right corner, you can see a line drawn on top of the map. That's the KML, which is stored as XML in template pages. I think because we're using JavaScript to embed this data into articles, that's close enough to being a transclusion that we should track it as such.

If you're curious, the wikilink itself is coming from Module:Attached KML, specifically line 176 of this version, which uses standard wikilink syntax: <div class="kmldata" data-server="%s" title="%s">[[%s%s]]</div>. That's what's populating Special:WhatLinksHere/Template:Attached KML/Wendell H. Ford Expressway. --MZMcBride (talk) 15:53, 12 August 2022 (UTC)

The presence of Category:Articles using KML not from Wikidata on Wendell H. Ford Expressway and its category description text makes me think we should just move these KML attachments to Wikidata entirely, as you say. --MZMcBride (talk) 16:01, 12 August 2022 (UTC)

I'm aware, and was deliberately simplifying. —Cryptic 19:21, 12 August 2022 (UTC)

MZMcBride, it seems these templates were included because of this edit you made, or the previous edit right before this one, as Anomie pointed out at the TFA talk page. The route templates have been showing up since July 20. From page two and onward the rest of these group templates have been added since July 20 and/or August 10/11. --WikiCleanerMan (talk) 16:58, 12 August 2022 (UTC)

So this previously did filter these pages by title. I don't think there's an alternative to returning to that for the Attached KML/ subpages, since putting a category on the pages changes the downloadable file. I agree that, whenever it's at all feasible, that categorization should be used instead of the title - Category:Wikipedia substituted templates and Category:Wikipedia transclusionless templates have more uses than just generating this report, and those shouldn't have to make the same ad-hoc exclusions. (Anomie can't be entirely correct, though, unless the script generating the reports crazily runs whatever's on the wiki page, rather than the wiki page just being a convenient mirror of the script's source - whatever changes Gonnym made there shouldn't have made any difference at all.) —Cryptic 19:54, 12 August 2022 (UTC)
Relatedly, why separately fetch all untranscluded templates and the entire contents of the excluded categories, then filter them in python? It seems to me that it would be simpler overall and likely less load to do that in sql. Same with excluding title patterns. —Cryptic 19:54, 12 August 2022 (UTC)
Hi Cryptic. Again, if we actually transcluded the "Attached KML" pages in addition to simply linking to them, we wouldn't need to categorize them and they would automatically disappear from this report, since they would then have a transclusion and no longer be considered unused. Or if we moved the KML to Wikidata and deleted the local pages, that would also remove them from this report. Both of these options seem much more preferable to me than indefinitely excluding them from the unused templates report. I think we should address these issues, not continue to mask them.
There are almost certainly better uses of my time than maintaining this report, so if you or anyone else would like to volunteer to maintain this database report or any others, please let me know. I'd be happy to help get you set up. --MZMcBride (talk) 15:03, 13 August 2022 (UTC)

Request: Mainspace pages without talk pagesEdit

Ignore the redirects and the disambiguation pages, if possible. -MPGuy2824 (talk) 11:27, 27 August 2022 (UTC)

Hi @MPGuy2824 - it looks like there around 120K pages that meet that criteria. Any chance you could explain what the purpose of the report would be? For a one-off list have a look at this Quarry. Thparkth (talk) 22:01, 1 September 2022 (UTC)
Thparkth, I was mulling over starting some sort of taskforce to add wikiproject tags to every talk page. Also, this seemed like a good beginner task to add to Wikipedia:Community portal/Open tasks. Thanks for the query, that will be enough for now. If my taskforce gets off the ground, I might get back here and ask for the top X of that query as a periodic report. -MPGuy2824 (talk) 02:34, 2 September 2022 (UTC)
Sounds good. I suspect a list of 120K pages is too long to be useful, but as you say the top X (maybe the top X oldest articles with no talk page?) could work well. Thparkth (talk) 13:10, 2 September 2022 (UTC)
Yes, top X oldest or top X most popular (by article page views) would work well. Let me see (elsewhere) if there is any interest in my taskforce idea. -MPGuy2824 (talk) 02:54, 3 September 2022 (UTC)

New articles proposed to MergeEdit

Category:All articles to be merged currently has about 2,600 articles. Could a report be generated that shows which of these are Unreviewed articles (still listed in Special:NewPagesFeed), by date of proposed merge? MB 20:25, 8 September 2022 (UTC)

I've used {{Database report}} to get this at Wikipedia:New pages patrol/Reports#Unreviewed articles with merge tags. It doesn't have the date of proposed merge, but maybe someone can tweak the sql further to get that. -MPGuy2824 (talk) 09:19, 2 November 2022 (UTC)

Links to userspaceEdit

A user link has been added to Template:Cleanup bare URLs, and as a result Wikipedia:Database reports/Articles containing links to the user space is now nearly 30,000 lines, any chance that template can be ignored?--Jac16888 Talk 15:14, 9 September 2022 (UTC)

You could also replace that link with Wikipedia:Citation bot, a redirect. – Jonesey95 (talk) 13:19, 11 September 2022 (UTC)
Good shout, thanks--Jac16888 Talk 15:38, 14 September 2022 (UTC)

Listing maintenance categoriesEdit

For some reason, tonight's Empty Categories list has maintenance categories listed on it. They are typically ommitted as they would overwhelm the content categories and because they don't stay empty for long and they do not get tagged for speedy deletion, CSD C1. I'm not sure if this talk page is monitored so I'll just ping Jonesey95 and see if they know what has happened. Liz Read! Talk! 01:11, 15 September 2022 (UTC)

They've gone and changed the schema on us. See quarry:query/67346; what's now in lt_namespace and lt_title in the new linktargets table used to be in tl_namespace and tl_title (which now seem to always be 0 and ''?), and the database report still assumes they're there. —Cryptic 01:34, 15 September 2022 (UTC)
Apparently it was announced in March. The change to templatelinks is (obviously) live; pagelinks, imagelinks, and categorylinks aren't yet, but will follow. —Cryptic 01:46, 15 September 2022 (UTC)
Liz, you can ping me any time. I noticed that Wikipedia:Database reports/Transclusions of non-existent templates had been blanked by the bot this morning and figured that something screwy was happening with a database or one of the servers, so I restored the previous report and figured I'd give things a day to sort themselves out. The above wikitech-l posting is gibberish to me, but maybe Fastily will know if and what things need to change in that report. If this change affects a bunch of reports, I expect that we'll see a thread on VPT in the next day or two. – Jonesey95 (talk) 02:57, 15 September 2022 (UTC)
This is a pretty succinct statement of how to update queries that read from templatelinks. For the non-existent template report, for example, you'd need to change this to this. —Cryptic 03:57, 15 September 2022 (UTC)
This is all like reading a Greek textbook to me but I have enormous confidence in your abilities to get to the bottom of this, Cryptic and Jonesey95. Thank you for looking into this. There are only one or two of us that utilize this database report but it's one I check daily and helps us keep on top of the category clutter than comes out of deleting articles at AFD and categories at CFD. It also helps us notice if a new editor (they are almost always new editors) goes on a tear, creating dozens of unused categories. And lately a very experienced editor has been working on a major job recategorizing pages that left hundreds of empty categories to tag and delete.
Now that I think about it, when there are problems with this list, I usually go directly to the bot operator, MZMcBride so I will ping him to this discussion in case he can follow all of this. I appreciate your help! Liz Read! Talk! 07:23, 15 September 2022 (UTC)
Thanks for the fix, @Cryptic! -FASTILY 07:50, 15 September 2022 (UTC)
Fastily, Wikipedia:Database reports/Transclusions of non-existent templates appears to be broken for the last couple of days. It should have 100+ entries on it every day (see typical pages in the history from a couple of weeks ago). – Jonesey95 (talk) 01:46, 19 September 2022 (UTC)
Thanks for letting me know, looks like I missed a change in Cryptic's example; this should be fixed now. -FASTILY 04:35, 19 September 2022 (UTC)
All fixed today, back to 270 entries. Thanks. – Jonesey95 (talk) 14:18, 19 September 2022 (UTC)
That was my error, not yours - I was fooled by there still being enough rows in templatelinks with tl_title not the empty string that the results looked right, when I hadn't found even a single non-empty instance before that. —Cryptic 17:40, 19 September 2022 (UTC)
Cryptic and Jonesey95, it happened again on tonight's Wikipedia:Database reports/Empty categories. Looks like they are maintenance categories involving files and Proposed deletions. There are plenty of empty clean-up categories that aren't appearing on this list, it's the daily, not monthly maintenance If you tell me that this situation will be lasting a while, then I'll stop pinging you every time it happens. Just thought I'd let you know. Liz Read! Talk! 01:13, 16 September 2022 (UTC)
Wikipedia:Database reports/Empty categories is updated by BernsteinBot, which is maintained by MZMcBride & Legoktm; you'll probably have to ask one of them to fix it. -FASTILY 02:50, 16 September 2022 (UTC)
Yes, I pinged MZMcBride (above) but I'll go to their talk page and ask about this. Liz Read! Talk! 03:11, 16 September 2022 (UTC)
Sorry, I fixed another tool of mine (ours even), forgot about these. I'm traveling tomorrow, so it might not be until Saturday that I have time to fix the reports. Legoktm (talk) 05:51, 16 September 2022 (UTC)
Hello, Legoktm,
You know how to fix this problem? That's great! I look forward to it. Liz Read! Talk! 01:17, 17 September 2022 (UTC)
I think fixed most of them, hopefully the next runs of the reports are better. If there's a monthly report that's off let me know and I can kick it manually. Legoktm (talk) 23:19, 17 September 2022 (UTC)
Oh, my, Legoktm. Things went back to normal for a few days and then in tonight's report, things went bananas! Even worse than before. Ayiiieeee! Liz Read! Talk! 01:24, 22 September 2022 (UTC)
When I run the query in Wikipedia:Database reports/Empty categories/Configuration, I only get Category:MAX (band) video albums, Category:IIT Roorkee Alumni, and Category:Polish pilots, which looks right. —Cryptic 01:53, 22 September 2022 (UTC)
Ughhh, I have no clue why and I'm mostly offline tomorrow, if it's still wrong after tomorrow's update I'll start poking at it again... Legoktm (talk) 08:38, 22 September 2022 (UTC)
Well, everything is back to normal after last night's chaotic report. I don't know who did what but you all have my thanks! Liz Read! Talk! 01:28, 23 September 2022 (UTC)

Goodbye BernsteinBot, hello HaleBotEdit

If you haven't seen the news yet, BernsteinBot has been disabled. HaleBot will take over most of the tasks that it used to do. There are a lot of scattered reports in various places, if you notice something isn't updating, please leave a note here and ping me.

A big thank you to MZMcBride for starting this project 14(!) years ago. Wikipedia is better because of it. Legoktm (talk) 15:10, 12 October 2022 (UTC)

@Legoktm please document this bot's tasks on its userpage. — xaosflux Talk 15:18, 12 October 2022 (UTC)
Legoktm, thank you for taking on this responsibility. – Jonesey95 (talk) 16:05, 12 October 2022 (UTC)
Hale hath no fury. --MZMcBride (talk) 18:18, 12 October 2022 (UTC)
Legoktm, thanks to you and the bot approval team for your swift action. And thanks to all of our bot operators, like MZMcBride, of past and present. The tools you create make our editing lives so much easier. Liz Read! Talk! 02:00, 13 October 2022 (UTC)

Legoktm, can you please check on Wikipedia:Database reports/Unused templates and Wikipedia:Database reports/Uncategorized templates? The former was updating daily, and the latter was weekly, so it is not overdue yet. MZMcBride was also developing Wikipedia:Database reports/Unused templates (filtered) just before the bot retired (discussion); that would be a useful daily report. Thanks. – Jonesey95 (talk) 12:54, 13 October 2022 (UTC)

I tried to get Unused templates working last night but messed up with the subst:#time calls, will fix that tonight. I found the code for the (filtered) report, I'll set that up tonight too. Uncategorized templates should be set to go on the regular schedule. Legoktm (talk) 15:43, 14 October 2022 (UTC)
I saw the update and figured you were working on it. Did you notice that there were undesirable underscores, and that links with parens in them were not quite right, e.g. 1910s_in_music_ (in code here to make sure that the underscores show)? Maybe that's all tied up in the subst work. – Jonesey95 (talk) 17:00, 14 October 2022 (UTC)
I did not notice that, it was me being lazy by using the pipe trick. Should be fixed now, though the last page of the report is missing because of an edit filter I just fixed. The (filtered) report is running daily now too. Legoktm (talk) 02:48, 15 October 2022 (UTC)
Nice work. It's good to have the reports running again. Please see this discussion for suggestions about how the filtered report could benefit from a few more filters. It should be able to fit on one page pretty easily. – Jonesey95 (talk) 05:32, 15 October 2022 (UTC)

Database report templateEdit

{{Database report}} template can now be used to set up one-off or periodically updating reports in userspace or project namespace, given an SQL query. The template doc lists the supported formatting options. Feel free to give it a try and let me know if you face any issues. – SD0001 (talk) 15:43, 28 October 2022 (UTC)

This is nice, thanks. Hopefully no-one kills the DBs with it. -MPGuy2824 (talk) 03:12, 29 October 2022 (UTC)
There are some protections in place to prevent anyone from killing the DBs with it, see phab:T320657 for details. – SD0001 (talk) 10:45, 2 November 2022 (UTC)

Polluted categoriesEdit

Is there any particular reason why Wikipedia:Database reports/Polluted categories only runs once a month? Given the importance of cleaning polluted categories out, and the fact that running it only once a month means that there are typically hundreds of categories to deal with by the time it actually updates (thus making it an onerous task that people become significantly less likely to bother with at all), once a month isn't often enough. Bearcat (talk) 15:41, 22 November 2022 (UTC)

@Bearcat: how often would you like it to run? Legoktm (talk) 16:01, 22 November 2022 (UTC)
Weekly would be best, if possible, but every two weeks would also be okay if there's a reason why weekly isn't feasible. Bearcat (talk) 16:03, 22 November 2022 (UTC)
Done. Legoktm (talk) 16:19, 22 November 2022 (UTC)