User talk:WP 1.0 bot/Second generation

Feedback on the toolserver assessment stats edit

Per request by email, I'm offering my initial observations on the toolserver assessment stats. I've only really had a quick(ish) look at the summary tables, but these are the things I've noticed:

  1. ??? in the importance columns links to Category:Unassessed articles rather than specific subcategories of Category:Unknown-importance articles.
  2. An extra ??? quality row in addition to Unassessed in some tables (see http://toolserver.org/~enwp10/bin/table.fcgi?project=Gymnastics and http://toolserver.org/~enwp10/bin/table.fcgi?project=Lancashire+and+Cumbria, for example). The ??? row appears to include articles with assessments that are either unsupported by the project banner or unrecognised by the tool.
  3. Support for some additional classes (Current, Future, Category, Disambig, Portal, Template) but apparently not others (Book, File/Image, Merge, Needed, Project, Redirect, User). No support either for Deferred-Class (as used by WP Firearms) or SL-Class (used by WP Plants). Also no support for Bottom-importance or No-importance, or NA-Class/importance.
  4. What about Bplus-Class? I'm positive I saw this earlier at http://toolserver.org/~enwp10/bin/table.fcgi?project=Mathematics (though not in the table for WP Statistics, another project which uses the class), but it's not there now. Also, IIRC, it was misplaced between A and GA rather than GA and B.
  5. It might be an idea to order classes the same way as in {{cat class}} and other such templates, i.e. FL with List and Current & Future above the various non-article classes rather than below.
  6. "Feature requests" suggests that pages in user space are ignored, and I can appreciate the reasoning behind that. However, it may interfere with projects that do tag pages in that namespace, userboxes for example, or those (few) projects that have User-Class.
  7. Projects without importance assessments still get a redundant importance column in their tables.

Regards. PC78 (talk) 03:49, 6 December 2009 (UTC)Reply

Thanks for the feedback. One thing that I see now is that I need to document several things more clearly. I also see several improvements I can make to the user interface.
I can explain nonstandard ratings in the context of the Gymnastics project, which you pointed out. They use a quality rating (Redirect-Class) which the bot does not include by default. (I am very happy to have a discussion about which classes should be default, but there will always be exceptions.)
To make the bot recognize their extra quality rating, all the Gymnastics project has to do is tweak a template on their assessments category (Category:Gymnastics_articles_by_quality) one time. I added this template just now and updated the project's data (without editing the source code of the bot at all). The result is this table which does include the class as it should. This also gets rid of the ??? quality row.
Really the ??? row is just a catchall for unknown classes; I need to make this more clear in the user interface. The reason there is not a better link is that there really is no appropriate link.
We do need to have a discussion about which quality classes should be included by default. One issue with projects using nonstandard classes is that it is less obvious how some of these relate to release versions.
For the smaller bugs above:
(1) is a user interface bug that I need to fix
(4) is a quirk of the current categorization, where Bplus class articles are both in the Bplus category and the GA category. This is necessary for the old bot, but it makes the new bot confused. At some point I will fix the categorization on the wiki and then it should be fine, just like Redirect-class was fine for Gymnastics
(5) The classes are alphabetized based on a numeric "ranking" which is also used as part of the formula for calculating an article's release version score. Since FLs have a high ranking, they end up higher in the table. Because of the way that the project tables are generated, I do not have an easy way to change the sort order without hard-coding exceptions. So this is an issue worth discussing, but I have to make it a lower priority right now because of the difficulty/benefit ratio.
(6) This will need more discussion.
(7) is a user interface bug.
— Carl (CBM · talk) 13:55, 6 December 2009 (UTC)Reply
Regarding nonstandard classes, the impression I had both from reading the front page and my own previous enquiries is that all of them (both for quality and importance) would be supported. I gather from your comment above that only some will be supported by default, while others will require the {{ReleaseVersionParameters}} template? For the most part these classes relate to non-article content, so I don't see any big problem with regard to release versions. A wider discussion would probably be a good idea.
What about the two ??? articles in the Lancashire & Cumbria table? Are they showing up because of the Low-importance ratings (both should really be NA-importance)?
In general I'm finding the ??? row a little confusing, in large part because of the "Unassessed link" which really isn't accurate. If there is no appropriate link as you say, then why try and link it at all? A plain text "Other" would be better, IMO.
As for #5, that seems fair enough for FLs, but less so for the others. I don't know very much about these numeric "rankings" that you refer to (you may have to talk me through it a little), but it seems that Current & Future articles should have a higher ranking than non-articles.
One other thing: I assume the summary tables uploaded to wiki will essentially be the same as they are now? I do have a few thoughts on this, but there's a couple of things I'd liek to run by Martin first. Regards. PC78 (talk) 14:54, 6 December 2009 (UTC)Reply
My viewpoint here is that I want the system to support all the quality and importance ratings that a project happens to use, but whether a rating is supported "by default" or via ReleaseVersionParameters is just a configuration issue. Either way the nonstandard ratings are supported.
The issue with Lancashire and Cumbria was exactly the same as with Gymnastics. Once I added and configured ReleaseVersionParameters, and updated the project, the table became correct [1]. The one remaining article that is causing a '???' on that table is Talk:Empress_Ballroom, which is in Category:Unknown-importance Lancashire and Cumbria articles. I thought that article should be in Category:Unassessed-importance Lancashire articles; is there some fine distinction between these?
In any case I have hard-coded the table to refer to unrecognized classes as "Other", which I agree is more clear.
Re #5, it is easy to put "Current" and "Future" articles between "List" and "Book" just like they are in {{cat-class}}, by just setting ReleaseVersionParameters correctly. I am working to document that template today.
The summary tables will look just like the tables that are being displayed on the web tool. Actually that web tool generates wikicode for the table first, then uses Wikipedia's servers to parse it into HTML, then displays that HTML to you. This will make it easy for the on-wiki tables to look exactly like the tables on the web tool. — Carl (CBM · talk) 21:01, 6 December 2009 (UTC)Reply
BTW bugs (1) and (7) above are fixed now. — Carl (CBM · talk) 23:18, 6 December 2009 (UTC)Reply
Thanks for the above, I'll have a proper look at everything later. Category:Unknown-importance foo articles is the de facto naming convention for such categories though. PC78 (talk) 23:39, 6 December 2009 (UTC)Reply
That's not quite what I was asking with regard to the Lancashire & Cumbria table. Previously the table was picking up two specific NA-Class pages, not the whole 40 pages that are in the category, and I was wondering why that was when it seems that NA-Class pages are ordinarily ignored? PC78 (talk) 20:10, 7 December 2009 (UTC)Reply
I think we are looking at this somehow with different perspectives, so that I am not quite catching the point of your feedback; sorry about that. I investigated rebuilding the Lancashire table to see what was going on without the NA class, but now that the class is in the database I am afraid to try to delete it (don't want to break the database).
Here is what I think should have happened. As long as the importance was assessed, the bot would notice all the NA quality articles; as long as the quality was recognized, the bot would also notice all the NA-importance articles. But because it did not recognize the NA rating, it would replace it with "Unknown" (which is now "Other" in the tables). Once the bot was told about the NA quality and importance ratings, so that it recognized them, then it made the table correctly.
If you see another project that has the same problem the Lancashire one did, let me know and I will investigate it thoroughly. I also want to double check if the current table seems correct to you. I will double-check again myself, too. — Carl (CBM · talk) 22:13, 7 December 2009 (UTC)Reply
I have fixed the problem with 'Unknown-importance' not being recognized, as well. The data for the projects is updating right now. — Carl (CBM · talk) 01:10, 8 December 2009 (UTC)Reply
OK, I hear what you're saying and that's pretty much what I assumed was going on with that particular table (it looks fine now, BTW). However, would it not be better to have the bot simply ignore all pages with an NA-Class rating, and therefore make importance a secondary consideration rather than an equal one?
On a related note, I have two other observations related to NA ratings. First, it appears from looking at most tables that NA-importance is included by default, which seems rather illogical when NA-Class is not. Second, the Aesthetics table has an "Other" importance column in place of "NA", and I'm wondering if there's a reason for that.
Tables for projects without importance ratings are looking good. :) PC78 (talk) 01:47, 8 December 2009 (UTC)Reply
We came to the same conclusion regarding NA-Class quality; I added it and restarted the updates. But the new update started where the other was stopped, with "Belgium-related". It will wrap back around to cover the A articles at the end of its run. So Big Brother shows the new settings. Actually, I changed the sort order just now to move "Current" and "Future" higher up in the list; it will take another update for that to be live on all the projects. — Carl (CBM · talk) 01:57, 8 December 2009 (UTC)Reply
Re Aesthetics, the category was not set up correctly: [2]. I have updated the data for that project, and now the table is better. — Carl (CBM · talk) 02:06, 8 December 2009 (UTC)Reply
About the tables that are uploaded here: would it be possible for them to be in HTML format? It would be good if you could offer your input to this discussion. Regards. PC78 (talk) 13:34, 9 December 2009 (UTC)Reply

Beta testing 2009-12-16 edit

I think that the new system is ready for beta testing. There are quite a few things that still need to be done, but getting feedback from a wider collection of users will be very helpful in shaping the direction of development. Please feel free to use this page for discussion, and file any bugs at the bug tracker for this project on toolserver. The existing bot will still be running, so this beta test does not affect anything on-wiki right now.

The URL for the new system is http://toolserver.org/~enwp10.

The most helpful comments on the new system are about:

  1. Things that don't work correctly
  2. Things that don't work as expected
  3. Things that don't make sense
  4. Extra features that you wish were implemented

I'll try to answer questions here as they come up, but please be patient if it takes me a while to respond. — Carl (CBM · talk) 01:39, 17 December 2009 (UTC)Reply

Where's the UI for request #15? Titoxd(?!? - cool stuff) 20:05, 18 December 2009 (UTC)Reply
Maybe I misread it; I thought it was talking about the bar graphs on the project index page. I'll check with Headbomb to see if there was something else he had in mind. — Carl (CBM · talk) 20:16, 18 December 2009 (UTC)Reply
Heh, I understood something like this... and I hope it's the former... Titoxd(?!? - cool stuff) 17:44, 19 December 2009 (UTC)Reply
Those graphs are interesting, and I do think people would be intersted in them. On the other hand, the easiest way I can see to make them is to look at the history of the assessment tables that are stored to the wiki. Is that how they are actually made? If so, it seems like the sort of thing that we could do semi-dynamically, with another table in the database. There are issues to worry about - like new rating levels being added - that we would have to worry about. I opened a JIRA request about it at ENWPONE-13. Could you comment there to explain how you created those graphs in the originally? — Carl (CBM · talk) 21:33, 19 December 2009 (UTC)Reply