Wikipedia talk:Wikidata/2017 State of affairs/Archive 14

Archive 10 Archive 12 Archive 13 Archive 14

Status of descriptions

@DannyH (WMF): what is the status of the Wikidata descriptions on enwiki? Are you waiting for us to do anything, or are you progressing with the development and implementation of the magic word? Is there a Phab ticket for this where we can see what is done and what needs to be done, and perhaps some timing? Fram (talk) 08:21, 7 December 2017 (UTC)

Earlier today, the semi-protected Tel Aviv article (45,000 pageviews yesterday!) for more than three hours had the description "The Capital Of Israel"[1], after an IP changed it on Wikidata (they wouldn't have been able to make that edit on enwiki!). That this kind of politically charged editing is being shown to enwiki readers is completely unacceptable, and should have been fixed a long time ago by the WMF. Can you please finally do something about this? This discussion has been going on for months. Fram (talk) 13:45, 7 December 2017 (UTC)

Yannow I was thinking it was phab:T152743 or one of its daughter tasks but apparently not... Jo-Jo Eumerus (talk, contributions) 13:59, 7 December 2017 (UTC)
Thanks. I did a search for this on Phab and couldn't find a ticket for this either. Fram (talk) 14:30, 7 December 2017 (UTC)
Hi Fram, the thread where we were talking about this got archived: Wikipedia talk:Wikidata/2017 State of affairs/Archive 12#November Magic Word proposal from WMF. The team that's going to work on the magic word is planning to start making the changes in January, with estimated finish by the end of February. I don't actually know the tickets right now, but I'll find out, and report back. -- DannyH (WMF) (talk) 16:03, 7 December 2017 (UTC)
Fram, we reached agreement to collaborate on a multi-option RFC. I meant to get a draft started, but haven't been able to focus on it yet. I can't get to it today, my brain is fried on lack of sleep. Alsee (talk) 17:40, 7 December 2017 (UTC)
@DannyH (WMF) and Fram: As with @Alsee, I understood that this was heading to an RfC, *not* immediate implementation of a magic word. Thanks. Mike Peel (talk) 22:01, 7 December 2017 (UTC)
Yeah, I'm happy to collaborate on an RfC whenever it gets started. But we offered to make the magic word in Jan/Feb, and I don't want to go back on that offer just because the conversation quieted down. -- DannyH (WMF) (talk) 00:49, 8 December 2017 (UTC)
Oh, and to answer Fram's question above: there isn't a ticket yet. We're talking about requirements. -- DannyH (WMF) (talk) 00:51, 8 December 2017 (UTC)
Thanks for the answer, but that's terrible. This has been decided by RfC long, long ago, this has been discussed to death for months, and you don 't even have a first ticket for this and are talking about requirements (with whom? Not with us, clearly). What you (WMF) need to do is simple: develop a magic word, develop an option to have a blank description (wherever we want it, not on some pre-defined list of article types: this is preferably as simple as "no magic word is no description"), and make it available. The only thing that needs discussion between WMF and enwiki is "Hi enwiki, the magic word is ready, please start using it and tell us when to disable the showing of the Wikidata description on Enwiki completely and everywhere". Nothing else still needs to be decided or discussed by the WMF, or else you should come here and ask it. Fram (talk) 05:38, 8 December 2017 (UTC)

Wouldn't it be more logical to proceed now with the RfC, before anyone starts developing the so-called magicword? Depending on the result, it couldn't be needed at all, right? And, in any case, the RfC is completely independent of that development.-- Darwin Ahoy! 01:06, 8 December 2017 (UTC)

We need to start an RfC to discuss which description we want to have, and a bot request to populate the magic word then. Fram (talk) 05:38, 8 December 2017 (UTC)

RfC started at Wikipedia:Village pump (proposals)#RfC: Populating article descriptions magic word. I will post this at CENT as well. Fram (talk) 09:59, 8 December 2017 (UTC)

@Fram: That seems to be missing the key question of 'do we want to use a magic word for descriptions?'. Thanks. Mike Peel (talk) 11:55, 8 December 2017 (UTC)
I thought that was the compromise agreed upon with the WMF? What else would you use, a template? Fine by me, but that seems to be harder to implement in the apps and so on for the WMF, so I didn't see the benefit of arguing about that aspect, and neither did apparently anyone else here since the magic word was proposed. Fram (talk) 11:59, 8 December 2017 (UTC)
Please show me the consensus for that. As far as I can see, there are a few options - no descriptions, Wikidata descriptions (with/without extra visibility/editing here), magic word, local template, etc. There was some discussion here about it, but not a wider RfC. And now this has jumped to 'how do we do this in practice' without saying 'do we want to do this?'... Thanks. Mike Peel (talk) 12:13, 8 December 2017 (UTC)
I believe Mike Peel is right. It should first be asked/decided if descriptions are useful/needed at all, and then ask/decide about their implementation. (Personally I believe it will be a pacific win for the "yes, they are useful" but it's important that it gets referended and recorded, instead of sounding like yet another WMF imposition). Another important thing is what to do with newly created articles, as they will be missing the magic word by default (bot adding?). Frankly, this "magic word" thing sounds *a lot* like a bad patch, something that will inevitably have to be resolved/replaced in the middle term. Descriptions are obviously useful, and should be somehow part of the mediawiki environment where we edit, not those artesanal, primitive, and obviously insufficient (already by design) "magic words". IMO the "magic word" path should only be followed if it can be easily transformed in the (near) future into some Mediawiki embedded feature, so that the immense work that is going to be done now can be entirely used to populate that feature. If this is pacific, then it's OK to proceed with the "magic word" option.-- Darwin Ahoy! 12:30, 8 December 2017 (UTC)
If there truly are some (or many) people advocating for "no, we shouldn't have descriptions, period", then an RfC on that question is useful. But an RfC on a question that isn't really disputed in the first place is a waste of time. I have no idea what kind of "part of the mediawiki environment" you envision, so I can't really comment on that. Fram (talk) 12:49, 8 December 2017 (UTC)
@Fram: As descriptions seem to be something that will become kind of mandatory, and linked to app features, lists, and much more, I was thinking on something like a description box, somewhere around the article, kind of what you already have in Wikidata, but local to the Wikipedias. A "magic word" looks a lot like a temporary patch, not something for the long - not even the middle run. At least it has to be designed so that it allows for a future export to such a feature, in case it gets implemented in the future (something very probable, IMO, it should be embedded by design, not used as a template, or even a magic word).-- Darwin Ahoy! 15:33, 8 December 2017 (UTC)
(ec)"Some" discussion? Two months, 7 archives. The WMF have made it clear that a magic word is the solution, not an option. You are free to start an RfC on that of course, but it seems a lot more useful to try to find a workable compromise to finally get this resolved, than to start a fruitless RfC on whether the WMF should use a template instead of a magic word and so on, and then have the current RfC anyway. And "Wikidata descriptions" is the one option we already had an RfC about and which was then clearly rejected. The options now are no descriptions, or descriptions on enwiki through a template or a magic word. On the latter, the WMF clearly prefers a magic word for technical reasons, and I see no good reason to force the use of a template instead in that case. Which leaves us with "no descriptions" vs. "magic word", and just simply disallowing descriptions on enwiki seems counterproductive (and not as far as I know supported by anyone).
So we all want to provide descriptions in some cases, an RfC has decided that we don't want this to be Wikidata-hosted descriptions, and the WMF ha made it clear that for them a magic word is then the best solution. Since there was no progress since then and no RfC was started (but is needed), I started this one. If you know want to start a counter-RfC, then be my guest, but first think about how this would benefit both enwiki and the WMF. Fram (talk) 12:39, 8 December 2017 (UTC)
As Alsee had said, we'd agreed to collaborate on a multi-option RfC -- I was just waiting for someone to start writing it, so I could participate. This RfC didn't include me, so I added my piece in the discussion under the "What to do with blanks" section. -- DannyH (WMF) (talk) 15:02, 8 December 2017 (UTC)
Well, everyone seems to disagree on what was actaully agreed upon, if anything, so now we at least get a discussion where everyone can participate. Although I do hope that the quality of conbtributions will in general be better than "I don't know of any examples where a blank description would be better" right below a blatant example of such a case, as that gives the impression of someone whose idea of a "collaborative RfC" is "a place where I can repeat my thoughts without having to read what others have said". Not really the best start you could take there... Fram (talk) 15:57, 8 December 2017 (UTC)
As I wrote there, I agree that the vandalism response rate on Wikidata is too slow -- I cited your example, and said it's disappointing and frustrating. I think the solution to that is to make that response rate better, by making it easier for Wikipedia editors to monitor and fix vandalism of the descriptions. I disagree that the best solution is to pre-emptively blank descriptions because we know that there's a possibility that they'll be vandalized. I'm asking for specific examples where editors would make the choice to not show a description on the article page, because a blank description is better than the majority of good-to-adequate descriptions already on Wikidata. -- DannyH (WMF) (talk) 16:07, 8 December 2017 (UTC)
Since none of the other solutions exist yet, the best solution by far now is to have a blank description if there is no onwiki description. Making it necessary for enwiki editors to edit Wikidata to solve problems visible on enwiki is the big dream of the WMF and some ardent Wikidata proponents, but I (and as far as I can judge from many reactions over the months) many others have no intention to start editing two sites to have one good article. The reason we use Commons is that we usually can trust Commons to deal with vandalism swiftly on their own. If Commons files were a continuous source of long-lasting vandalism on major articles, we would soon decide to host most files locally instead. Anyway, this was given in the past as well, but do you really thing Spirit (Depeche Mode album) needs the description "Depeche Mode album"? Seems somewhat redundant... Such articles don't need a description on enwiki, but need one like the current one on Wikidata. Fram (talk) 16:16, 8 December 2017 (UTC)
Fram, I just fixed World War II (Q362) after you pointed out the vandalism. The process was exactly the same as fixing it on Wikidata: click history, click undo, write "revert vandalism." Everything was in English and I used my same username. If you aren't willing to do that, I think you're being obtuse, because you'd rather point out a flaw and wage a boundary dispute with Wikidata, then WP:FIXTHEPROBLEM.--Carwil (talk) 16:53, 8 December 2017 (UTC)
Good for you. Why would I revert vandalism on a site which I believe should not be used on enwiki anywhere but where the use of descriptions has been forced upon us by the WMF without adequate tools to deal with them? Just like they did with e.g. Gather, which they were very, very reluctant to abolish even when it became clear that we didn't want it? I am not going to clean up the mess they created, I want to stop the source of the mess, that's all. Fram (talk) 09:04, 9 December 2017 (UTC)
Fram, I agree that there needs to be a real solution to helping Wikipedia editors monitor and edit Wikidata descriptions; I'm anxious to hear Lydia's update on the progress on getting the descriptions into WP watchlists, which is a necessary (but not a final) step. I think the desire for a fix now is absolutely understandable. At the same time, I don't want to build a feature now that could potentially mass-blank thousands of good-to-adequate descriptions, when we could be working on getting the moderation to work properly.
I really believe that the short descriptions are valuable to the readers and editors who see them, and building a magic word that defaults to no description means taking the existing descriptions away by default. It would take away a feature that a lot of people use for reasons that they don't know or care about, and I don't see how that could be a positive step. I'm advocating for the magic word override, here and internally, because it will result in higher-quality descriptions for the users. Blanking the descriptions by default will result in lower-quality descriptions (i.e., none at all). I know this situation is frustrating right now, but the most important thing is that people get the most value out of reading Wikipedia.
Thanks for bringing up the Depeche Mode page -- you're right, I'd forgotten about that example. I think in that case, the best description would be "2017 Depeche Mode album". I think mild redundancy around disambiguators is pretty much inevitable. -- DannyH (WMF) (talk) 01:24, 9 December 2017 (UTC)
No, such mild redundancy is not inevitable at all, unless you are completely unwilling to allow blank descriptions and look for excuses all the time. You want examples where blank would be better, but vandalism doesn't count, redundancy doesn't count, and everything else could probably also be changed or would be aceptable to you. Perhaps you should care more about BLP violations than about keeping your precious descriptions at all costs? Can you tell me what progress there has been made the last 9 months by the WMF? They can't even get the enwiki watchlist to show English labels instead of P-numbers and Q-numbers for Wikidata changes, which means that it is completely unclear what is being changed. This has been asked for years... "At the same time, I don't want to build a feature now that could potentially mass-blank thousands of good-to-adequate descriptions, when we could be working on getting the moderation to work properly. " Tough luck, you (WMF) have stalled this for long enough, didn't keep your initial promise, and show very little interest in cooperating now or listening to concerns. As usual (Flow, Gather, MV, ACT, ...) Gie us the magic word, and we will populate it and we will decide where we need descriptions (and which ones) and where a blank might be better. These are content decisions, and the WMF should stay the hell out of content decisions. Fram (talk) 09:04, 9 December 2017 (UTC)

The case of Wikipedia titles with parenthetical disambiguators is one of the few places where we might actually want blank descriptors to ease reading, so it's a decent example where a magic word helps both projects. Even there, maybe we should consider ways of incorporating slightly more informative descriptions on both projects. I recently shepherded ORFN, a page about a graffiti artist named Aaron Curry, through AfC. Now there was already an Aaron Curry (artist), so this involved generating some hatnotes and tweaking the Wikidata descriptions. What I learned is that the text between the two parentheses is often an inadequate description. So we can't just shut off short descriptions because of a parenthetical disambiguator; someone needs to make a editorial decision for each. This is even more true at disambiguators that state a domain, like Aaron Curry (American football).--Carwil (talk) 12:55, 9 December 2017 (UTC)

Statistically, what fraction of Wikidata descriptions are bad?

Given that various claims and counter-claims have been made about how bad or good Wikidata descriptions are, but I don't think anyone's yet looked at a reasonable sample, I've written some code that fetches N random articles (using RandomPageGenerator, the pywikibot equivalent of Special:Random) and their Wikidata descriptions, and I've put the result for 1,000 articles at User:Mike Peel/Wikidata descriptions. 385 descriptions were blank, and 5 articles didn't have Wikidata entries. Looking through the rest, I can only spot one (0.1%) that's actually bad - "Tiger Mangweni - Rugbpy player" (typo now fixed) - although there are many that could be improved. Anyone else want to have a look through and see what they think? I can refresh/enlarge the sample easily if needed. Thanks. Mike Peel (talk) 23:53, 11 December 2017 (UTC)

Thanks, that's useful. A quick scan of the list confirms my (limited) experience, namely that the descriptions are useful for mobile readers. Some descriptions should be improved, but that's true of a lot of things. Johnuniq (talk) 01:47, 12 December 2017 (UTC)
Yeah, thanks for pulling this data, it's really helpful to see a random set. There are a lot that should be improved. I think any of them that are one word long -- book, ship, academic -- need some kind of qualifier. Some of them are too long, and occasionally there's a full sentence. But I don't see any that are actively harmful or wrong, the worst you can say is that the short ones aren't descriptive enough.
I made a change to the examples where you got an error -- that happened when an article topic didn't have an associated item on Wikidata. Those are just displaying as not having a description; there isn't an error message on the page. (I checked.) So I changed the label from ERROR to (no Wikidata item), to express what's actually seen on the page. -- DannyH (WMF) (talk) 02:05, 12 December 2017 (UTC)
I gave my impression of that sample on the RfC page. There are some clear errors in the list (including a Dutch description), many that are superfluous or confusing (the "Wikimedia" ones, or ones that repeat the disambiguation) and many, many that should be improved. The 0.1% error rate is not really correct in any case... But basically, yes, we need descriptions in many cases, there is little dispute of that. Whether this set is better than a set created automatically from the first lines of the articles isn't clear though, other tests on this page (archives) indicate that using information from the first line in general produces better results (though obviously not in all cases). Fram (talk) 08:29, 12 December 2017 (UTC)
Mike Peel, This is a useful list. There are few cases of where a description is not needed, and does not exist, and a few cases of where a description is not needed, but does exist and Wikipedia would be better served by a blank, also cases where the description is good enough as a start, and even some where it would be difficult to improve the description. It probably demonstrates the full range of quality available. Most of the missing descriptions appear to be in places where descriptions are needed. Some of the descriptions that exist but do not look useful for Wikipedia look useful for Wikidata. I did not spot any that are obviously wrong, but then I did not compare most of them to the article content, so if they looked plausible, I would not pick up a problem.
DannyH (WMF), If the worst you can say is that the short ones are not descriptive enough, you probably haven't looked very closely at the quality, and did not see the ones which are obviously redundant. · · · Peter (Southwood) (talk): 16:03, 12 December 2017 (UTC)
Peter, I have looked closely at the list; I'm making a slightly different judgment about the impact of redundancy. I think mild redundancy is acceptable, especially with disambiguation phrases. For example, Shine On (Ralph Stanley album) has "album by Ralph Stanley" -- I think a better description would be "2005 album by Ralph Stanley"; if I was involved in determining the format for record albums that's the one I'd support. But either way, the title is redundant with the first sentence of the article: "Shine On is a 2005 album by American bluegrass artist Ralph Stanley." Seeing this sequence on an article page -- "Shine On (Ralph Stanley album)", "2005 album by Ralph Stanley", "Shine On is a 2005 album by American bluegrass artist Ralph Stanley" -- has some redundant information, but the title and first sentence have redundant information anyway. I don't think that redundancy confuses or annoys anybody; you'd just skim past it and move on with the article. I don't see the harm.
If I was involved in determining the format for short descriptions on album pages, I'd want to have a uniform format that everyone could follow, so people wouldn't have to make case-by-case judgment calls every time there's a disambiguation phrase. Shine On (Ralph Stanley album) would have "2005 album by Ralph Stanley", Making Love (album) would have "1999 album by Atom and His Package", and The Conquest of You would have "1997 album by Kid Creole and the Coconuts". Is that the kind of redundancy that you're concerned about?
Also, I'm curious about the items where you say that a description isn't needed, and Wikipedia would be better served by a blank. Can you share those examples? -- DannyH (WMF) (talk) 17:23, 12 December 2017 (UTC)
And here I was thinking you didn't involve yourself with content decisions. If most editors on enwiki would decide that in cases like these, it is better to have no description than a redundant or too long and detailed one, then who is the WMF to interfere with that? Stick to your role please. Getting the same information twice in a row (title and first sentence) is enough, getting it three times in a row is ridiculous, and not needed for these use cases where you get the title but not the first line. Will anyone think "Oh, it's the 2005 album, I was looking for the 2004 album"? Just as many as will think "Oh, I was looking for the album with song X" or "I was looking for the album where Y featured on a song" or whatever. A description should give enough information to make the general subject clear, especially for those cases where the subject isn't clear from the title. That's it. If we want longer descriptions following some rule, we can easily decide this here and program a bot to fill or change them if wanted. We don't need the WMF holding our hands to guide us to the best description. They have better things to do. Fram (talk) 17:45, 12 December 2017 (UTC)
Yeah, I was just using that as an example. Like I said, once there's a magic word, Wikipedia editors make decisions on how to use it. The question that people are asking in this discussion is whether the existing descriptions are good or bad; that's a judgment call, and I'm expressing my opinion about it. -- DannyH (WMF) (talk) 18:05, 12 December 2017 (UTC)
Some input on the application of the short description from WMF is useful, as they will be using it to distribute our content. What we need from them is how they will use it, and what constraints there may be on text length and that kind of thing, so we can try to produce useful, as well as accurate, descriptions. Is there an optimum length range? It there a constraint on maximum number of characters for display reasons? Should we try for more information, or just enough to distinguish between likely targets? · · · Peter (Southwood) (talk): 05:59, 13 December 2017 (UTC)
The disambiguation and list article pages (there are 68 listed in the sample) are clearly neither errors nor vandalism. The problem of how to use these labels is worth discussing. I find "Wikipedia disambiguation page" to be a useful subtitle, but others' perspectives may differ. Lists are typically adequately described by their title, and the subtitle should be suppressed. However, in both cases, there are good reasons for this text to be visible in the VisualEditor. If an editor is inserting a link into an article, it's very helpful to know whether "James Montgomery" is an article about a person, or a disambiguation page.--Carwil (talk) 22:55, 12 December 2017 (UTC)
Yes, and that applies even more to someone browsing Wikipedia on a mobile device where the typical approach is to search for a topic or name, then select a page to view from the presented list. Such a reader would quickly learn what "Wikipedia disambiguation page" means and would regard that description as very useful. Johnuniq (talk) 02:18, 13 December 2017 (UTC)
Wikipedia disambiguation page is also better than Wikimedia disambiguation page, as it is more specific and more readers will know what Wikipedia means. It also allows disambiguation between the projects if applicable. If the search only targets Wikipedia, then "Disambiguation page" without the qualifier would be better. This would be easily done on Wikipedia, but would reduce the value of the Wikidata description, where the project name is useful, as Wikidata links to other projects besides Wikipedia where the subject matches, and Wikivoyage, for example, also uses disambiguation pages. This is not so much a harmful thing as a lack of elegance, so it is not urgent, just preferable. · · · Peter (Southwood) (talk): 05:59, 13 December 2017 (UTC)
DannyH (WMF), In any case where the article title describes the subject as fully as a reasonable length short description would, the short description is redundant. It is always possible to write a longer description with more information, but I understand the function of the short description is to give just enough for the reader to recognise which article from a group is most likely to be the one they are looking for. Perhaps you need to define the purpose of the short description more precisely.
A case in point here is those album article titles you mentioned. The existing short descriptions are redundant (when they contain no information that is not already in the title) and look unprofessional, but when you add information such as the date, they become more useful. As they are very short to start with, and not much longer when expanded, they would be acceptable as providing useful information, and not exceeding the limits of a short description. Much depends on whether the added information actually helps the reader find the right article. As explained by Fram above, some added information may not be useful for this purpose. · · · Peter (Southwood) (talk): 06:07, 13 December 2017 (UTC)