Wikipedia talk:Wikidata/2017 State of affairs/Archive 5

Archive 1 Archive 3 Archive 4 Archive 5 Archive 6 Archive 7 Archive 10

Wikidata has no BLP or V policies

Pulling this out of the above:

The biggest problem is the differences in policy. Wikidata has no verify policy and no BLP policy, and has rejected having them. And people can run bots to change many fields with very little oversight. And the SEO problem. All of that are issues internal to Wikidata that make it generally incompatible with en-WP. And we cannot make those changes over there; they will make them if and when they are ready to. We cannot get into a situation where there is data in Wikidata that violates en-WP BLP's policy and is getting imported here, and we we have no basis for removing it there or stopping who ever is adding it there from doing so. Jytdog (talk) 20:33, 8 September 2017 (UTC)}}

to which Andy Mabbett replied above:

"The biggest problem is the differences in policy. Wikidata has ... no BLP policy, and has rejected having [one]" that is absolutely false - what was rejected was a specific draft, which was unfit for purpose. The biggest problem with Wikidata appears to be the FUD that people spread about it. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:23, 9 September 2017 (UTC)

And I have just removed an unsubstantiated claim to the contrary from the page. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:13, 10 September 2017 (UTC)}}

It is fact that Wikidata has neither a BLP policy nor a Verify policy, Andy. I am well aware that proposals for such policies were rejected; that is not the point. The point is that they have no policies. I have no idea what you are reacting so negatively to and calling "FUD" and "absolutely false" and misleading. It is just a simple fact. If I am incorrect, please provide a link here to each of those policies at Wikidata. Thanks Jytdog (talk) 02:43, 11 September 2017 (UTC)

Though it may come as as surprise to you, putting the word fact in bold on your comment adds no additional weight to your argument. As for proof, where is your evidence that the Wikidata community has "has rejected having [a BLP policy]"? Hint: there is none, for it has not. Hence, to say that it has, is ""FUD" and "absolutely false" and "misleading". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:47, 11 September 2017 (UTC)
They held an RFC, in which you participated. Wikipedia requires the same level or stricter of adherence to our BLP policy in order to use information in a living person's biography. Unless Wikidata is willing to have a policy that contains the same level of restriction on content, saying 'too vague' then claiming 'Wikidata has not rejected a policy' is being disingenuous, as you personally as have others at Wikidata made it perfectly clear multiple times you are unwilling to have that level of restriction on content. So one of two things is ultimately going to happen. 1) Wikidata sourced content will be barred in part or in full from use on Wikipedia, from displaying on Wikipedia articles on mobile devices etc. This is within the remit and technical means of the Wikipedia community to do. Or 2) Enough editors on Wikipedia are going to get annoyed with the intransigence of the Wikidata regulars (and the WMF's determination to integrate Wikidata into apps etc) to go and impose Wikipedia sourcing requirements on Wikidata. Personally I am placing bets on the first, because its easier to do than attempting to herd 50 cats to go and impose some form of order on your pet project. Only in death does duty end (talk) 09:44, 11 September 2017 (UTC)
"you personally... made it perfectly clear... you are unwilling to have that level of restriction on content" False; and still no evidence that the Wikidata community has "has rejected having [a BLP policy]". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:27, 11 September 2017 (UTC)
Andy Mabbbet that was one of the silliest things I have read a putatively serious person write in WP. Wikidata has no BLP nor V policy and you are niggling over why? The fact is that there is no BLP and no V policy in Wikidata. You can try to blow smoke up my ass all day and it won't change that a whit. Jytdog (talk) 10:27, 11 September 2017 (UTC)
Who's he? Still no evidence that the Wikidata community has "has rejected having [a BLP policy]". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:27, 11 September 2017 (UTC)
  • User:David Eppstein -- about this, the WMF board resolution is not a policy for the WMF nor for Wikidata. It is just the board urging the various communities to take these issues with appropriate care. I clarified here.
For both David and Andy. Please be aware that I am very interested in people being clear that a) en-WP and Wikidata are distinct WMF projects, each with their own governance; and b) there are stark differences in governance between en-WP and Wikidata. I am not out to bash anybody. The Wikidata community is young (as a community) and it has a long way to go in terms of developing the policies and guidelines and procedures through which it governs itself. That is not a good thing or a bad thing; it is just is what it is and we open up worlds of trouble if don't we respect the differences. Jytdog (talk) 03:04, 11 September 2017 (UTC)
Thanks, and yes, I agree. It's not a bad thing that Wikidata has different policies than here on en, and even if they did copy our policies wholesale it would necessarily make them different from some of the other Wikipedias. But this state of affairs means that it's problematic to import their content (instead of just using them as a repository of citation metadata via {{cite Q}} or for tracking interwiki links, both of which I think are very good ideas). —David Eppstein (talk) 04:01, 11 September 2017 (UTC)

Wikidata:Verifiability has been a proposed guideline since January 2015, and the latest discussion of it was in July 2015. I think it is very fair to say that the Wikidata community is "disinterested in Verifiability". In any case, the line about BLP and V is a "perceived" disadvantage of Wikidata: as has been stated numerous times, it doesn't have to be an undeniable truth to be included in that list, just a perception with some basis in reality, which this clearly is. I don't agree with all perceived benefits either (is it really easier to edit pages if parts of the content appear "miraculously" without being included in the wikitext in some way? Not for me, but apparently it is so for others), but I don't remove or whitewash them. Fram (talk) 09:57, 11 September 2017 (UTC)

You may hold the perception that "the Wikidata community is 'disinterested in Verifiability'", but it is far from fair - indeed it is ignorant of the facts - to assert that as a truth. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:38, 11 September 2017 (UTC)
Andy, I've been trying to follow this, but it's a confusing conversation. Can you provide a link to whatever policies apply to verifiability and BLPs on Wikidata? Mike Christie (talk - contribs - library) 11:45, 11 September 2017 (UTC)
Seconded. I provided links to support my statement, you just assert that it is "ignorant of the facts" without presenting these facts, which seriously weakens your point. Fram (talk) 11:56, 11 September 2017 (UTC)
Lest I be accused of ignoring this direct question: The issue under contention is "the Wikidata community is 'disinterested in Verifiability'". Please avoid using straw-men. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:12, 11 September 2017 (UTC)
"Lest I be accused of ignoring this direct question" and then you proceed to ignore his direct question. Bye, Pigsonthewing, this page will be better of without your disruption. Fram (talk) 12:16, 11 September 2017 (UTC)
I've never said the Wikidata community is not interested in verifiability, or that it doesn't care about protecting BLPs. I'm just trying to understand the degree to which those things are codified in its policies. Looking at this page of Wikidata policies I don't see anything there that would cover it. Andy, is it the case that the WD community doesn't feel it needs policies, for some reason, perhaps because in practice it believes the issue is taken care of in some other way? Or do you think some form of policy to cover BLPs and verifiability would be beneficial to WD? Mike Christie (talk - contribs - library) 12:20, 11 September 2017 (UTC)
  • Wikidata has one relevant policy: Wikidata:Use common sense. If that is being followed, everything should be fine. —Kusma (t·c) 12:30, 11 September 2017 (UTC)
    • "It is common sense" that I may add "the truth" about politician X on Wikidata, even if I don't and can't provide an enwiki-level reliable source for it. "It is common sense" that tabloid lies may be repeated on Wikidata, as we are not the first to promote this filth. "It is common sense" that we may knowingly provide links to copyright violations on Wikidata, as it is the responsability of those sites to comply with copyright laws (which, as is common sense, are evil anyway). "It is common sense" that once the name of someone searched in relation with or accused of terrorism has been reported somewhere, we may repeat it, as we only compile information, not censor it. "It is common sense" that a lot of things which aren't allowed at enwiki at all are not a problem at Wikidata at all, but that this shouldn't stop use displaying these things on enwiki. Kusma, I hope you were being sarcastic with your comment, because otherwise it is quite worrying... Fram (talk) 12:39, 11 September 2017 (UTC)
    • By the way, this is agreed upon by others as well, see this. Fram (talk) 12:43, 11 September 2017 (UTC)
      • Common sense ain't common, and can't be legislated. Still, policy pages are only needed once there is disagreement on what the policy is. Ten years ago, our policy pages were mostly descriptions of actual practice. As most Wikidata editors are drawn from the multilingual Wikipedia family, I expect them to bring attitudes from their home projects to Wikidata. As for verifiability: aren't most descriptions taken from various Wikipedia articles that are verifiable using cited reliable sources? In other words, is all they lack in your opinion a direct link to the citations? Is there a way that Wikidata could possibly satisfy you or are you generally opposed to all wikis (you know, wikis are unreliable) other than this one? —Kusma (t·c) 15:46, 11 September 2017 (UTC)
        • I consider this one as unreliable as well, and have never accepted sourcing one article to another either. So please don't give me false dilemmas like "or are you generally opposed to all wikis (you know, wikis are unreliable) other than this one?" Enwiki is an unacceptable source for Wikidata, Wikidata is an unacceptable source for enwiki. However, the former is just my opinion about what Wikidata should do, but in the end is their problem. The latter though is our problem (and policy). So no, it doesn't really matter in the end if Wikidata provides sourced statements or unsourced statements, as it still remains an unreliable site which we shouldn't use in general. As a repository for our interwikilinks, fine, but as far as I am concerned not a single thing beyond that. Fram (talk) 15:58, 11 September 2017 (UTC)

With regard to Verifiability, Wikidata has this page d:Wikidata:Verifiability which is classified as an "informational page". It's only proposed as policy (although linked from d:Wikidata:List of policies and guidelines), but can anyone here see anything wrong with it? In the absence of what we consider firm policy, the "nutshell" summary reading "All statements on Wikidata must be supported by a source. Unsourced statements, and statements not supported by the source provided, will be removed." looks to me like the sort of WP:Verifiability that we would like to see enforced, doesn't it? Of course, until such time as the Wikidata community actually does enforce that, the onus falls on third-party re-users (such as ourselves) to ensure that we filter out challengeable unsourced statements. I believe I've done my best to present Wikipedia editors who choose to import Wikidata the tools to do just that. --RexxS (talk) 15:44, 11 September 2017 (UTC)

"All statements on Wikidata must be supported by a source. Unsourced statements, and statements not supported by the source provided, will be removed." What about reliable sources vs. unreliable ones. At the moment, Wikidata doesn't seem to care (e.g. the widespread importing of wikipedia-sourced items, and the use of things like Quora or Findagrave as identifiers)? Fram (talk) 15:58, 11 September 2017 (UTC)
My response was about the text you gave, I see the full page at least tries to have some guidance on what is reliable and what isn't. It doesn't seem to match actuyal practice at Wikidata though, so I wonder if it ever will be accepted as a policy. Fram (talk) 16:03, 11 September 2017 (UTC)
  • I think this thread needs a reboot... what is important is that 1) Wikidata currently has no BLP or Verifiability policies... 2) it will never be accepted at WP:en until it does ... 3) if wikidata ever does create BLP and V policies, they will have to be very close to what WP:en has ... if not, WP:en will continue to reject Wikidata... it is that simple. Blueboar (talk) 11:15, 15 September 2017 (UTC)

A light-hearted attempt to explain the current view of the broader WP:en community

Just to put a smile on everyone's face... the current view of the WP:en community towards links to Wikidata can be summarized this way:

We do not want them on our phones
We do not want them in our homes
We do not want them in descriptions
(please don't go in to conniptions)
We do not want them used in boxen
(We really think they are a toxin)
We do not want them used in cites
(not even broken into little bytes)
We do not want them on our screens
We do not want them "behind the scenes"
We do not want them here or there
We do not want them anywhere
We do not like Wikidata links
We think that Wikidata stinks

Now... this dislike of Wikidata may change at some point in the future... but the good folks at Wikidata have to address (what we see as) some serious flaws before that will happen. Blueboar (talk) 13:54, 16 September 2017 (UTC)

To be honest, this looks more like your personal (and a pretty biased one) view than the view of the WP:en community.--Ymblanter (talk) 14:23, 16 September 2017 (UTC)
Given the comments at numerous RFCs... I think my poem sums things up accurately. Blueboar (talk) 14:30, 16 September 2017 (UTC)
As the original ends with the narrator trying and then liking green data and ham, maybe there is hope for Wikidata here. —Kusma (t·c) 16:22, 16 September 2017 (UTC)
Yup there is hope. It is a good concept. Doc James (talk · contribs · email) 18:16, 16 September 2017 (UTC)
Of course there is hope... just resolve the various issues raised at the numerous RFCs (OK... not easy... or even simple... but there is hope.) Blueboar (talk) 18:52, 16 September 2017 (UTC)
  • i appreciate the dr seussian effort! it is a little overstated, but only a little. Just as an example, I think template:infobox gene that WP:MCB has created in collaboration with Wikidata is great -- it does valuable stuff like naming genes/proteins and providing links to the zillions of database out there. That is kind of the ideal en-WP/Wikidata collaboration in my view. But it was very much a welcome collaboration by the relevant people within en-WP, and the collaborators accepted pushback (not happily, but accepted it), from outside their collaborative bubble. In that case, including explicit health content, and content about what chemicals "interact" with genes/proteins (which implicitly is health content) was not OK and needed MEDRS sourcing, and they agreed to take that stuff out.
So in my view Wikidata has some places in en-WP, but it is very, very case-by-case.
Folks at Wikidata, and at WMF, can be way too aggressive in trying to push it in to Wikipedia. Jytdog (talk) 22:52, 16 September 2017 (UTC)

I doubt the other side sees it as very lighthearted or smile-worthy. Anywho, you missed lists. They were really obnoxious. The bot that maintained the lists would blindly overwrite any edits to the page. {{Wikidata_list}} now has NOT IN MAINSPACE slapped on it. Alsee (talk) 00:07, 17 September 2017 (UTC)

That is all true. thx Jytdog (talk) 00:35, 17 September 2017 (UTC)
We do not want them making lists
(That one made us shake our fists)
Blueboar (talk) 00:44, 17 September 2017 (UTC)

An RFC needs to be run

An RFC should be run now, or very soon. The WMF is planning start work to start work on more tightly integrating Wikidata into Wikipedia. According to the WMF recent discussions have been supportive of doing so. However I am unaware of any such discussions, and in my experience the WMF has a history of severely misjudging the consensus of the community in the absence of an RFC.

Could someone ping me if there is progress on this? Unless there are strong arguments to the contrary, I would support continuing to use Wikidata for interlanguage links. Alsee (talk) 01:50, 8 September 2017 (UTC)

User:Alsee where have you seen those WMF discussions happening? I would like to check them out. thx Jytdog (talk) 02:07, 8 September 2017 (UTC)
Jytdog: Ouch, I just closed a ton of browser tabs and now I'm having trouble finding the pages again. At the moment the only link I have available is a Wikimania 2017 image of a "prototype for editing Wikidata's data from Wikipedia".
On a related note, ping Fram. I just came across Wikidata:Project_chat#Wikidata_description_editing_in_the_Wikipedia_Android_app, which led me to the discussion you participated in at wp:VPT#Wikidata_description_editing_in_the_Wikipedia_Android_app. I was previously aware of the WMF's statement based on the raised concerns, we have decided to turn the wikidata descriptions feature off for enwiki for the time being.[1] Before I start an RFC to turn off wikidata descriptions for enwiki, I wanted to double check with you whether there were any developments after your WP:VPT discussions.
And I came across these issues while I was trying to catch up on some old stuff and clear my plate. I needed to clear my plate to open a RFC#2 proposing rollback of NewWikitextEditor from beta features... because the WMF is ignoring RFC#1 that reached consensus against it. Oh, and I was also clearing my plate to open a Commons RFC to uninstall Flow. I've also got a German Editor lined up to run a DeWiki RFC to uninstall Flow. Because the WMF ran a massively canvassed survey on Flow which managed to votestack a whopping 38% support for Flow. And based on that fictional 38% support they are sinking more money into "upgrading" Flow and pursuing further deployment. And eventually I was hoping get back to the needed multi-wiki RFC's to deal with the WMF's stealth deployment of VE as the default editor. (It was rolled back from Enwiki when one of us wrote a sitewide javascript hack that would override it, but the WMF is still doing a stealth deployment to the rest of the planet.)
Holy friking crap. This is getting stupid. Alsee (talk) 04:43, 8 September 2017 (UTC)
P.S. I found a link explicitly saying they are building the Wikidata intergation for Wikipedia: Wikidata:Client_editing_prototype. I still can't find the original discussions I saw, where they said we somehow supported it. I think they only talk to the Wikidata community about this stuff, and obviously that community wants their stuff deployed here. Alsee (talk) 05:17, 8 September 2017 (UTC)
We definitely need an RfC to make it clear that the Wikidata descriptions shouldn't be shown on enwiki anywhere, through any tool they can control (be it apps, search, popups, ...). They have wormed their way out of the previous RfC by saying one thing and doing another, so this time it should be absolutely clear. No English language text should be retrieved from Wikidata, all text needs to be on enwiki. Fram (talk) 06:52, 8 September 2017 (UTC)
As far as I can see, the Wikimania thing and the application at Wikidata are both coming from Wikidata, not from WMF. Yes the Wikidata people are very interested in pressing their stuff into en=WP, and getting en=WP editors to improve Wikidata.
My sense is that the en-WP community is very aware of the differences in policy regimes between the two projects, and until Wikidata is much more mature with regard to controlling bots and sourcing - heck - until it gets a BLP policy - there is not going to consensus here to integrate.
I just had a discussion with somebody about Wikidata intergration the other day at the Template_talk:Infobox_company#Idea_for_a_change_to_the_explanation_about_data_being_automatically_included_from_Wikidata. A big "no" and my sense is that most everybody here is skeptical.
We should be watchful for forays, but there is no way to stop folks there from advocating, creating tools, etc. over there in Wikidata, and no point in trying. It is a separate project with its own governance. Jytdog (talk) 08:52, 8 September 2017 (UTC)
If there is an RfC, I will strongly oppose the formulation "no text can be imported from Wikidata without being stored locally". I do think Wikidata has problems with sourcing and vandalism, but, vandalism aside, I said repeatedly on several occasions (and usually people prefer not to hear) that there are many data, some of them text data, which should not be sourced at all (like Commons category) or can be sourced to itself (like official website). Additionally, if the vandalism problem has been solved, I see no reason to not accept data which is propaerly sourced on Wikidata.--Ymblanter (talk) 14:13, 8 September 2017 (UTC)
Commons category, website, ... are links. Even so, as I have shown in earlier discussions, often the local data is better than the Wikidata one (I have no idea where the mistaken belief comes from that Wikidata is interlanguage, ergo Wikidata is better). Outsourcing something like the website to Wikidata may seem convenient, but is at best lazy and at worst wrong or less useful (e.g. the many websites which have different landing pages for different languages). Using Wikidata for your source is good if you have few editors on your wikipedia language, but many articles (say, Cebuano or Volapuk). Otherwise, not so. But anyway, what I meant with text was sentences, or sentence fragments, written in a specific language (a website url is not in itself written in a specific language), which is no use on other language wikipedias. Every piece of information which is a priori only useful for one or two wikipedias (even if the subjct exists on 20 wikipedias) doesn't belong on Wikidata. We can and should have a separate discussion for other types of information, but this is specifically for descriptions (and labels, nicknames, ... if these are used anywhere). Fram (talk) 14:29, 8 September 2017 (UTC)
I can not think of any information which is acceptable on Wikidata but is only useful for one language (except obviously for Wikipedia labels/titles). Possibly dates are language-specific, but they are stored there in a universal form, and I believe can be easily imported to any language (not that I advocate this, since dates are often unsourced on Wikidata - I would only consider supporting import of well-sourced dates).--Ymblanter (talk) 14:40, 8 September 2017 (UTC)
Ymblanter, your suggestion to only import "well-sourced" information is essentially impossible unless an exceptionally-knowledgeable person is looking at the entry and copying it here individually. However we're venturing into details best discussed in the RFC itself. Alsee (talk) 18:21, 8 September 2017 (UTC)
This is incorrect. Every property of every item in Wikidata has the field "sources". If the field is empty or says "Imported from xxx Wikipedia", the sourcing is not reliable (with some exceptions, which we do not need to discuss now). This can be dome automatically.--Ymblanter (talk) 18:45, 8 September 2017 (UTC)
No, Alsee is correct. It is necessary for a person to check that the sourcing actually meets the reliability standards of en.wikipedia. The existence of a nonempty string in that field is far from the same as the reliability of the source it describes. —David Eppstein (talk) 18:50, 8 September 2017 (UTC)
It does not take more than it takes on Wikipedia, and one can easily set filters which would only show "good" sources. Actually, there is a lot of data which is reliably sourced to databases (like population etc), and I do not see any problem with this data. It still does not solve the vandalism problem (what if someone changes population of New York without changing the source), but in my experience Wikipedia suffers much more from this type of vandalism, which can stay here for years.--Ymblanter (talk) 19:37, 8 September 2017 (UTC)
I think this is exactly the sort of thing that needs to be discussed in any RfC that comes up; what counts as "does not take more than it takes on Wikipedia" varies a good deal by editor, since different editors have different comfort levels with Wikidata, but most share the feeling that data that can affect an article they have worked on is data for which they want to understand the risks. Mike Christie (talk - contribs - library) 19:50, 8 September 2017 (UTC)
That is unfortunately untrue. It would be very difficult to construct a filter to not show any possible unreliable source, or all possible reliable ones. Not to mention that some sources are reliable in some cases and not in others (primary, self-published, etc). Nikkimaria (talk) 20:17, 8 September 2017 (UTC)
The biggest problem is the differences in policy. Wikidata has no verify policy and no BLP policy, and has rejected having them. And people can run bots to change many fields with very little oversight. And the SEO problem. All of that are issues internal to Wikidata that make it generally incompatible with en-WP. And we cannot make those changes over there; they will make them if and when they are ready to.
We cannot get into a situation where there is data in Wikidata that violates en-WP BLP's policy and is getting imported here, and we we have no basis for removing it there or stopping who ever is adding it there from doing so. Jytdog (talk) 20:33, 8 September 2017 (UTC)
This is a good example of what annoys me most with this type of discussion: it's characterised as "us" vs "them", with the assumption that this means a fight. This is not the case. Both Wikipedia and Wikidata are Wikimedia projects - you have the same username for all of them, you can edit there the same as you can edit here, it's just "us together" figuring out the best way to do things. If you want policies that are aligned between the two, then please work towards making that happen. Thanks. Mike Peel (talk) 20:48, 8 September 2017 (UTC)
My understanding is that we have already requested policies similar to WP:RS and WP:BLP for Wikidata, to make their policies more compatible with ours, and that those proposals were explicitly rejected. Given that the editors on Wikidata do not want to make their data compatible with use on en.Wikipedia, it seems completely reasonable to me both to avoid using it and to speak of those editors with their different policies as being a separate population from the editors here. —David Eppstein (talk) 21:18, 8 September 2017 (UTC)
citation needed, please. I have not seen those policy discussions. Thanks. Mike Peel (talk) 21:22, 8 September 2017 (UTC)
See Wikidata:Wikidata:Requests for comment/Verifiability and living persons where these concerns were directly raised and dismissed with the consensus being "no changes to current practice". —David Eppstein (talk) 21:31, 8 September 2017 (UTC)
Thank you. I need to study this link, but from a first look it seems that @Pigsonthewing: turned the tide in the voting here, so he may want to comment on this discussion. I'm also not sure that the proposed policies actually match enwp's policies - they look like they were a lot stricter. Thanks. Mike Peel (talk) 21:39, 8 September 2017 (UTC)
Thank you. I think my comments there were clear and unambiguous. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:32, 9 September 2017 (UTC)
"The biggest problem is the differences in policy. Wikidata has ... no BLP policy, and has rejected having [one]" that is absolutely false - what was rejected was a specific draft, which was unfit for purpose. The biggest problem with Wikidata appears to be the FUD that people spread about it. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:23, 9 September 2017 (UTC)
And I have just removed an unsubstantiated claim to the contrary from the page. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:13, 10 September 2017 (UTC)
"many websites which have different landing pages for different languages" Which Wikidata already caters for, perfectly adequately. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:23, 9 September 2017 (UTC)
  • There are some exceptions. Template:Infobox gene is pure import from Wikidata. I think that that one is fine; i did some work with them to exclude any content about health so it is just data that really only biochemistry people care about. If anybody wants to see all the infoboxes that use Wikidata in one way or another, see Category:Infobox templates using Wikidata. There are more than i thought! Bears some looking at, for forays. Jytdog (talk) 19:59, 8 September 2017 (UTC)
It looks like, similar to the previous zillion times, people do not listen to what I say and misinterpret what I say. I never said it is easy or possible to construct a filter which would only input to Wikipedia all data from Wikidata which conform with are reliably sourced by English Wikipedia standards. What I say is that there are large arrays of data (population, heritage monument numbers, sports statistics, to name some) which on Wikidata are either not sourced, or sourced reliably by any Wikipedia standards. This data can be imported without any problem. One can even make a white list of acceptable sources and not import anything which is not on the list. Furthermore, there is data which do not need to be sourced. One example is Commons category, and it is already imported from Wikidata, but there are more data of this type. My second point is that I generally support holding an RfC, but if the RfC is prepared as black and white as many users here seem to suggest (for example, no import from Wikidata), it is doomed. If we want to have a workable solution, we need to go beyond black and white and at the level of preparation of RfC define what is acceptable and what is not. We ob=bviously can not sole all the problems immediately, but there are many things which would work just fine if users are willing to differentiate them from the others.--Ymblanter (talk) 21:56, 8 September 2017 (UTC)
What is black and white is that there is extremely unlikely to be any consensus for free importing of Wikidata. It needs to be case by case, and with care. Jytdog (talk) 22:54, 8 September 2017 (UTC)
To be honest, I do not see here a single person advocating free import from Wikidata, but I see many users proposing to fully block the import.--Ymblanter (talk) 08:07, 9 September 2017 (UTC)

Crap. I didn't want to get into a discussion of wikidata sourcing here, this page is for discussing creating and RFC not for debating the issue itself. Unfortunately by trying to avoid that off-topic discussion, I inadvertently I inadvertently spawned the mess above. What I was trying to avoid getting into, was that the templates that supposedly avoid using unsourced/wikipedia-sourced data items DOES NOT WORK. The Wikidata community has essentially zero concern for sourcing, other than to consider a source field as more generic datapoint to blindly suck up. Some of the bot-generated source info are virtually unreadable obfuscated links to wikipedia. Also wikidata engages in WP:circular massive bot-import and export to one or more external databases forming a loop. Unsourced or wikipedia sourced info can then go through a purely bot-edited sequence adding a fictional source. I have a number of other concerns with wikidata. But NONE OF THIS DISCUSSION BELONGS HERE. This page is for organizing the RFC. Then we can discuss all of these topics inside the RFC. Alsee (talk) 10:53, 9 September 2017 (UTC)

Fine. By refusing the discuss issues during the RfC preparation, you secured my oppose vote at RfC. Great example of Wikipedia collaboration.--Ymblanter (talk) 11:48, 9 September 2017 (UTC)

Break

@Alsee: are you considering an RfC to turn off Wikidata descriptions for en-wp, or a broader RfC about the general uses of Wikidata on en-wp? Mike Christie (talk - contribs - library) 13:38, 9 September 2017 (UTC)

Mike Christie, I hadn't considered the (valid) possibility of covering it all in one RFC. I ran into the two issues independently. There was an RFC about wikidata descriptions which was heading to a clear consensus not to use them, and which was withdrawn when a staff member said we have decided to turn the wikidata descriptions feature off for enwiki for the time being. However they only partially did so. It would be quick and easy to toss up an RFC on that. Separately I learned that the WMF is planning to build an entire system to directly integrate wikidata into wikipedia editing. Based on that, I believe the broad RFC here should move forward reasonably soon. I haven't been closely following the work here. We should sort out what we want to do with wikidata here before the WMF builds something random. Alsee (talk) 05:18, 10 September 2017 (UTC)

@Alsee: are you considering an RFC which would fix multiple complexities introduced by wikidata? the nice thing about wikidata is that it stores structured data, to be easily reused, also in multiple wikipedia projects. there might be a mental flaw in the process currently how the data gets into wikidata: people think the only way is that one would edit wikidata, and wikipedia would consume the data. but - interestingly - to consume one needs to sture a structure like a template. if one stores a structure to be auto-populated, this structure can of course be auto-read as well. so - it would be straigt forward to read the data from wikipedia into wikidata as well. to give an example: the data for a city would be read from its home wiki, de:wp for berlin, en:wp for new york. and reused in other languages. which would have multiple nice properties. first, the data is checked with wikipedia quality. second, the editing software does not know anything about wikidata and is therefor less complex. --ThurnerRupert (talk) 13:50, 16 September 2017 (UTC)

ThurnerRupert, you don't need a template to 'consume' wikidata. As a game, I made an edit that was nothing but a series Q-number calls to wikidata. The edit was completely unreadable. However it displayed on the page as a paragraph of text, complete with a fabricated name/date signature. There was no signature in the wikitext itself, just Q-number calls. You could create an entire article that is nothing but an unreadable series of Q-number calls to wikidata.
Regarding wikidata consuming from wikipedia, they already have bots sucking up everything they can from wikipedias. Subtle tweaks there won't make any difference. The concern is displaying wikidata on wikipedia, as well as the need to edit a wikidata form in order to alter it. (Having a wikidata form pop up "on wikipedia" is not materially different than having to go to wikidata to edit it.)
Ever since wikidata was introduced, no one has been happy. Wikidata enthusiasts have been trying to use it in various ways, and in many cases wikidata-critics get consensus to revert or delete that work. No one is happy with the situation. The battling back and forth is just wasting people's work on both sides, and pissing off both sides. We need an RFC to sort out how we do or don't want to use wikidata, so the two sides can stop the continual roll-forwards/roll-back. Alsee (talk) 23:04, 16 September 2017 (UTC)
The editorial community should insist on transparency and consultation. The end-point of Wikidata is to minimise the role of human writers and editors, by maximising the machine-readability of WP articles. This has some sinister consequences. Tony (talk) 04:20, 17 September 2017 (UTC)
There are a lot of unfortunate outcomes possible from this discussion unless we insist on, and get, transparency and consultation, as Tony says. Please ping me if there's an RfC. - Dank (push to talk) 15:21, 17 September 2017 (UTC)
  • I agree about the importance of holding an RfC very soon. Is anyone willing to organize that? Alsee and Fram are both in a position to come up with some clearly worded questions. SarahSV (talk) 15:31, 17 September 2017 (UTC) Repinging Alsee because of typo. SarahSV (talk) 15:32, 17 September 2017 (UTC)
    They have both demonstrated that they advocate full rejection of Wikidata (I guess possibly even rejecting interwiki storage). An RfC offering only this option is unlikely to be productive.--Ymblanter (talk) 15:52, 17 September 2017 (UTC)
And yet that is a vital question to ask... perhaps we should ask something along the lines of: "What would you like from Wikidata?", with the understanding that the answer to that question might well be... "nothing". And if that is the answer, it is kind of pointless to ask further questions. Blueboar (talk) 16:38, 17 September 2017 (UTC)
As far as I am aware, there have been exactly zero objections to using wikidata for interlanguage links between matching articles. Beyond that, I will certainly accept any clearly defined uses cases that the community finds valuable. Any RFC certainly needs to sort through viable possibilities in an effective manner. However wikidata-enthusiasts have been provoking (and losing) too many conflicts, and local-community-values at wikidata have been.... let's say they have been incompatible. If the only way to end this eternal disruptive battle back&forth is the French Wiki solution of dropping wikidata from articles completely, then so be it. My estimate of consensus is that the community IS prepared to go against wikidata, but probably not quite prepared to ban it. That would leave a thorny question of finding a meaningful bright-line result between "here" and "zero". I haven't pinned down where consensus would be. The WMF's plans to more deeply integrate wikidata have escalated this issue to the point that we MUST sort out this issue before deployment.
Regarding running a broad RFC, I was really hoping that the people who built this page were working on it. I am up to my eyeballs dealing with other issues. Pardon the rant but: The WMF has reversed position and is no longer willing to remove Flow from communities that don't want it. They have also begun a global stealth deployment of VE as the default editor for new users. They also want to get rid of the wikitext editor, expecting us to use a new "wikitext mode" inside of VE, and they are ignoring established consensus against it. I also really need to find time to do normal wiki work. Oh... but I can add a small positive note. The WMF thinks Visual-Everything is The Future, and they have been ignoring/abandoning work on unimportant legacy wikitext tools like DIFF. However YEAY for the Wikimedia Germany affiliate picking up the diff project. We may finally get readable diffs when a paragraph is moved! The prototype indicates moved paragraphs, and it shows what changes (if any) were made during the move. Alsee (talk) 17:59, 17 September 2017 (UTC)

Wikipedia descriptions vs Wikidata descriptions

An experiment with article descriptions. My hypothesis: that we could get better descriptions by taking the first sentence of article text (up to the first full stop) and stripping any parenthetical clauses than we could be looking at Wikidata. After the first few examples, I modified the hypothesis to also include stripping any text before the words "is"/"was" (and the following article if any), if it appears in the first sentence. Test set: articles generated by the "random article" button, without any filtering from me.

Article Wikipedia lead Wikidata description
Sara Lampe Democratic Party member of the Missouri House of Representatives, representing District 138 since 2004 American politician
Anogeissus leiocarpa tall deciduous tree native to savannas of tropical Africa species of plant
Black Lightning (2009 film) 2009 Russian action superhero film directed by Alexandr Voitinsky and Dmitry Kiseliov, and produced by Timur Bekmambetov 2009 film by Aleksandr Voytinskiy, Dmitriy Kiselev
Frank Lammers Dutch television and film actor Dutch actor
Katzie First Nation band government of the Katzie people of the Lower Fraser Valley region of British Columbia, Canada (none)
Military art characterized by its subject matter rather than by any specific style or material used works of art on military themes
Lunsford L. Lewis American attorney and politician American politician
Conopomorpha chionosema moth of the Gracillariidae family species of insect
Victor Scantlebury Acting bishop of the Episcopal Diocese of Central Ecuador (none)
Oļegs Aleksejenko former Latvia international football midfielder footballer
Massilia kyonggiensis Gram-negative and rod-shaped bacterium from the genus Massilia which has been isolated from the surface of a soil sample from a forest in Suwon in Korea species of prokaryote
Run for Your Wife (1965 film) 1965 Italian comedy film directed by Gian Luigi Polidoro 1965 film by Gian Luigi Polidoro
Oliver F. Atkins American photographer who worked for the Saturday Evening Post and as personal photographer to President Richard Nixon American photographer
Jones Motor Company historic U.S. Route 66-era building located on Central Avenue in the Nob Hill neighborhood of Albuquerque, New Mexico (none)
Isosauris genus of moth in the family Geometridae genus of insects
Geoff Valli former New Zealand rugby union player New Zealand rugby union player
Albion, Oklahoma town in Pushmataha County, Oklahoma, United States town in Pushmataha County, Oklahoma
Chrysanthos of Madytos Greek poet, chanter, Archimandrite, and Archbishop, born in Madytos Greek archimandrite, chanter and teacher of music
Cape Wilson (Ross Dependency) bold, rocky, snow-covered cape, forming the south-east end of the Nash Range and marking the northern entrance point to Shackleton Inlet on the western edge of the Ross Ice Shelf headland in Antarctica
German submarine U-1019 Type VIIC/41 U-boat of Nazi Germany's Kriegsmarine during World War II (none)
Every Little Whisper song co-written and recorded by American country artist Steve Wariner (none)
Lepa Lukić Serbian folk singer with a career spanning more than five decades singer
Bogenbay Batyr famous Kazakh warrior from the 18th century Kazakh warrior
Danish Committee for Aid to Afghan Refugees non-political, non-governmental, non-profit humanitarian/development organization working to improve the lives of the Afghan people since 1984 organization

Conclusions: It's hard to say which is better. The Wikipedia text is more reliably present and usually significantly more detailed, while the Wikidata text can be vague to the point of uselessness ("organization"). But I think that the simple syntax-based stripping that I initially hypothesized isn't good enough to get uniformly concise descriptions. Some sort of natural language processing method that recognizes dependent clauses and strips them when the description is too long might work, but at some point any sufficiently advanced hypothetical system becomes indistinguishable from the work of human editors. (E.g. for Jones Motor Company, it would be nice to strip down the Wikipedia text to "building in Albuquerque, New Mexico" but that seems too advanced to expect software to handle automatically.)

Nevertheless, I think that the Wikipedia text is reasonably usable as-is, and avoids all the political issues with importing Wikidata text of unknown provenance and incompatible sourcing policies. If we posit that the WMF has a real problem (how to describe article links in mobile views) and is merely taking the wrong solution to it, this could provide an alternative solution that is more palatable. —David Eppstein (talk) 19:39, 14 September 2017 (UTC)

If we use the lead sentence, we already have a pre-built solution. The relatively recent hovercards/article_preview_popup project has already done the work reviewing various cases of what should be stripped or included. To a rough approximation, it trims out things in parenthesis. Alsee (talk) 21:32, 14 September 2017 (UTC)
@David Eppstein:: your table is exactly what the solution should look like. Any user should be able to bring up (in a tool, on Wikipedia, whatever) their watchlist with the lead sentence and the (Wikidata description/to-be-created Wikipedia short description to be used in searches, VisualEditor, etc.). And the right hand column should have little edit pencils to fix the description entries. That, plus watchlist updates on visible Wikidata changes (ideally, with more user friendly way to fix them) would be the right package. (The watchlist updates is this item on Phabricator).
(Conversely, the Wikidata interface—which needs work, imho—should get a tool that brings up opening paragraphs of linked Wikipedia pages to aid in data entry over there.)
Having short descriptions is good for Wikipedia and good for Wikidata. We're talking about WP search functionality and ease of editing/wikilinking, especially on VisualEditor. Having the Wikidata descriptions automatically defer to those created by Wikipedians is probably good for Wikidata in all or almost all cases.
I support fixing this. I support solutions in which Wikipedia and Wikipedians can improve a still-new-ish project at Wikidata which needs this information. It's neighborly and produces a real synergy for us. However, I oppose "Get off my lawn!"-style solutions that don't enable the functionality that a short description field provide, or pointless wall us off. (Acknowledging that more openness can only come when Wikidata has more policy maturity around V, BLP.)--Carwil (talk) 17:25, 15 September 2017 (UTC)
Is there any sort of standard or MoS for these short descriptions? My opinion is that they should be created and maintained on Wikipedia, for Wikipedia and by Wikipedians. If the physical location of the stored material is somewhere else, it makes no difference to me provided I can edit and watch it from Wikipedia via my usual browser on my desktop or laptop. I do not use a mobile for internet as I have difficulty reading it, and this is unlikely to change for the better. Wikidata can house it if they like, no worries. · · · Peter (Southwood) (talk): 12:58, 18 September 2017 (UTC)
See also: wikidata:Help:Description. They function mostly as disambiguators. So, short but just descriptive enough that you can tell two things with the same name easily apart. If more than 6-10 words, it's likely too long. In terms of length, you would normally have, Title, wikidata description, hovercard aka page preview, wikipedia lead, wikipedia article. —TheDJ (talkcontribs) 14:27, 18 September 2017 (UTC)

WMF Annual plan 2017/18: Program 10 Wikidata

 
Employee chain sawing to control exotic invasive melaleuca

Reviewing this link provided above, one finds:

  • Outcome 2: Continue to increase the reach of Wikidata into the Wikimedia projects. We aim to significantly increase the instances of data uses from Wikidata by Wikimedia projects. We will achieve this by removing barriers for editors on Wikipedia and enabling new kinds of data to be collected within Wikidata and Wikimedia Commons — thereby making Wikidata more useful to the other projects..

-- Jytdog (talk) 15:49, 18 September 2017 (UTC)

I have no object to the goal of increasing use of Wikidata, if it's done by "removing barriers for editors on Wikipedia". Much of the discussion above is about barriers. Not all the barriers (control of vandalism, implementation of BLP and V policies) are under the WMF's control, but the software interfaces are. With the right interface I could see Wikidata being very useful, particularly for stable well-sourced data that comes from external reliable databases. It's the definition of "right interface" that's difficult. Mike Christie (talk - contribs - library) 16:34, 18 September 2017 (UTC)
To clarify, this is not an aim of the Foundation per se, but of the Wikidata team at the Geman Wikimedia chapter, as noted on top of the linked section ("Team: Wikidata (WMDE)" - I agree it's a bit confusing because the overall page is called "Wikimedia Foundation Annual Plan", IIRC the aim was to provide an integrated overview over technical work).
As explained in Jon Katz' response above where that annual plan page was linked, "Currently, experiments and plans for further Wikidata integration into Wikipedia projects exist, mostly driven by the Wikidata community and Wikimedia Deutschland, not the Foundation.... the Wikimedia Foundation has no concrete plans for integrating other data [besides the descriptions] from Wikidata with Wikipedia". Regards, Tbayer (WMF) (talk) 16:50, 18 September 2017 (UTC)
The page is called "Wikimedia Foundation Annual Plan/2017-2018/Final/Programs/Product". This is stated as an aim of WMF. Jytdog (talk) 16:55, 18 September 2017 (UTC)
I want to understand the roots of this, so we can pull it up by the roots. As I noted above, it is bizarre that a WMF product manager came here advocating for integration of Wikidata. I want to spend my volunteer time working on content, not cutting down instances of invasive species. The WMF doing this, is not like a weed though. It isn't organic; the WMF is clearly committed to pushing Wikidata into en-WP. That is actually manageable. (Individuals doing stuff is an entirely different matter) Jytdog (talk) 16:56, 18 September 2017 (UTC)
Hey Jytdog, I'm back again to try and provide an alternative perspective. but first, let me be sure I understand. Wikidata is the invasive species. The foundation introduced it by adding the descriptions*. Not the first case of species being present outside of its native environment†, but the one that is most egregious (in your view).
The Wikimedia Foundation as a whole (not just a specific person or team) should have known it was an invasive species and not introduced it.
The foundation didn't, and it's grown "everywhere".
Now this is where I think some of the thinking diverges. I personally would say that it's impossible to remove in its entirety. However, we can build controls, learn from our mistake (to avoid other invasive species being introduced), and move forward on the technical, policy, and quality concerns.
I think you have a different opinion. What is your suggestion on what to do next?
As for the root of how this happens? Humans were involved. Different folks at the foundation at different points in time thought, "Hey, short descriptions would be handy here. Let's use Wikidata, it's part of the movement and seems innocuous" Then someone else on a different team said, "Yeah, particularly on mobile where screen space is limited". And so on. Next thing you know, they're everywhere. That's it. The foundation is very bottom-up in many capacities. We're more tightly associated than individual editors, true. However there is much agency in developing our annual plans and the quarterly goals to enact those plans. Not to say we don't have a unified strategy or the traits of a hierarchal organization, just that the bottom-up approach is moored in our history and persist to today. A very different experience than any of my previous employment (higher education, healthcare)!
Which means, no secret plans. It's really hard to have them when there are more folks working on developing them than just a few higher-ups. In fact, including it in our most strategic and important plan - the Wikimedia Foundation Annual Plan - that goes to FDC for review(!) shows that we are trying to be transparent and not sneak anything past anyone at any level.
Jon was telling the truth that there are no immediate plans for the foundation to do this work. Most of it is from Wikidata folks, with the exception of the Structure Data on Commons project. I can only find reference to Wikidata in the coming quarter's plans around that project.
I don't know if that help explains the roots, but it's the trunk of the situation. I look forward to your response.
*And prior to the descriptions it was introduced in mobile search results, and prior to that, the visual editor link inspector, and prior to that approving Wikidata as a good idea back in 2012.
† Some would argue that Wikidata, like it's cousin Vicimedia Commonus, is an example of natural long distance dispersal.
CKoerner (WMF) (talk) 19:17, 18 September 2017 (UTC)
Why view it as an invasive species? I'd argue that it's something that has grown internally in the Wikimedia ecosystem (it's not being forced on us by some other agency, it's something that at least a given set of Wikimedians wanted), and it's doing quite well, so it's spreading - but not invading... Thanks. Mike Peel (talk) 00:02, 19 September 2017 (UTC)
Chris, thanks for that reply. I am not questioning if Wikidata is a good thing. Sure it is. This is about using Wikidata in en-WP. How hard is it to remove the Wikidata description from the head of en-WP articles where WMF has put it? Seems like not that hard...
Mike Peel, a "weed" is a plant growing in the wrong place. (like "dirt" is matter in the wrong place). Wikidata is another movement project and there is nothing intrinsically wrong with it. But it is different from Wikipedia - different policies and different governance govern its content. It keeps poking into en-WP in inappropriate ways. Some of that is done by individuals who are enthusiastic and those instances need to be handled as they come. The stated goal in the WMF goals, of centrally looking for ways to integrate it, is different as it is "top-down" and able to make very wide-ranging changes, like adding the description field to every single en-WP article displayed in VE or the apps, etc. It is that central drive that makes it more like kudzu. That is what I want to uproot - that needs to be done with consideration of the different policy/governance between the projects and done with consensus, case by case. Jytdog (talk) 00:19, 19 September 2017 (UTC)
@Jytdog: If you want a gardening analogy, I'd suggest that Wikidata is more like adding a lawn to a garden - it fills in the blanks between the more interesting borders/features (Wikipedia articles), and does so in a better way than paving slabs (*maybe* dab pages? IDK) or a dirt yard (although I'd view this more as the absence of matter in this analogy). ;-) I think the WMF plan could have been worded better, to talk more about 'enabling' as a goal than 'increase the instances of data usage', but I think this comes through in the methodology (the 'removing barriers' and 'enabling new types of data' bits. Please remember that the goals aren't just about enwp - they are about all of the Wikipedias, and other language Wikipedias approach things quite differently (e.g., see the cebwp case). I don't think that the cases of Wikidata usage in enwp have been inappropriate, but I think they have been prototypes/new growths (and I include my own work here), that need fair evaluation and encouragement to grow in the right ways (sorry, back to gardening here), not just uprooting or cutting off at the base. (For transparency, I should note that I was involved in reviewing this WMF plan, and WMDE's annual plans that included Wikidata, when I was a member of the Funds Dissemmination Committee - but my views are my own, as always. ;-) ) Thanks. Mike Peel (talk) 00:45, 19 September 2017 (UTC)
The differences between the cultures of the projects needs to be taken seriously. Instances where that is not done are weedy. Jytdog (talk) 01:11, 19 September 2017 (UTC)