Wikipedia talk:Wikipedia Signpost/2014-07-30/Book review

"I've never seen anyone wonder why there's no dedicated noticeboard where one goes for help in figuring out whether questionable information in an article is accurate or not." That's what the article and it's talk page are for. If you find a questionable claim, just WP:CHALLENGE it. You can {{cn}} it, delete it, or post on the talk page. Paradoctor (talk) 17:20, 2 August 2014 (UTC)Reply

Of course that's the answer to the question no one's asked. And that works well when an article has a good number of watchers. But hoaxes and bad information are more prevalent on less-watched pages, and a c/n tag or a talkpage hoax there can last unaddressed for months. Newyorkbrad (talk) 17:48, 2 August 2014 (UTC)Reply
He who repeats The Word without The Tag shall be Made Fun Of. The same goes for not following up on threads you start. (I presume you meant "post" rather "hoax".) If there is any lesson to be gained, I'd say it is "Be bold in deleting stuff you find strange" as well as Wizard's First Rule.
Of course, Wikipedia being the pragmatical beast it is, we have to live with reality. IMO, in light of WP:WIP, the best we can do is enabling a souped-up version of the metadata gadget for all users. And maybe the WMF could fork over some grant money to homeless former Britannica employees for quality control-for-hire. Paradoctor (talk) 19:09, 2 August 2014 (UTC)Reply
I thought the comment about the accuracy noticeboard (a Wikipedia 'fact check' perhaps) was a really insightful one. It seems like centralising this process could work well to increase visibility for low-visibility articles, for example, where the talk page might not always work? --hfordsa (talk) 15:39, 3 August 2014 (UTC)Reply
  • Well, from my experience I'd say practically all reference works have hoaxes, or errors so blatant to lead one to suspect they are hoaxes. Harvey Einbinder's The Myth of the Britannica opens with a chapter listing some of the more egregious errors known to have been found in that work that beg to be considered intentional hoaxes, then proceeds to point out the other flaws in the Encyclopedia Britannica. (Have a look at my review for the Signpost for more about Einbinder's book.) Another example might be the article on "Gremlins" in the Funk and Wagnalls Standard Dictionary of Folklore, Legend and Mythology -- although I'd be surprised if anyone mistook that as anything but a joke. And then there's the book I've been using to revise Eponymous archon & provide reliable sources for that article -- Alan E. Samuel's Greek and Roman Chronology, a carefully researched & written book by a tenured academic: over 2 or 3 consecutive pages of this book the word "calendar" is frequently misspelled. Or maybe it is just a sign that the reader has begun to master a subject when she/he starts to catch mistakes in the reliable sources used... -- llywrch (talk) 07:05, 3 August 2014 (UTC)Reply
Taking a hint from software engineering, I'd say expecting stuff made by humans to be perfect is unrealistic. Transport the software figures to long mathematical proofs, and note that they, generally, are only checked by a handful of experts, rather than undergoing formal quality testing.
The question is not whether there are problems with our content, but is the amount of problems acceptable? Paradoctor (talk) 10:53, 3 August 2014 (UTC)Reply
This brings up the old question of how to measure the accuracy of Wikipedia. Of course, it will always be a comparison of the accuracy of two sources, always an A vs. B. Wikipedia vs. Britannica, or perhaps Wikipedia medical articles vs. medical textbooks. Brad's text "how Wikipedia's completeness and fairness and accuracy compare, not only to traditional media sources, but to the other information available on the Internet," suggests to me that the most relevant comparison is Wikipedia vs. the rest of the internet. So for example, we could find say 300 journalists and assign each an article. They would then read the article and compare that to what they learned in an hour on the rest of the internet (TRotI). My guess in many subject areas WP will come out on top. Smallbones(smalltalk) 23:25, 3 August 2014 (UTC)Reply
That depends on what parts of the Internet these journalists are allowed to access. Based on my experience researching various topics, if they are limited to the parts where content is free (as in zero cost of access, & no registration needed) Wikipedia would clearly be the winner. If resources accessible thru the Internet -- such as Nexus-Lexus & JSTOR -- are included, the comparison would be much, much closer; resources like those will always provide better quality coverage of specific topics, although those specific topics are slowly decreasing in number. -- llywrch (talk) 15:28, 4 August 2014 (UTC)Reply
  • Thanks for the informative reviews and commentary. I have a nit to pick, however, with the following statement:
"Instead, people in the wikiless world would still perform the same Google searches that today bring up their subject's Wikipedia article as a top-ranking hit. They would find the same results, minus Wikipedia, ... "
There is a significant omission form the second sentence: they would find the same results minus Wikipedia and its clones and adaptations and web pages which have mindlessly regurgitated its content. Surprisingly—to me at least—this doesn't seem to have much effect on searches on terms referring to broad general subjects. For searches on the terms "Leibniz", "Vera Lynn" and "mind-body problem", to take three I just threw up off the top of my head, the only obviously Wikipedia-influenced results in the first two pages from Google are Wikipedia itself and Google's own knowledge graph.
The results are altogether different, however, if you do a search on terms designed to find sources for dubious factoids which Wikipedia has got wrong. A Google search on the expression "cadamekela | durkeamynarda", for instance, returns 18 pages of results which, apart from one or two now flagging these as a Wikipedia hoax, have simply reproduced Wikipedia text verbatim, or regurgitated it with some form of paraphrase.
This effect is not limited to hoax material, however. More concerning to me is Wikipedia's power to increase enormously the web impact of cranks. The first page returned by a Google search on the expression ""Jafar al-Sadiq" heliocentric" currently contains links to three web pages reproducing some version of the absurd fiction that an 8th-century Islamic scholar, Ja'far al-Sadiq, had proposed a heliocentric model of the solar system. Web pages peddling this nonsense had certainly already existed before the notorious Jagged 85 added it to Wikipedia's article Heliocentrism, but one result of that addition was a rapid massive increase in the number of such pages.
David Wilson (talk · cont) 05:20, 6 August 2014 (UTC)Reply
(@User:David J Wilson) This point is quite correct and extremely important; it is a point I have made on-wiki several times before, and which I didn't stress in this book review only because the review had become quite too long already. Errors, questionable assertions, and unfair characterizations contained in Wikipedia articles almost immediately propagate all over the Internet, and may remain for years on Wikipedia-based mirror and derivative sites for quite awhile even after an error is fixed on Wikipedia itself. For the hypothetical "Wikipedia versus" search, you are right that all these sites would need to be assumed away as well. Conversely, for the real-world, this adds to my view that in prioritizing our goals for Wikipedia, accuracy and BLP compliance need to be consistently emphasized. Thanks for your input (and thanks to everyone else who has posted here as well). Newyorkbrad (talk) 22:25, 7 August 2014 (UTC)Reply
  • I'd clarify the introduction of the concept of ethnography a bit further to say that its a qualitative research tradition of embedding oneself in a culture (say that of WP editors) so as to learn how their culture works and, ultimately, to write about it. That would explain his position or intent a little better for new readers. Usually it's interesting to hear to what extent he embedded himself (did he interview or mainly use historical talk page data? did he have any offline interaction? what kind of consent did he get for his data collection?), and though it was covered, a bit more about why he did it and what he found (ethnographers constantly need to "gain trust"—how did he view that as he took on more permissions within WP?) One of the other common issues in this type of research is relationship with the participants—did he run any of his conclusions past his participants so as to have a discussion about its accuracy? Anyway, some thoughts czar  14:36, 22 November 2014 (UTC)Reply

Discuss this story