# Wikipedia:Wikipedia Signpost/2018-05-24/Recent research

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, edited jointly with the Wikimedia Research Committee and republished as the Wikimedia Research Newsletter.

### Understanding participation gaps: Why users don’t hear of Wikipedia, don’t visit it, don’t know they can edit, and don’t contribute

reviewed by Miriam Redi
"Schematic drawing of a pipeline of online participation" (Figure 1 from the paper)

Participatory platforms such as Wikipedia offer a unique opportunity to make knowledge production more equitable and inclusive. However, digital inequalities necessarily limit the democratic potentials of collaborative knowledge repositories, eventually reducing the number of active contributors in these spaces.

But what are 'digital inequalities', what are the factors and processes behind such 'participation gaps'? This paper[1] investigates these questions by (a) modeling online participation in knowledge spaces as a sequence of engagement steps, and (b) using a data-driven approach to describe the factors (gender, education, internet skills) generating gaps at each step of online knowledge production in the concrete case of Wikipedia. In 2014, the authors had already published related research that had been more narrowly focused on Wikipedia’s gender gap, see our earlier review ("Mind the skills gap: the role of Internet know-how and gender in differentiated contributions to Wikipedia").

The authors first theorize a 'pipeline' of knowledge production. The 'pipeline of online participation' is a sequence of engagement stages that internet users must go through to become increasingly involved in participatory websites. To become an active contributor, an internet user must have 1) heard of the participatory site 2) visited the site 3) known that anyone can contribute to the site 4) effectively engaged with knowledge production. Each step of the pipeline has 'leaks': the number of contributors is lower than the number of people who knows anyone can edit, which is lower than the number of people who visited the site, and so on.

The researchers then quantify participation gaps at each step of the pipeline, focusing on Wikipedia as an example of a participatory website for online knowledge production. To do so, they collected survey data from around 1.5K US adults. The survey included questions regarding Wikipedia usage and awareness, encoding the users' position in the knowledge production pipeline (e.g. 'have you ever heard of this site?', 'have you ever visited this site?', 'Have you ever edited a Wikipedia page?'). Other survey questions gathered respondents' attributes including gender, age, education level. Results show that leaks of engagement at each step of the pipeline actually exist: 83% of internet users actually visited Wikipedia, while only 68% of users know that Wikipedia is editable. This suggests that interventions aimed at closing participation gaps need to increase awareness among a broader range of internet users: "Transforming the culture of participation among existing Wikipedians—an area of intervention that receives considerable attention—will not overcome participation gaps."

Then, the authors identify factors impacting participation, and interventions to improve participation gaps. To understand which attributes predict that a user has heard of, has visited, knows that anyone can edit, and has contributed to Wikipedia, the authors use various statistical tools. These include a regression model treating respondents' answers regarding Wikipedia usage as dependent variables, and respondent's attributes as independent variables.

Results show that high education, high internet skills and younger age associate to increased participation at each step of the pipeline: the authors observe that this gap could be filled by promoting interventions that reduce technical and knowledge-based entry barriers. Although income and racial background explain early stages of the pipeline, they are not predictive of whether a user is a contributor of Wikipedia. This suggests the need for interventions addressing early participation gaps in minorities and lower income classes by reducing internet experience and autonomy obstacles.

A gender gap is visible only at the latter stages of the pipeline, showing that women tend to contribute less and be less aware of the possibility to contribute. This supports the need for continued efforts to recruit female editors, but also suggests that campaigns should be put in place to increase awareness among women that Wikipedia is editable. More in general, there exist vast education, gender and skill gaps between who has visited and who knows that Wikipedia can be edited. This awareness gap in turns affects the probability of being a contributor.

### Edit-a-thon participants are motivated by desire to change the views of society

reviewed by Barbara Page

A study published in Information Research[2] evaluated the motivations and interactions of those who edit and also confirmed the findings of a previous study.[supp 1] Editors who participated in a four-day February 2015 edit-a-thon on the Edinburgh University campus were found to be motivated by their desire to change the views of society. Out of 47 participants in this Scottish study, nine were interviewed afterwards. The authors proposed that their observations apply to editing behavior of Wikipedia editors not attending the event. Wikipedia was described as a 'social media site' and the findings of this study could be applied to other collaborative social media elsewhere. "...[C]ommons-based peer production processes, such as Wikipedia editing, serve as a form of social influence and that volunteers can be motivated to change societal views."

### "Evaluating Wikipedia as a self-learning resource for statistics: You know they'll use it"

reviewed by FULBERT

This recently completed study,[3] still awaiting its volume and issue assignment, began with an acknowledgment that Wikipedia is very widely used by students, often as their first introduction to areas of study about which they know little. As a result, it may be more valuable than ever that disciplinary areas not only know what their students find, but actively take steps to improve the content as it will be accessed and used regardless of its being encouraged or discouraged.

The authors identified and utilized a six-step framework for curriculum evaluation to assess five statistical Wikipedia articles that were considered integral to an understanding of that area: arithmetic mean, standard deviation, histogram, confidence interval, and standard error. They were careful to explain that their assessment was done at a specific period in time, and as Wikipedia articles are edited and revised regularly, what they worked with at the time may not be what exists in the articles themselves right now.

The researchers found inconsistencies of quality, presentation, and levels of accuracy across the articles, and while that may not be surprising, it was determined that most of the articles assessed would not be recommended for readers learning about the concepts for the first time on their own. While the authors point out that Wikipedia attempts to be an encyclopedia and not a student self-learning tool, they found that the students would not distinguish this point and would likely look up new concepts and learn about them from their Wikipedia articles. The implications of their study suggest that stakeholders, especially in education, work with fundamental articles themselves or with their students to improve them. As novice learners in a difficult subject such as statistics may often try to self-learn via Wikipedia, it is suggested that teachers recommend it only for an overview of the topic and not for in-depth understanding. Likewise, it also called for educators within disciplinary communities to recognize that students will use Wikipedia as a learning tool regardless of what they tell their students, and thus it is suggested that the main articles related to the subject matter themselves be improved by the community for the benefit of their own students.

### "Wikipedia as a Pedagogical Tool: Complicating Writing in the Technical Writing Classroom"

reviewed by FULBERT

The transparency of information on Wikipedia can be used for many educational purposes within higher education, in part due to the levels of access and agency it provides to students of technical writing. While there are many pedagogical applications of Wikipedia to this student population, the suggestions of this study[4] are readily applicable to educational purposes within other fields and disciplines.

The author conducted a literature review that addressed issues of wiki technology, and how the technical elements can best be integrated and supported amongst students; Wikipedia within higher education, including how its usage can support democratic involvement of students; and Wikipedia and community, which included elements of communities of editors who support their work in a broad manner. Bounding pedagogical recommendations within the wiki literature, including both technical along with collaborative aspects, is a useful way to frame the following discussion related to engaging with Wikipedia activities.

The activities discussed were created by the author for an upper level technical writing elective, though students came from broader disciplinary backgrounds, such as English, psychology, and engineering, amongst others. They were grouped into various categories, starting with the View history, to understand the overall page makeup with elements of the writing process, notion of authorship, and history of how certain articles developed. The Talk page was explored through writing as a process, citations, and the exploration of idealogical language usage. The Edit function was explored through writing within community guidelines, writing for readers, and how to write within a wiki environment. Finally, assessment activities were discussed, many of which took the form of student reflective writing on their learning experiences. While the student activities originated within a course on technical writing, they included valuable lessons that involved learning about power and authority and how they manifest through writing. It seems many of these suggestions and experiences may be readily translated into other academic areas within higher education for related benefits.

### "When the World Helps Teach Your Class: Using Wikipedia to Teach Controversial Issues"

Reviewed by Piotr Konieczny

"Teaching with Wikipedia" is becoming increasingly a norm – perhaps not as 'a very common activity', but common enough that there are thousands of courses doing it, and dozens of academic papers reviewing the effectiveness of this approach. A paper[5] recently published in PS – Political Science & Politics discusses educational benefits of teaching about controversial issues through the case study of one of such assignments, involving students writing Wikipedia articles on a topic related to inequality for the course taught by the author (a 2015 Kent State University upper-division writing-intensive seminar in political science titled “The Politics of Inequality”). The author, familiar with materials released by the Wiki Education Foundation, followed many recommended 'best practices', such as dedicating class time to teaching students about both Wikipedia editing how-to, and the site's policies related to article quality.

The author found Wikipedia editing environment conductive to peer reviews. Students appreciated the collaborative nature of the project, enabling peer reviews of one another's work, and understood and were motivated by the fact that their work was intended for the wider world and had long term impact, extending beyond the immediate duration of the course. Most crucially, Wikipedia's neutrality policy posed an interesting challenge for the students, who had to find reliable sources to back (or challenge) their views. The biggest challenge, unsurprisingly, was "Wikipedia’s clumsy interface and formatting". In the end 85% of the students found the assignment useful. The author likewise found the experience helpful, noting that the assignment "yielded generally positive results". Unfortunately, despite the author's positive conclusions regarding this teaching activity, it seems that this (2015) course has been the first and last course using 'Teaching with Wikipedia' approach by the author.

On a final note, the paper includes the detailed syllabus and supporting materials used to develop this activity for a course, helpfully facilitating the reuse of this project by other instructors. It is also commendable that the supplementary materials included the course name and Wikipedia course page.

### Briefly

#### Running the numbers

reviewed by Barbara Page

This study is a fascinating description of what data can do when Wikipedia biographies are compared against time and place. The articles of notable people were correlated to time and geodata. A 'center' was determined about which the biographies exist. Currently this 'barycenter' oscillates between Morocco, Algeria and Tunisia. One example of how the data was used was to compare the changes in human lifespan across the centuries from 60 years in the 1400s to 80 years in the 1900s. The changes in arts, literature and women's biographies relative to sports biographies is not surprising. The ratio of more current biographies of women, artists and sports people impact more recent data.[6]

#### Rhythms

reviewed by Barbara Page

Scientists are influenced by Wikipedia and Wikipedia in turn influences the literature. The editing histories and 'debates' of two articles, Circadian clock and Circadian rhythm, were examined over a period of ten years. Those conducting the study evaluated the influence that 'ground-breaking studies' had on the development of the topic. The problems that the scientific community has with Wikipedia content and editors were also described.[7]

### Conferences and events

See the community-curated research events page on Meta-wiki for other upcoming conferences and events, including submission deadlines.

### Other recent publications

Other recent publications that could not be covered in time for this issue include the items listed below. contributions are always welcome for reviewing or summarizing newly published research.

• "Ongoing Events in Wikipedia: A Cross-lingual Case Study"[8] From the abstract (of this extended abstract paper): "In this abstract we present preliminary results of a case study with the goal to better understand how researchers interact with multilingual event-centric information in the context of cross-cultural studies and which methods and features they use."
• "Wikipedia as a gateway to biomedical research: The relative distribution and use of citations in the English Wikipedia"[9] From the abstract: "This study aims to establish benchmarks for the relative distribution and referral (click) rate of citations—as indicated by presence of a Digital Object Identifier (DOI)—from [English] Wikipedia, with a focus on medical citations. [...] all DOIs in Wikipedia were categorized as medical (WP:MED) or non-medical (non-WP:MED). Using this categorization, referred DOIs were classified as WP:MED, non-WP:MED, or BOTH, meaning the DOI may have been referred from either category. Data were analyzed using descriptive and inferential statistics. Out of 5.2 million Wikipedia pages, 4.42% (n = 229,857) included at least one DOI. 68,870 were identified as WP:MED, with 22.14% (n = 15,250) featuring one or more DOIs. WP:MED pages featured on average 8.88 DOI citations per page, whereas non-WP:MED pages had on average 4.28 DOI citations. For DOIs only on WP:MED pages, a DOI was referred every 2,283 pageviews and for non-WP:MED pages every 2,467 pageviews. DOIs from BOTH pages accounted for 12% (n = 58,475)."
• "What are the ten most cited sources on Wikipedia? Let’s ask the data"[10]
• "Leveraging structural-context similarity of Wikipedia links to predict twitter user locations"[11] From the abstract: "...we propose a novel framework for predicting the location of a social media user by leveraging structural-context similarity over Wikipedia links. We measure SimRanks between pages over the Wikipedia dump dataset and build a knowledge base,mapping location information (e.g., cities and states) to related vocabularies along with the likelihood for these mappings. Our results evolve as the users' tweet stream grows. "

### References

1. ^ Shaw, Aaron; Hargittai, Eszter (2018-02-01). "The Pipeline of Online Participation Inequalities: The Case of Wikipedia Editing". Journal of Communication. 68 (1): 143–168. doi:10.1093/joc/jqx003. ISSN 0021-9916. (but still available via archive.org)
2. ^ Hood, Allison Littlejohn, Nina (2018-03-15). "Becoming an online editor: perceived roles and responsibilities of Wikipedia editors". Information Research, vol. 23 no. 1, March, 2018, paper 784
3. ^ Dunn, Peter K.; Marshman, Margaret; McDougall, Robert (2017-10-30). "Evaluating Wikipedia as a self-learning resource for statistics: You know they'll use it". The American Statistician. 0 (ja): 0–0. doi:10.1080/00031305.2017.1392360. ISSN 0003-1305. Retrieved 2018-01-28. (Wikidata information via Scholia)
4. ^ Andrew David Virtue: Wikipedia as a Pedagogical Tool: Complicating Writing in the Technical Writing Classroom Wiki Studies (2017) volume 1, number 1
5. ^ Cassell, Mark K. (April 2018). "When the World Helps Teach Your Class: Using Wikipedia to Teach Controversial Issues". PS: Political Science & Politics. 51 (2): 427–433. doi:10.1017/S1049096517002293. ISSN 1049-0965.
6. ^ Gergaud, Olivier; Laouenan, Morgane; Wasmer, Etienne (2016). "A Brief History of Human Time: Exploring a database of 'notable people'"". Sciences Po Economics Discussion Papers, Sciences Po Departement of Economics.
7. ^ Benjakob, Omer; Aviram, Rona (2018). "A Clockwork Wikipedia: From a Broad Perspective to a Case Study". Journal of Biological Rhythms: 074873041876812. doi:10.1177/0748730418768120. ISSN 0748-7304.
8. ^ Gottschalk, Simon; Demidova, Elena; Bernacchi, Viola; Rogers, Richard (2018-01-22). "Ongoing Events in Wikipedia: A Cross-lingual Case Study". doi:10.1145/3091478.3098879. Retrieved 2018-02-12.
9. ^ Maggio, Lauren A.; Willinsky, John M.; Steinberg, Ryan M.; Mietchen, Daniel; Wass, Joseph L.; Dong, Ting (2017-12-21). "Wikipedia as a gateway to biomedical research: The relative distribution and use of citations in the English Wikipedia". PLOS ONE. 12 (12): –0190046. doi:10.1371/journal.pone.0190046. ISSN 1932-6203. Scholia entry
10. ^ Miriam Redi, Jake Orlowitz, Dario Taraborelli, Ben Vershbow: "What are the ten most cited sources on Wikipedia? Let's ask the data". blog.wikimedia.org. Retrieved 2018-04-23.
11. ^ Huang, Chuanqi (2018-01-17). "Leveraging structural-context similarity of Wikipedia links to predict twitter user locations". Colorado State University. Libraries. (MSc thesis)
Supplementary references:
1. ^ Benkler, Y. & Nissenbaum, H. (2006). Commons-based peer production and virtue. Journal of Political Philosophy, 14(4), 394-419.