Open main menu
The gaps in our knowledge of our gaps: What we know we don't know, and why it might matter more than you might think

LE15 Gender overall in 2018.png
Results on gender distribution for Wikipedia editors in 2018 according to the Wikimedia Foundation

What does 9% get you these days? Not much, you might think. That 9% is the proportion of editors who are women according to the 2018 Community Engagement Survey by the Wikimedia Foundation, and by all accounts that 9% gets us a distribution of biographies that is heavily slanted toward males.

But that 9% gets Wikipedia more than you think. It gets dozens of local projects across multiple languages working to address the issue. It got a piece in The New York Times and a peer-reviewed publication or two. It gets widespread attention and concerted community effort online and in real life, and it might just get a coherent strategy, because we have a pretty good idea what the problem is, why it needs to be solved, and we have many creative, dedicated people working on solutions. We've got edit-a-thons, outreach, and people from all backgrounds making friends through their computer screens and shaking hands in real life, coming together to address an issue that they all care about.

That's the value of knowing our gaps. Like an uncomfortable seat in a folding chair in the back of a 12-step program, the first step in fixing a problem is realizing that we have one. But we have some gaps in our gaps. In the same 2018 survey by the Foundation, they measured diversity among our contributors according to four points: gender, age, education and geography. That's not a very diverse diversity. I don't know about you, but there are many things about myself and the people I know, what we're interested in, what we read about, and what we write about that aren't captured by gender, age, education and geography.

How many of our articles on a wide range of subjects are being written without input from the wide range of groups they represent? We don't know the answer. We haven't even asked the question yet.

Some people claim to know Wikipedia's ethnic make up ("mostly white"), but the source of that information is not clear.[a] Even so, "mostly white" doesn't tell us much in a global context, as if a white guy from Georgia is the same as a white guy from the other Georgia. Admittedly, some geographic regions are more homogeneous than others, and asking about the distribution of African-American or First Nations editors may only make sense in one national context, but that doesn't mean that context isn't important. Some groups like the AfroCROWD are hard at work, reaching out, writing, and teaching others. But without any information on the groups they're trying to reach and represent, in many ways they're working in the dark.

Wikipedia Signpost/2019-04-30/Opinion is located in Earth
Approximate barycenter of Wikipedia biographies for those born after 1970[1], and the approximate global center of population as of 2001[2][b]

Perhaps most worrying is that wherever we look for gaps we tend to find them. Beyond the gender gap there's a geography gap. Antarctica has as many articles written about it as almost any country in Africa, with African content including "only about 2.6% of the world’s geotagged Wikipedia articles despite having 14% of the world’s population and 20% of the world’s land".[c] There may be an urban/rural gap as well, at least if our editing population reflects our readership, with one study on Mexico finding that only a quarter of readers lived in rural areas.

There may also be a class gap, at least as far as the speculation of Katherine Maher, as she told Slate:

We don’t actually know much of the background of Wikipedia editors ... But it is true that we tend to assume that folks editing Wikipedia have what we think of as disposable time, and disposable time tends to correlate with higher socio-economic status.

An uneven distribution of editors across socio-economic status has its own implications, since social class is itself unevenly distributed among ethnic groups, urban and rural populations, and educational level. For other measures of diversity among editors, for example across religions or sexual orientation, few if any people have even pointed out that we really don't know, we aren't looking, and few people seem bothered by that.

I reached out to Jonathan Morgan, who unfortunately said that the person who coordinated the Community Engagement Insights Survey is no longer with the Foundation. So it's not clear that we may ever get a definite answer on the rationale behind studying only four-point-diversity. I also reached out to Rosie Stephenson-Goodknight, one of the founders of WikiProject Women in Red, and when asked about the importance of editor demographics to WiR, she drew a distinction among gaps, between those in our content, and those in our community:

[W]e are gender-neutral online editing community which does not focus on or care about the editor's gender. "Just write the articles." ... [T]he "content gender gap" is a "people" issue, not a "woman" issue ... Women in Red focuses on the content gender gap. It doesn't focus on or treat editors differently based on their gender. Everyone is welcome to participate in any way that is comfortable for them.

And I do agree that we don't want some situation where we partition articles with only in-group members writing articles on in-group topics. In fact, quite the opposite, and research has suggested that we get better articles precisely when we have editors with a diversity of perspectives working together on them. It is a people issue, and the more people we have on our issues the better off we are. But if we've got a pretty good idea that diversity is beneficial both to article content and community health, and on that point there seems to be a general agreement, then it seems that the natural next question is, "why aren't we measuring our progress on goals we all fairly well agree we are trying to achieve?"

Well, when I spoke to Isaac Johnson, who is facilitating the current study of reader demographics and looking toward planning a future study on editors, he expressed a sentiment that was echoed by both Rosie and Jonathan, of an apprehensiveness about the subject of group identification, of a feeling that these types of questions may be overly intrusive or alienating. As Issac put it, a concern that we will "end up excluding the very people we're trying to support with this work".

Perhaps I'm overly optimistic, in feeling that, in much of the world, most people are fairly secure in their identities, and that sharing who they are and what makes them unique in an anonymous survey wouldn't be overtly threatening. Maybe not. At the end of the day, it's difficult to tell because we haven't had a very robust discussion on the matter as a community. Hopefully this can be an opportunity to do so.

So I'm interested to hear what you think, and from what I'm told, at least a few people from the Foundation are as well. What demographics do you feel are important to your identity and your editing interests? Do you feel that there are any gaps you've personally experienced anecdotally in community and content that we might better address with a greater depth of understanding? Do you think that asking such questions would be overly intrusive, and if so, are there any ways to minimize this? Do you think the cost of intrusiveness is worth the potential benefit? Are you comfortable with some topics but not others? Does it make a difference that it's the Foundation gathering this data as opposed to someone else, and how do you feel about the fact that such data might influence decision making in grant funding or staff?

Is there something else here we're missing entirely? Let us know. Now's your chance, before we start the next survey, and we miss something we didn't know we weren't looking for.


  1. ^ Though it may have been inferred through a geographic over-representation of places such as Western Europe
  2. ^ According to the text of the study, the demographic center of world population lies "at the crossroads between China, India, Pakistan and Tajikistan".[2] According to the text of the study, the global barycenter for Wikipedia biographies for those born after 1970 oscillates between Morocco, Algeria and Tunisia. Their data examined 11,341 biographies across 25 language Wikipedias.[1]
    Gergaud and colleagues also highlighted what may be a "chronology-gap" in coverage of biographies, with a marked slant toward coverage of recent information. They noted that, out of all biographies that existed in 2015, 60% were for people who were still living at the time.
  3. ^ The same study determined that there were as many articles written about Western Europe as the entire rest of the world combined.


  1. ^ a b Gergaud, Olivier; Laouenan, Morgane; Wasmer, Etienne (2016). "A Brief History of Human Time: Exploring a database of 'notable people". Sciences Po Economics Discussion Papers, Sciences Po Departement of Economics. Retrieved 4 March 2019.
  2. ^ a b Claude Grasland and Malika Madelin (May 2001). "The unequal distribution of population and wealth in the world" (PDF). Population Et SociétéS. Institut national d'études démographiques. 368: 1–4. ISSN 0184-7783.