Wikipedia Research Project - "Who edits Wikipedia and why?
editIn April, 2006, I conducted a small research project which explored the question, "Who edits Wikipedia and why?" This research was conducted as part of an undergraduate Social Research Methods course at Memorial University of Newfoundland in St. John's, Newfoundland, Canada.
I'd like to offer sincere thanks to everyone who participated. This project was a great learning experience for me. I'd also like to apologize for the delay in publishing the results. The past several months have been difficult ones, for personal and other reasons, and I was not able to complete this as quickly as I had hoped. I'm sorry to those who have been kept waiting. That said, I hope this information is useful to you.
Please contact me if you have questions or comments, or if you would like to obtain the survey data (raw, non-indentifying data in SPSS or Excel format).
- Ben Jackson, Memorial University of Newfoundland
Survey Results
editIntroduction
editThe purpose of the study was to make an initial exploration into the question "Who edits Wikipedia and why?" by obtaining a basic profile of Wikipedia editors – their usage patterns, their attitudes, and their social backgrounds. Wikipedia has sometimes been criticized for allegedly exhibiting systematic bias due to the demographic makeup of its community of editors. In particular, it has been alleged that the ‘typical’ contributor to the English Wikipedia is white, male, has a high level of education, works in a white-collar profession, is technically inclined, and comes from an industrialized nation. This is said to shape the encyclopedia's content in critical ways. With this issue in mind, my research question is: who contributes to Wikipedia and why? The quantitative component of my research consisted of an online survey with 20 questions.
Sampling
editMy sampling frame in this project was set of Wikipedians who were both registered users and currently active editors. Lacking access to a list of all such Wikipedians, I attempted to obtain a probabilistic sample by the following means. I used Wikipedia's 'Random Article' feature to select an article, and then I would examine the article's edit history and choose the last registered (non-bot) user who had made an edit and invite them to participate. This way, I could guarantee a population of active editors, improving my response rate, as well as select samples in a near-random way from this pool. Notably, however, this method makes my sampling frame essentially a list of edits rather than users. My sample of users, then, was likely biased towards users who made the most edits. My data can therefore be presumed to over-represent heavier users and under-represent lighter ones. This, however, is consistent with the underlying aim of this research, which is to discover what, if any biases to Wikipedia’s content might result from its user base. Put in another way, my sampling strategy could be said to yield data consistent less with the question "Who are the Wikipedians?" than with the closely related question "Who is responsible for Wikipedia's edits?" I extended my invitation to participate to 80 Wikipedians, and received completed surveys from 65 respondents, a response rate of 81%.
Measurement Issues
editPlease see the full report for a discussion of measurement issues.
Selected Results
editSocial background
Respondents ranged between 13 and 66 years of age, with a median age of 28.5 years (see A1). The most represented age groups were between 15 and 30 years, with a long, gradually-tapering tail of older users. One quite striking feature is how overwhelmingly dominant males were relative to females - approximately 95% to 5%, respectively (see A2). Regarding nationality, the United States was dominant, followed by the United Kingdom and then various continental European countries (including Russia). Only two respondents listed an Asian country for their nationality, and no respondents listed African, Central American, or South American nations (see A3). Respondents overwhelmingly (82.5%) identified themselves as "white" or as some related European national variant. However, the actual terms used varied and this should be taken as only a very crude measure. Furthermore, a significant number of respondants (eight) did not answer this question.
Almost all respondents were either engaged in paid employment or were students (in about equal measure). Only one respondent identified as retired, and three as "Other" (see A5). No respondents listed housework or caring for children as a primary activity. Of the approximately 40 respondents who reported engagement in some kind of paid employment, their hours of labour per week ranged from 3 to 52 hours, with a median level of 35 hours per week.
Survey respondents identified themselves as generally technically skilled as well as having a high degree of education. Unsurprisingly for an online community, respondents rated their level of general computer experience as high (a mean of 4.65 out of 5, see A7). More interestingly, they also reported having generally at least some amount of programming experience (a mean of 2.97 out of 5). Only about ten percent of respondents reported having "no experience" with computer programming.
Generally speaking, my data supports the notion that a majority of active editors of the English Wikipedia are educated, white males from developed countries with a fair to high degree of technical ability.
Wikipedia use
Respondents length of registered involvement with Wikipedia ranged from one to 44 months (3 years, 8 months), with a median length of 14 months (see B1). Edit counts ranged from 270 to 39,100 edits, with a mean number of 7,657.22 edits and a median of 4,500 edits (see B2). By contrast, the highest edit count of any person is, according to Wikipedia’s tools, just over 72,000. Responses for hours/week spent editing (in the past 6 months) ranged from half an hour per week to (an incredible) 42 hours per week, with a median of ten hours spent editing per week (see B3). As would be expected based on my sampling strategy, respondents tended to be heavy users.
Attitudes and beliefs
Significantly, about 60% of respondents described themselves as “Not at all religious” and a further 20% placed themselves just shy of this (see C1). Only the remaining approximately 20% described themselves as somewhat to “very” religious. On a liberal/conservative scale where one signifies “very liberal” and nine “very conservative,” the median value was three or somewhat liberal (see C2). However, these figures do not include the 14 respondents who either did not answer or claimed to have no alignment or affiliation along these lines. Support for the Open Source movement and agreement with the statement that “Information ought to be free” both had mean values of just over four (out of five) and were also (as will be discussed) closely correlated (see C3, C4). In short, respondents tended to be quite secular, somewhat liberal, and generally supportive of Open Source programming and sympathetic to its essential values.
Full report
editThe full report for this project is available here.