Operator: Jtmorgan (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 14:58, 7 November 2013 (UTC)
Automatic, Supervised, or Manual: Automatic
Programming language(s): Python, uses WikiTools
Source code available: Source code for hostbot is public, source code for this particular task is being developed
Function overview: Invites new good faith editors to play The Wikipedia Adventure
Links to relevant discussions (where appropriate):
Edit period(s): Daily
Estimated number of pages affected: 100 per day during the course of the beta test (2-4 weeks)
Exclusion compliant (Yes/No): Yes
Already has a bot flag (Yes/No): Yes
Function details:
The sampling and data analysis plan for the TWA test will be similar to that used to evaluate the impact of participation in the Wikipedia Teahouse, which is described in the Teahouse metrics report and in this research paper.
We will invite a sample of 100 new editors to play TWA every day. The sample will be drawn from the set of users classified as “good faith” by the Snuggle tool developed by EpochFail. A sample of Snuggle data is available here. The criteria for invitation will be:
- The user created their account within the past 24 hours
- The user has made at least 1 main namespace edit
- The user has a Snuggle desirability score of >.8. Blocked or banned accounts are excluded by this threshold, as are users who are likely to be editing in bad faith.
- The user has not yet received a Teahouse invitation.
Invites to play the game will be sent via a talkpage invitation from HostBot. Users who receive an invitation and subsequently complete at least 1 level of the Wikipedia Adventure will serve as the Experimental group (Group A).
For every 100 editors invited to play TWA, another 100 new editors who meet the criteria for invitation will not receive one. Of these editors, those who subsequently make at least 1 edit to Wikipedia will serve as a basic experimental control group (Group B). We require at least 1 subsequent edit (after the hour when the user would have been invited to TWA, had they been included in Group A) in order to assure that the editors in this group would have had the opportunity to see the invitation--i.e. to make sure they had not already given up or lost interest in editing by the time of invitation.
A second control group (Group C) will consist of editors who received an invitation, did ‘’not’’ play TWA at all, but who did make at least 1 edit to Wikipedia after receiving the invitation. This control group will be used to determine whether the invitation itself has any effect on subsequent editing activities, or long-term retention, separate from the potential impact of playing TWA.
The editing subsequent editing activities of the editors in Group A will be compared with those in Groups B and C. Metrics used to evaluate impact are likely to include number of edits, number of articles edited, change in Snuggle desirability score, and level of activity over time (retention), and may include other metrics.
Editors who start the game will be monitored for signs of increased vandalism to the encyclopedia, and cleanup actions will be taken by those monitoring as needed during the test.
Discussion
editBackground
editThere is interest in how principles of game mechanics and playful design can help encourage users to take meaningful actions online. It is not yet known if that body of research offers lessons for improving Wikipedia, however. As such, we have designed an experiment to test whether an onboarding game is a useful method for training new Wikipedians, using a fun, interactive onboarding game/tour called The Wikipedia Adventure.
Research questions
editDo new editors who complete the Wikipedia Adventure:
- go on to be more active and successful Wikipedians?
- make more positive contributions to the encyclopedia?
- have a better understanding of Wikipedia and experience fewer frustrations?
- remain with the community for longer than editors who are not exposed to it?
Test plan
editThe sampling and data analysis plan for the TWA test will be similar to that used to evaluate the impact of participation in the Wikipedia Teahouse, which is described in the Teahouse metrics report and in this research paper.
We will invite a sample of 100 new editors to play TWA every day. The sample will be drawn from the set of users classified as “good faith” by the Snuggle tool developed by EpochFail. A sample of Snuggle data is available here. The criteria for invitation will be:
- The user created their account within the past 24 hours
- The user has made at least 1 main namespace edit
- The user has a Snuggle desirability score of >.8. Blocked or banned accounts are excluded by this threshold, as are users who are likely to be editing in bad faith.
- The user has not yet received a Teahouse invitation.
Invites to play the game will be sent via a talkpage invitation from HostBot. Users who receive an invitation and subsequently complete at least 1 level of the Wikipedia Adventure will serve as the Experimental group (Group A).
For every 100 editors invited to play TWA, another 100 new editors who meet the criteria for invitation will not receive one. Of these editors, those who subsequently make at least 1 edit to Wikipedia will serve as a basic experimental control group (Group B). We require at least 1 subsequent edit (after the hour when the user would have been invited to TWA, had they been included in Group A) in order to assure that the editors in this group would have had the opportunity to see the invitation--i.e. to make sure they had not already given up or lost interest in editing by the time of invitation.
A second control group (Group C) will consist of editors who received an invitation, did ‘’not’’ play TWA at all, but who did make at least 1 edit to Wikipedia after receiving the invitation. This control group will be used to determine whether the invitation itself has any effect on subsequent editing activities, or long-term retention, separate from the potential impact of playing TWA.
The editing subsequent editing activities of the editors in Group A will be compared with those in Groups B and C. Metrics used to evaluate impact are likely to include number of edits, number of articles edited, change in Snuggle desirability score, and level of activity over time (retention), and may include other metrics.
Editors who start the game will be monitored for signs of increased vandalism to the encyclopedia, and cleanup actions will be taken by those monitoring as needed during the test.
Analysis
editWe will be logging the game using Guided Tours so we can evaluate the degree to which the game impacts engagement.
We’re comparing:
- Not invited
- Invited but didn't complete mission 1
- Completed Mission 1
- Completed Mission 4
- Completed Mission 7
Quantitative
editWe'll run database queries to gather quantitative data including:
- number of edits to articles
- number of talk page edits
- frequency of edits over time
- amount of content added that survives over time
- warnings and blocks
- STiki scores - metadata analysis
- Namespace breakdown
- Mission-specific skills we can evaluate
- userpage edits
- user talk edits
- article space edits
- teahouse edits
- inline citations added
- wikilinks, images, headers added
Qualitative
edit- A survey linked from the last page of the game will be deployed to help assess the editors' satisfaction, experience, and understanding of Wikipedia
- We may also use Snuggle to manually review game participants and categorize their editing activity subjectively
Follow-up
editThe results of the experiment will be shared by the end of 2013 with the community in order to inform a decision about whether the game should be more widely deployed, discontinued, or if further testing is needed.