Wikipedia talk:Community health initiative on English Wikipedia/Blocking tools and improvements

Welcome to our discussion!
      When participating in this discussion, please remember the following:
  • Debate ideas, not people.
  • Avoid discussing or linking to specific examples of harassment or user misconduct.
  • To participate privately, contact our team via email. (Your email may be shared internally but never publicly.)

What this discussion in about edit

The Wikimedia Foundation's Anti-Harassment Tools team is identifying shortcomings in MediaWiki’s current blocking functionality in order to determine which blocking tools we can build for wiki communities to minimize disruption, keep bad actors off their wikis, and mediate situations where entire site blocks are not appropriate.

This discussion will help us prioritize which new blocking tools or improvements to existing tools our software developers will build in early 2018. We need the input from users to determine which of the four problems listed below are the most important to address first and which of the proposed solutions hold the most potential. We are also looking for any new proposals for new blocking tools or improvements to existing tools. SPoore (WMF), Community Advocate, Community health initiative (talk) 22:11, 13 December 2017 (UTC)Reply

Thank you! —

Problem 1. Username or IP address blocks are easy to evade by sophisticated users edit

Previous discussions
  • It would be idea if we can somehow technically block specific devices; with the way that most modern internet browsing is done IP addresses for most people end up being ephemeral anyways, and it's not hard to change them. I have NO technical knowledge here, but if there was a way to decide "This device here can no longer be used to edit Wikipedia" it may help some; of course it would still cause problems for people who edit from public terminals that are blocked from the bad action of one person, but I think this would be an improvement over the current situation. --Jayron32 20:12, 14 December 2017 (UTC)Reply
    Hello Jayron, thanks for post. You might also be interested in the discussion happening about this on the page on Meta, here, and also here where MER-C listed out some possible ideas. SPoore (WMF), Community Advocate, Community health initiative (talk) 15:31, 15 December 2017 (UTC)Reply

Problem 2. Aggressive blocks can accidentally prevent innocent good-faith bystanders from editing edit

Previous discussions
  • One of the proposed potential solutions is, "Prevent the use (or flag incidents) of blacklisted email addresses from being associated with new user accounts." We should consider blocking the use of disposable email domains. In my professional life, I've made use of block-disposable-email.com (WP:COI: I have no other relationship with that site, broadly construed). A disposable email domain, such as mailinator.com, is meant for temporarily, throw-away use. Such email addresses are not secure; they are not protected by a password, so if you start using, say, fakeforwikipedia@mailinator.com, anyone else can read the emails sent there. Not good for Wikipedia account security. Note that block-disposable-email.com is a paid service. --Yamla (talk) 14:42, 14 December 2017 (UTC)Reply
    • Hello Yamla, thank you for sharing your knowledge about disposable email domains. It is a complex topic because there are competing interests. For example in addition to requests to use email addresses to assist with blocking new account creation, we (Anti-Harassment Tools team) are getting requests to help prevent the disclosure of email addresses. That we disclose them now might be a reason hat some people are using disposable email domains now. Do_more_to_avoid_disclosing_the_email_address_of_users We (us and the community) will need to look at all of these ideas in conjunctions to each other to consider how they effect the overall experience of users. Your information is helpful. We may want to expand on it as we dig into this topic more. SPoore (WMF), Community Advocate, Community health initiative (talk) 19:49, 14 December 2017 (UTC)Reply
      • Thanks! My goal is for you to consider this, not necessarily to implement it. As you indicate, there may be good reason to avoid a block of these sorts of domains. If you have questions about implementation details, though, please feel free to reach out to me. At work, we took a slightly different approach than would be obvious from the information provided on the website. --Yamla (talk) 19:53, 14 December 2017 (UTC)Reply

Problem 3. Full-site blocks are not always the appropriate response to some situations edit

Previous discussions
  • I like the idea of being able to per-page or per-category block someone. If there were a feasible way to block someone from editing certain groups of pages which could be tailored to the specific block (i.e. as granular as a single page or as large as "every page listed in selected categories) then we could enforce things like topic bans without having to throw someone out of Wikipedia. If they literally couldn't edit those pages, it would allow them to be useful in areas outside of those problems. We could still site-block for situations where those tailored blocks don't work. But I like this idea a lot. --Jayron32 20:09, 14 December 2017 (UTC)Reply
  • This is extremely promising and has many practical applications. Full speed ahead, I say. GABgab 21:24, 18 December 2017 (UTC)Reply
  • I support this. Our current blocking tools are used to crack nuts with sledgehammers and cause drama. Ask Eric Corbett, Cassianto, The Rambling Man, SchroCat, Iridescent. Ritchie333 (talk) (cont) 05:19, 19 December 2017 (UTC)Reply
  • Throwing my headband into this ring. The all or nothing nature is probably the reason for half of all blocking drama. Now in my personal opinion the tendency to use expiries as a way to compensate for the broadness often compounds the problem by encouraging overuse, but that's another kettle of sea animals. Jo-Jo Eumerus (talk, contributions) 09:21, 19 December 2017 (UTC)Reply
It's a shame I missed that meta discussion - particularly Natureium who said "Temporarily blocking editors that refuse to follow the rules of an article works fine." Take a look at Wikipedia:Arbitration/Requests/Case/Arbitration enforcement and Wikipedia:Arbitration/Requests/Case/Arbitration enforcement 2, where we saw months of drama with a huge number of users (wasn't there a lengthy discussion on Wikipediocracy thrown into the mix too?) centred around the issue of when it was appropriate to serve a block. No - if we can solve this problem technically, then we must. Ritchie333 (talk) (cont) 10:31, 19 December 2017 (UTC)Reply
  • I like this idea. It would make many things easier. ···日本穣 · 投稿 · Talk to Nihonjoe · Join WP Japan! 23:55, 16 January 2018 (UTC)Reply
  • Category blocking would be very useful and would allow us to control a significant subset of troublesome editors while still letting them edit generally - analogous to the Standard Offer in some ways. I would definitely use this. Guy (Help!) 23:44, 16 February 2018 (UTC)Reply

Problem 4. The tools to set, monitor, and manage blocks have opportunities for productivity improvement edit

Previous discussions
  • ...

General discussion edit

  • I'll start with the bad and unworkable ideas. User agent blocking and blocking by device ID would not be useful. Global blocks already prevent account creation by default. Allowing checkusers to "watch" IP addresses would have serious privacy implications and not be useful, as we can already do sleeper checks when necessary. Allowing admins to oversight usernames when blocking is a clear violation of the access to nonpublic data policy, and is already possible through a request to the stewards - with global accounts, this should not be done on a per-wiki basis anymore anyway. Preventing the use of blacklisted emails is already possible, and not very practical.
Now onto the suggestion: the main way to make blocking more effective is to require all new accounts to have an associated email address from a major provider (gmail, yahoo, etc). There should also be a way for checkusers to identify which accounts use the same email, without revealing the email address. The unknown email address should be block-able on the local and global level. Annotating block logs would be of marginal utility, but still possibly worth doing if a dev has some free time. More specific user blocking would be useful, but should be separate from the existing blocking interface due to the vastly different situations in which the two functions would be used. -- Ajraddatz (talk) 01:33, 14 December 2017 (UTC)Reply
  • Blocking by user agent could be useful for range blocks to potentially reduce collateral damage, which would make them a more attractive option. NinjaRobotPirate (talk) 03:23, 14 December 2017 (UTC)Reply
    User agents can be useful in a lot of cases when trying to figure out who someone is. But there is a very high possibility of collateral damage, especially when a common device or browser is used. It is also even easier for a person to change their user agent than their IP. There might be some useful applications of user agent blocking, but user agents tend to be even more fluid than IPs, even when people aren't intentionally trying to change them. -- Ajraddatz (talk) 19:39, 14 December 2017 (UTC)Reply
  • We have a current situation where a globally banned editor persists in editing using dynamic IP addresses from at least two providers. Some means of preventing this disruption other than using multiple rangeblocks would be helpful. --Malcolmxl5 (talk) 16:05, 14 December 2017 (UTC)Reply
    Hello Ajraddatz, NinjaRobotPirate, and Malcolmxl5, thanks for highlighting the pro and cons, strengths and weaknesses of these ideas. The challenge is (and always has been) balancing collateral damage against the need to stop account creation from repeat trolls and vandals. We (AHT team and the community) will need to consider this as we weigh solutions. SPoore (WMF), Community Advocate, Community health initiative (talk) 20:22, 14 December 2017 (UTC)Reply


  • Email: There is no ethical way Wikipedia/WMF could force editors to use a Wikipedia/WMF approved email provider any more than they can be forced to do their editing from a Wikipedia/WMDF specified OS or brand of mobile device or what car they should drive. I for one, have my own TLD and server hardware and would certainly refuse to sign up with Google to appease Wikipedia. That would be the day I retire.
That said, for those who are determined spammers and/or paid editors exploiting our voluntary work, creating throwaway Gmail accounts is as easy as creating throw away Wikipedia accounts, use cell phone mobile data, and a pocket full of cell phones of different brands, OS, and UA. These guys are professionals (names withheld).
But the idea of requiring an email address on registration is worth a try - Wikipedia is probably one of the few sites left in the world now that doesn't require one - the professional sites also require a cell phone number to double check.Kudpung กุดผึ้ง (talk) 15:25, 14 December 2017 (UTC)Reply
Creating throw-away email accounts doesn't take long, but tends to take much longer than just creating a Wikipedia account. At the very least, we could double the time it takes for a sockmaster to make a new account by requiring an email address. This would help a) dissuade people from making more socks in the first place, and b) gives us more time when responding to ongoing abuse. Fair point about using a WMF-approved email provider; instead, we could allow all emails by default and then block naughty hosts as required. -- Ajraddatz (talk) 19:45, 14 December 2017 (UTC)Reply
Noting these comments about the requirement for an email as a tool to slow down or some instances stop blocked users from returning. SPoore (WMF), Community Advocate, Community health initiative (talk) 20:45, 14 December 2017 (UTC)Reply
Requiring a mobile phone code sent to a unique number to sign up has merit for the same reasons. Getting a new mobile phone number requires more effort and costs money, which means it would be a lot more effective in stopping mass creation of socks. It's also an opportunity to ask "would you like to turn on 2FA?". MER-C 19:53, 15 December 2017 (UTC)Reply

Wishlist topics related to blocking edit

We're watching the 2017 Community Wishlist for proposals related to blocking. Smart Blocking, Per-page user blocking, and Allow further user block options ("can edit XY" etc.). Discussion about these proposals and others on the Community Wishlist are welcome in this consultation. SPoore (WMF), Community Advocate, Community health initiative (talk) 22:11, 13 December 2017 (UTC)Reply

Blocking tools and improvements is an important initiative and I thank SPoore (WMF) and her team for coming up with it. Our current heavy focus on spam and paid editing depends very much on finding new and/or improved solutions. However, as far as developers are concerned, the operators of the Community Wishlist have clearly stated in other places that the development and/or maintenance of essential software or extensions is not within their remit. This would leave these excellent suggestions for our blocking/CU systems in limbo, exactly in the same way that those responsible for the Community Wishlist insisted that we list with them these desperately needed upgrades to a related core en.Wiki function, only to be told later that it is not the responsibility of their department.
Before we advance much further with these discussions therefore, it needs to be established if the WMF hierarchy - from the CEO herself down - is is aware of these important suggestions and if development time is really going to be accorded to them in 2018. A quick glance at the Phab tickets demonstrates yet again that some of them are already very old and have been discreetly allowed to lapse. (FYI: TonyBallioni, Doc James). Kudpung กุดผึ้ง (talk) 16:22, 14 December 2017 (UTC)Reply
Hello Kudpung กุดผึ้ง, the Anti-Harassment Tools team is part of Community health initiative. This page describes the general focus of the work that the team will do. Anti-harassment_tools.
Improvement to blocking tools is one of key areas identified as a focus. This was based on the backlog in phabricators tickets, as well as other discussions on wikis. So, I feel comfortable saying that the Anti-Harassment Tools team will work on improvements to blocking tools in the first few quarters of 2018.
Trevor and I searched to find the backlog of related ideas on Phabricator, Meta, and English Wikipedia. We know that there are a lot of them. :-) While our focus is primarily about addressing harassment and making the community more welcoming, we know that the blocking tools are used for a variety of reasons. So, we are thinking about how changes in the tools or the creation of the new tools might overall improve the effectiveness of the blocking tool for all users. This is important. We need to prevent our changes from breaking an important existing workflow. So, in that way we are looking broader than uses for mitigating harassment. But development related to anti-harassment is our main focus.
The main point of this consultation is getting feedback about prioritization from the community about which improvements or new features related to the blocking tool will be the most effective. Once a week or so, Trevor or I will summarize the discussions happening here, on Meta, and other wikis. Then we can decide on next steps, and we can share the expected timeline.
I hope this response gives you some reassurance that this consultation is really going result in work focused on improving the blocking tools, although it is too soon to decide which is the priority. :-) SPoore (WMF), Community Advocate, Community health initiative (talk) 23:56, 14 December 2017 (UTC)Reply
Thank you for this Sydney. I will be taking a detailed look at the suggestions and I'm sure other concerned members of the community will chime in too. (If I remember rightly, I think we met in Italy last year. Correct me if I'm wrong). Kudpung กุดผึ้ง (talk) 02:15, 15 December 2017 (UTC)Reply
Yes, I pretty sure that we met at one or more of the Wikimanias. :-) SPoore (WMF), Community Advocate, Community health initiative (talk) 15:34, 15 December 2017 (UTC)Reply

Add a "Redact username from edits and lists" tickbox to Special:Block just like Oversighters have edit

This would make life SO MUCH EASIER compared to having to individually hide logs of purely disruptive or grossly offending usernames that need redaction, but don't qualify for suppression. It would keep our logs and pages clean, and eliminate the possibility that we miss something and a log that should have been redacted is left for public view. Thoughts? ~Oshwah~(talk) (contribs) 22:55, 18 December 2017 (UTC)Reply

I have a few thoughts on this!
  • Our current policy on username suppression makes no sense. "Ajraddatz is a meanypants" qualifies for suppression under criterion 4 of the OS policy (albeit weakly), but would only qualify for revision deletion if placed on a page.
  • I would support splitting hideuser into two rights - revdeluser which can apply revdel-level hiding of usernames, and suppressuser which can apply oversight-level hiding of usernames. The two would be used in the same circumstances as revision deletion and revision suppression are currently used.
  • Global username management should be done from a global level - local oversighters retaining the hideuser right doesn't make sense from any perspective other than "but muh local project autonomy".
  • However, a username like "Ajraddatz is stinky LOL" hardly needs to be hidden everywhere. A disruptive username such as that could be reasonably hidden on just the wiki it was vandalizng on. Usernames like "Ajraddatz's phone number is 911-911-911" are obviously more serious, and should be suppressed everywhere.
All of this leads to my primary suggestion here: Create revdeluser and suppressuser, add revdeluser to the local admin toolkit (and add its functionality to CentralAuth so stewards can globally hide if needed), and add suppressuser to the steward toolkit only. This sort of change would require some serious community-side policy work, but some of the technical details could be set into motion by the team watching here. Without any policy work, the revdeluser/suppressuser distinction could still be made, with revdeluser given to admins and suppressuser given to local oversighters as well. -- Ajraddatz (talk) 23:53, 18 December 2017 (UTC)Reply
Ajraddatz took the words out of my mouth. This is exactly how I imagine the idea be implemented on MediaWiki - with adding a new right to the software ('redactuser' would be the proper name) and implementing it on Special:Block only showing the suppressuser tickbox if you're an Oversighter. ~Oshwah~(talk) (contribs) 09:45, 28 December 2017 (UTC)Reply
Hello Oshwah, Ajraddatz and Jo-Jo Eumerus, I'll make sure that this discussion gets captured and included with ways that we could improve existing blocking tools. SPoore (WMF), Community Advocate, Community health initiative (talk) 03:36, 23 December 2017 (UTC)Reply

Congratulations edit

Summary of feedback received to date, December 22, 2017 edit

Hello and happy holidays!

I’ve read over all the feedback and comments we’ve received to date on Meta Wiki and English Wikipedia, as well as privately emailed and summarized it in-depth on this Meta talk archive page.

Here is an abridged summary of common themes and requests:

  • Anything our team (the Wikimedia Foundation’s Anti-Harassment Tools team) will build will be reviewed by the WMF’s Legal department to ensure that anything we build is in compliance with our privacy policy. We will also use their guidance to decide if certain tools should be privileged only to CheckUsers or made available to all admins.
  • UserAgent and Device, if OK’d by Legal, would deter some blocks but won’t be perfect.
  • There is a lot of energy around using email addresses as a unique identifiable piece of information to either allow good-father contributors to register and edit inside an IP range, or to cause further hurdles for sockpuppets. Again, it wouldn’t be perfect but could be a minor deterrent.
  • There was support for proactively globally blocking open proxies.
  • Some users expressed interest in improvements to Twinkle or Huggle.
  • There is a lot of support for building per-page blocks and per-category blocks. Many wikis attempt to enforce this socially but the software could do the heavy lifting.
  • There has been lengthy discussion and concern that blocks are often made inconsistently for identical policy infractions. The Special:Block interface could suggest block length for common policy infractions (either based on community-decided policy or on machine-learning recommendations about which block lengths are effective for a combination of the users’ edits and the policy they’ve violated.) This would reduce the workload on admins and standardize block lengths.
  • Any blocking tools we build will only be effective if wiki communities have fair, understandable, enforceable policies to use them. Likewise, what works for one wiki might not work for all wikis. As such, our team will attempt to build any new features as opt-in for different wikis, depending on what is prioritized and how it is built.
  • We will aim to keep our solutions simple and to avoid over-complicating these problems.
  • Full summary can be found here.

The Wikimedia Foundation is on holiday leave from end-of-day today until January 2 so we will not be able to respond immediately but we encourage continual discussion!

Thank you everyone who’s participated so far! — Trevor Bolliger, WMF Product Manager (t) 20:20, 22 December 2017 (UTC)Reply

TBolliger (WMF) I think that There has been lengthy discussion and concern that blocks are often made inconsistently for identical policy infractions is a social and not a technical issue; a number of editors commenting on a block discussion and admins un/blocking a given user factor the history of the user in (e.g an user with a long history of disruption will get a harsher block than one with a history of productive editing) when deciding what to do and what is one user's "identical policy infraction" may not be another user's "identical policy infraction". Jo-Jo Eumerus (talk, contributions) 20:27, 22 December 2017 (UTC)Reply
I'd also be interested in knowing about machine-learning recommendations about which block lengths are effective for a combination of the users’ edits and the policy they’ve violated. Jo-Jo Eumerus (talk, contributions) 20:27, 22 December 2017 (UTC)Reply
@Jo-Jo Eumerus: There was a fair amount of discussion on this topic on the meta talk page. Yes, I agree there is policy/politics around block length, neither of which software can solve, but the software could potentially alleviate some of the human judgement burden. For example, rather than selecting a block length, the admin could just select the policy infraction (and maybe severity?) and the system would suggest a recommended block length. This could be amplified with machine learning to know which block lengths work best (in terms of limiting disruption and retaining constructive contributors) for which policy infractions and types of users. There are ethical questions around blindly using AI to make such decisions, but even if the software can produce some guidance for the admin based on the users' edits and similar cases the admin won't have to make a decision from their intuition every time. (Note that I'm using "admin" as a general term here, not specifically just one individual permissioned user for all blocks.) — Trevor Bolliger, WMF Product Manager (t) 17:22, 23 December 2017 (UTC)Reply

Addendum to summary of feedback received to date, January 18, 2018 edit

Hello everyone! Here is an addendum to feedback received on the Meta Wiki talk page from December 22 to today, January 18. A original summary of feedback can be found here.

Problem 1. Username or IP address blocks are easy to evade by sophisticated users
  • More people added support for blocking by email address, page or category blocking (including cascading sub-pages), and User Agent blocks.
  • Use typing patterns (e.g. the rhythm, speed, and other characteristics of how a user inputs text in the editing surface) to identify sock-puppeteers.
  • Use network speed as an identifiable piece of information.
  • Use editing patterns (time of day, edit session length, categories of pages edited) to build improved CheckUser tools
  • Finish building a tool that allows viewing all the contributions within an IP range. (phab:T145912)
  • Extend Nuke to allow one-click reverting of all edits made within an IP range
Problem 2. Aggressive blocks can accidentally prevent innocent good-faith bystanders from editing
  • Perform analysis on IP-hoppers to build a model of how users (intentionally or unintentionally) change their IP address to build more tactical blocking tools.
  • We should limit cookie blocks to 1 year, instead of indefinitely.
Problem 3. Full-site blocks are not always the appropriate response to some situations
  • Build a user masking systems to obfuscate or ‘hide’ users from each other on wiki (e.g. User:Apples should not be able to see actions by User:Bananas (edits, talk page messages, etc.) or interact with them on any article or talk page.
  • Create a credit-based system instead of a blocking system.
Problem 4. The tools to set, monitor, and manage blocks have opportunities for productivity improvement
  • Mobile block notices are abysmal (phab:T165535)
  • Allow admins to ‘pause’ a block so the user can participate in on-wiki dispute resolution discussions without being jerked-around and blocked several times. (This could also be addressed by creating a ‘pages allowed to edit while blocked’ whitelist.)
  • Add a date-picker to Special:Block to make it easier to set an end date and time for a block, taking into account time zones.
  • Make it easier for admins to hide harassing usernames during the blocking process
General
  • In the long run, we should devise a system that automatically (or strongly encourages to a human administrator) sets an appropriate block length and block method (e.g. IP, email, User Agent, etc.)
  • Some users raised their concerns about giving admins more tools which could potentially affect the balance of power between non-admins and admins. The discussion also proposed limiting the abilities of administrators and that only bureaucrats or a new group of trained users to set blocks, and that indefinite blocks, range blocks, and cookie blocks be removed from the software.
  • We should keep in mind that perfection is not required. Our new tools don’t need to be watertight, just better than they are today.
Next steps

Some people are still discovering this discussion — please continue to share new ideas or comment on existing topics!

In early February we will want to narrow the list of suggestions to a few of the most promising ideas to pursue. Sydney and I will ask for more feedback on a few ideas our software development team and Legal department think are the most plausible. We’ll be sure to keep everything documented, both on Meta Wiki and on Phabricator, for this initiative and future reference.

Thanks! — Trevor Bolliger, WMF Product Manager (t) 20:10, 18 January 2018 (UTC)Reply

Cookie blocks expire after 24 hours or when the initial block expires: mw:Manual:$wgCookieSetOnAutoblock. NinjaRobotPirate (talk) 01:40, 19 January 2018 (UTC)Reply

Of this shortlist of 6 features, help us pick 2 to build edit

Hello everybody! Over the past weeks our team took a look at all 58 suggestions that came out of this discussion. We discussed each one and ranked them on four criteria — how much demand is behind this idea? what is the potential impact? how technically complex is this to build? is this in line with Wikimedia’s mission? (We're also going to have the Wikimedia Foundation Legal department weigh in with their thoughts, I'll share more when that's ready.)

You can see our ranking and notes at this table, but here are the top ideas that we think will best address the four problems we want to solve:

Problem: Username or IP address blocks are easy to evade by sophisticated users

  • Project 1 - Block by combination of hashed identifiable information (e.g. user agent, screen resolution, etc.) in addition to IP. With this project, we would create a browser fingerprint with some specific identifiable pieces of data about the user's computer and store it as a hash. Admins could then set an IP range block that also includes a match for this fingerprint, but would not be able to see the hashed information.
  • Project 2 - Block by user agent in addition to IP. This project is similar to the first, but would store the user's user agent which would be visible to CheckUsers.
  • Project 3 - Surface hashed identifiable data to surface as a percentage match to CheckUser. With this project, we would create the same browser fingerprint as outlined in the first bullet item, but would not set blocks by the hash, instead it would be displayed to CheckUsers as a percentage match in a tool to compare 2+ usernames (e.g. "User:A and User:B have a 93% match for using the same computer.")

Problem: Aggressive blocks can accidentally prevent innocent good-faith bystanders from editing

  • Project 4 - Drop a 'blocked' cookie on anonymous blocks. This already exists for logged-in users, but we will need to look into the European Union's laws regarding using cookies on logged-out visitors.

Problem: Full-site blocks are not always the appropriate response to some situations

  • Project 5 - Block a user from uploading files and/or creating new pages and/or editing all pages in a namespace and/or editing all pages within a category. With this project, we will provide a few specific ways to block users from specific pages/areas within a wiki, controlled via Special:Block and logged similarly to Special:Log/block.

Problem: The tools to set, monitor, and manage blocks have opportunities for productivity improvement

  • Project 6 - Special:Block could suggest block length for common policy infractions. With this project, admins would have to first select a block reason on Special:Block which would then automatically select the block length.

Our team only has time to build two of these features, so we need your help in determining which of these we should proceed to investigate. Of these six ideas, which holds the most promise and why?

Thank you! — Trevor Bolliger, WMF Product Manager (t) 00:00, 16 February 2018 (UTC)Reply

Projects 1/2/3 and 5 should be prioritized.
  • Resolving the first problem will save many volunteer hours by enabling admins/checkusers to more effectively block those engaged in long-term abuse and spammy behaviour. Under the current technical situation, it is very easy for users to evade blocks. Fixing this problem would also be useful for the rest of the Wikimedia wikis (and for stewards globally).
  • Project 5 will allow some behavioural issues to be addressed with a much finer tool than a site-wide block. This can be useful for editor retention, and keeping some difficult people engaged while removing them from problematic areas. Fixing this problem would also be useful elsewhere on Wikimedia.
  • Project 4 is important, but needs more work than just anon-only cookie blocks. As someone very involved with responding to long-term abuse, I definitely understand the need to prevent collateral damage. But I don't feel that this problem is as critical as the above two.
  • Project 6 is questionable, and would not be transferable to other Wikimedia wikis. Project 5 would help a lot with this already, and most of the other work needed here is community-side - setting clearer policies around blocking. -- Ajraddatz (talk) 00:58, 16 February 2018 (UTC)Reply
@Ajraddatz: Oops! I relabeled the list, I think the discussion should be able the 6 projects, not the 4 problems. Do you mind if I edit your comment to reflect my edit? — Trevor Bolliger, WMF Product Manager (t) 01:13, 16 February 2018 (UTC)Reply
Go for it :-) -- Ajraddatz (talk) 01:14, 16 February 2018 (UTC)Reply
Done. I have responses to your comments, but I'll let others chime in so I don't dominate the discussion. — Trevor Bolliger, WMF Product Manager (t) 02:08, 16 February 2018 (UTC)Reply
I'll expand a bit more on my answer regarding projects 1-3. Out of these, project 1 should be prioritized, followed by project 2. User agent blocks are of questionable overall value, but could be useful sometimes. This would also help to prevent collateral damage (the goal of project 4). Project 3 is not worth doing. From the perspective of using the checkuser tool, seeing a percent match isn't very useful. Each case is different, and the technical data needs to be interpreted in its full context. -- Ajraddatz (talk) 00:06, 17 February 2018 (UTC)Reply
  • I like the 1 and/or 3 if they are technically workable; anyway to make blocks stick to a person rather than an IP or IP range is preferable. As I noted originally, our blocking system is optimized for 2005, when most people still accessed the internet via a desktop computer tied to a wall with an ethernet cable; IP blocking was essentially person blocking back then. Today, most people access the internet using portable devices which access through a wide array of networks; and those networks are more likely to assign IP addresses dynamically even if the same device is used on the same network. IP addresses mean nothing to someone editing from a cellphone or tablet connected through a cellphone network or using coffee shop wifis. IP address blocking only really works to stop lazy vandals for a few hours; determined trolls and people with axes to grind aren't stopped by this. They don't even need any technical knowledge. The way their device connects to the internet means they aren't on the same IP from day to day. 5 is also good; anyway to make blocks granular would keep around good editors who have hot-button areas they can't work with; it would give us a step between soft-bans and permablocking. 6 sounds like a bad idea all around, admins need nuance in choosing block lengths appropriate to infractions; no automated system would be useful here. It removes the thing that makes good admins good admins, that is the ability to assess a unique situation and devise an appropriate response. --Jayron32 19:45, 16 February 2018 (UTC)Reply
  • 1/2/3 are probably the most beneficial, but they could still be evaded by someone with high technical skills. --Rschen7754 19:49, 16 February 2018 (UTC)Reply
    Anything can be evaded by anyone with the will. People can kick down the back door to your house and take anything they want when you aren't home. It doesn't mean you don't lock the door. Sensible security is still useful, even if it can't keep everyone out. --Jayron32 19:51, 16 February 2018 (UTC)Reply
  • Correct me if I'm wrong, but Project 1 is essentially fingerprinting a la EFF's Panopticlick, right? If so, I think that's the wrong way to go. The message of that project was that it is easy to follow and identify most individuals beyond what the average internet user is familiar or comfortable with (IP, cookies, ads, etc.). The language of the proposal implies that this information would be collected before a block is given, but I suppose it'd have to be if everyone's hash is to be compared against it. Given that en.Wiki is the 5th most visited site in the world, would that not mean collecting and hashing data for a good portion of the English speaking world? I disagree heartily that it's a good Wikimedia ethos fit. Project 2 is a good compromise — balancing the need to strengthen blocking while not becoming what the EFF fears. Project 4 sounds great — well within what folks expect when using the internet, and no adverse collection/treatment for those behaving themselves — so looking into the legality for EU purposes seems worth it. Project 5, particularly for namespaces, would be nice; I can foresee abuse, but given that such things are already implemented ad hoc here (e.g., via ArbCom) it should only help. I agree that Project 6 isn't worth much here; look no further than our devotion to the 31-hour block. Good blocks require adapting to the situation, not rigorous rule-following. ~ Amory (utc) 20:52, 16 February 2018 (UTC)Reply
  • Using the information the browser sends in HTTP headers is unobjectionable; Wikimedia already gets that information automatically and the Privacy Policy states that it is collected. See here. Some of the other information is also already collected by Wikimedia (e.g. whether JavaScript is disabled). I'm of the opinion that if the WMF already extract and log information about your browser for some other reason, than it is OK to reuse it for abuse prevention purposes. This data would be then dumped into the checkuser table, which expires after 90 days. If new code has to be written to collect additional data points, then that would be crossing the boundary. MER-C 21:44, 16 February 2018 (UTC)Reply
  • I know WMF receives plenty of this already, the whole point of Panopticlick is that this info is freely given. To me, Project 1 implies using more than what the privacy policy lists. There is a large difference between optimizing based on browser, OS, mobile, and referral to storing, for every user, a hash that could potentially uniquely identify every reader. Just because the WMF is able to do something doesn't mean they should. As I said, Project 2 is sufficient. ~ Amory (utc) 03:02, 17 February 2018 (UTC)Reply
  • 1+2+3 and 5 please. I would expose the full HTTP header information to checkusers with hashes over (secret + IP information), (secret + user agent) and (secret + rest of the HTTP headers) available to admins. Blocking someone based on CU evidence is about IP address + geolocation + user agent, and I've made sock blocks where the CU publicly stated that the geolocation and user agent (was common) was the same (without stating what they were, of course). I would like to be able to make similar blocks under the new system, but one has to be aware of users using the latest version of Chrome and Windows 10 (very common combination) on a busy IP range. I'm not sure whether stuff that requires JavaScript to collect should be available to CUs (but am leading towards yes, CUs must be technically literate and something like 93% similar is rather meaningless) -- after all, IP (range)s are one of the larger sources of distinctiveness. Either way, JS-only information should be given a separate hash. MER-C 21:52, 16 February 2018 (UTC)Reply
  • Just because the WMF is able to do something doesn't mean they should. — I agree, which is why we're asking for help in our prioritization process. We're planning to build these features within the confines of our existing Privacy Policy. I'm meeting with some members of the WMF's legal team tomorrow to understand where these boundaries lie on these six projects, and I'll share notes immediately after. — Trevor Bolliger, WMF Product Manager (t) 18:44, 21 February 2018 (UTC)Reply
  • Of all of these, 5 is the one I see as most obviously fitting identified use cases, it would also benefit all admins. 3 would provide the much longed-for magic pixie dust, so i would support that too if CUs want it. Guy (Help!) 23:48, 16 February 2018 (UTC)Reply
  • I'd prefer you spent all available effort on 5, but I also quite like number 4. Please help me understand something. For me the biggest problem is users hopping all over dynamic ranges and various proxies - numbers 1-3 are intended to address the fact that "IP address blocks are easy to evade", so I can't see how it would make much sense to include the IP address in the hash. You could include the range but I'm sure we'd just block the range instead, and in most cases that won't help either. You could leave out the IP address I suppose, but that raises other questions. So numbers 1-3 seem less about making blocks effective, and more about reducing collateral. Number 6 seems like not a good idea for various reasons. -- zzuuzz (talk) 10:35, 17 February 2018 (UTC)Reply
  • Great question, and yes these are mostly about minimizing collateral damage of wide IP range blocks. (Problems 1 and 2 are near-identically related.) Without some form of location/geo information in the block, setting a device block is likely too dangerous given the lack of diversity in devices available on the market and the small amount of data available to use within the existing Privacy Policy. — Trevor Bolliger, WMF Product Manager (t) 18:44, 21 February 2018 (UTC)Reply
  • Comment: For those who are interested in what Wikipedia can do to identify different usernames/IPs as having a high probability of being the same computer, see [ https://panopticlick.eff.org/ ]. Technical details are at [ https://panopticlick.eff.org/about ]. --Guy Macon (talk) 23:09, 17 February 2018 (UTC)Reply
  • I've never been a fan of the clamor for page specific blocks (we already have topic bans, and if you need the technical prohibitions from editing one page, you likely need the technical prohibitions from editing the site.) Of these, 1-3 should be the priority with 1 and 3 being top of those, followed closely by 2. TonyBallioni (talk) 05:29, 19 February 2018 (UTC)Reply
    Not "likely" because sometimes page specific bans are used for editors who behave well in places A, B and C and thus "prohibitions on editing the site" including blocks aren't reasonable but can't behave in D and thus need to be restricted from there. Jo-Jo Eumerus (talk, contributions) 10:17, 19 February 2018 (UTC)Reply
    Then we use a topic ban or a page ban that is less gameable than a technical restriction and won't sully a block log. I know a bunch of people want it, I just think it will be an utter mess in implementation. TonyBallioni (talk) 16:29, 19 February 2018 (UTC)Reply
    No, because some of these users won't comply and then we are back to blocking and thus to the first issue. The reason page specific blocking is proposed is because neither of the alternatives (blocking, not blocking, bans) work in some cases, typically with editors who have gained a fair amount of support thanks to their work on A, B and C but their contributions to D aren't acceptable and they don't want to stop. Jo-Jo Eumerus (talk, contributions) 16:43, 19 February 2018 (UTC)Reply
    Which will get to technical concerns about how these blocks are implemented (do we allow category blocks? What if something is miscategorized and someone violates a topic ban or what should have been a topic ban and then WikiLawyers saying it wasn't covered by a technical page specific block when it should have been?) Project 5 will only create more hassle than it is worth and will be a net negative, IMO, but at best will have zero positive impact and be a dud (roughly like Pending Changes has had no virtually no impact). It's a nice way to feel good about ourselves for not doing full site blocks for otherwise productive editors, but it won't solve anything, and I think will create more headaches. TonyBallioni (talk) 18:17, 25 February 2018 (UTC)Reply
    If someone is blocked from editing a page within a category, they can't edit pages in that category. You can't violate a block except by socking and that's not the issue here. A miscategorization would need a third party to edit, and we do have mechanisms for dealing with proxying as well. Jo-Jo Eumerus (talk, contributions) 09:27, 27 February 2018 (UTC)Reply
  • Some kind of solution to problem 1 would be my priority, especially if more information were returned to checkusers. Projects 4 and 5 sound nice, but I'm tired of repeatedly blocking the same editors. Twinkle already does a simplistic version of Project 6. NinjaRobotPirate (talk) 06:26, 27 February 2018 (UTC)Reply
  • Proposal (2 or 3) and (5)--~ Winged BladesGodric 15:46, 27 February 2018 (UTC)Reply
  • Project 5 seems the most obviously useful one to me. I don't fully understand the more technical proposals (14), as you don't have to be smart to be an admin here, but aren't CUs already able to more or less identify the UA? The CUs I talk to sound like they can. And pile-on opposition to 6. It's not a bad thing that admins have to activate higher brain centers when deciding on block length. Bishonen | talk 16:44, 27 February 2018 (UTC).Reply
    Yes. This is about using additional information (e.g. HTTP headers, screen size) that is already collected by the WMF and used for other purposes and using those to identify sockpuppeteers and/or getting blocks to stick to them and them only. MER-C 21:23, 27 February 2018 (UTC)Reply
    Thank you. In that case I'm in favour of (2 or 3) as well. If I've understood them. Bishonen | talk 21:32, 27 February 2018 (UTC).Reply

Update after meeting with WMF Legal edit

This morning a few folks from WMF’s Legal department gave some feedback and guidance on these six projects. In short, there are no insurmountable blockers to accomplishing any of these projects. All six have their own nuanced concerns and will require a deeper discussions but each is possible to be built.

The first three projects will require further conversations about exactly what data is captured, how it is stored, how long it is stored, and the format it is displayed to users. We will also want to consider how to provide training and other resources to users to understand how these blocks can affect users. Dropping cookies on anonymous users is possible, but will require careful thought on how to update the cookie policy while avoiding WP:BEANS. There were no privacy concerns for the fifth and sixth projects, so long as each community sets them in accordance to their own policies.

Feedback so far seems to favor a combination of Projects 1/2/3 and 5, with brewing support for 4. Project 6 seems to be DOA. Please keep the comments coming! In the coming weeks our team’s developers will begin technical investigations into the projects that have the most support. — Trevor Bolliger, WMF Product Manager (t) 19:14, 22 February 2018 (UTC)Reply

Status update: Keep adding comments, AHT team is reading posts edit

Hello everyone, Keep on adding your thoughts about the shortlist. Trevor Bolliger and I are monitoring the discussion. This week the Anti-Harassment Tools team developers will begin investigating the top projects to estimate the time involved to complete them and other technical aspects of them. We'll give an update about this when we have more information. SPoore (WMF), Community Advocate, Community health initiative (talk) 19:07, 27 February 2018 (UTC)Reply

What the WMF’s Anti-Harassment Tools team will build in 2018 edit

Hi everybody, thank you for the input over the past months, we think this has been an incredibly worthwhile process. Given all the participation on this talk page, our discussion with our legal department, and our preliminary technical analysis we have decided to investigate two projects now, build two small changes next, and later will follow with a third project.

We will investigate and decide between:

  • Project 1 - Block by combination of hashed identifiable information (e.g. user agent, screen resolution, etc.) in addition to IP range. We are still defining what “hashed identifiable information” means in our technical investigation, which can be tracked at phab:T188160. We will also need to decide how this type of block is set on Special:Block (likely an optional checkbox) and how this type of block is reflected in block logs.
  • Project 4 - Drop a 'blocked' cookie on anonymous blocks. The investigation can be tracked at phab:T188161.
  • If these projects are deemed technically too risky, we will pursue Project 2 - Block by user agent in addition to IP. User agents data is already available to Check Users.

We will also be adding an optional datetime selector to Special:Block (phab:T132220) and will be improving the display of block notices on mobile devices (phab:T165535). In a few months (likely around May 2018) we will pursue some form of Project 5 - Block a user from uploading files and/or creating new pages and/or editing all pages in a namespace and/or editing all pages within a category.

Because we value access to Wikipedia (in high risk regions, for users not comfortable with email, etc.) and because of evolving technical infrastructure of the internet (e.g. IPs, browsers, devices) we will need to continually evolve our blocking tools. This article page and the organized user blocking column on Phabricator will be useful in future discussions and decisions. Please continue to add to them or discuss on this talk page as new ideas or problems emerge.

Again, thank you. We’ll post updates here as our work progresses.

– The WMF’s Anti-Harassment Team (posted by Trevor Bolliger, WMF Product Manager (t) 23:37, 8 March 2018 (UTC))Reply

Thanks for your efforts, they are very much appreciated. That reminded me to file phab:T189391 (notify me when this block is about to expire), but that requires some backend work on Echo first. MER-C 21:09, 10 March 2018 (UTC)Reply
Update, March 26: We are nearly code complete with adding a DateTime selector to Special:Block (phab:T132220) and we hope to release this to all wikis by mid-April. We are also making good progress on IP cookie blocking (phab:T188161) which I've submitted to WMF's Legal department to review any privacy concerns and to update our cookie policy.
We've decided to not proceed with building Project 1 — hashed fingerprint blocking — given it would be too error prone with the data we currently gather and any additional data would likely be too unreliable to justify updating our Privacy Policy. We are now proceeding with giving CheckUsers the ability to block by user agent (phab:T100070.) We have a new project page for this and encourage your input!
Thank you. — Trevor Bolliger, WMF Product Manager (t) 23:37, 26 March 2018 (UTC)Reply
Beautiful, thank you for the update and the quick, concerted efforts! ~ Amory (utc) 00:40, 27 March 2018 (UTC)Reply

Software development update, May 3 edit

Hello everyone! I have another status update:

  • We’ve completed work on the datetime selector for Special:Block. It is live on a handful of wikis and we’ll be releasing it sitewide in the coming weeks.
  • We’re nearly done with improving the display of block warnings on mobile. It’s our team’s first mobile project so we’re getting used to tiny code on the tiny screens.
  • We’re in the final stage of anon cookie blocking. It’s tricky to QA so we’re taking our time and putting in our due diligence. Should be out before June.
  • We’re generating some statistics about blocking usage across Wikimedia. (phab:T190328) I’ll post a summary here when the data comes back, it should be interesting!
  • User Agent + IP range blocking for CheckUsers is next on the queue.
  • We’re working on designs for granular blocks (per page, category, namespace, and/or uploading files.) We need your help to design this feature! 🎨 See more details at Wikipedia talk:Community health initiative on English Wikipedia/Per user page, namespace, category, and upload blocking#Help us design this tool! and join the discussion!

Thanks, and see you on the talk page. — Trevor Bolliger, WMF Product Manager (t) 21:50, 3 May 2018 (UTC)Reply