Per Xaosflux I've preemptively started this BRFA. I haven't done any coding but I think this is a good time to discuss implementation details. Concerns that come to mind:
After deployment, we'd need to update the JS page to clearly say it should not be edited directly. We could do a two-way syncing, but I'd prefer not to, just to keep it simple.
I can confirm MusikBot II's account is secure and 2FA is enabled (with some caveats). The bot solution still puts us on the winning end, as there will be fewer int-admin accounts than if we added it to all who manage geonotices.
@MusikAnimal: for the directional sync concerns, a defined "section" (delineated by comments) should be the only area edited - this section should be marked "do not edit directly" - and the bot should only edit within the section. This way if other changes to the page are needed they won't interfere. — xaosfluxTalk 04:25, 24 September 2018 (UTC)
This should work fine, just like her PERMclerking, right? Would be good if there are rush edits, last-minute-changes, etc. ~ Amory(u • t • c) 16:22, 24 September 2018 (UTC)
Yeah, we can definitely reserve a part of the JS page for free editing, much like we do at WP:AWB/CP. — MusikAnimaltalk 16:41, 24 September 2018 (UTC)
I'd like to see some tests over at testwiki that can be used to demonstrate the edits. — xaosfluxTalk 04:25, 24 September 2018 (UTC)
No problem. Though I don't think we need to test Geonotice itself (could be tedious), rather just that the JS was generated properly. — MusikAnimaltalk 16:41, 24 September 2018 (UTC)
Agree, don't need to actually implement the geonotice, just that things work as expected in the namespaces and content types. — xaosfluxTalk 01:21, 25 September 2018 (UTC)
Perhaps make the mock-up json page to demonstrate? — xaosfluxTalk 04:25, 24 September 2018 (UTC)
JS injection shouldn't be possible, unless there are vulnerabilities in Geonotice itself. I would hope it doesn't use eval on the strings. Arbitrary code (e.g. begin: alert('foo!') isn't valid JSON and hence would fail on the initial validation (and the MediaWiki JSON editor won't let you save it, either). We can still validate it ourselves, to be sure. As I said this would be a nice feature. I don't know that I want to validate things like the country, though. We could validate the 'begin'/'end' date format, in particular, but for everything else I think the bot will just look for permissible keys and the data type of the values ('country' is a string, 'corners' is an array of two arrays, each with two integers). — MusikAnimaltalk 16:41, 24 September 2018 (UTC)
For the JSON page, not the bot: we'll also have to move the normal explanation text into an editnotice or regular notice, since comments are stripped on save for pages with the JSON content model. Enterprisey (talk!) 23:32, 24 September 2018 (UTC)
I agree with the quoting change. You may want to specify the number of edits if it's more than one, but I don't know if that's required for attribution. (And it's displayed on the diff page anyway.) Enterprisey (talk!) 06:16, 25 September 2018 (UTC)
Another option there is to put the whole attribution (source, diff, time, user of diff) into the comments of the .json, and only minimal in the edit summary (revid, sourcepage). --Dirk BeetstraTC 13:59, 2 October 2018 (UTC)
I'm using Ruby internal methods to tell if the date is valid. This works for "Invalid date" or "35 January 2018 00:00 UTC" but not for invalid month names as with "15 Foobar 2018 00:00 UTC". Going by some logic I don't understand it chooses some other valid month. I could use regular expressions to ensure the month names are valid, but I want this bot task to work in other languages where I assume they're able to put in localized month names, if not a different format entirely (which Ruby should still be able to validate). Anyway I think this is fine. There were no validations before, after all :)
Validating the country code actually works! It's going off of the ISO 3166 spec, which is what's advertised as the valid codes Geonotice accepts.
Coordinates are validated by ensuring there are two corners, and each with two values (lat and lng), and that the values are floats and not integers or strings.
The keys of each list item are also validated, ensuring they only include "begin", "end", "country", and either "corners" or "text".
I added code to check if they escaped single quotations (as with \'), since Geonotice admins probably are used to doing this. Amazingly, MediaWiki won't even let you save the JSON page if you try to do this, as indeed it is invalid JSON. So turns out no validation is needed for this, or any other JSON syntax errors for that matter. This should mean we don't need to worry about anyone injecting malicious code.
The comment block at the top of the JS page is retained and can be freely edited.
Back to the issue of attribution in the edit summary, I went with Xaosflux's recommendation and am using a combined diff link, piped with the title of the JSON page. I'm not sure it's worth the hassle of adding in comments in the generated JS code directly, but let me know if there are any strong feelings about that.
Let me know if there's anything else I should do, or if we're ready for a trial! — MusikAnimaltalk 04:35, 11 October 2018 (UTC)
MediaWiki won't let you save the page with invalid JSON even if you turn off JS or use the API, right? Because if it does you may want to validate for that case. Enterprisey (talk!) 04:43, 11 October 2018 (UTC)
Luckily it's server side. It shows the error "Invalid content data", even if you have JS turned off. I haven't tested the API yet, but if it does work it's probably a bug in MediaWiki :) — MusikAnimaltalk 16:38, 11 October 2018 (UTC)
But I should clarify, the bot does validate JSON content, but I haven't tested to see if this works because I am unable to create invalid JSON :) At any rate, we would not end up in a situation where an invalid JS object is written to MediaWiki:Gadget-geonotice-list.js, because the core JSON methods that we're using would error out before this happens. — MusikAnimaltalk 20:15, 11 October 2018 (UTC)
This seems fine to me. My main concern (and it is fairly minor) is making sure the bot's role is clear in documentation/notices, and that people will know how to look for errors if something doesn't get updated because validation failed (because there won't be immediate user feedback as there is with the basic MW-side JSON validation). I'm giving this for a two-week trial, pending completion of the editnotice(s) and related pages and granting of the i-admin flag; based on history, that should allow for at least a handful of edits to test with, but feel free to extend if more time is required. Approved for trial (14 days). — Earwigtalk 05:51, 14 October 2018 (UTC)
@The Earwig: will this be trialing on the actual pages or in userspace? Ping me if you need a flag assigned for trialing. — xaosfluxTalk 00:06, 15 October 2018 (UTC)
@Xaosflux: My intention is for a full trial. I saw there were already reasonable tests done in the userspace, so given that MA feels comfortable working with the actual pages now, I'm fine with that too. As for the IA flag, it's not clear to me from the policy whether we can do that here or a request needs to be explicitly made to BN? I would prefer MA post something to BN to be safe, but I suppose one interpretation of the policy would let you grant it immediately without the waiting period. — Earwigtalk 03:22, 15 October 2018 (UTC)
@MusikAnimal: what authentication options do you have configured for this bot account? (e.g. 2FA, BotPasswords, OAuth) — xaosfluxTalk 11:57, 15 October 2018 (UTC)
@Xaosflux: 2FA is enabled. Historically I have not had a good solution for OAuth, but times have changed. I'll try to look into this today. For the record the current consumer for MusikBot II can only edit protected pages, all other admin rights are not permitted. We will use a different consumer here, and all related edits will be tagged with the application name. We could use "GeonoticeSync 1.0" (what I've dubbed the task, and then a version number), or is there a better name? For permissions, I believe the consumer only needs editsiteconfig.
So no need to grant int-admin just yet -- although it should be safe to do so, because we have 2FA enabled and the current consumer can't edit sitewide or user JS/CSS.
The outstanding to-dos:
Create OAuth consumer and rework the bot to use it.
Ping all the current Geonotice admins to make sure they know about the new system, and the new rules (don't escape single quotes, but do for double, etc.).
Grant int-admin to MusikBot II, and enable the task.
I'll ping you when I'm done with steps 1-3, and once given the final okay we'll do 4-5. Sound like a plan? If we have to rollback or abandon the new system, I'll be sure to let everyone know that they can go back to editing MediaWiki:Gadget-geonotice-list.js directly. — MusikAnimaltalk 16:58, 15 October 2018 (UTC)
Sounds fine, let us know when you are ready. — xaosfluxTalk 18:37, 15 October 2018 (UTC)
@Xaosflux and The Earwig: I spoke with Bawolff, a security expert working for the Foundation. It would seem in this case, bot passwords is no less secure than OAuth. OAuth is more about editing on behalf of users, or authorizing users to use some centralized service. This is fantastic news, because I found the library I was going to use is more for webservices (specifically Ruby on Rails or similar), which doesn't apply here. I would have to implement my own client. To illustrate the complexity, have a look at this "simple" example written in PHP.
So if it's alright, I'd like to move forward with the current bot infrastructure. I have gone ahead and set up the JSON config to reflect the current JS config, and filled in the edit notice. I'm going to be out of town this weekend, so I can start the trial early next week if we're ready to move forward (first doing steps 3-5 above). — MusikAnimaltalk 03:09, 19 October 2018 (UTC)
@MusikAnimal: its not "as good" but it is still much better than using standard passwords. Please let us know what BP grants you are including (you don't have to disclose the allowed IP ranges (that you should also use if you can). — xaosfluxTalk 03:15, 19 October 2018 (UTC)
Just an update that I haven't forgotten about this. I would like to resume work soon. Bawolff had another great idea to use the MediaWiki parser API on the "text". This virtually eliminates security concerns, and above all, gets rid of all that HTML. It'll be easier for Geonotice managers to test what the output looks like, and I assume wikitext is more familiar to them than using <a>...</a>, etc. @Xaosflux and The Earwig: Thoughts? — MusikAnimaltalk 23:41, 4 November 2018 (UTC)
Seems OK, the sooner you can get a mock up running outside of the production page the sooner people can start test cases. — xaosfluxTalk 16:03, 5 November 2018 (UTC)
Coordinates are validated by ensuring there are two corners, and each with two values (lat and lng), and that the values are floats and not integers or strings. - I'm not sure why you want to treat integers and floats differently? Ignore this concern if you've already tested the bot with integer coordinates.
Geonotices are a low-volume process and the team of maintainers is small, especially after the IAdmin right got unbundled. I think we should roll out the bot first and then update the documentation as we go along.
I've imported all current geonotices into Wikipedia:Geonotice/list.json (involving a bit of search and replace, and a bit of human legwork) so we can test the bot. If it runs correctly none of the current geonotices should change. Deryck C. 17:52, 7 November 2018 (UTC)
Good point about the coordinates. I don't know why I decided to enforce floats. This has been fixed.
For the record, I've tested that the Geonotice actually shows on testwiki. At this point I'm confident to move forward. Glad to hear you are ready as well!
@Xaosflux: Shall we make a request at BN for the int-admin flag, or can you handle that? — MusikAnimaltalk 22:23, 7 November 2018 (UTC)
@Xaosflux: Deryck says we can go ahead without updating documentation (which makes sense, in case we have to rollback), so yes, I'm ready :) Just a note that I'm not going to turn the bot on until the int-admin rights have been granted, hence don't expect MediaWiki:Gadget-geonotice-list.js to update immediately. — MusikAnimaltalk 02:21, 8 November 2018 (UTC)
@MusikAnimal: int-admin access added for 14 days to support this trial. Any admin should block this bot immediately if it is malfunctioning related to this access. — xaosfluxTalk 02:37, 8 November 2018 (UTC)
@Xaosflux: Thanks! I apologize, I've changed my mind -- I'm going to start it first thing in the morning (my time). No need to extend the int-admin access, the ~12 hour difference is trivial. I'm going to ping everyone at Wikipedia talk:Geonotice#New bot-assisted_system just after the first edit.
If something does go wrong, do we need to block the bot? I know admin/intadmin bots are more sensitive, and I'm even advertising the block button on the userpage. You definitely should revert the edits to the last stable version, but ideally you'd use the "disable task" link (making it anything other than true), that way the other bot task won't be interrupted. That link works, as evidenced by Special:Diff/867621853 after a vandal cleverly disabled the AWB task yesterday. I've upped the protection level on the run pages to prevent this from happening again. — MusikAnimaltalk 03:24, 8 November 2018 (UTC)
@MusikAnimal: - Clarifying:if it is doing things off-task with this access, malfunctions should be handled in the normal escalating manner. — xaosfluxTalk 04:30, 8 November 2018 (UTC)
@Deryck Chan: Good catch, this has been fixed (see testwiki:Special:Diff/364384). I noticed you added "begincomments" and "endcomments" to the JSON config. Currently the only supported key is comments, you have throw everything in there. Is that satisfactory?
I'm otherwise ready to go! The generated JS of the current JSON can viewed at testwiki:Special:PermaLink/364390. It looks right to me, minus the red links since those pages don't exist on testwiki. — MusikAnimaltalk 19:02, 8 November 2018 (UTC)
@MusikAnimal: Thank you so much! Regarding "begincomments" and "endcomments", I was hoping the bot should ignore JSON fields that aren't in the bot specification, allowing admins to use arbitrary field names to leave additional notes about the JSON without changing the actual geonotice. But it seems that you've built the bot to police all fields and make the bot throw up an error if somebody adds unrecognised field names. That's fine with me too. Deryck C. 23:16, 8 November 2018 (UTC)
It seems that we've done enough testing on testwiki - should we try to set the bot to work on en.wp? Deryck C. 23:16, 8 November 2018 (UTC)
We are live! Let me know of any problems. Thanks for your help. — MusikAnimaltalk 17:35, 9 November 2018 (UTC)
@Cyberpower678: Conveniently you made an edit right after the first sync! It took another 15 minutes or so, but this was because the cron was not set up on Toolforge yet (I always do the first run locally). Special:Diff/868053189 was the first true, fully-automated edit, and it looks good :)
@Xaosflux: Eh, it'd be quite tricky as I'm using a library to generate the JS object, and believe it or not the method is called pretty_generate =P Manipulating the result may be error-prone. I realize it's probably difficult for bot approvers to evaluate the JS, but the page geonotice managers will be using, Wikipedia:Geonotice/list.json, is more readable than ever! Hopefully that's OK? — MusikAnimaltalk 18:27, 9 November 2018 (UTC)
I personally prefer the JS formatting style of keeping the contents of each square bracket within the same line. On the other hand, the JSON viewing mode of MediaWiki does make it easier to review content, and the fact that the bot has standardised the presentation of JS on geonotice-list.js also adds to readability. All in all I think we are okay with this change in code formatting. Deryck C. 21:11, 10 November 2018 (UTC)
@Redrose64: I'd like to solicit your input as well. I think Cyberpower678 was mostly joking above (maybe? :), but do you think it would make sense for the bot auto-remove expired notices, say, 24 hours afterwards? — MusikAnimaltalk 05:59, 19 November 2018 (UTC)
Auto-removing 1 week after expiry sounds very sensible to me.
Actually I would like to propose a related functionality: perhaps the bot should skip notices whose start date is more than 1 week in the future? That way we can queue notices without burdening other editors' browsers. Deryck C. 11:35, 19 November 2018 (UTC)
This can be done as well. I want to hear a few other opinions first, though. — MusikAnimaltalk 17:39, 19 November 2018 (UTC)
Seems reasonable to me. If there's something wrong with it, the same editors might not be around/active to notice/fix things, but that's not a change from the original system so there's nothing lost there. It'd make it harder to check if the bot made any errors, since the two pages won't be in sync all the time, but that's not the end of the world nor too hard to manage. ~ Amory(u • t • c) 12:03, 21 November 2018 (UTC)
@The Earwig and MusikAnimal: the int-admin access and the trial are expiring - is this ready to be "trial complete" and reviewed for closure, will you need more time, etc? — xaosfluxTalk 03:54, 21 November 2018 (UTC)
There's also the #Auto-removal feature. Not sure if you want to extend a trial for that, assuming everyone is OK with the idea. — MusikAnimaltalk 04:05, 21 November 2018 (UTC)
Everything seems to have gone swimmingly AFAICS. ~ Amory(u • t • c) 12:06, 21 November 2018 (UTC)
Agree that the trial has been successful. I think there is value in keeping this discussion page open, so we can continue to discuss the possibility of adding auto-remove functionality, which I think is a "small change to improve the operation of a particular task" (the task brief - to sync valid notices from list.json to geonotice-list.js - is unchanged). Deryck C. 16:47, 21 November 2018 (UTC)
@MusikAnimal: OK - so right now your bot's account is sort of broken. The access has expired, but it hasn't come off, and your user groups are stuck and can't be updated. I was going to give this another few days pending a BAG reviewer here, but technically it's borked up. Expect this to break at anytime, if you think its working again (e.g. try adding 'confirmed' to your bot) let me know with a ping please. I haven't opened a phab case on this yet. — xaosfluxTalk 03:19, 22 November 2018 (UTC)
This is... awkward. There are a few recurrent events that I think would want geonotices in the next week or so. But there aren't any open geonotice update requests right now so we can wait a day or two for a BAG member to close this officially.
There's also a wider procedural point to be made: why do we have a BAG procedure where the bot's user-rights expire only 24 hours after the scheduled end time of trial, but bureaucrats aren't supposed to extend the permissions' expiry date while we await formal BAG decision?Deryck C. 12:08, 22 November 2018 (UTC)
@Xaosflux: Can you confirm whether the problem is: (1) you don't feel comfortable with unilaterally extending MusikBot II's IAdmin flag, or (2) Special:Userrights is buggered so bureaucrats can't extend the IAdmin flag even if someone from BAG approved this now? Deryck C. 12:11, 22 November 2018 (UTC)
@Deryck Chan: the "technical problem" is that Special:Userrights isn't (or at least wasn't) working for any changes on this one particular account only. — xaosfluxTalk 12:38, 22 November 2018 (UTC)
@MusikAnimal: and @Deryck Chan: whatever was stuck on the account is fixed now, I've extended i-admin for 4 days pending closure. Regarding the normal "trial process" - generally with BRFA's when a trial is conducted it is expected to end, then allow for review of the activities during the trial (at which time the task is not undertaken). We also usually allow for a wide range of discretion with BAG if they want it to work other ways. — xaosfluxTalk 12:43, 22 November 2018 (UTC)
Great, thanks for the update. Deryck C. 14:20, 22 November 2018 (UTC)
Approved. Thanks to everyone for their work in getting this bot running smoothly. If you'd like to continue the discussion regarding additional minor features within the scope of the task, please feel free to do so below. — Earwigtalk 04:24, 23 November 2018 (UTC)
The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.