Wikipedia:Bots/Requests for approval/Lightbot
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Revoked. In June 2009, ArbCom restricted Lightmouse from performing any automated tasks. They are now considering lifting this restriction, subject to (re)approval of any tasks by BAG. To avoid any confusion, I am marking Lightbot's old approvals as "Revoked". Anomie⚔ 17:38, 13 July 2010 (UTC)[reply]
- The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Approved.
Operator: Lightmouse (talk)
Automatic or Manually Assisted: Manually assisted
Programming Language(s): Monobook or AWB
Function Summary: Janitorial edits mainly to units and dates.
Edit period(s) (e.g. Continuous, daily, one time run): Continuous
Already has a bot flag (Y/N): No.
Function Details: Janitorial edits mainly to units and dates. Examples include:
- Changing '|sqm|', '|cum|' and '|knot|' to '|m2|', '|m3|' and '|kn|' when in the convert template (no visible effect to the reader but rationalises the template). Very low false positive rate.
- Fixing damaged date links that damage autoformatting e.g. [[November 5th]] (should be [[November 5]]). Very low false positive rate.
- Fixing dates that are damaged by autoformatting such as date ranges e.g. [[1 May|1]] to [[4 May]] should be simply '1 to 4 May' (to stop autoformatting converting it to "1 to May 4"). Low false positive rate.
- Unlinking date fragments such as links to solitary months ([[February]]), solitary days of the week ([[Tuesday]]), digits ([[16]]). Some false positives possible but I know some of the common ones and will check by eye when doing these.
I have done thousands of script assisted edits of this kind as Lightmouse. Low error rate tasks will be transferred to Lightbot. See contributions of Lightbot.
Discussion
editSeems fine assuming to check manually for false positives on the last bullet point. --Apoc2400 (talk) 22:00, 25 May 2008 (UTC)[reply]
- You say "Very low false positive rate." a few times. Can you give examples of these false positives? dihydrogen monoxide (H2O) 09:35, 26 May 2008 (UTC)[reply]
- Actually, I say 'Very low' for the first two bullets. I say plain 'low' for the third bullet. The fourth bullet has different wording. I base my estimates on thousands of edits as Lightmouse. I will use low error rate parts of the same script.
- "Changing '|sqm|'": I have yet to detect or imagine a false positive scenario with the proposed regex. It would be naive to predict zero. So I hypothesised 'very low'.
- "Date links that damage autoformatting": I have yet to detect or imagine a false positive scenario with the proposed regex. It would be naive to predict zero. So I hypothesised 'very low'.
- "Fixing dates that are damaged by autoformatting": This is a more difficult regex problem. With range examples, it is easy to address the first half of a range (or just bad formatting) such as [[1 May|1]]. It is more difficult to correctly address the second half of a range [[4 May]]. That is where the theoretical possibility of false positives exists i.e. I want to delink '4 May' in a date range but not otherwise.
- Unlinking date fragments. For example false positives for delinking day names occur in references to calendars and gods i.e. I want to delink 'Thursday' when it is just the day that a TV show airs but not when referring to the god 'Thor'. Of the four bullet points here, this one needs the most care. I have done thousands of these and I know what to check for. I would be happy to do test runs. I hope that helps. Lightmouse (talk) 11:13, 26 May 2008 (UTC)[reply]
While you're editing units, I wonder whether you would be able to implement a procedure that places a non-breaking space between any digit and an ensuing unit, as per the manual of style. I think that using a regexp as simple as \d\s\w would suffice; even if false positives did occur, the change in space style would cause no user inconvenience. The scope could be extended with a regexp such as (\s\d+)(\w*) (replaced by $1 $2), but I wonder whether that will be more false-positive producing. Cheers, Smith609 Talk 13:53, 26 May 2008 (UTC)[reply]
- I am not convinced that the upside/downside balance for non-breaking spaces is a net benefit. So I do not choose to write, check, debug and maintain regex for them. I know that my view is a minority. However, please note that I use the convert template. That template includes non-breaking spaces as per the MOS. The net effect of any edit that adds the convert template is to give you what you want. Lightmouse (talk) 17:32, 26 May 2008 (UTC)[reply]
I think there would be many false positives when a digit is used inside a name or in codes of various kinds. --Apoc2400 (talk) 19:42, 26 May 2008 (UTC)[reply]
- Please set Smith609's non-breaking space question and his/her proposed code to one side. It is not part of my request for bot approval. Lightmouse (talk) 19:54, 26 May 2008 (UTC)[reply]
Okay, thanks for considering it, and sorry for sidetracking discussion! Smith609 Talk 08:18, 27 May 2008 (UTC)[reply]
- Sounds good to me. A bot to clean up {{convert}} tranclusions would be useful. JIMp talk·cont 20:27, 29 May 2008 (UTC)[reply]
Any news on this? Lightmouse (talk) 21:06, 4 June 2008 (UTC)[reply]
- Lets try a Approved for trial (100 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. to see how it works. MBisanz talk 21:18, 4 June 2008 (UTC)[reply]
Trial edits complete. Lightmouse (talk) 09:55, 5 June 2008 (UTC)[reply]
- {{BAGAssistanceNeeded}} I've taken a look at 1/4 edits and see no mistakes, propose approval. BJTalk 10:03, 5 June 2008 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.