Wikipedia:Bots/Requests for approval/Bot0612 4
- The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Approved.
Operator: RichardΩ612 Ɣ ɸ
Automatic or Manually Assisted: Automatic (supervised)
Programming Language(s): AWB custom module (C#)
Function Summary: Update transclusions of {{Infobox Court Case}}, see this botreq.
Edit period(s) (e.g. Continuous, daily, one time run): One time run (until all transclusions are fixed)
Already has a bot flag (Y/N): Y
Function Details: Per the botreq mentioned above, the Court Case infobox has been updated, removing the need to use wikicode when specifying an image. For example: instead of [[Image/File:Foo.jpg|100px]]
you would simply put Foo.jpg
in the image= parameter. Also, if the court= parameter is specified, and matches one of the courts listed here, a relevant image will automatically be added to the infobox.
The bot will:
- Strip any wikicode from the image= parameter.
- If a valid court is specified, remove the image= parameter entirely.
Discussion
edit- Feel free to have a look at the source code and make any suggestions for improvements, etc. RichardΩ612 Ɣ ɸ 13:09, 1 January 2009 (UTC)[reply]
- There are less than 250 translusions of the template and I noticed that many are fixed.
I think it can be done semi-automatically but the code seems fine and I think you can do it.-- Magioladitis (talk) 13:59, 1 January 2009 (UTC)[reply]- Maybe there is a problem when there is an underscore in the image name but I am not completely sure. -- Magioladitis (talk)
- More comments: Since you are planning to run this, it would be nice if you could check these things as well: diff. "date_decided" has to change to "date decided" and the date to be unlinked etc. -- Magioladitis (talk) 14:21, 1 January 2009 (UTC)[reply]
- With regards to the small number of pages, I'll let this run so that the bot is approved to do this task anyway. That way if someone requests the same thing in future I won't have to go through another BRFA. As for the date fixes, I'll code that in. RichardΩ612 Ɣ ɸ 14:23, 1 January 2009 (UTC)[reply]
- Date fixing added, see this edit. Should there be a comma+space between the month and the year, or just a space? RichardΩ612 Ɣ ɸ 14:43, 1 January 2009 (UTC)[reply]
- Great. I am ok with it now. Since you are member of the BAG can you just approve it or not? -- Magioladitis (talk) 17:23, 1 January 2009 (UTC)[reply]
- It's usually best if someone else approves it to be honest. RichardΩ612 Ɣ ɸ 17:49, 1 January 2009 (UTC)[reply]
- Great. I am ok with it now. Since you are member of the BAG can you just approve it or not? -- Magioladitis (talk) 17:23, 1 January 2009 (UTC)[reply]
- More comments: Since you are planning to run this, it would be nice if you could check these things as well: diff. "date_decided" has to change to "date decided" and the date to be unlinked etc. -- Magioladitis (talk) 14:21, 1 January 2009 (UTC)[reply]
- Maybe there is a problem when there is an underscore in the image name but I am not completely sure. -- Magioladitis (talk)
- I'm a little concerned about some of your regular expressions; feel free to correct me if there are any oddities about C# regular expressions that make these not a concern.
- "
\|( *|\r\n)?image
" won't match "|\r\n image", unless "(...)?
" means something different in C#. - I think your first replacement regex would incorrectly replace any other parameters on the same line if any subsequent parameters contain a wikilink; for example, the foo parameter would be stripped in "
|image=[[Image:Example.jpg|200px]]|foo=[[bar]]
". Unless C# regexen are non-greedy by default? - If "date_decided" has a YMD format date or a non-wikilinked date, it will not be replaced with "date decided".
- If any other templates in the page have an "image" parameter, they will also be replaced. I don't know whether this will actually be an issue.
- Not a critical issue: In your regular expressions, you have "
( *)?
". Unless that has some unusual meaning in C#, "*
" would be equivalent.
- "
- Anomie⚔ 01:48, 2 January 2009 (UTC)[reply]
- First, thanks for taking the time to check the code!
- As for your first point re: not catching spaces after a newline, I did squash that bug when I did my final checks. I must have forgotten to update the code on the wiki.
- You are quite correct about any subsequent parameters on the same line getting nuked, I hadn't thought of that. I suppose I assumed (wrongly) that each parameter would be on a new line. Thanks for pointing it out, and I've fixed it by changing
(File|Image)\:([^|]*).*\]\]"
to"(File|Image)\:([^|]*)[^\]]*\]\]"
- You're right about 'date decided' as well, I coded that quickly as an 'add-in', and obviously didn't think of simple things like that! I've split the regex so that regardless of the date, the underscore will go.
- I thought about other templates with an image= parameter, but after checking a sample of the pages in question, I couldn't find any (few templates other than infoboxes have the image parameter, as far as I know).
- "
( *)?
" means 'everything inside the brackets is optional'. Without the(...)?
it would fail to match if there was no space.
- Thanks again for checking the code. I've updated the version on wiki as well, just in case I missed something else! RichardΩ612 Ɣ ɸ 11:34, 2 January 2009 (UTC)[reply]
- No problem!
- Re the newlines: Wouldn't it be simplest to just do "
[ \r\n]*
" to check for any arbitrary combination of linebreaks and spaces? - Re the extra stripping: You'll still fail on "
|image=[[File:Example.jpg]]|foo=[[bar]]
"; try"(File|Image)\:([^|\]]*)(\|[^\]]*)?\]\]"
. Although that too will fail if the image caption contains a link, hopefully that's another one that turns out to not be an issue in practice. - Re detecting 0 or more spaces: "
*
" means "0 or more spaces", so it will match if there are no spaces. OTOH, the redundancy doesn't hurt anything either, besides reducing your code's performance slightly. - Your date replacer will now remove the links from any date (or any other pair of wikilinks, e.g. "
location=[[Sacramento]], [[California]]
") that follows an "=", not just in the "date decided" parameter.
- Re the newlines: Wouldn't it be simplest to just do "
- Anomie⚔ 13:40, 2 January 2009 (UTC)[reply]
- Wow, I never used to make this many stupid mistakes. I suppose I haven't done regex programming in a while so I'm a bit rusty (time to fish out 'Regular Expressions for Dummies' again methinks!). Thanks for correcting my code (again). As for the redundancy of the bracket-question mark combination, that is something completely new to me. I always thought that the * meant 'one or more'. Thanks for clarifying that! RichardΩ612 Ɣ ɸ 16:08, 2 January 2009 (UTC)[reply]
- In regex programs that support it, "+" normally means "one or more". I suppose there could be some version of regular expressions that use "*" for "one or more", but that would be quite odd considering that the use of "*" goes back to regular expressions' origins in formal language theory and the Kleene star.
- I see only one major issue left: your latest image-matching regex will match
[[File:Example.jpg]]
and[[File:Example.jpg|caption]]
, but not[[File:Example.jpg|20px|caption]]
; you need a * rather than ? just before the\]\]
to match that. - One more regex optimization:
[ \r\n]* *
is redundant to just[ \r\n]*
.
- I see only one major issue left: your latest image-matching regex will match
- Anomie⚔ 01:22, 3 January 2009 (UTC)[reply]
- In regex programs that support it, "+" normally means "one or more". I suppose there could be some version of regular expressions that use "*" for "one or more", but that would be quite odd considering that the use of "*" goes back to regular expressions' origins in formal language theory and the Kleene star.
- Wow, I never used to make this many stupid mistakes. I suppose I haven't done regex programming in a while so I'm a bit rusty (time to fish out 'Regular Expressions for Dummies' again methinks!). Thanks for correcting my code (again). As for the redundancy of the bracket-question mark combination, that is something completely new to me. I always thought that the * meant 'one or more'. Thanks for clarifying that! RichardΩ612 Ɣ ɸ 16:08, 2 January 2009 (UTC)[reply]
- No problem!
- First, thanks for taking the time to check the code!
Approved for trial (25 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Once you've fixed the last issue above, go ahead and give it a run. Anomie⚔ 01:22, 3 January 2009 (UTC)[reply]
- Trial complete. - there was only one error caused by someone removing the wikicode but not the size specification from the image parameter. I updated the code to compensate (although this is an unlikely scenario). RichardΩ612 Ɣ ɸ 14:05, 3 January 2009 (UTC)[reply]
- Check this edit. Underscore must be removed from "full_name", "prior_actions" and "subsequent_actions" as well. Otherwise, they won't appear in the infobox. -- Magioladitis (talk) 17:59, 3 January 2009 (UTC)[reply]
- Comment I was wondering if we can use your bot to convert "image=" and "caption=" in {{Infobox soap character}} as well. In this case code>image=[[Image/File:Foo.jpg|100px]] must be replaced by
image1=Foo.jpg
. -- Magioladitis (talk) 18:04, 3 January 2009 (UTC)[reply]- I'll code in the removal of underscores in other parameters as well, shouldn't take long at all. As for doing the soap characters template, that also is possible with minimal modification to the code (I take it image2= should also be modified if necessary?). As the task is virtually identical, I imagine this BRFA would cover both. RichardΩ612 Ɣ ɸ 18:50, 3 January 2009 (UTC)[reply]
- Comment I was wondering if we can use your bot to convert "image=" and "caption=" in {{Infobox soap character}} as well. In this case code>image=[[Image/File:Foo.jpg|100px]] must be replaced by
- Check this edit. Underscore must be removed from "full_name", "prior_actions" and "subsequent_actions" as well. Otherwise, they won't appear in the infobox. -- Magioladitis (talk) 17:59, 3 January 2009 (UTC)[reply]
Sankey v Whitlam wasn't your error, actually; as far as wikitext, the user left things passing an extra unnamed parameter to the template. The bot handled it correctly (OTOH, fixing that user error isn't a bad thing to do). Dow Jones & Co. Inc. v Gutnick is also not an error in the bot, as replacing those other parameters wasn't specified as part of the task; I have no problem with your adding them using the same regex as for the date_decided replacement.
I do see one problem with the date stripping: You're changing "[[December 9]], [[1996]]" to "December 9 1996" rather than "December 9, 1996". The easiest fix might be to just use two date-fixing regular expressions:
ArticleText = Regex.Replace(ArticleText, @"date decided *\= *\[\[([a-z]+ [0-9]+)\]\] *\,? *\[\[([^\]]*)\]\]", "date decided=$1, $2",RegexOptions.IgnoreCase);
ArticleText = Regex.Replace(ArticleText, @"date decided *\= *\[\[([0-9]+ [a-z]+)\]\] *\,? *\[\[([^\]]*)\]\]", "date decided=$1 $2",RegexOptions.IgnoreCase);
Alternatively, you could capture the "\,?
" and include it in the output, but I'd recommend the two-regex solution because a date of "[[9 December]], [[1996]]" will be output without the comma by the date autoformatting (and vice versa).
I see nothing wrong with the trial besides that easy to fix comma issue, so Approved.
Regarding {{Infobox soap character}}, does anything actually need to be done there? It doesn't look like a case of changing parameters, as both "image" and "image1" were present in the very first version. I think that needs a bit of discussion (elsewhere); if it turns out to be actually desired and the same image-replacing replacing regex is used, I'll speedy-approve it. Anomie⚔ 19:07, 3 January 2009 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.