Wikipedia talk:India Education Program/Archive 4

Archive 1 Archive 2 Archive 3 Archive 4 Archive 5

Community work load

 
NPP backlog 4 Nov 2011

I think this graph clearly demonstrates the massive action by volunteers to clean up someone else's mess.

Now that the word has got round that the IEP has been (partly) closed, the backlog is again on the increase.

Today, there is only 1 (one) patroller on duty...

Heartfelt thanks to everyone who rolled their sleeves up.--Kudpung กุดผึ้ง (talk) 04:20, 5 November 2011 (UTC)

I think I checked more pages than a copyright attorney this past month, though some of it was also existing articles. I got one hell of a crash course in picking out close paraphrasing and copies from multiple websites. The Blade of the Northern Lights (話して下さい) 05:26, 5 November 2011 (UTC)
Ha! me too! Kudpung กุดผึ้ง (talk) 07:00, 5 November 2011 (UTC)

Pilot program?

I am having difficulty understanding how a supposed pilot program was able to grow so large and out of control. The whole idea of a pilot is that it is small, contained, and doesn't risk resources. Continuing to refer to this giant fiasco as a pilot strikes me as minimizing rhetoric rather than the thorough review of a deeply flawed process that we need. Jojalozzo 02:39, 6 November 2011 (UTC)

Roughly 20,000 registered accounts made more than 10 article space edits in September. source This program was intended to add 1,000 new editors at that roughly that level of activity, an increase of 5%. Make of that what you will. Danger High voltage! 03:06, 6 November 2011 (UTC)
- a frightening proposition indeed, and serious enough to firmly insist on a delay before the next phase starts. What's going to happen ulmtimately, is that with all the other issues with the WMF this year, the regular community is going to be extremely skeptical of wanting to offer any more help. I had a silly comment from another admin yesterday that wrecked my day so I sat next to my pool in the tropical sun for the rest of the afternoon and mused over whether we are shouting into the void, or if what we do here is worth it at all - it's sowieso only a drop in the (Indian) ocean, and nobody is listening to the drips (pun). --Kudpung กุดผึ้ง (talk) 04:57, 6 November 2011 (UTC)
That "5%" looks soothingly manageable, until you consider that, unlike most of the 20,000, the thousand new editors (a) are quite new to Wikipedia, (b) in many cases, it seems, are not fluent in English, (c) do not seem to have been given even a rudimentary briefing on things like wiki markup, sandboxes, and copyright, (d) are "supervised" by people with equally little WP experience, and (e) are given deadlines to contribute. JohnCD (talk) 13:15, 6 November 2011 (UTC)
...and, another important difference: the 20,000 are volunteers, here because they want to edit. The 1,000 are pressed men, they have to edit or they fail their course. JohnCD (talk) 22:17, 6 November 2011 (UTC)
Plus, they have been assigned (well, have to choose one, thats not much better) a random topic from a field they're still trying to pass. They are not experts at all. Regular Wikipedia contributors usually write because they know a lot about a subject, not because they randomly picked this article as a "least effort" topic to write about. If you know a lot it's usually much easier to write yourself; if they don't really know enough, people tend to copy&paste. --Chire (talk) 08:50, 7 November 2011 (UTC)

Common format

Ok, let's get this underway. We need to figure out a common format for all the tables at WP:IEPS so that we can keep the machine-readable lists updated.
I suggest these columns, in this order: Rollno, name, username, articles, sandboxes,mentor,approval/sign,instructor,OA, OA comments, other columns. Most of the tables just need to be rearranged, but some of the tables have double columns (see this for an example). I'm all for removing the extraneous columns like approval/sign and instruction (they all have the same content), but I don't want it to inconvenience the profs. Course pages should have their tables replaced with links to the sections of WP:IEPS.
We should also decide places to keep master machine-readable lists (For the moment, these are User:Manishearth/Ambassador/IEPstudents/rcl and User:Manishearth/Ambassador/IEParticles/rcl) Once this is done, I can easily keep them in sync, as well as run redirect checks, etc. ManishEarthTalkStalk 16:45, 6 November 2011 (UTC)

In regard to the table rows with multiple students, it would be possible to create separate rows for each student and still combine their shared data into multi-row table fields for stuff like articles they worked on, etc. This way, we could bring the tables into a uniform format without loosing the info that some users formed groups to work on articles. The down-side is the more complicated table syntax. (If we want to avoid the more complicated syntax, we could, in a first pass, add separate rows for each student and just fill up the shared columns with something like "see above". This way, the table format could be kept simple and straightforward to edit (and parse on source code level).) Comments?
An alternative might be to create a special {{Template:User IEP|accountname=|realname=|rollno=}}. However, the real name and roll number is mostly don't care for our cleanup purposes. Ideally, this kind of information should be part of the user account info or stored in a template on the corresponding user pages and extracted from there by another template, so that it can be centrally maintained as part of the data object "user", not "course". So, while it might be a good idea to have something like this in future programs, it would only increase our work now for no immediate benefit, it seems. Comments?
Should we add a special comment column for our cleanup efforts, or just use a generic comment column as we already do now?
Do we need to have special columns for cleanup status info, such as "user page has been tagged with IEP template", "all discussion pages of articles edited by user have been tagged with IEP assignment template", "all articles have been evaluated, reverted and/or cleaned", always with date stamp, or is this kind of info no longer necessary and will be maintained elsewhere as part of a formal CCI investigation process?
Additional notes:
In order to avoid ambiguity I would like to suggest that we all stick to use the well-established ISO 8601 standard international date format in the comments column, that is "yyyy-mm-dd", example: 2011-11-06 for November 6th, 2011. No abbreviated 2-year forms, no national date orders, no non-standard separators, only hyphens (as per the standard). Using the ISO format, we no longer have to wonder if something like "11/10/11" now means November 10th, 2011 or October 11st, 2011 (and if we didn't already knew we are talking about 2011, there would be even more ways to interpret this).
If we don't create a special user template (s.a.), I still think we should frame all accountnames with {{User-c|accountname}}. The added "(t c)" can be easily filtered during table export, but it makes it much easier to check for contributions, if we add it. Existing usage of the {{User|accountname}} template could be changed to {{User-c|accountname}} in a minute.
If we add multiple entries (account names, articles, sandboxes) to a table field, I suggest to use a semicolon (;) to separate the entries, not a comma (which might be used as part of an article or user name as well) or no separator at all (no separator makes it difficult to export the data using the HTML rendering, and we would have to parse the data on source code level instead). I don't know, if it is necessary, but if we find multiple multi-word entries where separating them by semicolon would prove to be difficult, we could put them in "quotes". Easy to parse and strip off in the resulting exported data.
"Course pages should have their tables replaced with links to the sections of WP:IEPS". Yes, but only after once more proofing and sync'ing the data into the master table.
--Matthiaspaul (talk) 00:27, 7 November 2011 (UTC)
Something like this:
{| class="wikitable sortable IEPtable"
|-
! ID
! Roll number
! Real name
! Account(s)
! Article(s)
! Sandbox(es)
! Mentor
! Approval/Sign
! Instructor
! Last change link
! Wikiproject review
! Online ambassador
! OA comments
! Cleanup comments
! Cleanup status
|-
| zzz <!-- Group ID, where applicable -->
| zzz <!-- Student roll number -->
| zzz <!-- Student real name -->
| {{User-c|zzz}} <!-- Student account name. No real name. Repeat with ; for more than one account name. -->
| [[zzz]] <!-- Article name without pipes. Repeat with ; for more than one article. -->
| [[User:zzz/sandbox]] <!-- Student sandbox. Repeat with ; for more than one sandbox. -->
| {{User-c|zzz}} <!-- Mentor account name or real name -->
| zzz <!-- Approval/sign -->
| {{User-c|zzz}} <!-- Instructor account name or real name -->
| [[zzz]] <!-- Link of last change made to article, where applicable -->
| zzz <!-- WikiProject Computing/Computer science review, where applicable -->
| {{User-c|zzz}} <!-- Online ambassador account name or real name -->
|
*yyyy-mm-dd: zzz <!-- OA comment. Repeat in new line for more than one comment. -->
|
*yyyy-mm-dd: zzz <!-- Cleanup comment. Repeat in new line for more than one comment. (add new comments on top) -->
|
*yyyy-mm-dd: zzz <!-- Reserved for cleanup status. Repeat in new line for more than one comment (add new statuses on top). -->
|}

for

ID Roll number Real name Account(s) Article(s) Sandbox(es) Mentor Approval/Sign Instructor Last change link Wikiproject review Online ambassador OA comments Cleanup comments Cleanup status
zzz zzz zzz zzz (t c) zzz User:zzz/sandbox zzz (t c) zzz zzz (t c) zzz zzz zzz (t c)
  • yyyy-mm-dd: zzz
  • yyyy-mm-dd: zzz
  • yyyy-mm-dd: zzz
--Matthiaspaul (talk) 01:53, 7 November 2011 (UTC)
I like it!!! No, we should not use the multi-row fields etc, that makes it confusing for scripts. For extracting data, the usernames/whatever need to be in a specific column number. Separation with semicolons (for multiple usernames/rollnos/articles/etc) is the best ("See above" is OK, too, except that then the comments and all become confusing). Let's not use the Template:User IEP thing. YOu're absolutely right, it will be useful next time, but right now, it will just be an unnecessary headache.
I think we should have the separate comments column, so that the originaly OA commetns dont get overwritten. The status column is an excellent idea. We can fill it with {{yes}}/{{no}}/{{partial}} templates, with the content being somewhat like "Checked:Copyvio/Blanked","Checked:Copyvio/Not Blanked","Checked:OK","Checking","Not sure", and "Unchecked" (or empty), along with the username and datestamp. Making new comments on another line is perfect for our needs.
There's no need to do the "User page has been tagged with an IEP template", as it will be done via a bot sooner or later (I'm waiting for the BRFA to get approved). The "All articles have been reverted/cleaned" etc will come into the "cleanup status" column.
Yep, we should use ISO, though a timestamp might also be necessary. Why not just use the ~~~~~ timestamper? There's no ambiguity in that (Example: 13:03, 7 November 2011 (UTC))... We probably won't need to machine read it, and even if we do, JS/Java/etc have libraries that can interpret all types of dates.
User-c is the way to go. It's pretty easy to filter out talk/contrib links, though its not so easy to do so for sandbox links (because some OAs encourage "playground" pages instead of sandboxes, etc.). Which is why they'll be fine sitting in a separate column.
By the way, there's this tool that makes editing wikitables easy. It's here. Unfotrunately, it doesn't work with the new toolbar. I'll try to modify it so that it does. ManishEarthTalkStalk 13:03, 7 November 2011 (UTC)
We could use {{subst:ISO8601}} to implant the current date and time in ISO format, example: "2011-11-07T14:25Z ". (EDIT: Removed distracting extra linefeed and UTC link from template output. Hope nobody else needs it...) --Matthiaspaul (talk) 14:25, 7 November 2011 (UTC)
Well, it makes it less human-readable that the ~~~~~, though it adds a bit of machine readability. I'd say we keep the ~~~~~, as the comments aren't going to be required in the machine-parsing. Actually, I don't see the point of timestamping the comments when the commenter's going to sign them anyways...
On a related note, I've started writing an editnotice to put on the page after we do the reformat. The working copy is here. Feel free to edit it.ManishEarthTalkStalk 15:39, 7 November 2011 (UTC)
The table proposed above has too many columns - it's much wider than the screen of my small laptop, and it's going to be much wider when populated. I'd suggest the following changes:
  • ID - remove
  • Roll number / Real name - remove one if not both
  • Mentor - is it necessary ? abbreviate to initials (with a key below table)
  • Approval/sign - remove
  • Instructor - is it necessary ? abbreviate to initials (with a key below table)
  • Last change link - needs clarification, what's the purpose of this ?
  • Wikiproject review - need to clarify which wikiproject referring to
  • Online ambassador - possibly abbreviate to initials (with a key below table)
  • OA comments / Cleanup comments - combine ?
  • Cleanup status - OK
If student edits are continuing then the cleanup status column should be sortable by date. DexDor (talk) 07:53, 8 November 2011 (UTC)

I'm completely fine with it if we remove the extraneous columns, just that I didn't want to tamper with the info already there. It's of no use to us, but it is of use to the profs. Anyways, seeing that all edits are stopped till the cleanup is finished, I guess we can remove the columns (If anyone wants the data we can redirect them to an older revision). You make a good point about sorting the cleanup statuses. So we should use the ISO timestamp along with the signature in that cell (the same goes for the cleanup comments). The WikiProject review column will contain whatever is already there in the current table. We can combine the OA comments and cleanup comments, but it will become a headache to reformat the OA comments.

I have also added class="wikitable sortable IEPtable" to the table above (it won't interfere with anything, but it will ensure that other tables won't interfere when I extract data) ManishEarthTalkStalk 10:09, 8 November 2011 (UTC)

IMHO, we cannot simply delete existing columns, if, at the same time, we remove the lists on the course sub pages (as I have already started in some cases after once more checking bit by bit, that this information is reflected in the master list as well). We do this to concentrate the data into once place so that we no longer need to sync with other places. Pointing the locals to an older version of the data in the page history is just the same as forking the data again.
"ID", "Roll number": Some of the courses actively use these group IDs. Removing this information may interfere with their work. "Real name": In some cases, this may help us to identify multiple accounts and it may also be used by the local people in Pune. "Mentor - is it necessary? abbreviate to initials (with a key below table)", "Approval/sign - remove", "Instructor - is it necessary? abbreviate to initials (with a key below table)", "Last change link - needs clarification, what's the purpose of this?", "Wikiproject review - need to clarify which wikiproject referring", "Online ambassador - possibly abbreviate to initials (with a key below table)": Surely, we don't need any of this, but this is information present in some of the course lists, and since we want them to actively use the online lists as their one-and-only data base (instead of using shadow lists in paper form or whatever, which would once again cause synchronisation problems), we cannot just delete their info, because we don't need it. I'm personally not a fan of footnotes at all, however, if this way we can bring the table width down enough, this would be a solution. "OA comments / Cleanup comments - combine?" Possible, I thought about this as well, but decided against it in my proposal because the OA comments sometimes have nothing to do with any cleanup efforts and even if they do, I thought it would be better to have an extra column for the more systematic and organized cleanup efforts which still have to take place. However, I'm not against it, if it really helps.
I'm open to removing columns we don't need, of course, but I think we would need approval from the course instructors and corresponding CAs for this first.
I don't use a wide screen myself and therefore cannot see the whole table as well, but I don't see much problem in scrolling or temporary reducing the browser font size ([CTRL]+[-]) if I need to have a broad view... --Matthiaspaul (talk) 11:27, 8 November 2011 (UTC)
Are there horizontally collapsible tables, so that those with narrow screens could untick the columns they don't want to see? --Matthiaspaul (talk) 11:35, 8 November 2011 (UTC)
No. Reducing the font size to 85% could help, though. —Ruud 11:47, 8 November 2011 (UTC)
I'm not saying we delete their info. Remember, it will all be in the history.
How 'bout this? We put all the columns which are useless for our current purposes at the end. That way, whatever we need is up front, and the rest is off to the side. There isn't any way to make columns collapsible, unfortunately. (Though I can write a script that hides the last few columns if anyone's interested). It won't look too logical, but it serves our purposes. ManishEarthTalkStalk 13:02, 8 November 2011 (UTC)
Rearranging the order of columns and reducing the width of the table would give something like this:
Table. Test
ID Roll number Real name Account(s) Article(s) Sandbox(es) Cleanup status Cleanup comments OA comments Online ambassador Mentor Instructor Approval/ Sign Last change link Wikiproject review
zzz zzz zzz zzz (t c) zzz User:zzz/sandbox
  • 2011-11-08T16:55Z: zzz
  • 2011-11-08T16:55Z: zzz
  • 2011-11-08T16:55Z: zzz
zzz (t c) zzz (t c) zzz (t c) zzz zzz zzz
The downside would be that we'd have to swap the order of entries for many tables (instead of just inserting stuff in between). This will make it more complicated (at least for me, as I don't have a good editor at hands right now), but if it helps the others, I would be okay with it.
Question: Adding "sortable" to the wikitable class, the table will again be blown up to full width. Any workaround? --Matthiaspaul (talk) 16:55, 8 November 2011 (UTC)
I think we can remove the Instructor column altogether as it never changes within a course and the instructur is defined in the course tables as well, so no information gets lost here.
I'm not sure about the role of the Mentors, but perhaps we can combine this column with the OAs. If this would be important, we could add prefixes such as MT:{{User-c|accountname}} or OA:{{User-c|accountname}} and still save a column. Since the "link of last change" and "Approval/Sign" columns are never used at the same time, perhaps we can combine them as well. And finally, if we combine the "Wikiproject" and the "OA comments" columns (which have been mixed up anyway in many tables) we get another column. Gives:
Table. Test
ID Roll number Real name Account(s) Article(s) Sandbox(es) Cleanup status Cleanup comments OA comments / Wikiproject review Online ambassador / Mentor Approval / Sign / Last change link
zzz zzz zzz zzz (t c) zzz User:zzz/sandbox
  • 2011-11-08T16:55Z: zzz
  • 2011-11-08T16:55Z: zzz
  • 2011-11-08T16:55Z: zzz
OA:zzz (t c); MT:zzz (t c) zzz; zzz
Alternative proposal with a column order more in line with the existing table layout (and therefore easier to convert semi-manually):
Table. Test
ID Roll number Real name Account(s) Article(s) Sandbox(es) Approval / Sign / Last change link Online ambassador / Mentor OA comments / Wikiproject review Cleanup comments Cleanup status
zzz zzz zzz zzz (t c) zzz User:zzz/sandbox zzz; zzz OA:zzz (t c); MT:zzz (t c)
  • 2011-11-08T16:55Z: zzz
  • 2011-11-08T16:55Z: zzz
  • 2011-11-08T16:55Z: zzz
Would the second proposal be narrow enough to fit on the screens? It doesn't for me, but neither do the existing course tables, but I don't care. If we could get width:50% to work with sortable tables, we won't have any problems at all... --Matthiaspaul (talk) 20:28, 8 November 2011 (UTC)
For test purposes, I have changed the first of the course tables in the master list to this proposed format. It's still wide, but at least not wider than some of the existing tables. (I found it very time-consuming to change the order of columns, that's why I used the second of the proposals. Swapping the column order is an extra step anyway, so we don't waste time if we'd do it in two passes, would it still be necessary to change the order at a later stage. Regarding width:70% not working with sortable tables, is this a bug in the implementation or is this "by design"? Should we remove the "sortable" to make width work so that all can see the whole table without scrolling, and re-add it at a later stage, when it becomes more useful? --Matthiaspaul (talk) 04:21, 9 November 2011 (UTC)
In regard to those changes - I've been trying to finish off that course, and the table format has been changing, so I've moved to another course instead. However, now I'm lost as to what I'm supposed to do with the new format. I was never an OA, as far as I am aware - just someone assigned to do CCI. It may be that I am an OA, but now my comments are listed as OA comments, and there are two new fields regarding data I've already added, but which are blank. Do I fill in those fields with the same data I was adding to the OA field? Or is a third person, other than me, to check those cleanup fields? And does this mean that they will be repeating work I've completed? Or perhaps I shouldn't be adding data to the comments field at all, only to the cleanup fields? And, of course, if that's the case, what data is to go in the "cleanup status" column? At this stage I'm going on the assumption that I can't work with that course, as I might be compounding problems with the new formats. - Bilby (talk) 05:04, 9 November 2011 (UTC)

@Matthiaspaul: There's no need to make the table fit in the window (So keep it sortable and remove the 70% width). Only the important stuff should fit right now, and the remaining columns (name/rollno/group no) should be shunted to the end. There's no need to test out the format on WP:IEPS right now (copy the first table into your sandbox and try it out if you want). I'm working out the bugs from the wiki table editor, and this will let you shift columns in a jiffy.

@Bilby: Continue your cleanup efforts on that page. You may use the OA comments section for now, right now we are only testing formats... ManishEarthTalkStalk 05:27, 9 November 2011 (UTC)

Okay. I was not so much "testing", but more trying to seek consensus while moving forward at the same time. ;-) If WTE allows to swap columns at a later time, I think, I will continue in the current order and you swap the columns afterwards. I am temporary without my favourite editors (TSE Pro, Ultraedit), that's why I am a bit "handicaped" when it comes to REGEX and scriptable tasks such as table reordering, and doing this manually is a time-sink. That's why I prefer to keep the columns in basically the same order as they are already, and someone else can change it with proper tools at hands. BTW. I think it would be a good idea to leave at least the ID column as the first one (using the existing group IDs where present, and just counting up elsewhere). It doesn't take much room, but serves as a memorizable index into a row and makes it easy to restore the original list order.
Putting the column order aside for a moment, do we have consensus on the type and combination of columns in general? Should we combine some more? (I like your idea with the yes/no/partial template for the cleanup status, but this is something that can be added later, or should we set it to No now?) --Matthiaspaul (talk) 11:36, 9 November 2011 (UTC)

(edit conflict)I finally got the table editor to work. It's still quite buggy, but it does the trick.

To install, type this code in your skin js page.

importScript('User:Manishearth/WikiTable.js')


Once you install, two new table icons should appear in the editing toolbar. Select part of a table, and click the first icon. It should let you do all sorts of stuff with tables, most importantly adding and moving columns. The save button updates the table in the editbox (it does NOT save the page) Note that the script is still quite buggy (Because it was probably meant for simple wikitables, without templates etc.). It has the following bugs:

  • Sometimes the full wikitext of a row gets crammed into a single cell (Just scroll down the table while the Table Editor is open and check for anomalies) These anomalies have to be fixed by hand, unfortunately.
  • Templates with a pipe character mess up the editor. Basically, "OA:{{User-c|Manishearth}}" will show up as "Manishearth}}". It does not affect the actual wikitext of the cell, it ownly looks like it does. You may move around such half-eaten columns normally, the eaten text will move around with them (Even though it won't appear to do so).
  • Also, the editor can't work with the header row (actually it does, but this setting breaks the rest of the script). Header rows can't be seen in the editor (They aren't removed, though, they just don't show up). Moving columns around does not affect the header row. For this reason, I suggest you convert the header row to a normal row (convert the !s to |s, etc), use the table editor to shift things around, press save, and then convert the top row back to a header row. I'm going to try and fix this bug.


ManishEarthTalkStalk 11:52, 9 November 2011 (UTC)

In reply to the stuff you wrote above, I'd say that the yes/no/partial templates should only be used once an article has been/is being checked. If an article hasn't been glanced at, then we just leave the cell blank. Instructions on formatting will be in the editnotice (or in comments in the cells), which will let them know that they have to use the yesno template. As for agreement on the columns, I'm fine with them , but of course you need input from others. ManishEarthTalkStalk 11:57, 9 November 2011 (UTC)
Alrighty. I've not fixed the header row bug per se, but I found a good enough workaround. All you have to do is make sure that the header row has a "|-" before it. Eg:
Instead of:
{|style=blah etc etc
|+ Caption
!ID
!abc
!def
|-
(data goes here)
|}
Do:
{|style=blah etc etc
|+ Caption
|-
!ID
!abc
!def
|-
(data goes here)
|}


Hope that helps! ManishEarthTalkStalk 12:09, 9 November 2011 (UTC)

By the way, the merging of columns required by your later proposals won't be possible via the editor. Let's just go with the original format and shunt all columns irrelevant to cleanup off to the side. ManishEarthTalkStalk 12:12, 9 November 2011 (UTC)

Multiple inheritance

Since I added a {{copyvio}} to Multiple inheritance some days ago, an editor has very politely asked me to "unblock" the article because they need it to complete an assignment. The problems with that article are possibly not only due to the IEP - I suspect it was in bad shape beforehand - but I am quite reluctant to remove the copyvio tag. I still have serious doubts about it. I have no objections if someone else would do so, as long as the sourcing is improved. -84user (talk) 12:05, 7 November 2011 (UTC)

Currently, all IEP student edits are supposed to be stopped. See WT:IEP#Relief. ManishEarthTalkStalk 12:44, 7 November 2011 (UTC)
Apparently this student's deadline is the 10th. There's a temporary workspace available at Talk:Multiple inheritance/Temp which can be used as the foundation of a copyvio free article if the student competently rewrites the content in his own words. This is mentioned in the copyvio template. MER-C 13:00, 7 November 2011 (UTC)
I think that they aren't allowed to even edit sandboxes (and the temporary workspace should fall underneath that category). See the Relief section above. ManishEarthTalkStalk 15:15, 7 November 2011 (UTC)
Yes, IEP students from CoEP have been asked not to edit on wikipedia (main space or sandbox) anymore. If any student from CoEP edits, he/she will show up on this list: http://toolserver.org/~fschulenburg/student-o-meter.php and the Campus Ambassadors will get in touch with them and ask them personally not to edit for the time being. Im keeping a close watch on this list and as you can see no student from CoEP has edited in the past 24 hours. Nitika.t (talk) 09:10, 8 November 2011 (UTC)
Nitika, you seem to trust the database way more than we do... I mean, as discussed here on this page, we are still in the process to put together a complete, accurate and up-to-date master database (Wikipedia:India_Education_Program/Students) from the snippets of incomplete, outdated and faulty info distributed all over the place. In the past days, several students and articles missing have been added to this list, and if you check the students' list of contributions, the list of articles actually edited is still much larger than currently reflected in the master list. We are doing this because we cannot start cleaning up systematically without the data. If you already have more complete data then please, by all means, provide it (merge it into the master table, not any other place), because any serious investigation and cleanup effort must fail if the underlaying data is incomplete.
I just had a quick look at the student-o-meter (displaying events back to 1:40, 7 November 2011 right now). It missed for example RAJATPASARI (t c), one of the editors of "Multiple inheritance", who has edited WP on 2011-11-07T08:11:42 (although harmless, please don't penalize him for that).
Also, there is still significant IP activity (which can be traced back to Pune) in various IEP articles. Obviously some students continue to edit articles as IPs now that they have been disallowed to edit under their normal accounts. Does the student-o-meter also trigger on IP addresses?
(On a different note, I'm not sure it was really a good idea to disallow any edits under their student accounts. They should at least have been allowed to continue to edit on talk pages without being penalized, because now they cannot even answer questions raised by the community without risking being penalized. One more thing that IMHO should better have been discussed with the community first.) --Matthiaspaul (talk) 10:15, 8 November 2011 (UTC)
Curiously some of those copyvios date back to 2007. I've reverted back to pre IEP material and repaced to 2007 copyvio material with an earlier description. See page talk page for details.--Salix (talk): 09:29, 8 November 2011 (UTC)

Signpost report

Hello all. I am currently writing a report on the Signpost for publication in eight hours' time, and would welcome any comments and statements from interested parties. The working draft of the story is here; please email if you would like to contribute. Thanks in advance, Skomorokh 14:15, 7 November 2011 (UTC)

I'd like to point out that though Hisham was right about the cultural differences having no impact, (in my opinion) he was wrong about the attitudes not having any bearing on this. I have studied in both countries, so I have observed firsthand the stark differences in the attitudes towards plagiarism. In the US, plagiarism is denounced at an early stage (3rd grade or so), and teachers do penalize offending students. I've even seen teachers paste random chunks of a report in Google do check for plagiarism. On the other hand, in India, I haven't heard a single word about plagiarism yet. Teachers don't bat an eyelid even when the majority of the class submits almost the same content (copied from enwp or the first Google result). Bibliographies usually have "google.com" in them (Which we all know is meaningless), and usually no other entries (Sometimes you see en.wikipedia.org also). When I first came here and saw this happening, I asked a few students why they plagiarised, and didn't they know that it was illegal? They were genuinely mystified by this.
So I feel that the difference of attitudes has a very big part to play in this whole mess, though we can't blame anyone as such differences aren't obvious unless you have experienced both sides of the coin.
Note: All of the above are my observations and in no way do I mean to generalize for the country as a whole. ManishEarthTalkStalk 15:03, 7 November 2011 (UTC)
I've observed the same thing here in the US; in my area, we have a fairly large Indian, Thai, and Chinese population. None of the aforementioned groups, when they first come over here, have any concept of copyright (although in the case of the Chinese, it's more of a willful, malicious disregard for it; my uncle married a Chinese woman who taught English in China, and she can attest to the problems there). Again, it's anecdotal, but my experience has more or less been the same. Not that it necessarily stops plagiarism from American kids (especially at community colleges, where I tutor now), but it's definitely not so endemic. The Blade of the Northern Lights (話して下さい) 01:37, 8 November 2011 (UTC)
As a teacher and teacher trainer, I have been observing exactly the same phenomena here in Asia for the past 13 years. In universities, plagiarism is often tolerated from the dean down - graduate schools are no exception. Kudpung กุดผึ้ง (talk) 06:24, 8 November 2011 (UTC)

Common format: We need consensus

To be able to proceed in updating WP:IEPS to a common format (for machine parsing etc), we need consensus on the format. The current proposal is these columns (Not in this order):ID,Roll number,Real name,Account(s),Article(s),Sandbox(es),Mentor,Approval/Sign,Last change link,Wikiproject review,Online ambassador,OA comments,Cleanup comments,Cleanup status.
The columns irrelevant to the cleanup should be shunted to the end of the table (It's better not to remove them as the columns have some significance to the profs/CAs/etc). For a discussion on what the columns do, please see WT:IEP#Common_format.
A proposed editnotice is under construction at User:Manishearth/Ambassador/IEPnotice. ManishEarthTalkStalk 12:28, 9 November 2011 (UTC)

Hi everyone, I see that there has been a discussion about going through CCI procedures to clean up the articles from the Pune pilot. At the same time, as most of you are probably aware, staff members have already asked various Online Ambassadors - both some India-based ones and some U.S.-based ones - to take on cleaning up these articles (you can see which articles have been assigned to be cleaned by which Ambassadors on this list of students). While not all the Ambassadors have started cleaning up the articles they've been assigned to, many have, and I believe that the cleanup work they've already done have been very valuable in removing poor content from the articles. Yesterday I contacted almost all Ambassadors involved in the cleanup effort to inquire whether they're still available to help if they haven't already started doing cleanup, and whether some of them can take on cleaning up more articles if they've already done some cleanup. For the purposes of cleanup, I would also like to replace the Ambassadors who have little Wikipedia-editing experience with Ambassadors with more editing experience.

Seeing that both a CCI discussion and an Ambassador deployment effort are going on at the same time for the purposes of cleaning up articles, I want to work with you all on coordination to make sure no one is doing repeat work. As I mentioned, some Ambassadors have already put in a lot of time and effort in removing poor content from articles, and I wouldn't want any other community member to have to duplicate this work. The whole reason we assigned Ambassadors to the cleanup effort is to take workload off the larger community's shoulder - the community has had to shoulder too much workload related to the Pune pilot already (I fully acknowledge that and thank you all sincerely for the work you've done), so I would like the cleanup work to be largely shouldered by our Ambassadors rather than by the larger Wikipedia community, to save the community at large this burden. Of course, I think that a cleanup force made up of both Ambassadors and the community at large might be best, since a lot of cleanup work still remain to be done and our Ambassador resources are limited. What I really want to avoid, though, is getting any of us into a situation where we're just duplicating the work that someone else has already done.

So here's my proposal, and I'd like to get everyone's feedback so that we can move forward quickly depending on what we decide together. I say that we use this list of students as the master list that we'll all work off of, and we'll assign all the student articles on that page either to Ambassadors or to community members at large. Whoever is assigned to a student article would be listed in a column next to the article. Many of the student articles on that page have already been assigned to people (in fact many have already been cleaned up by Ambassadors), so we don't have to worry about those. But some articles haven't yet been assigned, and some are currently assigned to people with little Wikipedia-editing experience, so I'd recommend that we divide up these articles among the (experienced) Ambassadors and at-large community members willing to be part of this cleanup effort. This way, someone is "responsible" for cleaning up each article, and it'll be clear who is responsible for which article. Of course, I realize that the list may not be totally complete yet, and may lack some information that is important to the cleanup effort. So what I would recommend there is for us all to work together to update the list to make it more complete, and also change the format of the list (in alignment with some of the earlier discussions on this talk page) to make it more usable. I'd advise against creating completely separate lists/pages because that could lead to confusion and duplicate work.

What does everyone think about this? My basic point here is that I think everyone interested in the cleanup effort - staff, Ambassadors, super nice Wikipedia community members at large - needs to work together and stay in constant communication with each other during the cleanup process; to not do so would be to risk wasting people's time doing repeat work (staff is certainly guilty of this as well, but I want to change that and make the Pune-pilot cleanup effort going forward truly collaborative). Thanks all. Annie Lin (Wikimedia Foundation) (talk) 18:31, 9 November 2011 (UTC)
My experience is that most Ambassadors - even if experienced Wikipedians - are not always knowledgeable about the subject matter of the article they are supposed to review. So I'd really like to see this work being done in duplo. Once by an Ambassador, who can take care of trivial matters like copyright violations, once by a editor who knowledgeable about the subject and can also take care of any quality issues (i.e. assessing whether the article only needs a small amount of copy editing, or is better off being reverted to its pre-IEP state.) Clearly the latter class of editors are in much shorter supply and less willing to be named as "responsible" for a particular set of articles. —Ruud 19:08, 9 November 2011 (UTC)
  • The IEP CCI hasn't gone live yet. The processes, as far as I can see are complementary to a certain extent. The full CCI which will generate very easy links to all contributions which will have to be checked. They'll look like this for each of the 800 IEP students. In my view, a final check will still have to be done via the CCI after the dust completely settles regardless of what the Online Ambassadors do. There are several reasons for this:
  1. Students are still editing, even today (see below), even in courses where they have been explicitly told not to.
  2. Students are still adding their user names
  3. Students have gone back and edited after the original copyvio was cleared out and comments made in the status columns. Some of those comments are quite old and/or undated.
  4. Some of the OA/volunteer editor checks are only being made to the listed articles in the tables, which in many cases are inaccurately listed and don't represent all the places the student have actually edited.
When the CCI finally gets underway (and no one has started there to my knowledge), the work the OA's have done and are doing will have still have been a huge help. The CCI cleaner-uppers will be able to see from the article's history and cross checking with Wikipedia:India Education Program/Students whether it has already been checked, by whom, and when. They'll see if any copyvio has been removed, whether or not the students have edited it since then, and if the the original checker was sufficiently experienced to trust their check. If all the conditions are met, the article can be quickly signed off. If not, then it will need a re-check. It's the only way to do it systematically and thoroughly. At this point there's no use in rushing. It took 3 months to add the copyvio and it may well take another 3 months to get rid of it. But that's OK. Better to do it properly. Those are my thoughts, anyhow. Hopefully MER-C and Moonriddengirl can give us some expert input on this. Voceditenore (talk) 19:56, 9 November 2011 (UTC)
I fully agree with what Ruud and Voceditenore have said already. Any copyvio the OAs and the community can remove now will not be wasted time, but make the more formal and fully recursive CCI approach much easier. And the lesser the time copyvios remain in the articles the better, not only for legal reasons, but also because any other edit on top of copyvio'ed content is at risk to have to be deleted as well - this would be a waste of the precious contributors' energy. Still, I see, that a systematic CCI is necessary later on to ensure that we really catch all edits. The master list is still in flux, as are the edits, and even with the help of the student-o-meter and huge watchlists (based on days-old data, though) we will miss at least some activity, and we haven't even started to add all the other articles/sandboxes, which have been edited by the students (and I guess we will see even more once we come around to add IP addresses to the lists as well). --Matthiaspaul (talk) 20:32, 9 November 2011 (UTC)
I may write more later, but just quickly - as far as I am aware, a CCI has been ongoing for close to two weeks. The US Online Ambassadors who were willing to help were asked to conduct a CCI, and speaking for myself that's the way I've approached this. Given that there are a large number of students, but few contributions for each student, checking all of their edits where I have been assigned a student has not be difficult, and I assume that this is the same process that the other "EOAs" have been doing. I'm a bit concerned that there is a lot of duplication of work planned, which is less than ideal. My impression has been that there was a process in place to handle the copyright concerns, but there has been a parallel process being developed here as well, when I'm not yet sure that the parallel process was necessary. - Bilby (talk) 21:07, 9 November 2011 (UTC)
At the risk of reiterating myself - but I'm not sure if I was entirely clear in my first reply - but we seem to have at least two problems with the IEP. A copyvio problem and a quality problem. Anything the Ambassdors can do to remedy those quickly is fine, but I think both require the articles to be reviewed more carefully than the Ambassadors can do on their own. —Ruud 21:13, 9 November 2011 (UTC)
@Bilby. This is the CCI. It's an offical process. Is that what you've been asked to participate in? It appears that what you've been asked to do is centered on the Student Lists. I'm not sure how clear the WMF is about what an official CCI actually is and may have confused it in their emails with simply "cleaning up copyvio". The CCI isn't due to go live for a while. So any help you give on the Student List page won't be duplicated (per my comments above), unless the students edit their articles after you've checked them. Voceditenore (talk) 22:01, 9 November 2011 (UTC)
I've been involved in quite a few CCIs in the past, and the requst from the Annie Lin was specifically "look through those students' edits and remove all copyright violations or bad edits in general". There was a lot more detail, but we were being asked to remove copyvio, copyedit if possible, and remove badly written material to the talk page in regard to all of the student's edits on any articles - pretty much the same expections of a normal CCI, with the addition of the quality requirements. My concern is simply duplication: I get the impression that there is currently a process in place which is doing the job that the CCI is intended to do. But while the people asked over from the OA program have been following that process, a parallel and effectivly identical process has also been built. The result is going to be a lot of people doing the same jobs: where copyvio or problems were detected by the OA, it won't be a major issue for someone else to look, see it is fixed, and move on; but where none was detected the second person will have to go through the same process of confirming that it isn't a problem. I suspect, if nothing else, some of that duplication can be reduced if the OA people are asked to replicate what they've done on the CCI process, but doing so will slow things down.
The bit which confuses me is that, in reading the above, it seems that it has been stated that there is a formal process that seems to be being used to fix the problem, but that nevertheless a second formal process has also been developed. - Bilby (talk) 22:56, 9 November 2011 (UTC)
I didn't set up the CCI. but my impression is that it will serve as a final mopping up and check, after the whole thing has well and truly finished, not a duplication of what the OAs are currently doing. I would have thought that CCI investigators will cross check with the comments on the student lists. If the OA says they've checked for copyvio and haven't found any and the student hasn't edited the article since their last check, then the article(s) would be signed off without further checks. Voceditenore (talk) 23:06, 9 November 2011 (UTC)
That sounds like a good idea to me. I have absolute confidence, for instance, in Bilby's work in this area. :) Many of the OAs working on this are familiar to me. I don't know that much about bots and what they can do, but I know for instance that MER-C does. :D Would it be possible for us to cross-check the CCI against the OA cleanup chart? If not, I can do that by hand, since there's not a whiff of content management there. It's clerical stuff, and I stand ready to serve. --Moonriddengirl (talk) 11:22, 10 November 2011 (UTC)
I echo this comment made by Voceditenore. The CCI is absolutely necessary because the OA cleanup shares the same gross organizational incompetence (not the OAs' fault) as the whole project. The WMF did not even attempt to coordinate before throwing the OAs at the student list despite the CCI being brought up in the office hours. The CCI may also serve other purposes when the need arises. MER-C 09:01, 10 November 2011 (UTC)
Bilby, what we are preparing here right now is a solid database on which cleanup work of any kind can be based. Unfortunately, little thought has been put into the data aggregation and representation by those who ran the programme, perhaps because nobody could predict the lack of discipline of the students even to add their valid account and other data to the online course lists, or the lack of interest of the local people to verify, enforce and correct the online data earlier on in the timeline, or it simply was not seen that an accurate database would become important for any maintenance or controlling tasks alongside the IEP, I don't know and will leave that to a post-mortem-analysis. However, the organization of the data was (and still is) fundamentally flawed and that's how we ended up with dozens of half-maintained lists with similar but not identical contents, which need to be carefully merged and synchronized again. Our current efforts don't focus on correcting the structural mistakes already done, because it would require more work. Our focus now is on the cleanup, not the development of a proper technical infrastructure to base future programmes on - nevertheless, some of the stuff reshaped now may also be a good starting point in the future.
Anyway, even in the past few days we still stumbled upon several participating students and lots of articles not listed previously. We found them by reverse lookup from article space to user space and recursive backtracing of "suspicious" edits in the page history of articles, talk, and user pages. So, any cleanup efforts so far would have missed them. That's why we'll need the CCI, which will systematically and fully recursively scan and (re-)check any of the articles touched by any of the students. If they are found to be carefully cleaned up already, no further work is required. If not, the CCI will have to deal with the left-overs of any prior cleanup efforts. So, (almost) no work will be doubled, and if the WMF now wants the OAs to clean up before the formal CCI will come around, this is perfectly fine as well, for as long as it is documented what has been done. It does not interfere if we work on the same online data base (the master list). --Matthiaspaul (talk) 12:14, 10 November 2011 (UTC)
Sorry, I may be giving the wrong impression. I'm not against a CCI - I want this cleaned up too - but I'd like to avoid duplicating effort, mostly because I know of the backlog that is at CCI (through lack of volunteers, more than anything - CCI work is labour intensive, difficult and unrewarding, but essential, hence my immense respect for people like MER-C and Moonriddengirl, along with all involved). So if there's way of combining efforts it would work out better. The benefit I see from the current Emergency OA model is that it is proactively seeking people to work on the problem, which is a good move in my eyes from those involved in the IEP. Capitalising on that would be both a way of combining efforts and, when this occurs in the future, having the beginnings of a possible response model. (I say that simply because, as an educator who works with hundreds of tertiary students a year, I know the likelihood of copyright problems in any program in any country, even though I happily accept that the rewards outweigh risks). At any rate, that sounds like the plan, so all is good. - Bilby (talk) 12:47, 10 November 2011 (UTC)

@Bilby: The CCI won't be a duplication of effort. Our cleanup will involve checking the IEP edits to a particular page, determining if it is a copyvio (and fixing bad grammar), notifying the student, and recording our actions in the master table. The CCI will look at the resultant article (after our cleanup), do a check for remaining copyvios, and they will also check our actions (Which are recorded with diffs, so it will be quite quick). This level of redundancy (We'd probably go for more redundancy if we weren't short of volunteers) is required to make sure that every single copyvio is removed. The reason that we aren't just doing a CCI (which is rigorous enough to not need any more redundancy), is that (like you mentioned), CCI has a lack of volunteers (And will thus take a very long time to go through ~1000 articles). We need to get rid of these copyvios ASAP (Remember, they're live.. We can't have that at all..). The best thing to do is attempt to clean it up as much as possible in a short time so that the copyvios are no longer live, and the CCI has a much easier job to do. The CCI will basically 'clean up the cleanup', by picking up any stray stuff we've missed (or wrongly blanked).

@Voceditnore: Not all of the OAs are experienced enough for the community to be sure that "if an OA says it's OK, it's OK" (At least, that's what it looks like from older discussions on this page). The CCI investigators are experienced in the field of identifying and addressing copyvios, so it would be better if they checked the OA or other cleanup volunteer actions also. We should provide diffs of all cleanup edits in the master list (comments section) to make life easy for the CCI.
@Moonriddengirl: It shouldn't be too hard to cross-check. In fact, MER-C has written a bot that imports a list of students/articles and generates a CCI. Unfortunately, we don't have a complete list of students/articles yet. Aside from that, it's not machine readable (Because of the various formats). Once that is done, MER-C can bot-create the CCI, and since the CCI will be bot-created, it, too will have a regular format which can be broken down for comparison with the OA cleanup (I doubt we will have to do this, though, if the CCI is started after the OA cleanup is finished). ManishEarthTalkStalk 15:25, 10 November 2011 (UTC)

Oh, by the way, Voceditnore, this isn't exactly the CCI. It's just a page for testing MER-Cs CCI creator. (See WT:IEP#Contributor_surveyor_finished) ManishEarthTalkStalk 15:28, 10 November 2011 (UTC)
Yes I know not all of the OAs are experienced enough for the community to be sure that "if an OA says it's OK, it's OK". I think I mentioned even further up that we needed to cross-check how experienced the OA cleaners were, and re-check their work if in doubt. This was especially true of almost all of the "out of process" OAs appointed by the IEP which I had pointed out at Meta a week ago. (I believe the bulk of them were finally removed yesterday.) But even some of the ones from the "normal" US OA program are inexperienced with doing this work, and frankly, the work itself is so mind-numbing that it's almost impossible not to "glaze over" after a while. Voceditenore (talk) 16:13, 10 November 2011 (UTC)
@ManishEarth: I'm getting two stories here, which is confusing. However, redundancy is good, but I don't see it as a plus here: a CCI is going to have a hard time going through all of the relevant articles, and the work involved to recheck something already checked is going to be considerable. Redundancy is great when you can afford it, but if you can't afford to loose time doing things twice, it is normally better to work out a means of focusing on the relevant problems so that duplication is removed. Especially with some big CCIs floating around that will also need to be addressed elsewhere. - Bilby (talk) 21:15, 10 November 2011 (UTC)
Let me make an attempt at summarizing the points made by various people so far on this topic: the CCI process is going to be a separate cleanup process than the Online Ambassador (OA) efforts. The OA efforts have already been underway and are continuing to operate, and are focused on not only removing copyright violations, but also removing unsourced content, very bad English grammar, and other hard-to-fix poor content in general (these are the instructions that the OAs participating in the cleanup have received). The CCI process has not yet started and it is uncertain when exactly it will start; when it does start, it will involve community members with CCI experience in general, and it will focus mostly on getting rid of copyright violations. User:Bilby is expressing concern that the CCI process will duplicate a lot of the work that the OA efforts already did; a few other people on this page are saying that the duplication will be minimal, and others are saying that the duplication might be necessary for adequately removing poor content coming out of the Pune pilot. There appears to be consensus on the point that even though the CCI process will take place sometime down the line, the current OA efforts are still needed and valuable because the copyvios and poor content need to be removed as much as possible immediately.
Is this an accurate summary? Please correct me if I am mistaken.
Building off of this, I have a few points and questions:
  • I want to echo User:Matthiaspaul's point above that it is crucial for everyone involved in cleanup - whether via the CCI process or the OA efforts - to be working off of the same database, namely this master list of students. This will minimize the amount of duplicate work, because OAs are using that student list and leaving comments there after they've cleaned up articles. I understand that the master student list might still be incomplete - some professors in the Pune pilot have still not yet provided their students' usernames and article information, so more information continues to be added to that list. Furthermore, the list could perhaps use some format improvements (like additional columns or beautification in general). It is therefore important that all of us continue to update and improve the student list, but we should all work off this same list instead of creating any separate databases/lists for the duration of this OA and CCI cleanup.
  • I'd like to call on the community at large to join the OA efforts. The OA efforts - despite its (informal) name - are not restricted to OAs (and our OA resources are limited anyway!). I basically see the "OA efforts" as the currently ongoing efforts to remove poor content from the Pune pilot student articles before the CCI process does a second cleanup later on. So if any community member has some time to help out with cleanup this week or next week, I would really, really appreciate it if you could go to the master student list (linked above), put your username in the "Online Ambassador" column next to some student articles that currently have "IN NEED" next to it, and help clean up those articles (remove copyvios, unsourced content, bad grammar, poor content in general). We can use all the help we can get right now!
  • I understand that some folks are concerned about how much the OAs can be trusted with doing cleanup. Following some discussions on this topic on the relevant Meta talk page, I've removed almost all OAs who have little prior Wikipedia-editing experience from the cleanup effort. But there still seems to be concerns about how much to trust the OAs. I would really like for all of us to come together behind the OAs and support their cleanup work rather than cast doubt on it. Like what many at-large community members have (very generously) been doing, the OAs have already been doing highly valuable work in cleaning up those student articles, and I believe all these people deserve applause rather than suspicion for that work. I think it is unhealthy to cast doubt on people who are completely good-faithed members of the cleanup efforts - it is unhealthy not only because I think we should build our relationships on trust (and "assume good faith") rather than mutual skepticism, but also because to say that "we can't trust the OAs and therefore in the CCI we'll need to re-check everything the OAs did" would just lead to a lot of unnecessary duplicate work. So my suggestion is for us all to get behind the OAs (and to get behind the CCI folks when they start the CCI process down the line), and for everyone in the cleanup effort - OAs, CCI people, etc. - to operate on mutual trust. Now, if there are particular OAs who you think should not be part of the cleanup effort because you think those OAs for some reason don't have the right qualifications, I encourage you to please indicate (soon) who you think those OAs are who should be removed from the cleanup effort (please also provide good reasons for why you'd like to see those particular OAs taken off the cleanup effort), and then we can talk about it and take off those OAs who actually do not meet the "right qualifications" for doing cleanup. I certainly think that possibly some OAs aren't experienced enough with copyright issues on Wikipedia to help with cleanup, and in that case we probably should take them off this effort. But for all OAs who remain in the cleanup effort, I'd highly recommend that we assume good faith and assume competency, just as we'd do the same for other Wikipedia community members at-large. Annie Lin (Wikimedia Foundation) (talk) 23:31, 10 November 2011 (UTC)
  • Annie, this has nothing to do with not assuming good faith, and I would appreciate it if you didn't put it those terms. Furthermore, it has nothing to do with the editors' "trustworthiness" as people and no one has implied that. It has to do with the level of experience in a very specific task. Any conscientious editor would welcome a back-up check when they're operating in a wholly new area. I know I would.
Two days ago I left some tips for a very experienced editor and Online Ambassador for the US program (13,000 edits, 43 articles created) after seeing his message on another editor's talk page:
"I have no previous experience with systematically looking for copyright violations. I am wondering if you could give me a few tips about how I could be helpful in cleaning up the messes."
I'm sure he'll do fine now, but I'm wondering if the email messages sent to the OAs gave them tips and advice on how to do an effective copyvio search? If not, it might help to send one around rather than waiting for them to ask. There may be others in a similar position which is why they've been slow to get started.
On another note, if you're looking for more help in the current process you've set up, I suggest you reach out to the subject-specialised WikiProjects. If you leave a note on the project talk pages, you may be able to recruit not only experienced editors, but ones with specialist knowledge of the subject area and access to offline sources. Specialist knowledge is a big help in "fixing" articles. I've cleaned quite a few from the CoEP, but on several occasions the English was so garbled and my subject area knowledge so poor, that I had no idea what the students were trying to say. Thus, I couldn't adequately repair the article apart from removing any traced copyvio. I suggest you reach out to the following if you haven't already done so:
WikiProject BusinessWikiProject EconomicsWikiProject EngineeringWikiProject TechnologyWikiProject Computer scienceWikiProject Computing
Having said that, make sure these editors have a place to make their comments. The following either have no table at all, or tables without places for the OA/reviewer's name and comments: History of Economic Thought Year 2 Group A (section), Agribusiness and Marketing Year 2 Group A (section), History of Economic Thought Year 2 Group B (section), Agribusiness and Marketing Year 2 Group B (section), Research Methodology Year 3 Group B (section).
Best, Voceditenore (talk) 10:05, 11 November 2011 (UTC)
I feel the same way as Voceditnore.. It's not a question of 'trustworthiness', it's a question of 'experience'. It's perfectly fine if the OAs do the cleanup (along with the rest of the community), but a CCI would be necessary as a final, reassuring check.
We should also start to update the tables to the format given above. (I'll do some now if I get the time) ManishEarthTalkStalk 12:34, 11 November 2011 (UTC)
I've updated the first table on the list to the new format (More cleanup-centric). Please use the same column order when updating others. ManishEarthTalkStalk 14:02, 11 November 2011 (UTC)
I've done a few more. Please revert if you feel that we need to rethink the new format. ManishEarthTalkStalk 14:29, 11 November 2011 (UTC)
Great! The course sub-pages of the Symbiosis School of Economics and the SNDT Womens University have now all been merged into the master list. I have gone through the change log of the corresponding sub-pages back to 2011-11-02 (and in one case back to September) to check for user name and articles changes. The tables on the sub-pages have been deleted afterwards in order to force students to use the master list. So, we'll now have to continue with the more difficult task of converting the COEP tables into the new format as well... I would like to discuss one possible change to the current table order, though. The current order is:
Account(s)
Article(s)
Sandbox(es)
Cleanup comments
Cleanup status
OA comments / Wikiproject review
Online ambassador / Mentor / Campus ambassador
CA comments
ID
Roll number
Real name
Approval / Sign / Last change
And my proposed change would be:
ID
Account(s)
Article(s)
Sandbox(es)
Cleanup status
Cleanup comments
OA comments / Wikiproject review
CA Comments
Online ambassador / Mentor / Campus ambassador
Roll number
Real name
Approval / Sign / Last change
That is, the ID would be in the first column to help identify a particular row. Optionally, the three comments columns would be grouped together, with the cleanup columns coming first so that still fit on the screen. The SNDT Womens University table is currently in this order. What do you think? --Matthiaspaul (talk) 20:56, 14 November 2011 (UTC)

Good work with the course page cleanup. We should modify the links on the master lists to go from "Courses/Fall 2011/Course#Students" to just "Courses/Fall 2011/Course" (Otherwise we get recursive links). But we can do this after the reformat is over.

@ID column: Hmm, the ID column takes up space, and we don't need it to identify a student. To identify a student, a username us sufficent (with a course name for the students in two courses). The ID column will not be actively used and glanced at. The username/articlename should be visible for quick clicking, and the comments also take precedence.
@CA Comments I'm rather indifferent about that column (doesn't really matter where it goes), because its empty in almost all the tables. So I'd rather keep it the way it is and spend time in reformatting the newer tables (The ones from COEP are giving a bit of trouble as they have missing cells, etc, which trips up the script and half the time I'm adding workarounds to it). ManishEarthTalkStalk 09:51, 15 November 2011 (UTC)

Perhaps it's a language thing as I don't speak any Indian language, but I find it quite difficult to remember many of the students' account names, so between browsing the rendered page and editing a particular entry, I often use the ID (if available) as a handle to find the corresponding location more easily. --Matthiaspaul (talk) 10:34, 15 November 2011 (UTC)
Hmm... It's then much easier to just look for the article name (or a snippet of it). Or just scroll to the side and see the IDs there. But OK, I'll shift the ID column once I'm done with the rest. ManishEarthTalkStalk 11:19, 15 November 2011 (UTC)

Students are editing

A large number of editors from the Data Structures course are editing again, in article space no less [1]. Why? According to the course page there is some kind of deadline November 10th. —Ruud 18:27, 9 November 2011 (UTC)

But the course page also gives an article edit deadline of the 2011-10-25. Really strange. (We also discussed this further up under #Multiple inheritance.) --Matthiaspaul (talk) 19:59, 9 November 2011 (UTC)
The professor in charge of this class is a prior (and existing) wikipedia editor. He's determined that he wants to carry on the project. He's had previously given an extended timeline to his students of November 10 (even prior to our asking all the faculty members to stop the assignment.) He's extended it to November 12 now. He's also told them that anyone doing any kind of copyvio (text or image) will be lose all 20 marks for the assignment. He has also told his students that anyone who's work ha been redirected to wikibooks should continue on wikibooks. Lastly, he's told his students that anyone who have page redirects or other such issues can submit assignments to him outside of wiki. We had spoken to the Director and all the professors (including the one for this class.) We will speak to him again tomorrow morning India time and reiterate the reasons for the suspension of the project and the seriousness of the issues around copyvio and other quality concerns. Hisham (talk) 20:21, 9 November 2011 (UTC)
Half a dozen students from Macroeconomics are still editing today, 11 Nov, with at least one still adding copyvio material in the mainspace. Their course page says their deadline is 14 Nov. JohnCD (talk) 22:32, 11 November 2011 (UTC)
Sarangvk (talk · contribs) is still editing heavily. I haven't had a change to check for copyvios but given the proficiency of the English being used compared to the user's the previous text, it needs to be checked. I'll be very busy today and tomorrow and probably won't have time to check but since I thought students weren't supposed to be editing at this point and this student has added mass amounts of content, I thought I should bring it up somewhere. OlYeller21Talktome 15:38, 14 November 2011 (UTC)
At least one article, AK model, is mostly lifted from here as is the caption of an uploaded image, File:Ak model.png. I tagged the article and will check other contributions. Jojalozzo 21:59, 14 November 2011 (UTC)
The edits are continuing: [2] Wikipedia talk:Articles for creation/Interest-free economy, [3]... MER-C 03:41, 16 November 2011 (UTC)

USEP discussion

There's a discussion going on here about the US education programme. It has not had the same problems as the IEP, but it does impact the general community of editors. Those involved here, both as ambassadors and as part of the copyright cleanup effort, may wish to participate in that discussion too. Mike Christie (talk - contribs - library) 01:31, 11 November 2011 (UTC)

Two questions about student contributions

I am working through some student contributions and ran into a couple of things I'd like advice on.

  • Student Mallika.sharma created Wikipedia_talk:Articles_for_creation/Non-banking_Financial_Company; the creation was declined. It was weakly/incompletely sourced. The English is poor enough that I doubt it is a direct copyvio. Is it OK just to leave this alone?
  • Student Abhinav619 created Challenges of inflationary policy in India, and is the author of almost all the text in it. (It should probably be moved to Inflation in India, but that's another issue.) Sources are given but there are no citations, so per Nitika's instructions, if this were not a student-created article I would simply move all the student's text to the talk page with a note that it would have to be cited appropriately. I can't do that in this case; what should be done instead? I'm inclined to move it to Inflation in India and replace all the text with a redirect to Inflation until someone else gets around to writing this article. Any comments on that approach?

Thanks for any help. Mike Christie (talk - contribs - library) 05:47, 12 November 2011 (UTC)

  • Hi Mike. Re Non-banking Financial Company, I've seen a lot worse articles than this passed through AfC. It's a bit of a lottery depending who's reviewing. Having said that, I would just just leave this in the editor's user space for them to further improve (if they want) and note on the Clean-up/student list that it has no copyvio.

    Re Challenges of inflationary policy in India, again this is not that bad an article, certainly not so bad as to redirect. I'd leave it in place, move to Inflation in India, add any appropriate maintenance tags, and above all add {{WikiProject Economics}} and {{WikiProject India}} to the talk page as well as {{IEP assignment}}. (There are some specialised banners for individual IEP classes, but this will do in a pinch.) That way we can keep track of the IEP articles after the thing finishes. More importantly, the WikiProject banners give the subject-specialised projects a greater chance of finding it and possibly dealing with it in the perspective of the other articles within their scope. The maintenance tags are useful because most projects priortise their work via their cleanup lists.

    In general. I think we have to be careful about shooting these IEP articles at dawn, unless they are copyvio (although after a whole day of dealing with them, I get sorely tempted). The only time I remove material (apart from copyvio), redirect, or propose for deletion is if the material is so garbled as to be incomprehensible and requiring a complete rewrite, it essentially duplicates an existing article without adding anything new or helpful, or is dumped into an existing article where it is clearly not an improvement, and in fact a detriment.

    Frankly, I don't agree with Nitika's instructions about putting removed chunks of material on the talk page. It clutters up the talk page and serves no useful purpose. Instead simply note on the talk page that some material was removed and why and then link to the diff, e.g. [4]. Anyone who wants to work on the removed material has access to the history. Best, Voceditenore (talk) 10:04, 12 November 2011 (UTC)

Thanks for that. I left the student a note about it. I see from the student's talk page that their instructor approved the title/subject. Unfortunately this has happened a lot in the IEP. The instructors are approving topics without actually checking themselves that it isn't already covered in an existing article. I wonder if they were told to do this during the training sessions (were the instructors trained at all?) and more importantly, how to search. This has been a particular problem with the IEP because of a shaky grasp of English capitalisation conventions and a seeming lack of acquaintance with WP:TITLEFORMAT and WP:MOS generally. Thus they see Non-banking Financial Company as a red link and simply assume there is no article about it. Voceditenore (talk) 11:04, 12 November 2011 (UTC)
Thanks for the advice above. I have been deleting unsourced material from the students since that was what Nitika's instructions said to do, given the high likelihood of copyvio. If we feel that the upcoming CCI will address those issues I'll leave unsourced material in place from now on -- I don't like to delete material without definitely knowing it's a copyvio, but I could also see why that approach was suggested. If I do keep deleting unsourced material, I agree that a diff link on the talk page is a better approach; I'll do that instead. For Wikipedia_talk:Articles_for_creation/Non-banking_Financial_Company my concern was mostly that it's not in article space -- are these requests typically left sitting in project talk space forever if unsuccessful, or should it be deleted? If it's not to be deleted I will leave a link to it on the talk page of the current article MER-C linked to. Mike Christie (talk - contribs - library) 14:57, 12 November 2011 (UTC)

Note on duplicate students

I've just realized that some students appear in multiple sections of the student page, presumably since they took multiple classes. When I've been investigating a student's contributions I've been going through everything theyji did and noting all results in the comments column. I'm about to go back through my own updates and copy them to other locations on the page where the student appears. I think it would be sensible for anyone who is checking on the students work to search for their name on this page to see if it's already been checked by another editor. Mike Christie (talk - contribs - library) 17:23, 13 November 2011 (UTC)

I'm going through and adding my notes in other locations for those students but I'm not filling in the OA comments column if there is an OA name and the cleanup column (not the OA comments column) already has cleanup notes. I assume this is the simplest approach but if I should fill in the OA comments too, let me know. Mike Christie (talk - contribs - library) 17:57, 13 November 2011 (UTC)
If you find duplicate students, just cross-reference them in either comments section (Im going to later run a check on this and cross reference all dupes). Currently, we are in the process of bringing that page to a common format. The cleanup comments column is part of the new format, and thus has some rules attached to it. The OA commentji column, after the reformat will be kept for reference, and not edited at all. J When the reformat is over, everyone involved in the cleanup will be notified on how to use it. For now, you may just use the oa comments column (add your comments in a bulleted list if there already is something there). If you want to use the cleanup comments column, just check out the proposed guidelines here (not all of it is relevant) ManishEarthTalkStalk 18:47, 13 November 2011 (UTC)
Thanks. I think I'll stick with the OA comments column as I am not experienced at copyvio detection. Mike Christie (talk - contribs - library) 18:48, 13 November 2011 (UTC)

Commons image help

A student uploaded File:Money111.jpg which I am not sure about the copyright status for -- are the images on banknotes copyrighted? Mike Christie (talk - contribs - library) 02:54, 15 November 2011 (UTC)

My understanding is that it depends on the country, but that it is not ok in India [5]. The image itself seems to come from the RBI [6], which claims copyright of their site, so I'd assume the copyright would hold for that image without evidence otherwise. - Bilby (talk) 03:21, 15 November 2011 (UTC)
The Copyright Act in India gives a 60 year copyright to government images/designs. etc. in general. Would need careful reading to ascertain the exact rule position though. AshLin (talk) 03:50, 15 November 2011 (UTC)
I put if up for a deletion discussion at Commons [7]. Even if the banknotes themselves were out of copyright (which the don't seem to be), this is a collage with artistic input in the arrangement and colouring, not a straight-on image of a banknote. Anyhow, the experts there will know what to do about it. Voceditenore (talk) 06:36, 15 November 2011 (UTC)
But wait, there's more! File:Southasia111.jpg and File:SOUTHASIADEV.jpg are suspect imagevios: the gray strip at the top of the latter indicates it's likely to be a poorly cropped screenshot from some internet accessible database. I'm not sure if the data itself may be copyrightable (it depends on how it is compiled) but the presentation probably is.
By the way, doesn't anyone teach these students table markup or how to use an image editor?! This reminds me of a certain type of horrible Youtube video. MER-C 09:13, 15 November 2011 (UTC)

Well, to the newbie, table markup, whether taught or not, is pretty confusing. And I doubt the IEP would put more effort into teaching them table formatting as most of them won't need it anyways. Regarding the images, we'll have to probably look at all the IEP uploads, too, and fix them.. Anyways, I replaced the ugly image with a slightly less ugly comp-generated one (I used MS Word.. rather unorthodox, but its rather usefull for quick stuff like this). Ideally, it should be made svg or png, but I'm not going to start doing that. ManishEarthTalkStalk 11:28, 15 November 2011 (UTC)

It's not hard to point out and understand Help:Table or one of those leaflet thingies the WMF are so fond of. The images just keep getting better: File:Bowen's diagram 2.jpg and File:MU and TU of taxation.jpg. Do these students even look at their uploads before (or after) inserting them into the article?! (I won't fix these diagrams as I'm not familiar with the underlying economic models and hence can't tell whether they are correct. The camcorder diagram is just the market for financial capital where D = demand = investment, S = supply = saving and r = the real interest rate. This isn't obvious at all.) MER-C 11:51, 15 November 2011 (UTC)
OK, those are terrible. And not fixable. And shouldn't be on WP in the first place. OK, they are fixable if you know enough about the particular subtopic (I'm not). I really see no reason why someone would draw a diagram (I hate doing that), when their computer can draw the straight lines for them. Regarding the WP:Table, I seriously doubt that the students would read that if it was given in a pile of stuff along with WP:V/WP:NPOV/etc (I don't think the students thoroughly read these, either). ManishEarthTalkStalk 12:42, 15 November 2011 (UTC)

Mailing list discussion

See here. Starts in the message "Death and Post-mortem of Indian Education Program pilot -- #DelayedMail" by Srikanth Lakshmanan. It includes this scathing criticism of the WMF, which was forwarded to foundation-l. Enjoy. MER-C 05:14, 16 November 2011 (UTC)

The worrying thing is that some are thinking or repeating the process [8]. If it is to go forward I'm recommend using Wikipedia:Articles for creation as a way to minimise the damage. Thats the on-wiki process for new articles by new editors.--Salix (talk): 11:52, 16 November 2011 (UTC)
The poster of that message is one of two people who are responsible for running this program. I prefer an external sandbox because the IEP project may crowd out other newbies at AFC via backlogs. (AFC is currently backlogged). MER-C 12:00, 16 November 2011 (UTC)
I would have liked to see the IEP next semester planning discussion started on en-wiki at the same time, given the impact it has had. I've left a note for Annie Lin to that effect. Mike Christie (talk - contribs - library) 12:47, 16 November 2011 (UTC)
Just chiming in, I think that its fine to repeat the IEP, as long as they do it at a very small scale, and keep the community in the loop (And well involved in all the planning processes). There should be atleast one community member on-site, who has lots of experience (WikimediaIndia should have quite a few such contributors). ManishEarthTalkStalk 13:21, 16 November 2011 (UTC)

The thing that keeps making me facepalm, and it's been said both in that mailing list thread and in various other places where IEP was discussed, is this concept of "well, we don't have enough qualified mentors to assist the number of students we want to include. I know, let's bring in mentors who have even less experience than the students!" In what universe is that the choice to make, rather than "Ok, we only have enough competent mentors for X students. I guess we only have room for X students this time around. Maybe we'll gain more mentors as time passes"? Why would you supplement the "workforce" with people who patently don't know how to do their jobs, rather than just cutting down the number of students, especially in a pilot program where the goal is to test how things work?

I get that a whole lot of things went wrong, from a whole lot of different causes, in this program. But the desire to slap inexperienced people, many of whom had never even edited Wikipedia, into advisory and leadership roles for this program strikes me as one of the worst, especially when they had, according to those emails, eager local community members available who were being shut out in favor of "ambassadors" who didn't understand Wikipedia. A fluffernutter is a sandwich! (talk) 15:49, 16 November 2011 (UTC)

Hi everyone, I posted this on my user talk page too because Mike left me a message about this topic, but I'll repost what I said here as well:
The local India staff team members (Hisham, Nitika) and some of the San Francisco -based Global Education staff team members (myself included) had a long, in-depth meeting yesterday to talk about the future of the education program in India. One thing we talked about at length is whether to continue any in-class activities next semester (spring 2012), or instead to focus in the spring on doing post-mortem analysis and wait until after spring to start working with any classes. As you said, there are big risks to running in-class activities before we have adequate time to make a thorough analysis of what exactly needs to be changed to make the program in India more successful, and we discussed these risks at length during our meeting yesterday, with many people arguing for waiting until at least June before working with any more classes. We decided that Hisham and Nitika will make the call (soon) on whether or not any in-class activities will take place next semester based on these discussions, since they are the people who run the program in India. So we'll have more exact updates on that afterward, but rest assured that we share your concerns 100%!
Everyone was also in agreement that the post-mortem analysis and the planning process absolutely need to be a dialogue between the Wikimedia Foundation and the community. One of the mistakes we made this past semester was that we did not involve the community sufficiently in the planning, and we definitely want to change that. Various community members have been involved in the Pune pilot (thank you all for your help) and have a lot of knowledge at this point about what the outcomes and challenges of the pilot were and how those affected the larger English Wikipedia, so I think any analysis and planning process in the coming months will be inadequate if these community members are not an active part of the conversation. We'll like to use a variety of communication channels for the analysis/planning since each channel has its pro's and con's. So, be on the lookout for that soon as well! Annie Lin (Wikimedia Foundation) (talk) 20:15, 16 November 2011 (UTC)
One of the mistakes we made this past semester was that we did not involve the community sufficiently in the planning, and we definitely want to change that.
I just wish I had more faith in WMF having recognised the truth of the first part, and more hope that they'd take a useful approach to the second. So far though, nothing about this whole mess gives me any confidence. Andy Dingley (talk) 20:33, 16 November 2011 (UTC)
Hear hear. MER-C 09:12, 18 November 2011 (UTC)

I was one of those participating in that thread and other forks about the IEP. The one thing that consistently stands out it they are refusing to acknowledge a) they carefully avoid acknowledging that they need to scale down b)that the campus ambassadors must have more editing experience. After a while the discussion bogged down to no of edits a campus ambassador has and became ugly. The horde of CAs that descended into the discussion repeatedly keep claiming that the pilot was a success (a view shared by sue gardner). The CAs and the WMF IEP team, still refuse to acknowledge that they need more article editing experience in en wiki to handle such a program.

Even after all the heat, i am not surprised to hear Hisham and Nitika will make the call (soon) on whether or not any in-class activities will take place next semester based on these discussions. The very possibility that "in-class activities" may continue next semester with exactly the same setup despite all that has happened shows how wrong WMF's attitude toward the program is. I am reproducing kudpung's table on CA edits from meta

CA edits
User 1st edit Total edits Mainspace
en:User:Gsinghglakes 18 September 323 21
en:User:Ramshankaryadav May 696 41
w:en:User:Seva.panda 3 June 14 1
w:en:User:Arnavchaudhary June 117 23
en:User:Wasimmogal2007 12 Sept 2010 316 6
w:en:User:Pallaviagarwal90 28 August 109 10
w:en:User:Mihir.khatwani June 240 20
w:en:User:Tambeparag July 171 86
w:en:User:U.raghavendra June 39 6
w:en:User:AbhiSuryawanshi May 343 59
w:en:User:Rangilo_Gujarati February 1,210 206
w:en:User:ALX999 May 79 8
w:en:User:Mihir_Kelkar 31 August 9 3
w:en:User:Pratiklahoti8004 July 532 51
w:en:User:Gunit31 August 137 28
w:en:User:Devanshi_tripathi August 571 278
w:en:User:Anurag_acj 25 July 128 22
w:en:User:Vaibhavchandak 28 July 172 43
w:en:User:User:Debastein1 24 July 838 133 (user 244)
w:en:User:Vedantgupta7890 29 July 117 62
w:en:User:Minakshinajardhane 27 July 128 28
w:en:User:Nikita.agarwal 21 August 137 51
w:en:User:Shefalinaik 28 July 73 19
w:en:User:Roshnisaigal 30 July 188 96
w:en:User:Ishu.aghav 3 September 28 6
w:en:User:Arjunmangol 30 July 302 45
w:en:User:Tb0412 8 July 93 39
w:en:User:Kumarvikramsingh 6 September 29 3
w:en:User:RDebashruti 21 August 29 2

These are the sort of people the WMF offers as a solution to handle an ever expanding education program.

Despite all the issues we have raised and the large number of regular editors who have spoken out, the WMF a)refuses to acknowledge outright that the IEP pilot was a disaster b) is thinking about continuing the same program for the next semester c) still thinks throwing more inexperienced CAs into program will take care of things.

Unless something drastic happens from the en wiki community side - like an RFAR MER-C mentions or a blanket ban on student articles, i have no faith in the WMFs ability to self correct. --Sodabottle (talk) 10:11, 18 November 2011 (UTC)

  • The problem wasn't with the CAs themselves. The problem was that the organizers expected them to do what CA's are not meant to do, regardless of how much experience they have. I'm not sure what the thinking was behind that, especially given how inexperienced they were. The real problem was that the IEP organizers (somewhat late in the day) recruited "out of process" Online Ambassadors, the majority of whom were completely non-participative, and worse, even less experienced than the CAs! It was claimed that these IEP OAs had been trained. I'd be curious to know by whom and on what. Three of them had added copyvio to WP, for one thing. Another three of them are listed on the IEP course pages with no link whatsoever to their user account, if they even had one. Then after three months of chaos, when it was plain that the vast majority of these OAs weren't editing on Wikipedia at all, let alone mentoring students, they were assigned to clean up the copyvio. Note that I'm not referring here to the US CAs who were brought in later to help with the copyvio clean-up. Voceditenore (talk) 11:11, 18 November 2011 (UTC)
We (IEP OAs) were given a roughly one hour long lecture focused primarily on the goals/structure of the program and another hour long IRC session in which there was some role playing of assisting students. I'd be happy to forward the lecture notes to anyone interested. I'd also be happy to pass along the emails we received. I think they would be instructive to anyone wishing to analyze how and why this program failed. Danger High voltage! 15:35, 19 November 2011 (UTC)