User:RossO/sandbox/Current Events Portal Archive Cleanup

The purpose of this project is to implement a layout technique for the Portal:Current events archives and constituent components that improves (or creates) a mobile-friendly layout. The pages impacted are the 'Month pages', the individual 'day pages', the Calendar elements and the Sidebar elements.

Survey 1 - Month Pages (September 2017) edit

There are a variety of inconsistencies in the Current events portal archive pages. Here's a quick list of what I found on my first survey:

Part 1. Unnecessary TOCs edit

Showing TOC Reason Action to take
Portal:Current events/January 2001 External References Section Removed
Portal:Current events/July 2001 Film release dates Section Move to individual days
Portal:Current events/October 2001 Film release dates Section Move to individual days
Portal:Current events/January 2002 Topics in the news in January 2002 Section Move to sidebar
Portal:Current events/October 2002 See Also Section Move to sidebar
Portal:Current events/November 2002 See Also Section Move to sidebar
Portal:Current events/March 2005 News collections and sources Section Removed
Portal:Current events/July 2005 News collections and sources Section Removed

Most pages do NOT show a TOC. They are being triggered on the listed pages above and I've noted the reason for it. Most of these additional sections could be moved to a Sidebar box, integrated directly into the listings of the individual days (Film releases) or removed entirely.

Part 2. Migrating contents to day-specific pages edit

These were the targeted changes for migrating in-page contents to external included content (day-specific pages)

2003: Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec
2004: Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec

These pages seem to have broken the entire assumed configuration of using the sub-pages included into the primary page. This might will take a lot of work to complete. The contents of each day for each month will need to be copied into a new page for that day following the strict naming format.

Technique: Saved Tabs and Bookmarklets edit

For each month I follow a process of opening tabs, running two bookmarklets on each tab and then copying and pasting the contents of the month page from a text file. When working on June 2004 I use the following steps:

  1. Launch a pre-saved set of 32 tabs that are saved in my browser's bookmarks. The bookmarks are the January 2003 archive page, plus 31 day-specific tabs. Chrome allows me to open these up in one click.
  2. Edit the first bookmarklet: javascript:location=location.href.replace(/Portal:Current_events\/Edit_instructions/g, "").replace(/2003/g, "2004").replace(/January/g,"June"); and replace the year and month as necessary. It sits on my bookmarks bar for easy access.
  3. Place the mouse cursor over the bookmarklet and rapidly alternate between clicking the mouse and keying control-tab to apply the bookmarklet to each tab.
  4. Edit the second bookmarklet: javascript:(function() {day=('0'+(location.href.split('_')[3].substr(0,2).match(/\d+/)[0])).slice(-2);document.getElementsByTagName('textarea')[0].value = '<!-- All news items below this line -->{{Current events header|2004|06|'+day+'}}\r\n\r\n\r\n<!-- All news items above this line -->{{Current events footer}}';document.getElementById('wpSummary').value='Content migrated from Month page';})(); and replace the year and month as necessary. It sits on my bookmarks bar for easy access. The pre-fills the content of the page with the appropriate template calls and sets an appropriate Edit summary.
  5. Place the mouse cursor over the bookmarklet and rapidly alternate between clicking the mouse and keying control-tab to apply the bookmarklet to each tab.
  6. Cut the all of the daily contents from the source of the month page and paste it into a text editor. Apply the following search and replace regex: \n\*(\ ?)(\*?)( ?)( ?) and \n*\2 to clean up the bullet points.
  7. Scan the entries for obvious vandalism and sanity checks. Edit the entries as needed.
  8. Cut and paste the daily content into each day's tab's content editing box.
  9. Place the mouse cursor over the 'Save' button and rapidly alternate between clicking the mouse and keying control-tab to save each tab.
  10. Update the month page edit summary to Contents migrated to day pages and click save.

This process usually takes 10-15 minutes depending mostly on how much time is spent in the editing step.

Survey 2 - Month Pages Continued (September 2017) edit

Day 1 edit

Using a curl command, I pulled all of the Month archive pages into a single text file. While I found an amazing amount of consistency, I also found a few items that were worth cleaning up. I will tackle these manually, in advance of replacing the month page contents with a hyper-consistent template or Lua script that will generate the contents.

  1. Most pages have '''[[August]]''' '''[[2016]]''' was the… but some have '''[[December]] [[2011]]''' was the…
  2. February 1999 notes "There were no full moons in this month."
  3. First and last months of each decade are noted.
  4. A minority of months have "(See Holidays and observances, on sidebar at right, below)" or similar which we may want to remove.
  5. The following tags (I don't recognize) are on some pages: [[br:2008#Du]], [[br:2009#Du]], [[br:2010#Du]], [[br:2011#Du]], [[es:2003#Enero]], [[es:2003#Febrero]], [[es:2005#Enero]], [[es:2005#Febrero]], [[ru:2002 год#Февраль]], [[ru:2003 год#Февраль]], [[ru:2005 год#Декабрь]], [[ru:2006 год#Март]], [[sv:2003#Februari]], [[Independence Day]]:. I will identify these and see if they need to be kept or removed.
  6. Pages in 1997-1999 have [[Category:Months in the 1900s|*199#-##]]. This may be taken out or extended to the rest of the months, including Dec 1996.
  7. Some months do not have a call to a sidebar template
  8. {{commons category|April 1997}} was used up through 2013. 2014 and after use {{Commons|April 2014}}. Will investigate proper usage.
  9. {{Events by month|2016|prefix=Portal:Current events/}} seems to be missing for 2017.
  10. {{Portal:Current events/August 2010/Sidebar}} began use in January 2001 and later.
  11. A variety of earlier months list holidays. These should be moved to sidebars. They can be removed if judged unnecessary at that time.
  12. Standardize the varieties of style="vertical-align:top;", style="vertical-align:top;width:250px", style="vertical-align:top;", style="vertical-align:top;width:250px", valigh="top", valign="top", style="width:250px"
  13. Standardize the varieties of {{col-2}}, {{col-begin}}, {{col-end}}, {{colbegin||25em}}, {{colbegin||27em}}, {{colbegin||30em}}, {{colend}}, {{colend}}<!-- DO NOT INCLUDE MUSIC ALBUMS, MUSIC FESTIVALS, VIDEO GAME RELEASES OR YOUR BIRTHDAY -->, {{reflist}}, {{reflist|30em}}, {{reflist|colwidth=30em}}
  14. Some months have preamble text like "The month was marked by…" which will be moved to a sidebar box.
Process - curl command used for survey

In order to survey the Month pages, I used the following command in a terminal session to collect all of the contents into a single text file.

curl "https://en.wikipedia.org/wiki/Portal:Current_events/{January,February,March,April,May,June,July,August,September,October,November,December}_[1996-2016]?action=raw" >> archive.txt

I then used a variety of techniques (such as sorting the lines alphabetically, removing duplicate lines, and various regular expressions) to identify non-conforming content.

Once I have completed these clean up steps and normalized all of the month pages, I will look at creating a Lua script to generate the contents based on a single month and year variable. This will have the added benefit of creating a single location for the shared code used for the page layout. At this point it will be a simple process to make the layouts work in a mobile-friendly manner.

Day 2 edit

240 months surveyed (1997-2016) and cleaned to some degree. The remaining issues are below.

  • 21 months have "International holidays" sections that need to migrate into sidebars.
  • Many months list holidays for the following month. These should be migrated to correlated month.
  • 2 months (Dec 1999 and Jan 2000) have sidebars (Recent Deaths for December and Holidays for January) but this is prior to the Jan 2001 start. These should be evaluated for removal for consistency.
  • 1 month (Feb 2009) has a preamble about the uniqueness of a February with 28 days starting on a Sunday and the sequential month Fridays-the-Thirteenth. This should be investigated for removal. (Moved to talk page for further evaluation.)

Day 3 edit

All month pages from 1997-2016 have identical layout except for the following items. (Surveys results from previous days have been moved here and updated.)

  • Months from 1997 through December 2000 do NOT have sidebars. From January 2001 onward, all months have sidebars.
    • 2 months (Dec 1999 and Jan 2000) have sidebars (Recent Deaths for December and Holidays for January) but this is prior to the Jan 2001 start. These should be evaluated for removal for consistency.
  • 2 months (Sep 2005 and Dec 2005) have "{{Pp-move-indef}}" for some reason. These should be investigated for removal.
  • Months from 1990's have [[Category:Months in the 1900s|*1997-01]]
  • Months prior to December 2013 have {{commons category|September 2013}}. January 2014 and after use {{Commons|April 2014}}. These should be reconciled.
  • 13 months have non-English category tags. These can easily remain on the pages.

Notes for later: Parts to add to a Month page generator script:

  • Sidebars begin in January 2001. The template will need to take these into account.
  • See Also sections:
    • 6 months (June 2004-November 2004) have "See Also" links for Sports and 1 month has a See Also link for Science. These should be removed or folded into the Month page template.
    • "[[Wikipedia:News collections and sources]]."
    • "[[Wikipedia:News sources]] – This has much of the same material organized in a hierarchical manner to help encourage [[Wikipedia:NPOV|NPOV]] in our news reporting."


Prototype Month page edit

The prototypical Month page will have the following code. # will be numbers (mostly years) and * will be letters, often month names.

{{Events by month|####|prefix=Portal:Current events/}}
'''[[********]]''' '''[[####]]''' was the ****th month of that ***** year. The month, which began on a [[****day]], ended on a [[****day]] after ## days. Some additional text here.

== [[Portal:Current events]] ==
''This is an [[Portal:Current events/How to archive the portal|archived version]] of Wikipedia's [[Portal:Current events|Current events Portal]] from ******* ####.''
{| style="background-color:transparent" cellspacing=0 cellpadding=0
| style="vertical-align:top;" |
{{Portal:Current events/Month Inclusion|#### ********}}
|style="vertical-align:top;width:250px"|
{{Portal:Current events/******** ####/Calendar}}
{{Portal:Current events/******** ####/Sidebar}}
|}

==References==
{{reflist}}

{{commons category|******** ####}}
{{Portal:Current events/Events by month}}

[[Category:********|####]]
[[Category:####|*####-##]]
[[Category:Current events archives]]

Once this consistency in place, we will look at extracting the table-based layout and replacing it with a div-based layout using flexbox attributes.

Module Process edit

Day 1: Expressing the Opening Paragraph edit

I have created the module and it comprises the following pages:

Currently this module will only fill in the initial paragraph, but I will add arguments that will allow it to express the parts needed to support the layout of the page contents.

Day 2: Expressing the Page Structure edit

I have moved the Module forward to express very simple HTML stings that can be used to produce the layout of the page. It uses Flexbox styling in the same way that the Portal:Current events page does now. I would like to run this by the people interested before applying this to all Monthly archive pages. I have not updated the documentation or the testcases yet.