Wikipedia:Reference desk/Archives/Miscellaneous/2020 April 1

Miscellaneous desk
< March 31 << Mar | April | May >> Current desk >
Welcome to the Wikipedia Miscellaneous Reference Desk Archives
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


April 1

edit

Regarding the 2016 spike in Main Page views

edit

I was analyzing the daily pageviews for the Main Page (for an essay I may or may not end up writing) when I noticed a surprisingly large spike in traffic to the site between July 20th and August 16th, 2016, during which time the daily views of the main page more-than-tripled at one point. (Graph: https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&redirects=0&start=2016-07-01&end=2016-08-31&pages=Main_Page)

This is entirely inconsequential, but it nevertheless piqued my interest, as I could not remember any major events that occurred during this timeframe. I subsequently searched the article for 2016, but could not find any clear indicators; major events that occurred around that time either had already occurred (such as the 2016 Nice truck attack, the 2016 Turkish coup d'etat attempt, Typhoon Nepartak, and the end of the Western African Ebola virus epidemic) or occur on a regular basis (such as the 2016 Summer Olympics, which seemed like a potential candidate until I realized that A. it began on August 5th and ended on August 21st, so it doesn't match the timeframe, and B. the 2018 Winter Olympics didn't seem to cause a spike in traffic anywhere near this magnitude). The only two notable deaths on July 19th were Garry Marshall and Anthony D. Smith, which both seem unlikely candidates to cause such major interest in Wikipedia—and to my understanding, traffic from notable celebrity deaths (such as the deaths of Steve Jobs and Michael Jackson) usually subside after a day or two to begin with. Searching the Wikipedia: mainspace with the keywords "July" "19" "2016" and "July" "26" "2016" (when the traffic was greatest), and the Current Events portal, were equally unfruitful; the only event mentioned in the latter for July of 2016 that could conceivably cause this is Donald Trump being named the United States' Republican Party candidate for the 2016 election on July 19th, which I suppose could cause a jump in traffic for a day or two... but for an entire month? And why would the number of pageviews drop so suddenly between August 16th (which had ~56,000,000) and August 17th (which had ~24,000,000) if that was the cause of this? That doesn't seem to quite fit as the source of the increase in traffic.

So, for the sake of curiosity itself, does anybody know what occurred that sparked so much interest in Wikipedia during that time period? —TheHardestAspectOfCreatingAnAccountIsAlwaysTheUsername: posted at 06:24, 1 April 2020 (UTC)[reply]

According to The Signpost, the most popular articles during that period were the 2016 Olympics, Michael Phelps and Usain Bolt (Olympians) and the film Suicide Squad. You can see the relevant Signpost articles at Wikipedia:Wikipedia Signpost/2016-08-18/Traffic report and Wikipedia:Wikipedia Signpost/2016-09-06/Traffic report. For more detailed lists of the top articles during the period, see Category:Top 25 Report-gadfium 08:15, 1 April 2020 (UTC)[reply]
I don't have an answer, but I suggest the possibility that Wikipedia itself, for some reason, received unusually prominent news media coverage (which tends to snowball when different news vehicles hitch themselves to the news train to avoid being left behind [to coin a metaphor]), thus prompting more people to check it out. This might explain the spike of Main Page views, rather than to specific articles proffered via web searches. {The poster formerly known as 87.81.230.195} 90.197.27.39 (talk) 12:21, 1 April 2020 (UTC)[reply]
Therefore, looking at Wikipedia:Press coverage 2016, we find that on 21 July, Taylor Swift’s Wikipedia Page Vandalized After Kim Kardashian, Kanye West Feud: See What the Trolls Did, but that seems unlikely to be a major preoccupation with the Anglosphere. Also on 28 July, Wikipedia is pivoting into news with its redesigned Android app, so plausibly everyone with an android phone was downloading the app? Alansplodge (talk) 17:12, 1 April 2020 (UTC)[reply]
The fact that it's clean cut and such a specific length of time (28 days = 4 weeks) suggests to me that it's not something organic; interest in something doesn't just triple for that amount of time and then drop back to exactly where it was before with no slopes; it should be some kind of bell curve. To me, this has the hallmarks of something artificial, like us counting page views differently or Google crawling the main page differently and then reverting back. Matt Deres (talk) 13:45, 2 April 2020 (UTC)[reply]
The 28-day period suggests to me a free-trial subscription period to something, after which most people didn't sign up to the long-term paid subscription. I don't use a smartphone and know little of these "apps" of which Alansplodge speaks, but on what terms was the Android app made available? {The poster formerly known as 87.81.230.195} 90.197.27.39 (talk) 17:10, 2 April 2020 (UTC)[reply]

The main problem with the Android theory is that the release occurred on July 28th, not July 19th/20th, so the trendline doesn't match up. (As a matter of fact, on July 28th the number of daily page views was steadily declining until July 30th, making this even less plausible.)

On closer inspection, the graph appears to document two general peaks: one occurring between July 20th and August 4th, and the other occurring between August 5th and August 16th. The second of these almost perfectly coincides with the 2016 Summer Olympics (August 5th – August 21st), but the daily page views also dropped five days before the Games ended, so while I suspect the Olympic Games caused the second bump in traffic, I also suspect they're not what also killed said bump. (It's worth pointing out that between August 17th and August 21st (according to 2016 Summer Olympics#Calendar, at least), the competitions for Diving, Artistic Swimming, Water Polo, Basketball, Sprint Canoeing, BMX Cycling, Field Hockey, Football, Golf, Rhythmic Gymnastics, Handball, the Modern Pentathlon, and Indoor Volleyball were all still ongoing—so it isn't as if all of the events had ended when the page views dropped.)

The hypothetical change in pageview counting is also definitely plausible; to that end, I looked through the tool's Phabricator change log between a week before the traffic spike and a week afterwards (results here). On July 19th, they seemed to be discussing a bug allowing Cross-site scripting attacks, which is rather unhelpful for our purposes; the activity around August 16th is more interesting, however. An automated script/bot was reported to be "making requests for the same article with the same date range, etc." repeatedly on August 10th, and was reported to have halted by August 22nd. If the bot/script was truly "loading the page more than 10 times per second," and each pageload counted as a view, that bug could be hyper-inflating the page count around that time. (Though this theory is admittedly dampened by the fact that the bug reporter explicitly stated that "Anyway [sic] the tool is not broken" in his report.) If someone more experienced with coding than me could check the Phabricator issue to see if the script/bot's requests could even be counted as pageviews, that would be greatly appreciated, but that is my current theory at the moment. — TheHardestAspectOfCreatingAnAccountIsAlwaysTheUsername: posted at 19:05, 2 April 2020 (UTC)[reply]