What is AffCom? The Foundation's volunteer Affiliations Committee, created by the Board of Trustees 10 years ago, advises the Board on the approval of new WMF affiliates—chapters, thematic organisations, and user groups. AffCom's membership is large: currently there are 22 members, comprising 12 voting members, most of them with strong connections to an affiliate, and ten non-voting "advisers". These advisers enable the WMF to monitor and exercise a degree of control over AffCom; they include two board liaisons, three staff liaisons, and three staff observers.
On 19 August, the WMF's Affilliations Committee ("AffCom") announced that the Board of Trustees had established three additional criteria for new applications by user groups to become chapters and thematic organizations ("thorgs"), instructing AffCom to take these criteria into account.
These new criteria reflect the next development in a process set in motion by a 2014 Board resolution, which required that applications to become a chapter or thorg would require at least two years' prior status as a user group. Previously, any group of Wikimedians had been able to apply to move straight to chapter/thorg status. The 2014 decision proved highly controversial at the time, as it changed the rules and effectively put new chapter/thorg applications on hold until February 2016. According to AffCom's schedule, some of the user groups since established have now become eligible to apply. AffCom first discussed the criteria by which such applications would be evaluated at its July meeting.
The new criteria are:
Diversity of Activities: Chapters and thematic organisations are expected to plan and conduct a variety of different programs and events; to balance online and offline projects; to strive for continuous activity; and to conduct programs and events at least once every two months.
Planning and Evaluation: Chapters and thematic organisations are expected to set specific goals and targets for programs, projects, and events before executing them; to measure the results of programs, projects, and events against those targets; and to report on those results to the Wikimedia Foundation and the wider Wikimedia movement.
External Partnerships: Chapters and thematic organisations are expected to engage in programmatic partnerships with external groups and organisations (for example, cultural, academic, or government institutions, and so on) to promote the Wikimedia movement and to add and improve content on Wikimedia projects.
Questions on the Wikimedia-L email list have challenged the announcement on several fronts, ranging from the unclear duration of the “trial period” to the suitability of a two-tiered system in which existing chapters and thorgs will be treated differently from new ones. While AffCom’s chair, Carlos M. Colina, has engaged on the list, the Committee has yet to supply responses to many of the issues raised.
The existing 41 chapters and one thorg need only comply with the pre-existing requirements (items 4–9)); these are less specific, involving general big-picture expectations for mission alignment, geographical focus, legal incorporation, governance, a minimum of 20 active contributors, and "capacity". In effect, the new criteria will create a two-tiered system of standards and accountability, in which there are substantially lower standards for existing chapters and thorgs than for newly recognised chapters and thorgs.
The number of user groups has grown from nine to 64 since the Board’s 2014 decision. User groups are a simpler, less formal, and more flexible form of affiliation. Despite the official position that user groups are "equal players in the Wikimedia movement", they enjoy fewer privileges than chapters and thorgs do. Unlike user groups, chapters and thorgs are eligible for annual operating grants, which can involve significant amounts of donors' money; this may explain the strong attraction by some affiliates to the relatively expensive model followed by some European chapters, which involves paid staff and "bricks and mortar" city offices. Chapters and thorgs have the privilege of nominating two of the 10 WMF trustees, whereas user groups do not.
questions over whether criteria should be qualitative or quantitative;
a complaint that the bar is set "super high" for those organizations without paid staff;
queries about the vagueness of the new requirements, and perceived inflexibility in timing requirements.
the issue of creating two tiers of chapters/thorgs: existing and new.
Pine, a user group board member, wrote: “the criteria should also apply to existing chapters” and "existing chapters should be evaluated routinely". He suggested that “if any chapter's status is in doubt as a result of the new criteria, then the chapter can be given 6 months to rise to the occasion. If chapters still do not meet the new criteria after that time, it seems to me that they should be re-classified as user groups until they re-apply for chapter status and are accepted by AffCom as meeting the new criteria." The AffCom chair responded to Pine's suggestion of "a common baseline throughout the world" that he found it “divisive, discriminatory, patronizing, to say the least. Every chapter's situation is different, so being absolutely quantitative would be unfair and damaging to the movement".
Nevertheless, several Wikimedians expanded on Pine’s theme:
Ben Creasy, a former non-voting member of the WMF Audit Committee, asked which chapters fall short of the new criteria, adding: “I think we should at least get a sense for that, and those chapters should be notified and be put on the path to meeting standards or losing their status." Colina suggested that losing status would be a rare last resort: "in those cases the AffCom may reach out to them to help fix the issue, stimulate the organization of activities, fix governance issues, ...".
Chris Keating, formerly of the Wikimedia UK board, endorsed “a method of inactive chapters to be de-recognised – just as it is also useful for User Groups working towards chapter status to know what they are meant to be working towards." Keating pointed to a somewhat tougher approach, without conducting an audit, used by the organisers of the most recent Wikimedia Conference to review existing chapters' eligibility for paid expenses.
Asaf Bartov, a WMF staff liaison to AffCom, pointed to a relatively new process "being followed, right now, to review the status of inactive and non-compliant chapters, at long last." Bartov suggested that perhaps this link should be added to the AffCom navbox.
Delphine Ménard, a non-voting adviser to AffCom, took issue with the proposition that holding existing affiliates to solid expectations would be too harsh:
Experience proves that 'trying to get in touch' [with an apparently dormant chapter] and 'trying to put together a plan' is a very lengthy process, and takes months, if not years. ... You do have to draw the line somewhere though, and at some point get 'harsh' and have hard deadlines. An appeal process would mean having someone at the other end of the line. More often than not, this is not the case. I think it's important that we know to 'terminate', because dormant entities often prevent new people from rekindling motivation and starting anew.
WMF Trustee Alice Wiegand endorsed Ménard's post, while suggesting that "immediate termination [of a chapter/thorg] is for 'serious and urgent cases' only and that there is a more partnering process for less serious cases."
On the other side were claims that the new criteria were "focusing on how to bring down chapters", and a claim that "The only measure should be trust and an assumption of good faith". A related issue for some was "a huge shortage of support for user groups and smaller chapters."
The Signpost's questions to AffCom
The Signpost contacted the chair of AffCom, on 29 August, inviting response to a number of questions raised by the announcement. He declined to comment by copy-deadline, citing a need to confer with his AffCom colleagues. Our questions built on those raised on the list: We asked whether evaluation of applications for chapter/thorg status, which were not open to scrutiny in the past, would be handled transparently in the future. We inquired whether the proposed two-tiered system of new and existing chapters constituted an attempt to avoid objections by existing chapters/thorgs. We asked whether AffCom is sufficiently independent from chapters/thorgs to exercise the types of judgment indicated in its charter, in the Protocol for noncompliant Wikimedia movement affiliates, in WMF’s Organisational best practices, and in the new criteria. The Signpost awaits comment from AffCom on these and other issues that we put to the chair. TS
Editorial note: In keeping with the Signpost's COI practice, Rosiestep—a member of both the Signpost’s editorial board and AffCom—was not involved in preparing or writing this story.
"...lie" is the most common prediction when entering "Why does Wikipedia" into a Google search.
New FDC appointments: The Wikimedia Foundation has appointed four members to the Funds Dissemination Committee: Garfield Byrd, Anne Clin, Bishakha Datta, Candelaria Laspeñas. See our previous coverage.
New administrators – increasingly a rarity: The English Wikipedia has two new administrators, both veteran editors: Vanamonde93, a afficionado in 20th-century history and politics, biology, and science fiction; and Oshwah, a software engineer and self-confessed computer enthusiast since the age of five. A chart maintained by WereSpielChequers reveals that promotions are increasingly infrequent, with 2016 likely to see 12 new administrators – continuing a steady decline begun in 2008.
Why does Wikipedia lie?: On Twitter, @Faewikobserved that the top Google search prediction for "Why does Wikipedia..." is "Why does Wikipedia lie", while other top searches question its financial solicitations.
Danish philosopher and theologian Dorthe Jørgensen's bio was added as part of the Women in Philosophy drive.
Women in Philosophy drive: WikiProject Women in Red announced an article drive on women philosophers, which runs through December 2016, as a tribute to recently deceased Wikipedian Kevin Gorman.
Wiki Loves Monuments begins: The sixth edition of the world’s largest photographic competition has begun. Those looking to join the event can find instructions over at Commons.
New RAW: The French Wikipedia's 20 August RAW has details on CollectArt, an effort to get museum visitors to upload their photos to Wikimedia Commons, and a collaboration between Bibliothèque et Archives nationales du Québec and Wikisourcers to upload and proofread a book a day.
Books and Bytes published: The bimonthly newsletter from the Wikipedia Library, the program that helps connect editors with the sources they need to write articles, is out. The team has five new research partnerships, and editors from around the world can sign up for these accounts now; there are also six open Wikipedia Visiting Scholar positions.
Wikimedia in Education out: The September Wikimedia in Education contains the heartwarming story of Armenian children teaching their parents how to edit. Said one parent, "I was worrying that my son spent hours in front of the laptop. But now, seeing the important work he is doing by creating and sharing free knowledge, I’m more understanding. I'm so proud of him!"
Silesian Wikipedia reaches 5000 articles: On 6 July, the Silesian Wikipedia (site) reached 5000 articles, with its coverage of the American state of Utah. The Silesian language (or a dialect, depending on the source) is spoken by more than 500,000 people and is used mostly in the Silesia region of Poland. The Silesian Wikipedia is the largest encyclopedia created in that language. (note via Natalia Szafran-Kozakowska, Wikimedia Poland)
The past few weeks of the Traffic Report have been dominated by the 2016 Summer Olympics. Since the Olympics are one of the world's biggest international events, you might guess that it dominated the most-viewed articles of other language Wikipedias. And you would be right. But the topics of interest around the world show interesting variations. We love the Olympics, but also love our own Olympics and Olympians.
Using the WMF data available through TopViews*, we compiled charts of the 15 most popular Olympic-related articles for the period of August 5–21, the official period of the Olympics, for seven different language Wikipedias: English, Spanish, German, Portuguese (the language of Brazil, the host country), Russian, French, and Japanese. We considered, but declined, to include the Chinese Wikipedia due to its blockage in China greatly affecting its viewership.**
First of all, Michael Phelps really is popular worldwide. His biography was far and away #1 in English, #2 in Russian and Spanish, #3 in Portuguese, #4 in French, and #5 in German. Similarly, Usain Bolt was generally behind Phelps, and solidly the second most popular athlete of the Games. He ranked #3 in English, #4 in Spanish, #5 in Russian, #6 in Portuguese and French, #8 in Japanese, and #11 in German.
But the old saying "big in Japan" did not apply to Phelps, where he placed 12th, the only place where Bolt was about 25% more popular. To be big in Japan, though, you really had to be Japanese—the top seven Olympic-related articles were filled by Japanese medalists, not even interrupted by general articles like 2016 Summer Olympics (#1 on five lists) or the All-time Olympic Games medal table which were usually popular across the board. Japan's list was led by Saori Yoshida, who won wrestling silver, and had 240% the views of Phelps. She was followed by many others, presumably now household names in Japan, including gymnast Kōhei Uchimura (#2) and table tennis whiz Ai Fukuhara (#3).
Though the Japanese Wikipedia is the most extreme case, it is not fair to single it out; the data reveals that every language edition tends to favor its own. French judo practitioner and gold medalist Teddy Riner beat Phelps and Bolt on the French Wikipedia. Elsewhere, local favorites were not far behind Phelps and Bolt. In Spanish, Argentine tennis player Juan Martín del Potro, who won silver, was #5, and Spaniard Rafael Nadal was #9. In German, horizontal bar gold medalist Fabian Hambüchen (#8) was the top local hero. And in English, American gymnasts including Simone Biles (#4) and Aly Raisman (#9), and swimmers Katie Ledecky (#8) and Ryan Lochte (#11), were prominent, though India's P.V. Sindhu, who won silver in badminton, drew an impressive #6 showing on the otherwise American-dominated list. Sindhu and the top Americans, other than Phelps, do not appear on the other charts. And vice-versa: English speakers, for instance, were not focused on the three medals won by Russian gymnast Aliya Mustafina (#6 in Russia); she doesn't appear anywhere on the English (or other) charts.
Not popular in English, but rather popular elsewhere, was Football at the 2016 Summer Olympics. Perhaps because the American women's team floundered, no football-related articles are in the English Top 15, but such articles hit #3 in Germany (who won medals in both men's and women's), #7 in Spanish, #8 in Portuguese, and #14 in Russian. But if your country is good in a sport, like Germany was in football, or France was in the modern pentathlon (women's silver, #5), that's what you're most likely going to watch.
Our data collection showed that the Olympics were very popular everywhere. Other non-Olympic topics do appear in their general charts (remember the charts below are Olympic-only articles), just as we see on the Traffic Report, but to about the same extent. The lone exception may be Russian, where the popularity of other articles such as the film Suicide Squad seemed a bit higher—perhaps a reflection of the disqualification of many Russian athletes.
So, just like the Ancient Olympic Games brought together all of Greece, the modern Olympics does seem to bring us all together. We may celebrate our own victories a bit more, but that is part of a human nature we all share and treasure.
Indian badminton star P.V. Sindhu, #6, earned her position among a slew of Americans on the English Wikipedia.
*One caveat on TopViews: TopViews compiles data on the 1,000 most viewed articles on a Wikipedia for each day. Running charts for longer periods compiles from those daily charts. Thus, when an article drops out of the top 1,000, those views for a day will not be included in the compiled data. Generally speaking, we have found that this gap is not a significant problem when looking at the most popular articles. The English Traffic Report and WP:TOP25 are usually derived from the WP:5000, which includes all viewcount data, but there is no similar source for other-language Wikipedias. On the current WP:5000, the 1,000th most viewed article has under 59,000 views for one day. This number should be significantly lower on other language Wikipedias, which receive less traffic.
**We also reviewed statistics for the Bengali Wikipedia (7th on the list of languages by total number of speakers), but traffic and usage there was too low to yield usable information. Though their page on the 2016 Summer Olympics was in their top 10 (#5), many of the more viewed articles on that project are traditional encyclopedic topics, e.g., #1 was Sheikh Mujibur Rahman, the founding leader of Bangladesh. Only 21 articles (on any topic) had more 5,000 views during the Olympics on that project.
The Arabic Wikipedia was also considered. Though it has more traffic than the Bengali project (their 2016 Summer Olympics article was #1, showing users go there for topical information, the general Olympics Games article was #2, and Phelps was #10 among all articles), but only about 50 articles on that project broke 50,000 views during the Olympics, and primary encyclopedic articles (like Egypt and Saudi Arabia) were among them. Ultimately, space and time limitations led to the selection of seven languages to sample.
These discussions and initiatives inevitably link back to discussions about Wikipedia's culture and the gender gap. Inside Higher Ed lamented Wikipedia's current culture in the context of greater internet culture, where "highly stylistic lulz-based trolling" infects attempts at reasoned discussion. As has been stated before, a gender gap cannot be bridged where a community is seen as hostile by many female editors. Highlighting a blog post by Andromeda Yelton, who apparently attended the IFLA conference noted above, the article notes that librarians are 80% female and Wikipedians are 90% male, such that many see Wikipedia having an "adversarial, argumentative bent" that is not enjoyable to all.
Yet, the above initiatives evidence Wikipedia receiving more credit as an established institution, and thus becoming the target of more projects from the traditional institutions that curate knowledge. Perhaps Wikipedia got to where it is without as much formal support (and indeed in the face of many detractors), but the old guard eventually incorporating the nouveau riche is human nature. MW
When a request for help brings the opposite: Former Wikimedia Foundation trustee Bishakha Datta explores the connections between online harassment of women, and historic exclusion of women from public spaces, in an essay published by India Today (and a variant published by openDemocracy). When asked about how this dynamic relates to Wikipedia, Datta replied: "Online abuse isn't the only thing behind the gender gap on Wikipedia, but it's something we do have to tackle if we want to increase gender diversity." (Aug. 31)
Russia plans a Wikipedia rival... again: Russian Prime Minister Dmitry Medvedev has signed a document calling for the creation of a domestic analogue of Wikipedia, reports Pravda. (This report echoes a similar announcement from Nov. 2014). (Aug. 30)
Watch those dead links: An article on Search Engine Optimization (SEO) (intentionally not linked) advises spammers to seek out dead links on Wikipedia and replace them with links to their sites.
Melbourne goes seasteading?: A search for Melbourne, Australia on Bing Maps last week would have landed you in the Pacific Ocean near Japan, according to The Register, Gizmodo and TechEye. The problem was a missing negative sign in Wikipedia's co-ordinates. (Aug. 23, 28, 29)
The Twitter Account Anyone Can Edit? On August 20, Jimmy Wales' Twitter account was hacked by celebrity hacker group OurMine, as reported in Mashable and elsewhere. After taking control the group first tweeted a death announcement for Jimmy, "RIP Jimmy Wales 1966-2016", a hallmark type of childish trolling. About 15 minutes later the hacking group revealed its conquest by tweeting "I confirm that Wikipedia is all lies. OurMine Team is the true (link to OurMine page)." Once Wales regained access to his account he tweeted confirmation that reports of his death had been premature. (Aug. 20–21)
Indigenous language project aims to become a Wikipedia: The Guardian highlighted the "Noongarpedia" project, which would be the first Wikipedia in an Australian aboriginal language, in its own piece on Wikipedia and language preservation. The piece touches on many themes, including the significance of oral tradition, cultural dissonance on the philosophy of free knowledge, and the significance of an academic team driving the project. Unlike Tulu, the Noongar Wikipedia has not been approved as an official Wikipedia. Introducing 'Noongarpedia' – Australia's first Indigenous Wikipedia (Sept. 1)
Keeping Up with the Commons #2: Creative Commons published its second newsletter, with a number of updates of interest to Wikipedians. (Aug. 22)
This week, we’re talking about WikiProject Television with CAWylie, creator of many new articles about television shows on Wikipedia, including Hell on Wheels and The Killing. CAWylie joined Wikipedia about five years ago, when “both basic- and premium-cable television were becoming the ‘go-to-venue’ for some mainstream actors.” CAWylie feels that his work on television-related articles grew out of that trend.
WikiProject Television has a prominent place on the English Wikipedia, and as evidenced by an active talk page, it attracts many editors. It was created in 2003 and started to pick up steam by 2004. It has six project-specific guidelines and a further four writing guidelines, and lists ten related WikiProjects, among them "Actors and filmmakers", "Animation", "Anime and manga", "BBC", and "Screenwriters". In writing articles about popular culture, there are several challenges we discussed that relate to finding reliable sources and good references. CAWylie describes how “In this modern age of press releases, some media outlets mainly just print those, or either just copy each other or Wikipedia itself. Most times you have to follow the breadcrumbs to find the original source.” CAWylie also advises new editors to check out Good Articles on topics that are similar to ones they are trying to write. He adds that if the subject of the article is very new, “it is best to wait for the media outlets to pick up on it. Then fully ‘vet’ the subject by checking the the most reputable sites.” One of the biggest problems, which also leads to gaps in coverage on television articles is that editors can be “so excited to be FIRST to create anything that they forget Wiki-standards, or they use the unreliable IMDB as a main source.”
As this logo consists of simple geometric shapes and text, it is deemed ineligible for copyright and therefore permissible on Wikimedia Commons.
I asked about contributing to commons:Wikimedia Commons, and CAWylie felt that the commons was “like a separate entity from Wikipedia.” However, uploading screenshots of TV show title screens, or intertitles and crucial scenes from shows, is allowable. CAWylie has even seen fan or user-created logos pass on Wikimedia.
CAWylie tends to edit shows in which he’s familiar with the creative team or likes the show itself. He also edits articles he feels may be of interest to Wikipedia readers. Some of his favorite shows are ones that “change viewers’ perceptions. For example, at first Breaking Bad seemed to me like it would glorify the meth business. I was pleasantly surprised and happily proved wrong.”
One of his favorite articles to work on was the biography of Christopher Chapman, which CAWylie started and expanded. CAWylie says that Chapman was a pioneer in the film industry and influenced the way television was later filmed. CAWylie says that “Biographies are usually more fun to do, as research might reveal info not commonly known,” and he felt honored to create Chapman’s biography.
For anyone interested in getting involved with WikiProject TV, the talk page is active and editors can make requests or ask for help over there. Thanks to CAWylie for sharing his work on Wikipedia!
No. 91 (Composite) Wing(nominated by Ian Rose) was a Royal Australian Air Force wing that operated during the Korean War and its immediate aftermath. It was established in October 1950 to administer RAAF units deployed in the conflict: No. 77 (Fighter) Squadron, flying North American P-51 Mustangs; No. 30 Communications Flight, flying Austers and Douglas C-47 Dakotas; No. 391 (Base) Squadron; and No. 491 (Maintenance) Squadron. The wing was headquartered at Iwakuni, Japan, as were its subordinate units with the exception of No. 77 Squadron.
Lynx(nominated by Casliber) is a constellation in the northern sky that was introduced in the 17th century by Johannes Hevelius. It is a faint constellation with its brightest stars forming a zigzag line. The orange giant Alpha Lyncis is the brightest star in the constellation, while the semiregular variable star Y Lyncis is a target for amateur astronomers. Six star systems have been found to contain planets.
Rare Replay(nominated by Czar) is a 2015 compilation of 30 video games from the 30-year history of developers Rare and its predecessor, Ultimate Play the Game. The emulated games span multiple genres and consoles, and retain the features and errors of their original releases with minimal edits. The compilation adds cheats to make the older games easier and a Snapshots mode of specific challenges culled from parts of the games. Player progress is rewarded with behind-the-scenes footage and interviews about Rare's major and unreleased games.
HMS Emerald(nominated by Ykraps) was a 36-gun Amazon-class frigate that Sir William Rule designed in 1794 for the Royal Navy. She was completed in 1795 and joined John Jervis's fleet in the Mediterranean. Emerald was one of several vessels to hunt down and capture Santisima Trinidad. She was part of John Thomas Duckworth's squadron during the Action of 7 April 1800 off Cadiz. Emerald served in the Caribbean throughout 1803 in Samuel Hood's fleet, then took part in the invasion of St Lucia, and of Surinam. Returning to home waters for repairs in 1806, she served in the western approaches before joining a fleet under James Gambier in 1809, and taking part in the Battle of the Basque Roads. In 1811 she sailed to Portsmouth where she was laid up in ordinary. Fitted out as a receiving ship in 1822, she was eventually broken up in 1836.
Wrestle Kingdom 9(nominated by Ribbon Salminen and Starship.paint) was a professional wrestling pay-per-view event, produced by the New Japan Pro Wrestling promotion, which took place at the Tokyo Dome in Tokyo, Japan, in 2015. It was the 24th January 4 Tokyo Dome Show and the first event on the 2015 NJPW schedule. The event featured ten professional wrestling matches and one pre-show match, six of which were for championships. The event was attended by 36,000 people, and received universally positive reviews from critics.
The Boat Races 2016(nominated by The Rambling Man) took place on 27 March. Held annually, The Boat Race is a side-by-side rowing race between crews from the universities of Oxford and Cambridge along a 4.2-mile (6.8 km) tidal stretch of the River Thames in southwest London. For the first time in the history of the event, the men's, women's, and both reserves' races were all held on the Tideway on the same day.
Science-Fiction Plus(nominated by Mike Christie) was a U.S. science fiction magazine published by Hugo Gernsback for seven issues in 1953. It was initially in slick format, meaning that it was large-size and printed on glossy paper. Gernsback had always believed in the educational power of science fiction, and he continued to advocate his views in the new magazine's editorials. Sales were initially good, but soon fell. For the last two issues Gernsback switched the magazine to cheaper pulp paper, but the magazine remained unprofitable. The final issue was dated December 1953.
"No Me Queda Más"(nominated by AJona1992) is a song by American recording artist Selena for her fourth studio album, Amor Prohibido. It was released as the third single from the album in 1994 by EMI Latin. "No Me Queda Más" was written by Ricky Vela, and production was handled by Selena's brother A.B. Quintanilla. A downtempo mariachi and pop ballad, the song portrays the ranchera storyline of a woman in agony after the end of a relationship. Its lyrics express an unrequited love, the singer wishing the best for her former lover and his new partner. Praised by music critics for its emotive nature, "No Me Queda Más" was one of the most successful singles of Selena's career.
The Canadian National Vimy Memorial(nominated by Labattblueboy) is a memorial site in France dedicated to the memory of Canadian Expeditionary Force members killed during the First World War. It also serves as the place of commemoration for First World War Canadian soldiers killed or presumed dead in France who have no known grave. The monument is the centrepiece of a 100-hectare (250-acre) preserved battlefield park that encompasses a portion of the ground over which the Canadian Corps made their assault during the initial Battle of Vimy Ridge offensive of the Battle of Arras.
"Did You Hear What Happened to Charlotte King?"(nominated by Aoba47) is the seventh episode of the fourth season of the American television medical drama, Private Practice, and the show's 61st episode overall. Written by Shonda Rhimes and directed by Allison Liddi-Brown, the episode was originally broadcast on ABC. The episode revolved around KaDee Strickland's character, and was intended to accurately portray a victim's recovery from rape. It earned the series, Rhimes, and Strickland several awards and nominations and was well received by critics, with Strickland's character and performance praised.
State Route 94(nominated by Rschen7754) is a highway in the U.S. state of California that is 63.324 miles (101.910 km) long. The western portion, known as the Martin Luther King Jr. Freeway, begins at Interstate 5 in downtown San Diego and continues to the end of the freeway portion past State Route 125 in Spring Valley. The non-freeway segment continues east through the mountains to Interstate 8 near Boulevard is known as Campo Road.
Emma Stone(nominated by FrB.TG) (born 1988) is an American actress. Born and raised in Scottsdale, Arizona, Stone was drawn to acting as a child, and her first role was in a theater production of The Wind in the Willows in 2000. As a teenager, she relocated to Los Angeles with her mother, and made her television debut in VH1's In Search of the New Partridge Family (2004), a reality show that produced only an unsold pilot. After a series of small television roles, she won a Young Hollywood Award for her film debut in Superbad (2007), and received positive media attention for her role in Zombieland (2009).
Miami-Dade Transit operates the Metrorail rapid transit system and the Metromover people mover system in Miami and Greater Miami-Dade County, Florida. The network consists of two elevated Metrorail lines and three elevated Metromover lines. Miami-Dade Transit operates 42 metro stations(nominated by Dream out loud), with 23 in the Metrorail system and 21 in the Metromover system (Brickell and Government Center stations serve both systems).
Marilyn Monroe (1926–1962) was an American actress who appeared in 29 films between 1946 and 1961(nominated by SchroCat). After a brief career in modeling she signed short-term film contracts, and appeared in minor roles for the first few years of her career. Her major breakthrough came in 1953, when she starred in three pictures: the film noir Niagara, and the comedies Gentlemen Prefer Blondes and How to Marry a Millionaire. Monroe won, or was nominated for, several awards during her career. Those she won included the Henrietta Award for Best Young Box Office Personality and World Film Favorite, and a Crystal Star Award and David di Donatello Award. She was inducted to the Hollywood Walk of Fame and a Golden Palm Star was dedicated at the Palm Springs Walk of Stars. She continues to be considered a major icon in American popular culture.
The Adelaide Oval is a cricket ground in Adelaide, Australia. It is the home ground of the South Australia cricket team and both the men's and women's team of Adelaide Strikers as well as Australian rules football and soccer teams. Two-hundred international cricket centuries have been scored at the stadium(nominated by Yellow Dingo). The first century at the ground was scored by the Australian Percy McDonnell, and Don Bradman's 299 not out, is the highest individual score by a batsman at the ground.
The International Olympic Committee recognises the fastest performances in pool-based swimming events at the Olympic Games(nominated by The Rambling Man). Men's swimming has been part of the Summer Olympics since the Games' modern inception in 1896; but it was not until 1912 that women competed against each other. Races are held in four swimming categories: freestyle, backstroke, breaststroke and butterfly, over varying distances and in either individual or relay race events. Medley swimming races are also held, both individually and in relays, in which all four swimming categories are used. Of the 32 pool-based events, swimmers from the United States hold eighteen records, including one tied with a swimmer from Canada, Australia and China three each, Hungary two, and one each to the Netherlands, Brazil, Japan, Great Britain, Singapore and Sweden. Thirteen of the current Olympic records were set at the 2016 Games.
Selena Quintanilla-Pérez (1971–1995) was an American singer, songwriter, spokesperson, actress, and fashion designer. During her career, she has released(nominated by AJona1992) twenty-seven official singles, seven promotional singles, and made five guest vocalist appearances.
S.L. Benfica is a Portuguese professional football team based in São Domingos de Benfica, Lisbon. The club was formed in 1904, and played his first competitive match in 1906. Since their first competitive match, 247 players have played between 25 and 99 matches(nominated by Threeohsix). Three players fell one short of 100 appearances, and four former players went on to be first-team managers.
Quentin Tarantino (born 1963) is an American director, producer, screenwriter and actor. His filming career(nominated by FrB.TG) began in the late 1980s by directing, writing and starring in the black-and-white My Best Friend's Birthday, a partially lost amateur short film which was never officially released. Since then he has appeared in twenty-seven more films, directed ten more films (also guest directing in Sin City), wrote seven-teen more films and produced four-teen films. Tarantino has also appeared in eight television episodes, directed two and wrote one. He also appears in the game Steven Spielberg's Director's Chair as Jack Cavello.
The Olympics reigned again this week, shifting from swimming to track as the games neared their end. Seven of the Top 10 slots are Olympic-related, as are 15 of the Top 25. But somehow the incomprehensible internet meme Killing of Harambe still creeped into the Top 25 at #25.
In technical news in follow-up from in August, we are happy to report that this report is now using data from a revamped WP:5000 report which uses WMF's newer data feeds, thanks to Chief Traffic Data Guru West.andrew.g (not an official title). All WP:5000 reports have been re-run for 2016 and are available in that page's history. So far we don't expect the changes to have a significant effect on our charts, though it may help us exclude some spider/bot traffic, and may include Wikipedia Zero traffic not captured before. Unfortunately, however, the new WMF data does not keep records of red link hits, so the WP:TOPRED report has been retired.
For the full top-25 lists (and archives back to January 2013), see WP:TOP25. See this section for an explanation of any exclusions. For a list of the most edited articles every week, see WP:MOSTEDITED.
The ten most popular articles for the week of August 14–20, 2016, as determined from the newly revamped WP:5000 report, were:
The rhythm of the Summer Olympics went according to prediction. As swimming and Michael Phelps (#3) finished up, track took over, and Bolt took center stage, winning gold in both the 100 m and 200 m, for the third straight time. And he also won his third straight gold in the 4 × 100 m relay. Being regularly called the "greatest sprinter of all time" is not hyperbole at this point. An impressive 3.1 million views lead the chart, though well shy of the astounding 5.4 million views Phelps got last week.
Last week we noted that although India at the 2016 Summer Olympics was at #23 (#16 this week), the country had won no medals yet. Sindu became the first Indian woman to win an Olympic silver medal, in badminton. (And to tell you how lame American television coverage is, I had no idea badminton was a sport in the Olympics.) Sindhu was one of only two medalists from India, the second being a bronze won in women's wrestling by Sakshi Malik. Of course India's lack of medal haul regularly produces articlesaskingwhy. They are just SPORTS, people. Let's celebrate those who compete and shine.
DC Comics' ramshackle crew of press-ganged supervillains, forced to do the will of a shadowy organization or let their heads explode, are the stars of one of the most anticipated films in the nascent DC Cinematic Universe, which was released on August 5 to generally negative reviews. Nonetheless, it grossed $267 worldwide in its opening weekend.
This Netflix science-fiction series is basically an 8-hour homage to early 80s kid-centric flicks like E.T., The Goonies and Explorers, though aimed mostly at adults. It has been a smash hit for Netflix, evidenced by its continuing appearance on this chart – five straight weeks. The Internet has seized on even the most mundane facets of the show, such as turning minor character "Barb" into a celebrity.
Hello again, Reddit. One of the discoveries the Top 25 project has made over the years is that the site Reddit, which bills itself as "the front page of the Internet" because Wikipedia doesn't, has been a major factor in driving traffic here. It has also proven to be a massive justification for every quirky, oddball page that manages to make it through the deletion process, as these are frequently the most popular. In the past I've made impassioned defences of Reddit and its role in aiding Wikipedia, pointing out that our site has done little to draw people's attention to the information it conveys, leaving that job to Reddit and Google Doodles. I still feel that way, at least, for the section of Reddit that nearly always makes it here: TIL, or "Today I Learned". Comments on TIL threads seem to be fairly civil and genuinely inquisitive, but those make up only a tiny fraction of Reddit's user base. But, it is not those threads that best exemplify Reddit; rather it is the river of bile and toxicity that has flowed from the Killing of Harambe that best illustrates what Reddit has become. These days Reddit is mostly famous in the wider media as a den of race hate, misogyny, borderline paedophilia, and every other objectionable but not strictly illegal form of behaviour. The commitment of the site's owners to free speech has meant that many of their topic threads, or subreddits, have become echo chambers of vitriol, as those who disagree are shouted down or chased off. One writer for Time magazine has written Reddit off as unsalvageable. As such, I think Wikipedia would be better off taking on more of the job of spreading word of its content.
The ten most popular articles for the week of August 21 to 27, 2016, as determined from the newly revamped WP:5000 report, were:
Numbers are down by half, but the article is still holding at #2. The closing ceremony was held on August 21, the first day recorded by this list, so interest in the Olympics clearly has faded quickly. It will be interesting to see what will happen when the Paralympics get underway.
This Netflix science-fiction series is basically an eight-hour homage to early-80s kid-centric flicks like E.T., The Goonies and Explorers, though aimed mostly at adults. It has been a smash hit for Netflix, evidenced by its continuing appearance on this chart – six straight weeks. The Internet has seized on even the most mundane facets of the show, such as turning minor character "Barb" into a celebrity. Numbers have not shifted particularly since last week, but with the overall low view count it has let it rise four slots.
DC Comics' ramshackle crew of press-ganged supervillains, forced to do the will of a shadowy organization or let their heads explode, are the stars of one of the most anticipated films in the nascent DC Cinematic Universe, which was released on August 5 to generally negative reviews. Nonetheless, it grossed $267M worldwide in its opening weekend.
What began as a heartfelt reaction to what some felt was the unnecessary killing of a silverbackwestern lowland gorilla (pictured, though not him specifically) has morphed over the last three months into online trolling and racist abuse, along with the standard targeted misogyny. What the troll army hopes to accomplish is never clear, but whatever it is it doesn't involve helping gorillas.
As learned on a Reddit thread this week, Tic Tacs are almost pure sugar, but small enough to be considered sugar-free per serving. Interestingly, the two other Reddit threads linked to this article also noticed the same thing.
The views for the annual list of deaths are remarkably consistent on a day to day basis. It was consistently higher in the first half of 2016 owing to a string of highly notable deaths, but things seem to be calming down a bit.
Citoid (source) by User:Salix alba – Generates references from URL's and doi's using the Citoid server – normally only available with Visual Editor, this script allows access from the normal wikitext editor.
How you add text after an edit conflict might work in a different way in the future – you can test the prototype. Improving the edit conflict page was the top request in the German-speaking communties' 2015 wishlist; the prototype shows the solution from Wikimedia Germany's software engineering department. Feedback from the international community, including English Wikipedia users, would be appreciated.
Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.
The ORES review tool is now available on Special:Contributions as a beta feature. It can make it easier to find contributions that are probably damaging the wikis. The ORES review tool is available on Wikidata and Persian, Polish, Portuguese, Dutch, Turkish and Russian Wikipedia. 
The norm and ccnorm functions have been updated to make it easier to write abuse filters. This also affects the TitleBlacklist extension. You don't have to transform "I" and "L" to "1", "O" to "0" and "S" to "5" anymore. 
The old pageview data in the "pagecounts-raw" and "pagecounts-all-sites" files is no longer being updated. You can find the new pageview data here. This happened on August 5. 
Wikimedia mobile sites now don't load images if the user doesn't see them. This is to save mobile data and make the pages load faster. 
When you edit a table with the visual editor, pressing Tab in the last cell of a row will take you to the first cell in the next row. Pressing Shift and Tab in the first cell of a row will take you to the last cell in the previous row. 
Some big image files could not be thumbnailed. This has now been fixed. 
When you moved a page over a redirect it would delete the redirect without saving it in the logs. This has now been fixed. 
The new version of MediaWiki will be on test wikis and MediaWiki.org from 30 August. It will be on non-Wikipedia wikis and some Wikipedias from 31 August. It will be on all wikis from 1 September (calendar).
Sometimes when you mention another user they don't get a notification. You will be able to get a notification when you successfully sent out a mention to someone or be told if they did not get a notification. This will be opt-in. You can test this on the test wiki. 
The system described in the paper looks for red links in Wikipedia and classifies them based on their context. To find section titles, it then looks for similar existing articles. With these titles, the system searches the web for information, and eventually uses content summarization and a paraphrasing algorithm. The researchers uploaded 50 of these automatically created articles to Wikipedia, and found that 47 of them survived. Some were heavily edited after upload, others not so much.
While I was enthusiastic about the results, I was surprised by the suboptimal quality of the articles I reviewed – three that were mentioned in the paper. After a brief discussion with the authors, a wider discussion was initiated on the Wiki-research mailing list. This was followed by an entry on the English Wikipedia administrators' noticeboard (which includes a list of all accounts used for this particular research paper). The discussion led to the removal of most of the remaining articles.
The discussion concerned the ethical implications of the research, and using Wikipedia for such an experiment without the consent of Wikipedia contributors or readers. The first author of the paper was an active member of the discussion; he showed a lack of awareness of these issues, and appeared to learn a lot from the discussion. He promised to take these lessons to the relevant research community – a positive outcome.
In general, this sets an example for engineers and computer-science engineers, who often show a lack of awareness of certain ethical issues in their research. Computer scientists are typically trained to think about bits and complexities, and rarely discuss in depth how their work impacts human lives. Whether it's social networks experimenting with the mood of their users, current discussions of biases in machine-learned models, or the experimental upload of automatically created content in Wikipedia without community approval, computer science has generally not reached the level of awareness of some other sciences for the possible effects of their research on human subjects, at least as far as this reviewer can tell.
Even in Wikipedia, there's no clear-cut, succinct Wikipedia policy I could have pointed the researchers to. The use of sockpuppets was a clear violation of policy, but an incidental component of the research. WP:POINT was a stretch to cover the situation at hand. In the end, what we can suggest to researchers is to check back with the Wikimedia Research list. A lot of people there have experience with designing research plans with the community in mind, and it can help to avoid uncomfortable situations.
Ethics researcher: Vandal fighters should not be allowed to see whether an edit was made anonymously
A paper in the journal Ethics and Information Technology examines the "system of surveillance" that the English Wikipedia has built up over the years to deal with vandalism edits. The author, Paul B. de Laat from the University of Groningen, presents an interesting application of a theoretical framework by US law scholar Frederick Schauer that focuses on the concepts of rule enforcement and profiling. While providing justification for the system's efficacy and largely absolving it of some of the objections that are commonly associated with the use of profiling in, for example, law enforcement, de Laat ultimately argues that in its current form, it violates an alleged "social contract" on Wikipedia by not treating anonymous and logged-in edits equally. Although generally well-informed about both the practice and the academic research of vandalism fighting, the paper unfortunately fails to connect to an existing debate about very much the same topic – potential biases of artificial intelligence-based anti-vandalism tools against anonymous edits – that was begun last year by the researchers developing ORES (an edit review tool that was just made available to all English Wikipedia users, see this week's Technology report) and most recently discussed in the August 2016 WMF research showcase.
The paper first gives an overview of the various anti-vandalism tools and bots in use, recapping an earlier paper where de Laat had already asked whether these are "eroding Wikipedia’s moral order" (following an even earlier 2014 paper in which he had argued that new-edit patrolling "raises a number of moral questions that need to be answered urgently"). There, de Laat's concerns included the fact that some stronger tools (rollback, Huggle, and STiki) are available only to trusted users and "cause a loss of the required moral skills in relation to newcomers", and that they a lack of transparency about how the tools operate (in particular when more sophisticated artificial intelligence/machine learning algorithms such as neural networks are used). The present paper expands on a separate but related concern, about the use of "profiling" to pre-select which recent edits will be subject to closer human review. The author emphasizes that on Wikipedia this usually does not mean person-based offender profiling (building profiles of individuals committing vandalism), citing only one exception in form of a 2015 academic paper – cf. our review: "Early warning system identifies likely vandals based on their editing behavior". Rather, "the anti-vandalism tools exemplify the broader type of profiling" that focuses on actions. Based on Schauer's work, the author asks the following questions:
"Is this profiling profitable, does it bring the rewards that are usually associated with it?"
"is this profiling approach towards edit selection justified? In particular, do any of the dimensions in use raise moral objections? If so, can these objections be met in a satisfactory fashion, or do such controversial dimensions have to be adapted or eliminated?"
But snakes are much more dangerous! According to Schauer, while general rules are always less fair than case-by-case decisions, their existence can be justified by other arguments.
To answer the first question, the author turns to Schauer's work on rules, in a brief summary that is worth reading for anyone interested in Wikipedia policies and guidelines – although de Laat instead applies the concept to the "procedural rules" implicit in vandalism profiling (such as that anonymous edits are more likely to be worth scrutinizing).
First, Schauer "resolutely pushes aside the argument from fairness: decision-making based on rules can only be less just than deciding each case on a particularistic basis ". (For example, a restaurant's "No Dogs Allowed" rule will unfairly exclude some well-behaved dogs, while not prohibiting much more dangerous animals such as snakes.) Instead, the existence of rules have to be justified by other arguments, of which Schauer presents four:
Rules "create reliability/predictability for those affected by the rule: rule-followers as well as rule-enforcers".
Rules "promote more efficient use of resources by rule-enforcers" (as one example, in case of a speeding car driver, traffic police and judges can apply a simple speed limit instead having to prove in detail that an instance of driving was dangerous).
Rules, if simple enough, reduce the problem of "risk-aversion" by enforcers, who are much more likely to make mistakes and face repercussions if they have to make case by case decisions.
Rules create stability, which however also presents "an impediment to change; it entrenches the status-quo. If change is on a society’s agenda, the stability argument turns into an argument against having (simple) rules."
The author cautions that these four arguments have to be reinterpreted when applying them to vandalism profiling, because it consists of "procedural rules" (which edits should be selected for inspection) rather than "substantive rules" (which edits should be reverted as vandalism, which animals should be disallowed from the restaurant). While in the case of substantive rules, their absence would mean having to judge everything on a case-by-case basis, the author asserts that procedural rules arise in a situation where the alternative would be to to not judge at all in many cases: Because "we have no means at our disposal to check and pass judgment on all of them; a selection of a kind has to be made. So it is here that profiling comes in". With that qualification, Schauer's second argument provides justification for "Wikipedian profiling [because it] turns out to be amazingly effective", starting with the autonomous bots that auto-revert with an (aspired) 1:1000 false-positive rate.
De Laat also interprets "the Schauerian argument of reliability/predictability for those affected by the rule" in favor of vandalism profiling. Here, though, he fails to explain the benefits of vandals being able to predict which kind of edits will be subject to scrutiny. This also calls into question his subsequent remark that "it is unfortunate that the anti-vandalism system in use remains opaque to ordinary users". The remaining two of Schauer's four arguments are judged as less pertinent. But overall the paper concludes that it is possibile to justify the existence of vandalism profiling rules as beneficial via Schauer's theoretical framework.
Police traffic stops: A good analogy for anti-vandalism patrol on Wikipedia?
Next, de Laat turns to question 2, on whether vandalism profiling is also morally justified. Here he relies on later work by Schauer, from a 2003 book, "Profiles, Probabilities, and Stereotypes", that studies such matters as profiling by tax officials (selecting which taxpayers have to undergo an audit), airport security (selecting passengers for screening) and by police officers (for example, selecting cars for traffic stops). While profiling of some kind is a necessity for all these officials, the particular characteristics (dimensions) used for profiling can be highly problematic (see Driving While Black). For de Laat's study of Wikipedia profiling, "two types of complications are important: (1) possible ‘overuse’ of dimension(s) (an issue of profile effectiveness) and (2) social sensibilities associated with specific dimension(s) (a social and moral issue)." Overuse can mean relying on stereotypes that have no basis in reality, or over-reliance on some dimensions that, while having a non-spurious correlation with the deviant behavior, are over-emphasized at the expense of other relevant characteristics because they are more visible or salient to the profile. While Schauer considers that it may be justified for "airport officials looking for explosives [to] single out for inspection the luggage of younger Muslim men of Middle Eastern appearance", it would be an over-use if "officials ask all Muslim men and all men of Middle Eastern origin to step out of line to be searched", thus reducing their effectiveness by neglecting other passenger characteristics. This is also an example for the second type of complication profiling, where the selected dimensions are socially sensitive – indeed, for the specific case of luggage screening in the US, "the factors of race, religion, ethnicity, nationality, and gender have expressly been excluded from profiling" since 1997.
Applying this to the case of Wikipedia's anti-vandalism efforts, de Laat first observes that complication (1) (overuse) is not a concern for fully automated tools like ClueBotNG – obviously their algorithm applies the existing profile directly without a human intervention that could introduce this kind of bias. For Huggle and STiki, however, "I see several possibilities for features to be overused by patrollers, thereby spoiling the optimum efficacy achievable by the profile embedded in those tools." This is because both tools do not just use these features in their automatic pre-selection of edits to be reviewed, but expose at least the fact whether an edit was anonymous to the human patroller in the edit review interface. (The paper examines this in detail for both tools, also observing that Huggle presents more opportunities for this kind of overuse, while STiki is more restricted. However, there seems to have been no attempt to study empirically whether this overuse actually occurs.)
Regarding complication (2), whether some of the features used for vandalism profiling are socially sensitive, de Laat highlights that they include some amount of discrimination by nationality: IP edits geolocated to the US, Canada, and Australia have been found to contain vandalism more frequently and are thus more likely to be singled out for inspection. However, he does not consider this concern "strong enough to warrant banning the country-dimension and correspondingly sacrifice some profiling efficacy", chiefly because there do not appear to be a lot of nationalistic tensions within the English Wikipedia community that could be stirred up by this.
In contrast, de Laat argues that "the targeting of contributors who choose to remain anonymous ... is fraught with danger since anons already constitute a controversial group within the Wikipedian community." Still, he acknowledges the "undisputed fact" that the ratio of vandalism is much higher among anonymous edits. Also, he rejects the concern that they might be more likely to be the victim of false positives:
normally [IP editors] do not experience any harm when their edits are selected and inspected as a result of anon-powered profiling; they will not even notice that they were surveilled since no digital traces remain of the patrolling. ... The only imaginable harm is that patrollers become over focussed on anons and indulge in what I called above 'overinspection' of such edits and wrongly classify them as vandalism ... As a consequence, they might never contribute to Wikipedia again. ... Nevertheless, I estimate this harm to be small. At any rate, the harm involved would seem to be small in comparison with the harassment of racial profiling—let alone that an 'expressive harm hypothesis' applies.
With this said, de Laat still makes the controversial call "that the anonymous-dimension should be banned from all profiling efforts" – including removing it from the scoring algorithms of Huggle, STiki and ClueBotNG. Instead of concerns about individual harm,
my main argument for the ban is a decidedly moral one. From the very beginning the Wikipedian community has operated on the basis of a 'social contract' that makes no distinction between anons and non-anons – all are citizens of equal stature. ... In sum, the express profiling of anons turns the anonymity dimension from an access condition into a social distinction; the Wikipedian community should refrain from institutionalizing such a line of division. Notice that I argue, in effect, that the Wikipedian community has only two choices: either accept anons as full citizens or not; but there is no morally defensible social contract in between.
Sadly, while the paper is otherwise rich in citations and details, it completely fails to provide evidence for the existence of this alleged social contract. While it is true that "the ability of almost anyone to edit (most) articles without registration" forms part of Wikipedia's founding principles (a principle that this reviewer strongly agrees with), the "equal stature" part seems to be de Laat's own invention – there is a long list of things that, by longstanding community consensus, require the use of an account (which after all is freely available to everyone, without even requiring an email address). Most of these restrictions – say, the inability to create new articles or being prevented from participating in project governance during admin or arbcom votes – seem much more serious than the vandalism profiling that is the topic of de Laat's paper. TB
Conferences and events
Registration is open for WikiConference North America October 7–10. The conference will include a track about academic engagement and Wikipedia in education.
A list of other recent publications that could not be covered in time for this issue—contributions are always welcome for reviewing or summarizing newly published research. This month, the list mainly gathers research about the extraction of specific content from Wikipedia.
"Large SMT Data-sets Extracted from Wikipedia" From the abstract: "The article presents experiments on mining Wikipedia for extracting SMT [ statistical machine translation ] useful sentence pairs in three language pairs. ... The optimized SMT systems were evaluated on unseen test-sets also extracted from Wikipedia. As one of the main goals of our work was to help Wikipedia contributors to translate (with as little post editing as possible) new articles from major languages into less resourced languages and vice-versa, we call this type of translation experiments 'in-genre' translation. As in the case of 'in-domain' translation, our evaluations showed that using only 'in-genre' training data for translating same genre new texts is better than mixing the training data with 'out-of-genre' (even) parallel texts."
"Recognizing Biographical Sections in Wikipedia" From the abstract: "Thanks to its coverage and its availability in machine-readable format, [Wikipedia] has become a primary resource for large scale research in historical and cultural studies. In this work, we focus on the subset of pages describing persons, and we investigate the task of recognizing biographical sections from them: given a person’s page, we identify the list of sections where information about her/his life is present [as opposed to nonbiographical sections, e.g. 'Early Life' but not 'Legacy' or 'Selected writings']."
"Extraction of lethal events from Wikipedia and a semantic repository" From the abstract and conclusion: "This paper describes the extraction of information on lethal events from the Swedish version of Wikipedia. The information searched includes the persons’ cause of death, origin, and profession. [...] We also extracted structured semantic data from the Wikidata store that we combined with the information retrieved from Wikipedia ... [The resulting] data could not support the existence of the Club 27".
"Learning Topic Hierarchies for Wikipedia Categories" (from frequently used section headings in a category, e.g. "eligibility", "endorsements" or "results" for Category:Presidential elections)
"'A Spousal Relation Begins with a Deletion of engage and Ends with an Addition of divorce': Learning State Changing Verbs from Wikipedia Revision History." From the abstract: "We propose to learn state changing verbs [such as 'born', 'died', 'elected', 'married'] from Wikipedia edit history. When a state-changing event, such as a marriage or death, happens to an entity, the infobox on the entity's Wikipedia page usually gets updated. At the same time, the article text may be updated with verbs either being added or deleted to reflect the changes made to the infobox. ... We observe in our experiments that when state-changing verbs are added or deleted from an entity's Wikipedia page text, we can predict the entity's infobox updates with 88% precision and 76% recall."
"Extracting Representative Phrases from Wikipedia Article Sections" From the abstract: "Since [Wikipedia's] long articles are taking time to read, as well as section titles are sometimes too short to capture comprehensive summarization, we aim at extracting informative phrases that readers can refer to."
"Accurate Fact Harvesting from Natural Language Text in Wikipedia with Lector" From the abstract: "Many approaches have been introduced recently to automatically create or augment Knowledge Graphs (KGs) with facts extracted from Wikipedia, particularly its structured components like the infoboxes. Although these structures are valuable, they represent only a fraction of the actual information expressed in the articles. In this work, we quantify the number of highly accurate facts that can be harvested with high precision from the text of Wikipedia articles [...]. Our experimental evaluation, which uses Freebase as reference KG, reveals we can augment several relations in the domain of people by more than 10%, with facts whose accuracy are over 95%. Moreover, the vast majority of these facts are missing from the infoboxes, YAGO and DBpedia."
"Extracting Scientists from Wikipedia" From the abstract: "[We] describe a system that gathers information from Wikipedia articles and existing data from Wikidata, which is then combined and put in a searchable database. This system is dedicated to making the process of finding scientists both quicker and easier."
"LeadMine: Disease identification and concept mapping using Wikipedia" From the abstract: "LeadMine, a dictionary/grammar-based entity recognizer, was used to recognize and normalize both chemicals and diseases to MeSH [ Medical Subject Headings ] IDs. The lexicon was obtained from 3 sources: MeSH, the Disease Ontology and Wikipedia. The Wikipedia dictionary was derived from pages with a disease/symptom box, or those where the page title appeared in the lexicon."
"Finding Member Articles for Wikipedia Lists" From the abstract: "... for a given Wikipedia article and list, we determine whether the article can be added to the list. Its solution can be utilized on automatic generation of lists, as well as generation of categories based on lists, to help self-organization of knowledge structure. In this paper, we discuss building classifiers for judging on whether an article belongs to a list or not, where features are extracted from various components including list titles, leading sections, as well as texts of member articles. ... We report our initial evaluation results based on Bayesian and other classifiers, and also discuss feature selection."
"Study of the content about documentation sciences in the Spanish-language Wikipedia" (in Spanish). From the English abstract: "This study explore how [Wikipedia] addresses the documentation sciences, focusing especially on pages that discuss the discipline, not only the page contents, but the relationships between them, their edit history, Wikipedians who participated and all aspects that can influence on how the image of this discipline is projected" [sic]. TB
^Hu, Linmei; Wang, Xuzhong; Zhang, Mengdi; Li, Juanzi; Li, Xiaoli; Shao, Chao; Tang, Jie; Liu, Yongbin (2015-07-26). "Learning Topic Hierarchies for Wikipedia Categories"(PDF). Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Short Papers). Beijing, China. pp. 346–351.
^Ekenstierna, Gustaf Harari; Lam, Victor Shu-Ming. Extracting Scientists from Wikipedia. Digital Humanities 2016. From Digitization to Knowledge 2016: Resources and Methods for Semantic Processing of Digital Works/Texts, Proceedings of the Workshop, July 11, 2016, Krakow, Poland.
The following content has been republished from the Wikimedia Blog. Any views expressed in this piece are not necessarily shared by the Signpost; responses and critical commentary are invited in the comments. For more information on this partnership, see our content guidelines.
134,000 images are being uploaded to Wikimedia Commons, a central repository for free media, from ETH-Bibliothek, Switzerland’s largest public scientific and technical library.
Most of the photographs are being drawn from their aerial photograph holdings (70,000 in all) and 40,000 from the archives of Swissair, the national airline of the country until its bankruptcy in 2002.
The first 18,000 uploads come from Walter Mittelholzer, a Swiss aviation pioneer and entrepreneur. In his travels, which included the first north–south flight across the African continent, he took thousands of aerial photographs from places as varied as Spitsbergen (1923), a Norwegian island in the Arctic Ocean; Persia (1924–25); Kilimanjaro, the dormant volcano in modern-day Tanzania (1929–30); and Ethiopia (1934). You can see examples of his work sprinkled throughout this post.
“Mittelholzer captured sensational aerial images of landscapes, many of which had never been photographed from a bird’s-eye view before,” ETH-Bibliothek project coordinator Michael Gasser said. Mittelholzer utilized these images in a series of popular books that chronicled his trips into the-then great unknown; today, his work is used in post-colonial research.
"Policeman of the Emir of Kano"
Other images being uploaded are historical photographs of ETH-Bibliothek’s campus in Zurich, along with portraits of professors, students, and scientists at the same location.
Gasser says that while all of these images are already available on the internet, ETH-Bibliothek is “facilitating access to these valuable image sources ... we are trying to bring the material to where the users are.” All are licensed under CC BY-SA or are in the public domain.
The project to upload them to Wikimedia Commons stems from a collaboration between ETH-Bibliothek and Wikimedia CH, an independent organization that works to advance the Wikimedia movement in Switzerland, which was initiated through mutual contacts at Open Data.ch, the Swiss chapter of the Open Knowledge Foundation.
You can see the images for yourself as they are being uploaded on Commons. EE