Wikipedia's lead sentence problem: Over the years, the lead sentences of Wikipedia articles have become less and less readable. Wikipedian Kaldari argues that it's time for us to address this problem and make Wikipedia great again.
CAMPBELL, John, LL.D. (1708-1775), a miscellaneous author, was born at Edinburgh, March 8, 1708.
This allowed a reader to more easily distinguish between the 100+ notable people named John Campbell (only one of whom was actually lucky enough to get an article in the 9th edition). Although this convention was a bit awkward and redundant, it served a useful purpose (in the absence of disambiguation pages), and was kept in all subsequent editions.
When Wikipedia was created in 2001, it sought to emulate the successful model of the Encyclopædia Britannica and many editors adopted the convention of including birth and death years in the lead sentence. Here is the lead sentence for Christopher Columbus as it appeared on June 13, 2001:
Christopher Columbus (1451?-1506) was a probably Genovian sailor who crossed the Atlantic in service of Spain.
Little did Thomas Spencer Baynes realize, Wikipedia editors would eventually expand on his convention, including not only birth and death years, but entire birth and death dates, birth and death dates in alternate calendars, birth and death locations, alternate names, maiden names, foreign names, pronunciations, foreign pronunciations, and transliterations. Fifteen years later, here's what Christoper Columbus's lead sentence had become:
What began as a concise, encyclopedic sentence had slowly grown into a sprawling mess of multiplying metadata—a sentence so complicatingly packed as to render it unreadable. This isn't just a subjective opinion, either. If you chart the Flesch Reading Ease score of the sentence over the years, you'll see an almost continuous decline since 2002. This is by no means an isolated example, either. The metadata virus has spread from biographical articles to other subjects as well, like geography:
The problem has become so noticeable that many reusers of Wikipedia content (including the WMF itself) have started stripping out parenthetical phrases from the lead sentence in certain contexts. If you search for "Christopher Columbus" on Google, you'll see a much more digestible description, both in the Knowledge Graph and under the Wikipedia search result. If you turn on the Page Previews beta feature in your preferences and hover over Christopher Columbus, you'll also see a much shorter version. The Wikipedia apps even experimented with removing parenthetical phrases from the lead sentences in the articles themselves. This has led to heateddebates about whether or not we are potentially removing important information (as some parenthetical phrases consist of content other than metadata). Without a clear way to identify which parenthetical phrases are useful and which are detrimental, I'm sure these issues will remain unresolved. What's really needed is a vigorous debate by the Wikipedia community about how to bring this problem under control and make our articles readable again.
If we don't take significant steps to address this problem, the metadata disease is only going to keep multiplying and spreading. If left unchecked, I fear this is what our future will look like:
[Excerpt from the Americapedia article about Wikipedia, copyright 2034, used with permission.]
...Like frogs in a pot of boiling water, the proliferation of lead sentence metadata happened so slowly that no one noticed until 2021 when John Seigenthaler's son published a devastating video on ClickNews in which he read aloud the lead sentence of his Wikipedia article, and then wept for 3 minutes.
Seigenthaler's video caught the attention of the recently re-elected Donald Trump, who only weeks before had dissolved The New York Times and Washington Post by executive order. Trump immediately posted a flurry of tweets eviscerating the venerable online encyclopedia. By the next day, Wikipedia was no more.
Let's avoid this sorry fate and make Wikipedia great again!
^German Wikipedia also adopted the convention of preceding all death dates with a dagger (called a "Kreuz" in German), which has led to endless debates about whether or not the symbol is Christian and thus inappropriate to use for non-Christian biographies. Luckily, such a convention doesn't seem to exist in English encyclopedias!