Wikipedia:Assessing articles

A highly important butterfly

This essay discusses the criteria and purpose of article assessments, recorded in talk page templates like {{WikiProject Venezuela}}. Assessments are useful if done right, but are often done wrong. Many articles are given lower quality or importance ratings than they merit based on the criteria. A common mistake is to assess short articles as stub or start class even when there is nothing more to be said about the subject, and longer articles as B (or higher) class even when there is much more to be said.

An unjustified "stub class" assessment with no explanation may cause a potentially productive newbie to give up. However, an author may be blind to defects that a reviewer sees at once. Reviewers are therefore encouraged to give notes on the article talk page that state what they feel needs improvement, preferably relating the notes to the project's assessment criteria, and authors should feel free to ask reviewers for more detailed feedback on what needs improvement.

There may be more leverage in bringing many articles up to C class, where they meet the needs of most casual readers, than in bringing a few up to the very demanding standards of FA class.

By and for project membersEdit

Assessments are for project members, not for casual readers. Most Wikipedia readers never see ratings, but some may click on the  talk  tab by accident and see the article they were reading has a C rating. That seems like a rather mediocre grade for an article that gave them all they wanted to know. They shrug and move on. They will not click on the quality scale to find out what C class means.

Assessments should ideally be done only by project members, or at least should be reviewed by project members. An article on a species of butterfly may cover all that distinguishes it from others in its genus. The article is well written, well sourced and complete, as anyone who knows about butterflies can see. But the article is just two paragraphs long. A general editor, busily working through a list of new articles, may give it a Stub rating because it is so short. B would be more appropriate. An article on a 19th-century physicist may give an excellent and well-illustrated overview of his life, but skim over the work for which they are known. The same busy editor may assign it a B rating because it is so long and thorough, but if most readers are more interested in the work than the person, it may be a Start.

To assess an article properly the reviewer should understand where the article fits in the spectrum of importance for the project, what information should be included in this type of article and what casual readers would be looking for. The reviewer must understand what could make a butterfly or physicist very important to the project, as opposed to a mundane butterfly or physicist. They must also understand the standard information to be recorded about butterflies and physicists, and know something about the more important subjects, so the presence or absence of the information tells them how complete the article is. The length of the article is irrelevant.

Quality ratings: an awkward compromiseEdit

Quality ratings try to give a combined assessment of three quite different aspects of any article:

  1. Prose: Is the article well organized, easy to read and easy to understand, avoiding needless jargon, with no spelling or grammar errors?
  2. Technical style: Does the article cite reliable sources to support what it says? Does it have appropriate formatting, wikilinks, categories, etc.?
  3. Coverage: Does the article give detailed and in-depth coverage of all significant aspects of the subject?

The three are independent. A very readable article may be a hoax about a non-existent subject. A perfect article in terms of technical style may be poorly written and have major gaps. A professor may write the definitive article on a subject, but their English is very poor and they see no need to add citations to their own work.

The table below summarizes the criteria given at Wikipedia:WikiProject assessment. GA and FA are similar to A: fairly complete and well-written.

Class Coverage Prose & style
Stub Very little meaningful content May be incomprehensible
Start Some meaningful content, but most readers will need more May need improvements to organisation, grammar, spelling, writing style, jargon use and citations
C Still major gaps, but useful to a casual reader May have problems with clarity, balance, flow, bias or original research.
B Mostly complete, may not satisfy a serious student or researcher Reasonably well-written
A Essentially complete, very useful to readers Well-written, clear, well referenced

Coverage is the main criterion in assessing Stub, Start and C class articles, but truly awful prose or severe technical issues can drag a rating down to Stub. With B and GA/A/FA, where coverage is mostly complete, quality of prose and technical style are more important. One approach is to take the lowest rating of the three aspects. If an article's prose, style or coverage is Stub level, the article is Stub level. If it is not a Stub, but prose, style or coverage is Start level, the article is Start level. And so on. However, this may work poorly with Start class, which could describe a well-written article that just needs a bit more information to become C class, or a rambling, confused and unsourced essay that should be rewritten from scratch.

According to the policy Wikipedia:What Wikipedia is not, "Information should not be included in this encyclopedia solely because it is true or useful. A Wikipedia article should not be a complete exposition of all possible details, but a summary of accepted knowledge regarding its subject." However, the guideline Wikipedia:WikiProject assessment says of the FA grade, "it neglects no major facts or details ... a definitive source for encyclopedic information." Google's list of synonyms of "encyclopedic" includes "comprehensive, complete, thorough, thoroughgoing, full, exhaustive, in-depth, wide-ranging, broad-ranging, broad-based, all-inclusive, all-embracing, all-encompassing ...".

There is room for debate over what constitutes "complete" coverage, but three assertions seem uncontroversial:

  1. A Start class article does not meet the needs of the typical casual reader, our primary audience, but a C class article does, even though it may be quite incomplete. (A "casual reader" is curious enough to have clicked on a link to the article or searched for the title. They may not be the average Joe, but they are not an expert on the subject area.)
  2. If a short article gives all that has been published about the subject, it must be considered complete even if many questions remain unanswered. "Complete" measures how close the article comes to what is possible rather than how close it comes to the ideal. Thus Beornred of Mercia, which was rated Stub as of December 2017, perhaps should be rated A. There is no more to be said.
  3. A long article may still be incomplete if omits significant available information, so falls short of what is possible, even if it meets the needs of almost all readers.

Importance ratings: a variety of definitionsEdit

The importance scale, also called the priority scale, is specific to a project. An article may be highly important to one project, less important to another. There is no "official" scale, and projects are encouraged to define their own, specialized scales. Different projects may consider different factors to evaluate importance. A project's importance scale typically answers the question, "How important is it to Wikipedia's coverage of this project's subject area that there should be an article for this topic". It is often assigned incorrectly. Thus an article on a minor but notable artist, river or movie may be rated Low importance since the subject is not particularly significant, but should be rated Mid importance since deleting the article would leave a gap in Wikipedia's coverage of the project's subject area.

Some projects refer to the scale documented at {{Importance scheme}}, while others refer to the definitions in the Wikipedia:Version 1.0 Editorial Team/Release Version Criteria. Other projects have customized scales which may consider factors such as notability of article topics, relationship to a "main" article for the project, centrality to understanding the project's subject area, reader interest and expectations and so on. The WikiProject Video game Importance scale takes the interesting approach of breaking the project scope into sub-areas such as video games and series, in-game elements, companies, hardware and so on, and giving different Top/High/Mid importance criteria for each sub-area. {{WikiProject Visual arts}} does not assess importance at all due to the difficulty in comparing such things as 19th century English history paintings, traditional Chinese porcelain and pre-Columbian architecture.

The proposed default scale documented at {{Importance scheme}} is based on the subject's notability within the field of knowledge covered by the project (an estimate of how many sources discuss the subject in some depth) combined with an estimate of whether there is worldwide interest compared to purely local interest:

Importance ratings are used by the Wikipedia:Version 1.0 Editorial Team to decide which articles to include in an offline edition of Wikipedia. The editorial team's article selection bot also looks at factors such as the number of page hits, links from other pages, and a score of how broad the project is.[a] The definitions in the Wikipedia:Version 1.0 Editorial Team/Release Version Criteria are based on how central the subject is to the field, which may roughly correlate with notability, but ignores geographical distribution of interest in the subject:

Need The article is of priority or importance, regardless of its quality
Top Subject is a must-have for a print encyclopedia
High Subject contributes a depth of knowledge
Mid Subject fills in more minor details
Low Subject is mainly of specialist interest.
Bottom (Optional) Subject has no real significance to the project.
No (Optional) Subject is a disambiguation or redirect page, residing in article space.

These two scales are somewhat inconsistent, and a given project may have its own scale. The common factor is that an article is assigned importance based on an informed view of how important the article is to the project's subject area, and may be used to prioritise work by project members. Ideally importance is assigned or reviewed by a project member. It should not be assigned based on a vague idea of how important the subject is in the wider scheme of things, or how important it is to readers. In the second quarter of 2017 Darth Vader consistently got more pageviews than United States and World War II combined. This factoid should not affect the importance ratings of these three articles.

Sample articleEdit

An example of a mid-importance article that meets the criteria for C class:

 
 
Location in Ruritania

Slatsnovgrad is a hill village in the Tslatzyn province of Ruritania.

Slatsnovgrad is in the northeast of Tslatzyn province at coordinates 65°53′53″N 72°08′49″E / 65.898°N 72.147°E / 65.898; 72.147, at an elevation of 1,673 metres (5,489 ft) above sea level.[1] It may be reached by a two hour drive over a rough dirt road from Tslatzyn City to the south.[2] As of 2015 the population was 340, of whom 54% were female and 44.5% were under the age of 18. The economy is based on raising goats for milk, wool and meat.[3] There is a shop in the village that sells ammunition, gasoline, bread, olives and kefir (fermented goat milk).[2]

References

•••

In this example there are no major problems with the prose and technical style, and most readers will not need more. Although there are still major gaps in information, the article tells the typical reader looking up the village on their phone all they want to know about Slatsnovgrad. It gives a C level of coverage for a minor village. An infobox with a picture and a better map would be nice, but these are not required. The article is mid-importance because the citations indicate that the subject has achieved notability, at least locally. It fills in minor details and may be of interest to readers other than social scientists who specialize in villages. Unfortunately, many reviewers would glance at the article, see just one paragraph on a small village, and give it Stub and Low importance ratings.

The example meets the needs of most readers, but there may be more to be said:

There is a ruined stone fort in the north of the village that was the birthplace and power base of Borg the Greedy (d. 1154), regent of Ruritania from 1143 to 1154 during the Second Turkish War.[4] ... The Soviet-era Slatsnovgrad Dam supplies the village with water and hydroelectric power.[5] ... The locally fermented kefir is said to have aphrodisiac properties.[6] ...

The additional details may be of interest to readers, but are not needed to achieve C class, since the casual reader would not know the details were missing. With this additional information, the article may have reached B class, or even A class if no sources give further information on the village. The serious student or researcher may be dissatisfied, but the article is a fairly complete treatment of the subject. There is no more to be said without indulging in original research. This leads to a paradox: the more information is available, the harder it is to get above C class.

To illustrate, suppose the village of Slatsnovgrad was founded shortly after the Second Turkish War to house peasants subject to Slatsnovka Abbey. The monks kept detailed records from the foundation of the village up to the revolution of 1923. The Ruritanian People's Republic published the complete records in 13 volumes between 1935 and 1939. Every birth, marriage and death is recorded, as is monthly weather, livestock numbers, crop yields, prices, building, road and irrigation works and detailed accounts of plagues, wars and rebellions. Several major academic books have been based almost entirely on the Annals of Slatsnovka, discussing what it reveals of different aspects of central European culture, history and economy. The Wikipedia article can never be more than a superficial overview of this huge trove of information. It cannot be considered "mostly complete" or "essentially complete", so must remain C class for ever.

Article life cycleEdit

 
Article life cycle

Wikipedia:WikiProject assessment describes a smooth progression as an article moves step by step from Stub to Featured Article. The reality is different. The normal life cycle is "create as Stub or Start, then stagnate". The life cycle of a few select articles is "create as C, stagnate, upgrade to GA or FA, stagnate." (Controversial articles have a complex life cycle which is unrelated to quality assessments so not discussed here.)

  • Most articles on uncontroversial subjects are created with a series of edits, sometimes spread over several days.
  • Soon after a bot has put the new article onto project lists, a new article watcher rates it if the creator has not yet done so. Stub and Start are much the most common quality ratings, and "Low" the most common importance rating. The ratings are often incorrect and often premature.
  • The creator may continue to improve the article after the initial rapid assessment, but it is rarely re-assessed.
  • The article now enters a stagnant phase where various editors tweak spelling, punctuation, categories, links and so on, but add little real content. Editors working on related articles may add a sentence or two of more substantial content, but will usually leave the assessment unchanged.
  • A Stub may be nominated for deletion, prompting a rescue job and an upgrade to C class. This is not what the deletion process is for, but it happens.
  • An editor may take on the challenge of moving a C or B class article up to GA or FA status. There is a flurry of activity as editors add substantial content and make many copy edits, followed by approval of the upgrade. The article then becomes stagnant again.
  • Few articles are upgraded to A status, probably because of the lack of a recognition mechanism.

Ratings are used by bottom-feeding and top-feeding editors.

  • Bottom-feeding editors work through sets of Stub articles making the same enhancements to all of them, such as adding an infobox and basic data from a standard source. Their reward is knowing that they have added useful information to a lot of articles without doing any heavy-duty research. They rarely change the assessment.
  • Top-feeding editors browse among the B or C class articles, bringing them up to GA status, or try to bring GA articles to FA status. Their reward is bragging rights, and perhaps publication of "their" article on the front page.

One may question the value of developing an article to GA or FA status in an attempt to satisfy the serious student or researcher. Would any serious student or researcher use Wikipedia? Perhaps getting more articles up to C class, meeting the needs of most readers, gives greater payback. But most articles stay as Stub class for ever, or move to the Start class garbage can. A Start class article "... is quite incomplete ... might or might not cite adequate reliable sources ... is weak in many areas. Quality of the prose may be distinctly unencyclopedic, and MoS compliance non-existent ... needs substantial improvement in content and organisation. Also improve the grammar, spelling, writing style and improve the jargon use." No sane editor would want to fix up a mess like that.

Statistical analysisEdit

 
Chart 1: All non-list articles
 
Chart 2: Top-, high- and mid-importance articles
 
Chart 3: GA, A and FA

Statistics for the English Wikipedia derived from Wikipedia:Version 1.0 Editorial Team/Statistics as of 2017-05-17 follow. Where an article has been rated for quality and/or importance by more than one project, the highest quality and importance ratings are used. Thus an article counts as high importance if it is high importance for WikiProject Furry even if it is low importance for WikiProject Anime and manga.

Counts of articles as of 2017-05-17 by quality rating and by importance:

Quality rating Count
FA 10,837
A 2,296
GA 44,953
B 183,764
C 356,458
Start 1,910,101
Stub 3,265,699
Not assessed 552,983
Importance Count
Top 47,938
High 169,008
Mid 654,866
Low 2,850,796
Not assessed 173,301

Chart 1 shows the distribution of articles with different quality ratings across the various levels of importance. Most articles are considered low importance, or have not had their importance assessed, and almost all articles that have had their quality assessed are rated Stub or Start: does not meet the needs of most users. This may not be a serious issue. A Stub class article for an obscure subject may slightly annoy the rare reader who is searching for information on the subject, but otherwise does no harm. If most searches find articles that are C-class or above, Wikipedia is working well.

Chart 2 zooms in to show the distribution of top-, high- and mid-importance articles. Average quality is better for top- and high-importance articles than for mid-importance articles since project members are more likely to focus on improving the more important articles. Of the 51,011 top-importance articles there are only 4,243 Stub articles and 17,346 Start articles. Stub and Start articles still account for most mid-importance subjects, and by definition do not meet the needs of most users. This may not be a serious concern if, as is often the case, importance ratings are unrelated to levels of reader interest.

Chart 3 shows the distribution by importance of articles with quality GA and above. FA articles are most likely to be for top- or high-importance topics, while GA articles include more mid- and low-importance topics. There are relatively few A-class articles, perhaps due to the lack of reward for taking an article to this level. If an editor is going to make the effort to bring an article up to A, they may as well take it all the way to FA.

Educationalists have found students retain most interest in a subject when they score about 70% on tests. If they score much higher, they think the subject is boring. It they score much lower they think it is too hard, and may give up. Well-designed academic tests aim for a median score around 70%. If we take C-class or above as a success, only 10% of editors succeed. We are desperately short of new editors. Possibly the criteria are too rigorous or the scoring is too harsh.

NotesEdit

  1. ^ The fans of an obscure singer could set up a wikiproject devoted to that singer, with project-specific importance criteria that ensure the main article about the singer gets Top importance. However, since the project scope is very narrow and there would be few hits on the article, the Version 1.0 Editorial Team article selection bot would give it a relatively low importance score.