User:Facing the Sky/Meta assessment

During 2014-02 and 2014-03, I've been randomly visiting articles (through Special:Random), noting their current assessment and the presence of stub tags, and (conservatively) re-assessing them. This was done in blocks of 50 at a time, 10 times, for a total of 500 articles visited.

Results table edit

Initial assessment Stub tagged? Stub Start C B GA A FA Total
unassessed, no wikiprojects yes 3 2 0 0 0 0 0 5
unassessed, no wikiprojects no 7 15 0 0 0 0 0 23
unassessed yes 16 0 0 0 0 0 0 16
unassessed no 6 25 4 0 0 0 0 35
stub yes 172 4 0 0 0 0 0 176
stub no 47 19 1 0 0 0 0 67
start yes 2 3 0 0 0 0 0 5
start no 2 100 3 0 0 0 0 105
c no 0 1 14 0 0 0 0 15
b no 0 0 0 9 0 0 0 9
ga no 0 0 0 0 2 0 0 2

Analysis edit

Of the 500 articles, 457 weren't disambiguations or lists. Of those 457, 430 had wikiprojects assigned, and 379 were already assessed. I updated the assessments of 32 articles that had already been assessed.

202 articles had stub tags. 5 of those stub tags weren't consistent with the article assessment (i.e., a non-stub assessment); 2 of those were over-assessed (as opposed to inappropriately stub-tagged). 6 stub-tagged articles with a consistent assessment were updated.

243 articles were pre-assessed as stubs. 176 of those 243 had stub tags, and I updated the assessments of 4 of those. 67 of those 243 didn't have stub tags, and I updated the assessments of 20 of those.

Overall I either gave an initial assessments to, or updated the assessment of, 114 articles. 30 of those were just stale assessments, 5 were inconsistent with stub tags, and 79 had no initial assessment.

Comparison with other data sources edit

The Wikipedia 1.0 team runs a bot that tallies up article assessments. It reports 3,831,079 pages assessed as stub/start/C/B/GA/A/FA (an 'article' rating). The main page gives a total article count of 4,479,979. A simple ratio of the two gives an 85.5% rate of 'article' assessments. However, as sampled, only 379/500 = 75.8% of pages have an 'article' assessment. The 99% margin of error with a sample size of 500 is 6%—that is, there's a 99% chance that the 'true' value lies between 71% and 80%, according to the survey. The disagreement is too large to expect it to just be sampling error.

If we assume that every unassessed page would receive an 'article' assessment, we get 458/500 = 91.6% of total pages being articles (99% bounds: 86%–96.8%). This number isn't comparable to the WP1.0's count above, as it includes articles with no assessment. Also note that it does not include list articles of any sort.

Disambiguation pages? edit

I didn't record whether disambiguation pages (including set index pages) had an attached wikiproject and an assessment. Category:All disambiguation pages has 244,713 pages in it and Category:All set index articles has 50,104 pages in it. If the sampling error is large-ish and a large majority of disambiguation pages have an 'article' assessment (i.e., stub/start/C/B/GA/A/FA), that would explain the disagreement.

The fraction of non-disambig pages (including lists) is implied to be 93.4%. This is within the estimated confidence interval for the fraction of total pages being articles, but we should be careful in drawin conclusions because of the inclusion/non-inclusion of lists (including set indices). This explanation could be sufficient if a large majority of disambig pages have 'article' assessments (and so are counted as such by the WP1.0 bot).

Redirects? edit

Another possibility is that redirects are being counted by the 1.0 team's bot as articles. Possibly due to post-merge detritus? I don't know if I believe all of the 400,000 article shortfall is down to that. It might be a contributory factor.

Proving it edit

Both hypotheses could be examined properly with a database dump, and exact figures for the number of redirects and disambigs with 'article' assessments could be obtained. This could be done with just the category links table and the pages table.

Rating accuracy edit

I updated 32 / 379 = 8.4% of assessments. The bulk of these (20) were on articles assessed as a 'stub' but without stub tag. That category made up 67 articles—an assessment error rate of 30.0%. Another place to look for inaccurate ratings is articles which have a non-'stub' rating but a stub tag (5 articles in the survey), which will always need resolving one way or the other.

However, both are small compared to unassessed articles, which made up 79/500 = 39.5% of the total sample.