Talk:Usage share of operating systems/Archive 2

Archive 1 Archive 2 Archive 3 Archive 4 Archive 5

This page has got slightly huge, and discussion seems to have settled down for the moment, so I've archived it.--Harumphy (talk) 12:29, 3 February 2010 (UTC)

Who decided that it is median?

What is the logic behind the fact that median is used as an avarage not mean. Please kindly justify it. Pallab1234 (talk) 20:35, 24 April 2010 (UTC)

There's archived discussion about this in various places in Talk:Usage_share_of_operating_systems/Archive_1. Essentially, a median is preferred to a mean because it is less influenced by idiosyncratic highs and lows in the some of the cited source figures. --Harumphy (talk) 09:37, 25 April 2010 (UTC)
I think we should use mean instead. Cityscape4 (talk) —Preceding undated comment added 19:18, 5 July 2010 (UTC).

Chart is not correct

If you add up all the numbers it comes up to 96.77. What other systems comprise the 3.23%? There is a category for "other" so at worst the remaining 3.23% should be added in there. 99.149.197.13 (talk) 20:58, 23 March 2010 (UTC)Connie Jo

All data from the chart comes from the median values of the table -- even the other. Unfortunately, this does result in some missing percentage. I'll think about changing it to show a full 100%. But the problem is that adding the 3.23% wouldn't be what's reflected from our sources. Hmmm.... Jdm64 (talk) 00:43, 24 March 2010 (UTC)
The medians do not and should not be expected to add up to 100%. So there's nothing 'missing'. --Harumphy (talk) 11:23, 24 March 2010 (UTC)

www.webmasterpro.de

Here's a source of web client stats for German-language pages which looks quite interesting:[1]. Here's a translation page:[2]. It measures over 100,000 commercial and private, predominantly German-language sites.--Harumphy (talk) 13:02, 3 February 2010 (UTC)

Linux, Android and Symbian

I feel a bit uneasy about including Android in the Linux column. Android is of course based on the Linux kernel, but of the four sources we cite, only one (Wikimedia) lumps Android in with Linux. The other three treat it as something distinct. As we're supposed to be summarising the sources I think we should have a separate Android column. The only other mobile OS that is cited by at least three sources and has a greater share than Android is Symbian (median 0.24%) so I think we should add a column for that too.--Harumphy (talk) 14:29, 3 February 2010 (UTC)

I agree that we should put Android in a separate column than the Linux one. Cityscape4 (talk) 20:40, 24 February 2010 (UTC)
Since posting the above comment on Feb 3, the table has been modified to split a single Linux column into 'mainstream' and Android columns, both under a Linux 'umbrella' heading, as it were. Do you mean that we should remove the Linux umbrella from the Android column?--Harumphy (talk) 18:34, 24 February 2010 (UTC)
Android Linux and Chrome Linux are simply Linux distributions. They should remain in the Linux column. —Preceding unsigned comment added by Nwusr123log (talkcontribs) 04:44, 2 March 2010 (UTC)

Share of smartphone OSs in the main section?

The main section claims IphoneOS to have 0.72% market share, yet where are the other smartphone OSs? Even by our own article, lower down, it shows Symbian and Blackberry way ahead of Iphone. There are various explanations:

  • Some surveys were flawed, and for some reason included the Iphone, but ignored other phones (some mention other phone OSs, but others don't).
  • The survey is based only on web user agents, and Symbian/Blackberry are less likely to be identified by this means.
  • The survey is based only on web user agents, and Symbian/Blackberry are far less likely to access the Internet (or at least, they do so less often).

I'm not sure which, but all three of these suggest that the data is flawed as far as smartphones are concerned, in terms of using it to estimate usage share of operating systems. See [3] for example. We do already note that it is flawed, but this seems a rather major discrepency, and it seems odd to use web user agents as the primary source, given that we have far more reliable sources for the market share of mobile OSs...

Suggestions? Perhaps it would be better to remove mobile OSs from the main section, since we have far more reliable sources than web data to go with.

Also, where does the source [4] give the 0.52% figure for Iphone? It lists mobile OSs separately, not compared with overall. Mdwh (talk) 15:11, 6 February 2010 (UTC)

Iphone should be removed from the main section. —Preceding unsigned comment added by Nwusr123log (talkcontribs) 04:45, 2 March 2010 (UTC)
The table summarises web client usage share, regardless of whether those web clients are fixed or mobile. The article does have health warnings about using web client stats as a proxy for OS share. Symbian and BlackBerry may account for a larger share of the mobile phone market than iPhone, but do not feature anything like as prominently in the web client stats we cite. (Fewer of the sources mention them at all, and the numbers are lower - medians around 0.2%.) As far as [5] goes, the figures are weighted by the OS v. mobile OS shares, i.e. 1.56% of 33.13% = 0.52%.
Mobile web clients have only really become significant in the last year, with mobile share reaching 1% toward the end of 2009. Until then web clients were essentially just desktops and laptops, with tiny figures (<0.1%) from mobiles and games consoles. But as long as we're summarising web client stats, we have to summarise what those stats actually report. So far the iPhone is the only mobile OS with a share comparable to the lowest fixed OS we list - Windows 2000.
We could include Symbian and Blackberry without making the table much wider. I sandboxed this table a couple of days ago to try out some other ideas. (Please ignore the changed sources - that's a whole separate issue.)--Harumphy (talk) 15:50, 6 February 2010 (UTC)
Source Date Microsoft Windows Apple Linux Symbian RIM
Black-
Berry
Other[1] Sources
7 Vista XP 2000 All
versions
Mac iPhone main-
stream
Android
Net Applications Jan. 2010 7.51% 17.39% 66.31% 0.58% 92.02% 5.13% 0.47% 1.02% 0.06% 0.24% 0.03% 1.06% [6][7]
StatCounter Jan. 2010 8.24% 20.80% 62.48% 0.37% 92.31% 5.08% 0.52% 0.70% 0.07% 0.53% 0.16% 0.79% [8]
Web master pro Feb 03, 2010 11.1% 23.1% 56.3% 1.6% 93.3% 4.5% 0.5% 1.2% 0.04% 0.09% 0.37% [9]
W3 Counter Jan. 2010 9.11% 20.80% 55.08% 0.47% 86.56% 7.87% 0.72% 1.63% 0.10% 3.12% [10][2]
Wikimedia Nov. 2009 3.54% 25.90%[3] 57.45%[3] 0.86% 88.18% 6.75% 1.00% 1.50% 0.05% 0.13% 0.36% 2.52% [11][4]
Median Jan. 2010 8.24% 20.80% 57.45% 0.58% 92.02% 5.13% 0.52% 1.20% 0.06% 0.19% 0.16% 1.06% ---
Thanks for updating the table. I still think there's a problem in that the web usage is given too much prominence - a reader glancing that the article would see the Iphone as most popular, even though we know that to be false. That we state that web usage can be misleading isn't really good enough - this is "Usage share of operating systems" not "Web usage by operating systems".
My suggestion would be to move "Mobile devices" section above "Web clients" (but still below "Desktop and laptop computers"). In fact, we might as well move the Netbooks, Servers etc sections above "Web clients" too, as these also seem to be based on more reliable data. I also think that the main picture should be based on something more accurate than web data, for this article. Mdwh (talk) 12:00, 12 March 2010 (UTC)
I disagree. The current order (desktop/web client/netbook/mobile/server/mainframe/supercomp) makes a lot of sense - there's a narrative flow to it. Using web stats as an OS share proxy is problematic, but changing the order of the sections isn't going to fix that. The real problem is the shortage of good published source data.--Harumphy (talk) 17:19, 12 March 2010 (UTC)
I wouldn't call it a natural flow - we start with desktop, then go to web clients for all platforms, then jump back to specific sections such as netbooks and mobile. I imagine that this ordering made more sense when the web usage was only relevant for desktop (i.e., before netbooks and mobiles were included), but this is no longer the case. I don't understand why for desktops, we have the section before the web usage table, but not for netbooks and mobiles. So I'd suggest moving netbooks and mobiles straight after desktops, then have web usage which applies to all three sections - and then after that, we can keep servers etc as current.
We do have accurate source data for the mobile clients - this isn't an excuse for starting off with inappropriate data, claiming Iphones as most popular. I also see my move of the picture to the relevant section was reverted - surely accuracy and bias is more important than formatting? And can't the formatting be fixed? If not, that's a wiki bug, and shouldn't affect how we present the information. Mdwh (talk) 08:47, 20 March 2010 (UTC)
Web client operating systems data is not inappropriate. While there is no common way to look across all uses of operating systems, web usage is the most common one. iPhone happens to have larger share of web browsing, so there is no wrong information here either. If you want to - you can put this question to the vote in this discussion page - where to place the web client chart - and whatever majority of editors decide - that's what we will do. Wikiolap (talk) 22:22, 20 March 2010 (UTC)
Having the web client section after the desktop section makes sense because the desktop section refers to web client stats and, for the time being, desktops account for about 98% of web clients - so there's a strong correlation between the two. The netbook and mobile sections come next quite logically, because they're still clients not servers, but the stats in these sections use sources other than web client stats. I also agree with keeping the web client pie chart at the top of the article - it may be the 'wrong section' but that's a price worth paying to improve the page layout IMHO. --Harumphy (talk) 08:27, 21 March 2010 (UTC)
I also disagree with "The title of the chart is unambiguous about "web client" part" - the title is "Usage share of web client operating systems" (emphasis mine) which implies it's actual usage share of operating systems that can access the web, rather than how commonly they are used to access websites. ("web client operating system" is a rather vague statement anyway - I think this should be made clearer, e.g., "usage share of web clients by operating systems".) Mdwh (talk) 08:54, 20 March 2010 (UTC)
I'm not sure how to edit the picture text, as it's in some kind of template? "Usage share of web client operating systems." isn't correct, it should be "Web usage of operating systems" or perhaps explicitly stating that the data is based on web access, which gives inaccurate results for the mobile OS share, or something, as it's the share of accessing the web, not of operating systems (now that it's moved to the correct section, this should be clearer anyway, but it's better to be accurate throughout). Mdwh (talk) 12:05, 12 March 2010 (UTC)

rationalising notes

At the moment the web client summary table has some notes in the paragraph directly underneath and some amongst the references at the bottom of the page. I propose the put all the notes in a concise, bulleted list within the table, like this. Comments?--Harumphy (talk) 17:08, 7 February 2010 (UTC)

Source Date Microsoft Windows Apple Linux Other Sources
7 Vista XP 2000 All versions Mac iPhone
AT Internet Institute Aug. 2009 0.80% 29.13% 59.83% 92.80% 4.93% 0.75% 1.00% 0.52% [12]
Net Applications Jan. 2010 7.51% 17.39% 66.31% 0.58% 92.02% 5.13% 0.47% 1.08% 1.30% [13][14]
StatCounter Jan. 2010 8.24% 20.80% 62.48% 0.37% 92.31% 5.08% 0.52% 0.76% 1.33% [15]
StatOwl Jan. 2010 7.89% 25.69% 53.83 0.61% 88.02% 11.48% 0.39% 0.11% [16]
W3 Counter Jan. 2010 9.11% 20.80% 55.08% 0.47% 86.56% 7.87% 0.72% 1.73% 3.12% [17]
Wikimedia Nov. 2009 3.54% 25.90% 57.45% 0.86% 88.18% 6.75% 1.00% 1.55% 2.52% [18]
Median Jan. 2010 7.70% 23.25% 58.64% 0.58% 90.10% 5.94% 0.72% 1.04% 1.32% ---

Notes:

  • The 'Other' column is obtained adding columns from Windows 'all versions' through Linux and subtracting from 100%.
  • AT Internet Institute measures France, Germany, Spain and the UK. Table shows average of four countries' figures.
  • The StatCounter figure combines data from the source's Operating System, Mobile OS and Mobile v. Desktop tables.
  • StatOwl measures predominantly US web sites with "broad appeal".[19] Stat for XP includes Server 2003. Excludes mobile usage.
  • W3Counter shows only the top ten operating systems and is based on the last 15,000 page views to each of 30,961 web sites tracked.
  • Wikimedia uses 1:1000 sampling of its logs when deriving the usage numbers. Stat for Vista includes Server 2008; XP includes Server 2003.
  • Mac OS X is broken down by three sources and they show that version 10.5 (Leopard) is used by nearly half of Mac OS X users.
  • Wikimedia and StatOwl indicate that about half of Linux users use Ubuntu.
  • Four sources show shares for Android and these are included in the Linux column.
Looks good to me. Wikiolap (talk) 21:27, 7 February 2010 (UTC)
OK, done it.--Harumphy (talk) 17:34, 8 February 2010 (UTC)

more web client table tweaks

I will make these changes in a couple of days if nobody objects:

  1. add webmasterpro.de as a source
  2. make minor tweaks to reduce column widths
  3. split the Linux column into 'mainstream' and 'Android' columns
  4. add a column for Symbian
  5. add a column RIM BlackBerry --Harumphy (talk) 17:38, 8 February 2010 (UTC)
From what I saw on the webmasterpro.de - it is Germany only source, and therefore I beleive it is too narrow to include in the general article. If it has numbers for the whole Europe - then I would agree to include it. Wikiolap (talk) 19:55, 8 February 2010 (UTC)
It covers German-language sites, not just those in Germany. Please see German language in Europe.--Harumphy (talk) 23:02, 8 February 2010 (UTC)
I agree with Wikiolap, Webmasterpro.de is not a good source. It does not cover an extensive enough area. I'm in complete favor of removing it.Cityscape4 (talk) 17:18, 24 February 2010 (UTC)
It covers over 100,000 sites in what is the most widely-used language in Europe and the primary language in Germany, Austria and Switzerland and spoken by the majority of the people in several other European countries. So its area is probably at least as extensive as some of the US-centric stats, which it also helps to balance. If we're going to get rid of sources because their coverage is too small, let's start with StatOwl which is 92% US.--Harumphy (talk) 18:28, 24 February 2010 (UTC)
Okay, I'm all in favor of removing StatOwl. I don't think we should includes sources that do specific areas (example: USA). If we have a source for mainly the US (or mainly german-speaking European countries) then we really should have stats from China, UK, Brazil and many others to counter-act the balance. Lets remove Statowl and maybe Webmasterpro too. Cityscape4 (talk) 20:36, 24 February 2010 (UTC)
Many of the sources have a geographical and/or linguistic bias - the latter maybe unwittingly by only communicating with webmasters in English. The real difficulty is that these are the best sources we've found to date. It would be great to have more data from other areas/languages.--Harumphy (talk) 08:46, 25 February 2010 (UTC)
I'm going to agree with Harumphy. If someone can find better sources then -- good. But, loosing sources is not good unless there's a good enough reason. Jdm64 (talk) 19:24, 25 February 2010 (UTC)
The only objections I have is that I think by adding the symbian and blackberry to the table it makes it too large and cumbersome. Especially, for the pie graph, which I still see as valuable. Maybe the chart could only have OSs that are >1%. Jdm64 (talk) 23:45, 8 February 2010 (UTC)
Fully agree. If the chart is large like it is now it is harder and less likely for people to read. If everyone agrees, I will remove symbian, blackberry and anadroid from the chart. They all have less then .30%, I think we should only keep OS's with over .5%. Cityscape4 (talk) 17:18, 24 February 2010 (UTC)
The chart is fine. When I added the mobile columns I made several other tweaks to reduce existing column widths and the ISTM the table still probably fits in most users' screen widths without horizontal scrolling. Certainly it does so on my laptop which is nothing special (1280x800 px screen). --Harumphy (talk) 18:28, 24 February 2010 (UTC)

Wikimedia

Where does Wikimedia get its information from? Cityscape4 (talk) 05:03, 28 February 2010 (UTC)

The links on the Wikimedia stats page tell you, do they not? --Harumphy (talk) 09:43, 28 February 2010 (UTC)

Web Client stat history tables

What happened to those tables that listed all the historical stat data from the various firms? Did it move somewhere? It was similar to what was in this article below, but for the major operating systems. http://en.wikipedia.org/wiki/Usage_share_of_web_browsers —Preceding unsigned comment added by Rasmasyean (talkcontribs) 22:56, 8 March 2010 (UTC)

We did a big rewrite a while back and got rid of the tables of historical web client data. Discussion archived here: Talk:Usage_share_of_operating_systems/Archive_1#Complete_rewrite --Harumphy (talk) 23:45, 8 March 2010 (UTC)

Mobile devices section

This section seems to contain some content about the various mobile OSs that isn't very relevant to usage share and therefore belongs somewhere other than in this article. Could those who have edited this section please take a look and prune stuff that isn't about usage share, sources of usage share data and usage share trends. That's more or less the yardstick for the rest of the article and I think it should apply here too. I'm reluctant to wade in myself because I don't know much about it.--Harumphy (talk) 12:13, 10 March 2010 (UTC)

I will do the requested pruning. Wikiolap (talk) 15:58, 10 March 2010 (UTC)
That's much better. Thanks. --Harumphy (talk) 17:51, 10 March 2010 (UTC)

Pie charts

We have two pie charts, that duplicate information that's already in tables - badly. The web client one is usually out of date, and the mobile one just reflects one of the four five sources we have in the table. They have very different graphic styles. They don't fit well on the page - there's loads of white space at the side. I'd like to nuke both of them. What does anyone else think?--Harumphy (talk) 17:32, 12 March 2010 (UTC)

Well, I'm for keeping them.
  • Being a little out dated is ok. And really since the web client chart has the percentages separate from the image, as long as it's in the same month only the template page needs updating. The percentages don't change enough from month to month to really make a difference in the graph.
  • The mobile chart could be changed to the same style as the web client chart.
  • The images add "depth" to the page. Instead of just "dry" statistical tables, there's a visual representation of the data. Why are there images on any wikipedia page? We could describe everything in words. Having an image, I think, improves the look and feel of the page. The page is suppose to be for the reader's benefit. I think we sometimes loose sight of that. Jdm64 (talk) 20:44, 12 March 2010 (UTC)
Agree with all the points of Jdm64. Pie charts are the case of "one picture worth thousands words". I wouldn't mind adding more charts, like the ones which show trends over time - something that is currently lacking from the article. Wikiolap (talk) 07:45, 13 March 2010 (UTC)
In the table of the Supercomputers section the text "Windows HPC Server 2008" really sticks out (whilst "Windows HPC Server 2008" accounts only for 1%). This is misleading, it would help to add a pie chart like this one (please feel free to change) --HeWhoMowedTheLawn (talk) 19:30, 13 April 2010 (UTC)
 
TOP500 Supercomputers Operating system Family share for 11/2009
I've added some photos to make the page look and feel less dry. It's really just glorified clip art. The images don't need captions. Hope you like it!--Harumphy (talk) 14:33, 11 June 2010 (UTC)
Surprisingly, that does do a lot to the look of the page. It actually feels like I'm reading a print version of WorldBook or Britannica. Most of the articles on Wikipedia are always dry text -- and the rare pictures are always captioned. I wonder why other pages haven't added "clip art" to the article. It does seem to liven up the page. Jdm64 (talk) 20:09, 11 June 2010 (UTC)

Clicky web stats

Here's another source of stats: [20]. It appears to update daily so we could average them to get figures for a month. --Harumphy (talk) 15:09, 20 April 2010 (UTC)

No objections so far. Good! I intend to add this, averaging the last seven days of the month to make the job a bit easier, and using the Statcounter figures for desktop/mobile split, which seems to be pretty much in the middle of the desktop/mobile splits for the four sources from which such a thing can be derived. Naturally the method would be explained in a footnote. --Harumphy (talk) 11:26, 30 June 2010 (UTC)
[comment moved from section below] I have no objections, although do they have a mobile to desktop correlation so we can related ~60% iPhone to what the global share is? Jdm64 (talk) 08:58, 1 July 2010 (UTC)
They don't, which is not ideal. We can work around it by using statcounter's desktop/mobile split, which seems to be pretty much in the middle of the range of desktop/mobile splits for the four sources from which such a thing can be derived. We will of course need to explain the method in a footnote.--Harumphy (talk) 14:14, 1 July 2010 (UTC)

Windows 2000

The "extended support retired" date for Windows 2000 is 13th July 2010. I suggest we remove its column from the web client OS table on that date, while still including it along with other obsolete Windows versions in the 'all versions' column. --Harumphy (talk) 11:27, 8 June 2010 (UTC)

I'll agree with that. Jdm64 (talk) 19:04, 8 June 2010 (UTC)
I disagree, we should keep Windows 2000 for a while (say 6months-1year) longer. There are tons of small businesses and stuff that use it. It is still a commonly used version of windows and most programs are still being made for "Windows 2000 or higher". Cityscape4 (talk) —Preceding undated comment added 19:25, 5 July 2010 (UTC).

Playstation

Four of the seven sources we use list OS share for the Playstation - median share is 0.05%. I suggest we add a column to the web client table. None of the other odds'n'sods is currently listed by more than two sources. Three sources are splitting out iOS into iPhone/iPod/iPad, but as we're interested in OS share, not hardware share, I suggest we leave it as a single column.--Harumphy (talk) 11:11, 15 June 2010 (UTC)

Not this again! The Playstation is device not an OS. All the other listed OSs can be and are installed on several different devices. The OS on the Playstation, if you can even call it that, is inseparable from the device. The article is "Usage share of operating systems" not "Usage share of devices". To be on the list the OS must be an identifiable operating system distinguishable from the device that it's installed on. The operating system in current game consoles are more like an advanced BIOS/UEFI than an OS. Jdm64 (talk) 20:35, 15 June 2010 (UTC)
Fair enough. I've tweaked a couple of mobile column headings slightly in the light of this, so I don't get myself all confused again. ;-) --Harumphy (talk) 10:11, 17 June 2010 (UTC)

Palm WebOS

Palm WebOS has a tiny share, but it is an OS and it's listed by four of the seven sources currently used (and getclicky.com). I suggest we add a column for it, under the Linux umbrella heading. --Harumphy (talk) 11:54, 30 June 2010 (UTC)

We already have Palm WebOS in the mobile section. Or did you mean to include it in the Web Client section ? Wikiolap (talk) 15:08, 30 June 2010 (UTC)
Unless I'm missing something, I don't see it listed in the sources. You're talking about the web client section? Can you create a test table (with just the web os data) here with links back to the pages showing the percentages. I'm not exactly objecting to the inclusion, but I don't think the market share is note-able yet. Jdm64 (talk) 16:38, 30 June 2010 (UTC)
Yes I meant the web client section. It's listed in four sources, although one of them is 0.00%. May figures. Same references as already listed on the table:
  • Net Apps: Palm 0.00%
  • Statcounter (not on flash chart - see CSV spreadsheet): WebOS 0.52% of mobile, or 0.012% of all web clients by our method
  • Webmasterpro: shows Palm WebOS at 0.01%
  • Wikimedia: shows WebOS 1.4 at 0.02%
Thus the median would be 0.01%. Not much, granted, but some time ago I suggested a 0.1% threshold and somebody complained that was arbitrary! --Harumphy (talk) 18:02, 30 June 2010 (UTC)
It's not sure if Net Apps' Palm is WebOS or the older PalmPDA OS. Also if the source lists the percent at 0.00%, then I say that's not notable. Similarly Statcounter deemed it not significant by only having it in the csv file. Anyways, that would still make it the smallest percentage. Jdm64 (talk) 19:23, 30 June 2010 (UTC)

Criteria For Noteworthy Web Client OS

We need a concrete basis for adding OSs to the web client table.

  1. Must be reported on at least floor(n/2+1) sources. (ie. 7 sources means 4 must report it)
  2. Of the source that do list the OS, it must have a percentage of at least 0.10% on ceil(n/2-1) of the sources. (ie. 5 sources 2 must 0.10%)
  3. For a source to count as listing the OS it must show a percent of at least 0.01%.
  4. The OS must be currently supported by the developer. If support, including extended support, has expired it can still be listed if it still in widespread use. (ie. Win2k would be removed in July/August, WinXP might be removed in 2014.)

All currently listed OSs qualify under these regulations. WebOS would not qualify. Jdm64 (talk) 19:22, 30 June 2010 (UTC)

I agree we ought to settle this, but I would prefer to make the criteria simpler.
  1. Must be reported by at least half the sources (i.e. 6 sources means 3 must report it)
  2. Of the sources that do list the OS, the median value should be at least 0.05%
  3. If an OS is included in the table, a source showing it as 0.00% is significant - this should be shown and included in the median
  4. As above
My reasoning for #1 is to have a majority which is over 50%, so half+1. I think the median would lose "credibility" if half the sources don't report the OS as noteworthy. For #2 the percentage is arbitrary, but I choose 0.1% as Webmasterpro and AT Internet Institute are at least somewhat rounding to only 1 decimal place. It seamed to be the smallest level of accuracy found in all sources. I still disagree with #3, each source is different and they choose what they feel is the level of noteworthiness/accuracy. I'm sure the sources that don't list an OS still see them in their logs, but throw the data into the "other" category. Henceforth 0% gives no meaningful data to work with. Jdm64 (talk) 08:58, 1 July 2010 (UTC)

(BTW have you read my comments higher up the page about Clicky web stats?) --Harumphy (talk) 07:39, 1 July 2010 (UTC)

No, I didn't notice that comment. I have no objections, although do they have a mobile to desktop correlation so we can related ~60% iPhone to what the global share is? Jdm64 (talk) 08:58, 1 July 2010 (UTC) [moved to section above]

Pie chart - other

Shouldn't the pie chart include versions of Windows, other than the three shown in the chart, under "other"? --Harumphy (talk) 10:25, 2 July 2010 (UTC)

Ok, sounds fair. The current percentage for "Other Windows" is 1.34%. I'll add it to the June chart. Jdm64 (talk) 20:01, 2 July 2010 (UTC)
That sounds too low. Windows 7+Vista+XP have medians totalling 85.05%, all versions median is 89.53, so surely "Other Windows" is 4.48%? (These figures changed a bit because I've just done the StatOwl update, but not that much.) --Harumphy (talk) 16:21, 3 July 2010 (UTC)
You're calculating it wrong. You calculated the sum of the medians. I calculated the median of the sums. If that makes any sense. It's best not to work with medians until the end. So, for each source calculate the sum of XP+Vista+7 then find the median -- not the other way around. The value is currently at 1.29%. Jdm64 (talk) 05:58, 4 July 2010 (UTC)
That makes perfect sense. Thank you. --Harumphy (talk) 08:01, 4 July 2010 (UTC)

StatOwl

StatOwl currently lists just the main Desktop OSs - Windows, Mac OS and Linux and gives them 99.85% in total. It excludes anything below 0.1% and all mobile stuff. I suggest we reduce the figures in the table in accordance with the StatCounter desktop/mobile split so that we're comparing like with like - and not desktop share from one source with all client OS share from others, which is what we're doing now. It will need explaining in a footnote. --Harumphy (talk) 08:25, 4 July 2010 (UTC)

I'm not exactly sure what you mean. Can you explain further -- or demo the table. Jdm64 (talk) 20:18, 4 July 2010 (UTC)
The table shows usage share for web client OSes, regardless of whether they are desktop or mobile OSes. Most of the sources provide the data in this form, but some of them (StatCounter and Clicky) provide two separate tables. StatCounter provides a desktop/mobile split table and we use this as a fiddle factor (joke!) to combine the separate desktop and mobile data into our table. StatOwl only does desktop share, so I'm suggesting we do the same with its figures as we're already doing for StatCounter and Clicky's desktop shares. The effect would be that StatOwl's figures would be reduced by a factor of 0.9743 and the difference would be added to the 'other' column. It would be mentioned in a footnote. Hope this makes sense. --Harumphy (talk) 22:30, 4 July 2010 (UTC)
I am against changing data from the sources. We can provide all the footnotes in the world, but we should report exactly same numbers as our sources reported, otherwise it is WP:OR. Wikiolap (talk) 23:09, 4 July 2010 (UTC)
I'm also against messing with the source data in this way. Furthermore, it seams like StatOwl is collecting data about other/mobile operating systems, but is just not reporting. It should be included in the 0.15% of misc. Likewise, since Android is Linux and iOS is likened to MacOS. The data for those OSs might be included in those main categories but just not listed. Because it's not entirely sure how they are doing their statistics we should not be "fiddling" with the numbers. Jdm64 (talk) 23:25, 4 July 2010 (UTC)
OK, several points in reply to both of you:
  1. While it's fine to question the approach, I don't think it can be considered OR. We've had this discussion before.
  2. The other sources are showing between about 1.5% and 3.5% mobile share. StatOwl is reporting Windows, Mac and Linux totalling 99.85%. From the 'drill-down' reports it's clear that the 99.85% doesn't include mobiles. The remaining 0.15% consists of 'other' at 0.01% (presumably SunOS, Solaris, FreeBSD etc.) and 'undetected' at 0.14%. So it's certain that StatOwl is reporting desktop share, not web client share.
  3. When mobile share was less than 1%, the discrepancy could be overlooked. But mobile share is increasing by about 0.25% a month, according to StatCounter. It appears to be heading towards about 3% in August and 4% at the end of the year.
  4. Five of the sources report web client share without discriminating between desktop and mobile. Good - that's just what we need.
  5. StatCounter does desktop and mobile shares in separate tables, but provides a third table showing the split between the two categories. Multiplying the figures in the first two tables by the figures in the third enables us to get what we need with precision.
  6. Clicky, like StatCounter, provides two separate tables for desktop and mobile, but doesn't provide a figure for the split between the two categories. So we've borrowed StatCounter's figure for this purpose and explained the method in a footnote. I recommended doing this in the Clicky section above, and nobody objected.
  7. StatOwl presents a similar problem to Clicky, except that it only publishes a desktop table, not a mobile one. So it seems to me that, logically, what nobody objected to in relation to Clicky should be equally unobjectionable in relation to StatOwl.
  8. If applying StatCounter's desktop/mobile split figure to the other sources is unacceptable, then surely the only way we can resolve that is to drop Clicky and StatOwl from the table entirely. It is not OK to continue to conflate 'desktop share' with 'web client share' now that mobile share has become such a significant part of the overall picture.
--Harumphy (talk) 06:51, 5 July 2010 (UTC)
  1. It's because the data for mobile simply does not exist. In the other cases where we take the average of the countries (ie. ATII) or other such cases, the data existed within the source. We are just extrapolating data only from information coming from the source. You are proposing to take the desktop/mobile split from other sources and apply it to StatOwl.
  2. The sources are using the user agent strings. For android it's Mozilla/5.0 (Linux; U; Android 0.5; en-us) AppleWebKit/522+ (KHTML, like Gecko) Safari/419.3. This clearly puts Android in the Linux section, and StatOwl gives no clear indication either way what they did. Similarly, iOS is reported as Mozilla/5.0 (iPhone; U; CPU like Mac OS X; en) AppleWebKit/420+ (KHTML, like Gecko) Version/3.0 Mobile/1A543a Safari/419.3. Which has the string MacOSX. Henceforth, we must only go with what the source report, not what we wish it would report.
  3. It doesn't matter the size of mobile. We can only use what the source gives us. This is an encyclopedia. We only give the reader a compilation of information, not an interpretation of what we think the sources should be reporting.
  4. So?
  5. This is allowable because the source gives us the split. It's simple percentages to figure out the relative percentages. But, StatOwl doesn't give this information, and we should not use one source to estimate another source.
  6. Actually, I was unaware that's what was happening. If you're using StatCounter's data to calculate Clicky's data then that should be stopped.
  7. See previous points.
  8. Clicky and StatOwl don't nessisarily need to be dropped. But we should stop using StatCounter's split to calculate our own numbers. We should only report what the source reports. If they give no method for calculating the desktop/mobile split or don't even give mobile, then we should just leave those sections blank. Jdm64 (talk) 20:46, 5 July 2010 (UTC)
2. StatOwl says "We also currently exclude all mobile data while we work on completing our mobile reporting section."[21] Sorry I didn't make this clear earlier.
8. I don't understand how we could keep Clicky and StatOwl in the chart in this way. If we don't have a way of converting their 'desktop only' and/or 'mobile only' figures to 'all web client' figures, as used by the other six sources, the table will be comparing apples to oranges which would be statistical nonsense. Could you please explain what you have in mind further.--Harumphy (talk) 21:33, 5 July 2010 (UTC)
Ok, if they say they exclude mobile on their site, then I see your point. You should've made that point sooner. But what makes StatCounter's source more valid than any other source? But, picking any one source introduces that source's bias into the other one. And we couldn't use the median either because that would be recursive. So, I guess I'd just remove both Clicky and StatOwl until they provide the desktop/mobile split. Somebody could email them to see if they would provide the data. Anybody else have any ideas on the subject, Wikiolap -- you there? Two person consensus isn't really a consensus. Jdm64 (talk) 00:34, 6 July 2010 (UTC)

[section break]
I looked at the four sources from which a split could be derived, and the mobile share came in at about 1.5% (Webmasterpro), 2.0% (Net Apps), 2.5% (StatCounter), 3.5% (Wikimedia). StatCounter had the advantage of being (a) explicitly published as a split figure, rather than something we'd have to deduce from the list of OS shares, (b) is in the middle of the range of the others, (c) a credible, global source, (d) published regularly on the first of the month. Net Applications fails (a) but passes (b), (c) and (d), so I suppose we could do a median/mean of Net Apps and StatCounter.

StatOwl's comment, which I belatedly posted above, implies they're working on mobile stats. I have already emailed Clicky and got a helpful reply this morning saying they're already planning some updates and the info we need will probably be taken care of in that.

It looks as though we're approaching a consensus, albeit between only two of us, and I too would welcome more voices in this discussion. The key decision we need to make is whether it's better to (a) weight source figures in accordance with data from other sources in order to compare apples with apples, or (b) not use those two sources at all, at least until they make their planned improvements. --Harumphy (talk) 08:00, 6 July 2010 (UTC)

I support Jdm64 proposal to either remove StatOwl and Clicky until they provide a split or to leave things the way they are. I am still very much against changing the numbers. So given the choice above of (a) and (b) - I will vote for (b). Wikiolap (talk) 18:32, 6 July 2010 (UTC)
I'd lean more towards leaving things how they are -- even if the stats for Clicky are currently using the StatCounter split. The main reason for this is that the median would flux by as much as 2.83% (WinXP) or 1.64% (all Win) or 0.96% (Win7). I guess that's not a large fluctuation and might be within normal monthly changes. But if we do leave things how they are we need to make it even more clear what we're doing. Placing an '*' on all Clicky's mobile percentages or something additional to the note at the bottom of the table. Jdm64 (talk) 20:22, 6 July 2010 (UTC)
I am fine with that. Wikiolap (talk) 20:33, 6 July 2010 (UTC)
If we leave Clicky as it is, using StatCounter's split but maybe making it even more clear, what do we do with StatOwl? Shouldn't we be consistent in how we treat these two sources? --Harumphy (talk) 21:24, 6 July 2010 (UTC)
StatOwl data does not seem to improve with respect to mobile stats. Their data sticks out in various aspects but additionally not fitting into the scheme does not help to improve wikipedia consistency. So it might make sense to grey out the line with StatOwl data and while still including them in the table they'd be excluded from further calculation. Would give StatOwl an incentive to react. 95.117.246.16 (talk) 20:39, 5 October 2010 (UTC)

Updates needed

Smartphone has 2010 Q1 figures for smartphone OS share whilst this page only has 2009 Q2 figures. Can someone update?

Some of the numbers here don't seem to add up to me if iphone iOS is 0.9% of the overall OS share and 13% of the smartphone share wouldn't that make smartphone OSs above 10% of the overall OS share? Is the source reliable?

Update of Pie Chart needed. According to the Info under the chart, it's from June 2009. —Preceding unsigned comment added by 84.138.164.207 (talk) 16:03, 19 July 2010 (UTC)

Too much pictures

I think, this page has definitely too much pictures. The charts are useful, but I see no benefit for the pictures which shows laptops and servers. --one-eyed pirate 18:19, 2 August 2010 (UTC)

Historical data

I think it would be good to see the evolution of these numbers just like in the article Usage share of web browsers. So it would apear the usage share month after month and would be possible to see the trend. --Marceloml (talk) 15:39, 4 August 2010 (UTC)

I agree with that, a graph with usage as a function of time would be highly appreciated. 87.72.122.147 (talk) 20:02, 7 October 2010 (UTC)
The only sources of data we have for that are the web client stats. The article already gives them as much prominence as they deserve, IMHO.--Harumphy (talk) 06:59, 8 October 2010 (UTC)

Usage share of mobile operating systems

Shouldn't "usage share of mobile OS" section be made into its own article? it's becoming an ever increasingly important topic, just as important as the usage share of web browsers is.--Mark0528 (talk) 05:16, 24 August 2010 (UTC)

Generally speaking, the idea of WP:SPINOUT is that sections get split out of articles as new articles when the parent article is too big, like 60 - 100 KB. This article is currently 20,773 B so there is no need for this on that basis alone. --Nigelj (talk) 10:11, 24 August 2010 (UTC)

Changes to web client table 2nd Sept

Jdm64 - a few comments on your changes:

  • AT and Webmasterpro specify (most of) their stats to one decimal place. If we add trailing zeroes it implies an order of magnitude greater precision in the stat than the source actually contains.
  • StatOwl August figures are now out, FYI.
  • Webmasterpro - did you look at the figures today (2nd)? If so you got the last six days in August and the 1st day in September. It's a rolling 7-day thing that changes daily - it's important to use the figures published on the first of the month, which I did.
  • Wikimedia - you've counted Android twice - under both Linux and Android. You need to subtract the Android figure from the Linux total.

I won't have time to fix these until next week, so I'd be grateful if you could take another look at this. TIA. --Harumphy (talk) 21:38, 2 September 2010 (UTC)

Ok, fixed most of it. I have the extra '0' on AT/Webmaster for alignment purposes -- makes it easier to read. I haven't updated/checked Clicky. I think I fixed the rest. Jdm64 (talk) 04:24, 5 September 2010 (UTC)
Thanks for the fixes. I understand what you're trying to achieve with the trailing zeroes but I think it is wrong, mathematically - it implies the tolerance is ±0.005% instead of ±0.05% as per the source data. I've reverted the StatCounter iOS figure, which they have split between the desktop and mobile tables - although you have to look at the CSV spreadsheet to spot it.--Harumphy (talk) 13:33, 7 September 2010 (UTC)

Add W3 Schools as a source

I'm in favor of adding W3 Schools as a source (http://www.w3schools.com/browsers/browsers_os.asp). It would be a great addition to the page. —Preceding unsigned comment added by Cityscape4 (talkcontribs) 21:05, 2 November 2010 (UTC)

The eight sources we include in the table at the moment each monitor a large number of sites and a wide audience. W3schools' stats are for its own site only and a narrow audience (web developers), so it is very unrepresentative of web users as a whole. There a many web sites which publish user share stats for their own sites. So why single out this one? W3Schools is of interest (especially to Linux users like me) because of its higher Linux figures, and that is mentioned in the article, but that is not a reason to put it in the table. We should not include stats in the table just because their figures match our own biases. Don't get me wrong - I'm not saying that you are motivated by any such bias - but it would be helpful if you could explain why you favor W3Schools' inclusion. --Harumphy (talk) 07:54, 3 November 2010 (UTC)

Server Market Share Figures are Wrong

I did some research on the numbers that Mary Jane Foley printed, and they aren't right, specifically her numbers do not match the IDC Press Release. I wrote it up on my site, and was looking for information here for another article, and noticed that you are still using her numbers. I'd suggest that someone look into this, since obviously, I'm an interested party, and shouldn't make the changes myself. Link is here (http://madhatter.ca/2010/11/06/server-operating-system-market-share-lies-lies-and-more-lies/). BTW, good work on the entry, I use Wikipedia as a reference all of the time. I do contribute, but not to the areas I write about. -- UrbanTerrorist (talk) 04:54, 8 November 2010 (UTC)

Thanks for the tip. You're right: Mary Jane Foley's figures aren't in the press release she cites, so I've removed this line from the table.Harumphy (talk) 09:13, 8 November 2010 (UTC)
Good catch, and thanks for taking care of it. Wikiolap (talk) 17:28, 8 November 2010 (UTC)

remove StatOwl and AT Internet institute from the statistics

I suggest removing StatOwl and AT Internet Institute from the web client statistics page. Reasons:

StatOwl

1) excludes mobile market

2) heavily US biased

3) small number of tracked websites compared to other sources. See below:

Source Websites tracked Unique visitors
StatOwl ??? 27m+
Clicky 270k+ ???
Statcounter 3m+ ???
w3counter 40k+ ???
webmasterpro 100k+ ???
Net applications 40k+ 160m+
Gomez
see footnote @ [22]
34k+ 150m+

I assume that unique visitors per tracked website ratio is somewhat constant. Since Gomez and Net applications have similar ratios, this assumption is somewhat valid. So StatOwl tracks only about 6k-8k of webpages. This is five times less than numbers of other sources.

AT internet institute

1) does not state its methodology. It's now known where they take the data from.

2) its numbers are very questionable (e.g. 20% Mac share in Switzerland)

3) tracks exclusively European websites

1exec1 (talk) 02:27, 12 November 2010 (UTC)

The eight sources we use are the best we've found to date. If you know of others that are at least as good, let us know. You've listed some of the problems you perceive with two of the sources. But once we go down the route of discarding flawed sources we would logically end up discarding all of them, because they all have flaws, and there are many more flaws than the ones you've listed. We use all eight sources and calculate median figures in an attempt to mitigate against the statistical flaws in each source. The article does not claim that the summary table gives a true picture: on the contrary, it repeatedly draws the reader's attention to the inherent difficulties and shortcomings and just summarises what the cited sources report and leaves the reader under no false impressions about their credibility.--Harumphy (talk) 10:47, 12 November 2010 (UTC)
There have recently been similar discussions, starting at Talk:Usage share of web browsers#Remove Statowl from summary and going on for several sections, especially regarding StatOwl. --Nigelj (talk) 17:17, 12 November 2010 (UTC)
I'm in agreement with Harumphy. With regard to the web browser page, the editors there are so consumed with the global accuracy of the summary table even though there 3+ paragraphs stating that there are countless flaws in any analysis. I would find that a bit humorous, except for the fact that the end result is looking like they're going to start removing sources. Jdm64 (talk) 19:35, 12 November 2010 (UTC)

web client table edits Nov 17, 2010

I reverted two of User:Wikiolap's edits today. Sorry I was a bit terse in the edit summary. The reasons were:

  • The change to the notes in the table really cluttered up its appearance. It looked awful IMHO. It was no doubt a good faith attempt to clarify which bit of the table each footnote related to, but that was pretty obvious anyway. The 'cost' greatly exceeded the 'benefit'. --Harumphy (talk) 20:13, 17 November 2010 (UTC)
I modeled it after similar table in Usage share of web browsers. Since our table has lots of notes, it looked like it would make it more clear.Wikiolap (talk) 20:32, 17 November 2010 (UTC)
Please never model anything after that article. It's awful! Harumphy (talk) 08:42, 18 November 2010 (UTC)
  • Both the cited web sites show higher shares for Linux, not just one. I originally put this paragraph in an attempt to fend off the perennial "Why don't we include W3Schools?" comments but am wishing I hadn't bothered. I suggest we just delete it. --Harumphy (talk) 20:13, 17 November 2010 (UTC)
Agree with deleting W3Schools note completely.Wikiolap (talk) 20:32, 17 November 2010 (UTC)
Done. Harumphy (talk) 08:42, 18 November 2010 (UTC)

Linux share: Caitlyn Martin's blog piece

Is this [23] a credible secondary source? It seems to me to be an exercise in wishful thinking. It seems to be clutching at straws. As a Linux enthusiast myself I've tried to follow her argument but it doesn't stack up IMHO. She says "The best estimate for present sales is around 8%", but she doesn't cite a source for this estimate, and in any case, present sales is a very different thing from the total installed base, bought over several years, that makes up usage share.

I can quite accept that the web client stats under-measure Linux a bit, mainly because Linux users are relatively security and privacy conscious and thus more likely to disable javascript, install adblock etc., all things which reduce counting on the third-party stats sites. It's interesting that Wikimedia's figures, based on server log files and thus immune to this hazard, show a somewhat higher figure (1.57%) than most of the others.--Harumphy (talk) 14:00, 4 December 2010 (UTC)

Agreed, but still not 8%. I tried to work that 8% figure in somewhere too, but the jump from 'installed user-base' to 'current sales' seemed too sharp for a short addition to existing text. The only way would be to devote a whole couple of sentences to it somewhere, and I'm not sure if she is notable enough for that. O'Reilly is a good source, but I'm not sure of her status to be speaking for them. --Nigelj (talk) 15:10, 4 December 2010 (UTC)
The only cited source there is quote from Steve Ballmer where he says that internal Microsoft research showed Linux and MacOS shares comparable. We already have this source covered. The blog doesn't seem notable enough to include in the article.Wikiolap (talk) 20:03, 4 December 2010 (UTC)

Should we remove the Wikimedia web client statistics?

The article currently states: "All of these sources monitor a substantial number of web sites. Statistics that relate to a single web site are excluded." To a large extent, this is not true for Wikimedia, of which Wikipedia alone is by far their most trafficked web site (although one that most English-language Web users have visited).

Also note that the Wikimedia report is based on the total number of HTTP requests rather than the number of unique clients (as determined using cookies). We need to consider the merits of the two approaches and which is more accurate. The Wikimedia report could easily be biased toward those operating systems used by those who access Wikipedia more often (although the others could be influenced by how much of each browser's user base regularly clears cookies). On these two principles, should we exclude the Wikimedia statistics? PleaseStand (talk) 00:48, 14 December 2010 (UTC)

Wikimedia's stats cover 60-odd sites within the Wikimedia family.[24] While this is much less 'substantial' than many of the other sources, it's much greater than 'one', the avoidance of which (specifically w3schools) was the original purpose of that sentence. (From time to time we get people trying to add w3schools' stats to the table, or suggesting that we should on this discussion page. Often they seem to be unaware that that site's stats are for its own site only, and that that site is aimed at web developers - a highly atypical readership with a much more diverse set of web clients than the general web-using population.) Also, the Wikipedias are very high-traffic sites. The English one disproportionately so, granted, but there are similar regional/linguistic skews in many of the other stats sources too. So I wouldn't exclude Wikimedia stats on the grounds that they monitor an insubstantial number of sites.
AFAIK there's no evidence to suggest that certain operating systems are used by those who access WP more often. Just as there's no evidence that certain OS's are used more by those who clear cookies, block scripts, use adblock etc. I imagine many of us have our suspicions in this regard, but no actual evidence. And if we had such evidence, the magnitude of the biases they introduce may be no larger than many of the other biases we already know about and to which all the sources are prone. So I don't think there's a case for excluding Wikimedia stats here, either. Harumphy (talk) 11:05, 14 December 2010 (UTC)

  Northern Ontario Jacob12190 (talk) 11:06, 14 December 2010 (UTC)

Web client table tweaks - January 2011

I propose to make a couple of minor tweaks to the table when the December figures come out in the new year, unless there are objections here first:

  1. For the Clicky desktop/mobile 'in lieu' split, take the mean of the Net Market Share and Statcounter figures instead of just using Statcounter. (This will probably have the effect of reducing Clicky's mobile share from around 4.1% to around 3.6%.) The footnote will explain what has been done.
  2. Android is rising rapidly and within a few months may overtake what we currently call 'mainstream' Linux. I propose to change the "mainstream" sub-heading to "desktop distros.". Harumphy (talk) 14:56, 29 December 2010 (UTC)
2) Oppose. There are several other mobile Linux distributions such as Maemo currently included within Mainstream Linux.1exec1 (talk) 00:24, 31 December 2010 (UTC)
Fair enough. I've done #1 but not #2 in today's update.Harumphy (talk) 11:09, 1 January 2011 (UTC)

Should we remove AT Internet Institute from web client stats?

We're seeing constant changes in the data, month by month (summary of each month here). With lack of frequent updates by ATII, we're ending up with less accurate results... 195.23.92.1 (talk) 18:18, 7 January 2011 (UTC)

Support 1exec1 (talk) 00:55, 8 January 2011 (UTC)
Be consistent - if we are to remove sources which don't update every month - then remove all of them, i.e. including Wikipedia one. If Wikipedia stays then ATII should also stay. Wikiolap (talk) 17:12, 8 January 2011 (UTC)
Remove - ATII is consistently slow at updating their stats. Remove for January's stats, unless they update. As for Wikimedia, we have a little more control over it. I've emailed the person updating the stats in the past, I think we just need to have him set up something more automated, because I think he has to manually run his scripts. Or talk to someone else that has access to the logs and can give them to us. Jdm64 (talk) 18:57, 8 January 2011 (UTC)
Keep. Last time we discussed this (see archive) we decided to keep stats for up to 12 months. It used to say as much in the web client section. (Somebody, unaware of the discussion and seeing that all the stats at the time were more recent, changed the "12 months" to "few months". I've just changed that back.) If 12 months is too long a period, then we should reduce that period. Whatever we do, we should apply the same time limit to all the sources.Harumphy (talk) 19:21, 8 January 2011 (UTC)
I, the guy who raised the issue in the first place, agree with this item, them. I didn't know nor found out anywhere that the discussion was held in the past and that "12 months" was decided. Since it was, let's just abide to the decision. 195.23.92.1 (talk) 14:09, 10 January 2011 (UTC)
Hmm, I haven't found any discussion in the archives. The problem is that the software market evolves very rapidly. Most of the time initial adoption of some product increases exponentially, by allowing 12 month delay we face data errors of more than 10% [25]. For example now AT Institute data is different from the median/mean data in Windows columns by huge margins (W7 - ~7%, Vista - ~4%, XP - ~11%). If we reduce the allowed delay to, say 6 months or less, we can lower the possible data errors more than two times.1exec1 (talk) 12:10, 21 January 2011 (UTC)
The 12 month precedent came up in Talk:Usage_share_of_operating_systems/Archive_1#web_clients_summary_table, in a brief discussion about what to do with OneStat data from Dec 8, 2008 that was getting on for one year old at the time. Jdm64 suggested "less than one year old" and I agreed - nobody else took part in the discussion. We removed OneStat on Dec 9, 2009. For a long time afterwards the article mentioned 12 months, which I recently reinstated.
As far as the time limit goes, I don't think it matters about AT being 'out of date' because (a) it's an encyclopedia, not a news site, so up-to-the-minute topicality is not essential, and (b) our choice of median rather than mean does a good job of excluding outlier figures. Harumphy (talk) 15:04, 21 January 2011 (UTC)

Median Windows Numbers

I've been playing with the Median numbers, which I had intended to quote in an article I was writing. I'm not quoting them. The numbers do NOT add up. No matter what I've tried, I cannot get those numbers of make any sense, and since there is no explanation of the calculations used to determine the Median, the only conclusion I can draw is that the numbers were either invented, or are in error. So instead of reporting your numbers, I'm reporting them, and my conclusion that this article is in error. If you want to look at the article, which is my prediction for where OS usage shares will be in 2012, it will be at: http://madhatter.ca

One other point - Netbooks should be included as Notebooks, and Tablets should be in a separate category, due to the form factor. Tablets have more in common with phones than they do with notebooks.

UrbanTerrorist (talk) 20:01, 8 January 2011 (UTC)

The medians are calculated just like any other median, surely? In each column, the median is the middle value of the group. Thus the median of 1, 2 and 999 is 2. Where there is an even number, it's the mean of the middle two, so the median of 1,2,3 and 999 would be 2.5.) Are you saying that our table doesn't do this? If not, what precisely do you think does not add up?Harumphy (talk) 20:08, 8 January 2011 (UTC)
Are they calculated like any other median? Who knows? There is no explanation as to the method(s) used, and no reference to an explanation. In effect we are given numbers, and told to believe them, which is against the policies of Wikipedia. Either provide an explanation, or remove the median figures.
UrbanTerrorist (talk) 14:49, 11 January 2011 (UTC)
The Median label links to Median article which explains how medians are calculated. Do you consider this is not enough ? Do you propose adding a footnote with more details on it ? Wikiolap (talk) 18:51, 11 January 2011 (UTC)
The term 'median' has a precise meaning and there's only one way of calculating it. Anyone who wants to can calculate the median herself and get the same figure. The problem here is not the article, but your inability to click on the link to the median page to find out what it means. Harumphy (talk) 13:58, 17 January 2011 (UTC)
A link to the explanation would make things clearer. UrbanTerrorist (talk) 07:51, 27 January 2011 (UTC)
And the very common problem of confusing median with mean. or believing they share some properties that they do not. Like adding up to 100 % (under certain circumstances). The different properties are not necessarily easy to understand, and the article is quite heavy for somebody not used to mathematical theory. --LPfi (talk) 10:49, 18 January 2011 (UTC)
Agreed. Which is why an explanation, or a link to an explanation is needed. UrbanTerrorist (talk) 07:51, 27 January 2011 (UTC)
I should mention that I'm using the numbers for some articles that I'm writing, and I want to make sure that they are as accurate as possible, thus the questions. And yes, I link back to the source.

What gives with server market share BY REVENUE.

Revenue doesn't measure server market share, it measures how much money the supplier of one type of server rakes in. It also measures how much purchaser have had to pay for servers of that type ... both of these are the same number. Becuase some types of server cost far more than others, the metric skews the perception of market share towards which servers cost the most. These are perceived as having a greater market share, even though they may have relatively small numbers.

Furthermore, there are far more purchasers of servers than there are suppliers. Ergo, from the overwhelmingly predominant perspective, it is better to label this number as COST rather than revenue. Far more people would see it as a cost as opposed to those who see it as revenue.

So, I will keep trying to change the word "Revenue" within the server market share section to read "Cost" instead, until it sticks, because this wording gives the proper perspective on it from the vast majority viewpoint.

Alternatively, one could remove the "Revenue" metrics entirely, because they simply do not show server market share as they purport to.—Preceding unsigned comment added by 118.210.63.179 (talk) 10:46, 9 January 2011

Please sign your comments - otherwise we won't know who said what. Please see Wikipedia:Signatures.
Before you 'keep trying to change the word "Revenue" etc.' please read Wikipedia:Edit_warring.
As you've said, revenue and cost refer to the same number. It's the same sum of money seen from the two sides of the deal. The sources we're citing measure sales, not purchases. So revenue is the more accurate term in this context. Harumphy (talk) 13:59, 9 January 2011 (UTC)
Why include market share by revenue figures at all? They're useful for stock investors (i.e. which OS is generating the most revenue from a given market), but they will deceive casual readers expecting to learn about the "Usage share of operating systems on servers". Wallers (talk) 14:16, 9 January 2011 (UTC)
IDC and Gartner are well known sources in the server industry. They report in revenue probably because business people are more interested in the money instead of the number of units -- it's what their use to. Also it's easier to measure because you can't just check what OS a server is running like you can with desktop web browsers. A server could be hosting several virtual servers (upwards to 15) and each virtual server can have it's own ip address. So without detailed investigation, you'd see 15 servers when in reality there's only one real server. Henceforth, the real number of servers (as opposed to number of installs of an OS) is more closely correlated to revenue. Although, it gets complicated because the licensing of Linux servers can be free if the company goes with a distro without 24/7 support (like debian or centos) or it could be costly if paying for a full subscription of RHEL. So, even-though Linux is low on revenue, it's still really high in actual usage. Jdm64 (talk) 18:20, 9 January 2011 (UTC)
Jdm64 wrote it exactly right. There are different metrics to measure market share, and they indeed measure different things. Market share by units is interesting to know who which server OS is most popular. Market share by revenue is interesting to see which server OS vendor is making most money. So there is no contradiction - both are interesting, both are useful, just for different purposes. Wikiolap (talk) 18:11, 10 January 2011 (UTC)
"Making the most money" also means "getting the most money out of people for less cost to yourself". "Revenue" is a word with positive connotations in people's mind, whereas "Cost" has negative connotations. "Revenue" and "cost" are the same number ... what is revenue to sellers of servers is cost to buyers of servers. Since there are far more buyers than sellers, it is in the best interests of more people to show market share from the perspective of buyers rather than sellers (that is, to show it as cost rather than revenue). A casual reader might see the OS with the highest revenue and without thinking about it associate that positive term with "the best choice", when in fact it is the most costlly to him/her. From this perspective, citing statistics for sale value(price) of servers, labelling it as positive-sounding term revenue, and claiming that this shows "market share" is doing a sever dis-service to most people. In fact, it comes perilously close to free advertising for one company, which I would have thought goes against Wikipedia policy.—Preceding unsigned comment added by 118.210.63.179 (talk) 14:08, 11 January 2011
I think you are stretching the definition of the advertising a bit :) Your thinking about positive vs. negative association is interesting, but it is your opinion only. In encyclopedia we cite verifiable and reliable sources - and both Gartner and IDC qualify as such. They label the metric Revenue, and we must respect that.Wikiolap (talk) 18:54, 11 January 2011 (UTC)

IDC also report units, so there's no reason to use revenue, and I've changed the numbers to the unit rather than revenue figures. For the Gartner figures, I checked the source, and they appear to be unit figures as well, not revenue figures.Shalineth (talk) 06:36, 10 February 2011 (UTC)

Can you show where in the source the reported percentages are refered as units ? In the table that we cite, the column headers say Revenue. Wikiolap (talk) 20:53, 10 February 2011 (UTC)
The Gartner source is a three-year-old Reuters article. The text in the article reads:
According to research firm Gartner, the Windows share of global server shipments gained a percentage point to 66.8 percent in 2007 from a year earlier. Open-source Linux's share fell by a percentage point to 23.2 percent last year and Unix dropped to 6.8 percent in 2007 from 8.1 percent in 2006.
Note that this refers to the share of global server shipments, i.e. units, not to the share of global server revenue. The figures are also similar to IDC unit figures from about the same time, but very different from IDC revenue figures. It was only in 2005/6 or so that Windows severs overtook Unix servers in terms of revenue, but Windows has been ahead of Unix in unit shipments since the 1990s.
The IDC figures in the table are indeed revenue for server hardware (not revenue for operating systems), but it is a methodological error to use this as an indicator of server OS 'usage share'. What possible sense is there in saying that one server costing €20 000 and running one instance of AIX contributes 40 times as much to the AIX 'usage share' as one server costing €500 and running one instance of Linux or Windows?
I had provided a source for IDC unit shipments and corrected the table to include them, but this was reverted.
In a comparison of server revenue or server profitability, prices matter. If HP, for example, are selling a lot of €50 000 servers and Dell are selling a lot of €1 000 servers, that has a huge impact on their respective results. If each server runs only one copy of an OS, however, the 'usage share' is 1 for each server. The idea that multiplying server operating system units by the cost of the hardware the OS runs on somehow represents 'usage share' is completely nonsensical. Shalineth (talk) 16:28, 12 February 2011 (UTC)

Overview section

Recently somebody added the overview table, and various editors including me attempted to knock it into shape. I don't think we've been very successful. I can't see where many of the figures come from. The web client medians don't (and shouldn't be expected to) add up to 100% and so don't constitute 'share' anyway. Should we delete this section, or can it be improved? Harumphy (talk) 08:55, 17 January 2011 (UTC)

I'd lean for deleting it. Many of the fields are blank because some of the selected OSs come from disjoint usage (ie. mainframe and smartphones). Although, the one thing going for the table is the quick summary. I'd purpose that key stats from the table to be included in the opening paragraph. Something to elaborate on the current opening, but with some actual numbers. Jdm64 (talk) 10:32, 17 January 2011 (UTC)
Does anyone want to speak in the overview section's defence? If not I'll delete it in a couple of days from now. As far as putting figures in the opening paragraph goes, ideally that would be done in a way that doesn't need updating every month. Harumphy (talk) 09:12, 18 January 2011 (UTC)
I vote to delete it, the idea of this overview table never appealed to me, and it doesn't look like it is helping the article. Wikiolap (talk) 17:26, 18 January 2011 (UTC)
Now deleted as agreed. Harumphy (talk) 10:14, 21 January 2011 (UTC)

Font sizes

I've reverted 1exec1's changes to a couple of tables, in which the font size was fixed at 85% of normal.

Please, if font sizes look too big on your computer, it doesn't mean they look too big on everyone else's. The web is not a wysiwyg medium. You can adjust your browser's normal font size to suit your preferences. I have adjusted mine, and I don't want you reducing the font size on my computer (and everyone else's) just because it looks better on yours!

Besides, from a graphic design point of view it looked awful. --Harumphy (talk) 09:37, 9 February 2011 (UTC)

Possibly dubious server share claims based on websites

Are there any authoritative sources suggesting that scanning public websites is a reasonable way to estimate server market share? It seems rather dubious to me. For one thing, Linux is well known as a good OS for running web servers (the LAMP stack), so a sample of web servers may not be representative of servers in general, but rather biased towards Linux.

Another problem is that a single server OS can host a large number of small websites, whereas a large website may require several servers, especially if it makes heavy use of SSL. If website characteristics differ across sites, then an estimate of server market share based on the number of websites would be biased towards the system most favoured by smaller, less complex sites. Netcraft report a 50 per cent Windows share for SSL websites (http://news.netcraft.com/ssl-survey/), compared with a 25 per cent share for non-SSL websites, which suggests that estimates based on websites may indeed be biased towards Linux.

Unless an authoritative source suggesting that counting the number of public websites using a particular server OS is a valid way of estimating server OS market share, this looks like original research, and I suggest it be deleted. A separate section on web server OS share might be reasonable. Shalineth (talk) 06:55, 10 February 2011 (UTC)

Netcraft is reliable and verifiable source, and we properly reference it. They choose to analyze OS share of web servers, and we accurately mention it. Hence this is not original research. Some readers may or may not agree with their methodology, but it is really not up to us to make judgments whether or not we like it. As encyclopedia we report citing reliable and verifiable sources. Wikiolap (talk) 20:51, 10 February 2011 (UTC)
Netcraft present website statistics, not server OS market share. You're confusing two different things, and that's where the original research lies. It's like looking at valid statistics for flights out of a particular airport and claiming that represents airliner market share.Shalineth (talk) 11:13, 12 February 2011 (UTC)
  1. ^ Always calculated by subtracting listed columns from 100%.
  2. ^ W3Counter report is based on the last 15,000 page views to each of 30,961 website tracked
  3. ^ a b Wikimedia stat for Vista includes Server 2008; XP includes Server 2003.
  4. ^ Wikimedia uses 1:1000 sampling of its logs when deriving the usage numbers