Talk:Dot plot (statistics)
This article is rated Start-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||
|
Old discussion
editDot plots can be used to discrete data as well, not only continuous. VictorBraga 02:23, 18 September 2007 (UTC)
- True enough; it can be used with discrete, non-categorical data (e.g. integers instead of fractions). Perhaps the definition should be clarified by referring to "interval-scale" data rather than "continuous" data? Tom Hopper (talk) 14:23, 13 September 2010 (UTC)
There is much more to dot plots and dot charts.... I don't know how to include graphics here; I can create some examples in R, and write out .png or .jpg or metafile or .pdf or postscript. I'll try to figure out how to do thisPlf515 (talk) 01:51, 13 March 2008 (UTC)
Cleveland Dot Plot vs. Wikipedia Dot Plot
editI am having a little difficulty with this page's definition of "dot plot" and the corresponding references. I'm also not sure of the best way to fix this.
The Dot Plot, as defined by Cleveland in the ref 1, is a one-to-one pairing of each continuously-variable value with each categorical variable (although a second categorical variable can be added to plot in two panels or colors). In the R programming language, this type of chart is a "dotchart" or (using the Lattice library) "dotplot."
The dot plot defined and illustrated on this page has multiple continuously-variable values for each categorical variable. It is, in essence, a means of comparing the distribution of data across categories (functionally equivalent to stacking multiple histograms). In the R programming languange, this is referred to as a "stripchart" or (using the Lattice library) "stripplot." Minitab calls this a "Dotplot." I don't know what other software or references call this sort of plot.
The Cleveland Dot Plot would come about from data like:
Red | 10 |
Blue | 9 |
Green | 7 |
The dotplot or stripchart as described on this page would come about from data like
Red | 9 | 9 | 10 | 12 | 8 |
Blue | 10 | 7 | 7 | 9 | 8 |
Green | 6 | 9 | 9 | 8 | 9 |
It seems that we either need to remove the reference to Cleveland, or describe the two different definitions of "dot plot," with all of the appropriate references.
Personally, I'm in favor of keeping both definitions, but describing them separately on this page.
Any opinions?
Reference to stripchart/strip plot in wrong place
editThere's a discussion of stripcharts and strip plots in R under Cleveland dot charts.
This is simply wrong as the most cursory of glances at the plots generated in R would suggest. A Cleveland dot chart is called a `dotchart` in R. Stripcharts/strip plots (`stripchart` in R) are (at least with `method="stack"`) are the other kind of dotchart.
I'm going to edit, but I wanted to explain why it was done. Glenbarnett (talk) 02:07, 18 January 2015 (UTC)
Suspect definition of "non-Cleveland" Dot Plot
editThe article states:
"The algorithm for computing a dot plot is closely related to kernel density estimation. The size chosen for the dots affects the appearance of the plot. Choice of dot size is equivalent to choosing the bandwidth for a kernel density estimate."
but previously stated:
"The first has been used in hand-drawn (pre-computer era) graphs to depict distributions going back to 1884".
It seems unlikely people were carrying out KDE-like calculations by hand in 1884, and the information I can find on this subject outside Wikipedia is patchy and contradictory. It even seems many people use the two types described here interchangeably, or are not even aware of there being two types.