Welcome to Wikipedia! Kiefer.Wolfowitz (talk) 20:45, 28 February 2010 (UTC)Reply

Statistical inference edit

Hi Professor McPastry! I reverted your edit, technically because you removed the start of a footnote and not the end, which promoted part of the footnote's contents to the introduction. I am curious why you removed the statement that statistical models reflecting objective randomization were more reliable than statistical models proposed by a statistical scientist. Best regards, Kiefer.Wolfowitz (talk) 20:48, 28 February 2010 (UTC)Reply

I took them out because they're not justified without a discussion of what you mean by "reliable". If the statistical scientist has a good understanding of the underlying science, then relying on randomization is inefficient. If the statistical scientist has no clue about what's going on, then randomization may be best.
(I introduced the usual indenting. I'm sorry if this is objectionable.)
The emphasis on randomization's good properties has been standard since Peirce, and is endorsed by e.g. ASA guidelines for the first course in statistics for non-statisticians, and books by Freedman and by Moore & McCabe. The emphasis on randomization-based inference is especially dominant in survey sampling (e.g., Ken Brewer, and Swedes Särndal, Swensson, and Wretman, who are all considered friendly to models). Do you disagree with these statements?
Cox's 1956 book on DOE and his more recent Principles both state that statistical models have a record of success in some areas of science, where there is a high precision and control, or as you say good knowledge. This could be cited as a "reliable source" preferably (imho) after the introduction in the article.
Thanks Kiefer.Wolfowitz (talk) 21:04, 28 February 2010 (UTC)Reply
Regarding "Reliable". The (finite-population) randomization-based inference depends on the randomization scheme in survey sampling and (with additional hypotheses of unit-treatment additivity) in design of experiments, and doesn't require any assumption about the distribution. For DOE, this is explained in Hinklemann and Kempthorne. (I do not say that randomization-based inference suffices for every problem or is always better.) That's the theoretical basis for the claim of greater reliability in general. (The footnotes support these statements.)
Freedman describes some empirical studies comparing the results of randomized experiments and random samlples with observational studies. Similar conclusions are made by the psychologists/program evaluation experts Donald Campbell and Donald Cook in their book on quasi-experiments. That's the empirical basis (whichh isn't cited here.) Thanks Kiefer.Wolfowitz (talk) 21:12, 28 February 2010 (UTC)Reply
Randomization is indeed a fine idea. I am aware of it's place in introductory and survey work. I also understand distribution-free methods. I just object to the implication that randomization is preferable to other forms of inference; each has a place, and an encyclopeadia should try to cover them all, impartially.McPastry (talk) 21:16, 28 February 2010 (UTC)Reply
"Fine idea" is an irrelevant distraction. Are you denying that randomization is emphasized in official documents of our professional and by leading statisticians? Are you denying that "statistical models" are notoriously unreliable in general---for example in the social science literature or much of epidemiology, at least according to the leading authors? Kiefer.Wolfowitz (talk) 21:27, 28 February 2010 (UTC)Reply
No, I did not deny that randomization is important (please see my previous comment). However, to lead off an encyclopaedia entry, statements about what's preferable or "much more helpful" (when you haven't yet explained what the comparison is to) seems misplacedMcPastry (talk) 21:42, 28 February 2010 (UTC).Reply