Talk:Permutation test

Latest comment: 3 months ago by Bikestats in topic issues with JMASM links

Permuation Tests edit

I originally wrote the permutation test article. I understand that in the Wikipedia world that doesn't mean much, but it had gotten so convoluted with parenthetical phrases and qualifications and such that it was virtually impenetrable. So I edited it. I appreciate the helpful additions as well as putting the article into the general rubric of resampling which makes sense. - Respectfully, WJF. —Preceding unsigned comment added by 129.171.150.73 (talkcontribs) 15:59, 23 January 2006

I made changes to Bootstrap, Jackknife and wrote a much longer text on Permutation test and moved the reference list after Bootstrap to the reference list at the end of the chapter. My writing is based on my experience as an applied statistician and a developer of statistical software with emphasis on resampling techniques, except for the text about Jackknife, which borrows heavily from Mooney & Duval (see list of references). I have tried keep as much as possible of the original text, but in some cases where it clashed with my own writing it was removed. For example the sentence
"permutation tests usually involve calculation of test statistics from and permutation of the observed data, as opposed to other non-parametric tests which may involve analysis of the ranks of data points"
have been removed because it may confuse the readers, as rank tests ain't 'other non-parametric tests'. Rank tests, for example Mann-Whitney U and the Spearman rank correlation test, are permutation tests. - Respectfully, VS.—Preceding unsigned comment added by 130.241.83.174 (talkcontribs) 08:39, 3 February 2006

I made some changes to the Permutation test section to correct some vagueness and misleading comments. I am not an expert in this area, but the current section does not appear to present a balanced perspective on parametric vs. permutation tests. In addition, all sections would benefit mightily from simple examples of each technique. - Ken K 21:15, 1 March 2006 (UTC)Reply


I changed the "approximation" section title to Monte Carlo Testing" and some of the language therein. Monte Carlo testing is not an approximation, but an exact test (meaning that the true alpha = nominal alpha) and is asymptotically equivalent to the test performed by enumerating all of the possible arrangements.Ken K 19:45, 30 March 2006 (UTC)Reply

Wikipedia wrote "An important consequence of the exchangeability assumption is that tests of difference in location (like a permutation t-test) require equal variance" I'm wondering... requires equal variance to infer what? Do you mean to draw an inference about the population from which the samples are drawn? Okay, maybe so. But there is a radically different way of thinking about permutation tests - as not only distribution free but POPULATION FREE. If the inference is limited to the sample at hand (or to put it a different way, if the entire population is being measured) then I don't see how equal variance is necessary. Why do we need statistical inference if we have the whole population? Because we need to know whether the difference between groups is plausibly attributable to chance (random assignment or simply chance factors).—Preceding unsigned comment added by 128.186.195.132 (talkcontribs) 14:34, 16 June 2006

Answer to the previous post: A test of group difference is not 'POPULATION FREE'. It is a test if the observed data belong to one population or two different populations. This is regardless of if the test is parametric or non-parametric, and also the requirement of exchangeability is independent of if we regard the observed sample as a random sample from a larger population or as the population in it self. For a comprehensive explanation of this, read the article by Welch. But it also is easy to understand this requirement if we think about a concrete example about testing that two groups have the same mean. Assume that we have two samples (or two complete populations) with different variance, and we randomly draw one observation from the combined sample, and that observation happens to have a value in the tail of the combined distribution. A permutation test is a conditional test, and this means that the marginal distribution of the combined sample is fixed, so if we observe an extreme value and (for example) know that the first group have larger variance than the second group, the probablity of that observation to belong to the first group is larger than the probability of belonging to the second group if the null hypothesis is true. This invalidates the basic assumption of a permutation test that all permutations of the observed sample have equal probability when H0 is true. Permutation in this situation is equivalent to the allocation of an observation to the first or second group. This means that if the two groups have very different variance, the significane from the permutation test of group difference in mean may be completely misleading. V.S. 28 July 2006—Preceding unsigned comment added by 81.230.140.189 (talkcontribs) 17:08, 27 July 2006

the external link at the bottom of the page (to some random verizon user's page) is (i) broken and (ii) an advertisement (to a "Statistical Consultants for Clinical Trials, Legal Affairs, and Marketing." company). i suggest deletion. -c.w., nyc, Mon Sep 4 05:37:07 EDT 2006

Permutation test edit

With the aim of claryfing the permutation test, I added to its section a couple of paragraphs describing how the test is performed. In the next future, I could add an example. Gideon fell 14:09, 9 March 2007 (UTC)Reply

I think you should reword this section a bit. The test you described is a test of the weak null hypothesis, that the mean of the two distributions are equal. However, you've used language that implies it is a test of the strong null hypothesis, that the two distributions are in fact equal. It is possible to test the strong null hypothesis by permuting some measure of distributional distance like the Kullback Leibler divergence, but I think it's more instructive to continue with the weak null hypothesis. I would suggest clarifying that we are testing whether or not the mean of the two distributions are equal. --Saffloped (talk) 00:54, 10 August 2010 (UTC)Reply
Would be nice not only for this test but for the others as well. Stevemiller (talk) 04:15, 3 March 2008 (UTC)Reply
Presumably the example is a test for not whether the samples come from the same distribution, but whether they come from distributions with equal means - should this be corrected? —Preceding unsigned comment added by 128.243.220.21 (talk) 15:46, 3 March 2008 (UTC)Reply


This article is not very clear about what a permutation test actually is. I read the main section on permutation test several times and compared it to other sources and I'm still kind of fuzzy on it. Specifically I'm confused over how you compare the results after the permutations. Is it necessarily implied by permutation test that you order all of the test statistic values, find the number of t values "more extreme" than your t value (I'll call it k), and say that your confidence of the null hypothesis is (k/n!)? Or is that just one way to do it? -Anadverb 16:24, 24 September 2006 (UTC)Reply

Misconceptions edit

Some misconceptions have crept into this article since I last read it.
Misconception 1: (some authors speak of permutation tests in this last case only, using the term randomization test in the previous situation).

Maybe they do, but then they don’t understand what a permutation test is, and it would be best to keep quit about this. A permutation test is a test that derives the distribution of the test statistic from the permutation distribution defined from the observed data. When we perform the test in a practical situation, it may entail the enumeration of all permutations, or a random selection of them, but that does not mean that we have two different tests, nor two variants of the same test. The test in itself is the same in both cases; the only difference is that we in some situations prefer to take a time saving short cut when we calculate the p-value of the test. So even if there is a difference (of no practical importance) on a practical level, they are the same test on the theoretical level.

This misconception is also shown in the sentence: This type of permutation test is known under various names: approximate permutation test, Monte Carlo permutation tests or random permutation tests[2].

There exists only one type of permutation test from this perspective. Even if there are a few alternative ways to calculate the p-value, this is only a matter of computational detail and does not lead to the raise of different tests.

Misconception 2 The Student's t test is exactly a permutation test under normality and is thus relatively robust. The F-test (z-test) and chi-squared test are far from exact except for in large samples (n > 5, or 20).
The Student's t test is not a permutation test in any situation.
Valter Sundh 2007-04-14


wha? edit

What is the meaning of the "(1 - )" in the description of permutation testing? 96.241.2.69 (talk) 04:16, 8 May 2008 (UTC)Reply

It is just 1 minus.   Biggerj1 (talk) 23:17, 7 March 2022 (UTC)Reply

Permutation test edit

In the permutation test section we have:

"the one-sided p-value of the test is calculated as the proportion of sampled permutations where the difference in means was greater than or equal to T(obs). The two-sided p-value of the test is calculated as the proportion of sampled permutations where the absolute difference was greater than or equal to ABS(T(obs))".

I do not see how the one-sided p-value can be calculated in this way, without knowing whether T(obs) is positive or negative. Example, if T(obs) = -1, then all permutations where we observe >-1 differences in means are counted as at least as extreme, in particular the nonnegative differences, when intuitively we want to start counting from the closest tail to T(obs).

217.44.49.193 (talk) 10:56, 5 April 2012 (UTC)Reply

Note on calculation of permutation test p-value edit

It's a relatively minor point, but if someone more qualified wants to add it in, the permutation test p-value is not strictly the proportion of sampled permutations where [the test statistic] is greater than or equal to [T_obs or abs(T_obs)] -- a correction must be applied such that the type I error rate is controlled appropriately, as simply taking such a proportion inflates the type I error rate. See Phipson and Smyth (2010) Statistical Applications in Genetics and Molecular Biology ( https://doi.org/10.2202/1544-6115.1585 ) for details. Best, PFR 173.72.6.161 (talk) 00:34, 12 March 2018 (UTC)Reply

Thanks, I added a sentence about this today. GKSmyth (talk) 01:10, 20 December 2022 (UTC)Reply

Randomized test edit

[Randomized_test] gives the confusing result that one shouldn't confuse things that redirect to one another:

   Permutation test
      (Redirected from Randomized test)
    Not to be confused with Randomized test.

Not sure how to exit that loop. Josce (talk) 18:01, 13 February 2022 (UTC)Reply

Thanks for the note! I changed the redirect. Biggerj1 (talk) 22:26, 5 March 2022 (UTC)Reply

  Biggerj1 (talk) 23:16, 7 March 2022 (UTC)Reply

Merge with surrogate data testing edit

Several books seem to imply that surrogate data are the resamples generated under H0 in a permutation test, e.g. "permutation tests adjust the test statistic based on ... surrogate data" https://books.google.es/books?id=ixZHDgAAQBAJ&lpg=PA97&dq=surrogate%20data%20permutation%20test&hl=de&pg=PA97#v=onepage&q=surrogate%20data%20permutation%20test&f=false

I admit that the article on surrogate data testing is *currently* focusing on time series data, but is it maybe only describing some special permutation test for time series ??

I think it would be good if the difference and the similarities between the methods is clearly outlined in the articles or there is a merge.

Biggerj1 (talk) 22:02, 5 March 2022 (UTC)Reply

The idea of statistical contradiction is present in both, but the surrogate data generation process for time series is more elaborate than simple permutations so that correlations are preserved. Biggerj1 (talk) 22:54, 5 March 2022 (UTC)Reply

Conclusion: do not merge Biggerj1 (talk) 23:04, 5 March 2022 (UTC)Reply

  Biggerj1 (talk) 23:16, 7 March 2022 (UTC)Reply

issues with JMASM links edit

I believe that in the "Invited papers" link the year should be 2002 (not 2011). Here's a link to that issue (the current one is to the Wayback machine). I'm not sure why there is a three hundred page range (202-522).

https://jmasm.com/index.php/jmasm/issue/view/2 Bikestats (talk) 00:09, 17 January 2024 (UTC)Reply