Talk:Fisher's exact test

Please note that the minimum expected value for a chi-squared test to be appropriate is 10 not 5, in the particular case where there is only one degree of freedom (see any responsible stats cookbook). This was correct in earlier versions of the page and I have put it back now. seglea 21:30, 11 May 2006 (UTC)Reply

In Bob Moore. (2004) On Log-Likelihood-Ratios and the Significance of Rare Events. In Proc. of the ACL 2004., Moore shows that Fischer's exact is not really prohibitively more expensive to compute than Chi-square. In light of this, the introductory paragraph suggesting that its computational complexity is a major consideration may deserve some qualification. —Preceding unsigned comment added by 70.108.245.148 (talk) 21:16, 19 February 2008 (UTC)Reply

I checked the link to http://mathworld.wolfram.com/FishersExactTest.html and found there were no content for general $m\times n$ cases. So I deleted it. Lixiaoxu (talk) 13:33, 9 November 2008 (UTC)Reply

Yes you're quite right. I've also removed the link at the bottom. RupertMillard (Talk) 10:53, 24 March 2009 (UTC)Reply

I don't understand this. I quote from the beginning of the mathworld article, "Let there exist two such variables X and Y, with m and n observed states, respectively...."; and there then follows a paragraph giving the formula and some description of procedures for the m x n case. In what sense can this be described as having no content for the m x ncase? I have therefore restored the link and reference. seglea (talk) 23:34, 24 March 2009 (UTC)Reply

Oh yes - you're right. I think I'm going mad! Thank you for putting the link back in. I don't think the article's brilliantly clear, but it's a start - very vague about the other measures of association that are required for

m\times n

case. RupertMillard (Talk) 07:10, 25 March 2009 (UTC)Reply

Agreed. I only put it in because at least it states unambiguously that the

m\times n

is possible, and so many students (and not a few lecturers) believe that only 2 x 2 can be done. There might be a reference to a better source in some SPSS manual, since SPSS will calculate the

m\times n

case, but I don't have one to hand. seglea (talk) 21:43, 25 March 2009 (UTC)Reply

In the example the notation switches from girls and boys to men and women. Perhaps it would be less confusing to maintain one label. Australisergosum (talk) 01:41, 16 December 2008 (UTC)Reply

I wonder if the example gender x dieting is well-chosen... It is a requirement of the standard exact fisher test that both marginals are fixed; it can easily be assumed that a researcher could choose to include an equal number of men and women in his/her sample, but how about dieters versus non-dieters? These particular marginal counts seem to be random to me? —Preceding unsigned comment added by 201.52.149.7 (talk) 23:09, 31 March 2010 (UTC)Reply

The link to http://www.socr.ucla.edu/htmls/ana/FishersExactTest_Analysis.html points to an applet that only calculates P(Cutoff), and not the actual probability of the null hypothesis. http://www.physics.csbsju.edu/stats/exact2.html calculates the interesting probability correctly, and works for NxN matrices.128.243.21.225 (talk) 21:12, 22 January 2009 (UTC)Reply

I just looked at the link for the Fisher exact test calculator that you gave: Fisher Exact Test Calculators: 2-by-2 and N-by_N, but the HTML was rather mangled, so it is not rendered in Firefox 12 or IE9. Looking at the source, I see that the page has good information. Here are the direct (working) links to the calculators:

- Fisher 2-by-2 Calculator
- Fisher N-by-N Calculator, up to N=6

Everettr2 (talk) 20:37, 8 May 2012 (UTC)Reply

Typo in formula explanation edit

Latest comment: 5 years ago1 comment1 person in discussion

If the marginal totals (i.e. a+b, a+b, a+c, and b+d) are known

I believe the second a+b should be c+d.109.65.36.159 (talk) 22:01, 13 January 2019 (UTC)Reply

Reference does not exist - Exact inference in categorical data. Biometrics, 53(1), 112-117.' edit

Latest comment: 15 years ago2 comments2 people in discussion

Mehta, C. R.& Patel, N. R. 1997. Exact inference in categorical data. Biometrics, 53(1), 112-117. definitely does not exist.[1] Is the intention to reference Mehta CR. Exact inference for categorical data. Encyclopedia of Biostatistics 1998; 2:1411–1422 as per[2]? I would probably cite this as Corcoran, Christopher D; Senchaudhuri, Pralay; Mehta, Cyrus R; Patel, Nitin R, Exact Inference for Categorical Data, doi:10.1002/0470011815.b2a10019. Anyway, I have removed the reference for now, as it was superfluous to the 1984 reference. RupertMillard (Talk) 10:47, 24 March 2009 (UTC)Reply

Very odd. That reference was added by an anon in April 2008, presumably relying on a secondary source. seglea (talk) 23:43, 24 March 2009 (UTC)Reply

Question edit

Latest comment: 14 years ago3 comments3 people in discussion

Can someone spell out how the value from Fisher exact is used please? Is fisher exact value same as p-value? What is considered to be statistically significant? —Preceding unsigned comment added by Sedoc (talk • contribs) 16:09, 5 June 2009 (UTC)Reply

You should try the mathematics reference desk for a question like that. Baccyak4H (Yak!) 17:41, 5 June 2009 (UTC)Reply

Is there any confirmation on the minimum value of n=5 or 10 or it it still a debated topic? I have seen textbooks (Biostatistics the bare essentials 2nd edition - Geoffrey R Norman/David L Streiner) and statistics professors in the flesh that says otherwise. Any paper/summary would help the layman to understand the debate if any. Thanks a million. —Preceding unsigned comment added by 155.69.163.224 (talk) 04:51, 30 October 2009 (UTC)Reply

Fisher-Irwin Test edit

Latest comment: 28 days ago2 comments2 people in discussion

This is the same as the Fisher-Irwin test, correct? If so there should at least be a redirect, and a mention in the article. Esox id^t•contribs 18:13, 12 January 2013 (UTC)Reply

According to Campbell (https://doi.org/10.1002/sim.2832), the answer is yes, they are the same.

"Versions of the Fisher–Irwin test — This test appears in the literature under various names including ‘Fisher’s exact test’. Because the test was developed independently by Fisher [1, 17] and Irwin [18], and because it is controversial whether the P values obtained are‘exact’ in all 2 x 2 tables, the test will be referred to here as the ‘Fisher–Irwin test’." Cajawe (talk) 11:57, 29 March 2024 (UTC)Reply

Dieting edit

I changed the example from dieting to studying. Female teenagers are particularly likely to develop eating disorders, and dieting seems to be influenced by societal expectations that it's normal to diet (see eating disorder). Since there's no reason whatsoever that this example must be about dieting, I changed it. "Studiers" is an awkward word, so feel free to change it to "slackers" and "keeners" or whatever you can think of that fits better. But really, it would make most sense to find an example that doesn't involve made up statistics about the habits of people who happen to have penises vs people without penises. For example, it could be two sets of patients taking a new medication.

Example is wrong? edit

Latest comment: 6 years ago2 comments2 people in discussion

According to online calculators and R, fisher.test(matrix(c(1, 9, 11, 3), 2,2)) results in a p-value of 0.002759 and not 0.0013. — Preceding unsigned comment added by 83.153.126.238 (talk) 19:23, 21 September 2017 (UTC)Reply

Today (2018-03-30) I ran this:

fisher.test(rbind(c(1,9),c(11,3)), alternative="less")$p.value

[1] 0.001379728

and this:

fisher.test(rbind(c(1,9),c(11,3)), alternative="two.sided")$p.value

[1] 0.002759456

with R version 3.2.2 (2015-08-14).

So the example is right, and you have applied the two-sided test where the example was using the "less" one-sided test. You should read again the article, and more carefully, because after the example, the two-sided case is discussed and it is said that in the example framework the two-sided p-value is twice the one-sided. GizTwelve (talk) 15:19, 20 March 2018 (UTC)Reply

Simplification edit

Latest comment: 5 years ago1 comment1 person in discussion

The example

	Men	Women	Row total
Studying	1	9	10
Not-studying	11	3	14
Column total	12	12	24

is conveniently analyzed by computing the mean values and variances of the hypergeometric distributions, rather than computing the hypergeometric probabilities themselves.

Based on the sums

	Men	Women	Row total
Studying			10
Not-studying			14
Column total	12	12	24

The mean values are (10)(12)/(24) etc

	Men	Women	Row total
Studying	5	5	10
Not-studying	7	7	14
Column total	12	12	24

All the variances are equal to (10)(14)(12)(12)/(24)(24)(24-1)=35/23

The squares of the deviations from the mean values are all equal to 16.

As 16/(35/23) ≈ 10.5 the difference of proportion is indeed significant.

This calculation is simpler than that in the text.

Bo Jacoby (talk) 05:28, 10 November 2018 (UTC).Reply

On the requirement that the margin of the tables be fixed edit

Latest comment: 4 years ago2 comments2 people in discussion

In Fisher's original example quoted in the article, the number of cups of tea Bristol judged to have had milk put in first is not fixed a priori. It is of course fixed once the experiment has been carried out, but if we take this as the meaning of 'fixed' then every experiment seems to me to have fixed margins, rendering such a stipulation redundant. Therefore, I am assuming the article currently means to say that both margins must be a prior fixed in order for the test to apply. It appears to me from Fisher's motivating example that this is in fact untrue. I was wondering if there was disagreement on this point. If not, I would propose removing the stipulation that marginal totals be fixed. Marko1973 (talk) 16:24, 5 March 2020 (UTC)Reply

Marko1973 makes a perceptive observation. I would also add that it is not clear whether Bristol was constrained to select exactly 4 and 4. Presumably she was offered the cups one by one and had to make an irrevocable judgement. What if she was up to 4-3 (that is, she has said 4 with milk first and 3 with milk second) and, tasting cup number 8 was sure that this had milk first? Could she say that the final cup was milk first, giving a 5-3 split of her guesses? I've never been sure about this. Best wishes, Robinh (talk) 20:58, 7 March 2020 (UTC)Reply

Politically incorrect example edit

I am unhappy with the example given to illustrate Fischer's statistical significance idea. Please change it back what Fischer originally applied the idea to, i.e., tea with milk, or any other example from biology, of which there are plenty from Fischer's own papers.

Add topic