Talk:Student's t-test/Archive 1

Latest comment: 2 years ago by RMCD bot in topic Move discussion in progress

Serious problems with page

It must be that the t-test is so popular and widely available that serious problems are neglected on this page. Take for example the S_x1x2, meaning grand standard deviation in one section and then grand variance in another. Not to mention that Sx1 and Sx2 are not defined. Please use your textbook or another website if you really want to learn about the t-test. —Preceding unsigned comment added by 128.32.52.185 (talk) 20:39, 17 July 2008 (UTC)

Corrected the math for S_{X_1X_2}. The notations need to be improved. Haruhiko Okumura (talk) 01:53, 13 August 2008 (UTC)

Is there a reason why grand standard deviation uses capital 'S' instead of lowercase 's', while the standard deviation page uses sigma? 24.211.189.56 (talk) 13:46, 17 April 2012 (UTC)

Calculations

I don't suppose anyone wants to add HOW TO DO a t-test??

That seems to be a deficiency of a fairly large number of statistics pages. The trouble seems to be that they're getting written by people who've gotten good grades in statistics courses in which the topics are covered, but whose ability does not exceed what that would imply. Maybe I'll be back.... Michael Hardy 22:04, 7 June 2006 (UTC)
If I have time to learn TeX, maybe I'll do it. I know the calculations, it's just a matter of getting Wikipedia to display it properly. Chris53516 16:17, 19 September 2006 (UTC)
Those who don't know TeX can present useful changes here on the talk page in ASCII (plain text), and others can translate them into TeX. I can do basic TeX; you can contact me on my talk page to ask for help. (i.e. I can generally translate equations into TeX; I may not be able to help with more advanced TeX questions.) --Coppertwig 11:57, 8 February 2007 (UTC)
I uploaded some crappy images of the calculations. I don't have time to mess with TeX, so someone that's a little more TeX-savvy (*snicker*) can do it. Chris53516 16:42, 19 September 2006 (UTC)
User:Michael Hardy converted two of my crappy graphics to TeX, and I used his conversion to do the last. So there you have it, calculations for the t-test. Chris53516 18:21, 19 September 2006 (UTC)
Great. Now, could someone explicit the formulla? I assume than N is the sample size, s the standard deviation, but what is the df1/dft? ... Ok I found the meaning of df. I find the notation a bit comfusing. it looks a lot like the derivative of a function... is dft thez degrees of freedom of the global population?
What do you mean: "could someone explicit the formulla (sic)" (emphasis added)? N is the sample size of group 1 or group 2, depending on which number is there; s is the standard deviation; and df is degress of freedom. There is a degree of freedom for each group and the total. The degrees of freedom for each group is calculated by taking the sample size and subtracting one. The total degrees of freedom is calculated by adding the two groups' degrees of freedom or by subtracting the total sample size by 2. I will change the formula to reflect this and remove the degrees of freedom. Chris53516 13:56, 11 October 2006 (UTC)
Thanks for the help with doing the calculation, I'm feeling comfortable finding a confidence bound on the Mean - but is there any way to also find a confidence bound on the variation? My real goal is to make a confidence statement like "using a student t-test, these measurements offer a 90% confidence that 99% of the POPULATION would be measured below 5000". —Preceding unsigned comment added by 64.122.234.42 (talk) 14:03, 23 October 2007 (UTC)

Welch (or Satterthwaite) approximation?

"As the variance of each group is different, the Welch (or Satterthwaite) approximation to the degrees of freedom is used in the test"...

Huh?

--Dan|(talk) 15:00, 19 September 2006 (UTC)

Table?

This article doesn't mention the t-table which appears to be necessary to make sense of the t value. Also, what's the formula used to compute such tables? —Ben FrantzDale 15:07, 12 October 2006 (UTC)

I'm not sure which table you are referring to or what you mean by "make sense of the t value". Perhaps you mean the table for determining whether t is statistically significant or not. That would be a statistical significance matter, not a matter of just the t-test. Besides, that table is pretty big, and for the basic meaning and calculation of t, it isn't necessary. Chris53516 15:24, 12 October 2006 (UTC)
I forgot. The calculation for such equations is calculus, and would be rather cumbersome here. It would belong at the statistical significance article, anyway. That, and I don't know the calculus behind p. Chris53516 15:26, 12 October 2006 (UTC)
Duah, Student's t-distribution has the answer to my question. —Ben FrantzDale 14:55, 13 October 2006 (UTC)
Glad to be of not-so-much help. :) Chris53516 15:11, 13 October 2006 (UTC)

Are the calculations right?

The article says:

 

But if you ignore the -1 and -2, say for the biased estimator or if there are lots of samples, then s simplifies to

 


This seems backwards. The external links all divide the standard deviation by its corresponding sample size, which is what I was expecting. So I'd guess there's a typo and the article should have:

 

Can anyone confirm this?

Bleachpuppy 22:14, 17 November 2006 (UTC)

I think it's right as it stands, but I don't have time to check very carefully. When you multiply s12 by N1 − 1, you just get the sum of squares of deviations from the sample mean in the first sample. Similarly with "2" instead of "1". So the sum in the numerator is the sum of squares due to error for the two samples combined. Then you divide that sum of squares by its number of degrees of freedom, which is N1 + N2 − 2. All pretty standard stuff. Michael Hardy 23:23, 17 November 2006 (UTC)
... and I think that just about does it; i.e. I've checked carefully. Michael Hardy 23:29, 17 November 2006 (UTC)
Please provide a citation or derivation. I think Bleachpuppy is right that the subscripts have been switched. Suppose   and  , a very large number, and   and   are of moderate and comparable size (i.e.   is a very large number in comparison to any of the other numbers involved). In this case, in effect   is known almost perfectly, so the formula should reduce to a close approximation of the t-distribution for the case where the sample 1 is being compared to a fixed null-hypothesis mean   which in this case is closely estimated by  . In other words, it should be approximately equal to:
 
But apparently the formula as written does not reduce to this; instead it reduces to approximately:
 
This is claiming that this statistical test depends critically on  . But since   is a very large number in this example,   should be pretty much irrelevant; we know   with great precision regardless of the value of  , as long as   is not also a very large number. And the test should depend on the value of   but does not. --Coppertwig 12:45, 19 November 2006 (UTC)
All I have with me right now is an intro to stat textbook: Jaccard & Becker, 1997. Statistics for the behavioral sciences. On page 265, it verifies the original formula. I have many more advanced books in my office, but I won't be there until tomorrow. -Nicktalk 21:02, 19 November 2006 (UTC)
P.S. none of the external links really have any useful information on them (they especially lack formulas). Everything that I've come across on the web uses the formula as currently listed in the article. -Nicktalk 21:29, 19 November 2006 (UTC)
The original formula is also confirmed by Hays (1994) Statistics p. 326. -Nicktalk 19:36, 20 November 2006 (UTC)
OK! I see what's wrong!! The formula is a correct formula. However, the article does not state to what problem that formula is a solution! I assumed that the variances of the two populations could differ from each other. Apparently that formula is correct if you're looking at a problem where you know the variance of the two distributions is the same, even though you don't know what the value of the variance is. I'll put that into the article. --Coppertwig 03:33, 21 November 2006 (UTC)

I know these calculations are correct; I simply didn't have my textbook to for a citation. Keep in mind that much of the time we strive to have an equal sample size between the groups, which makes the calculation of t much easier. I will clarify this in the text. – Chris53516 (Talk) 14:28, 21 November 2006 (UTC)

I'm not certain, but it looks like the calculations don't match the graphic formula; n=6 in the problem, but n=8 in the graphic formula. 24.82.209.151 07:54, 23 January 2007 (UTC)


These are wrong, they do not match each other. In the first you need to divide by 2, and in the second, you need to drop the multiplication by (1/n1+1/n2) That makes them match -DC

Extra 2?

Where the text reads, "Where s2 is the grand standard deviation..." I can't tell what that two is referring to. It doesn't appear in the formula above or as a reference. 198.60.114.249 23:29, 14 December 2006 (UTC)

The equation you're looking for can be found at standard deviation. It was not included in this page because it would be redundant. However, I will add a link to it in the text you read. — Chris53516 (Talk) 02:38, 15 December 2006 (UTC)
Thanks Chris! 198.60.114.249 07:23, 15 December 2006 (UTC)

I wanna buy a vowel ...

I may be off my medication or something, but does this make sense to anyone? :

  "In fact, Gosset's identity was unknown not only to fellow statisticians but
   to his employer—the company insisted on the pseudonym so that it could turn
   a blind eye to the breach of its rules."

So Gosset works for Guinness. Gosset uses a pen-name cuz Guiness told him to. But, um ... Guiness doesn't know who he is and doesn't want to know. So they can turn a blind eye.

So they told this person - they know not whom - to use the pen-name.

I know this was a beer factory and all but ... somebody help me out here.

CeilingCrash 05:28, 24 January 2007 (UTC)

I don't know the history, but maybe they promulgated a general regulation: If you publish anything on your research, use a pseudonym and don't tell us about it. Michael Hardy 20:13, 3 May 2007 (UTC)
Maybe it should should read "a pseudonym" instead of "the pseudonym". I'm not so sure management did not know his identity, however. My recollection of the history is that management gave him permission to publish this important paper, but only under a pseudonym. Guiness did not allow publications for reasons of secrecy. Can someone research this and clear it up?--141.149.181.4 14:45, 5 May 2007 (UTC)

Unfortunatly I have no sources at hand, but the story as I heard it is that Guiness had(/has?) regulations about confidentiallity on all processes used in the factory. Since Gosset used his formulas for grain selection, they fell under the regulations, so he couldn't publish. He than published under the pseudonym, probably with non-official knowladge and consent of the company, which officially couldn't recognize the work as to be his, due to the regulations.

Can we just delete that last sentence and just keep that he wrote under a pen name because it was against company rules to publish a paper —Preceding unsigned comment added by 65.10.25.21 (talk) 12:02, 4 December 2007 (UTC)

a medical editor's clarification

The correct way of expressing this test is "Student t'Italic text test". The word "Student" is not possessive; there is no "apostrophe s" on it. The lowercase "t" is always italicized. And there is no hyphen between the "t" and "test". It's simply "Student t'Italic text' test"

I'm a medical editor, and this is according the the American Medical Association Manual of Style,'Italic text 9th edition. Sorry I don't really know how to change it - I'm more a word person than a technology person. But I just wanted to correct this. Thank you! -- Carlct1 16:40, 7 February 2007 (UTC)

You need to close those comma edits. When you want bold text, close it off like this:'''bold''', and it will appear like this: bold. Please edit your comment above so it makes more sense using this information. — Chris53516 (Talk) 17:00, 7 February 2007 (UTC)
I'm not sure you are correct about the possessive use. As the article notes, "Student" was Gosset's pen name, which would require a possessive s after the name; otherwise, what does the s mean? The italic on t is left off of the article name because it can't be used in the heading. There are other problems like this all over Wikipedia, and it's a technical limitation. By the way, I see both use of "t-test" and "t test" on the web, and I'm not sure that either are correct. — Chris53516 (Talk) 17:05, 7 February 2007 (UTC)
I have no solid source on this, but I have definitely seen it both ways. It is Student's test, in that Student invented it. However with age, it has also been referred to as the Student t-test. And I, too, have seen both "t-test" and "t test" alas. 128.200.46.67 (talk) 19:43, 18 April 2008 (UTC)
I'm looking at my Stats text (Statistics in Criminology and Criminal Justice, by Jeffery T. Walker & Sean Maddan, ISBN 0-7637-3071-8), and on page 369, it says
Gosset wrote under the name "Student," so the t-test is often referred to as Student's t..."
So my text disagrees with Carlct1 on both the hyphen and the apostrophe. Now what?
*Septegram*Talk*Contributions* 06:25, 10 December 2009 (UTC)

Recent edit causing page not to display properly -- needs to be fixed

Re this edit: 10:47, 8 February 2007 by 58.69.201.190 I see useful changes here; but it's not displaying properly, and also I suggest continuing to provide the equation for the unbiased estimate in addition to the link to the definition of it. I.e. I suggest combining parts of the previous version with this edit. I don't have time to fix it at the moment. --Coppertwig 11:53, 8 February 2007 (UTC)

Looking at it again, I'm not sure any useful material was added by that edit, (I had been confused looking at the diff display), so I've simply reverted it. --Coppertwig 13:02, 8 February 2007 (UTC)

Equal sample sizes misses a factor sqrt(n)

The formula with equal sample size should be a special case of the formula with unequal sample size. However, looking at the formula for the t-test with unequal sample size:

 

and setting n=N_1=N_2 yields

 .

The factor of sqrt(n) should be correct in the limit of large n. However, there might be a problem since one sets N_1 = N_2 which reduces the degree of freedom by one. Does anyone knows the correct answer?

Oliver.duerr 09:13, 20 February 2007 (UTC)

I don't have the answer, but I agree both formulas don't match 128.186.38.50 15:37, 10 May 2007 (UTC)

Explaining a revert

I just reverted "t tests" from singular back to plural. The reason is that it's introducing a list of different tests, so there is more than one kind of t test. I mean to put this in the edit summary but accidentally clicked the wrong button. --Coppertwig 19:55, 25 February 2007 (UTC)

Copied from another source?

Why is this line in the text? [The author erroneously calculates the sample standard deviation by dividing N. Instead, we should divide n-1, so the correct value is 0.0497].

To me, this suggests that portions of the article were copied from another, uncited source. If true, this is copyright infringement and needs to be fixed right away.

I can't find any internet-based source for the text in that part of the article. I think the line might be directed toward the author of the Wikipedia article, as it seems to point out an error. I removed it, and will look into the error. -Nicktalk 00:38, 19 March 2007 (UTC)

By the way; the line should be read carefully. It is correct. As this is an estimate based on an estimate, it should have been divided by n-1, so the correct value is 0.0497. Can someone please change this?

Testing Normality

The call: I think it would be appropriate to change the wording "normal distribution of data, tested by ..." as these tests for normality are only good for establishing that the data is not drawn from a normal distribution.

The backgroud: Tests for normality (e.g. Shapiro-Wilk test) test the null hypothesis that the data is normally distributed against the alternative hypothesis that it is not normally distributed. Normality cannot be refuted if the null is not rejected, a statement that can only be statistically evaluated by looking at the power of the test.

The evidence: Spiegelhalter (Biometrika, 1980, Table 2) shows that the power of the Shapiro-Wilk test can be very low. There are non normal distributions such that with 50 observations this test only correctly rejects the null hypothesis that the data is not normally distributed 8% (!) of the time.

At least two possible solutions: (1) Drop the statement that the assumption of normality can be tested. (2) Indicate that one can test if the data is not normally distributed, pointing out that no rejection of normality does not mean that the data is normally distributed due to the low power of these tests.

Schlag 11:55, 27 June 2007 (UTC)

If you perform any of these tests before doing a t-test the p-value under the null hypothesis will no longer be uniformly distributed. This entire section is bad statistical advice (although commonly done in practice) Hadleywickham 07:27, 9 July 2007 (UTC)

Bad Reference: The Zimmerman reference cited in the quote below has nothing to do with tests for normality. Zimmerman is cautioning against tests for equal variance. Suggest deleting the reference and the text..."However, testing normality before deciding whether to use a test statistic is not now generally recommended."

"Each of the two populations being compared should follow a normal distribution. This can be tested using a normality test, such as the Shapiro-Wilk or Kolmogorov–Smirnov test, or it can be assessed graphically using a normal quantile plot. However, testing normality before deciding whether to use a test statistic is not now generally recommended.[7]" — Preceding unsigned comment added by 128.218.19.131 (talk) 00:11, 15 September 2011 (UTC)

Dependent t-test

The Dependent t-test section makes very little sense. Where, for example, are the pairs in the first table - or did someone maliciously truncate the table and rename Jon to Jimmy and Jane to Jesse? Why not walk the reader through a paired t-test using the data in the second table? Also, the example in the following section is not very helpful, since a 95% confidence interval is never even calculated. The example isn't related to any of the "uses of t-tests" previously - in part because the construction of confidence interval isn't really a "test" sensu stricto. Also, the claim that: "With the mean and the first five weights it is possible to calculate the sixth weight. Consequently there are five degrees of freedom," is a classic example of the opaque fog statisticians lead their flailing students into when trying to explain the degrees of freedom concept. Surely somebody in the community can clarify this page, as it is certainly a widely visited one. - Eliezg (talk) 21:41, 20 November 2007 (UTC) Yes, this is not the place to learn about a t-test and how to use it. I get more confused reading this page. 174.109.203.130 (talk) 03:58, 31 March 2011 (UTC)

Example

I don't get the example at all. It doesn't mention any of the formulas from the article, and computes a quantity (confidence interval of a mean) that's not discussed anywhere else in the article, with no explanation as to how that quantity was derived. Can someone add a real example, and explain it in terms of the rest of the article? (null hypothesis, etc.) --Doradus 17:16, 4 December 2007 (UTC)

Without what?

I commented the following sentence out:

"Modern statistical packages make the test equally easy to do with or without it [to what does "it" refer here?]."

as I also don't know what "it" is. --Slashme (talk) 08:39, 21 February 2008 (UTC)


Additionally the example contains a problem: it says its calculating 95th percentile but is really using 97.5th percentile. From t-table: 95% with v=5: 2.015, 97.5% with v=5: 2.571.

Not sure if I should fix it or someone wants to redo the example in order for it to actually be useful Efseykho (talk) 17:18, 18 October 2008 (UTC)

Null hypothesis with unequal variances

Since all calculations are done subject to the null hypothesis, it may be very difficult to come up with a reasonable null hypothesis that accounts for equal means in the presence of unequal variances...

Or it may be very easy. Consider an example where we're comparing two measurement techniques; we know they have different levels of random error (normally-distributed for each) but we want to determine whether they have different systematic errors. If we use the two techniques to take two sets of measurements, and the systematic errors are equal, we will have equal 'population' means but non-equal variances. --144.53.251.2 (talk) 23:24, 19 June 2008 (UTC)

There seems to be an acceptance of a certain fact about gender differences ... that if take things that can be objectively measured and which are not related to "strength" (such as eyesight, some aspects of blood chemistry) then the genders differ only in variability between individuals, with gender-separated-distributions having the same central location. It is "known" that men are more variable between individuals than women. Thus this would be another situation where this hypothesis is relevant. Melcombe (talk) 08:51, 18 July 2008 (UTC)

Incorrect formula for T-Test?

I'm a relative novice to stats and editing Wikipedia, so I'll just post this here rather than on the main page. Isn't the formula for Independent one-sample t-test wrong?

It's shown as:

 

but according to Sokal and Rolf's [http://www.amazon.com/Biometry-Principles-Practices-Statistics-Biological/dp/0716724111 Biometry] it should be:

 

or at least:

 

Where there is a multiplication in the denominator not a division. —Preceding unsigned comment added by JimmyDingo (talkcontribs)

Sure, as the number of observations increases, the results should be more significant and the "t" lower. But in this formula its the reverse. I learned: (mean-value)/sd when this is distributed t(freedom levels that depend on n) —Preceding unsigned comment added by 128.139.226.34 (talk) 08:28, 17 June 2009 (UTC)

The formula is correct as given. As the sample size grows the value of t should get larger, and the p-value smaller. Skbkekas (talk) 16:58, 17 June 2009 (UTC)

Terrible edit

This was a really terrible edit. Two say that the t-test is use ONLY when there are two samples is to say that only the tiny bit you learned in a baby-level course exists. To say that the test gives the probability that the null hypothesis is false is one of the standard mistakes that teachers incessantly warn freshman not to make. To little avail, maybe? Michael Hardy (talk) 20:48, 3 September 2008 (UTC)

In that case, can we have some (accurate) baby-level course material in the intro with a disclaimer that there's more to it? 108.208.102.14 (talk) 18:06, 2 April 2012 (UTC)

Independent two sample T test, unequal sample size, equal variance - concerns about accuracy

Thank you, Melcombe, for your edit on the standard deviation estimate not being unbiased. I was concerned about this too and was contemplating a similar edit to follow my correction where the pooled standard deviation had been called pooled variance.

I'm still concerned however, because   and   are not defined, and unless they are defined correctly the estimate will still be biased. Also the weights would have to be correct for it to be unbiased and there is more than one commonly used weighting using either   and   or   and  . So some questions:

Are you certain what definition leads to an unbiased estimate of the pooled variance? Is it sufficient to use the unbiased estimate of each variance? Unless we are certain about the final estimate of the pooled variance being unbiased we would be better to omit the unreferenced claim to being an unbased estimator.
A different formula is also often used for the pooled variance (for example by the analysis toolbox in MS Excel):
 
What are the pros and cons of each formula?
Do we necessarily want an unbiased estimate here, or do we want a maximum liklihood estimate?

I would value knowledgeable opinions on this. Thanks.

SciberDoc (talk) 22:37, 1 January 2009 (UTC)


The question is not really whether one wants "an unbiased estimate here, or do we want a maximum liklihood estimate?". If you use any estimate other than the one intended, then the distribution of the test-statistic would not be a Student-t distribution. In simple cases, a factor could be introduced to make it a Student-t, but this would be equivalent to a difference in the factor used in forming the estimate for the variance. The particular weighting required between the two sample variances could be said to relate to the necessity that the combined variance should have a distribution proportional to a chi-squared distribution, and this won't happen for any other relative weighting.
I see that the article is presently very non-specific about what the varios quantities S actually are, and your quote of a formula used by Excel might actually be using a different definition. (And Excel is notoriously in what is does in relation to statistical calculations.) The article might have previously contained proper deinitions.
Melcombe (talk) 10:05, 2 January 2009 (UTC)

Hotelling's T-square distribution

Shouldn't Hotelling's T-square distribution, and the references to it in this article, be "Hotelling's t-square distribution"? —DIV (128.250.247.158 (talk) 07:07, 29 July 2009 (UTC))

It is fairly standard notation in published literature to use the upper case T in relation to Hotelling's particular statistics. Melcombe (talk) 08:48, 29 July 2009 (UTC)

Abs

I do not have a stats book on me, but in excel the Tdist accepts only positive numbers and so do modules in perl. should the x-µ/s be abs(x-µ)/s? --Squidonius (talk) 02:13, 11 August 2010 (UTC)

No. Any decent statistics package or language allows negative arguments for the t-distribution. Qwfp (talk) 12:09, 11 August 2010 (UTC)

Assumption of normality

It is frequently mentioned that the assumption of normality can be ignored if the sample size is large enough. Arbitrarily, a sample size of 30 is usually considered large enough.

I think this fact should be reflected in the article; in addition, there should be references citing the first time cut-point of "30" was proposed, or why this is large enough in most cases to let us ignore the assumption of normality. hujiTALK 00:06, 20 November 2010 (UTC)

Completely non-understandabele

This must be the most unclear page in the wiki. I read it and I don't understand anything they say.

I really have a very common and simple question and I'm stumped that reading this page didn't answer it for me: I have two sample sets of difference size, what is the chance that the population mean of the first is less than that of the second?

For example: I run a test program 20 times, measuring how long it takes to execute. Then I make a chance to the program and run it 10 times. How large is the chance that the program became slower?

Sorry but this question should be answered on the main page, it's just too common imho. Carlo Wood (talk) 15:46, 4 March 2011 (UTC)

You are looking in the wrong place. You are looking for the answer to a question in Bayesian inference, not Statistical hypothesis testing, which is the context of this article. Melcombe (talk) 18:06, 4 March 2011 (UTC)
Let me clarify Melcombe's point. In hypothesis testing you have to (1) choose a statistic, i.e. a formula for turning your observations into a single number; (2) assume that one hypothesis (the null one) is correct; (3) calculate the probability of your statistic being as large/small as it is given that hypothesis. So it only tells you how likely your observations are given the null hypothesis. It does NOT tell you the probability that the hypothesis is correct. For that, you need Bayesian analysis.
But one way to look at this is that you asked the wrong question. Instead of "How large is the chance that the program became slower?" ask "How likely is it that I would get two sets of data this different if the program speed has not changed?"

To answer that, look at "Unequal sample sizes, equal variance". Plug into the formulae the means of your two samples as

X1

and

X2

, and their standard deviations as SX1, SX2. Haruspicator (talk) 02:52, 17 March 2011 (UTC)

Simplifying the equations in "Slope of a regression line"

Doesn't the formula for t reduce to sqrt{(n-2)*R2/(1-R2)} where R2 is the square of the sample correlation coefficient? I.e. where R2 = cov(x,y)^2/(var(x)var(y))? Haruspicator (talk) 03:19, 17 March 2011 (UTC)

Plugging numbers in the t-score formula at the end of the "Slope of a regression line" section yields different results than plugging those same numbers in the formula at the top of the section. Looks like an error in the formula at the end of the section. Glahaye (talk) 14:08, 19 February 2012 (UTC)

The formulas are mathematically equivalent. Perhaps some confusion of SE and SSE? Melcombe (talk) 20:12, 20 February 2012 (UTC)

Plain Speak?

This article is written to people who already know something about statistics. It would be nice if the intro (at least) explained t-tests to the statistically uninitiated. The current intro says: "A t test is any statistical hypothesis test in which the test statistic has a Student's t distribution if the null hypothesis is true." A lot of people will be thinking, "You what?" Laetoli —Preceding comment was added at 13:44, 3 November 2007 (UTC)

The "history" section is perfectly straightforward, and the "assumptions" and "uses" sections are comprehensible if one reads them closely although they might bear a little expansion, but the "calculations" section should be explained better. What, for example, is meant by a "grand" standard deviation? The explanation "pooled sample standard deviation" might mean something to somebody who remembers what they learned in Statistics, but not all of us remember what we studied in college (:-) I would like to see an article which teaches the average reader:
  1. when to use the tests (the article already explains this, but some textual explanation could supplement some of the wikilinks); and
  2. how to do the tests: although mathematical formulæ are concise, precise and unambiguous, as another Wikipedia article points out, "Not everyone thinks in mathematical symbols," so either a text description or, if a text description would be burdensome, examples would be useful. 69.140.159.215 (talk) 03:57, 10 January 2008 (UTC)
The "assumptions" and "uses" sections are comprehensible to someone who is familiar with the material, but unfortunately that leaves out probably most of the human race. For myself, I'm struggling my way through a class on this stuff, and this article is really no help at all. I'm looking for a basic understanding of what the t-test is and does in, as the original poster of this thread put it, "plain English." No luck so far...
*Septegram*Talk*Contributions* 06:31, 10 December 2009 (UTC) (who is sore tempted to put that "technical" template up on the main page of this article)

How about the explanation I found from Google? "Student's t-test can be used to determine if the averages of two sets of data are significantly different from each other. (reworded slightly)" Is that sufficiently accurate? The introduction seems to be only talking about one set of data although I'm not sure. I'd like to see something on that level of simplicity in the introduction. 108.208.102.14 (talk) 18:02, 2 April 2012 (UTC)

Tried to dumb it down a bit, please review: http://en.wikipedia.org/w/index.php?title=Student%27s_t-test&diff=538218425&oldid=536598429 --Yurik (talk) 12:59, 14 February 2013 (UTC)

What's with Z in "Assumptions" section?

The example in this section says makes no sense to me:

Most t-test statistics have the form t = Z/s, where Z and s are functions of the data. Typically, Z is designed to be sensitive to the alternative hypothesis (i.e., its magnitude tends to be larger when the alternative hypothesis is true), whereas s is a scaling parameter that allows the distribution of t to be determined.
As an example, in the one-sample t-test Z =  , where   is the sample mean of the data, n is the sample size, and   is the population standard deviation of the data; s in the one-sample t-test is  , where   is the sample standard deviation.

If this is correct, then the statistics for this test is t = Z/s=  , or t =  . That doesn't seem like a t statistic to me. And what is the alternative hypothesis for this "one-sample t-test (assuming as few typos as possible)?  ? If it is, no one will figure it out from what precedes it here? (A non-zero hypothesized population mean would be a better example.)

Stevekass (talk) 05:30, 1 April 2013 (UTC)

Worked Example: so, do we reject a null hypotesis?

May be it would be beneficial to the article if someone clarifies it.

"The two-tailed test p-value is approximately 0.091 and the one-tailed p-value is approximately 0.045."

As I understand (but I'm not confident), if we choose α=0.05 then we should reject the null hypothesis (that means of two sample sets are equal) coz 0.045 < 0.05.

04:04, 10 April 2013 (UTC)

independent samples

Should 'assumptions' include the idea that we assume all samples are independent? This seems like a major omission.

04:04, 10 April 2013 (UTC)

Inconsistency in use of N and n

The section "Unequal sample sizes, unequal variance" seems to use both 'N' and 'n' to mean the sample size. Correct?

04:04, 10 April 2013 (UTC)

Tone?

Is the tone and level of this article appropriate for a general encyclopedia? The information is very good, but it becomes pretty deep pretty quickly. (Speaking as a non-statistician.) — Preceding unsigned comment added by Emergentchaos (talkcontribs) 14:40, 18 October 2013 (UTC)

  • People who understand these things tend not to be very good at explaining them in human terms! (Or more cynically prefer to show off how clever they are.) For instance the basic assumption of the t-test is that "each of the two populations being compared should follow a normal distribution", yet this appears as more of a foot-note to a long-winded mathematical analysis of edge-case variations to the test. Cypherzero0 (talk) 18:06, 12 December 2013 (UTC)
  • Agreed the statistics articles are terrible. I'm teaching statistics and I can't understand all of this! Calculus and other maths articles are much more accessible. — Preceding unsigned comment added by 182.149.92.43 (talk) 07:21, 20 December 2013 (UTC)

history unclear

"but was forced to use a pen name by his employer who regarded the fact that they were using statistics as a trade secret. In fact, Gosset's identity was unknown not only to fellow statisticians but to his employer - the company insisted on the pseudonym so that it could turn a blind eye to the breach of its rules." What breach? Why didn't the company know? If it didn't know, how is it insisting on a pseudonym?

04:04, 10 April 2013 (UTC)

I agree. the history section is a mess! It reads like two (or more) authors are fighting over trying to tell slightly different facts. "Gosset did this..." "Actually!,... " — Preceding unsigned comment added by 148.129.71.40 (talk) 14:05, 27 February 2014 (UTC)

Is discussion of "Dependent t-test for paired samples" correct?

Relative stats newbie here so please take this with a grain of salt.

I can see why a paired test is appropriate for the example on the right (pre and post test results) but can't figure out how it would be appropriate for the example on the left (grouped by ages). If there was another variable (eg diseased vs healthy) and the pairing was done on age, I can see how that would work (sort of), but as it stands, there are only two groups (age ~35 and ~22) and there is nothing to link or pair members of the first group to the second.

Can someone with real statistics chops weigh in? — Preceding unsigned comment added by 149.142.243.68 (talk) 16:13, 4 June 2014 (UTC)

-> The names suggest a test that compares men and women, each woman is paired with a man of the same age.

Article Contradicts Itself

When discussing the assumptions of the test it is stated that the population must be normal but when discussing the calculations it is stated that the distribution of the means should be normal. These are very different statements. As far as I know the t-test is dependent on normally distributed populations because of how t is derived but I'm not confident enough to suggest a change. — Preceding unsigned comment added by 12.33.141.36 (talk) 16:35, 16 July 2014 (UTC)

Uses

I was surprised to see no mention of the one thing I have been using Student's t tests for for 40+ years: tossing out "bad" data points. For SMALL sample sizes, student's t is a better distribution than the normal, and so should be used to calculate whether a particular outlier can be reasonably (at a given confidence level) excluded. This I learned in Analytical Chemistry as an undergraduate back in the early 70's. So, since it is (or was?) in the text books, why isn't it mentioned here? Its used extensively, afaik, in quantative analysis.216.96.76.198 (talk) 06:01, 8 May 2014 (UTC)

--> The t-test is always more appropriate than the z-test, the z-test is just simpler to compute (and is the limit of the t-test for an infinite sample size). — Preceding unsigned comment added by 12.33.141.36 (talk) 14:58, 29 July 2014 (UTC)

About the attribute of Statistics

Both Karl Pearson and William S. Gosset had never called themselves mathematicians, and their great works on statistical methodology were rejected by mathematical journals in their time. This was why Karl Pearson had to establish the journal Biometrika for himself. So, we should say their works are "statistical" rather than "mathematical".

In the current domain of Mathematics, Statistics is widely said to be a pure branch of Mathematics. This might be a wrong statement since not all concepts in Statistics is completely a subset of all concepts in Mathematics. Therefore, we cannot say that Statistics is a branch of Mathematics. This means that a person working in Statistics might not be necessarily to be mathematician or mathematical-background, just as William Sealy Gosset, who was a chemistry-background and worked at a local brewery of Arthur Guinness & Son in Dublin, Ireland.Yuanfangdelang (talk) 01:37, 13 March 2015 (UTC)

Why does it say that the statistic must distrubute according to the Student t?

I think this statement is wrong. Indeed, in the hypothesis testing process you go through the student-t distribution. For instance, when you take the average as the statistic of the sample, and in the process of finding the right threshold for rejection of H0 you standardize it with estimated standard deviation. The statistic itself distributes, in this case, according to the normal distribution. — Preceding unsigned comment added by Itskov (talkcontribs) 12:55, 14 June 2015 (UTC)

Assessment comment

The comment(s) below were originally left at Talk:Student's t-test/Comments, and are posted here for posterity. Following several discussions in past years, these subpages are now deprecated. The comments may be irrelevant or outdated; if so, please feel free to remove this section.

Needs to be made more accessible to the non-mathematically oriented novice, e.g. by attending to the later comments below. In particular I agree that the "dependent t-test" section needs work. Makes it to B rather than Start class though I think. This is an much-viewed article but it is a hard topic to explain well. Qwfp (talk) 14:24, 22 February 2008 (UTC)

Last edited at 14:24, 22 February 2008 (UTC). Substituted at 15:53, 1 May 2016 (UTC)

rotted link

Link to reference [9] for proc ttest in SAS is rotten. 131.249.80.207 (talk) 20:01, 11 September 2015 (UTC)EAC

Unclear contribution by Texyalen

The recent contribution by User:Texyalen doesn't seem very useful to me without some explanation or at least one or two sentences of context. On the other hand, it will be very confusing or even deterring to beginners. Also, I can't make any sense out of some of the terms (see also copyedits by later contributors). Note that (1) Texyalen has posted the exact same content on Log-rank test and Logistic regression (there even a second time), and (2) they have refused to discuss any of their contributions [1].

If you want to add this back into the article, please refine it and make it understandable - or, even better, put it in one of the introductory articles on the topic, or create one in its own right. Here is the full contribution of Texyalen (with copyedits by others and demoted heading):

Statistical Comparison

variable

type


Statistical Unit

Comparison Test
regression


model

numerical


mean


t-test/ANOVA


Linear

regression

categorical


percentage


Chi-square test


Logistic

regression

persontime


KM estimates

(survival curves)

Log-rank test


Cox regression


Se'taan (talk) 19:47, 7 October 2015 (UTC)

Student t test

What are people's thoughts about changing this to the Student t test, with an italicized t? This removal of the possessive and use of italics is as per AMA style. Removing the hyphen is also according to AMA style, but it is also just logical; the hyphen should be retained only if the term is being used as a modifier (eg, "the t-test results"). Any objections?

Well, no objections have been made. If some of you do object, maybe me being bold and making the aforementioned change will spur you to comment so we can discuss the issue properly.211.23.25.61 (talk) 07:12, 20 April 2016 (UTC)
211.23.25.61 (talk) 07:27, 14 April 2016 (UTC)

Sample size formula

Is there a reason why sample size planning for Student's t-test is not discussed in this article? — Preceding unsigned comment added by 134.76.140.68 (talk) 07:40, 13 September 2016 (UTC)

Worked Examples appear to be wrong and use inconsistent notation

It's possible that I'm doing something wrong in my math, but I cannot reproduce the answers in the worked examples section. For example, under the equal variance section, I get test statistic of approximately 2.03. I'm not confident enough in my knowledge to fix this section, but I wanted to mark it so that somebody who is comfortable with the topic can go through and double check all of the math.

I added some clarifying comments and changed some of the notation to match the rest of the article, but fixing the section as a whole is a bit beyond me at the moment.

Deadphysicist (talk) 17:26, 27 September 2016 (UTC)

After more reading, I believe I have fixed the values given in the worked examples section. However, I removed the discussion of p-values. My goal is to add that back next. I would appreciate it if somebody could double check my math.

Deadphysicist (talk) 17:56, 27 September 2016 (UTC)

I have gone though the worked examples section and actually agree with the original values (such as the 1.959 value for the test statistic within the equal variance section.) Would be happy to discuss more. Mathandpi (talk) 23:06, 29 September 2016 (UTC)
Sorry for taking so long to get back to you. This is my work in ipython, is there something you see that I'm doing wrong? Thanks!
In [1]: import numpy

In [2]: A1=numpy.array([30.02, 29.99, 30.11, 29.97, 30.01, 29.99])

In [3]: A2=numpy.array([29.89, 29.93, 29.72, 29.98, 30.02, 29.98])

In [4]: S1sq=numpy.sum((A1-A1.mean())**2.)/(len(A1)-1)

In [5]: S2sq=numpy.sum((A2-A2.mean())**2.)/(len(A2)-1)

In [6]: unequal_var_test_statistic=numpy.sqrt(S1sq/len(A1) + S2sq/len(A2))

In [7]: unequal_var_test_statistic
Out[7]: 0.048493985881413022

In [8]: equal_var_test_statistic=numpy.sqrt(((len(A1)-1)*S1sq+(len(A2)-1)*S2sq)/(len(A1)+len(A2)-2))

In [9]: equal_var_test_statistic
Out[9]: 0.083994047408135153

Deadphysicist (talk) 17:04, 16 February 2017 (UTC)

Thank you for the reply. I think these numbers are ok (though what you have called the test statistics I think you mean the estimates of the pooled standard deviation). These numbers I think are different from your previous revision ( https://en.wikipedia.org/w/index.php?title=Student%27s_t-test&type=revision&diff=741466746&oldid=740180390 ). In the meantime, it looks like someone else actually restored the values in the article to those prior to your revision. Do you still get different estimates for the test statistics or p-values? Mathandpi (talk) 18:41, 16 February 2017 (UTC)

Going through the worked example, I could be wrong but I did not get the final values and would disagree with the standard deviation given for A2. It's rounded to 0.11 not 0.12. ValentinWFP (talk) 11:52, 9 November 2016 (UTC)

I agree. I calculate standard deviation for A2 as 0.1079 which rounds to 0.11, not 0.12. 24.184.121.149 (talk) 19:06, 20 November 2016 (UTC)

I agree with both of you, I don't know what I was thinking before. Thanks for catching this. Deadphysicist (talk) 17:04, 16 February 2017 (UTC)

One-sample t-test section : Assumption for CLT

The current text reads: if the sampling of the parent population is independent and the first moment of the parent population exists then the sample means will be approximately normal

I think two moments are needed, not one - stable distributions covering the ground in between. https://en.wikipedia.org/wiki/Stable_distribution#A_generalized_central_limit_theorem — Preceding unsigned comment added by 170.148.215.156 (talk) 12:04, 20 July 2017 (UTC)

Alternatives to the t-test for location problems

I continue to think the paragraph on exactness is correct but not most relevant in this section. The first paragraph in this section outlines the basic considerations for when the t-test is most likely to give reasonable results: under normality in small samples (which is hard to confirm), and also asymptotically thanks to the CLT (plus other limit theorems).

Since this sections concerns alternatives to the t-test, I suggest we only include information here that is relevant to choosing between the t-test and an alternative. Conditions for exactness are great to go over in, say, the assumptions section. However, I'm not aware of any situation in which exactness (or lack thereof) would guide a decision regarding what test to use; type-1 error control and reasonable power to test a hypothesis we care about (or coverage and width of intervals if we're inverting a test to form a CI) are the relevant properties for this decision. (In fact, in some situations, exact tests° have poorer behavior in terms of these criteria than do approximate tests.)


° e.g., the Clopper-Pearson interval for a binomial proportion[1]

So many suspicious toenails (talk) 18:09, 6 August 2019 (UTC)

Hi So many suspicious toenails,
In general, I appreciate your thoughts and involvement, regardless of what my thoughts are about your specific arguments.
As for what you wrote, I half agree. I also think that the description of exactness is better positioned under assumptions section (I've now moved it there).
I also agree with you that t-test should be compared in terms of type I error, power, and general robustness to assumptions and hypothesis tested.
At the same time, I think (differently than you), that exactness is important (specifically, to make sure we control the type I error). It is true, as your reference mentions, that for discrete cases we could get a conservative test that doesn't make full use of the alloted type I error. But going with, say, asymptotic tests, mean that for small sample sizes we could easily get overly inflated error. I happen to come across such cases in my work, and think this is worth mentioning and keeping in mind (and in the article). With regards, Tal Galili (talk) 12:10, 7 August 2019 (UTC)
Hi,
Yes, you’re completely right that the issue in the binomial case is discreteness, and I think your description of exactness makes a lot of sense in the assumptions section.
My guess is that we largely agree here. I think, practically speaking, that type 1 error control, etc., is more important than exactness, and it sounds like you are emphasizing exactness because it guarantees good testing properties in small samples (when it applies at all, of course).
It sounds like we may have differing views of how useful MW is as a test of location, particularly when sample sizes are small, but if you’re happy with the comparison currently in this article, I am too.
So many suspicious toenails (talk) 16:51, 7 August 2019 (UTC)

Test statistic for one-sample t-test

The section “One-sample t-test” says

In testing the null hypothesis that the population mean is equal to a specified value μ0, one uses the statistic
 
where   is the sample mean, s is the sample standard deviation of the sample and n is the sample size. The degrees of freedom used in this test are n − 1.

Should the formula actually use the square root of n–1 rather than the square root of n? Recently an IP changed it to n–1, but another IP changed it back. Loraof (talk) 19:45, 7 July 2018 (UTC)

No, it should definitely be n there; the standard error of   is  , which we estimate by replacing s by  . (There is an n-1 'hidden' within the definition of s though.) Unfortunately, while it used to be mostly correct, there are many serious errors on this page now because too many people think they have enough expertise on this subject to edit this page when they really, really don't. (There are so many people who write basic stats books for various application areas who don't know what they're doing, and then their students run down here and wreck up the place. It's like trying to push back the tide with a colander.) Glenbarnett (talk) 03:09, 25 October 2020 (UTC)

Move discussion in progress

There is a move discussion in progress on Talk:Student's t-distribution which affects this page. Please participate on that page and not in this talk page section. Thank you. —RMCD bot 04:01, 24 August 2021 (UTC)