Open main menu

A statistic (singular) or sample statistic is any quantity computed from values in a sample, often the mean. Technically speaking, a statistic can be calculated by applying any mathematical function to the values found in a sample of data.[1]

In statistics, there is an important distinction between a statistic and a parameter. "Parameter" refers to any characteristic of a population under study. When it is not possible or practical to directly measure the value of a population parameter, statistical methods are used to infer the likely value of the parameter on the basis of a statistic computed from a sample taken from the population. When a statistic is used to estimate a population parameter, is called an estimator. It can be proved that the mean of a sample is an unbiased estimator of the population mean. This means that the average of multiple sample means will tend to converge to the true mean of the population.[2]

Formally, statistical theory defines a statistic as a function of a sample where the function itself is independent of the unknown estimands; that is, the function is strictly a function of the data. The term statistic is used both for the function and for the value of the function on a given sample.

When a statistic (a function) is being used for a specific purpose, it may be referred to by a name indicating its purpose: in descriptive statistics, a descriptive statistic is used to describe the data; in estimation theory, an estimator is used to estimate a parameter of the distribution (population); in statistical hypothesis testing, a test statistic is used to test a hypothesis. However, a single statistic can be used for multiple purposes – for example the sample mean can be used to describe a data set, to estimate the population mean, or to test a hypothesis.

ExamplesEdit

Some examples of statistics are:

  • "In a recent survey of Americans, 52% of Republicans say global warming is happening."

    In this case, "52%" is a statistic, namely the percentage of Republicans in the survey sample who believe in global warming. The population is the set of all Republicans in the United States, and the parameter is the percentage of all Republicans, not just those surveyed, who believe in global warming.

  • "The manager of a large hotel located near Disney World indicated that 20 selected guests had a mean length of stay equal to 5.6 days."

    In this example, "5.6 days" is a statistic, namely the mean length of stay for our sample of 20 hotel guests. The population is the set of all guests of this hotel, and the parameter is the mean length of stay for all guests.[3]

There are a variety of functions that are used to calculate statistics. Some include:

PropertiesEdit

ObservabilityEdit

A statistic is an observable random variable, which differentiates it both from a parameter that is a generally unobservable quantity describing a property of a statistical population, and from an unobservable random variable, such as the difference between an observed measurement and a population average. A parameter can only be computed exactly if entire population can be observed without error; for instance, in a perfect census or for a population of standardized test takers.

Statisticians often contemplate a parameterized family of probability distributions, any member of which could be the distribution of some measurable aspect of each member of a population, from which a sample is drawn randomly. For example, the parameter may be the average height of 25-year-old men in North America. The height of the members of a sample of 100 such men are measured; the average of those 100 numbers is a statistic. The average of the heights of all members of the population is not a statistic unless that has somehow also been ascertained (such as by measuring every member of the population). The average height that would be calculated using all of the individual heights of all 25-year-old North American men is a parameter, and not a statistic.

Statistical propertiesEdit

Important potential properties of statistics include completeness, consistency, sufficiency, unbiasedness, minimum mean square error, low variance, robustness, and computational convenience.

Information of a statisticEdit

Information of a statistic on model parameters can be defined in several ways. The most common is the Fisher information, which is defined on the statistic model induced by the statistic. Kullback information measure can also be used.

See alsoEdit

ReferencesEdit

  • Kokoska, Stephen (2015). Introductory Statistics: A Problem-Solving Approach (2nd ed.). New York: W. H. Freeman and Company. ISBN 978-1-4641-1169-3.
  • Parker, Sybil P (editor in chief). "Statistic". McGraw-Hill Dictionary of Scientific and Technical Terms. Fifth Edition. McGraw-Hill, Inc. 1994. ISBN 0-07-042333-4. Page 1912.
  • DeGroot and Schervish. "Definition of a Statistic". Probability and Statistics. International Edition. Third Edition. Addison Wesley. 2002. ISBN 0-321-20473-5. Pages 370 to 371.
  1. ^ Kokoska 2015, p. 296, "A statistic is any sample quantity. There are infinitely many quantities one could compute using the data in a sample. For example, x̄ and x͂ are statistics, as is the sum of the smallest and the largest values divided by 2."
  2. ^ Kokoska 2015, p. 296-308.
  3. ^ Kokoska 2015, p. 296-297.