Open main menu

In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli,[1] is the discrete probability distribution of a random variable which takes the value 1 with probability and the value 0 with probability that is, the probability distribution of any single experiment that asks a yes–no question; the question results in a boolean-valued outcome, a single bit of information whose value is success/yes/true/one with probability p and failure/no/false/zero with probability q. It can be used to represent a (possibly biased) coin toss where 1 and 0 would represent "heads" and "tails" (or vice versa), respectively, and p would be the probability of the coin landing on heads or tails, respectively. In particular, unfair coins would have

Ex. kurtosis
Fisher information

The Bernoulli distribution is a special case of the binomial distribution where a single trial is conducted (so n would be 1 for such a binomial distribution). It is also a special case of the two-point distribution, for which the possible outcomes need not be 0 and 1.


Properties of the Bernoulli distributionEdit

If   is a random variable with this distribution, then:


The probability mass function   of this distribution, over possible outcomes k, is


This can also be expressed as


or as


The Bernoulli distribution is a special case of the binomial distribution with  [3]

The kurtosis goes to infinity for high and low values of   but for   the two-point distributions including the Bernoulli distribution have a lower excess kurtosis than any other probability distribution, namely −2.

The Bernoulli distributions for   form an exponential family.

The maximum likelihood estimator of   based on a random sample is the sample mean.


The expected value of a Bernoulli random variable   is


This is due to the fact that for a Bernoulli distributed random variable   with   and   we find



The variance of a Bernoulli distributed   is


We first find


From this follows



The skewness is  . When we take the standardized Bernoulli distributed random variable   we find that this random variable attains   with probability   and attains   with probability  . Thus we get


Related distributionsEdit

The Bernoulli distribution is simply  , also written as  
  • The categorical distribution is the generalization of the Bernoulli distribution for variables with any constant number of discrete values.
  • The Beta distribution is the conjugate prior of the Bernoulli distribution.
  • The geometric distribution models the number of independent and identical Bernoulli trials needed to get one success.
  • If  , then   has a Rademacher distribution.

See alsoEdit


  1. ^ James Victor Uspensky: Introduction to Mathematical Probability, McGraw-Hill, New York 1937, page 45
  2. ^ a b c d Bertsekas, Dimitri P. (2002). Introduction to Probability. Tsitsiklis, John N., Τσιτσικλής, Γιάννης Ν. Belmont, Mass.: Athena Scientific. ISBN 188652940X. OCLC 51441829.
  3. ^ McCullagh, Peter; Nelder, John (1989). Generalized Linear Models, Second Edition. Boca Raton: Chapman and Hall/CRC. Section 4.2.2. ISBN 0-412-31760-5.

Further readingEdit

  • Johnson, N. L.; Kotz, S.; Kemp, A. (1993). Univariate Discrete Distributions (2nd ed.). Wiley. ISBN 0-471-54897-9.
  • Peatman, John G. (1963). Introduction to Applied Statistics. New York: Harper & Row. pp. 162–171.

External linksEdit