# Entropy rate

In the mathematical theory of probability, the entropy rate or source information rate of a stochastic process is, informally, the time density of the average information in a stochastic process. For stochastic processes with a countable index, the entropy rate $H(X)$ is the limit of the joint entropy of $n$ members of the process $X_{k}$ divided by $n$ , as $n$ tends to infinity:

$H(X)=\lim _{n\to \infty }{\frac {1}{n}}H(X_{1},X_{2},\dots X_{n})$ when the limit exists. An alternative, related quantity is:

$H'(X)=\lim _{n\to \infty }H(X_{n}|X_{n-1},X_{n-2},\dots X_{1})$ For strongly stationary stochastic processes, $H(X)=H'(X)$ . The entropy rate can be thought of as a general property of stochastic sources; this is the asymptotic equipartition property. The entropy rate may be used to estimate the complexity of stochastic processes. It is used in diverse applications ranging from characterizing the complexity of languages, blind source separation, through to optimizing quantizers and data compression algorithms. For example, a maximum entropy rate criterion may be used for feature selection in machine learning .

## Entropy rates for Markov chains

Since a stochastic process defined by a Markov chain that is irreducible, aperiodic and positive recurrent has a stationary distribution, the entropy rate is independent of the initial distribution.

For example, for such a Markov chain $Y_{k}$  defined on a countable number of states, given the transition matrix $P_{ij}$ , $H(Y)$  is given by:

$\displaystyle H(Y)=-\sum _{ij}\mu _{i}P_{ij}\log P_{ij}$

where $\mu _{i}$  is the asymptotic distribution of the chain.

A simple consequence of this definition is that an i.i.d. stochastic process has an entropy rate that is the same as the entropy of any individual member of the process.