In probability theory, a Markov kernel (also known as a stochastic kernel or probability kernel) is a map that in the general theory of Markov processes plays the role that the transition matrix does in the theory of Markov processes with a finite state space.[1]

Formal definition edit

Let   and   be measurable spaces. A Markov kernel with source   and target   is a map   with the following properties:

  1. For every (fixed)  , the map   is  -measurable
  2. For every (fixed)  , the map   is a probability measure on  

In other words it associates to each point   a probability measure   on   such that, for every measurable set  , the map   is measurable with respect to the  -algebra  .[2]

Examples edit

Simple random walk on the integers edit

Take  , and   (the power set of  ). Then a Markov kernel is fully determined by the probability it assigns to singletons   for each  :

 .

Now the random walk   that goes to the right with probability   and to the left with probability   is defined by

 

where   is the Kronecker delta. The transition probabilities   for the random walk are equivalent to the Markov kernel.

General Markov processes with countable state space edit

More generally take   and   both countable and  . Again a Markov kernel is defined by the probability it assigns to singleton sets for each  

 ,

We define a Markov process by defining a transition probability   where the numbers   define a (countable) stochastic matrix   i.e.

 

We then define

 .

Again the transition probability, the stochastic matrix and the Markov kernel are equivalent reformulations.

Markov kernel defined by a kernel function and a measure edit

Let   be a measure on  , and   a measurable function with respect to the product  -algebra   such that

 ,

then   i.e. the mapping

 

defines a Markov kernel.[3] This example generalises the countable Markov process example where   was the counting measure. Moreover it encompasses other important examples such as the convolution kernels, in particular the Markov kernels defined by the heat equation. The latter example includes the Gaussian kernel on   with   standard Lebesgue measure and

 .

Measurable functions edit

Take   and   arbitrary measurable spaces, and let   be a measurable function. Now define   i.e.

  for all  .

Note that the indicator function   is  -measurable for all   iff   is measurable.

This example allows us to think of a Markov kernel as a generalised function with a (in general) random rather than certain value. That is, it is a multivalued function where the values are not equally weighted.

Galton–Watson process edit

As a less obvious example, take  , and   the real numbers   with the standard sigma algebra of Borel sets. Then

 

where   is the number of element at the state  ,   are i.i.d. random variables (usually with mean 0) and where   is the indicator function. For the simple case of coin flips this models the different levels of a Galton board.

Composition of Markov Kernels and the Markov Category edit

Given measurable spaces  ,   we consider a Markov kernel   as a morphism  . Intuitively, rather than assigning to each   a sharply defined point   the kernel assigns a "fuzzy" point in   which is only known with some level of uncertainty, much like actual physical measurements. If we have a third measurable space  , and probability kernels   and  , we can define a composition   by

 .

The composition is associative by the Monotone Convergence Theorem and the identity function considered as a Markov kernel (i.e. the delta measure  ) is the unit for this composition.

This composition defines the structure of a category on the measurable spaces with Markov kernels as morphisms first defined by Lawvere.[4] The category has the empty set as initial object and the one point set   as the terminal object. From this point of view a probability space   is the same thing as a pointed space   in the Markov category.

Probability Space defined by Probability Distribution and a Markov Kernel edit

A composition of a probability space   and a probability kernel   defines a probability space  , where the probability measure is given by

 

Properties edit

Semidirect product edit

Let   be a probability space and   a Markov kernel from   to some  . Then there exists a unique measure   on  , such that:

 

Regular conditional distribution edit

Let   be a Borel space,   a  -valued random variable on the measure space   and   a sub- -algebra. Then there exists a Markov kernel   from   to  , such that   is a version of the conditional expectation   for every  , i.e.

 

It is called regular conditional distribution of   given   and is not uniquely defined.

Generalizations edit

Transition kernels generalize Markov kernels in the sense that for all  , the map

 

can be any type of (non negative) measure, not necessarily a probability measure.

External links edit

References edit

  1. ^ Reiss, R. D. (1993). A Course on Point Processes. Springer Series in Statistics. doi:10.1007/978-1-4613-9308-5. ISBN 978-1-4613-9310-8.
  2. ^ Klenke, Achim (2014). Probability Theory: A Comprehensive Course. Universitext (2 ed.). Springer. p. 180. doi:10.1007/978-1-4471-5361-0. ISBN 978-1-4471-5360-3.
  3. ^ Erhan, Cinlar (2011). Probability and Stochastics. New York: Springer. pp. 37–38. ISBN 978-0-387-87858-4.
  4. ^ F. W. Lawvere (1962). "The Category of Probabilistic Mappings" (PDF).
§36. Kernels and semigroups of kernels