Riesz representation theorem

This article describes a theorem concerning the dual of a Hilbert space. For the theorems relating linear functionals to measures, see Riesz–Markov–Kakutani representation theorem.

The Riesz representation theorem, sometimes called the Riesz–Fréchet representation theorem after Frigyes Riesz and Maurice René Fréchet, establishes an important connection between a Hilbert space and its continuous dual space. If the underlying field is the real numbers, the two are isometrically isomorphic; if the underlying field is the complex numbers, the two are isometrically anti-isomorphic. The (anti-) isomorphism is a particular natural one as will be described next; a natural isomorphism.

Preliminaries and notationEdit

Let   be a Hilbert space over a field   where   is either the real numbers   or the complex numbers   If   (resp. if  ) then   is called a complex Hilbert space (resp. a real Hilbert space). Every real Hilbert space can be extended to be a dense subset of a unique (up to bijective isometry) complex Hilbert space, called its complexification, which is why Hilbert spaces are often automatically assumed to be complex. Real and complex Hilbert spaces have in common many, but by no means all, properties and results/theorems.

This article is intended for both mathematicians and physicists and will describe the theorem for both. In both mathematics and physics, if a Hilbert space is assumed to be real (that is, if  ) then this will usually be made clear. Often in mathematics, and especially in physics, unless indicated otherwise, "Hilbert space" is usually automatically assumed to mean "complex Hilbert space." Depending on the author, in mathematics, "Hilbert space" usually means either (1) a complex Hilbert space, or (2) a real or complex Hilbert space.

Linear and antilinear mapsEdit

By definition, an antilinear map (also called a conjugate-linear map)   is a map between vector spaces that is additive:

 
and antilinear (also called conjugate-linear or conjugate-homogeneous):
 

In contrast, a map   is linear if it is additive and homogeneous:

 

Every constant   map is always both linear and antilinear. If   then the definitions of linear maps and antilinear maps are completely identical. A linear map from a Hilbert space into a Banach space (or more generally, from any Banach space into any topological vector space) is continuous if and only if it is bounded; the same is true of antilinear maps. The inverse of any antilinear (resp. linear) bijection is again an antilinear (resp. linear) bijection. The composition of two antilinear maps is a linear map.

Continuous dual and anti-dual spaces

A functional on   is a function   whose codomain is the underlying scalar field   Denote by   (resp. by   the set of all continuous linear (resp. continuous antilinear) functionals on   which is called the (continuous) dual space (resp. the (continuous) anti-dual space) of  [1] If   then linear functionals on   are the same as antilinear functionals and consequently, the same is true for such continuous maps: that is,  

One-to-one correspondence between linear and antilinear functionals

Given any functional   the conjugate of   is the functional

 

This assignment is most useful when   because if   then   and the assignment   reduces down to the identity map.

The assignment   defines an antilinear bijective correspondence from the set of

all functionals (resp. all linear functionals, all continuous linear functionals  ) on  

onto the set of

all functionals (resp. all antilinear functionals, all continuous antilinear functionals  ) on  

Mathematics vs. physics notations and definitions of inner productEdit

The Hilbert space   has an associated inner product   valued in  's underlying scalar field   that is linear in one coordinate and antilinear in the other (as described in detail below). If   is a complex Hilbert space (meaning, if  ), which is very often the case, then which coordinate is antilinear and which is linear becomes a very important technicality. However, if   then the inner product a symmetric map that is simultaneously linear in each coordinate (that is, bilinear) and antilinear in each coordinate. Consequently, the question of which coordinate is linear and which is antilinear is irrelevant for real Hilbert spaces.

Notation for the inner product

In mathematics, the inner product on a Hilbert space   is often denoted by   or   while in physics, the bra-ket notation   or   is typically used instead. In this article, these two notations will be related by the equality:

 

Completing definitions of the inner product

The maps   and   are assumed to have the following two properties:

  1. The map   is linear in its first coordinate; equivalently, the map   is linear in its second coordinate. Explicitly, this means that for every fixed   the map that is denoted by   and defined by
     
    is a linear functional on  
    • In fact, this linear functional is continuous, so  
  2. The map   is antilinear in its second coordinate; equivalently, the map   is antilinear in its first coordinate. Explicitly, this means that for every fixed   the map that is denoted by   and defined by
     
    is an antilinear functional on  
    • In fact, this antilinear functional is continuous, so  

In mathematics, the prevailing convention (i.e. the definition of an inner product) is that the inner product is linear in the first coordinate and antilinear in the other coordinate. In physics, the convention/definition is unfortunately the opposite, meaning that the inner product is linear in the second coordinate and antilinear in the other coordinate. This article will not chose one definition over the other. Instead, the assumptions made above make it so that the mathematics notation   satisfies the mathematical convention/definition for the inner product (that is, linear in the first coordinate and antilinear in the other), while the physics bra-ket notation   satisfies the physics convention/definition for the inner product (that is, linear in the second coordinate and antilinear in the other). Consequently, the above two assumptions makes the notation used in each field consistent with that field's convention/definition for which coordinate is linear and which is antilinear.

Canonical norm and inner product on the dual space and anti-dual spaceEdit

If   then   is a non-negative real number and the map

 

defines a canonical norm on   that makes   into a normed space.[1] As with all normed spaces, the (continuous) dual space   carries a canonical norm, called the dual norm, that is defined by[1]

 

The canonical norm on the (continuous) anti-dual space   denoted by   is defined by using this same equation:[1]

 

This canonical norm on   satisfies the parallelogram law, which means that the polarization identity can be used to define a canonical inner product on   which this article will denote by the notations

 
where this inner product turns   into a Hilbert space. There are now two ways of defining a norm on   the norm induced by this inner product (that is, the norm defined by  ) and the usual dual norm (defined as the supremum over the closed unit ball). These norms are the same; explicitly, this means that the following holds for every  
 

As will be described later, the Riesz representation theorem can be used to give an equivalent definition of the canonical norm and the canonical inner product on  

The same equations that were used above can also be used to define a norm and inner product on  's anti-dual space  [1]

Canonical isometry between the dual and antidual

The complex conjugate   of a functional   which was defined above, satisfies

 
for every   and every   This says exactly that the canonical antilinear bijection defined by
 
as well as its inverse   are antilinear isometries and consequently also homeomorphisms. The inner products on the dual space   and the anti-dual space   denoted respectively by   and   are related by
 
and
 

If   then   and this canonical map   reduces down to the identity map.

Riesz representation theoremEdit

Two vectors   and   are orthogonal if   which happens if and only if   for all scalars  [2] The orthogonal complement of a subset   is

 
which is always a closed vector subspace of   The Hilbert projection theorem guarantees that for any nonempty closed convex subset   of a Hilbert space there exists a unique vector   such that   that is,   is the (unique) global minimum point of the function   defined by  

Theorem — Let   be a Hilbert space whose inner product   is linear in its first argument and antilinear in its second argument (the notation   is used in physics). For every continuous linear functional   there exists a unique   such that

 
  • Importantly for complex Hilbert spaces, the vector   which is called the Riesz representation of   is always located in the antilinear coordinate of the inner product (no matter which notation is used).[note 1]

Moreover,

 
and   is the unique vector in   satisfying   and   If   is non-zero then   and  

Furthermore, with regard to the Hilbert projection theorem,   is the unique element of minimum norm in  ; explicitly, this means that   is the unique element in   that satisfies  

Corollary — The canonical map from   into its dual  [1] is the injective antilinear operator isometry   defined by  [1][note 2] The Riesz representation theorem states that this map is surjective (and thus bijective) when   is complete and that its inverse is the bijective isometric antilinear isomorphism   defined by   Consequently, every continuous linear functional on the Hilbert space   can be written uniquely in the form  [1] where   for every   The assignment   can also be viewed as a bijective linear isometry   into the anti-dual space of  [1] which is the complex conjugate vector space of the continuous dual space  

The inner products on   and   are related by

 
and similarly,
 

The set   satisfies   and   so when   then   can be interpreted as being an affine hyperplane[note 3] that is parallel to the vector subspace  

For   the physics notation for the functional   is the bra   where explicitly this means that   which complements the ket notation   defined by   In the mathematical treatment of quantum mechanics, the theorem can be seen as a justification for the popular bra–ket notation. The theorem says that, every bra   has a corresponding ket   and the latter is unique.

Historically, the theorem is often attributed simultaneously to Riesz and Fréchet in 1907 (see references).

Proof[3]

Let   denote the underlying scalar field of  

Proof of norm formula:

Fix   Define   by   which is a linear functional on   since   is in the linear argument. By the Cauchy–Schwarz inequality,

 
which shows that   is bounded (equivalently, continuous) and that   It remains to show that   By using   in place of   it follows that
 
(the equality   holds because   is real and non-negative). Thus that    

The proof above did not use the fact that   is complete, which shows that the formula for the norm   holds more generally for all inner product spaces.


Proof that a Riesz representation of   is unique:

Suppose   are such that   and   for all   Then

 
which shows that   is the constant   linear functional. Consequently   which implies that    

Proof that a vector   representing   exists:

Let   If   (or equivalently, if  ) then taking   completes the proof so assume that   and   The continuity of   implies that   is a closed subspace of   (because   and   is a closed subset of  ). Let

 
denote the orthogonal complement of   in   Because   is closed and   is a Hilbert space,[note 4]   can be written as the direct sum  [note 5] (a proof of this is given in the article on the Hilbert projection theorem). Because   there exists some non-zero   For any  
 
which shows that   where now   implies
 
Solving for   shows that
 
which proves that the vector   satisfies  

Applying the norm formula that was proved above with   shows that   Also, the vector   has norm   and satisfies    


It can now be deduced that   is  -dimensional when   Let   be any non-zero vector. Replacing   with   in the proof above shows that the vector   satisfies   for every   The uniqueness of the (non-zero) vector   representing   implies that   which in turn implies that   and   Thus every vector in   is a scalar multiple of    

The formulas for the inner products follow from the polarization identity.

ObservationsEdit

If   then

 
So in particular,   is always real and furthermore,   if and only if   if and only if  

Linear functionals as affine hyperplanes

A non-trivial continuous linear functional   is often interpreted geometrically by identifying it with the affine hyperplane   (the kernel   is also often visualized alongside   although knowing   is enough to reconstruct   because if   then   and otherwise  ). In particular, the norm of   should somehow be interpretable as the "norm of the hyperplane  ". When   then the Riesz representation theorem provides such an interpretation of   in terms of the affine hyperplane[note 3]  as follows: using the notation from the theorem's statement, from   it follows that   and so   implies   and thus   This can also be seen by applying the Hilbert projection theorem to   and concluding that the global minimum point of the map   defined by   is   The formulas

 
provide the promised interpretation of the linear functional's norm   entirely in terms of its associated affine hyperplane   (because with this formula, knowing only the set   is enough to describe the norm of its associated linear functional). Defining   the infimum formula
 
will also hold when  

Constructions of the representing vectorEdit

Using the notation from the theorem above, several ways of constructing   from   are now described. If   then  ; in other words,

 

This special case of   is henceforth assumed to be known, which is why some of the constructions given below start by assuming  

Orthogonal complement of kernel

If   then for any  

 

If   is a unit vector (meaning  ) then

 
(this is true even if   because in this case  ). If   is a unit vector satisfying the above condition then the same is true of   which is also a unit vector in   However,   so both these vectors result in the same  

Orthogonal projection onto kernel

If   is such that   and if   is the orthogonal projection of   onto   then[proof 1]

 

Orthonormal basis

Given an orthonormal basis   of   and a continuous linear functional   the vector   can be constructed uniquely by

 
where all but at most countably many   will be equal to   and where the value of   does not actually depend on choice of orthonormal basis (that is, using any other orthonormal basis for   will result in the same vector). If   is written as   then
 
and
 

If the orthonormal basis   is a sequence then this becomes

 
and if   is written as   then
 

Relationship with the associated real Hilbert spaceEdit

Assume that   is a complex Hilbert space with inner product   When the Hilbert space   is reinterpreted as a real Hilbert space then it will be denoted by   where the (real) inner-product on   is the real part of  's inner product; that is:

 

The norm on   induced by   is equal to the original norm on   and the continuous dual space of   is the set of all real-valued bounded  -linear functionals on   (see the article about the polarization identity for additional details about this relationship). Let   and   denote the real and imaginary parts of a linear functional   so that   The formula expressing a linear functional in terms of its real part is

 
where   for all   It follows that   and that   if and only if   It can also be shown that   where   with   defined similarly. In particular, the linear functional   is bounded if and only if its real part   is bounded.

Representing a functional and its real part

Let   and as usual, let   be such that   for all   Let

 
denote the kernel of the real part   of   If   denotes the unique vector in   such that   for all   then   This follows from the main theorem because if   then
 
and consequently, if   then   which shows that   Moreover, because   is real,   In other words, in the theorem and constructions above, if   is replaced with its real Hilbert space counterpart   and if   is replaced with   then   This means that vector   is obtained by using   and the real linear functional   is the equal to the vector obtained by using the origin complex Hilbert space   and original complex linear functional   (with identical norm values as well).

Assume now that   Then   because   and   is a proper subset of   The kernel   has real codimension   in   where   has real codimension   in   and   That is,   is perpendicular to   with respect to  

Properties of canonical injections from a Hilbert space to its dual and anti-dualEdit

Induced linear map into anti-dual

The map defined by placing   into the linear coordinate of the inner product and letting the variable   vary over the antilinear coordinate results in an antilinear functional:

 

This map is an element of   which is the continuous anti-dual space of   The canonical map from   into its anti-dual  [1] is the linear operator

 
which is also an injective isometry.[1] The Fundamental theorem of Hilbert spaces, which is related to Riesz representation theorem, states that this map is surjective (and thus bijective). Consequently, every antilinear functional on   can be written (uniquely) in this form.[1]

If   is the canonical antilinear bijective isometry   that was defined above, then the following equality holds:

 

Extending the bra-ket notation to bras and ketsEdit

Let   be a Hilbert space and as before, let   Let   be the bijective antilinear isometry defined by

 
so that by definition
 

Bras

Given a vector   let   denote the continuous linear functional  ; that is,

 
so that this functional   is defined by   This map was denoted by   earlier in this article.

The assignment   is just the isometric antilinear isomorphism   which is why   holds for all   and all scalars   The resulting of plugging some given   into the functional   is the scalar   which may be denoted by  [note 6]

Bra of a linear functional

Given a continuous linear functional   let   denote the vector  ; that is,

 

The assignment   is just the isometric antilinear isomorphism   which is why   holds for all   and all scalars  

The defining condition of the vector   is the technically correct but unsightly equality

 
which is why the notation   is used in place of   The defining condition becomes
 

Kets

For any given vector   the notation   is used to denote  ; that is,

 

The assignment   is just the identity map   which is why   holds for all   and all scalars  

The notation   and   is used in place of   and   respectively. As expected,   and   really is just the scalar  

Adjoints and transposesEdit

Let   be a continuous linear operator between Hilbert spaces   and   As before, let   and  

Let   and   be the bijective antilinear isometries defined respectively by

 
so that by definition
 

Definition of the adjointEdit

For every   the scalar-valued map  [note 7] on   defined by

 

is a continuous linear functional on   and so by the Riesz representation theorem, there exists a unique vector in   denoted by   such that   or equivalently, such that

 

The assignment   thus induces a function   called the adjoint of   whose defining condition is

 

The adjoint   is necessarily a continuous (equivalently, a bounded) linear operator.

Adjoints are transposesEdit

It is also possible to define the transpose or algebraic adjoint of   which is the map   defined by sending a continuous linear functionals   to

 
where   is always a continuous linear functional on   It satisfies   (this is true more generally, when   and   are merely normed spaces).[4]

The adjoint   is actually just to the transpose  [2] when the Riesz representation theorem is used to identify   with   and   with  

Explicitly, the relationship between the adjoint and transpose can be shown[proof 2] to be:

 

 

 

 

 

(Adjoint-transpose)

which can be rewritten as:

 

Given any   the left and right hand sides of equality (Adjoint-transpose) can be rewritten in terms of the inner products:

 
where as before,   denotes the continuous linear functional on   defined by  [note 7]

Descriptions of self-adjoint, normal, and unitary operatorsEdit

Assume   and let   Let   be a continuous (that is, bounded) linear operator.

Whether or not   is self-adjoint, normal, or unitary depends entirely on whether or not   satisfies certain defining conditions related to its adjoint, which was shown by (Adjoint-transpose) to essentially be just the transpose   Because the transpose of   is a map between continuous linear functionals, these defining conditions can consequently be re-expressed entirely in terms of linear functionals, as the remainder of subsection will now describe in detail. The linear functionals that are involved are the simplest possible continuous linear functionals on   that can be defined entirely in terms of   the inner product   on   and some given vector   These "elementary  -induced" continuous linear functionals are   and  [note 7] where

 

Self-adjoint operators

A continuous linear operator   is called self-adjoint it is equal to its own adjoint; that is, if   Using (Adjoint-transpose), this happens if and only if:

 
where this equality can be rewritten in the following two equivalent forms:
 

Unraveling notation and definitions produces the following characterization of self-adjoint operators in terms of the aforementioned "elementary  -induced" continuous linear functionals:   is self-adjoint if and only if for all   the linear functional  [note 7] is equal to the linear functional  ; that is, if and only if

 

 

 

 

 

(Self-adjointness functionals)

Normal operators

A continuous linear operator   is called normal if   which happens if and only if for all  

 

Using (Adjoint-transpose) and unraveling notation and definitions produces[proof 3] the following characterization of normal operators in terms of inner products of the "elementary  -induced" continuous linear functionals:   is a normal operator if and only if

 

 

 

 

 

(Normality functionals)

The left hand side of this characterization is also equal to   The continuous linear functionals   and   are defined as above.[note 7]

The fact that every self-adjoint bounded linear operator is normal follows readily by direct substitution of   into either side of   This same fact also follows immediately from the direct substitution of the equalities (Self-adjointness functionals) into either side of (Normality functionals).

Alternatively, for a complex Hilbert space, the continuous linear operator   is a normal operator if and only if   for every  [2] which happens if and only if

 

Unitary operators

An invertible bounded linear operator   is said to be unitary if its inverse is its adjoint:   By using (Adjoint-transpose), this is seen to be equivalent to   Unraveling notation and definitions, it follows that   is unitary if and only if

 

The fact that a bounded invertible linear operator   is unitary if and only if   (or equivalently,  ) produces another (well-known) characterization: an invertible bounded linear map   is unitary if and only if

 

Because   is invertible (and so in particular a bijection), this is also true of the transpose   This fact also allows the vector   in the above characterizations to be replaced with   or   thereby producing many more equalities. Similarly,   can be replaced with   or  

See alsoEdit

CitationsEdit

  1. ^ a b c d e f g h i j k l Trèves 2006, pp. 112–123.
  2. ^ a b c Rudin 1991, pp. 306–312.
  3. ^ Rudin 1991, pp. 307−309.
  4. ^ Rudin 1991, pp. 92–115.

NotesEdit

  1. ^ If   then the inner product will be symmetric so it doesn't matter which coordinate of the inner product the element   is placed into because the same map will result. But if   then except for the constant   map, antilinear functionals on   are completely distinct from linear functionals on   which makes the coordinate that   is placed into is very important. For a non-zero   to induce a linear functional (rather than an antilinear functional),   must be placed into the antilinear coordinate of the inner product. If it is incorrectly placed into the linear coordinate instead of the antilinear coordinate then the resulting map will be the antilinear map   which is not a linear functional on   and so it will not be an element of the continuous dual space  
  2. ^ This means that for all vectors   (1)   is injective. (2) The norms of   and   are the same:   (3)   is an additive map, meaning that   for all   (4)   is conjugate homogeneous:   for all scalars   (5)   is real homogeneous:   for all real numbers  
  3. ^ a b This footnote explains how to define - using only  's operations - addition and scalar multiplication of affine hyperplanes so that these operations correspond to addition and scalar multiplication of linear functionals. Let   be any vector space and let   denote its algebraic dual space. Let   and let   and   denote the (unique) vector space operations on   that make the bijection   defined by   into a vector space isomorphism. Note that   if and only if   so   is the additive identity of   (because this is true of   in   and   is a vector space isomorphism). For every   let   if   and let   otherwise; if   then   so this definition is consistent with the usual definition of the kernel of a linear functional. Say that   are parallel if   where if   and   are not empty then this happens if and only if the linear functionals   and   are non-zero scalar multiples of each other. The vector space operations on the vector space of affine hyperplanes   are now described in a way that involves only the vector space operations on  ; this results in an interpretation of the vector space operations on the algebraic dual space   that is entirely in terms of affine hyperplanes. Fix hyperplanes   If   is a scalar then   Describing the operation   in terms of only the sets   and   is more complicated because by definition,   If   (respectively, if  ) then   is equal to   (resp. is equal to  ) so assume   and   The hyperplanes   and   are parallel if and only if there exists some scalar   (necessarily non-0) such that