Cauchy–Schwarz inequality

In mathematics, the Cauchy–Schwarz inequality, also known as the Cauchy–Bunyakovsky–Schwarz inequality, is a useful inequality encountered in many different settings, such as linear algebra, analysis, probability theory, vector algebra and other areas. It is considered to be one of the most important inequalities in all of mathematics.

The inequality for sums was published by Augustin-Louis Cauchy (1821), while the corresponding inequality for integrals was first proved by Viktor Bunyakovsky (1859). The modern proof of the integral inequality was given by Hermann Amandus Schwarz (1888).

Statement of the inequality

The Cauchy–Schwarz inequality states that for all vectors $u$  and $v$  of an inner product space it is true that

$|\langle \mathbf {u} ,\mathbf {v} \rangle |^{2}\leq \langle \mathbf {u} ,\mathbf {u} \rangle \cdot \langle \mathbf {v} ,\mathbf {v} \rangle ,$

where $\langle \cdot ,\cdot \rangle$  is the inner product. Examples of inner products include the real and complex dot product; see the examples in inner product. Equivalently, by taking the square root of both sides, and referring to the norms of the vectors, the inequality is written as

$|\langle \mathbf {u} ,\mathbf {v} \rangle |\leq \|\mathbf {u} \|\|\mathbf {v} \|.$

Moreover, the two sides are equal if and only if $\mathbf {u}$  and $\mathbf {v}$  are linearly dependent (meaning they are parallel: one of the vector's magnitudes is zero, or one is a scalar multiple of the other).

If $u_{1},\ldots ,u_{n}\in \mathbb {C}$  and $v_{1},\ldots ,v_{n}\in \mathbb {C}$ , and the inner product is the standard complex inner product, then the inequality may be restated more explicitly as follows (where the bar notation is used for complex conjugation):

$|u_{1}{\bar {v}}_{1}+\cdots +u_{n}{\bar {v}}_{n}|^{2}\leq (|u_{1}|^{2}+\cdots +|u_{n}|^{2})(|v_{1}|^{2}+\cdots +|v_{n}|^{2})$

or

$\left|\sum _{i=1}^{n}u_{i}{\bar {v}}_{i}\right|^{2}\leq \sum _{j=1}^{n}|u_{j}|^{2}\sum _{k=1}^{n}|v_{k}|^{2}.$

Proofs

First proof

Let $u$  and $v$  be arbitrary vectors in a vector space over $\mathbb {F}$  with an inner product, where $\mathbb {F}$  is the field of real or complex numbers. We prove the inequality

${\big |}\langle u,v\rangle {\big |}\leq \|u\|\|v\|$

and that equality holds if and only if either $u$  or $v$  is a multiple of the other (which includes the special case that either is the zero vector).

If $v=0$ , it is clear that there is equality, and in this case $u$  and $v$  are also linearly dependent, regardless of $u$ , so the theorem is true. Similarly if $u=0$ . One henceforth assumes that $v$  is nonzero.

Let

$z=u-u_{v}=u-{\frac {\langle u,v\rangle }{\langle v,v\rangle }}v.$

Then, by linearity of the inner product in its first argument, one has

$\langle z,v\rangle =\left\langle u-{\frac {\langle u,v\rangle }{\langle v,v\rangle }}v,v\right\rangle =\langle u,v\rangle -{\frac {\langle u,v\rangle }{\langle v,v\rangle }}\langle v,v\rangle =0.$

Therefore, $z$  is a vector orthogonal to the vector $v$  (Indeed, $z$  is the projection of $u$  onto the plane orthogonal to $v$  .) We can thus apply the Pythagorean theorem to

$u={\frac {\langle u,v\rangle }{\langle v,v\rangle }}v+z$

which gives

$\|u\|^{2}=\left|{\frac {\langle u,v\rangle }{\langle v,v\rangle }}\right|^{2}\|v\|^{2}+\|z\|^{2}={\frac {|\langle u,v\rangle |^{2}}{(\|v\|^{2})^{2}}}\,\|v\|^{2}+\|z\|^{2}={\frac {|\langle u,v\rangle |^{2}}{\|v\|^{2}}}+\|z\|^{2}\geq {\frac {|\langle u,v\rangle |^{2}}{\|v\|^{2}}}$

and, after multiplication by $\|v\|^{2}$  and taking square root, we get the Cauchy–Schwarz inequality. Moreover, if the relation $\geq$  in the above expression is actually an equality, then $\|z\|^{2}=0$  and hence $z=0$ ; the definition of $z$  then establishes a relation of linear dependence between $u$  and $v$ . On the other hand, if $u$  and $v$  are linearly dependent, then there exists $c\in \mathbb {F}$  such that $u=c\cdot v$  (since $v\neq 0$ ). Then

$|\langle u,v\rangle |=|\langle c\cdot v,v\rangle |=\left|c\|v\|^{2}\right|=|c|\|v\|^{2}=\|c\cdot v\|\|v\|=\|u\|\|v\|.$

This establishes the theorem.

Second proof

Let $u$  and $v$  be arbitrary vectors in an inner product space over $\mathbb {C}$ .

In the special case $v=0$  the theorem is trivially true. Now assume that $v\neq 0$ . Let $\lambda \in \mathbb {C}$  be given by $\lambda =\langle u,v\rangle /\|v\|^{2}$ , then

{\begin{aligned}0&\leq \|u-\lambda \cdot v\|^{2}\\&=\langle u,u\rangle -\langle \lambda \cdot v,u\rangle -\langle u,\lambda \cdot v\rangle +\langle \lambda \cdot v,\lambda \cdot v\rangle \\&=\langle u,u\rangle -\lambda \langle v,u\rangle -{\overline {\lambda }}\langle u,v\rangle +\lambda {\overline {\lambda }}\langle v,v\rangle \\&=\|u\|^{2}-\lambda {\overline {\langle u,v\rangle }}-{\overline {\lambda }}\langle u,v\rangle +\lambda {\overline {\lambda }}\|v\|^{2}\\&=\|u\|^{2}-{\frac {|\langle u,v\rangle |^{2}}{\|v\|^{2}}}-{\frac {|\langle u,v\rangle |^{2}}{\|v\|^{2}}}+{\frac {|\langle u,v\rangle |^{2}}{\|v\|^{2}}}\\&=\|u\|^{2}-{\frac {|\langle u,v\rangle |^{2}}{\|v\|^{2}}}.\end{aligned}}

Therefore, $0\leq \|u\|^{2}-{\frac {|\langle u,v\rangle |^{2}}{\|v\|^{2}}}$ , or $|\langle u,v\rangle |\leq \|u\|\|v\|$ .

If the inequality holds as an equality, then $\|u-\lambda \cdot v\|=0$ , and so $u-\lambda \cdot v=0$ , thus $u$  and $v$  are linearly dependent. On the other hand, if $u$  and $v$  are linearly dependent, then $|\langle u,v\rangle |=\|u\|\|v\|$ , as shown in the first proof.

More proofs

There are many different proofs of the Cauchy–Schwarz inequality other than the above two examples. When consulting other sources, there are often two sources of confusion. First, some authors define ⟨⋅,⋅⟩ to be linear in the second argument rather than the first. Second, some proofs are only valid when the field is $\mathbb {R}$  and not $\mathbb {C}$ .

Special cases

Titu's lemma

Titu's lemma (named after Titu Andreescu, also known as T2 lemma, Engel's form, or Sedrakyan's inequality) states that for positive reals, one has

${\frac {\left(\sum _{i=1}^{n}u_{i}\right)^{2}}{\sum _{i=1}^{n}v_{i}}}\leq \sum _{i=1}^{n}{\frac {u_{i}^{2}}{v_{i}}}.$

It is a direct consequence of the Cauchy–Schwarz inequality, obtained upon substituting $u_{i}'={\frac {u_{i}}{\sqrt {v_{i}}}}$  and $v_{i}'={\sqrt {v_{i}}}.$  This form is especially helpful when the inequality involves fractions where the numerator is a perfect square.

R2 (ordinary two-dimensional space)

In the usual 2-dimensional space with the dot product, let $v=(v_{1},v_{2})$  and $u=(u_{1},u_{2})$ . The Cauchy–Schwarz inequality is that

$\langle u,v\rangle ^{2}=(\|u\|\|v\|\cos \theta )^{2}\leq \|u\|^{2}\|v\|^{2},$

where $\theta$  is the angle between $u$  and $v.$

The form above is perhaps the easiest in which to understand the inequality, since the square of the cosine can be at most 1, which occurs when the vectors are in the same or opposite directions. It can also be restated in terms of the vector coordinates $v_{1},v_{2},u_{1}$  and $u_{2}$  as

$(u_{1}v_{1}+u_{2}v_{2})^{2}\leq (u_{1}^{2}+u_{2}^{2})(v_{1}^{2}+v_{2}^{2}),$

where equality holds if and only if the vector $(u_{1},u_{2})$  is in the same or opposite direction as the vector $(v_{1},v_{2}),$  or if one of them is the zero vector.

Rn (n-dimensional Euclidean space)

In Euclidean space $\mathbb {R} ^{n}$  with the standard inner product, the Cauchy–Schwarz inequality is

$\left(\sum _{i=1}^{n}u_{i}v_{i}\right)^{2}\leq \left(\sum _{i=1}^{n}u_{i}^{2}\right)\left(\sum _{i=1}^{n}v_{i}^{2}\right)$

The Cauchy–Schwarz inequality can be proved using only ideas from elementary algebra in this case. Consider the following quadratic polynomial in $x$

$0\leq (u_{1}x+v_{1})^{2}+\cdots +(u_{n}x+v_{n})^{2}=\left(\sum u_{i}^{2}\right)x^{2}+2\left(\sum u_{i}v_{i}\right)x+\sum v_{i}^{2}.$

Since it is nonnegative, it has at most one real root for $x$ , hence its discriminant is less than or equal to zero. That is,

$\left(\sum (u_{i}v_{i})\right)^{2}-\sum {u_{i}^{2}}\sum {v_{i}^{2}}\leq 0,$

which yields the Cauchy–Schwarz inequality.

L2

For the inner product space of square-integrable complex-valued functions, one has

$\left|\int _{\mathbb {R} ^{n}}f(x){\overline {g(x)}}\,dx\right|^{2}\leq \int _{\mathbb {R} ^{n}}|f(x)|^{2}\,dx\int _{\mathbb {R} ^{n}}|g(x)|^{2}\,dx.$

A generalization of this is the Hölder inequality.

Applications

Analysis

The triangle inequality for the standard norm is often shown as a consequence of the Cauchy–Schwarz inequality, as follows: given vectors x and y:

{\begin{aligned}\|x+y\|^{2}&=\langle x+y,x+y\rangle \\&=\|x\|^{2}+\langle x,y\rangle +\langle y,x\rangle +\|y\|^{2}\\&=\|x\|^{2}+2\operatorname {Re} \langle x,y\rangle +\|y\|^{2}\\&\leq \|x\|^{2}+2|\langle x,y\rangle |+\|y\|^{2}\\&\leq \|x\|^{2}+2\|x\|\|y\|+\|y\|^{2}\\&=(\|x\|+\|y\|)^{2}\end{aligned}}

Taking square roots gives the triangle inequality.

The Cauchy–Schwarz inequality is used to prove that the inner product is a continuous function with respect to the topology induced by the inner product itself.

Geometry

The Cauchy–Schwarz inequality allows one to extend the notion of "angle between two vectors" to any real inner-product space by defining:

$\cos \theta _{xy}={\frac {\langle x,y\rangle }{\|x\|\|y\|}}.$

The Cauchy–Schwarz inequality proves that this definition is sensible, by showing that the right-hand side lies in the interval [−1, 1] and justifies the notion that (real) Hilbert spaces are simply generalizations of the Euclidean space. It can also be used to define an angle in complex inner-product spaces, by taking the absolute value or the real part of the right-hand side, as is done when extracting a metric from quantum fidelity.

Probability theory

Let X, Y be random variables, then the covariance inequality is given by

$\operatorname {Var} (Y)\geq {\frac {\operatorname {Cov} (Y,X)\operatorname {Cov} (Y,X)}{\operatorname {Var} (X)}}.$

After defining an inner product on the set of random variables using the expectation of their product,

$\langle X,Y\rangle :=\operatorname {E} (XY),$

the Cauchy–Schwarz inequality becomes

$|\operatorname {E} (XY)|^{2}\leq \operatorname {E} (X^{2})\operatorname {E} (Y^{2}).$

To prove the covariance inequality using the Cauchy–Schwarz inequality, let $\mu =\operatorname {E} (X)$  and $\nu =\operatorname {E} (Y)$ , then

{\begin{aligned}|\operatorname {Cov} (X,Y)|^{2}&=|\operatorname {E} ((X-\mu )(Y-\nu ))|^{2}\\&=|\langle X-\mu ,Y-\nu \rangle |^{2}\\&\leq \langle X-\mu ,X-\mu \rangle \langle Y-\nu ,Y-\nu \rangle \\&=\operatorname {E} ((X-\mu )^{2})\operatorname {E} ((Y-\nu )^{2})\\&=\operatorname {Var} (X)\operatorname {Var} (Y),\end{aligned}}

where $\operatorname {Var}$  denotes variance, and $\operatorname {Cov}$  denotes covariance.

Generalizations

Various generalizations of the Cauchy–Schwarz inequality exist in the context of operator theory, e.g. for operator-convex functions and operator algebras, where the domain and/or range are replaced by a C*-algebra or W*-algebra.

An inner product can be used to define a positive linear functional. For example, given a Hilbert space $L^{2}(m),m$  being a finite measure, the standard inner product gives rise to a positive functional $\varphi$  by $\varphi (g)=\langle g,1\rangle$ . Conversely, every positive linear functional $\varphi$  on $L^{2}(m)$  can be used to define an inner product $\langle f,g\rangle _{\varphi }:=\varphi (g^{*}f)$ , where $g^{*}$  is the pointwise complex conjugate of $g$ . In this language, the Cauchy–Schwarz inequality becomes

$|\varphi (g^{*}f)|^{2}\leq \varphi (f^{*}f)\varphi (g^{*}g),$

which extends verbatim to positive functionals on C*-algebras:

Theorem (Cauchy–Schwarz inequality for positive functionals on C*-algebras): If $\varphi$  is a positive linear functional on a C*-algebra $A,$  then for all $a,b\in A$ , $|\varphi (b^{*}a)|^{2}\leq \varphi (b^{*}b)\varphi (a^{*}a)$ .

The next two theorems are further examples in operator algebra.

Theorem (Kadison–Schwarz inequality, named after Richard Kadison): If $\varphi$  is a unital positive map, then for every normal element $a$  in its domain, we have $\varphi (a^{*}a)\geq \varphi (a^{*})\varphi (a)$  and $\varphi (a^{*}a)\geq \varphi (a)\varphi (a^{*})$ .

This extends the fact $\varphi (a^{*}a)\cdot 1\geq \varphi (a)^{*}\varphi (a)=|\varphi (a)|^{2}$ , when $\varphi$  is a linear functional. The case when $a$  is self-adjoint, i.e. $a=a^{*},$  is sometimes known as Kadison's inequality.

Theorem (Modified Schwarz inequality for 2-positive maps): For a 2-positive map $\varphi$  between C*-algebras, for all $a,b$  in its domain,

$\varphi (a)^{*}\varphi (a)\leq \Vert \varphi (1)\Vert \varphi (a^{*}a),{\text{ and }}$
$\Vert \varphi (a^{*}b)\Vert ^{2}\leq \Vert \varphi (a^{*}a)\Vert \cdot \Vert \varphi (b^{*}b)\Vert .$

Another generalization is a refinement obtained by interpolating between both sides the Cauchy-Schwarz inequality:

Theorem (Callebaut's Inequality) For reals $0\leqslant s\leqslant t\leqslant 1$ ,

${\Bigl (}\sum _{i=1}^{n}a_{i}b_{i}{\Bigr )}^{2}\leqslant \sum _{i=1}^{n}a_{i}^{1+s}b_{i}^{1-s}\sum _{i=1}^{n}a_{i}^{1-s}b_{i}^{1+s}\leqslant \sum _{i=1}^{n}a_{i}^{1+t}b_{i}^{1-t}\sum _{i=1}^{n}a_{i}^{1-t}b_{i}^{1+t}\leqslant \sum _{i=1}^{n}a_{i}^{2}\sum _{i=1}^{n}b_{i}^{2}.$

It can be easily proven by Hölder's inequality. There are also non commutative versions for operators and tensor products of matrices.