# Trace inequality

(Redirected from Von Neumann's trace inequality)

In mathematics, there are many kinds of inequalities involving matrices and linear operators on Hilbert spaces. This article covers some important operator inequalities connected with traces of matrices.

## Basic definitions

Let Hn denote the space of Hermitian n×n matrices, Hn+ denote the set consisting of positive semi-definite n×n Hermitian matrices and Hn++ denote the set of positive definite Hermitian matrices. For operators on an infinite dimensional Hilbert space we require that they be trace class and self-adjoint, in which case similar definitions apply, but we discuss only matrices, for simplicity.

For any real-valued function f on an interval I ⊂ ℝ, one may define a matrix function f(A) for any operator AHn with eigenvalues λ in I by defining it on the eigenvalues and corresponding projectors P as

$f(A)\equiv \sum _{j}f(\lambda _{j})P_{j}~,$  given the spectral decomposition $A=\sum _{j}\lambda _{j}P_{j}.$

### Operator monotone

A function f: I → ℝ defined on an interval I ⊂ ℝ is said to be operator monotone if ∀n, and all A,BHn with eigenvalues in I, the following holds,

$A\geq B\Rightarrow f(A)\geq f(B),$

where the inequality A ≥ B means that the operator AB ≥ 0 is positive semi-definite. One may check that f(A)=A2 is, in fact, not operator monotone!

### Operator convex

A function $f:I\rightarrow \mathbb {R}$  is said to be operator convex if for all $n$  and all A,BHn with eigenvalues in I, and $0<\lambda <1$ , the following holds

$f(\lambda A+(1-\lambda )B)\leq \lambda f(A)+(1-\lambda )f(B).$

Note that the operator $\lambda A+(1-\lambda )B$  has eigenvalues in $I$ , since $A$  and $B$  have eigenvalues in I.

A function $f$  is operator concave if $-f$  is operator convex, i.e. the inequality above for $f$  is reversed.

### Joint convexity

A function $g:I\times J\rightarrow \mathbb {R}$ , defined on intervals $I,J\subset \mathbb {R}$  is said to be jointly convex if for all $n$  and all $A_{1},A_{2}\in \mathbf {H} _{n}$  with eigenvalues in $I$  and all $B_{1},B_{2}\in \mathbf {H} _{n}$  with eigenvalues in $J$ , and any $0\leq \lambda \leq 1$  the following holds

$g(\lambda A_{1}+(1-\lambda )A_{2},\lambda B_{1}+(1-\lambda )B_{2})\leq \lambda g(A_{1},B_{1})+(1-\lambda )g(A_{2},B_{2}).$

A function g is jointly concave if −g is jointly convex, i.e. the inequality above for g is reversed.

### Trace function

Given a function f: ℝ → ℝ, the associated trace function on Hn is given by

$A\mapsto \operatorname {Tr} f(A)=\sum _{j}f(\lambda _{j}),$

where A has eigenvalues λ and Tr stands for a trace of the operator.

## Convexity and monotonicity of the trace function

Let f: ℝ → ℝ be continuous, and let n be any integer. Then, if $t\mapsto f(t)$  is monotone increasing, so is $A\mapsto \operatorname {Tr} f(A)$  on Hn.

Likewise, if $t\mapsto f(t)$  is convex, so is $A\mapsto \operatorname {Tr} f(A)$  on Hn, and it is strictly convex if f is strictly convex.

See proof and discussion in, for example.

## Löwner–Heinz theorem

For $-1\leq p\leq 0$ , the function $f(t)=-t^{p}$  is operator monotone and operator concave.

For $0\leq p\leq 1$ , the function $f(t)=t^{p}$  is operator monotone and operator concave.

For $1\leq p\leq 2$ , the function $f(t)=t^{p}$  is operator convex. Furthermore,

$f(t)=\log(t)$  is operator concave and operator monotone, while
$f(t)=t\log(t)$  is operator convex.

The original proof of this theorem is due to K. Löwner who gave a necessary and sufficient condition for f to be operator monotone. An elementary proof of the theorem is discussed in  and a more general version of it in.

## Klein's inequality

For all Hermitian n×n matrices A and B and all differentiable convex functions f: ℝ → ℝ with derivative f ' , or for all positive-definite Hermitian n×n matrices A and B, and all differentiable convex functions f:(0,∞) → ℝ, the following inequality holds,

$\operatorname {Tr} [f(A)-f(B)-(A-B)f'(B)]\geq 0~.$

In either case, if f is strictly convex, equality holds if and only if A = B. A popular choice in applications is f(t) = t log t, see below.

### Proof

Let C = AB so that, for 0 < t < 1,

$B+tC=(1-t)B+tA.$

Define

$\varphi (t)=\operatorname {Tr} [f(B+tC)]~.$

By convexity and monotonicity of trace functions, φ is convex, and so for all 0 < t < 1,

$\varphi (1)-\varphi (0)\geq {\frac {\varphi (t)-\varphi (0)}{t}},$

and, in fact, the right hand side is monotone decreasing in t. Taking the limit t→0 yields Klein's inequality.

Note that if f is strictly convex and C ≠ 0, then φ is strictly convex. The final assertion follows from this and the fact that ${\tfrac {\varphi (t)-\varphi (0)}{t}}$  is monotone decreasing in t.

## Golden–Thompson inequality

In 1965, S. Golden  and C.J. Thompson  independently discovered that

For any matrices $A,B\in \mathbf {H} _{n}$ ,

$\operatorname {Tr} e^{A+B}\leq \operatorname {Tr} e^{A}e^{B}.$

This inequality can be generalized for three operators: for non-negative operators $A,B,C\in \mathbf {H} _{n}^{+}$ ,

$\operatorname {Tr} e^{\ln A-\ln B+\ln C}\leq \int _{0}^{\infty }dt\,\operatorname {Tr} A(B+t)^{-1}C(B+t)^{-1}.$

## Peierls–Bogoliubov inequality

Let $R,F\in \mathbf {H} _{n}$  be such that Tr eR = 1. Defining g = Tr FeR, we have

$\operatorname {Tr} e^{F}e^{R}\geq \operatorname {Tr} e^{F+R}\geq e^{g}.$

The proof of this inequality follows from the above combined with Klein's inequality. Take f(x) = exp(x), A=R + F, and B = R + gI.

## Gibbs variational principle

Let $H$  be a self-adjoint operator such that $e^{-H}$  is trace class. Then for any $\gamma \geq 0$  with $\operatorname {Tr} \gamma =1,$

$\operatorname {Tr} \gamma H+\operatorname {Tr} \gamma \ln \gamma \geq -\ln \operatorname {Tr} e^{-H},$

with equality if and only if $\gamma =\exp(-H)/\operatorname {Tr} \exp(-H).$

## Lieb's concavity theorem

The following theorem was proved by E. H. Lieb in. It proves and generalizes a conjecture of E. P. Wigner, M. M. Yanase and F. J. Dyson. Six years later other proofs were given by T. Ando  and B. Simon, and several more have been given since then.

For all $m\times n$  matrices $K$ , and all $q$  and $r$  such that $0\leq q\leq 1$  and $0\leq r\leq 1$ , with $q+r\leq 1$  the real valued map on $\mathbf {H} _{m}^{+}\times \mathbf {H} _{n}^{+}$  given by

$F(A,B,K)=\operatorname {Tr} (K^{*}A^{q}KB^{r})$
• is jointly concave in $(A,B)$
• is convex in $K$ .

Here $K^{*}$  stands for the adjoint operator of $K.$

## Lieb's theorem

For a fixed Hermitian matrix $L\in \mathbf {H} _{n}$ , the function

$f(A)=\operatorname {Tr} \exp\{L+\ln A\}$

is concave on $\mathbf {H} _{n}^{++}$ .

The theorem and proof are due to E. H. Lieb, Thm 6, where he obtains this theorem as a corollary of Lieb's concavity Theorem. The most direct proof is due to H. Epstein; see M.B. Ruskai papers, for a review of this argument.

## Ando's convexity theorem

T. Ando's proof  of Lieb's concavity theorem led to the following significant complement to it:

For all $m\times n$  matrices $K$ , and all $1\leq q\leq 2$  and $0\leq r\leq 1$  with $q-r\geq 1$ , the real valued map on $\mathbf {H} _{m}^{++}\times \mathbf {H} _{n}^{++}$  given by

$(A,B)\mapsto \operatorname {Tr} (K^{*}A^{q}KB^{-r})$

is convex.

## Joint convexity of relative entropy

For two operators $A,B\in \mathbf {H} _{n}^{++}$  define the following map

$R(A\parallel B):=\operatorname {Tr} (A\log A)-\operatorname {Tr} (A\log B).$

For density matrices $\rho$  and $\sigma$ , the map $R(\rho \parallel \sigma )=S(\rho \parallel \sigma )$  is the Umegaki's quantum relative entropy.

Note that the non-negativity of $R(A\parallel B)$  follows from Klein's inequality with $f(x)=x\log x$ .

### Statement

The map $R(A\parallel B):\mathbf {H} _{n}^{++}\times \mathbf {H} _{n}^{++}\rightarrow \mathbf {R}$  is jointly convex.

### Proof

For all $0 , $(A,B)\mapsto \operatorname {Tr} (B^{1-p}A^{p})$  is jointly concave, by Lieb's concavity theorem, and thus

$(A,B)\mapsto {\frac {1}{p-1}}(\operatorname {Tr} (B^{1-p}A^{p})-\operatorname {Tr} A)$

is convex. But

$\lim _{p\rightarrow 1}{\frac {1}{p-1}}(\operatorname {Tr} (B^{1-p}A^{p})-\operatorname {Tr} A)=R(A\parallel B),$

and convexity is preserved in the limit.

The proof is due to G. Lindblad.

## Jensen's operator and trace inequalities

The operator version of Jensen's inequality is due to C. Davis.

A continuous, real function $f$  on an interval $I$  satisfies Jensen's Operator Inequality if the following holds

$f\left(\sum _{k}A_{k}^{*}X_{k}A_{k}\right)\leq \sum _{k}A_{k}^{*}f(X_{k})A_{k},$

for operators $\{A_{k}\}_{k}$  with $\sum _{k}A_{k}^{*}A_{k}=1$  and for self-adjoint operators $\{X_{k}\}_{k}$  with spectrum on $I$ .

See, for the proof of the following two theorems.

### Jensen's trace inequality

Let f be a continuous function defined on an interval I and let m and n be natural numbers. If f is convex, we then have the inequality

$\operatorname {Tr} {\Bigl (}f{\Bigl (}\sum _{k=1}^{n}A_{k}^{*}X_{k}A_{k}{\Bigr )}{\Bigr )}\leq \operatorname {Tr} {\Bigl (}\sum _{k=1}^{n}A_{k}^{*}f(X_{k})A_{k}{\Bigr )},$

for all (X1, ... , Xn) self-adjoint m × m matrices with spectra contained in I and all (A1, ... , An) of m × m matrices with

$\sum _{k=1}^{n}A_{k}^{*}A_{k}=1.$

Conversely, if the above inequality is satisfied for some n and m, where n > 1, then f is convex.

### Jensen's operator inequality

For a continuous function $f$  defined on an interval $I$  the following conditions are equivalent:

• $f$  is operator convex.
• For each natural number $n$  we have the inequality
$f{\Bigl (}\sum _{k=1}^{n}A_{k}^{*}X_{k}A_{k}{\Bigr )}\leq \sum _{k=1}^{n}A_{k}^{*}f(X_{k})A_{k},$

for all $(X_{1},\ldots ,X_{n})$  bounded, self-adjoint operators on an arbitrary Hilbert space ${\mathcal {H}}$  with spectra contained in $I$  and all $(A_{1},\ldots ,A_{n})$  on ${\mathcal {H}}$  with $\sum _{k=1}^{n}A_{k}^{*}A_{k}=1.$

• $f(V^{*}XV)\leq V^{*}f(X)V$  for each isometry $V$  on an infinite-dimensional Hilbert space ${\mathcal {H}}$  and

every self-adjoint operator $X$  with spectrum in $I$ .

• $Pf(PXP+\lambda (1-P))P\leq Pf(X)P$  for each projection $P$  on an infinite-dimensional Hilbert space ${\mathcal {H}}$ , every self-adjoint operator $X$  with spectrum in $I$  and every $\lambda$  in $I$ .

## Araki–Lieb–Thirring inequality

E. H. Lieb and W. E. Thirring proved the following inequality in  in 1976: For any $A\geq 0$ , $B\geq 0$  and $r\geq 1,$

$\operatorname {Tr} (BAB)^{r}\leq \operatorname {Tr} (B^{r}A^{r}B^{r}).$

In 1990  H. Araki generalized the above inequality to the following one: For any $A\geq 0$ , $B\geq 0$  and $q\geq 0,$

$\operatorname {Tr} (BAB)^{rq}\leq \operatorname {Tr} (B^{r}A^{r}B^{r})^{q},$  for $r\geq 1,$

and

$\operatorname {Tr} (B^{r}A^{r}B^{r})^{q}\leq \operatorname {Tr} (BAB)^{rq},$  for $0\leq r\leq 1.$

The Lieb–Thirring inequality also enjoys the following generalization: for any $A\geq 0$ , $B\geq 0$  and $\alpha \in [0,1],$

$\operatorname {Tr} (BA^{\alpha }BBA^{1-\alpha }B)\leq \operatorname {Tr} (B^{2}AB^{2}).$

## Effros's theorem and its extension

E. Effros in  proved the following theorem.

If $f(x)$  is an operator convex function, and $L$  and $R$  are commuting bounded linear operators, i.e. the commutator $[L,R]=LR-RL=0$ , the perspective

$g(L,R):=f(LR^{-1})R$

is jointly convex, i.e. if $L=\lambda L_{1}+(1-\lambda )L_{2}$  and $R=\lambda R_{1}+(1-\lambda )R_{2}$  with $[L_{i},R_{i}]=0$  (i=1,2), $0\leq \lambda \leq 1$ ,

$g(L,R)\leq \lambda g(L_{1},R_{1})+(1-\lambda )g(L_{2},R_{2}).$

Ebadian et al. later extended the inequality to the case where $L$  and $R$  do not commute . 

## Von Neumann's trace inequality and related results

Von Neumann's trace inequality, named after its originator John von Neumann, states that for any n × n complex matrices AB with singular values $\alpha _{1}\geq \alpha _{2}\geq \cdots \geq \alpha _{n}$  and $\beta _{1}\geq \beta _{2}\geq \cdots \geq \beta _{n}$  respectively,

$\left|\operatorname {Tr} (AB)\right|\leq \sum _{i=1}^{n}\alpha _{i}\beta _{i}\,.$

A simple corollary to this is the following result: For hermitian n × n complex matrices AB where now the eigenvalues are sorted decreasingly ($a_{1}\geq a_{2}\geq \cdots \geq a_{n}$  and $b_{1}\geq b_{2}\geq \cdots \geq b_{n}$ , respectively),

$\sum _{i=1}^{n}a_{i}b_{n-i+1}\leq \operatorname {Tr} (AB)\leq \sum _{i=1}^{n}a_{i}b_{i}\,.$