Log sum inequality

The log sum inequality is used for proving theorems in information theory.

Statement edit

Let $a_{1},\ldots ,a_{n}$ and $b_{1},\ldots ,b_{n}$ be nonnegative numbers. Denote the sum of all $a_{i}$ s by $a$ and the sum of all $b_{i}$ s by $b$ . The log sum inequality states that

\sum _{i=1}^{n}a_{i}\log {\frac {a_{i}}{b_{i}}}\geq a\log {\frac {a}{b}},

with equality if and only if ${\frac {a_{i}}{b_{i}}}$ are equal for all $i$ , in other words $a_{i}=cb_{i}$ for all $i$ .^[1]

(Take $a_{i}\log {\frac {a_{i}}{b_{i}}}$ to be $0$ if $a_{i}=0$ and $\infty$ if $a_{i}>0,b_{i}=0$ . These are the limiting values obtained as the relevant number tends to $0$ .)^[1]

Proof edit

Notice that after setting $f(x)=x\log x$ we have

{\begin{aligned}\sum _{i=1}^{n}a_{i}\log {\frac {a_{i}}{b_{i}}}&{}=\sum _{i=1}^{n}b_{i}f\left({\frac {a_{i}}{b_{i}}}\right)=b\sum _{i=1}^{n}{\frac {b_{i}}{b}}f\left({\frac {a_{i}}{b_{i}}}\right)\\&{}\geq bf\left(\sum _{i=1}^{n}{\frac {b_{i}}{b}}{\frac {a_{i}}{b_{i}}}\right)=bf\left({\frac {1}{b}}\sum _{i=1}^{n}a_{i}\right)=bf\left({\frac {a}{b}}\right)\\&{}=a\log {\frac {a}{b}},\end{aligned}}

where the inequality follows from Jensen's inequality since ${\frac {b_{i}}{b}}\geq 0$ , $\sum _{i=1}^{n}{\frac {b_{i}}{b}}=1$ , and $f$ is convex.^[1]

Generalizations edit

The inequality remains valid for $n=\infty$ provided that $a<\infty$ and $b<\infty$ .^{[citation needed]} The proof above holds for any function $g$ such that $f(x)=xg(x)$ is convex, such as all continuous non-decreasing functions. Generalizations to non-decreasing functions other than the logarithm is given in Csiszár, 2004.

Another generalization is due to Dannan, Neff and Thiel, who showed that if $a_{1},a_{2}\cdots a_{n}$ and $b_{1},b_{2}\cdots b_{n}$ are positive real numbers with $a_{1}+a_{2}\cdots +a_{n}=a$ and $b_{1}+b_{2}\cdots +b_{n}=b$ , and $k\geq 0$ , then $\sum _{i=1}^{n}a_{i}\log \left({\frac {a_{i}}{b_{i}}}+k\right)\geq a\log \left({\frac {a}{b}}+k\right)$ . ^[2]

Applications edit

The log sum inequality can be used to prove inequalities in information theory. Gibbs' inequality states that the Kullback-Leibler divergence is non-negative, and equal to zero precisely if its arguments are equal.^[3] One proof uses the log sum inequality.

Proof^[1]

Let

P=(p_{i})_{i\in \mathbb {N} }

and

Q=(q_{i})_{i\in \mathbb {N} }

be pmfs. In the log sum inequality, substitute

n=\infty

,

a_{i}=p_{i}

and

b_{i}=q_{i}

to get

\mathbb {D} _{\mathrm {KL} }(P\|Q)\equiv \sum _{i}p_{i}\log _{2}{\frac {p_{i}}{q_{i}}}\geq 1\log {\frac {1}{1}}=0

with equality if and only if $p_{i}=q_{i}$ for all i (as both $P$ and $Q$ sum to 1).

The inequality can also prove convexity of Kullback-Leibler divergence.^[4]

Notes edit

^ ^a ^b ^c ^d Cover & Thomas (1991), p. 29.
^ F. M. Dannan, P. Neff, C. Thiel (2016). "On the sum of squared logarithms inequality and related inequalities" (PDF). Journal of Mathematical Inequalities. 10 (1): 1–17. doi:10.7153/jmi-10-01. S2CID 23953925. Retrieved 12 January 2023.{{cite journal}}: CS1 maint: multiple names: authors list (link)
^ MacKay (2003), p. 34.
^ Cover & Thomas (1991), p. 30.

References edit

Cover, Thomas M.; Thomas, Joy A. (1991). Elements of Information Theory. Hoboken, New Jersey: Wiley. ISBN 978-0-471-24195-9.
Csiszár, I.; Shields, P. (2004). "Information Theory and Statistics: A Tutorial" (PDF). Foundations and Trends in Communications and Information Theory. 1 (4): 417–528. doi:10.1561/0100000004. Retrieved 2009-06-14.
T.S. Han, K. Kobayashi, Mathematics of information and coding. American Mathematical Society, 2001. ISBN 0-8218-0534-7.
Information Theory course materials, Utah State University [1]. Retrieved on 2009-06-14.
MacKay, David J.C. (2003). Information Theory, Inference, and Learning Algorithms. Cambridge University Press. ISBN 0-521-64298-1.

[FOOTNOTECoverThomas199129-1] Cover & Thomas (1991), p. 29.

[2] F. M. Dannan, P. Neff, C. Thiel (2016). "On the sum of squared logarithms inequality and related inequalities" (PDF). Journal of Mathematical Inequalities. 10 (1): 1–17. doi:10.7153/jmi-10-01. S2CID 23953925. Retrieved 12 January 2023.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[FOOTNOTEMacKay200334-3] MacKay (2003), p. 34.

[FOOTNOTECoverThomas199130-4] Cover & Thomas (1991), p. 30.

[1]

[2]

[3]

[4]