How do I maximize entropy?

I think the proof using Jensen's inequality is nicer, but let's do the proof using the LM method just for clarity. Consider the discrete distribution case. Let us maximize $-\displaystyle\sum_{i}\log(p_i)p_i$ subject to $\displaystyle\sum_{i}p_i=1$. The first order condition is

$$ -1-\log(p_i)-\lambda=0$$

i.e. $\log(p_i)=-1-\lambda$. Moreover, by twice differentiating this expression you can check that the sufficient conditions for the LM method holds.

Hence $p_i=p_j$ for all $i$ and $j$. Moreover, the probabilities are positive. Therefore $p_i=\frac{1}{N}$, where $N$ is the number of states.


One approach:

  • Say your probability distribution takes values in $\{1,\cdots,n\}$.
  • Then $H(X) = \mathbb{E}[\log\frac{1}{p(X)}] \le \log \mathbb{E} \left[ \frac{1}{p(X)} \right] = \log n$, by Jensen's inequality applied to the concave function $f(x)=\log x$.
  • Equality holds when $p(X)$ is constant, i.e. when $X$ is uniformly distributed.