Matrix determinant inequality proof without using information theory

Here's a more general result.

From Choi's inequality, we know that for a positive linear map $\Phi$ and an operator convex function $f$, we have $\Phi(f(\Lambda)) \ge f (\Phi(\Lambda))$. Let $\Phi(\Lambda)=A\Lambda A^T$, and let $f(t)=-\log t$, then we obtain \begin{align*} -A\log(\Lambda)A^T \ge -\log(A\Lambda A^T)\\ \Leftrightarrow\qquad\log(A\Lambda A^T) \ge A\log(\Lambda)A^T. \end{align*} This operator inequality is stronger than the what we need. Simply take the trace of both sides to conclude $F(\Lambda) \ge 0$ (using the observation that $\log\det(Z)=\text{tr}\log Z$, and since $\Lambda$ is diagonal).


Let $B := A \Lambda^{1/2}$, so that $A\Lambda A^T = BB^T$. Using the Cauchy-Binet formula for the determinant of $BB^T$, we obtain

$$|BB^T| = \sum_{1 \leq i_1 < \dots < i_k \leq n} |B_{i_1 i_2 \dots i_k}|| B_{i_1 i_2 \dots i_k}^T|, $$

where $B_{i_1 i_2 \dots i_k}$ consists of $k$ columns of $B$ corresponding to the indices $i_1, \dots, i_k$. The right hand side of the above equality may be written explicitly as

$$ \sum_{1 \leq i_1 < \dots < i_k \leq n} \left({\prod_{j=1}^k \lambda_{i_j}}\right) |A_{i_1i_2\dots i_k}|^2.$$

Noting that $\sum_{1 \leq i_1 < \dots < i_k \leq n}|A_{i_1i_2\dots i_k}|^2 = |AA^T| = |I|= 1 $ (again via Cauchy-Binet formula), we can take logarithms and use Jensen's inequality to obtain

$$ \log |A \Lambda A^T| \geq \sum_{1 \leq i_1 < \dots < i_k \leq n} |A_{i_1i_2\dots i_k}|^2 \log \left({\prod_{j=1}^k \lambda_{i_j}}\right). $$

We now gather the coefficients of $\log {\lambda_j}$ for a fixed $j$. Without loss of generality, let $j=1$. The coefficient of $\log {\lambda_1}$ is given by

$$ \sum_{1 = i_1 < \dots < i_k \leq n} |A_{i_1i_2\dots i_k}|^2= 1 - |A_{2,3,\dots, n}A_{2,3,\dots,n}^T| = 1 - |I - A_1 A_1^T| = \alpha_1^2.$$

The first equality follows by using the Cauchy-Binet formula again, the second equality follows from the orthogonality of $A$, and the third equality is true because $|I-uu^T| = 1- ||u||^2$ for any vector $u$.


This is too long for a comment, and not (yet) a complete answer.


In other words, we have to show $e^{F(\Lambda)}=\prod_{j} \lambda_j^{-\alpha_j^2}\det (A\Lambda A^\top)\geq 1,$ or $$\prod_{j} \lambda_j^{\alpha_j^2}\leq\det (A\Lambda A^\top).$$ Indeed, this is trivially true for $k=n$, as then $\det (A\Lambda A^\top)=\prod_j \lambda_j$, and by orthogonality of $A$ one has in this case $\alpha_j=1$.

As well, for $k=1$ one has $\det (A\Lambda A^\top)=\sum_j \lambda_j\alpha_j^2\geq \prod_{j} \lambda_j^{\alpha_j^2}$, and this is a consequence of Jensen inequality for the convex function $\exp$. Indeed, the latter gives $$\exp\left(\frac{\sum_j x_j\alpha_j^2}{\sum_j \alpha_j^2}\right)\leq \frac{\sum_j \exp(x_j)\alpha_j^2}{\sum_j \alpha_j^2},$$ and the claim follows by setting $\lambda_j=\exp(x_j)$ and noting that $\sum_j \alpha_j^2=1$.