The intuition behind the Hilbert projective metric and the Perron Frobenius Theorem

Here's my understanding of the intuition behind the Hilbert metric's utility for Perron-Frobenius. (I don't have access to Birhoff's paper handy so I'm not sure to what extent I'm just duplicating what's there.)

(1) As Suvrit's comment pointed out, since studying eigenvectors of $A$ is really a projective question, it is completely natural to consider a metric on the projectivization of $\mathbb{R}_+^n$, so that we are considering lines $\ell$ through the origin into the positive orthant.

(2) For our metric to be useful, we should be able to compute $d(\ell_1,\ell_2)$ in terms of $x_1,x_2$ for some (any) $x_i\in \ell_i$. This should be independent of the choice of $x_i$; replacing $x_i$ with another point on $\ell_i$ should not change the value of $d$. At the risk of being a little bit vague, this replacement is a kind of projective transformation, and so it is natural to ask our metric to be associated to a projective invariant.

(3) The most fundamental projective invariant is the cross-ratio. Given four collinear points $x_1,x_2,x_3,x_4$, the cross-ratio is $$ (x_1,x_2;x_3,x_4) = \frac{|x_3 - x_1|}{|x_3-x_2|} \cdot \frac{|x_4-x_2|}{|x_4-x_1|}\qquad\qquad (*) $$

(4) A cross-ratio requires four points but a metric only has two points as input. So given $x_1,x_2\in \mathbb{R}_+^n$, we need to choose two other points that are collinear with $x_1,x_2$. Let $L$ be the line through $x_1,x_2$; the only other natural reference points to choose on $L$ are the points where it intersects the boundary of the positive orthant; that is, the two points $x_3,x_4\in L$ where one (or more) of the coordinates vanishes and the rest are positive.

(5) With $x_i$ as above, note that $(x_1,x_2; x_3,x_4)$ is equal to 1 iff $x_1=x_2$ (since we always have $x_3\neq x_4$) and so to produce something with the right scaling for a metric we should put $d(x_1,x_2) = |\log(x_1,x_2;x_3,x_4)|$. This defines the Hilbert metric.

(6) It remains to get some intuition for why the Hilbert metric is a natural choice for something that is contracted by $A$. First note that since the quantities $x_i-x_j$ in $(*)$ are all scalar multiples of each other, we have $(Ax_1,Ax_2;Ax_3,Ax_4) = (x_1,x_2;x_3,x_4)$. Let $x_3',x_4'$ be the boundary points for the line through $Ax_1,Ax_2$, so that we have the following picture; note that this is where we use positivity of $A$ to guarantee that boundary points of the positive orthant ($x_3,x_4$) are mapped into the interior.

image showing the six points

Now we have $d(x_1,x_2) = |\log(Ax_1,Ax_2;Ax_3,Ax_4)|$ and $d(Ax_1,Ax_2) = |\log(Ax_1,Ax_2;x_3',x_4')|$. The first cross-ratio is the product of the ratios $$\frac{|Ax_1-Ax_3|}{|Ax_2-Ax_3|} \text{ and } \frac{|Ax_4 - Ax_2|}{|Ax_4 - Ax_1|},$$ while the second is the product of the ratios $$\frac{|Ax_1-x_3'|}{|Ax_2-x_3'|} \text{ and }\frac{|x_4' - Ax_2|}{|x_4' - Ax_1|}.$$ Compare the first members of these pairs of ratios; to go from one to the other we add $|Ax_3 - x_3'|$ to both the numberator and denominator, which has the effect of making the ratio closer to the value 1. Similarly for the second ratio in each pair. Thus the cross-ratio involved in the definition of $d(Ax_1,Ax_2)$ is closer to 1 than the cross-ratio involved in $d(x_1,x_2)$, which is equivalent to the statement that $A$ contracts the metric $d$.

Of course one has to be a little more careful with this to guarantee that the contraction is uniform (and I've been a little glib regarding the relative orders of $x_1,x_2,x_3,x_4$), but this is the geometric intuition behind the fact that positive matrices contract the Hilbert metric on the positive orthant.


There is actually a paper written on this topic by E. Kohlberg and W. Pratt, named The contraction mapping approach tothe Perron-Frobenius theory : why Hilbert’s metric ? published in Math. Oper. Res.7.2 (1982). They explain there why the Hilbert metric is contracting.

They also prove something which is worth noting here. Let $K$ be a convex cone in $\mathbb{R}^d$. Say that $K$ is pointed if $K\cap -K=\{0\}$. A projective metric on a pointed cone $K$ is an application $D:K\times K \to \mathbb{R}_{\geq 0}$ which is symmetric, satisfies the triangle inequality and the following separation condition : if $D(x,y)=0$, then $x=\lambda y$ for some $\lambda\geq 0$. Typically, the Hilbert metric is a projective metric on the pointed cone of non-negative vectors.

Letting $K$ be a cone, say that a linear transformation $A$ is positive (with respect to $K$) if $Ax\in K$ whenever $x\in K$. When given a projective metric $D$ on $K$, define then $$k_D(A)=\mathrm{sup} \left ( \frac{D(Ax,Ay)}{D(x,y)},D(x,y)>0 \right ).$$ Then, $A$ induces a contraction on $K$ for the metric $D$ if and only if $k_D(A)<1$.

E. Kohlberg and W. Pratt define in general a Hilbert metric on any pointed cone $K$, which is the same as the metric described by @VaughnClimenhaga. They prove that any positive linear transformation $A$ induces a contraction for this Hilbert metric. They also prove the following.

Theorem (Kohlberg, Pratt) Let $K$ be a pointed cone in $\mathbb{R}^d$ and let $d$ be the Hilbert metric on $K$ and let $D$ be another projective metric on $K$. Let $A$ be a positive linear transformation. Then, there exists a function $f:\mathbb{R}\to\mathbb{R}$ such that for any $x,y$, $D(x,y)=f(d(x,y))$. Moreover, $k_d(A)\leq k_D(A)$.

Consequently, not only the Hilbert metric yields a contraction, but this is the optimal metric doing so (since it has the best possible contraction coefficient).