Quantifying the noninvertibility of a function

You're right that this is a significant quantity information-theoretically. It's essentially the Rényi entropy of order $2$, as I'll explain.

First let me generalize your setting ever so slightly, because I find it a distraction that you've made the domain and codomain the same. For any function $f: X \to Y$ between finite sets, put $$ \kappa_f = \sum_{y \in Y} |f^{-1}(y)|^2/|X|. $$ This extends your definition, and continues to have the kind of properties you want: $\kappa_f = 1$ iff $f$ is injective, and $\kappa_f = |X|$ iff $f$ is constant. Anyway, you can ignore my generalization if you want and stick with $Y = X$.

The function $f: X \to Y$ gives rise to a probability distribution $\mathbf{p} = (p_y)_{y \in Y}$ on $Y$, defined by $$ p_y = |f^{-1}(y)|/|X|. $$ Like any probability distribution on any finite set, $\mathbf{p}$ has a Rényi entropy of order $q$ for every $q \in [-\infty, \infty]$. When $q \neq 1, \pm\infty$, this is by definition $$ H_q(\mathbf{p}) = \frac{1}{1 - q} \log \sum_y p_y^q, $$ where the sum runs over the support of $\mathbf{p}$. The exceptional cases are got by taking limits in $q$, which explicitly means that $H_1$ is Shannon entropy: $$ H_1(\mathbf{p}) = - \sum_y p_y \log p_y $$ and that $$ H_\infty(\mathbf{p}) = -\log\max_y p_y, \qquad H_{-\infty}(\mathbf{p}) = -\log\min_y p_y $$ (where again, the min is over the support of $\mathbf{p}$).

Many of the good properties of Shannon entropy are shared by the Rényi entropies $H_q$. For example, over all probability distributions $\mathbf{p}$ on an $n$-element set, the maximum value of $H_q(\mathbf{p})$ is $\log n$, which is attained when $\mathbf{p}$ is uniform, and the minimum value is $0$, which is attained when $\mathbf{p} = (0, \ldots, 0, 1, 0, \ldots, 0)$. That's true for every $q \in [-\infty, \infty]$.

Often it's better to work with the exponentials of the Rényi entropies, which I'll write as $D_q = \exp H_q$. For instance, $$ D_2(\mathbf{p}) = 1\Big/\sum_y p_y^2. $$ (D stands for diversity, since ecologists use $D_q$ to measure biodiversity; in ecology, $D_q$ is called the "Hill number" of order $q$.) So the maximum value of $D_q(\mathbf{p})$ over distributions $\mathbf{p}$ on a fixed finite set is the cardinality of that set, not its logarithm.

Returning to your question, we had a function $f: X \to Y$ between finite sets and the induced probability distribution $\mathbf{p}$ on $Y$. It's a trivial manipulation to show that $$ \kappa_f = |X|/D_2(\mathbf{p}). $$ So as I claimed at the start, $\kappa_f$ is essentially (up to a simple transformation) the Rényi entropy of order $2$ (of the distribution $\mathbf{p}$ induced by $f$).

You might also want to consider $$ |X|/D_q(\mathbf{p}) $$ for other values of $q$, especially the Shannon case $q = 1$. Although entropies of order $2$ are the easiest to manipulate (being essentially quadratic forms), it's $q = 1$ that has the really magical properties.

Incidentally, in ecology $D_2(\mathbf{p})$ is known as the Simpson or Gini-Simpson index; there $p_1, \ldots, p_n$ are the relative abundances of the $n$ species in some community. Jack Good wrote in 1982 that it should really bear the name of Turing, but also that "any statistician of this century who wanted a measure of homogeneity would have taken about two seconds to suggest $\sum p_i^2$." Thanks, Jack.


$\lambda(f):=\kappa_f-1$ is called "the coefficient of coalescence of $f$" here:

https://msp.org/pjm/1982/103-2/pjm-v103-n2-p03-p.pdf

(note the typo on p.269, the correct definition appears on p.272).

Of course, $\lambda(f)/|X|$ (the square of the Euclidean distance between the preimage distribution ($p(x)=|f^{-1}(x)|/|X|$) and the uniform distribution on $X$), and $\lambda(f)\,|X|$ (the value of the $\chi^2$ test statistic for a (uniform) random mapping) are specific instances of well known concepts (but to the best of my knowledge without special names).