How many surjections are there from a set of size n?

It seems to be the case that the polynomial $P_n(x) =\sum_{m=1}^n m!S(n,m)x^m$ has only real zeros. (I know it is true that $\sum_{m=1}^n S(n,m)x^m$ has only real zeros.) If this is true, then the value of $m$ maximizing $m!S(n,m)$ is within 1 of $P'_n(1)/P_n(1)$ by a theorem of J. N. Darroch, Ann. Math. Stat. 35 (1964), 1317-1321. See also J. Pitman, J. Combinatorial Theory, Ser. A 77 (1997), 279-303. By standard combinatorics $$ \sum_{n\geq 0} P_n(x) \frac{t^n}{n!} = \frac{1}{1-x(e^t-1)}. $$ Hence $$ \sum_{n\geq 0} P_n(1)\frac{t^n}{n!} = \frac{1}{2-e^t} $$ $$ \sum_{n\geq 0} P'_n(1)\frac{t^n}{n!} = \frac{e^t-1}{(2-e^t)^2}. $$ Since these functions are meromorphic with smallest singularity at $t=\log 2$, it is routine to work out the asymptotics, though I have not bothered to do this.

Update. It is indeed true that $P_n(x)$ has real zeros. This is because $(x-1)^nP_n(1/(x-1))=A_n(x)/x$, where $A_n(x)$ is an Eulerian polynomial. It is known that $A_n(x)$ has only real zeros, and the operation $P_n(x) \to (x-1)^nP_n(1/(x-1))$ leaves invariant the property of having real zeros.


Richard's answer is short, slick, and complete, but I wanted to mention here that there is also a "real variable" approach that is consistent with that answer; it gives weaker bounds at the end, but also tells a bit more about the structure of the "typical" surjection. I'll write the argument in a somewhat informal "physicist" style, but I think it can be made rigorous without significant effort.

Tim's function $Sur(n,m) = m! S(n,m)$ obeys the easily verified recurrence $Sur(n,m) = m ( Sur(n-1,m) + Sur(n-1,m-1) )$, which on expansion becomes

$Sur(n,m) = \sum m_1 ... m_n = \sum \exp( \sum_{j=1}^n \log m_j )$

where the sum is over all paths $1=m_1 \leq m_2 \leq \ldots \leq m_n = m$ in which each $m_{i+1}$ is equal to either $m_i$ or $m_i+1$; one can interpret $m_i$ as being the size of the image of the first $i$ elements of $\{1,\ldots,n\}$. If we make the ansatz $m_j \approx n f(j/n)$ for some nice function $f: [0,1] \to {\bf R}^+$ with $f(0)=0$ and $0 \leq f'(t) \leq 1$ for all $t$, and use standard entropy calculations (Stirling's formula and Riemann sums, really), we obtain a contribution to $Sur(n,m)$ of the form

$\exp( n \int_0^1 \log(n f(t))\ dt + n \int_0^1 h(f'(t))\ dt + o(n) )$ (*)

where $h$ is the entropy function $h(\theta) := -\theta \log \theta - (1-\theta) \log (1-\theta)$. So, heuristically at least, the optimal profile comes from maximising the functional

$\int_0^1 \log(f(t)) + h(f'(t))\ dt$

subject to the boundary condition $f(0)=0$. (The fact that $h$ is concave will make this maximisation problem nice and elliptic, which makes it very likely that these heuristic arguments can be made rigorous.) The Euler-Lagrange equation for this problem is

$-\frac{f''}{f'(1-f')} = \frac{1}{f}$

while the free boundary at $t=1$ gives us the additional Neumann boundary condition $f'(1)=1/2$. The translation invariance of the Lagrangian gives rise to a conserved quantity; indeed, multiplying the Euler-Lagrange equation by $f'$ and integrating one gets

$\log(1-f') = \log f + C$

which is easily solved as

$f = \frac{1}{A} (1 - B e^{-At} )$

for some constants A, B. The Dirichlet boundary condition $f(0)=0$ gives $B=1$; the Neumann boundary condition $f'(1)=1/2$ gives $A=\log 2$, thus

$f(t) = (1 - 2^{-t}) / \log 2$.

In particular $f(1)=1/(2 \log 2)$, which matches Richard's answer that the maximum occurs when $m/n \approx 1/(2 \log 2)$. To match up with the asymptotic for $Sur(n,m)$ in Richard's answer (up to an error of $\exp(o(n))$, I need to have

$\int_0^1 \log f(t) + h(f'(t))\ dt = - 1 - \log \log 2.$

And happily, this turns out to be the case (after a mildly tedious computation.)

This calculation reveals more about the structure of a "typical" surjection from n elements to m elements for m free, other than that $m/n \approx 1/(2 \log 2)$; it shows that for any $0 < t < 1$, the image of the first $tn$ elements has cardinality about $f(t) n$. If one fixes $m$ rather than lets it be free, then one has a similar description of the surjection but one needs to adjust the A parameter (it has to solve the transcendental equation $(1-e^{-A})/A = m/n$).

With a bit more effort, this type of computation should also reveal the typical distribution of the preimages of the surjection, and suggest a random process that generates something that is within o(n) edits of a random surjection.

It's also interesting to note that the answer $m/n \approx 1/(2\log 2) = 0.72134\ldots$ fits extremely well with Kevin's numerical computation $f(1000)=722$, so we now have several independent confirmations that this is the correct answer...


This looks like the Stirling numbers of the second kind (up to the $m!$ factor).

This and this papers are specifically devoted to the maximal Striling numbers. It seems that for large $n$ the relevant asymptotic expansion is $$k! S(n,k)= (e^r-1)^k \frac{n!}{r^n}(2\pi k B)^{-1/2}\left(1-\frac{6r^2\theta^2 +6r\theta+1}{12re^r}+O(n^{-2})\right),$$ where $$e^r-1=k+\theta,\quad \theta=O(1),$$ $$B=\frac{re^{2r}-(r^2+r)e^r}{(e^r-1)^2}.$$