Spanning forests of bipartite graphs and distinct row/column sums of binary matrices

This conjectured equivalence (if true) is still amazing to me. However, here is a baby step: The conjecture is indeed true for $m=2$. Perhaps someone else will find this useful as a base for generalization.

Claim: $|F_{2,n}| = |S_{2,n}|$ for all $n$

First, lets consider $F_{2,n}$. In your definition, a spanning forest is any subset $T$ of edges of $K_{2,n}$ which is acyclic. Let $u,v$ be the two nodes on the $m=2$ side, and let $S(u)$ be the neighbors of $u$, i.e. the subset of nodes (on the $n$-node side) which $u$ is connected to in $T$. Similarly for $S(v)$.

Next, note that $T$ contains a cycle iff $S = S(u) \cap S(v)$ contains $2$ nodes (or more). So, $F_{2,n}$ can be counted by counting all cases where $S$ contains $0$ or $1$ node.

  • $|S|=0$ case: Each of $n$ nodes can be connected to $u$, or to $v$, or to neither. Total no. $= 3^n$.

  • $|S|=1$ case: There are $n$ ways to pick the unique $w \in S$. After that, each of the remaining $n-1$ nodes can be connected to $u$, or to $v$, or to neither. Total no. $= n\, 3^{n-1}$

  • Summing up, $|F_{2,n}| = 3^n + n \,3^{n-1} = 3^{n-1} (3+n),$ which matches your table.

Second, lets consider $S_{2,n}$. There are $n$ column sums, each of which can be $0, 1, 2$. Let $x,y,z$ be the number of columns whose sum $=0, 1, 2$ respectively. The columns whose sum $=0$ or $2$ dictate their elements. In the $y$ columns whose sum $=1$, the number of $1$s in the top row can be any integer $\in [0,y]$, and so the top row sum can be any integer $\in [z,z+y]$, and in short there are $(y+1)$ possibilities. Obviously, once the top row sum is known that determines the bottom row sum $= (y+2z) \,-$ top row sum.

So $S_{2,n}$ can be counted by (i) summing over all possible $y$, and for each $y$: (ii) picking the $y$ columns whose sum $=1$, and (iii) for the remaining $n-y=x+z$ columns assign them $0$ or $2$ arbitrarily. I.e.:

$$|S_{2,n}| = \sum_{y=0}^n (y+1) {n \choose y} 2^{n-y}$$

This can be evaluated many ways but my favorite is to recognize a slightly transformed sum as an explicit formula for the expected value of a Binomial random variable:

$$ \begin{align} |S_{2,n}| &= \sum_{y=0}^n (y+1) {n \choose y} 2^{n-y} \\ &= 3^n \times \sum_{y=0}^n (1+y) {n \choose y} (\frac13)^y (\frac23)^{n-y} \\ &= 3^n \times \mathbb{E}[1 + Bin(n, \frac13)] \\ &= 3^n (1 + {n \over 3}) = 3^{n-1} (3 + n) = |F_{2,n}| & QED \end{align} $$


As I said, this is a baby step only. The key step in $F_{2,n}$ is that $T$ has a cycle iff $|S| \ge 2$, and the key step in $S_{2,n}$ is that the $y$ ones can be all in the top row or all in the bottom row or anywhere in between, for $(1+y)$ possibilities. These key steps make the counting easy. However, as far as I can see, neither key step has any easy way to generalize to $m=3$, let alone arbitrary $m$. (E.g. for $m=3$, a cycle in $T$ can involve $2$ or $3$ nodes on the $m=3$ side, and I don't know any good way to count that.)


I can prove the conjecture, although I can't give a bijection. I've left up a previous answer which does the case $m=3$, since it serves as an example of many of the ideas here.

Here is the key observation:

Theorem: There is a unique array $f(m,n)$, for $m$ and $n$ nonnegative integers, satisfying:

  1. $f(m,0) = f(0,n)=1$

  2. If we fix $m \geq 1$, then $f(m,n)$ is of the form $(m+1)^n p_m(n)$ where $p_m(x)$ is a polynomial of degree $\leq m-1$.

  3. If we fix $n \geq 1$, then $f(m,n)$ is of the form $(n+1)^m p_n(m)$ where $p_n(x)$ is a polynomial of degree $\leq n-1$.

Uniqueness: Let's suppose that we have already found unique values of $f(m,n)$ whenever $\min(m,n) < k$. In particular, we know the values of $f(k,x)$ and $f(x,k)$ for $0 \leq x \leq k-1$. This is enough values to uniquely determine a degree $k-1$ polynomial, so the values we already know determine the values of $f(k,x)$ and $f(x,k)$ for all $x$.

Existence: The proof of uniqueness gives an algorithm to compute $f$. The only potential issue is that $f(k,k)$ is determined twice, as a polynomial in its first variable and in its second variable. However, the algorithm is symmetric in $m$ and $n$, so the two interpolating polynomials coincide. $\square$

For example, we have $f(1,0) = 1 = 2^0 \cdot 1$ so we must have $p_1(n)=1$ and $f(1,n) = 2^n$. Similarly, $f(2,0)=1=2^0 \cdot 1$ and $f(2,1) = 4=3^1 \cdot (4/3)$, so we must have $p_2(n) = 1+n/3$ and $f(2,n) = 3^n(1+n/3) = 3^{n-1} (n+3)$. Using this algorithm, I computed $f(m,n)$ for $0 \leq m,n \leq 10$. Here is the data, and some Mathematica code, if you'd like to play with it.

f[m_, n_] := f[m, n] =
  If[m == 0 || n == 0, 1,
   If[m <= n, 
    (m + 1)^n * InterpolatingPolynomial[Table[{k, f[m, k]/(m + 1)^k}, {k, 0, m - 1}], x] /. x -> n,
    (n + 1)^m * InterpolatingPolynomial[Table[{k, f[k, n]/(n + 1)^k}, {k, 0, n - 1}], x] /. x -> m]]

Values of $f(m,n)$ for $0 \leq m,n \leq 10$.

1,1,1,1,1,1,1,1,1,1,1
1,2,4,8,16,32,64,128,256,512,1024
1,4,15,54,189,648,2187,7290,24057,78732,255879
1,8,54,328,1856,9984,51712,260096,1277952,6160384,29229056
1,16,189,1856,16145,129000,968125,6925000,47690625,318500000,2073828125
1,32,648,9984,129000,1475856,15450912,151201728,1403288064,12479546880,107152782336
1,64,2187,51712,968125,15450912,219682183,2862173104,34828543449,401200569280,4418300077219
1,128,7290,260096,6925000,151201728,2862173104,48658878080,760774053888,11122973573120,153936323805184
1,256,24057,1277952,47690625,1403288064,34828543449,760774053888,15047189968833,274908280855680,4707029151392121
1,512,78732,6160384,318500000,12479546880,401200569280,11122973573120,274908280855680,6199170628499200,129708290461760000
1,1024,255879,29229056,2073828125,107152782336,4418300077219,153936323805184,4707029151392121,129708290461760000,3283463201858585471

So, we can prove the conjecture by showing that both $F_{m,n}$ and $S_{m,n}$ obey conditions 1, 2 and 3. For condition 1, we can just define $S_{m,0}$, $S_{0,n}$, $F_{m,0}$, and $F_{0,n}$ to be singletons (and I claim this is actually the most natural definition); then we just need to check that our verification of polynomiality goes all the way down to the zero case.

Since conditions 2 and 3 are symmetric, and the definitions of $F_{m,n}$ and $S_{m,n}$ are symmetric, we just check condition $2$.

Verification of condition 2 for forests: Let $A_{m,b}$ be the set of isomorphism classes of bicolored forests whose white vertices are labeled $\{1,2,\ldots,m \}$ and which have $b$ black vertices, each of degree $\geq 2$. We claim that $$|F_{m,n}| = \sum_b |A_{m,b}| n(n-1)(n-2) \cdots (n-b+1) (m+1)^{n-b} . \qquad (1) $$

Proof: Take a forest in $F_{m,n}$, with the $m$ vertices colored white and the $n$ vertices colored black. Delete all black vertices of degree $0$ and $1$ and forget the numbering of the remaining black vertices. This gives a forest in $A_{m,b}$. To undo this process, we first must take the $b$ black vertices and decide which of the $n$ vertices of $K_{m,n}$ they will be; there are $n(n-1)(n-2) \cdots (n-b+1)$ ways to do this. (We used that trees in $A_{m,b}$ have no automorphisms.) Then we must take the $n-b$ remaining black vertices and either join them to one of the $m$ white vertices or leave them unconnected; there $(m+1)^{n-b}$ ways to do this.

Formula (1) is clearly $(m+1)^n$ times a polynomial, it remains to check that the polynomial has degree at most $m-1$. We must check that $A_{m,b}$ is empty if $b \geq m$. Indeed, since every black vertex has degree at least $2$, a forest in $A_{m,b}$ has at least $2b$ edges. But, since it is a forest, it also has at most $m+b-1$ vertices. So $2b \leq m+b-1$ and $b \leq m-1$. $\square$

Verification of condition 2 for margins of $(0,1)$ matrices: Consider a vector $C = (C_1, \ldots, C_n) \in \{ 0,1,\ldots, m \}^n$ of columns sums, and consider how many row sums $(R_1, \ldots, R_m)$ are compatible with it. Let $c_j = \# \{ k : C_k = j \}$, so $c_0+c_1+\cdots + c_m=n$.

By the Gale-Ryser theorem, $(R_1, \ldots, R_m)$ is compatible with $(c_0, \cdots, c_m)$ if and only if $\sum R_i = \sum j c_j$ and, for all index sets $1 \leq i_1 < i_2 < \cdots < i_k \leq m$, we have $$R_{i_1} + R_{i_2} + \cdots + R_{i_k} \leq \sum_j \min(j,k) c_j . $$ This is the defining list of inequalities of the permutahedron, so this condition can alternately be stated as saying that $(R_1, \ldots, R_m)$ is in the convex hull of the $m!$ permutations of $(c_1+c_2+\cdots+c_m, c_2+\cdots + c_m, \cdots, c_{m-1}+c_m, c_m)$.

By Theorem 11.3 of Postnikov's Permutohedra, associahedra, and beyond, the number of such $(R_1, \ldots, R_m)$ is a polynomial $g_m(c_0,c_1, \ldots, c_m)$ in the $c_j$, of degree $m-1$. So $$|S_{m,n}| = \sum_{c_0+\cdots + c_m=n} \frac{n!}{c_0! c_1! \cdots c_m!} \ g_m(c_0,c_1, \ldots, c_m). \qquad (2)$$ Let $\partial_j$ be $\tfrac{\partial}{\partial x_j}$. There is some polynomial $h_m$ in the $\partial_j$, of degree $m-1$, such that $$\left. h_m(\partial_0, \ldots, \partial_m) \left( x_0^{c_0} \cdots x_m^{c_m} \right) \right|_{(1,1,\ldots,1)} = g_m(c_1, \ldots, c_m). \qquad (3)$$ Combining (2), (3) and the binomial expansion of $(x_0+x_1+\cdots+x_m)^n$, we obtain $$|S_{m,n}| = \left. h_m(\partial_0, \ldots, \partial_m) \left( x_0+x_1+\cdots+x_m \right)^n \right|_{(1,1,\ldots,1)} . \qquad (4) $$ Let $h_m(t,t,\ldots,t) = \sum_{k=0}^{m-1} h_{m,k} t^k$. Then $(4)$ is $\sum_{k=0}^{m-1} h_{m,k} n(n-1)(n-2) \cdots (n-k+1) (m+1)^{n-k}$, which is a polynomial of the desired form. $\square$