Intersection of Random Subsets

Credit goes to Studentmath and Ted Shifrin for discussing this with me in the MSE chat.


One of the most viable approaches (even though not exactly elegant) is to apply inclusion-exclusion on the size of the intersection $\displaystyle\bigcap_{i=1}^n e_i$.

So we determine the probability that the intersection contains a certain set of size $i$, say. To do this, we choose $i$ elements, and then for each $q$-subset, $q-i$ elements to go with it. This yields: $$\binom p i \binom{p-i}{q-i}^n \binom{p}{q}^{-n}$$

Now, of course, we have to do the familiar correction for double-counting, yielding the following inclusion-exclusion summation: $$\sum_{i=1}^q (-1)^{i+1} \binom p i \binom{p-i}{q-i}^n \binom{p}{q}^{-n}$$


Update: When there is a desire to calculate multiple values, or to know the exact distribution over the different intersection sizes, the following recursive approach may be useful:

Let $N(k, i)$ denote the number of ways $k$ $q$-subsets can have an intersection with $i$ elements. Then we can derive $N(k, i)$ from the $N(k-1, *)$ as follows:

\begin{align*} N(k,i) &= \frac 1n \sum_{j=i}^q N(k-1,j) \binom j i \binom{p-j}{q-i} \\ N(1,i) &= \begin{cases} 0 & :i \ne q \\ \binom p q & :i = q \end{cases} \end{align*}

where the $\frac 1n$ corrects for the otherwise ordered sequence of adding the $q$-subsets to our consideration.