Expected Number of Single Socks when Matching Socks

The expected number can be computed via Linearity of Expectation. Let $E[n,k]$ denote the answer and let $\{X_i\}_{i=1}^n$ denote the indicator variable for the $i^{th}$ pair. Thus $X_i=1$ if exactly one member of the $i^{th}$ pair has been chosen in your $k$ trials, and $X_i=0$ otherwise. It is easy to see that $$E[X_i]=2\times \frac k{2n}\times \left(1-\frac {k-1}{2n-1}\right)$$ from which it follows that $$E[n,k]=E\left[\sum X_i\right] =\sum E[X_i]= k\times \left(1-\frac {k-1}{2n-1}\right)$$

Sanity check: $k=1\implies E[n,1]=1$ as it should. Also $k=2n\implies E[n,2n]=0$ as it should.

Remark: it is easily seen that this function is maximized with $k=n$, confirming your intuition. Also the expression can be written as $$E[n,k]=\frac {k(2n-k)}{2n-1}$$ which is symmetric under the exchange of $k,2n - k$ also in line with your expectations.

Remark: more strongly, it is clear that at any time the number of unmatched socks in one pile is the same as the number in the other pile (indeed it's exactly the same pairs of socks which are split between the piles). That provides clear justification for the symmetry.


We can verify the accepted answer using the methodology from this MSE link where we see that the problem is very similar to a coupon collector without replacement and two instances of $n$ types of coupons. Suppose we have $j$ instances. Start by asking about the probability of getting the following distribution of coupons:

$$\prod_{q=1}^n C_q^{\alpha_a}$$

where $\alpha_q$ says we have that many instances of type $q$ and is at most $j.$ We get from first principles the probability

$$\frac{(nj-\sum_{q=1}^n \alpha_q)!}{(nj)!} \prod_{q=1}^n \frac{j!}{(j-\alpha_q)!}.$$

Now when we multiply a probability by the total number of events we get the favorable events. Therefore the EGF for a given coupon type is

$$\sum_{k=0}^j \frac{j!}{(j-k)!} \frac{z^k}{k!} = \sum_{k=0}^j {j\choose k} z^k = (1+z)^j.$$

With $j=2$ and $n$ types of coupons we get

$$m! [z^m] (1+z)^{2n}$$

and asking for the total count after $m$ coupons have been drawn yields

$$m! \times {2n\choose m}.$$

Placing a marker on the singletons we find

$$m! [z^m] \left.\frac{\partial}{\partial u} (1+2uz+z^2)^n\right|_{u=1} \\ = m! [z^m] \; \left. n \times (1+2uz+z^2)^{n-1} \times 2z \right|_{u=1} \\ = m! [z^m ] 2nz (1+z)^{2n-2} \\ = m! \times 2n {2n-2\choose m-1}.$$

Divide to get the expectation

$$ {2n\choose m}^{-1} 2n {2n-2\choose m-1} = 2n \frac{m! \times (2n-m)!}{(2n)!} \frac{(2n-2)!}{(m-1)! \times (2n-m-1)!} \\ = 2n \times m \times (2n-m) \frac{1}{(2n)(2n-1)} \\ = \frac{m\times (2n-m)}{2n-1}.$$


If you let your function that gives the expected number of socks in your 'no match yet' pile take $k$ and $2n-k$ as arguments, i.e.

$$f\,=\,f(k,\,2n-k),$$

it will be symmetric. The 'no match yet' pile is just the subset of all $k$ socks that have been picked that have their matching sock among all $2n-k$ socks that have not been picked yet.

Therefore, we can make the following reinterpretation of $f$:

Out of $n$ pairs of socks, $f(a,\,b)$, where $a+b=2n$, is the expected number of pairs of socks for which the two socks in the pair end up in different piles when all $2n$ socks are randomly divided into two piles of sizes $a$ and $b$, respectively.

Since the piles don't have any order, the order of the arguments to $f$, which are just the sizes of the piles, doesn't matter. Hence, $f$ is symmetric.

Tags:

Probability