Expected cardinality of a randomly chosen element of the family of subsets of $\{1,\ldots,n\}$ with at most $k$-elements

As I understand the problem, the expected value is $$\frac{\sum_{i=0}^k i\binom{n}{i}}{\sum_{i=0}^k \binom{n}{i}}$$ which, for $k=n$, reduces by nice identities to $\frac{n}{2}$.

I don't know of nice formulas for partial sums of (weighted) binomial coefficients. Below are the simplified expressions for small values of $k$.

  • $k=1$: $\frac{n}{n+1} \sim 1$,
  • $k=2$: $\frac{2n^2}{n^2+n+2} \sim 2$,
  • $k=3$: $\frac{3n^3-3n^2+6n}{n^3+5n+6} \sim 3$,
  • $k=4$: $\frac{4n^4-12n^3+32n^2}{n^4-2n^3+11n^2+14n+24} \sim 4$.

(The coefficients in the denominator expressions are given by A054651; I don't find an OEIS entry for the analogous numerator coefficients.)


$\DeclareMathOperator\E{E}$As already noted by the other answers, $$\E\xi^k_n=\frac{\sum_{i=0}^ki\binom ni}{\sum_{i=0}^k\binom ni}.$$ One can then easily determine the asymptotics of $\E\xi_n^{\lfloor n\delta\rfloor}$ for fixed $0\le\delta\le1$:

Case 1: $0\le\delta<1/2$. Then $$k-\frac{\delta}{1-2\delta}\le\E\xi_n^k\le k,$$ where $k=\lfloor n\delta\rfloor$. This can be shown by approximation of $\binom ni$ by a geometric series. That is, we have $$0<i\le k\implies\binom n{i-1}=\frac i{n-i+1}\binom ni\le\frac i{n-i}\binom ni\le\frac\delta{1-\delta}\binom ni,$$ hence $$0\le j\le i\le k\implies \binom nj\le\left(\frac\delta{1-\delta}\right)^{i-j}\binom ni.$$

Thus, $$\begin{align} k-\E\xi^k_n=\frac{\sum_{i=0}^k(k-i)\binom ni}{\sum_{i=0}^k\binom ni} &=\frac{\sum_{j=1}^k\sum_{i=0}^{k-j}\binom ni}{\sum_{i=0}^k\binom ni}\\ &\le\frac{\sum_{j=1}^k\left(\frac\delta{1-\delta}\right)^j\sum_{i=j}^k\binom ni}{\sum_{i=0}^k\binom ni}\\ &\le\sum_{j=1}^k\left(\frac\delta{1-\delta}\right)^j\le\frac\delta{1-2\delta}. \end{align}$$

One can show that the lower bound is closer to the truth: $\E\xi^k_n=k-\frac\delta{1-2\delta}+O\bigl(\frac{\log n}n\bigr)$.

Case 2: $1/2<\delta\le1$. Then $\E\xi^k_n=\frac n2-O(\gamma^n)$ for some $\gamma<1$ (depending on $\delta$).

Indeed, Stirling bounds give $$2^n-\sum_{i=0}^k\binom ni=\sum_{i=k+1}^n\binom ni=O(\alpha^n)$$ for some constant $\alpha<2$, hence $$\E\xi^k_n=\frac{\sum_{i=0}^ni\binom ni+O(n\alpha^n)}{\sum_{i=0}^n\binom ni+O(\alpha^n)}=\frac n2+O\bigl(n(\alpha/2)^n\bigr).$$

Case 3: $\delta=1/2$. Then $$\frac n2-\frac{\sqrt n}2\le\E\xi^k_n\le\frac n2.$$ Indeed, if $Y$ is drawn from the binomial distribution $B(n,1/2)$, we have $$\frac n2-\E\xi^k_n=\E\left|Y-\frac n2\right|\le\sqrt{\E\left(Y-\frac n2\right)^2}=\sqrt{\operatorname{Var}Y}=\frac{\sqrt n}2.$$

In this case, approximation of the binomial distribution by Gaussian distribution with mean $n/2$ and variance $n/4$ suggests that the true value of $\E\xi^k_n$ should be roughly $\frac n2-\sqrt{\frac{n}{2\pi}}$, but I will not try to make this rigorous.


Let $f(n,k):=\sum_{i=0}^k\binom{n}{i}$. It can be seen that $$\sum_{i=0}^k i\binom{n}{i} = \sum_{i=1}^k n\binom{n-1}{i-1} = n\cdot f(n-1,k-1).$$ So, it remains to evaluate $$\frac{n\cdot f(n-1,k-1)}{f(n,k)},$$ where sharp bounds for the numerator and denominator are known -- e.g., see answers at Lower bound for sum of binomial coefficients?