Expectation calculation of hypergeometric distribution with sampling without replacement

Let $S_k:=\sum_{i=1}^k X_i$. Then for $k<n$, \begin{align} \mathsf{P}(X_{k+1}=1)&=\sum_{i=0}^k\mathsf{P}(S_k=i)\mathsf{P}(X_{k+1}=1\mid S_k=i) \\ &=\sum_{i=0}^k\frac{\binom{n}{i}\binom{m}{k-i}}{\binom{n+m}{k}}\times\frac{n-i}{n+m-k} \\ &=\frac{n}{n+m}\sum_{i=0}^k\frac{\binom{n-1}{i}\binom{m}{k-i}}{\binom{n+m-1}{k}}=\frac{n}{n+m}. \end{align}


The solution stems from the linearity of expectation, which applies even if the random variables are not independent. At more length,

Let $X_k$ be an indicator random variable that is $1$ if the $k_{th}$ element is black, and $0$ if not

Now black balls (or those of any other color) have no preference for position, thus if you randomly pick up the $k_{th}$ ball,

$\Bbb P(X_k) = \Bbb P(X_1) = \frac {n}{n+m}$

Now the expectation of an indicator random variable is just the probability of the event it indicates, thus

$\Bbb E(X_k ) = \frac{n}{n+m},$

and $\Bbb E(X) = \Bbb E(X_1) + \Bbb E(X_2) + ...+\Bbb E(X_r) = \frac{rn}{n+m}$


The following seems to be a more rigorous argument.

I think you agree that $P(X_1 = 1) = \frac{n}{n+m}$.

Then \begin{align}P(X_2=1) &= P(X_2=1 \mid X_1=1)P(X_1=1) + P(X_2=1 \mid X_1=0)P(X_1=0) \\ &{= \frac{n-1}{n+m-1} \cdot \frac{n}{n+m} + \frac{n}{n+m-1} \cdot \frac{m}{n+m}} \\ &= \frac{n}{n+m}, \end{align}

if you simplify the expression on the second line. If you feel like it, you can show that $P(X_k=1) = \frac{n}{n+m}$ for all $1 \leq k \leq r$ similarly, by induction (d.k.o's answer is the induction step).