Hypergeometric Random Variable Expectation

We have $N$ balls, of which $r$ are red. We draw balls sequentially, without replacement.

Imagine the balls are distinct, they have ID numbers written on them, if you wish with invisible ink. Imagine also drawing all $N$ balls, one after the other. Then all of the sequences of ball ID's are equally likely. So the probability a particular ball is drawn $k$-th is $\frac{1}{N}$, and therefore the probability it is red is $\frac{r}{N}$.

Now let us suppose that we only draw $n$ balls. For $i=1$ to $n$, let random variable $X_i$ be $1$ if the $i$-th drawn ball is red, and $0$ otherwise. Then the number $Y$ of red balls is given by $Y=X_1+\cdots +X_n$. By the linearity of expectation, it follows that $$E(Y)=nE(X_1)=n\Pr(X_1=1)=n\cdot \frac{r}{N}.$$


You have to rethink your intuition about the binomial RV, s.t. the intuition for the hypergeometric RV makes sense. Let's do it step by step.

CURRENT INTUITION FOR BINOMIAL: First of all let's question the intuition of the Binomial case. Let's say we have an urn with N balls, whereas R are red balls and N-R white. We want to draw red balls (successes). And we draw n times. The random variable X denotes the number of red balls we draw.

Currently your mental picture of drawing is:

  1. close your eyes
  2. draw (without looking)
  3. open your eyes; look at the ball ("ahaa it's red/white")
  4. put it back (since it's sampling with replacement)
  5. repeat n-1 times

For all draws i in {1, ... n} we have: E[Xi ] = R / N. With linearity of expectation we get E[X] = n * R / N.

NEW INTUITION FOR BINOMIAL: So far so good. But now let's change your mental picture of the Binomial case. Just skip the "looking at the drawn ball":

  1. close your eyes
  2. draw (without looking)
  3. don't open your eyes, but take a picture of the ball (e.g. with your phone) to look at it later
  4. put it back (since it's sampling with replacement)
  5. repeat n-1 times

Here, during the drawing process you have no information about the drawn balls. Only when you look at the pictures later, you will know. Again, E[X] = n * R / N. In the binomial case, there is no difference, if you look at the drawn balls or not. BUT in the hypergeometrical case it does make a difference. So here is the mental picture:

NEW INTUITION FOR HYPERGEOMETRIC:

  1. close your eyes
  2. draw (without looking)
  3. keep the ball in a box (for you to look at it later; eyes still closed)
  4. repeat n-1 times

So you haven't looked at the balls yet. Now, what's the probability that the first ball is a red one? That's easy, it's R / N. You have R successes and draw them from N balls uniformly.

Now, what about the probability, that the second ball (let's say you have it in your hand, eyes closed) is red (You still haven't looked at the first ball yet): Ahaa, it's R/N again. Why? Because you have N unknowns (N-2 in the urne, 1 in the box and 1 in your hand) and still R successes (most of them in the urne, and maybe one in the box and/or in your hand - you don't know where these R balls are). The same is true for the third draw, etc. Since you don't know which ones you have drawn already, the probability of a red ball in your hand does not change.

So again E[Xi] = R/N. With linearity of expectation, we get E[X] = n * R / N (which is, what we were looking for).

This is, what @André Nicolas meant in his comment of the accepted answer. There's a difference between P["second ball red"] and P["second ball red" | "first ball red"] (respectively P["second ball red" | "first ball white"]).

Hope that helps.


We have a population of $N$ balls, or which $R$ are red, and extract a sample of $n$ balls (without replacement).

Suppose we line the balls in a row as we extract them.   Let $X_k$ be the indicator that the $k$-th ball drawn is red.   That means: $X_k=1$ if it is, $X_k=0$ if it is not.   It is a Bernouli random variable.

Now the expectation of this random variable is the probability that the $k$-th balls is red. $$\begin{align}\mathsf E(X_k)=&~1\cdot\mathsf P(X_k=1)\color{silver}{+0\cdot\mathsf P(X_k0)}\\ =&~ \tfrac RN \end{align}$$

The count of red balls withing the sample is $\sum_{k=1}^n X_k$.   Of course the random variables, $\{X_k\}_{k\in\{1,..,n\}}$, are not independent, but here's a fun fact:

The Linearity of Expectation holds regardless of whether the random variables are independent or not.

Hence we have that:

$$\mathsf E\left(\sum_{k=1}^n X_k\right) ~=~ \sum_{k=1}^n\mathsf E(X_k) ~=~ n\cdot\frac{R}{N}$$