Derivation for hypergeometric distribution formula and comparsion with Bernoulli formula

If there are $N$ balls total, of which $K$ are "red," and $N-K$ are "white," and $n$ balls are randomly selected without replacement, then the probability that exactly $k$ of the $n$ balls are red is given by $$\Pr[X = k] = \frac{\binom{K}{k}\binom{N-k}{n-k}}{\binom{N}{n}}.$$ To see where this formula comes from, label the balls $$R_1, R_2, \ldots, R_K, W_{K+1}, W_{K+2}, \ldots, W_N.$$ Note that the way we have labeled the balls allows us to uniquely identify them simply by the subscript, since there are no repeated subscripts. Now, our sample space $\frak S$ consists of all $n$-subsets of $S = \{R_1, R_2, \ldots, R_K, W_{K+1}, W_{K+2}, \ldots, W_N\}$: $${\frak S} = \{s \subseteq S : |s| = n\}.$$ What is the size of this sample space; i.e., how many $n$-subsets are there of $S$, or how many ways are there to choose $n$ distinct balls from a group of $N$ numbered balls? Obviously, this is just $$|{\frak S}| = \binom{N}{n} = \frac{N!}{n! (N-n)!}.$$ This is where the denominator comes from. All that is left now is to enumerate those $n$-subsets in which exactly $k$ of the balls are red. To do this, we note that any such subset would not only have exactly $k$ red balls, but also exactly $n-k$ white balls, since the number of red balls drawn ($k$), plus the number of white balls drawn ($n-k$), must equal the total number of balls drawn ($n$). How many ways are there to choose $k$ red balls from $S$ and $n-k$ white balls from $S$? This is also obvious if we look at the structure of $S$; there are $$\binom{K}{k}$$ ways to choose $k$ red balls from the set $\{R_1, R_2, \ldots, R_K\}$ of red balls, and $$\binom{N-K}{n-k}$$ ways to choose $n-k$ white balls from the set $\{W_{K+1}, W_{K+2}, \ldots, W_N\}$. Moreover, the choices of which numbered red and white balls are selected are independent of each other; thus the total number of desired outcomes is simply the product of each individual event, so the total number of ways to have exactly $k$ red balls from $n$ selected balls is $$\binom{K}{k}\binom{N-k}{n-k},$$ the numerator of our probability.

If this level of abstraction is difficult to grasp, it helps to consider a numeric example. Suppose we have $N = 9$ balls, $K = 4$ of which are red (so $N - K = 9 - 4 = 5$ are white). And suppose we are interested in the probability that, among $n = 3$ balls chosen at random, we obtain exactly $k = 1$ red ball (and implicitly, $n - k = 3-1 = 2$ white balls). Obviously, there are simply $\binom{N}{n} = \binom{9}{3} = 84$ ways to select any three balls out of nine without replacement. How many of these outcomes have exactly $1$ red ball? There were only $4$ red balls to choose from, so there are just $\binom{K}{k} = \binom{4}{1} = 4$ ways to get one red ball. But we also have to account for the number of ways to select white balls; this is $\binom{N-k}{n-k} = \binom{5}{2} = 10$; thus there are $4(10) = 40$ such outcomes with exactly $1$ red and $2$ white balls, and the resulting probability is $$\Pr[X = 1] = \frac{40}{84} = \frac{10}{21}.$$

Specifically, we can enumerate the desired outcomes as follows:

$$\{R_1 , W_5 , W_6 \}, \{ R_1 , W_5 , W_7 \}, \{ R_1 , W_5 , W_8 \}, \{ R_1 , W_5 , W_9 \}, \{ R_1 , W_6 , W_7 \}, \{ R_1 , W_6 , W_8 \}, \{ R_1 , W_6 , W_9 \}, \{ R_1 , W_7 , W_8 \}, \{ R_1 , W_7 , W_9 \}, \{ R_1 , W_8 , W_9 \}, \{ R_2 , W_5 , W_6 \}, \{ R_2 , W_5 , W_7 \}, \{ R_2 , W_5 , W_8 \}, \{ R_2 , W_5 , W_9 \}, \{ R_2 , W_6 , W_7 \}, \{ R_2 , W_6 , W_8 \}, \{ R_2 , W_6 , W_9 \}, \{ R_2 , W_7 , W_8 \}, \{ R_2 , W_7 , W_9 \}, \{ R_2 , W_8 , W_9 \}, \{ R_3 , W_5 , W_6 \}, \{ R_3 , W_5 , W_7 \}, \{ R_3 , W_5 , W_8 \}, \{ R_3 , W_5 , W_9 \}, \{ R_3 , W_6 , W_7 \}, \{ R_3 , W_6 , W_8 \}, \{ R_3 , W_6 , W_9 \}, \{ R_3 , W_7 , W_8 \}, \{ R_3 , W_7 , W_9 \}, \{ R_3 , W_8 , W_9 \}, \{ R_4 , W_5 , W_6 \}, \{ R_4 , W_5 , W_7 \}, \{ R_4 , W_5 , W_8 \}, \{ R_4 , W_5 , W_9 \}, \{ R_4 , W_6 , W_7 \}, \{ R_4 , W_6 , W_8 \}, \{ R_4 , W_6 , W_9 \}, \{ R_4 , W_7 , W_8 \}, \{ R_4 , W_7 , W_9 \}, \{ R_4 , W_8 , W_9 \} $$


It helps to think in terms of the number of subsets of $[N]=\{1,\dots,N\}$ with various properties.

Imagine the "success" outcomes form the set $[K]$ which is a subset of $[N]$.

The numerator in the hypergeometric distribution formula gives the number of subsets of $[N]$ of size $n$ having exactly $k$ elements from $[K]$.

The denominator gives the total number of subsets of $[N]$ of size $n$.

Since the subsets are all equally likely (an important point), the ratio is the probability you want.


Here's another way:

There are $N$ balls, $K$ are red and $N-K$ are white. We take a sample of $n$ balls without replacement. What is the probability that we have exactly $k$ red balls in our sample?

All of the ways in which we can choose our sample so that it contains $k$ red balls occur with the same probability. For example $$ P(\underbrace{red,\ldots,red}_{k},\underbrace{white,\ldots,white}_{n-k}) = P(red,white,\underbrace{red,\ldots,red}_{k-1},\underbrace{white,\ldots,white}_{n-k-1}). $$ How many different such ways are there? This is just the binomial coefficient, ${n \choose k}.$

So \begin{align*} &P(k \text{ red ball in a sample of } k) \\ &= {n \choose k}P(\underbrace{red,\ldots,red}_{k},\underbrace{white,\ldots,white}_{n-k})\\ &= {n \choose k} \underbrace{\frac{K}{N}\frac{K-1}{N-1}\cdots\frac{K-(k-1)}{N-(k-1)}}_{k \text{ terms}} \,\,\, \underbrace{\frac{N-K}{N-k}\cdots\frac{N-(n-k-1)}{N-(n-1)}}_{n-k \text{ terms}}\\ &= {n \choose k} \frac{\frac{K!}{(K-k)!}\frac{(N-K)!}{(N-K-(n-k))!}}{\frac{N!}{(N-n)!}}\\ &= \frac{n!}{k!(n-k)!} \frac{\frac{K!}{(K-k)!}\frac{(N-K)!}{(N-K-(n-k))!}}{\frac{N!}{(N-n)!}}\\ &= \frac{{K \choose k} {N-K \choose n-k}}{{N \choose n}}. \end{align*}