formula for relating number of successes to number of tries

I'll assume the jar contains a sufficiently large number of balls equally distributed among the colors. If we don't assume equi-distribution of colors or if we take into account the fact we are drawing without replacement the problem becomes significantly more complicated. Supposing we draw $n$ balls from a jar containing $c$ colors, what is the distribution of the random variable $K$, the number of colors in the sample? (assuming $\boldsymbol{n\geq c}$)

This is a stars-and-bars calculation. Given $c$ different colors possible and our sample being of size $n$, the number of different samples we can get is $$\mathrm{C}(n+c-1,c-1)$$ With $\mathrm{C}$ being of course the binomial coefficient. Of these possible samples, how many of them contain exactly $k$ different colors? Well, we basically don't get to pick the first $k$ balls - they have to be the $k$ distinct colors. But, the remaining $n-k$ balls can be any mixture of the $k$ colors we already have. The number of ways of selecting $n-k$ balls when we have $k$ color choices is $$\mathrm{C}(n-1,k-1)$$ Finally, the number of ways of selecting which $k$ colors out of a pool of $c$ our sample will consist of is precisely $$\mathrm{C}(c,k)$$ Meaning that the number of ways of picking a sample of size $n$ that has exactly $k$ different colors out of a possible $c$ is exactly $$\mathrm{C}(c,k)\mathrm{C}(n-1,k-1)$$ Meaning that $$\Pr(K=k)=\frac{\mathrm{C}(c,k)\mathrm{C}(n-1,k-1)}{\mathrm{C}(n+c-1,c-1)}$$ As a sanity check, it can be verified that $$\sum_{k=1}^c \frac{\mathrm{C}(c,k)\mathrm{C}(n-1,k-1)}{\mathrm{C}(n+c-1,c-1)}=1$$ As long as $n\geq c$. Here is a plot for the $n=50,c=20$ case:

enter image description here

It can also be computed directly that $$\mathrm{E}[K]=\frac{nc}{n+c-1}$$ And $$\operatorname{Var}[K]=\frac{nc(nc-1)}{(n+c-1)(n+c-2)}-\left(\frac{nc}{n+c-1}\right)^2$$

EDIT: Numerical evidence seems to suggest my probability mass function gives correct results even for $n<c$.

Tags:

Probability