The Birthday Paradox with combinatorics

The Scientific American's calculation is not quite accurate: it supposes that every non-match between two people is independent from any other non-match between two people, but that is not the case.

To see this, consider just 3 people: A, B, and C.

What is the chance that they don't share a birthday?

Scientific American's approach is to say: well, we need the following three events:

E1. A and B don't share a birthday

E2. A and C don't share a birthday

E3. B and C don't share a birthday

Now, individually, the probability of each of those events is $\frac{364}{365}$

So, they say: $P(E1 \cap E2 \cap E3) = P(E1) \cdot P(E2) \cdot P(E3) = (\frac{364}{365})^3$

But again, that is assuming these three events are independent ... and they are not! For example, once we know that $A$ does not share a birthday with either $B$ or $C$ (i.e. once we have events $E1$ and $E2$) the probability of $B$ not sharing a birthday with $C$ (i.e. event $E3$) is no longer a simple $\frac{364}{365}$, because it is no longer possible for $B$ or $C$ to have their birthday on the day that $A$ has their birthday.

Indeed, the more people we know do not share a birthday with $B$ and $C$, the fewer options become available for $B$ and $C$ to have their birthdays, and this will in fact increase the chances of $B$ and $C$ to share a birthday, and thus decrease the chance of them not sharing a birthday.

To illustrate how the probability of $B$ and $C$ not sharing their birthday is affected once we know they don't share the same birthday with $A$ (how, indeed, that probability has gone down), consider what happens when we consider only three possible days that each of $A$, $B$, and $C$ can have their birthday.

First of all, the probability of $B$ and $C$ sharing their birthday not knowing anything about $A$ is $\frac{1}{3}$, and hence there is a $\frac{2}{3}$ probability that they don't share their birthday.

OK, but now let's add the information that $B$ nor $C$ share their birthday with $A$. That means that there are now only two days left for $B$ and $C$ to have their birthday, meaning that now there is a $\frac{1}{2}$ probability that they share their birthday, and hence a $\frac{1}{2}$ probability that they don't share their birthday. So, indeed, the probability that $B$ and $C$ don't share their birthday has gone down once we know they don't share their birthday with $A$.

And indeed, note what happens when we have even fewer possible birthdays: if we get to a point where we have more people than possible birthdays, then the probability of two people not matching their birthdays given that no one else shares their birthday with any of these two people simplybecomes $0$.

In sum: the probability that Scientific American assumes for each non-match is too high.

Now, with the actual birthday problem we are not considering $3$ people and $3$ possible birthdays, but $23$ people with $365$ possible birthdays. And with so many more birthdays than people, it turns out that Scientific American's calculations are not going to be that far off. But still: the $\frac{364}{365}$ is a little too high for the remaining non-matches once you already know other non-matches, meaning that the $(\frac{364}{365})^{253}$ is a little too big, and thus when subtracted from $1$, the Scientific American's calculation ends up being a little too low.

But of course, it would be hard to tell this beforehand (note that the $\frac{364}{365}$ does get raised to the $253$-th power!) ... It happened to get 'close enough' for this problem ... but I really don't like to think that it was an a priori justified simplification. You certainly can't use this kind of approximation to establish that $23$ people was the magic 'breaking point' as opposed to $22$ people, or maybe even fewer. Indeed, they we're lucky that with their calculation for $23$ people it just stayed above $0.5$.

Your calculation, however, is completely correct. Good job!!