how does expectation maximization work?

These are the likelihoods of the corresponding set of $10$ coin tosses having been produced by the two coins (using the current estimate for their biases) normalized to add up to $1$. The estimated probability of $k$ out of $10$ tosses of coin $i$ ($i\in\{A,B\}$) yielding heads is

$$p_i(k)=\left({10\atop k}\right) \theta_i^k (1-\theta_i)^{10-k}\;.$$

The binomial coefficient is the same for both coins, so it drops out in the normalization, and only the ratio of the remaining factors determines the result.

For instance, in the second row, we have $9$ heads and $1$ tails. Given the current bias estimates $\theta_A=0.6$ and $\theta_B=0.5$, the factors are

$$\theta_A^9 (1-\theta_A)^{10-9}\simeq0.004$$

and

$$\theta_B^9 (1-\theta_B)^{10-9}\simeq0.001\;,$$

resulting in the numbers

$$\frac{0.004}{0.004+0.001}=0.8$$

and

$$\frac{0.001}{0.004+0.001}=0.2$$

in the second row.


Consider one of the coin-toss realizations in the figure.

Let $P(H_9T_1|A)$ be the probability of observing 9 heads, 1 tail when coin is A.

Let $P(H_9T_1|B)$ be the probability of observing 9 heads, 1 tail when coin is B.

Let $P(A|H_9T_1)$ be the probability of the coin being A when you observe 9 heads, 1 tail.

Let $P(B|H_9T_1)$ be the probability of the coin being B when you observe 9 heads, 1 tail.

Apply conditional probability definition.

$P(A|H_9T_1) = \frac{P(A) \cdot P(H_9T_1|A)}{P(H_9T_1)}$

$P(B|H_9T_1) = \frac{P(B) \cdot P(H_9T_1|B)}{P(H_9T_1)}$

Now,

$P(A) = 0.5 = P(B)$

Estimates of $P(H_9T_1|A)$ and $P(H_9T_1|B)$ are computed using method described by @joriki

Since the coin can either be A or B, $P(A|H_9T_1) + P(B|H_9T_1) = 1$

Hence you can calculate numbers in step 2. They are $P(A|H_9T_1)$ and $P(B|H_9T_1)$ respectively.

Tags:

Statistics