The case of the missing ninth of a $2$€ coin

The best intuitive understanding I can offer comes from looking at some pictures.


Picture 1:

Graph of probabilities

For a number of coins $x=A+B$ (x-axis) and total $n=A+2B$ (y-axis), I plot the probability (colour, yellower means higher probability, violet means impossible) of getting the total $n$ given the number of coins $x=A+B$. The black dashed line is the line $A=B$.

The fact that, for a given number of coins, $A-B$ has average zero is seen by the fact that the distribution along vertical lines is perfectly symmetric about the black dashed line. I've drawn a vertical dotted blue line to guide the eye.

The fact that, for a given $n$, $A-B$ may have non-zero expectation is seen by the fact that this symmetry is absent along horizontal lines. (Again, there's a green dotted line to guide the eye.) The distribution is still peaked very close to or on the line $A=B$. But we might guess from the picture that the distribution is skewed a little right -- typically, we have slightly higher number of coin drawn than the naive $A=B$ line suggests, and hence more lower-value coins, as you observed.


Picture 2:

graph of probabilities

This is the same picture, except shifted by subtracting the average total of $x$ coins off the y-axis. Hence the y-axis $(B-A)/2$ is now proportional to the number of excess large denomination coins, and the black dotted line is horizontal.

This picture makes the symmetry more explicit, so is perhaps better.

You can also more easily imagine following a random walk to the right. The two questions now involve waiting until your random walk hits either the blue (fixed number of draws) or green (fixed total) line, and then looking at whether you expect to be above or below the black dotted line when that happens. Again, this isn't totally obvious, though it is suggestive that you expect to hit it below, as you have shown.


I like these pictures because they convey pictorially the independence of the two quantities you are calculating -- the constraint that the marginal distribution has a nice property along every vertical line says very little in general about its properties along other lines.


I am not sure how rigorous my math is, but here is an argument for:

Claim: Let $F_n$ denote the event of reaching $n$. If $\lim_{n \to \infty} E[A-B \mid F_n] = \ell$ exists, then $\ell = 1/3$.

The proof is a mixture of steady state arguments and martingales.

Suppose while you are drawing coins, happily approaching $n$, I am betting on the coins you draw. To minimize confusion, I am betting in US dollars. Every time you draw, I bet $1$ USD, at even odds, that you would draw a $1$€ coin. At any point in time my profit is exactly $P=A-B$ USD.

Here is the stopping rule: my game ends when you get to $n$ or beyond (i.e. $n+1$). The stopping time is bounded, so Doob's theorem applies and we have $E[P] = 0$ when my game stops.

Now, my game can end in one of $3$ ways, and by steady state arguments, each is equally likely for large $n$:

  • (X) The last step was $n-2 \to n$

  • (Y) The last step was $n-1 \to n$

  • (Z) The last step was $n-1 \to n+1$

Curiously, in $2$ of the $3$ cases, I lost $1$ USD on that last bet. By the law of total expectation:

$$ E[P] = \frac13 (E[P \mid X] + E[P \mid Y] + E[P \mid Z])$$

By definition of $F_n$ and explicit accounting of the last win/loss, this becomes:

$$0 = (E[P \mid F_{n-2}] - 1) + (E[P \mid F_{n-1}] + 1) + (E[P \mid F_{n-1}] - 1)$$

So if the limit exists, we have:

$$0 = (\ell -1) + (\ell + 1) + (\ell - 1) \implies \ell = \frac13 ~~~~~\square$$

As I mentioned in the very beginning, I am not sure about the rigor of the argument. Critiques, corrections, comments are most welcome.


This feels like a Renewal Theory problem to me. Typically when you run into a 'paradox' in these situations it's because (a) a convergence issue or (b) some feature of the problem is doing things you didn't intend. Depending on preference we can assign blame to (a) or (b) here.

where $X_k := \mathbb I_k + 1$ is your iid draw on the $k$th selection, (i.e. a fair Bernouli +1)

and $S_k := X_1 + X_2 + ... + X_k$

we can see $S_k$ is integer valued and monotone increasing. And so with a valid stopping rule we should be comfortable that there aren't convergence issues.

But depending on how you want to look at it, we could say there is in effect a convergence problem because you are using a defective stopping rule -- i.e. under your stopping criterion a meaningful amount of sample paths never stop or never get 'counted' under your problem. (I'll show that $\approx \frac{2}{3}$ of sample paths properly stop under your rule -- i.e. $\frac{1}{3}$ of paths never stop so the stopping rule is defective-- but I'll reframe this so the game stops when $S_k \geq n$, which occurs WP1, but there is only a 'reward' when the game stops at $S_k = n$. This can be interpreted as Renewal Rewards.)

That is for large enough $n$ when you condition on stopping with a score of exactly $n$ you are suppressing / discarding about $\frac{1}{3}$ of all sample paths. And this is why you cannot make a fair game type of relation like

"for any given number of coins the expectation is 0, but for any given value of the coins it's positive" .

if you had a valid stopping rule you could make such a claim (subject to other convergence subtleties for any given martingale).

The renewal theory bit:
A residual life renewal chain can be very helpful here. The standard form is

$P = \left[\begin{matrix}p_1 & p_2 & p_3 & p_4 & p_5 & \dots \\ 1 & 0 & 0 & 0 & 0 & \dots\\ 0 & 1 & 0 & 0 & 0 & \dots\\ 0 & 0 & 1 & 0 & 0 & \dots\\ \vdots & \vdots & \vdots & \ddots & \ddots & \ddots\\ \end{matrix}\right]$

where $p_i$ stands for probability of a return (renewal) at integer time $i$

for this problem, it is very simple and we have
$P = \left[\begin{matrix}\frac{1}{2} & \frac{1}{2} \\1 & 0\\\end{matrix}\right]$

given a start at state one we have expected time until return of
$\bar{X} = \frac{1}{2}\cdot 1 +\frac{1}{2}\cdot 2 = \frac{3}{2}$

(For reference, Feller volume 1's Markov Chains chapter does a very good job discussing this chain which complements its chapter on Renewal Theory.)

now fix some large $n$, if you give someone a reward of 1 for hitting exactly $n$ this may be modelled as $\mathbf e_1^T P^n \mathbf e_1$
(with $\mathbf e_1$ a standard basis vector)

and for large enough n we have
$P^n \approx \mathbf 1 \left[\begin{matrix}\frac{1}{\bar{X}} \\ 1-\frac{1}{\bar{X}} \\\end{matrix}\right]^T = \mathbf 1\left[\begin{matrix}\frac{2}{3} \\ \frac{1}{3} \\\end{matrix}\right]^T$
(this is dealt with as an analytical problem of "repeated averaging" on page 333 of 3rd edition of Feller vol 1)

hence for $n$ large enough we have
$\mathbf e_1^T P^n \mathbf e_1 \approx \frac{1}{\bar{X}} = \frac{2}{3}$

interpretation: I'd suggest using associativity and considering this as
$\mathbf e_1^T \big(P^n \mathbf e_1\big) = \mathbf e_1^T \big(P\big(... \big(P\big(P\mathbf e_1\big)\big)...\big)\big)$
which reads as -- assume we've stopped (i.e. $S_T \geq n$) which we know occurs with probability 1, then at $P\mathbf e_1$ we have a vector containing the expected rewards one euros prior to stopping, and $P^2\mathbf e_1$ has the vector with expected rewards 2 euros prior to stopping, $P^3\mathbf e_1$ with expected rewards 3 euros prior to stopping and so on to $P^n\mathbf e_1$. We start in the initial state and thus collect the first reward from that vector given by $\mathbf e_1^T P^n\mathbf e_1$. The key idea under this interpretation is $r$ in $P^r$ denotes the number of euros we are from our threshold, and via backward induction we can reason about expected rewards all the way back to the starting state when we have 0 euros, i.e. are $n$ from the stopping threshold.

now to confirm the obvious, if you had the stopping rule in your original problem that stopping occurred when $S_k \geq n$ for some k, we could model this as a renewal rewards problem that gives you a reward of $1$ if you stop at $n$ and a reward of 1 if you stop at $n+1$ (this is the overshoot). So expected rewards are

$\mathbf e_1^T P^n\mathbf 1 = \mathbf e_1^T \big( P^n\mathbf 1\big) = \mathbf e_1^T\mathbf 1 = 1$

because $P$ is row stochastic. So this confirms that obvious that the expected reward is 1. In both cases there is a reward of 1 for 'success' and reward of 0 for 'failure' so the expected reward gives the probability of a Bernouli-- i.e. the probability of hitting the thing we are rewarding.

That ended up being a bit long. Here's a quick look at your problem with a proper stopping rule in place, i.e. stop when $S_k \geq n$. We now have $E\big[T\big] \lt\infty$ and in fact $T$ is now a bounded random variable.

setting up the Wald Equation
$S_T = X_1 + X_2 + ... + X_T$
and taking expectations (i.e. Wald's Equality) gives us
$n + \delta = E\big[S_T\big] = \bar{X} E\big[T\big] = \frac{3}{2} E\big[T\big]$
for some $\delta \in (0,1)$, so
$E\big[T\big] = \frac{2}{3}\big(n + \delta\big)$

This sets up a system of linear equations
$\left[\begin{matrix}1 &2 \\1 & 1\\\end{matrix}\right]\left[\begin{matrix}E\big[A\big] \\E\big[B\big] \\\end{matrix}\right] = \left[\begin{matrix}n + \delta \\E\big[T\big] \\\end{matrix}\right] = \left[\begin{matrix}1 \\ \frac{2}{3}\\\end{matrix}\right]\big(n + \delta \big) $

The matrix is invertible and you can see at a glance that
$\big(n + \delta \big) \left[\begin{matrix}1 &2 \\1 & 1\\\end{matrix}\right]\left[\begin{matrix}\frac{1}{3} \\ \frac{1}{3}\\ \end{matrix}\right]$ $= \big(n + \delta \big) \left[\begin{matrix}1 \\ \frac{2}{3}\\ \end{matrix}\right]$

which confirms that for a proper stopping rule, via the Wald Equality, we know that

$E\big[A\big]=E\big[B\big] = \frac{1}{3}\big(n +\delta\big) \approx \frac{1}{3}n$