When to stop rolling a die in a game where 6 loses everything

Before deciding whether to stop or roll, suppose you have a non-negative integer number of points $n$.

How many more rolls should you make to maximise the expected gain over stopping (zero)?

Suppose that further number of rolls is another non-negative integer $k$. Now consider the $6^k$ possible sequences of $k$ rolls:

In $5^k$ of those sequences there is no six and you win some points. The sum $D_k$ over all such sequences of the sum of dice rolls within each sequence satisfies the recurrence relation $$D_0=0\quad D_{n+1}=5D_n+15\cdot5^n$$ It turns out that this has a closed form: $$D_k=15k\cdot5^{k-1}=3k\cdot5^k$$
In the remaining $6^k-5^k$ sequences there is at least one six and you lose the $n$ points you had beforehand.

So the expected gain when you have $n$ points and try to roll $k$ more times before stopping is $$G(n,k)=\frac{D_k-n(6^k-5^k)}{6^k}=\frac{3k\cdot5^k-n(6^k-5^k)}{6^k}$$ For a fixed $n$, the $k$ that maximises $G(n,k)$ is $m(n)=\max(5-\lfloor n/3\rfloor,0)$; if $3\mid n$ then $k=m(n)+1$ also forms a maximum.

Suppose we fix the maximum number of rolls before starting the game. At $n=0$, $k=5$ and $k=6$ maximise $G(n,k)$ and the expected score with this strategy is $$G(0,5)=\frac{15625}{2592}=6.028163\dots$$ But what if we roll once and then fix the maximum rolls afterwards? If we roll 1 or 2, we roll at most 5 more times; if 3, 4 or 5, 4 more times. The expected score here is higher: $$\frac16(1+G(1,5)+2+G(2,5)+3+G(3,4)+4+G(4,4)+5+G(5,4))=6.068351\dots$$ We will get an even higher expected score if we roll twice and then set the roll limit. This implies that the greedy strategy, outlined below, is optimal:

Before the start of each new roll, calculate $m(n)$. Roll if this is positive and stop if this is zero.

When $n\ge15$, $m(n)=0$. A naïve calculation that formed the previous version of this answer says that rolling once has zero expected gain when $n=15$ and negative expected gain when $n>15$. Together, these suggest that we should stop if and when we have 15 or more points.

Finding a way to calculate the expected score under this "stop-at-15" strategy took quite a while for me to conceptualise and then program, but I managed it in the end; the program is here. The expected score works out to be $$\frac{2893395172951}{470184984576}=6.1537379284\dots$$ So this is the maximum expected score you can achieve.

In the last round you can get $\frac{1+2+3+4+5}{6}$ or lose $p\frac 1 6$, whenever the second is more than the first you should stop. So once you have scored more than 15 you should stop. If you score 15 it doesn't matter if you continue or stop.

The question is missing the concept of utility, a function that specifies how much you value each possible outcome. In the game, the utility of ending the game with a certain score would be the value you place on that outcome. Although you could certainly argue it is implied in the question that the utility of a score is simply the score, I would like to add an answer that takes a non-trivial utility into account. If each point translated to 1000 USD, for example, you might have a utility that looks more like $U(x) = \log(x)$ than $U(x) = x$.

Let's say that $U(x)$ is the utility of score $x$ for $x \ge 0$ and assume that $U$ is monotonically non-decreasing. Then we might say that the optimal strategy is that which maximizes $E[U(X)]$, where $X$ is a random variable representing the final score if one plays with the policy where you roll the die if and only if your current score is less than $t \in \mathbb{Z}_{\ge 0}$. (It is clear that the optimal policy must have this form because the utility is non-decreasing.)

Let $Z$ denote current score. Suppose we are at a point in the game where our current score is $z \ge 0$. Then

$$E[U(X)|Z = z] = \frac{1}{6} \left( U(0) + \sum_{i=1}^5 E[U(X)|Z=z+i] \right) \text{ if } z < t$$

$$E[U(X)|Z = z] = U(z) \text{ if } z \ge t$$

Note that for many choices of $U(x)$ the recurrence relation is very difficult to simplify, and that, in the case of choosing to roll the die, we must consider the expected change in utility from that roll and all future rolls. The figures below are examples of what the above recurrence relation gives for $U(x) = x$ and $U(x) = \log_2(x + 1)$. The expression $E[U(X)]$ means $E[U(X)|Z=0]$, because at the start we have $0$ points. The horizontal axis corresponds to different policies, and the vertical axis corresponds to expected utility under each policy.

enter image description here

Gist with Python code

When to stop rolling a die in a game where 6 loses everything

Tags:

Probability

Dice

Gambling

Related

Recent Posts