Die that never rolls the same number consecutively

An impressive array of heavy machinery has been brought to bear on this question. Here's a slightly more pedestrian solution.

The $6\cdot5^{999}$ admissible sequences of rolls are equiprobable, so we need to count how many of them contain $k$ occurrences of a given number, say, $6$.

Let $a_{jn}$ be the number of admissible sequences of length $n$ with exactly $j$ $6$s that begin and end with a non-$6$. Fix a non-$6$ at the beginning, glue a non-$6$ to the right of each of the $j$ $6$s and choose $j$ slots for the resulting glued blocks out of $n-j-1$ objects ($j$ glued blocks and $n-2j-1$ unglued non-$6$s). There are $5^{j+1}4^{n-2j-1}$ options for the non-$6$s, so that makes

$$ a_{jn}=\frac54\cdot4^n\left(\frac5{16}\right)^j\binom{n-j-1}j $$

sequences.

A sequence with $k$ $6$s can begin and end with non-$6$s, for a contribution of $a_{k,1000}$, begin with a $6$ and end with a non-$6$ or vice versa, for a contribution of $2a_{k-1,999}$, or begin and end with a $6$, for a contribution of $a_{k-2,998}$, for a total of

$$ a_{k,1000}+2a_{k-1,999}+a_{k-2,998}=4^{1000}\left(\frac5{16}\right)^k\left(\frac54\binom{999-k}k+2\binom{999-k}{k-1}+\frac45\binom{999-k}{k-2}\right)\;. $$

Dividing by the total number of admissible sequences yields the probability of rolling $k$ $6$s:

$$ \frac56\left(\frac45\right)^{1000}\left(\frac5{16}\right)^k\left(\frac54\binom{999-k}k+2\binom{999-k}{k-1}+\frac45\binom{999-k}{k-2}\right)\;. $$

We could use this to calculate the expectation and variance, but this is more easily done using the linearity of expectation.

By symmetry, the expected number of $6$s rolled is $\frac{1000}6=\frac{500}3=166.\overline6$.

With $X_i$ denoting the indicator variable for the $i$-th roll being a $6$, the variance is

\begin{align} \operatorname{Var}\left(\sum_iX_i\right) &= \mathbb E\left(\left(\sum_iX_i\right)^2\right)-\mathbb E\left(\sum_iX_i\right)^2 \\ &= 1000\left(\frac16-\frac1{36}\right)+2\sum_{i\lt j}\left(\textsf{Pr}(X_i=1\land X_j=1)-\textsf{Pr}(X_i=1)\textsf{Pr}(X_j=1)\right) \\ &= 1000\left(\frac16-\frac1{36}\right)+2\sum_{i\lt j}\textsf{Pr}(X_i=1)\left(\textsf{Pr}(X_j=1\mid X_i=1)-\textsf{Pr}(X_j=1)\right)\;. \end{align}

The conditional probability $p_{j-i}=\textsf{Pr}(X_j=1\mid X_i=1)$ depends only on $j-i$; it satisfies $p_{k+1}=\frac15(1-p_k)$ and has initial value $p_1=0$, with solution $p_k=\frac{1-\left(-\frac15\right)^{k-1}}6$. Thus

\begin{align} \operatorname{Var}\left(\sum_iX_i\right) &= 1000\left(\frac16-\frac1{36}\right)-\frac1{18}\sum_{k=1}^{999}(1000-k)\left(-\frac15\right)^{k-1} \\ &= 1000\left(\frac16-\frac1{36}\right)-\frac1{18}\cdot\frac{25}{36}\left(\frac65\cdot1000-1+\left(-\frac15\right)^{1000}\right) \\ &=\frac{60025-5^{-998}}{648} \\ &\approx\frac{60025}{648} \\ &\approx92.63\;, \end{align}

in agreement with mercio's result.


Getting a precise answer is possible, but not very enlightening (it would be very complicated, with a thousand terms to compute), so I'll focus on two approximate results: the law of large numbers, and the central limit theorem. With $1000$ throws on such a simple system, they are bound to be quite accurate.

To begin with, let's say that we throw a standard die a large number $N$ of times.

The standard die

Let $(X_n)_{n \geq 0}$ be the successive results of the die, which take their values in $\{1, \ldots, 6\}$. Then the $X_n$ are independent, and identically distributed (each has a probability of $1/6$ to yield any given number).

Let $f : \{1, \ldots, 6\} \to \mathbb{R}$ an observable. Then the sequence $(f(X_n))_{n \geq 0}$ is also a sequence of independent and identically distributed random variables. Put $S_N f := \sum_{k=0}^{N-1} f(X_k)$. For instance, if we take:

  • $f(x) = 1_{x=4}$: then $S_N f$ is the number of times we get the number $4$ in the first $N$ throws.

  • $f(x) = x$: then $S_N f$ is the sum of the first $N$ throws.

Since the expectation is linear,

$$\mathbb{E} (S_N f) = \mathbb{E} \left(\sum_{k=0}^{N-1} f(X_k)\right) = \sum_{k=0}^{N-1} \mathbb{E} \left(f(X_k)\right) = N\mathbb{E} \left(f(X_0)\right).$$

But that only deals with the average value of $S_N f$. Since $f$ is bounded, and thus integrable, the (weak) law of large numbers asserts that, in distribution,

$$\lim_{N \to + \infty} \frac{S_N f}{N} =\mathbb{E} \left(f(X_0)\right).$$

This gives you a first order result. This is not completely satisfactory; for instance, if $\mathbb{E} \left(f(X_0)\right) = 0$, this law tells us that the sum grows sub-linearly. We may want something more precise, which gives us a non degenerate limit distribution. Since $f$ is bounded, it is square-integrable, and we may apply the central limit theorem:

$$\lim_{N \to + \infty} \frac{S_N f-N\mathbb{E} \left(f(X_0)\right)}{\sqrt{N}} = \sigma(f) \mathcal{N},$$

where the convergence is in distribution, $\mathcal{N}$ is a standard centered Gaussian, $\overline{f} = f-\mathbb{E} (f)$, and:

$$\sigma(f)^2 = \mathbb{E} (\overline{f}^2) = \frac{1}{6} \sum_{k=1}^6 \overline{f}(k)^2,$$

For instance, if $f(x) = 1_{x=4}$, we get:

$$\lim_{N \to + \infty} \frac{S_N f}{N} = \frac{1}{6},$$

so roughly $1/6$th of the throws yield a $4$, and:

$$\lim_{N \to + \infty} \frac{S_N f-N/6}{\sqrt{N}} = \frac{\sqrt{5}}{6} \mathcal{N},$$

which tells us for instance that the number of $4$, with probability $0.95$, will be in the interval $[N/6-\sqrt{5N}/3, N/6+\sqrt{5N}/3]$. For $N = 1000$ throws, this is the interval $[143,190]$.

The magical die

I'll now show how these results apply to the magical die. However, I won't prove what I assert, because it tends to be quite high level (I do this in a graduate course, typically). I'll assume that the die is fair: for the first throw, the $6$ possible outcomes are equally likely, and for each throw past the first, the $5$ possible outcomes are also equally likely. The first remark is that, for any given throw,

$$\mathbb{P} (X_n = 1) = \ldots = \mathbb{P} (X_n = 6) = \frac{1}{6}.$$

This is because, if you relabel the faces of the die, it doesn't change the law of the sequence of throws, so at each throw, all the outcomes are equally likely.

The sequence $(X_n)_{n \geq 0}$ is then a stationary Markov chain: for all $n$, the throw $X_{n+1}$ depends only on $X_n$. I may encode it into the following matrix:

$$P:= \left(\begin{matrix} 0 & 1/5 & 1/5 & 1/5 & 1/5 & 1/5 \\ 1/5 & 0 & 1/5 & 1/5 & 1/5 & 1/5 \\ 1/5 & 1/5 & 0 & 1/5 & 1/5 & 1/5 \\ 1/5 & 1/5 & 1/5 & 0 & 1/5 & 1/5 \\ 1/5 & 1/5 & 1/5 & 1/5 & 0 & 1/5 \\ 1/5 & 1/5 & 1/5 & 1/5 & 1/5 & 0 \end{matrix}\right)$$

The coefficient $P_{ij}$ is the probability that, given that the die is currently on $i$, the next throw yields $j$. So $P_{ii} = 0$ for all $i$, and $P_{ij} = 1/5$ otherwise.

This Markov chain is finite-state (there are six possible states), transitive (starting from any state $i$, I can get to any state $j$ in two steps), and aperiodic (given any two states $i$ and $j$, I can go from $i$ to $j$ in $n$ steps, for all $n \geq 2$ - in other words, if you give me two possible outcomes of the die, I can see these possible outcomes $n$ throws apart, as long as $n \geq 2).

Let $f : \{1, \ldots, 6\} \to \mathbb{R}$ be an observation. Then the sequence $(f(X_n))_{n \geq 0}$ is still identically distributed (since the $X_n$ are identically distributed), but these random variables are no longer independent. Knowing $f(X_n)$ tends to constrain severely $f(X_{n+1})$: if $f$ is injective, then $f(X_{n+1}) \neq f(X_n)$. However, what will save us is that the dependence decays very fast: knowing $f(X_n)$ gives a lot of information on $f(X_{n+1})$, but very little on, say, $f(X_{n+10})$. And this influence decay exponentially fast. In other words, if the first throw is $1$, you can say reliably that the second throw won't be $1$, but there is not much you can say on the tenth or twentieth throw.

Given an observable $f$, note that we still have:

$$\mathbb{E} (S_N f) = \mathbb{E} \left(\sum_{k=0}^{N-1} f(X_k)\right) = N\mathbb{E} \left(f(X_0)\right),$$

because this argument only uses the stationarity of the process. We also still have a law of large numbers, in distribution,

$$\lim_{N \to + \infty} \frac{S_N f}{N} =\mathbb{E} \left(f(X_0)\right).$$

So the law of large number is still as simple, and is a consequence of Birkhoff's theorem and the transitivity of the Markov chain. Now, we also have a central limit theorem, but it is more complicated. Indeed,

$$\lim_{N \to + \infty} \frac{S_N f-N\mathbb{E} \left(f(X_0)\right)}{\sqrt{N}} = \sigma_{GK} (f) \mathcal{N},$$

where the convergence is in distribution, $\mathcal{N}$ is a standard centered Gaussian, $\overline{f} = f-\mathbb{E} (f)$ and:

$$\sigma_{GK} (f)^2 = \mathbb{E} (\overline{f}^2) + 2 \sum_{n=1}^{+ \infty} \mathbb{E} (\overline{f} (X_0) \overline{f}(X_n)).$$

the main change is the formula for the variance, which is given by Green-Kubo's formula.

Now, how do we compute this thing? An observable $f$ is essentially the same thing as an element of $\mathbb{R}^6$; for instance, $1_{x=4} = (0, 0, 0, 0, 1, 0)$. Now, one can prove that, for any functions $f$ and $g$,

$$\mathbb{E} (f (X_0) g(X_n)) = \mathbb{E} (f (X_0) (P^n) g (X_0)) = \frac{1}{6} \sum_{k=1}^6 f(k) (P^n) g (k),$$

and thanksfully, $P^n$ is easy to compute. If we write $A := P+I/5$ the $6 \times 6$ matrix whose coefficients are all $1/5$, then $A^k = (6/5)^{k-1} A$ for $k \geq 1$, so that:

$$P^n = \left( A-\frac{I}{5} \right)^n = \sum_{k=0}^n \binom{n}{k} A^k \left( - \frac{1}{5} \right)^{n-k} = \frac{5}{6} \sum_{k=0}^n \binom{n}{k} \left(\frac{6}{5} \right)^k \left( - \frac{1}{5} \right)^{n-k} A - \frac{5}{6}\left( - \frac{1}{5} \right)^n A+ \left( - \frac{1}{5} \right)^n I = A - \frac{5}{6}\left( - \frac{1}{5} \right)^n A+ \left( - \frac{1}{5} \right)^n I.$$

In addition, if $\mathbb{E} (f (X_0)) = 0$, then $Af = 0$. This is the case for $\overline{f}$. Hence,

$$P^n \overline{f} = \left( - \frac{1}{5} \right)^n \overline {f},$$

so that:

$$\sigma_{GK} (f)^2 = \mathbb{E} (\overline{f}^2) + 2 \sum_{n=1}^{+ \infty} \mathbb{E} (\overline{f} (X_0) (P^n \overline{f})(X_0)) = \mathbb{E} (\overline{f}^2) + 2 \sum_{n=1}^{+ \infty} \left( - \frac{1}{5} \right)^n \mathbb{E} (\overline{f}^2) = \left( 1+2 \sum_{n=1}^{+ \infty} \left( - \frac{1}{5} \right)^n\right) \mathbb{E} (\overline{f}^2) = \frac{2}{3} \sigma (f)^2.$$

For instance, if $f(x) = 1_{x=4}$, we get:

$$\lim_{N \to + \infty} \frac{S_N f}{N} = \frac{1}{6},$$

so roughly $1/6$th of the throws yield a $4$, and:

$$\lim_{N \to + \infty} \frac{S_N f-N/6}{\sqrt{N}} = \frac{\sqrt{10}}{6 \sqrt{3}} \mathcal{N},$$

which tells us for instance that the number of $4$, with probability $0.95$, will be in the interval $[N/6-\sqrt{10N}/(3\sqrt{3}), N/6+\sqrt{10N}/(3\sqrt{3})]$. For $N = 1000$ throws, this is the interval $[147,186]$.

What happens is that, if $f(X_n)$ is large, then since $X_{n+1} \neq X_n$, $f(X_{n+1})$ tends to be smaller. Large increments are immediately followed by smaller increments, and small increments are immediatly followed by larger increments. This phenomenon tends to reduce the spread of $S_N f$, which explains that the results are slighly more concentrated with the magical die than with a standard die.


let $P_{k,n}$ be the probability to get $k$ times the result "1" in $n$ throws, and let $$P_{k,n} = p_{k,n} + q_{k,n}$$ where $p,q$ are the corresponding probabilities where you add the constraint that you finish with a 1 (for $p$) or anything but 1 (for $q$).

It is easy to get the equations (for $n \ge 1$) : $$5p_{k,n} = q_{k-1,n-1}$$ and $$5q_{k,n} = 5p_{k,n-1} + 4 q_{k,n-1}$$ from which you get that for $n \ge 2$, $$5q_{k,n} = q_{k-1,n-2} + 4q_{k,n-1}$$

Obviously, the $(p_{k,n})$ family satisfies the same linear equation, and since $P$ is the sum of $p$ and $q$, we have for $n \ge2$, that $$5P_{k,n} = P_{k-1,n-2} + 4P_{k,n-1}$$

Together with the data that $P_{0,0} = 1, P_{0,1} = \frac 16, P_{1,1} = \frac 56$ this is enough to determine all those numbers.

Let $f(x,y)$ be the formal power series $$f(x,y)=\sum P_{k,n} x^k y^n$$ From the recurrence on $P$ we immediately have that in $(5-4y-xy^2)f(x,y)$, every coefficient vanish except the coefficients of $x^0y^0, x^0y^1, x^1y^1$, and so that $f(x,y)$ is the power series of the rational fraction $$\frac {30+y+5xy}{6(5-4y-xyy)}$$

We can check that doing the partial evaluation $x=1$ we get $$f(1,y) = \frac {30+6y}{6(5-4y-yy)} = \frac{6(5+y)}{6(5+y)(1-y)} = \frac 1 {1-y} = 1 + y + y^2 + \dots$$so for each $n$, all the probabilities $P_{k,n}$ sum to $1$.


If $E_n$ is the expectation of the number of occurences of 1 in $n$ throws then $$E_n = \sum k P_{k,n}$$so that $$\sum E_n y^n = \frac {df}{dx} (1,y)$$

We compute $$\frac{ df}{dx} = \frac {(5y)(5-4y-xyy)-(-yy)(30+y+5xy)}{6(5-4y-xyy)^2} = \frac {y(5+y)^2}{6(5-4y-xyy)^2}$$

Then evaluating this at $x=1$ we get $$\sum E_ny^n = \frac {y(5+y)^2}{6(5+y)^2(1-y)^2} = \frac y {6(1-y)^2} = \sum \frac n6 y^n$$

Thus $E_n = n/6$ which is as we expected.


To compute the variance we need to compute the expectancy of the square of the number of occurences, that is, $F_n = \sum k^2 P_{k,n}$. Then $$\sum F_n y^n = \frac {d^2f}{dx^2}(1,y) + \frac {df}{dx}(1,y) = \frac{2y^3(5+y)^2}{6(5+y)^3(1-y)^3} + \frac y{6(1-y)^2} = \frac{2y^3+y(5+y)(1-y)}{6(5+y)(1-y)^3} = \frac{5y-4y^2+y^3}{6(5+y)(1-y)^3}$$

$$\sum E_n^2 y^n = \sum \frac 1{36} n^2y^n = \frac {y(1+y)}{36(1-y)^3}$$

Computing their difference gives the variance :

$$\sum V_n y^n = \frac {30y-24y^2+6y^3-y(1+y)(5+y)} {36(5+y)(1-y)^3} = \frac {5y(5-6y+y^2)}{36(5+y)(1-y)^3} = \frac {5y(5-y)}{36(5+y)(1-y)^2} \sim_{y=1} \frac {20}{216(1-y)^2}$$

Doing the partial fraction decomposition gives a formula $$V_n = \frac {50}{1296} + \frac{20}{216}n - \frac {50}{1296}(-1/5)^n$$


I expect that after renormalization, your random variable (number of 1s obtained in $n$ throws) converges in law to a gaussian variable as $n \to \infty$.