When calculating averages, why can we treat exploding die as if they're independent?

Another way to look at it:

Let $E$ denote the answer. Suppose you toss the die once. One of two things happens..either you get a value below $6$ or you get a $6$ and start over (from which point, of course, you expect to get an additional $E$). Thus we have $$E=\frac 16\times (1+2+3+4+5)+\frac 16\times (6+E)\implies E=\frac {21}5$$ as desired.


This does indeed come from linearity of expectation - but you have to be really careful about what exactly you're applying this theorem to. In particular, let's examine some random variables. In a given trial (i.e. you roll the dice until you get something other than $6$), let us define some quantities. First, let $X$ be the total achieved. Let $X_1$ be the portion of this due to the first roll and $X_2$ be the portion due to the second roll (which is $0$ if there was no second roll) and so on.

We then have that $X=X_1+X_2+X_3+\ldots$ noting that, almost certainly, there are only finitely many non-zero terms in the sum and also - in case we should later worry about issues of convergence - that these are all non-negative quantities, so we are justified in applying linearity of expectations to this to get $$\mathbb E[X]=\mathbb E[X_1]+\mathbb E[X_2]+\ldots$$ Then, we just compute $\mathbb E[X_n]$. This is straightforwards: There is a $\frac{1}{6^{n-1}}$ chance that we will roll for an $n^{th}$ time and, given that we do roll, the expected roll is $3.5$ as it is just a typical die roll. So, $\mathbb E[X_n]=\frac{1}{6^{n-1}}\cdot 3.5$ as given in the solution which gives $$\mathbb E[X]=(1+1/6+1/6^2+1/6^3+\ldots)\cdot 3.5$$ Note that, via this approach, we never consider whether a die actually was $6$ except in determining whether we reach the $n^{th}$ roll - that's because, to compute expectation, we are splitting into the cases of "I roll this die" and "I don't roll this die" which do not bias the roll of the die at all. Basically, we are allowed to imagine, while computing each expectation, that this is the last roll, regardless of whether we get a $6$ because no further information is relevant to the value of $X_n$.


The series you have there represents

The expected value of the first die throw, plus the probability that you get a second throw times the expected value of the second die, plus the probability that you get a third throw times the expected value of the third throw, plus ...

It is basically what you get if you write down what the expectation is straight from the definition, and tidy up a little: $$ \frac16\cdot1+\cdots+\frac16\cdot 5+\frac16\cdot \left(6+ \frac16\cdot1+\cdots+\frac16\cdot 5+\frac16\cdot \left(6+ \cdots \right) \right) $$