Incremental averageing

You need to keep at least two pieces of information: the number of terms so far and the running mean (or something equivalent to it).

Let's suppose the $n$th component of the vector is $a_n$ and the running mean up to this is $m_n$ so $$m_n= \frac{1}{n}\sum_{i=1}^n a_i.$$

Starting with $m_0=0$, you can use the obvious single pass $$m_n = \frac{(n-1)m_{n-1}+a_n}{n}$$ but precision errors are likely to be smaller if you use the equivalent $$m_n = m_{n-1} + \frac{a_{n}-m_{n-1}}{n}.$$


Intuition

It's an easy question and @Henry basically answered it. However, I think it would be nice to add some intuition on the second equation:

$$\mu_k=\mu_{k-1}+\frac{x_k-\mu_{k-1}}{k}$$

The idea is to represent the new value $x_k$ value by a part that is equal to the previous mean $\mu_{k-1}$ plus the remaining part $x_k-\mu_{k-1}$. The new mean $\mu_k$ we then contains $k$ times the previous mean and one time the remaining part of the new value, divided by the total count. Okay, so how to get there?

Derivation

Let's say you already have the mean $\mu_{k-1}$ for the elements $x_0,...,x_{k-1}$. Of course, we can easily incorporate an additional value $x_k$ by undoing the division, adding the new value, and redoing the scaling (but the new count):

$$\mu_k=\frac{(k-1)*\mu_{k-1}+x_k}{k}$$

Now, let's represent the new value by the previous mean and the difference to the previous mean:

$$x_k=\mu_{k-1}+(x_k-\mu_{k-1})$$

Putting that into our first equation:

$$\mu_k=\frac{(k-1)*\mu_{k-1}+\mu_{k-1}+(x_k-\mu_{k-1})}{k}$$

Look how we now got $k$ times the previous mean:

$$\mu_k=\frac{k*\mu_{k-1}+(x_k-\mu_{k-1})}{k}$$

The nice thing is that we don't have to undo and redo the division on the previous mean anymore:

$$\mu_k=\mu_{k-1}+\frac{x_k-\mu_{k-1}}{k}$$

As you can see, we still have to do the division on the difference of the new value to the previous mean. Since the new value comes in "sum-space" already, we don't have to undo anything on it, just divde by the total count.

Application

An application of this form of incremental mean can be found in the field of reinforcement learning where a value function is averaged over many experienced rewards. In this setting, you can think of $x_k-\mu_{k-1}$ as a temporal-difference error term. Since we usually don't know the total number of experiences here, we multiply by a small learning rate rather than dividing by $k$.


yep easy to do - I call it the travelling mean :)

Note you will need to keep track of the 1. 'count' of values, 2. previous 'mean' and 3. the 'new value'. Algorithm is:

in words : ('previous mean' * '(count -1)') + 'new value') / 'count'

benefit you have a running mean can do the same with 'standard deviation' just a little more complex

Tags:

Average