What is the intuition behind the formula for the average?

Suppose all of us gathered here in this room take all the money out of our pockets and put it on the table, and then we divide it among us in such a way that we all have the same amount. The total amount is still the same. Then the amount we each have is the average. That's what averages are.

The total amount is $a+b+c+\cdots$. The number of us gathered here is $n$. So the amount that each of us gets is $(\text{total}/n).$


In most contexts, what passes for an 'average' can be thought of this way: if you replaced a collection of separate instances with their 'average', you get the same result.

The usual mean comes from thinking this way for addition: if you have numbers $a_1,\ldots,a_n$, their sum is $a_1+\cdots+a_n$. If you replaced all of them with their mean $\mu$, you should also get $a_1+\cdots+a_n$.

Therefore $\mu$ must satisfy $$ n\mu=a_1+\cdots+a_n, $$ leading to the formula you've seen.

As another example: doing the same thing but for multiplication leads to the geometric mean $\sqrt[n]{a_1a_2\cdots a_n}$.


Here is a slightly different perspective on what Nick and Michael have already said: the average of $n$ numbers $x_i$ is the unique number $\mu$ such that the sum of the deviations $x_i-\mu$ is zero.

Starting from this characteristic property it is easy to derive the formula

$$\mu=\frac{1}{n}\sum x_i$$

A closely related characterization comes from statistics. Suppose we want to find the "number of best fit" for our data points $x_i$. To find this number, we first need to say what counts as "best."

One popular choice is to measure the "error" of our best-fit "approximation" using a quadratic "cost function." More formally, finding "the number of best fit" amounts to finding the number $m$ that minimizes the sum of the squared errors

$$SSE=\sum (x_i-m)^2$$

If you know any calculus (or simple multivariable geometry) you can easily prove that this function is minimized precisely when $m$ is the average of the $x_i$. In this sense, the average is the minimizer of squared errors.

If instead of measuring error by the sum of the quadratic deviations $(x_i-m)^2$ we use the sum of absolute deviations $|x_i-m|$, the minimizer is the median rather than the average.

In fact, other types of means (the geometric mean, the harmonic mean, etc.) can be understood using this same framework. See the Wikipedia page on Fréchet means.

Tags:

Average