Why is the geometric mean less sensitive to outliers than the arithmetic mean?

The geometric mean is the exponential of the arithmetic mean of a log-transformed sample. In particular,

$$\log\left( \biggl(\prod_{i=1}^n x_i\biggr)^{\!1/n}\right) = \frac{1}{n} \sum_{i=1}^n \log x_i,$$ for $x_1, \ldots, x_n > 0$.

So this should provide some intuition as to why the geometric mean is insensitive to right outliers, because the logarithm is a very slowly increasing function for $x > 1$.

But what about when $0 < x < 1$? Doesn't the steepness of the logarithm in this interval suggest that the geometric mean is sensitive to very small positive values--i.e., left outliers? Indeed this is true.

If your sample is $(0.001, 5, 10, 15),$ then your geometric mean is $0.930605$ and your arithmetic mean is $7.50025$. But if you replace $0.001$ with $0.000001$, this barely changes the arithmetic mean, but your geometric mean becomes $0.165488$. So the notion that the geometric mean is insensitive to outliers is not entirely precise.


We can even generalized this idea further - consider the definition of a power mean: $$\mu_p=\left(\frac{1}{n} \sum_{i=1}^n x_i^p \right)^\frac{1}{p}$$ We get the arithmetic mean when we plug $p=1$ and geometric mean when $p\rightarrow0$. It turns out that the lesser the value of $p$ the less impact the big numbers make and more impact small numbers make. Notice that for example even if $x_1$ is very close to zero the arithmetic mean will always be at least $\frac{x_2+x_3+\dots+x_n}{n}$ so it won't go down to zero. This is not the case for the other extreme - the arithmetic mean can be arbitrarily large only because of a single element. The same is true for all power means with $p>0$. For negative $p$ we have the reverse behaviour. Consider a harmonic mean (which is a reciprocal of an arithmetic mean of reciprocals and also a power mean with $p=-1$): $$\frac{n}{\sum_{i=1}^{n}\frac{1}{x_i}}$$ We see that even if $x_1$ is huge, it's reciprocal will still be bigger than zero making the whole mean less than: $$\frac{n}{\sum_{i=2}^{n}\frac{1}{x_i}}$$ But if only one element is very close to zero, it's reciprocal will be very big which will make the whole denominator large and thus make the harmonic mean go down to zero. Geometric mean, since it is a power mean with $p=0$ exhibits both these behaviours - it can grow big or small under the influence of just a single element. It seems bad ad first but one has to remember that it will be less sensitive to big outliers than any power mean with $p>0$ (like for example arithmetic mean) and less sensitive to small outliers than any power mean with $p<0$ (such as the harmonic mean) - so in some sense it can be a good compromise.

There are also two important special/border cases of the power mean, mainly $p \rightarrow \infty$ and $p \rightarrow -\infty$. In the first case, we just get the maximum and in the second minimum of the data. Obviously, as they are extremes, the maximum is completely sensitive to big outliers and completely insensitive to small ones whereas the minimum exhibits opposite behaviour. They are obviously a horrible example of a "mean" but can serve as a help in understanding general behaviour.

I've generated a random sample of million uniformly distributed numbers and computed their power means for different values of $p$. For $p=1$ we observe the mean of around $\frac{1}{2}$ which is a true mean of the distribution. For bigger values of $p$ we obtain bigger means, as always, but as you can see for $p<1$ we observe very small values of the mean. Also for bigger $p$ the mean seems to not be representative. So one has to make a choice depending on the distribution.

                                             Sensitivity to p

PROOF THAT GEOMETRIC MEAN IS A POWER MEAN FOR $p=0$:
We have thanks to the L'Hôpital's rule: $$\log u_0=\lim_{p\rightarrow 0}\frac{\log(\sum_{i=1}^n x_i^p)-\log(n)}{p}=\lim_{p\rightarrow 0}\frac{\sum_{i=1}^n x_i^p \log x_i}{\sum_{i=1}^n x_i^p}=\frac{1}{n}\sum_{i=1}^n \log x_i$$ So indeed: $$\mu_0=\exp\left(\frac{1}{n}\sum_{i=1}^n \log x_i \right)=\left(\prod_{i=1}^nx_i \right)^\frac{1}{n}$$