Alterative Sum of Squared Error formula proof

Let $\bar x = \frac{1}{n} \sum_{i=1}^n x_i$ be the sample mean; equivalently, $$n \bar x = \sum_{i=1}^n x_i.$$ Then $$\begin{align*} \sum_{i=1}^n \sum_{j=1}^n (x_i - x_j)^2 &= \sum_{i=1}^n \sum_{j=1}^n (x_i - \bar x + \bar x - x_j)^2 \\ &= \sum_{i=1}^n \sum_{j=1}^n (x_i - \bar x)^2 + (x_j - \bar x)^2 - 2(x_i - \bar x)(x_j - \bar x) \\ &= \sum_{i=1}^n (x_i - \bar x)^2 \sum_{j=1}^n 1 + \sum_{i=1}^n \sum_{j=1}^n (x_j - \bar x)^2 - 2 \sum_{i=1}^n (x_i - \bar x) \sum_{j=1}^n (x_j - \bar x) \\ &= (\operatorname{SSE})( n ) + \left(\sum_{i=1}^n \operatorname{SSE}\right) - 2 \left( n \bar x - \sum_{i=1}^n \bar x \right) \left( n \bar x - \sum_{j=1}^n \bar x \right) \\ &= 2n \operatorname{SSE}{}- 2 (n \bar x - n \bar x)(n \bar x - n \bar x) \\ &= 2n \operatorname{SSE}{} - 0 \\ &= 2n \operatorname{SSE}. \end{align*}$$


Working with your numerical example: $$14=SSE=(6-3)² + (6-7)² + (6-8)² = \\ \left(\frac{3+7+8}{3}-3\right)^2+\left(\frac{3+7+8}{3}-7\right)^2+\left(\frac{3+7+8}{3}-8\right)^2=\\ \frac{(7-3+8-3)^2}{3^2}+\frac{(3-7+8-7)^2}{3^2}+\frac{(3-8+7-8)^2}{3^2}=\\ \frac{2[(7-3)^2+(8-3)^2+(7-8)^2]+2[(7-3)(8-3)+(3-7)(8-7)+(3-8)(7-8)]}{3^2}=\\ \frac{3[(7-3)^2+(8-3)^2+(7-8)^2]}{3^2}-\frac{[(7-3)+(3-8)+(8-7)]^2}{3^2}=\\ \frac{2[(7-3)^2+(8-3)^2+(7-8)^2]}{2\cdot 3}-0=\\ \frac{\sum_{i}\sum_{j}dist(x_i,y_j)^2}{2\cdot3}.$$

Tags:

Statistics