Limit Of A Sequence Formal Proof

Let's discuss the intuitive idea behind $(1)$, as it is the intuitive idea behind the whole proof.

Consider a convergent sequence $x_n \to x$. Convergence of $x_n$ to $x$ means that eventually the sequence $x_n$ becomes, and stays, as close as you like to $x$. The behaviour of the sequence may be quite wild in the first finitely many steps, but after a certain point $N$, the sequence settles and becomes $\varepsilon$-close to $x$ forever after. The smaller the value of $\varepsilon$, the typically larger that $N$ must be, but the sequence eventually settles and becomes well-behaved, and stays well-behaved for the rest of its infinitely many points.

So, any "bad" behaviour is relegated to only a finite number of points. This is basically nothing in the face of the infinitude of well-behaved points that come after it. So, if we take the sequence of these averages, while the finitely many badly-behaved points will always contribute something to the average, they will eventually be far outweighed by the well-behaved points, and as such, the average will be pretty well-behaved as well.

Let's talk in slightly more practical terms. Let's fix some $\varepsilon > 0$, and naively consider the guaranteed $N \in \Bbb{N}$ such that $$n \ge N \implies |x_n - x| < \varepsilon.$$ Our badly-behaved points are $x_1, x_2, \ldots, x_{N-1}$. In a large enough average, we'll have \begin{align*} &\dfrac{x_1 + x_2 + \ldots + x_n}{n} - x \\ = \, &\dfrac{(x_1 - x) + (x_2 - x) + \ldots + (x_n - x)}{n} \\ = \, &\dfrac{\overbrace{(x_1 - x) + (x_2 - x) + \ldots + (x_{N-1} - x)}^{\text{Bad points}} + \overbrace{(x_N - x) + (x_{N+1} - x) + \ldots + (x_n - x)}^{\text{Each less than }\varepsilon}}{n}. \end{align*} We just need to make $n$ much bigger than $N$, so that the bad part of the average $$\dfrac{(x_1 - x) + (x_2 - x) + \ldots + (x_{N-1} - x)}{n}$$ becomes insignificantly tiny. This should be possible, since the numerator is really just a fixed sum, and we will be dividing it by larger and larger numbers $n$. So, we should be able to fix it so that $$\dfrac{|(x_1 - x) + (x_2 - x) + \ldots + (x_{N-1} - x)|}{n} < \varepsilon \tag{2}$$ for all $n \ge M$, for some $M$. On the other hand, the good points are all $\varepsilon$-small, so \begin{align*} &\dfrac{|(x_N - x) + (x_{N+1} - x) + \ldots + (x_n - x)|}{n} \\ \le \, &\dfrac{|x_N - x| + |x_{N+1} - x| + \ldots + |x_n - x|}{n} \\ < \, &\dfrac{\overbrace{\varepsilon + \varepsilon + \ldots + \varepsilon}^{\text{Exactly $n - N + 1$ terms}}}{n} = \frac{n - N + 1}{n} \varepsilon < \varepsilon. \tag{3} \end{align*} Thus, in total, $$\left|\dfrac{x_1 + x_2 + \ldots + x_n}{n} - x\right| \le \varepsilon + \varepsilon = 2\varepsilon,$$ whenever $n \ge M$ and $n > N$.


This approach is exactly the same as the proof you've quoted, but I've tried to talk through the thinking behind the steps. Where I have used $\varepsilon$, they have used $\varepsilon/2$, so that they obtain $< \varepsilon$ where I obtained $< 2\varepsilon$.

They also use $n_1$ instead of $N$. Hopefully you can see why they want an $n_2$ such that $(1)$ holds: essentially it's the same step as my $(2)$. What I called $M$, they called $n_2$.

As for your second question, this again corresponds to my step $(3)$. Essentially, they are counting the number of terms in the sum, and replacing each term with the largest of all the terms (which should yield something bigger than or equal to what you started with). I just replaced everything with $\varepsilon$, since I knew it was strictly larger than everything there. This is what they proceed to do: replace the maximum term with $\varepsilon$, which is larger than the maximum, increasing the sum again.