Approaching to zero, but not equal to zero, then why do the points get overlapped?

Basically, what you have put your finger on is the original difficulty with infinitesimals, and the basis on which Bishop Berkeley made some of his famous objections (mind you, Berkeley didn't really care about the logical foundations of Calculus; he was engaged in a theological debate at the time1): if infinitesimals are not zero, then what you have is not a tangent but a secant, because the line touches the curve at two distinct points; but if the two points are the same, then they don't define a tangent because you cannot determine a line with a single point.

The answer is that when we talk about limits, we are talking about what the quantities are approaching, not what the quantities are. The point $B$ never "gets overlapped" with $A$, it just approaches $A$; the line between $A$ and $B$ never "becomes" the tangent (which in your diagram is $C$), but its slope approaches the slope of $c$. These "approaches" have a very precise meaning (made formal by Weierstrass).

First, what would it mean to say that the values of a function $g(x)$ "approach" the value $L$ as $x$ approaches $a$? It means that you can make all the values of $g(x)$ be arbitrarily close to $L$ provided that $x$ is "close enough" to $a$. If you specify a narrow horizontal band around the value $y=L$, then by specifying a narrow vertical band around the value $a$ we can guarantee that the graph of $y=g(x)$ for values of $x$ inside the narrow vertical band will necessarily lie inside the horizontal line.

The reason this is sensible is that if the values of $g(x)$ approach some other number $M$ as well, then by specifying a band around $L$ which is smaller than the distance from $M$ to $L$ you will always run into problems: the graph of $y=g(x)$ will always end up with parts outside this band, because $M$ is outside the band and you are also approaching $M$.

Formally, we say it like this: the limit of $g(x)$ as $x$ approaches $a$ is $L$, which is written: $$\lim_{x\to a}g(x) = L$$ if and only if for every $\epsilon\gt 0$ there exists a $\delta\gt 0$ such that if $0\lt |x-a|\lt \delta$, then $|f(x)-L|\lt \epsilon$; $\epsilon$ is how wide the horizontal band around $L$ is, $\delta$ is how thin you need to make the vertical band around $a$ needs to be to make sure the graph is completely inside the rectangle.

What the picture suggests is that as $\Delta x$ approaches $0$, the slope of the line joining $A$ and $B$ should approach the slope of the tangent at $a$; however, the line joining $A$ and $B$ never actually "becomes" the tangent $C$, and the point $B$ never actually "becomes" the point $A$. They are just approaching.

Now, does the line joining $A$ and $B$ have a slope that really approaches the slope of the tangent? The key is that we can characterize the tangent algebraically: the tangent is the unique line that affords the best approximation to $y=f(x)$ near the point $x_0$, where by "best approximation" we mean one in which the relative error approaches zero. That is: if we take a line $mx+b$ that goes through $(x_0,f(x_0))$, then we can ask for the "relative error" in approximating using $mx+b$ instead of $f(x)$: $$\frac{f(x) - (mx+b)}{x-x_0}.$$ The tangent is the unique line through $(x_0,f(x_0))$ for which the relative error approaches $0$ as $x$ approaches $x_0$ (if such a line exists at all). Since $y=mx+b$ goes through $(x_0,f(x_0))$ if and only if $f(x_0) = mx_0 + b$, then $b= f(x_0)-mx_0$; so we have that the relative error will be $$\frac{f(x) - mx - f(x_0)+mx_0}{x-x_0} = \frac{f(x)-f(x_0) - m(x-x_0)}{x-x_0} = \frac{f(x)-f(x_0)}{x-x_0} - m.$$ So if we want the relative error to approach $0$, then we need $\frac{f(x)-f(x_0)}{x-x_0}$ to approach $m$; that is, the slope of the tangent must be whatever quantity the numbers $$\frac{f(x)-f(x_0)}{x-x_0}$$ are approaching as $x$ approaches $x_0$ (if they are approaching any particular number). Setting $\Delta x = x-x_0$, so that $x = x_0+\Delta x$, the above fraction is the same as $$\frac{f(x_0 + \Delta x) - f(x_0)}{\Delta x}.$$ And $x$ approaches $x_0$ if and only if $\Delta x$ approaches $0$. So the slope of the tangent needs to be whatever quantity this fraction approaches as $\Delta x$ approaches $0$; that is, the slope of the tangent at $x_0$ is $m$ if and only if $$ \lim_{\Delta x \to 0} \frac{f(x_0+\Delta x)-f(x_0)}{\Delta x} = m.$$ Turning this assertion into a picture gives precisely the picture you have.


1 In case anyone is interested: many people viewed Newtonian mechanics and its consequent "clockwork universe" as a direct attack on the Christian notion of a deity that was directly involved in and modifying his creation, and also on the notion of free will. Some deists were in fact using the new physics as evidence in favor of the deist God, who created the universe, set it motion, but does not actively participate in it, in contrast to the theistic deity. By attacking the mathematical foundation of the new physics, Berkeley was defending the notions of free will and of the active deity. If you read Augustus de Morgan's A Budget of Paradoxes, he reviews many pamphlets and booklets written during those years which attack Newton and Calculus because they view the latter as an attack on religion. The "morality" arguments raised against Calculus are eerily similar (when not downright identical) to those raised against Darwin and the Theory of Evolution in later days. 


$A$ and $B$ do not literally become overlapped, and the line $C$ is not obtained literally as a line joining points $A$ and $B$ on the curve. The diagram is designed to motivate the plausibility of the fact that as $B$ gets "sufficiently close" (but not equal) to $A$, the line connecting $A$ and $B$ gets "arbitrarily close" to the tangent line $C$ to the curve at the point $A$. This is consistent with the idea that $\Delta x$ "approaches $0$", but is never equal to $0$.

More precisely, the slope of the line $C$ is the derivative of $f$ at $x_0$, $f'(x_0)$, defined by

$$f'(x_0)=\lim_{\Delta x\to 0}\frac{f(x_0+\Delta x)-f(x_0)}{\Delta x}.$$

More informally again, this limit means that you can get "arbitrarily close" to $f'(x_0)$ by taking $\Delta x$ "sufficiently small" (but not $0$) in the difference quotient $\frac{f(x_0+\Delta x)-f(x_0)}{\Delta x}$, which is the slope of the line connecting $A$ to $B$ in the diagram.

Tags:

Calculus