Is $\frac{\textrm{d}y}{\textrm{d}x}$ not a ratio?

Historically, when Leibniz conceived of the notation, $\frac{dy}{dx}$ was supposed to be a quotient: it was the quotient of the "infinitesimal change in $y$ produced by the change in $x$" divided by the "infinitesimal change in $x$".

However, the formulation of calculus with infinitesimals in the usual setting of the real numbers leads to a lot of problems. For one thing, infinitesimals can't exist in the usual setting of real numbers! Because the real numbers satisfy an important property, called the Archimedean Property: given any positive real number $\epsilon\gt 0$, no matter how small, and given any positive real number $M\gt 0$, no matter how big, there exists a natural number $n$ such that $n\epsilon\gt M$. But an "infinitesimal" $\xi$ is supposed to be so small that no matter how many times you add it to itself, it never gets to $1$, contradicting the Archimedean Property. Other problems: Leibniz defined the tangent to the graph of $y=f(x)$ at $x=a$ by saying "Take the point $(a,f(a))$; then add an infinitesimal amount to $a$, $a+dx$, and take the point $(a+dx,f(a+dx))$, and draw the line through those two points." But if they are two different points on the graph, then it's not a tangent, and if it's just one point, then you can't define the line because you just have one point. That's just two of the problems with infinitesimals. (See below where it says "However...", though.)

So Calculus was essentially rewritten from the ground up in the following 200 years to avoid these problems, and you are seeing the results of that rewriting (that's where limits came from, for instance). Because of that rewriting, the derivative is no longer a quotient, now it's a limit: $$\lim_{h\to0 }\frac{f(x+h)-f(x)}{h}.$$ And because we cannot express this limit-of-a-quotient as a-quotient-of-the-limits (both numerator and denominator go to zero), then the derivative is not a quotient.

However, Leibniz's notation is very suggestive and very useful; even though derivatives are not really quotients, in many ways they behave as if they were quotients. So we have the Chain Rule: $$\frac{dy}{dx} = \frac{dy}{du}\;\frac{du}{dx}$$ which looks very natural if you think of the derivatives as "fractions". You have the Inverse Function theorem, which tells you that $$\frac{dx}{dy} = \frac{1}{\quad\frac{dy}{dx}\quad},$$ which is again almost "obvious" if you think of the derivatives as fractions. So, because the notation is so nice and so suggestive, we keep the notation even though the notation no longer represents an actual quotient, it now represents a single limit. In fact, Leibniz's notation is so good, so superior to the prime notation and to Newton's notation, that England fell behind all of Europe for centuries in mathematics and science because, due to the fight between Newton's and Leibniz's camp over who had invented Calculus and who stole it from whom (consensus is that they each discovered it independently), England's scientific establishment decided to ignore what was being done in Europe with Leibniz notation and stuck to Newton's... and got stuck in the mud in large part because of it.

(Differentials are part of this same issue: originally, $dy$ and $dx$ really did mean the same thing as those symbols do in $\frac{dy}{dx}$, but that leads to all sorts of logical problems, so they no longer mean the same thing, even though they behave as if they did.)

So, even though we write $\frac{dy}{dx}$ as if it were a fraction, and many computations look like we are working with it like a fraction, it isn't really a fraction (it just plays one on television).

However... There is a way of getting around the logical difficulties with infinitesimals; this is called nonstandard analysis. It's pretty difficult to explain how one sets it up, but you can think of it as creating two classes of real numbers: the ones you are familiar with, that satisfy things like the Archimedean Property, the Supremum Property, and so on, and then you add another, separate class of real numbers that includes infinitesimals and a bunch of other things. If you do that, then you can, if you are careful, define derivatives exactly like Leibniz, in terms of infinitesimals and actual quotients; if you do that, then all the rules of Calculus that make use of $\frac{dy}{dx}$ as if it were a fraction are justified because, in that setting, it is a fraction. Still, one has to be careful because you have to keep infinitesimals and regular real numbers separate and not let them get confused, or you can run into some serious problems.


Just to add some variety to the list of answers, I'm going to go against the grain here and say that you can, in an albeit silly way, interpret $dy/dx$ as a ratio of real numbers.

For every (differentiable) function $f$, we can define a function $df(x; dx)$ of two real variables $x$ and $dx$ via $$df(x; dx) = f'(x)\,dx.$$ Here, $dx$ is just a real number, and no more. (In particular, it is not a differential 1-form, nor an infinitesimal.) So, when $dx \neq 0$, we can write: $$\frac{df(x;dx)}{dx} = f'(x).$$


All of this, however, should come with a few remarks.

It is clear that these notations above do not constitute a definition of the derivative of $f$. Indeed, we needed to know what the derivative $f'$ meant before defining the function $df$. So in some sense, it's just a clever choice of notation.

But if it's just a trick of notation, why do I mention it at all? The reason is that in higher dimensions, the function $df(x;dx)$ actually becomes the focus of study, in part because it contains information about all the partial derivatives.

To be more concrete, for multivariable functions $f\colon R^n \to R$, we can define a function $df(x;dx)$ of two n-dimensional variables $x, dx \in R^n$ via $$df(x;dx) = df(x_1,\ldots,x_n; dx_1, \ldots, dx_n) = \frac{\partial f}{\partial x_1}dx_1 + \ldots + \frac{\partial f}{\partial x_n}dx_n.$$

Notice that this map $df$ is linear in the variable $dx$. That is, we can write: $$df(x;dx) = (\frac{\partial f}{\partial x_1}, \ldots, \frac{\partial f}{\partial x_n}) \begin{pmatrix} dx_1 \\ \vdots \\ dx_n \\ \end{pmatrix} = A(dx),$$ where $A$ is the $1\times n$ row matrix of partial derivatives.

In other words, the function $df(x;dx)$ can be thought of as a linear function of $dx$, whose matrix has variable coefficients (depending on $x$).

So for the $1$-dimensional case, what is really going on is a trick of dimension. That is, we have the variable $1\times1$ matrix ($f'(x)$) acting on the vector $dx \in R^1$ -- and it just so happens that vectors in $R^1$ can be identified with scalars, and so can be divided.

Finally, I should mention that, as long as we are thinking of $dx$ as a real number, mathematicians multiply and divide by $dx$ all the time -- it's just that they'll usually use another notation. The letter "$h$" is often used in this context, so we usually write $$f'(x) = \lim_{h \to 0} \frac{f(x+h) - f(x)}{h},$$ rather than, say, $$f'(x) = \lim_{dx \to 0} \frac{f(x+dx) - f(x)}{dx}.$$ My guess is that the main aversion to writing $dx$ is that it conflicts with our notation for differential $1$-forms.

EDIT: Just to be even more technical, and at the risk of being confusing to some, we really shouldn't even be regarding $dx$ as an element of $R^n$, but rather as an element of the tangent space $T_xR^n$. Again, it just so happens that we have a canonical identification between $T_xR^n$ and $R^n$ which makes all of the above okay, but I like distinction between tangent space and euclidean space because it highlights the different roles played by $x \in R^n$ and $dx \in T_xR^n$.


My favorite "counterexample" to the derivative acting like a ratio: the implicit differentiation formula for two variables. We have $$\frac{dy}{dx} = -\frac{\partial F/\partial x}{\partial F/\partial y} $$

The formula is almost what you would expect, except for that pesky minus sign.

See http://en.wikipedia.org/wiki/Implicit_differentiation#Formula_for_two_variables for the rigorous definition of this formula.