Why can't the Schrödinger equation be derived?

A derivation means a series of logical steps that starts with some assumptions, and ends up at the result you want. Just about anything can be "derived", as long as you vary what the assumptions are. So when people say "X can't be derived", they mean "at your current level of understanding, there's no way to derive X that sheds more light on why X is true, over just assuming it is".

For example, can you "derive" that momentum is $p = mv$? There are several possible answers.

  • You ask this as a student in introductory physics. Some might say yes. For example, you can start from the kinetic energy $K = mv^2/2$, and then assume $K = p^2/2m$. Combining these equations and solving for $p$ gives $p = mv$, so this is a derivation.
  • You ask this as a student in introductory physics. Some might say no. The above derivation is just nonsense. Starting from $K = p^2/2m$ is basically the same thing as assuming the final result, and if you're allowed to do that, it's no better than just taking $p = mv$ by definition. It's like "deriving" $1 + 1 = 2$ by defining $2$ to be $1 + 1$.
  • You ask this as a student in advanced mechanics. Most would say yes. You start from the deeper idea that symmetries are related to conserved quantities, along with the definition that momentum should be the conserved quantity associated with translational symmetry. Putting these together gives the result.

The point is, you can make up a derivation for anything -- but you might not be at a stage in your education where such a derivation is useful at all. If the derivation only works by making up ad-hoc assumptions that are basically as unmotivated as what you're trying to prove, then it doesn't aid understanding. Some people feel this is true for the Schrodinger equation, though I personally think its elementary derivations are quite useful. (The classic one is explained in a later answer here.)


There is often confusion here because derivations in physics work very differently than proofs in mathematics.

For example, in physics, you can often run derivations in both directions: you can use X to derive Y, and also Y to derive X. That isn't circular reasoning, because the real support for X (or Y) isn't that it can be derived from Y (or X), but that it is supported by some experimental data D. This two-way derivation then tells you that if you have data D supporting X (or Y), then it also supports Y (or X).

Once you finish putting high school math on a rigorous foundation, undergraduate math generally builds upward. For example, you can't use Stokes' theorem to prove the fundamental theorem of calculus, even though it technically subsumes it as a special case, because its proof depends on the fundamental theorem of calculus in the first place. In other words, as long as your classes are being rigorous at all, it would be very strange to hear "we can't derive this important result now, but we'll derive it next year" -- that would be in danger of logical circularity.

This isn't the case in physics: undergraduate physics generally builds downward. Every year, you learn a new theory that subsumes everything you previously learned as a special case, which is completely logically independent of those earlier theories. You don't actually need any results from classical mechanics to completely define quantum mechanics: it is a new layer constructed below classical mechanics rather than above it. That's why definitions now can turn into derived things later, once you learn the lower level. And it means that in practice, physicists have to guess the lower level given only access to the higher level; that's the fundamental reason why science is hard!


Although knzhou's answer makes a good point stressing the possibility that what is taken as a starting point at the introductory level could become a consequence of a more fundamental principle, I think that there is a key point that should be stressed more clearly.

In physics, whatever conceptual tool we develop has to be rooted in, and its motivation comes from the need to describe and predict what happens in the real world.

Every theory we have, is not just an equation but it is based on some definitions (always conventional; definitions can be useful or not, but never true or false), on some formal apparatus, and on a set of principles which are a convenient way to summarize a lot of experimental activity.

An equation like $\vec F = m \vec a$, within classical mechanics, can be taken as a principle (Newton), or it could be "derived" from a more geometric point of view by referring to groups of transformations on symplectic manifolds. But the important thing that shouldn't be forgotten is that it is an equation within a theory describing the dynamical behavior of macroscopic bodies under a certain set of conditions.

Beyond the range of applicability of classical mechanics, some new physics enters the game. New physics means that some experimental findings are not described anymore by Newton's equations (independently if assumed as principles or derived within a more general approach), and one has to find a new theory.

This change from a theory (or better from a set of equivalent theories) to another set is the irreducible step that justifies the statement that Schrödinger's equation cannot be derived. To be more precise, Schrödinger's equation can be derived, if one assumes as a starting point an equivalent equation. But it cannot be derived from starting points that are not consistent with quantum mechanics. For example, there is no way to deduce Schrödinger's equation from classical mechanics. The best one can do is to recast classical mechanics in the form the closest to quantum mechanics. Still, at some point, a key conceptual difference justified by experiments has to appear. Without that, Physics would be a branch of Mathematics.


A bit of a different perspective than other answers:

I was once in a strange physics class as an undergraduate, where an old 90 year old professor would mumble to himself while drawing terribly on a tablet connected to a projector. Everyone would get A's by default so no one would pay attention, in fact some days I would be the only one to show up, but this was "Modern Physics", and I wanted to be a physicist so I paid attention, trying to learn whatever I could.

One thing I'll never forget:

the old professor said that everyone says that Schrodinger's Equation is an axiom, but you actually can derive it!

If you imagine yourself in the shoes of Schrodinger. Experiments are showing up that things with matter have wavelike properties. Are there equations of motion that describe "wavelike behavior"? We know how some waves operate in classical mechanics. Now typically in classical E&M, we throw out the imaginary part of $e^{i k - \omega t}$ to work with $\cos(\omega t)$, but what happens if you simply keep the imaginary part of the plane wave?

If you start off with a plane wave:

$$\Psi = e^{i (k z - \omega t)}$$ and you find its derivative $$\frac{d\Psi}{dt} = -i \omega e^{i (k z + \omega t)}$$

if you use the Einstein's idea that energy is quantized into packets of energy (that is that E = h f $\implies f = E/h \implies \omega = E/\hbar $) this becomes:

$$\frac{d\Psi}{dt} = -i \frac{E}{\hbar} e^{i (k z - \omega t)}$$

this immediately becomes

$$i \hbar \frac{d\Psi}{dt} = E \Psi$$

and since the Hamiltonian represents the total energy operator, we can make this:

$$i \hbar \frac{d\Psi}{dt} = H \Psi$$

Which is exactly the Schrodinger's equation!

Now this contradicts what even Feyman says: "Where did we get that (equation) from? Nowhere. It is not possible to derive it from anything you know. It came out of the mind of Schrödinger."

I was curious after class and I asked him some questions about this. No matter what isn't there always needs to be an axiom! He responded saying that yes, there needs to be a starting point, but this is how he imagines Schrodinger came up with it, since this is a very simple and a natural way of obtaining it using knowledge at the time.

To me what's remarkable about this "derivation", is that you only need to start with two things:

  1. The state your observing has the form of a plane wave: $\Psi = e^{i (k z - \omega t)}$
  2. And that energy is quantized in packets: $ E = h f$

And that's it! You don't even need the de Broglie's hypothesis!


EDIT: Some people are curious why the Hamiltonian for the Schrodinger equation has such a strange form: $$H = \nabla^2/2 + V(x)$$ This is also very simple, you just need to plug in the definition of the momentum operator into the equation for the Hamiltonian (which classically is just kinetic energy + potential energy)

$$H = \frac{p^2}{2m} + V(x)$$

$$p = -i \hbar \frac{\partial}{\partial x}$$

$$H = -\frac{\nabla^2}{2m} + V(x)$$

It's that simple!

Now if you are also curious where $p = -i \hbar \frac{\partial}{\partial x}$ comes from, this is also simple. For classical waves, the value "k" is considered to be the momentum. So if we do what we did before, but now find the derivative with respect to position instead of time:

$$\frac{d\Psi}{dz} = i \frac{p}{\hbar} e^{i (k z - \omega t)}$$

$$\frac{d\Psi}{dz} = i \frac{p}{\hbar} \Psi$$ $$-i\frac{d\Psi}{dz} = \frac{p}{\hbar} \Psi$$

$$p \Psi = (-i\hbar\frac{d}{dz}) \Psi $$

This suggests that any time you use $p \Psi$ you can swap it out with $(-i\hbar\frac{d}{dz}) \Psi$, and this is why people say "The momentum operator is $(-i\hbar\frac{d}{dz}) $ in the position basis."