Can Sturm-Liouville theory actually solve ODEs?

Here are two reasons why Sturm-Liouville theory is useful:

Physics reason: I noticed that you tagged this question as "physics". Almost every problem in quantum physics is a Sturm-Liouville eigenfunction problem: namely, solving Schrodinger's equation!

For example:

  • Take the infinite square well. Here, Schrodinger's equation is of the form $ \mathcal L u + \lambda u = 0$, where $\mathcal L = \frac {d^2}{dx^2}$ (up to a constant factor). The boundary conditions $u(0) = u(\pi) = 0$. This is a Sturm-Liouville eigenfunction problem. The eigenfunctions $u_n = \sin(n x)$ are the wavefunctions of the various states, and the eigenvalues $\lambda_n = n^2$ are their energies. (The allowed values of $n$ are $n = 1,2,3,\dots$)

  • For the simple harmonic oscillator, Schrodinger's equation can be converted to the Hermite equation, which is also of Sturm-Liouville form. The eigenfunctions and eigenvalues are (closely related to) the wavefunctions and energies of the simple harmonic oscillator states. (This example is a "singular" Sturm-Liouville system, because the domain extends to infinity.)

  • When you solve Schrodinger's equation for the hydrogen atom by separation of variables, the radial equation can be converted into the associated Laguerre's equation, which is another (singular) Sturm-Liouville system...

Sturm-Liouville theory tells us (with certain caveats relating to dimensionality, infinite domains and singularities, and with subtleties depending on how closely-related the Sturm-Liouville problem is to the original Schrodinger problem) that:

  • There is a discrete sequence of energy levels, labelled by a quantum number $n$. This discreteness is why quantum physics is called "quantum". The energy values of the states form an ascending sequence $\lambda_1 \leq \lambda_2 \leq \lambda_3 \leq \dots$, which tends to infinity. So in particular, there is a well-defined notation of a "ground state" of minimal energy: this is the $n = 1$ state.

  • The wavefunctions of two energy levels with distinct energy values are orthogonal. This is an easy consequence of the self-adjointness property of the Hamiltonian operator. If we rescale the wavefunctions (and choose bases carefully if certain energy levels are degenerate), then we can make the wavefunctions orthonormal: i.e. $\int u_{n_1} u_{n_2} = \delta_{n_1, n_2}$.

  • Any general wavefunction $u$ can be written as a linear combination $ u = \sum_n c_n u_{n} $ of the pure energy-level wavefunctions $u_{n}$. This is the "completeness" statement in Sturm-Liouville theory. In physics language, the state $u$ is then a "superposition" of the pure energy-level states.

Moreover, it is easy to work out the coefficients $c_n$ in the linear decomposition $ u = \sum_n c_n u_{n} $. Since we have an orthogonality relation $\int u_{n_1} u_{n_2} = \delta_{n_1 , n_2}$, the $c_n$ coefficient is simply $c_n = \int u_{n} u$.

Maths reason: Suppose we wish to solve an equation of the form $$ \mathcal L u = f,$$ where $\mathcal L$ is a second-order differential operator of Sturm-Liouville type, and $f$ is some given function. (For example, this could be a dynamical system, where $f$ represents a forcing term.)

A common strategy for solving an equation of this form is to start by finding a set of eigenvalues $\lambda_n$ and eigenfunctions $u_n$ satisfying the equation, $$ \mathcal L u_n + \lambda_n w u_n = 0,$$ You're right in saying that Sturm-Liouville theory doesn't actually help you to find these eigenvalues and eigenfunctions. Usually, people determine them using other methods, such as power series. [Many of these power series solutions to Sturm-Liouville eigenfunction problems are well known - perhaps you can read up on Legendre polynomials, Hermite polynomials, Laguerre polynomials, Chebyshev polynomials, and so on.]

But once we have found the $\lambda_n$'s and $u_n$ satisfying the eigenfunction equation, we can use these eigenvalues and eigenfunctions, together with results from Sturm-Liouville theory, to build solutions to the original equation $\mathcal L u = f$. Here is how we do it:

First, Sturm-Liouville theory tells us that the $u_n$'s are complete. So we can write the solution $u$ of our original equation as a linear combination of the $u_n$'s: $$ u = \sum_n c_n u_n.$$

If we now substitute this ansatz into the equation $\mathcal L u = f$, we get $$- \sum_n c_n \lambda_n w u_n = f$$

Sturm-Liouville theory also tells us that the $u_n$'s are orthonormal (after rescaling): $$\int w u_{n_1} u_{n_2} = \delta_{n_1, n_2}.$$ [By a standard integration-by-parts argument, this orthonormality property follows from the fact that a differential operator of the form $\mathcal L = \frac d {dx} p \frac d {dx} + q$ is self-adjoint.]

So if we multiply both sides by $u_k $ and integrate, we get $$ c_k = - \tfrac 1 {\lambda_k}\int f u_k .$$

We have now solved the equation! The answer is $$ u = \sum_n \left( - \tfrac 1 {\lambda_n}\int f u_n \right) u_n.$$ [If you like, you can think of $\sum_n \tfrac 1 {\lambda_n }u_n(x')u_n(x)$ as the Green's function for the operator $\mathcal L$.]

To summarise, Sturm-Liouville theory doesn't tell us how to find the eigenfunctions and eigenvalues that satisfy $\mathcal L u + \lambda w u = 0$; this is done usually using power series methods. Instead, Sturm-Liouville theory tells us information about these eigenfunctions and eigenvalues that enable us to use them as building blocks for building solutions to general problems of the form $\mathcal L u = f$.

[This technique can be generalised for equations of the form $ \mathcal L u + \mu w u = f,$ where $\mu$ is a constant number. As along as $\mu$ is not equal to any $\lambda_n$, one can use the same method that the solution is $ u = \sum_n \left( \tfrac 1 {\mu - \lambda_n}\int f u_n \right) u_n.$ Since Sturm-Liouville theory tells us that the $\lambda_n$'s form a discrete set (more specifically, a positive, strictly-increasing sequence), it follows that this solution method is valid for "almost" every choice of $\mu$.]


Sturm-Liouville Theory came out of studying Fourier's Separation of Variables technique for solving Partial Differential Equations, and that's still an important application. In this context, solving the ODE is not the ultimate objective, but it provides a way to solve a more complicated equation. As an example, suppose $A$ is a selfadjoint matrix, and you want to solve a vector equation $$ \frac{d x}{dt} = A x,\;\;\; x(0)=x_0. $$ This problem can be reduced by finding an orthonormal basis $\{ e_1,e_2,\cdots,e_n \}$ of eigenvectors for $A$ with eigenvalues $\lambda_1,\lambda_2,\cdots\lambda_n$ because you can then write $$ x(t) = \alpha_1(t)e_1 + \cdots + \alpha_n(t)e_n $$ where $\alpha_k(t)$ are scalar functions of $t$, and plug back into the equation to obtain $$ \sum_{k=1}^{n} \alpha_k'(t)e_k = \sum_{k=1}^{n} \lambda_k \alpha_k(t)e_k \\ \implies \alpha_k(t) = \alpha_k(0)e^{\lambda_k t}. $$ Then you can find $\alpha_k(0)$ knowing that $x(0)=x_0$: $$ x_0 = x(0) = \sum_{k=0}^{n}\alpha_k(0)e_k \\ \implies \langle x_0,e_l\rangle = \alpha_l(0) \\ x(t) = \sum_{k=1}^{n}e^{\lambda_k t}\langle x_0,e_k\rangle e_k $$ This is a full solution of the problem.

Separation of variables works basically the same way for partial differential equations, but on a grander scale. Here Sturm-Liouville operators must be diagonalized, which requires an infinite-dimensional space, and an infinite orthogonal basis. The operators are selafadjoint, and so there is a strong similarity between these cases and the matrix case mentioned above. For example, the wave equation for a thin uniform wire stretched between two fixed points and plucked to an initial displacement $u_0(x)$ initially at rest and then released at $t=0$ is a partial differential equation for the unknown displacement $u(t,x)$ of the string governed by $$ \frac{\partial^2 u}{\partial t^2}=c\frac{\partial^2 u}{\partial x^2},\\ u(t,0)=u(t,L)=0, \\ u(0,x)=f(x), \\ u_{t}(0,x)=0. $$ In this case you're working with an operator $A f = -\frac{\partial^2}{\partial x^2}f$ on the domain of twice differentiable functions $f$ for which $f(0)=f(L)=0$. This operator also has an orthonormal basis of eigenfunctions, just as a selfadjoint matrix does: $$ e_n(x) = \sqrt{\frac{2}{L}}\sin(n\pi x/L). $$ The operator has eigenvalues $\lambda_n = n^2\pi^2/L^2$ and $$ Le_n = \lambda_n e_n. $$ By expanding $u(t,x)$ in these functions, the $x$ variable is removed, and you end up with a time-varying vector equation where the vectors are functions of $x$ in the vector space $L^2[0,L]$. $$ u(t) = \sum_{n=1}^{\infty}\alpha_n(t)e_n $$ The coefficients $\alpha_n(t)$ are scalar coefficient functions that must satisfy $$ \alpha_n''(t) = c \alpha_n(t),\;\;\; \alpha_n(0) = \langle f,e_n\rangle,\;\; \alpha_n'(0)=0. $$ The coefficient function solutions are $$ \alpha_n(t) = \langle f,e_n\rangle\cos(\sqrt{c}n\pi t/L), $$ which leads to the PDE solution $$ u(t,x) = \sum_{n=1}^{\infty}\langle f,e_n\rangle \cos(\sqrt{c}n\pi t/L)e_n(x) \\ = \sum_{n=1}^{\infty}\left[\frac{2}{L}\int_{0}^{L}f(x')\sin(n\pi x'/L)dx'\right]\cos(\sqrt{c}n\pi t/L)\sin(n\pi x/L) $$ This solution is a decomposition of the vibrational discplacement in terms of the "natural modes" of the string, which are harmonic multiples of a fixed ground state frequency, which is how the field came to be known as Harmonic Analysis.

So the goal is not so much how to solve the Sturm-Liouville problem, but how to use the "diagonalization" of the Sturm-Liouville problem in terms of its eigenfunctions to reduce partial differential equation problems to ODE problems for the coefficients in eigenfunction expansions of the solution along a basis that diagonalizes a Sturm-Liouville operator.