Why does separation of variable gives the general solution to a PDE

There are several key ingredients I will briefly describe here. I won't go into too much detail as you've mentioned that you don't have a graduate real analysis background yet. But indeed a full description of the theory is a standard part of a graduate course in linear PDE. So I hope that answers your side question as well.

  1. We start with a strongly elliptic linear operator (such as the Laplacian) and, along with some nice boundary condition, we restrict to some appropriate solution (Hilbert) space.

  2. In that solution space, we can prove under fairly general conditions that the eigenvalues of the operator are countable and that eigenvectors (eigenfunctions) form an orthogonal basis for the solution space. This is the infinite-dimensional generalization of the diagonalizability result from regular matrix theory. The proof relies on the spectral theorem for compact operators. The key here is that, up to a shift, the inverse of a strongly elliptic operator is compact.

  3. This demonstrates that if we can construct all the eigenvectors of the operator, the general solution can be written as a decomposition of these eigenvectors.

  4. It remains to find the eigenvectors; in special cases (most famously, 2D Laplacian on a rectangle) this can be done via separation of variables. Therefore it remains to address "Why does separation of variables produce all eigenvectors?" To answer this question, we note that we proved that the eigenvectors form a complete basis. Next, we see that because of the specific symmetry of the Laplacian on the rectangle, using separation of variables reduces the problem to a pair of second-order equations in one-dimension; in this process we produce the eigenvectors of these one-dimensional operators, and then from the existing theory (in particular, Sturm-Liouville theory) we know that we have produced a set of functions that span the space. As we have produced a basis, no other eigenvectors are needed to form a general solution.


The answer by @Christopher is very complete and definitely better than what this answer will be. But I would like to make some comments on Separation of Variables.

Separation of Variables is a process of splitting a multi-dimensional problem, into several single dimensional problems. However, this relies on an inherent symmetry of the domain, which itself determines the coordinates which allow for separation of variables.

If the question is posed in a rectangle, then it is quite natural that the problem given in rectangular coordinates can be broken down into two one dimensional problems in each orthogonal dimension. If the problem is posed on a circle, then polar coordinates are required. However, if the problem is given on a completely arbitrary domain then it is unlikely that you could find a coordinate system that can reflect the symmetry of the domain and allow for separation of variables.

If you get deeper into Lie theory, one can describe a group theoretic method of determining the possible coordinate systems that allow for a given equation to be separable. However, I don't think I have a deep enough understanding on this to comment further.


Separation of variables relies on being able to choose an orthogonal coordinate system in which the Laplace operator separates. That is a rather strong restriction. For example, the 3d Laplacian splits in only a couple dozen different orthogonal coordinate systems. And the solid in which you are solving the Laplace equation must be a cube in the curvilinear coordinate system, so that each surface of the solid is described as a rectangle in two variables of the curvilinear coordinate system. Then, under these conditions, the transformed Laplacian permits the use of separation of variables for solving the Laplace equation.

The ODEs that result from separation of variables are Sturm-Liouville eigenvalue problems, which is where Sturm-Liouville theory originated. The Sturm-Liouville problems are easier to analyze than the PDE. One can prove that eigenfunction expansions exist for Sturm-Liouville problems. And that gives you enough to solve the Laplace equation by using the eigenfunction expansions coming from the Sturm-Liouville ODEs. You do not necessarily end up with discrete sum expansions of eigenfunctions. If the domain is infinite in one or more coordinates, or if the Jacobian of the orthogonal transformation to curvilinear coordinates vanishes somewhere on the outer surface or at an interior point, then eigenfunction expansions may involve discrete sums and/or integrals of eigenfunctions in the eigenvalue parameter. The theory is not necessarily simple, but it was worked out well before the general theory of Elliptic PDEs, and it remains important because of being able to find explicit solutions for some rather important cases. The method is validated by proving the completeness of eigenfunction expansions associated with Sturm-Liouville problems.

The general theory of Elliptic PDEs is far more general than that required to deal with the problems where separation of variables applies for the Laplace equation. On the other hand, the general theory is not needed when separation of variables applies. Separation of variables is one of the few ways to obtain general, explicit solutions for specific geometries. Even though there are not many cases where explicit solutions are possible, these cases are useful special cases that help reveal the general nature of elliptic PDEs.