Metamathematics of the Banach space consequences of the Baire category theorem

The bolded comments are making a relatively simple point made complex by the abstractness of the subject matter. Abstracting away from the mathematical complexity, suppose we have some interesting property Y, that is hard to verify directly. What we typically want to do in this case is find some easier to verify property X which provides a sufficient condition for Y (that is if anything has property X it has property Y). Now, this gives us a nice method for verifying that something has property Y (hard) by first verifying it has property X (easy). This, of course, will work for any case where the object does in fact have property X. But there still remains the question: Is there a more powerful method for verifying property Y? If all we've shown is a that X is sufficient, but not necessary, there might be some property, Z, which is weaker than X that is also sufficient for Y; then verifying Z is a more powerful method for verifying Y. Thus in this situation this would still be an area of active mathematical research trying to find more powerful methods for verifying that objects have property Y.

However, if we can prove that X is not only sufficient but necessary for Y, then we know we can stop looking for more powerful methods. Essentially since anything with property Y has property X, and X is easy to verify, then Y has become easy to verify.

To take a trivial example: It is useful to know that two points suffice to determine the equation of a line, this (or rather the method for determining the equation) is something people use all the time. That 2 points are necessary (in that 1 point is insufficient) is not a fact you will use very often but it allows you to stop worrying that there might be fancy method for determining equations of lines that you're missing.


I think there's a number of reasons why this approach is useful, like what Hendreyth said in his answer. In my answer I want to address on the following question:

Why should we work hard to prove a stronger result, when we can just prove the weaker condition and appeal to the three theorems? In many ways, it doesn't make sense to use the easier direction of the theorems, which are basically trivialities in all three cases. At first glance, there seems to be no apparent benefit to trying to prove the stronger result.

I think the reason is because these precise quantitative estimates are more generally useful for establishing other results about the operator in question. Sometimes the conclusions of these theorems aren't enough to establish what we really want to know, and it turns out that the quantitative estimates allow us to deduce stronger conclusions. Alternatively in applications you may be studying a very particular system, so having these estimates may help in understanding a different aspect of the system.

Taking the example of Fourier inversion $L^2,$ note that it suffices to show that the sequence $S_N(f)$ is bounded for all $f \in L^2,$ from which a uniform bound follows. This however doesn't tell you that the mapping is in fact an isometry, which is a much stronger condition than just invertibility.

Note that you don't really come across this in an introductory functional analysis course, which takes a more abstract viewpoint in a sense. I find this only really makes sense in applications, where you have a concrete system from which you can establish explicit estimates.


A more involved, but hopefully more illustrative example is the following: Consider the problem of solving a second order linear elliptic PDE of the form $Lu = f$ on some bounded domain $\Omega$ with zero boundary conditions. Here we view $L$ as an operator mapping between Hölder spaces $C^{2+\alpha}(\Omega) \rightarrow C^{\alpha}(\Omega)$ (the details aren't too important here).

One way of establishing invertibility is via the so-called continuity method. We will assume that we can solve the Poisson equation $\Delta u = f,$ so there is a bounded inverse $\Delta^{-1}: C^{\alpha}(\Omega) \rightarrow C^{2+\alpha}(\Omega).$ We then set, $$ L_t = (1-t) \Delta + t L$$

for $0 \leq t \leq 1.$ We know that $L_0 = \Delta$ is invertible and we want to use that to show that $L_1 = L$ is invetible. Now, it should hopefully be clear that we can't do this without knowing a bit more about the operator $L,$ but we can nevertheless try to proceed via general arguments.

Set $S = \{ t \in [0,1] : L_t\ \text{is invertible} \},$ which is non-empty as $0 \in S.$ Now given $s \in S,$ let $t \in [0,1]$ and $f \in C^{\alpha}(\Omega).$ We wish to find a solution to the equation $L_tu=f,$ so we define $T : C^{2+\alpha}(\Omega) \rightarrow C^{2+\alpha}(\Omega)$ by sending,

$$ Tu = L_s^{-1}f + u + L_s^{-1}L_tu = L_s^{-1}f + L_s^{-1}(L_s-L_t)u. $$

Note that $Tu = u$ if and only if $L_tu =f.$ To show that a fixed point exists, we will show that $T$ is a contraction for $|s-t|$ sufficiently small. Indeed we have,

$$ \lVert Tu - Tv \rVert = \lVert L_s^{-1}(L_s-L_t)(u-v) \rVert \leq \lVert L_s^{-1} \rVert \lVert \Delta - L \rVert |s-t| \lVert u - v \rVert.$$

So this estimate tells us that $T$ is a contraction for $|s-t| \leq 1/(2\lVert L_s^{-1} \rVert \lVert \Delta - L \rVert).$ Hence by the contraction mapping theorem we have for all $t$ sufficiently close to $L,$ the equation $L_t u = f$ admits a unique solution for all $f \in C^{\alpha}(\Omega).$ Thus by the open mapping theorem it is invertible, which establishes that $S$ is open.

At this point we may make a few remarks:

  • We proven that for sufficiently small $s,$ $L_s$ is invertible.

  • In fact this argument is much more general: we proved that a sufficiently small pertubation of an invertible operator is still invertible.

  • This argument gives us a local result, but we can't extend it to show that $S$ contains all of $[0,1].$ The problem is that we have no uniform control over $\lVert L_s^{-1} \rVert,$ which may blow up as $s$ approaches a limit point of $S.$

This is where quantitative estimates become useful. Under appropriate assumptions on $L,$ one can prove so-called Schauder estimates: there exists a constant $C>0,$ independent of $t$ and $u$ such that,

$$ \lVert u \rVert_{C^{2+\alpha}(\Omega)} \leq C\lVert L_t u \rVert_{C^{\alpha}(\Omega)}. $$

Establishing this estimate requires a lot of work and makes full use of the exact structure of the operator $L.$ As a result however we get a uniform bound to the quantity $1/(2 \lVert L_s^{-1} \rVert \lVert \Delta - L \rVert),$ from which we can establish $S=[0,1],$ by a compactness argument (noting the bound on $|s-t|$ we needed earlier can now be established uniformly in $s$ and $t$).

Note that the Schauder estimates replaces the use of the open mapping theorem, so this result holds without using Baire category. The conclusion we get however is much stronger, because of the extra information encoded in our estimates (namely that the bound is independent of $t$).