Why do natural transformations express the fact that a vector space is canonically embedded in its double-dual but not in its dual?

I am answering as somebody who has struggled through a related matter, as you noted in the OP. I do not think I will be able to satisfy every one of your related threads of dissatisfaction and I am not sure I will be able to satisfy any at all. On the flip side, as the question is a year old, you may have resolved it for yourself long ago.

But let's give it a whirl. I love the question.

First of all, separate from the question about how the category language speaks to (or doesn't speak to) matters, it seems to me you are not convinced that there even is a substantive difference between the isomorphism of a finite dimensional vector space to its dual and the isomorphism to its double dual, a propos of your Profound and Fundamental Lesson of Abstract Algebra -- aren't they both isomorphisms? So, before even engaging the category theory, let me speak to this:

(1) I think you will gain useful insight about the situation from studying cases where the substance of the difference between a space and its dual is felt. user254665 mentioned one such instance in her/his answer. In general, the infinite-dimensional topological vector spaces of functional analysis provide an abundant source of examples. While the dual of a finite dimensional vector space is finite-dimensional of the same dimension, and therefore isomorphic, the dual of a Banach space is typically a different Banach space. For example the dual of $L^p$ is $L^q$ with $p^{-1}+q^{-1} = 1$, which are two different Banach spaces unless $p=2$. The dual of the space of continuous, compactly supported functions on a locally compact Hausdorff space is a space of measures i.e. it is not even a space of functions!

Even in these situations where the dual is really a different animal, the original space does embed in its double dual, as usual by mapping a vector to the functional on functionals obtained by evaluation at that vector. (I will avoid controversy by not lionizing this embedding as "natural".) In many cases, the embedding is proper, i.e. the double dual is bigger than the original space. Nonetheless, there's often no obvious embedding of the original space in the (single) dual at all.

I am not a functional analyst, but a place I've encountered this substance in my own life is in the difference between a locally compact abelian group and its character group, i.e. its Pontryagin dual. Like vector spaces, this is a situation where finiteness causes a non-canonical isomorphism to the dual, and there is a canonical isomorphism to the double dual. A finite abelian group $A$ is isomorphic to its dual $\hat A$, but not an infinite group. For example, the additive group $\mathbb{Z}$ of integers and the circle group $S^1 = \{z\in\mathbb{C}^\times \mid |z| = 1\}$ are Pontryagin duals of each other, and they don't even have the same cardinality. In the finite case, where they are isomorphic, I've still "bumped into" the difference between $A$ and $\hat A$, for example in trying to understand the relationship between an action of a group $G$ of automorphisms on $A$ and the induced action of $G$ on $\hat A$, e.g. see this question.

All of this is to say that study of such examples can help convince one that the dual is really not the same as the original object, so that even when they're isomorphic it's worth keeping track of which is which. (More so than it is worth distinguishing the object from its double-dual when they are isomorphic.)

(2) How to make sense of this difference in light of your Profound and Fundamental Lesson (PaFL), that isomorphic objects are to all intents and purposes the same.

This is a question about the scope of the PaFL.

The PaFL is the right way to see things when you view the objects in isolation from their surroundings and each other. Let $A$ and $B$ be isomorphic objects (e.g. vector spaces or groups). Any specific isomorphism $\phi:A\rightarrow B$ gives you a dictionary to translate statements about the isolated object $A$ to statements about the isolated object $B$ and vice versa. For example: if $A,B$ are vector spaces, then $\phi$ carries bases to bases, so there is a perfect bijective correspondence between bases of $A$ and bases of $B$. It carries linear transformations of $A$ to linear transformations of $B$ (via $T\mapsto \phi T\phi^{-1}$) so there is a bijection between such transformations. If we think of $\phi$ as a "renaming", then we can think of $B$ as just $A$ with different names.

From this point of view, $A$ and $B$ are "the same", and any "renaming" $\phi$ works as well as any other to show this. This is the PaFL.

But. If we allow $A$ and $B$ to interact with other objects (even each other!), then distinct isomorphisms start to feel very different! For example:

Let $A = \mathbb{R}^2$, seen as a real vector space. Let $B$ be $A$'s vector space dual, i.e. the space of linear functionals $A\rightarrow \mathbb{R}$, with pointwise addition and scalar multiplication. $B$ is isomorphic to $A$ since it is also a 2-dimensional real vector space. One has a wide choice of isomorphisms: fixing a basis of $A$, one can send it to any basis of $B$. There is a 4-dimensional manifold's worth of choice.

Now along comes a linear transformation $T$ acting on $A$, say by scaling the $x$-axis by a factor of $2$. One can pick some isomorphism $\phi:A\rightarrow B$ and translate $T$ into a transformation of $B$ as above (i.e. $\phi T \phi^{-1}$). But there is another (natural??) way that $T$ acts on $B$, irrespective of any choice of $\phi$, which is to send a functional $f:A\rightarrow\mathbb{R}$ to the functional $f\circ T$. Now one can ask about any given $\phi$: does the transformation of $B$ into which it translates $T$ equal this (natural??) action of $T$ on $B$? I.e. does $\phi T \phi^{-1} (f) = f\circ T$ for all $f\in B$? A priori, some $\phi$'s may be compatible with the action of $T$ on $B$ in this respect, and some may not.

One could go further. I chose a specific $T$ at the front end of this. But one could ask if there is a $\phi$ such that $\phi T\phi^{-1}(f)$ will equal $f\circ T$ regardless of the choice of $T$. This $\phi$, if it existed, would clearly (?) be "awesome" in some way that other isomorphisms aren't.

Perhaps you respond by saying, well, why did you bring $T$, and especially its action on $B$ by $f\mapsto f\circ T$, into it? This is a perfectly legitimate question. From the point of view where you only look at $A$ and $B$ as self-contained systems, there's no reason to. But my point is that mathematical objects are often embedded in a network of other mathematical objects (such as $T$, or a wide variety of choices of $T$, and their related actions on $A$ and $B$), and when we bring these other objects and the interactions between them into it, it complicates the (overly?) simplistic picture drawn by the PaFL. Maybe some isomorphisms play better than others with the network of relationships in which $A$ and $B$ are embedded.

(3) This is a segue into the matter of categories. A natural isomorphism between two functors is not an isomorphism between two isolated objects. It is some kind of construction that works simultaneously across an entire category, in such a way that the isomorphisms all interact well with a bunch of other maps.

Thus, the way in which the categorical language translates the word "natural" is, loosely, "working simultaneously across all the objects of a whole category, in such a way that it cooperates with the other relevant maps in the category." The naturality lies in the everywhere-at-once-ness and in the fits-in-with-what-was-already-going-on-ness.

To get specific to the case. Let $\mathscr{V}$ be the category of finite dimensional $\mathbb{R}$-vector spaces.

Let's try to carry out what you proposed in the penultimate paragraph of the OP, i.e. try to reconstruct the dualizing functor as a covariant functor; call it $D$. We are already given the map on objects: it sends $V\in\operatorname{Obj}\mathscr{V}$ to its dual $V^*$. We need to design, for every $T\in \operatorname{Hom}(V,W)$, a map $D(T):V^* \rightarrow W^*$, in such a way that the identity map always gets sent to the identity map, and for any $U\xrightarrow{S} V\xrightarrow{T}W$ occurring in $\mathscr{V}$, we have $D(TS) = D(T)D(S)$.

It seems to me that this is actually possible, modulo some axiom-of-choice typed issues. If we separately chose an isomorphism $\phi_V:V\rightarrow V^*$ for each $V\in \operatorname{Obj}\mathscr{V}$, then we could send $T:V\rightarrow W$ to $D(T) = \phi_W T\phi_V^{-1}$, which maps $V^*$ to $W^*$. Furthermore, it seems to me that the maps $\phi_V:V\rightarrow V^*$ would then constitute a natural isomorphism from the identity functor to our new "dualizing functor" $D$.

I think some readers will be given pause by the fact that this construction needs some form of the axiom of choice to be carried out. (I'm out of my set-theoretic league on what's needed. It seems to me that the category at hand is not a small category; thus we need an even stronger axiom like global choice, right?) But you've indicated that the need to make choices doesn't strike you as a barrier to "naturalness," so I assume that this high degree of nonconstructiveness of the construction won't be a problem. However, I see another issue as well:

This construction loses any information related to the fact that $V^*$ is supposed to be the dual of $V$. It completely ignores the fact that the elements of $V^*$ are supposed to be functionals on $V$. We could replace $V^*$ with any other vector space of the same dimension and carry out the same construction. Thus it seems to me $D$ doesn't really send $V$ to its dual in any meaningful sense. Thus, while it uses a nonconstructive axiom (global choice?) to get past the category-theoretic insistence that a natural transformation happen "all at once across a whole category", it doesn't (honestly anyway, it seems to me) meet the second condition that it "cooperates with what was already going on."

This is where the transpose (also called the adjoint) comes in. You ask, "who ordered that?" I.e. isn't the adjoint map extrinsic to the relationship between $V$ and its dual? I contend it's actually essential. If $T:V\rightarrow W$ is a map between vector spaces, then the adjoint $T^*:W^*\rightarrow V^*$ between their duals is defined as $f\overset{T^*}{\mapsto} f\circ T$. This $T^*$ cooperates with what was already going on! I.e. it transforms the dual space in accordance with what the elements in the dual space are supposed to mean. Without a relationship like that between $T$ and $T^*$ that incorporates the fact that the elements of $V^*$ are supposed to be the contents of $\operatorname{Hom}(V,\mathbb{R})$, a functor sending $V$ to $V^*$ is only meaningfully sending it to some other vector space of the same dimension, not actually its dual.

Thus a natural isomorphism to the dual really should somehow respect the adjoint, or something like it. Otherwise, what makes the dual the dual?

Obviously the question was soft and this is a soft answer. So let me know if any of this speaks to any of the issues you outlined.


It seems to me that there are two possible meanings (close to each other).

One is that such isomorphism $**$ is defined via very simple and "expected" means. Another word commonly used for this is canonical. The definition $(v,w):=w(v)$ for $w\in V^*$ identifies $v$ with an element of $V^{**}$ and this does not depend on any additional structure on $V$, such as a metric. In this sense, it is "natural": you pair elements of $V^*$ with elements of $V$, so you can consider it the other way around as a pairing between elements of $V$ and $V^*$.

Another meaning is, as you say, the categorical. This basically says that not only can you apply $**$ to spaces but also to linear maps and the corresponding diagram commutes. That is, you can identify $f: X\to Y$ with $f^{**}: X^{**}\to Y^{**}$ (again, the identification goes via simple and expected means). As before, the functorial definition of $**$ does not depend on any additional structures such as metrics or scalar products.

These two meanings often go hand-in-hand: if something has a simple and expected definition (or a complicated one, but satisfying simple axioms), usually it can be converted into something categorical. It seems to me that if one wants to highlight the categorical meaning, (s)he uses the word natural, if one wants to highlight the simple and expected thing, (s)he often uses the word canonical.

But I'm not sure if this answers your question because I guess that you are aware of all of this.

Extension + edit: a small attempt to give some intuition of why natural transformation are "natural". Consider a finite-dimensional vector space $V$ and two its bases $\mathcal{B}_1$ and $\mathcal{B}_2$. Let $\mathcal{V}$ be a category with only one object $V$ and morphisms $Mor(V,V)$ being all linear transformations; let $\mathcal{R}$ be a category with one object $\Bbb R^n$ and morphisms being all linear transformations. You can define two functors $F$ and $G$ from $\mathcal{V}$ to $\mathcal{R}$ that express a linear transformation as a matrix wrt. the coordinates $\mathcal{B}_1$ resp. $\mathcal{B}_2$. A natural transformation between $F$ and $G$ assigns to the object $V$ in $Obj(\mathcal{V})$ the morphism $x\mapsto C^{-1}x$ of $\Bbb R^n$ (an element of $Mor(\Bbb R^n, \Bbb R^n)$), where $C$ is the transition matrix from base $\mathcal{B}_1$ to $\mathcal{B}_2$. This morphism is just the coordinate transformation in $\Bbb R^n$.

The fact that it is a natural transformation just reflects that any linear map $f: V\to V$ (an element of $Mor(V,V)$) gives rise to the commutative diagram \begin{array}{ccc} \Bbb R^n & \stackrel{F(f)}{\to} & \Bbb R^n \\ \downarrow_{C^{-1}} && \downarrow_{C^{-1}} \\ \Bbb R^n & \stackrel{G(f)}{\to} & \Bbb R^n \\ \end{array} or equivalently, \begin{array}{ccc} \Bbb R^n & \stackrel{M}{\to} & \Bbb R^n \\ \downarrow_{C^{-1}} && \downarrow_{C^{-1}} \\ \Bbb R^n & \stackrel{C^{-1}MC}{\to} & \Bbb R^n \\ \end{array} where $M$ is the matrix expression of $F(f)$. In physics, this corresponds to a change of observer: observer $\mathcal{B}_2$ will just "see" a vector $C^{-1}x$ and/or "use" the matrix $C^{-1}MC$ whenever observer $\mathcal{B}_1$ "sees" the vector $x$ and "uses" the matrix $M$. But they both see the same "real object". In this sense, the natural transformation is "natural".


Consider the space $l_0$ of real sequences $(x_n)_{n\in N}$ that converge to $0,$ with $\|(x_n)_n\|=\sup_n |x_n|,$ and its dual $l_1,$ the space of absolutely summable real sequences $(y_n)_{n\in N}$ with norm $\|(y_n)_n\|=\sum_{n\in N}|y_n|<\infty.$ The space $l_0$ contains many positive sequences that are not summable,e.g if $y_n=1/n$ for each $n.$ We should expect an embedding $E$ from $l_0 $ into $l_1$ to preserve the algebraic structure and the topological structure, in other words $E$ should be a continuous linear bijection to its image, and $E^{-1},$ acting on the image of $E$, should also be continuous. Such an $E$ doesn't exist. As a special case of a fairly recent theorem, $l_0$ and $l_1$ are homeomorphic, but by a non-linear mapping $F$, so the algebraic structure is not preserved by $F$.