What do the covariance function really tell us?

First, we also require that the mean of the two processes are the same, otherwise if we set $\Omega=\{a,b\}, \mathbb{P}(\{a\})=\frac{1}{2}, \mathbb{P}(\{b\})=\frac{1}{2}$ and set $$\forall n\in \mathbb{N}, X_n(\omega)=1$$ and $$\forall n\in \mathbb{N}, Y_n(\omega)=0,$$ we have that $$\forall m,n \in \mathbb{N}, \operatorname{cov}(X_m,X_n)=0=\operatorname{cov}(Y_m,Y_n),$$ but $$\forall n \in \mathbb{N}, \|X_n\|_2=1\neq0=\|Y_n\|_2,$$ so it can't exist $\Phi : L^2(\mathbb{P})\rightarrow L^2(\mathbb{P})$ such that $\forall f \in L^2(\mathbb{P}), \|\Phi(f)\|_2=\|f\|_2$ and $\forall n \in \mathbb{N}, \Phi(X_n) = Y_n$, otherwise: $$1=\|X_1\|_2=\|\Phi(X_1)\|_2=\|Y_1\|_2=0,$$ absurd.

Second, notice that if we interpret the term isometry as a linear map from $L^2$ onto itself such that $\forall f\in L^2, \|\Phi(f)\|_2=\|f\|_2$, than the following provides a counterexample. Define the processes $X$ and $Y$, indexed by $\mathbb{Z}$, on the set $\Omega:=[-\pi,\pi]$, equipped with the normalized Lebesgue measure $\operatorname{d}\mathbb{P}=\frac{\operatorname dm}{2\pi}$, as $\forall n\in \mathbb{Z}, \forall \omega \in [-\pi,\pi], X_n(\omega)=e^{in\omega}$ and $\forall n >0, \forall \omega \in [-\pi,\pi], Y_n(\omega)=e^{i(n+1)\omega}$ and $\forall n\le0, \forall \omega \in [-\pi,\pi], Y_n(\omega)=e^{int}$. Notice that both processes have the same mean and the same covariance, because $$ \forall n \in \mathbb{Z} \backslash \{0\}, \mathbb{E}(X_n)=0=\mathbb{E}(Y_n)$$ and $$\mathbb{E}(X_0)=1=\mathbb{E}(Y_0)$$ and $$\forall m,n\in\mathbb{Z}, \operatorname{Cov}(X_m,X_n)=\delta_{m,n}=\operatorname{Cov}(Y_m,Y_n).$$ Then if there exists a linear map $\Phi$ of $L^2$ onto itself such that $\forall f\in L^2, \|\Phi(f)\|_2=\|f\|_2$ and $\forall n\in \mathbb{Z}, \Phi(X_n)=Y_n$, get $f\in L^2(\mathbb{P})$ such that $\Phi(f)=X_1$. Then being $(X_n)_{n\in\mathbb{Z}}$ an orthonormal base of $L^2(\mathbb{P})$, we have: $$X_1=\Phi(f)=\Phi(\sum_{n\in\mathbb{Z}}\langle f,X_n\rangle X_n)=\sum_{n\in\mathbb{Z}}\langle f,X_n\rangle\Phi(X_n)=\sum_{n\in\mathbb{Z}}\langle f,X_n\rangle Y_n,$$ and so: $$1=\langle X_1,X_1\rangle=\langle\sum_{n\in\mathbb{Z}}\langle f,X_n\rangle Y_n,X_1\rangle=\sum_{n\in\mathbb{Z}}\langle f,X_n\rangle\langle Y_n,X_1\rangle=\sum_{n\in\mathbb{Z}}\langle f,X_n\rangle 0=0,$$ absurd. Then we can't interpret the term isometry in this manner.

However, also if we interpret the term isometry dropping the surjectivity condition, i.e. as a linear map from $L^2$ into itself such that $\forall f\in L^2, \|\Phi(f)\|_2=\|f\|_2$, we can't require that $\forall t, \Phi(X_t) = Y_t$. In fact, there are examples where such a map doesn't exist. For, get $S=Y$ and $T=X$, where $X$ and $Y$ are as in the previous example. Then if $\Phi$ satisfies these requiments so that $\forall n \in \mathbb{Z}, \Phi(S_n)=T_n$, then $$\overline{\operatorname{span}(\{\Phi(S_n)\}_{n\in\mathbb{Z}})}=\overline {\operatorname{span}(\{T_n\}_{n\in\mathbb{Z}})}=L^2(\mathbb{P})$$ and so $$\Phi(T_1)\in \overline{\operatorname{span}(\{\Phi(S_n)\}_{n\in\mathbb{Z}})},$$ so $$\Phi(T_1)=\sum_{n\in\mathbb{Z}} \langle \Phi(T_1),\Phi(S_n)\rangle\Phi(S_n)=\Phi\left(\sum_{n\in\mathbb{Z}} \langle \Phi(T_1),\Phi(S_n)\rangle S_n\right),$$ so $$\Phi\left(T_1-\sum_{n\in\mathbb{Z}} \langle \Phi(T_1),\Phi(S_n)\rangle S_n\right)= \Phi(T_1)-\Phi\left(\sum_{n\in\mathbb{Z}} \langle \Phi(T_1),\Phi(S_n)\rangle S_n\right)=0,$$ but $$T_1-\sum_{n\in\mathbb{Z}} \langle \Phi(T_1),\Phi(S_n)\rangle S_n = T_1-\sum_{n\le 0} \langle \Phi(T_1),\Phi(S_n)\rangle S_n + \sum_{n\ge 1} \langle \Phi(T_1),\Phi(S_n)\rangle S_n = T_1-\sum_{n\le 0} \langle \Phi(T_1),\Phi(S_n)\rangle T_n + \sum_{n\ge 1} \langle \Phi(T_1),\Phi(S_n)\rangle T_{n+1} \neq 0.$$ Then $\Phi$ isn't injective, while every isometry is injective, absurd.

So we have to interpret the result in another manner again. The working way is: there exists a linear map of $L^2$ into itself such that $\forall f\in L^2, \|\Phi(f)\|_2=\|f\|_2$ and either:

  • $\forall t, \Phi(X_t) = Y_t$ or
  • $\forall t, \Phi(Y_t) = X_t.$

Now, let's prove the result when $L^2(\Omega,\mathcal{A},\mathbb{P})$ is separable and, say, the index set is $[0,+\infty)$. We can assume that $\exists t\in[0,+\infty), X_t\not\equiv 0$, otherwise the conclusion immediatly follows, being the two processes identically zero. Also the finite dimensional case is pretty straightforward, so we assume that we are in the infinite dimensional case.

Claim 1: there exists $\{t_n\}_{n\in\mathbb{N}}\subset[0,+\infty)$ such that:

  • $\{X_{t_n}\}_{n\in\mathbb{N}}$ is linear independent;
  • $\overline{\operatorname{span}(\{X_{t_n}\}_{n\in\mathbb{N}})}=\overline{\operatorname{span}(\{X_t\}_{t\in{[0,+\infty)}})}=:V.$

Proof: since $L^2(\Omega,\mathcal{A},\mathbb{P})$ is separable, so is $V$ too. Then choose a sequence $\{h_n\}_{n\in\mathbb{N}}$ dense in $V$. Then, for each $n\in\mathbb{N}$ there exists a sequence $\{g_{n,k}\}_{k\in\mathbb{N}}\subset {\operatorname{span}(\{X_t\}_{t\in{[0,+\infty)}})}$ such that: $$\forall n\in\mathbb{N}, \|g_{n,k}-h_n\|\rightarrow 0, k\rightarrow \infty.$$ Then, for each $n,k\in\mathbb{N}$ there exist $t_{n,k,1},...,t_{n,k,m_{n,k}}\in[0,+\infty)$ such that $$ g_{n,k} \in \operatorname{span}(X_{t_{n,k,1}},...,X_{t_{n,k,m_{n,k}}}).$$ Then, since denumerable union of denumerable set is denumerable, we have that: $$\cup _{n\in\mathbb{N}} \cup_{k\in\mathbb{N}} \cup _{j=1}^{m_{n,k}} \{X_{t_{n,k,j}}\}$$ is denumerable and so we can extract a maximal (denumerable) subset of linear independent vectors from this set, say $$\{X_{t_n}\}_{n\in\mathbb{N}},$$ obtaining $$\overline {\operatorname{span}(\{X_{t_n}\}_{n\in\mathbb{N}}} = \overline {\operatorname{span}(\cup _{n\in\mathbb{N}} \cup_{k\in\mathbb{N}} \cup _{j=1}^{m_{n,k}} \{X_{t_{n,k,j}}\})}\\ = \overline {\operatorname{span}(\cup _{n\in\mathbb{N}} \cup_{k\in\mathbb{N}} \{g_{n,k}\})} = \overline {\cup _{n\in\mathbb{N}} \{h_{n}\}}=V,$$

and so the claim is proved.

Claim 2: $\{Y_{t_n}\}_{n\in\mathbb{N}}$ is linear independent;

Proof: first, notice that because $X$ and $Y$ have the same mean and the same covariance matrix, then: $$\forall s,t\in[0,+\infty), \langle X_s,X_t\rangle=\operatorname{cov}(X_s,X_t)+\mathbb{E}(X_s)\mathbb{E}(X_t) \\ =\operatorname{cov}(Y_s,Y_t)+\mathbb{E}(Y_s)\mathbb{E}(Y_s)=\langle Y_s,Y_t\rangle.$$ Then, if $n\in\mathbb{N}$, we have that the Gram-matrix of $X_{t_1},...,X_{t_n}$ and of $Y_{t_1},...,Y_{t_n}$ coincide, i.e. $$(\langle X_{t_h}, X_{t_k}\rangle)_{h,k\in\{1,...,n\}} = (\langle Y_{t_h}, Y_{t_k}\rangle)_{h,k\in\{1,...,n\}}.$$ Then, being $X_{t_1},...,X_{t_n}$ linear independent, their Gram-matrix is invertible, then so is the Gram-matrix of $Y_{t_1},...,Y_{t_n}$, and then $Y_{t_1},...,Y_{t_n}$ are linear independent, and so the claim is proved.

Now, use Gram-Schmidt orthonormalization process, to build an orthonormal base $\{e_n\}_{n\in\mathbb{N}}$ of $V$ such that $$X_{t_1}\in \operatorname{span}(e_1), X_{t_2}\in \operatorname{span}(e_1,e_2), X_{t_3}\in \operatorname{span}(e_1,e_2,e_3),..., X_{t_n}\in \operatorname{span}(e_1,...,e_n),...$$ Do the same thing in $\overline{\operatorname{span}(\{Y_{t_n}\}_{n\in\mathbb{N}})}$, i.e. build an orthonormal base $\{f_n\}_{n\in\mathbb{N}}$ of $\overline{\operatorname{span}(\{Y_{t_n}\}_{n\in\mathbb{N}})}$ such that $$Y_{t_1}\in \operatorname{span}(f_1), Y_{t_2}\in \operatorname{span}(f_1,f_2), Y_{t_3}\in \operatorname{span}(f_1,f_2,f_3),..., Y_{t_n}\in \operatorname{span}(f_1,...,f_n),...$$

Then there exists an isometry $\Phi$ from $V$ onto $\overline{\operatorname{span}(\{Y_{t_n}\}_{n\in\mathbb{N}})}$ such that: $$\forall n \in \mathbb{N}, \Phi(e_n)=f_n.$$

Claim 3:

  • $\forall t\in[0,+\infty), \forall n\in \mathbb{N}, \langle X_t, e_n \rangle = \langle Y_t, f_n \rangle.$
  • $\overline{\operatorname{span}(\{Y_{t_n}\}_{n\in\mathbb{N}})}=\overline{\operatorname{span}(\{Y_t\}_{t\in[0,+\infty)})}=:W$

Proof: first, remember that because $X$ and $Y$ have the same mean and the same covariance matrix, then: $$\forall s,t\in[0,+\infty), \langle X_s,X_t\rangle=\langle Y_s,Y_t\rangle$$ Now, notice that if $j\in\mathbb{N}$ and $e_j= \sum_{k=1}^j a_{j,k} X_{t_k}$, then $f_j=\sum_{k=1}^j a_{j,k} Y_{t_k}$, as you can see by determinant formula for $e_j$ and $f_j$, so $$\forall j,n\in\mathbb{N}, \langle X_{t_n},e_j\rangle = \langle X_{t_n},\sum_{k=1}^j a_{j,k} X_{t_k}\rangle = \sum_{k=1}^j \overline{a_{j,k}} \langle X_{t_n}, X_{t_k}\rangle \\ = \sum_{k=1}^j \overline{a_{j,k}} \langle Y_{t_n}, Y_{t_k}\rangle = \langle Y_{t_n}, \sum_{k=1}^j a_{j,k} Y_{t_k}\rangle = \langle Y_{t_n}, f_j\rangle.$$ Then, if $t\in[0,+\infty)$: $$X_t = \sum_{j=1}^{+\infty} \langle X_t , e_j \rangle e_j = \sum_{j=1}^{+\infty} \langle X_t , \sum_{k=1}^j a_{j,k} X_{t_k} \rangle e_j = \sum_{j=1}^{+\infty} \sum_{k=1}^j \overline{a_{j,k}} \langle X_t , X_{t_k} \rangle e_j,$$

and there exists $Z\in \overline{\operatorname{span}(\{Y_{t_n}\}_{n\in\mathbb{N}})}^{\perp}$ such that:

$$Y_t = Z + \sum_{j=1}^{+\infty} \langle Y_t , f_j \rangle f_j = Z + \sum_{j=1}^{+\infty} \langle Y_t , \sum_{k=1}^j a_{j,k} Y_{t_k} \rangle f_j, = Z + \sum_{j=1}^{+\infty} \sum_{k=1}^j \overline{a_{j,k}} \langle Y_t , Y_{t_k} \rangle f_j$$ so, if $n\in\mathbb{N}$, then: $$\langle X_t, e_n \rangle = \langle \sum_{j=1}^{+\infty} \langle X_t , \sum_{k=1}^j a_{j,k} X_{t_k} \rangle e_j , e_n \rangle = {\sum_{j=1}^{+\infty} \sum_{k=1}^j \overline{a_{j,k}} \langle X_t , X_{t_k} \rangle \langle e_j , e_n \rangle = \sum_{j=1}^{+\infty} \sum_{k=1}^j \overline{a_{j,k}} \langle Y_t , Y_{t_k} \rangle \langle f_j , f_n \rangle} \\ = \langle \sum_{j=1}^{+\infty} \langle Y_t , \sum_{k=1}^j a_{j,k} Y_{t_k} \rangle f_j , f_n \rangle = \langle Z+ \sum_{j=1}^{+\infty} \langle Y_t , \sum_{k=1}^j a_{j,k} Y_{t_k} \rangle f_j , f_n \rangle = \langle Y_t, f_n \rangle$$ and so the first part of the claim is proved.

For the second, just note that $$\|Y_t\|_2^2=\|X_t\|_2^2=\|\sum_{j=1}^{+\infty} \sum_{k=1}^j \overline{a_{j,k}} \langle X_t , X_{t_k} \rangle e_j\|_2^2 = \sum_{j=1}^{+\infty} | \sum_{k=1}^j \overline{a_{j,k}} \langle X_t , X_{t_k} \rangle|^2 \\ = \sum_{j=1}^{+\infty} | \sum_{k=1}^j \overline{a_{j,k}} \langle Y_t , Y_{t_k} \rangle|^2 = \|\sum_{j=1}^{+\infty} \sum_{k=1}^j \overline{a_{j,k}} \langle Y_t , Y_{t_k} \rangle f_j\|_2^2 = \|\sum_{j=1}^{+\infty} \langle Y_t , f_j \rangle f_j\|_2^2 \\ \le \|\sum_{j=1}^{+\infty} \langle Y_t , f_j \rangle f_j\|_2^2 + \|Z\|_2^2 = \|Z+\sum_{j=1}^{+\infty} \langle Y_t , f_j \rangle f_j\|_2^2 = \|Y_t\|_2^2,$$ and so $Z=0$, i.e. $$Y_t \in \overline{\operatorname{span}(\{Y_{t_n}\}_{n\in\mathbb{N}})}.$$ Being $t$ arbitrary, we obtain $$\overline{\operatorname{span}(\{Y_{t_n}\}_{n\in\mathbb{N}})}=\overline{\operatorname{span}(\{Y_t\}_{t\in[0,+\infty)})},$$ i.e. the last part of the claim is proved.

Notice that: $$\forall t\in[0,+\infty), \Phi(X_t)=\Phi(\sum_{n\in\mathbb{N}} \langle X_t, e_n\rangle e_n)=\sum_{n\in\mathbb{N}} \langle X_t, e_n\rangle \Phi(e_n)\\=\sum_{n\in\mathbb{N}} \langle X_t, e_n\rangle f_n=\sum_{n\in\mathbb{N}} \langle Y_t, f_n\rangle f_n=Y_t.$$

Now either $\dim (V^\perp)\le\dim (W^\perp)$ or $\dim (W^\perp)<\dim (V^\perp)$. Assume that $\dim (V^\perp)\le\dim (W^\perp)$ (if the other case holds, just switch the role of $X$ with the role of $Y$, the role of $V$ with the role of $W$, the role of $V^\perp$ with the role of $W^\perp$ and the role of $\Phi$ with the role of $\Phi^{-1}$ in what follows).

Then there exists an isometry $\Psi$ from $L^2(\Omega,\mathcal{A},\mathbb{P})$ into $L^2(\Omega,\mathcal{A},\mathbb{P})$ such that:

  • $\forall v \in V, \Psi(v)=\Phi(v);$
  • $\Psi(V^{\perp})\subset W^{\perp}$.

Now, noticing that $$\forall t\in[0,+\infty), \Psi(X_t) = \Phi(X_t) = Y_t,$$ we have finally got the conclusion.

Some final remarks: about your doubt on how such an isometry could be possible knowing nothing about the distributions of the involved random variables, notice that being isometric has nothing to do with having the same distribution. For example in $\Omega=\{0,1\}$ with $\mathbb{P}(\{0\})=1/2=\mathbb{P}(\{1\})$, then $X(0)=1, X(1)=1$ has the same $L^2(\mathbb{P})$ norm of $Y(0)=-1, Y(1)=1$, but the range of the first is $\{1\}$ while the range of the second is $\{-1,1\}$, so they have not the same distribution. Actually, covariance is a scalar product on $L^2(\mathbb{P})/\mathbb{R}$, so it is a way of talking about length and angles in the space of square integrable random variables, once we have identified two random variables that differ by a constant.