Asymmetric Hessian matrix

Looks like the problem was solved by G.P.Tolstov in 1949.

  1. If $f\colon\mathbb R^2\to\mathbb R$ has all mixed derivatives of second order everywhere, then $f_{xy}=f_{yx}$ almost everywhere. Reference: G.P.Tolstov, “On partial derivatives”, Izv. Akad. Nauk SSSR Ser. Mat., 13:5 (1949), 425–446 (MR0031544).

    English translation available as Amer. Math. Soc. Translation no. 69 (1952), 30pp. (MR0047758).

  2. There exists a function $f\colon\mathbb [0,1]^2\to\mathbb R$ s.t. $f_{xy}$ and $f_{yx}$ exist everywhere but $f_{xy}-f_{yx}$ is the characteristic function of a set of positive measure (proposition I); there also exists a function $f$ as above s.t. $f_{xy}\ne f_{yx}$ almost everywhere (proposition II). Reference: G.P.Tolstov, “On the mixed second derivative”, Mat. Sb. (N.S.), 24(66):1 (1949), 27–51 (MR0029971).


(Very rough) summary of the proof of proposition I from the second paper.

  1. Choose some variant of (thick) Cantor set $P=\bigcap\limits_n P_n$ (each $P_n$ is a union of $2^n$ intervals).

  2. Let $f_n$ be a sequence of continuous PL-functions s.t. $f'_n|_{P_n}=1$ and $f'_n\le 0$ on $I\setminus P_n$.

  3. Choose some $C^1$-smoothing $\phi_n$ of $f_n$ s.t. $\phi'_n|_P=1$, $|\phi_n|<2^{1-n}$ (+some other bounds from page 31).

  4. Define $\psi_0=x-\phi_0$, $\psi_n=\phi_n-\phi_{n-1}$. Note that $\psi'_n|_P=0$ (+some other bounds (11) from page 32).

  5. Define $F(x,y)=\sum \phi_n(x)\psi_n(y)$.

  6. $F_x=\sum\phi'(x)\phi(y)$ (since convergence is uniform — which relies on delicate choice of smoothing, AFAICS). So $F_x|_{I\times P}=\sum\psi(y)=y$ and $F_{xy}|_{P\times P}=1$.

  7. $F_y=\sum\phi(x)\phi'(y)$ (-//-). So $F_y|_{P\times I}=0$ and $F_{yx}|_{P\times P}=0$.

  8. $F_{xy}$ and $F_{yx}$ exist on the whole $I^2$... for some reason.

Something like that.


I can give a proof of the following statement.

Let $U\subseteq \mathbb{R}^2$ be open, and $f\colon U\to\mathbb{R}$ be such that $f_{xx}$, $f_{xy}$, $f_{yx}$ and $f_{yy}$ are well defined on some Lebesgue measurable $A\subseteq U$. Then, $f_{xy}=f_{yx}$ almost-everywhere on $A$.

[Note: This is after seeing Grigory's answer. The statement here is a bit stronger than statement (1) due to Tolstov in his answer. I haven't, as yet, been able to see the translation of that paper, so I'm not sure if his argument actually gives the same thing.]

In fact, we can show that $$ f_{xy}=f_{yx}=\lim_{h\to0}\frac{1}{h^2}\left(f(x+h,y+h)+f(x,y)-f(x+h,y)-f(x,y+h)\right)\ \ {\rm(1)} $$ almost everywhere on $A$, where the limit is understood in the sense of local convergence in measure (functions $g_{(h)}$ tend to a limit $g$ locally in measure if the measure of $\{x\in S\colon\vert g_{(h)}(x)-g(x)\vert > \epsilon\}$ tends to zero as $h\to0$, for each $\epsilon > 0$ and $S\subseteq A$ of finite measure).

First, there are some technical issues regarding measurability. However, as $f_x$ and $f_y$ are assumed to exist on $A$, then $f$ is continuous along the intersection of $A$ with horizontal and vertical lines, which implies that its restriction to $A$ is Lebesgue measurable. Then, all the partial derivatives must also be measurable when restricted to $A$. By Lusin's theorem, we can reduce to the case where all the partial derivatives are continuous when restricted to $A$. Also, without loss of generality, take $A$ to be bounded.

Fix an $\epsilon > 0$. Then, for any $\delta > 0$, let $A_\delta$ be the set of $(x,y)\in A$ such that

  • $\left\vert f_{yy}(x+h,y)-f_{yy}(x,y)\right\vert\le\epsilon$ for all $\vert h\vert \le\delta$ with $(x+h,y)\in A$.
  • $\left\vert f_y(x+h,y)-f_y(x,y)-f_{yx}(x,y)h\right\vert\le\epsilon\vert h\vert$ for all $\vert h\vert\le\delta$ with $(x+h,y)\in A$.
  • $\left\vert f(x,y+h)-f(x,y)-f_y(x,y)h-\frac12f_{yy}(x,y)h^2\right\vert\le\epsilon h^2$ for all $\vert h\vert\le\delta$ with $(x,y+h)\in A$.

This is Lebesgue measurable and existence and continuity of the partial derivatives restricted to $A$ implies that $A_\delta$ increases to $A$ as $\delta$ decreases to zero. By monotone convergence, the measure of $A\setminus A_\delta$ decreases to zero.

Now, choose nonzero $\vert h\vert\le\delta$. If $(x,y)$, $(x+h,y)$, $(x,y+h)$, $(x+h,y+h)$ are all in $A_\delta$ then,

$$f(x+h,y+h)-f(x+h,y)-f_y(x+h,y)h-\frac12f_{yy}(x+h,y)h^2$$ $$-f(x,y+h)+f(x,y)+f_y(x,y)h+\frac12f_{yy}(x,y)h^2$$ $$\frac12f_{yy}(x+h,y)h^2-\frac12f_{yy}(x,y)h^2$$ $$f_y(x+h,y)h-f_y(x,y)h-f_{yx}(x,y)h^2$$

are all bounded by $\epsilon h^2$. Adding them together gives $$ \left\vert f(x+h,y+h)+f(x,y)-f(x+h,y)-f(x,y+h)-f_{yx}(x,y)h^2\right\vert\le4\epsilon h^2.\ \ {\rm(2)} $$

Now, choose a sequence of nonzero real numbers $h_n\to0$. It is standard that, for any integrable $g\colon\mathbb{R}^2\to\mathbb{R}$ then $g(x+h_n,y)$, $g(x,y+h_n)$ and $g(x+h_n,y+h_n)$ all tend to $g(x,y)$ in $L^1$ (this is easy for continuous functions of compact support, and extends to all integrable functions as these are dense in $L^1$). Applying this where $g$ is the indicator of $A_\delta$ shows that the set of $(x,y)\in A_\delta$ for which one of $(x+h_n,y)$, $(x,y+h_n)$ or $(x+h_n,y+h_n)$ is not in $A_\delta$ has measure decreasing to zero. So, for $\vert h\vert$ chosen arbitrarily small, inequality (2) applies everywhere on $A_\delta$ outside of a set of arbitrarily small measure. Letting $\delta$ decrease to zero, (2) applies everywhere on $A$ outside of a set of arbitrarily small measure, for small $\vert h\vert$. As $\epsilon > 0$ is arbitrary, this is equivalent to the limit in (1) holding in measure and equalling $f_{yx}$ almost everywhere on $A$. Finally, exchanging $x$ and $y$ in the above argument shows that the limit in (1) is also equal to $f_{xy}$.