Deriving Klein-Gordon from Hamilton's equations for fields using functional derivatives

OP's question is not about the Klein-Gordon equation per se, but rather about the very definition of the functional/variational derivative (FD). OP's eq. (4) is just plain wrong.

To complicate matters, OP's eqs. (8) refers to a misleading 'same-space' FD notation, which when applied naively, leads to a mismatch of integral signs in OP's eqs. (3), (7) & (9).

For more details, see e.g. this, this & this related Phys.SE posts.


You are most likely confused because methods from functional calculus are often obscured by hacks to make the calculations easier to understands but it also hides a lot of the intuition. So starting with functionals. Functionals are objects that take in functions and spit out a real (complex) number. They are denoted by square brackets to indicate that they are functionals. The simplest functional is probably a plain integral: $$F[\varphi]=\int dx\,\varphi(x)$$ Functional calculus extends the idea of taking derivatives to functionals. This allows you to do things like finding the function that minimizes the functional given some constraint (like the action!). Let's say you want to do this naively by taking the partial derivative. I will take $F$ to be a sum to allow us to take the partial derivative. $$F[\varphi]=\sum_i\varphi(x_i)\Delta x$$ Here $x_i$ are evenly spaced x-values from some interval $[a,b]$. Now we can take the partial derivative if we consider each $\varphi(x_i)$ as an independent parameter $$\frac{\partial F[\varphi]}{\partial\varphi(x_j)}=\frac{\partial}{\partial\varphi(x_j)}\sum_i\varphi(x_i)\Delta x=\delta_{ij}\Delta x$$ You might see the issue here. The result depends on $\Delta x$ and if we transform this to an integral we always get zero. Now if we define the functional derivative as $$\frac{\delta F[\varphi]}{\delta\varphi(y)}=\lim_{\epsilon\rightarrow 0}\frac{F[\varphi(x)+\epsilon \delta(x-y)]-F[\varphi(x)]}{\epsilon}$$ then we get a definition that doesn't vanish. You can show for yourself that $$\frac{\delta }{\delta\pi(y)}\int dx\,\pi(x)^2=2\pi(y).$$ In general the functional derivative of an integral will give the derivative of the integrand. This is what you try to state in (4). Now to get back to your problem define $$F[\varphi,\nabla\varphi]=\int d^3x\left[(\nabla\varphi(x))^2+m^2\varphi(x)^2\right]$$ We could vary $F$ with respect to only its first argument but we are interested the case where the second argument is dependent on $\varphi$, similar to a total derivative. Since $\nabla\delta(x)$ is kind of a nasty object I will use a better, more general definition of the functional derivative that doesn't use delta functions. $$\int dx\,\frac{\delta F}{\delta\varphi}(x)\rho(x)=\lim_{\epsilon\rightarrow 0}\frac{F[\varphi+\epsilon \rho]-F[\varphi]}{\epsilon}\tag{1}\label{definition}$$ With $\rho$ some arbitrary test function. Let's name $\rho$ to be $\rho(x)=\delta\varphi(x)$ to make the notation nicer. To calculate $\frac{\delta F}{\delta\varphi}$ we need to evaluate the right hand side of $$\int dx\,\frac{\delta F}{\delta\varphi}(x)\delta\varphi(x)=\lim_{\epsilon\rightarrow 0}\frac{F[\varphi+\epsilon\delta\varphi,\nabla\varphi+\epsilon\nabla\delta\varphi]-F[\varphi,\nabla\varphi]}{\epsilon}$$ After some work you can show that this becomes $$\lim_{\epsilon\rightarrow 0}\int d^3x\left[2\nabla\varphi\nabla\delta\varphi+\epsilon(\nabla\delta\varphi)^2+2m^2\varphi\delta\varphi+\epsilon m^2\delta\varphi^2\right]\\ =\int d^3x\left[2\nabla\varphi\nabla\delta\varphi+2m^2\varphi\delta\varphi\right]$$ To match the left hand side of (1) we must use partial integration to remove the $\nabla$ from $\delta\varphi$. This gives us $$\int d^3x\left[-2(\nabla^2\varphi)\delta\varphi+2m^2\varphi\delta\varphi\right]$$ Now reading of (1) gives us $$\frac{\delta F}{\delta\varphi}=-2\nabla^2\varphi+2m^2\varphi$$


Final note: if you plug in $\rho(x)=\delta(x-y)$ in (1) you get the first definition of the functional derivative.