How to calculate second-order variations of an action?

A proper treatment (and how you should usually go about these things if you forget) is to remember the definition of the functional derivative. It is linear, defined to obey a chain rule, a product rule, and has the fundamental feature

$$\frac{\delta\phi(y)}{\delta\phi(x)}=\delta(x-y)$$

Thus, in painstaking detail, we have

$$\frac{\delta S[\phi]}{\delta\phi(x)}=\frac{1}{2}\int\mathrm{d}^dy\left[\frac{\delta}{\delta\phi(x)}\left(\partial\phi(y)\cdot\partial\phi(y)\right)-m^2\frac{\delta}{\delta\phi(x)}\phi(y)^2\right]\\ =\int\mathrm{d}^dy\left[\partial_{\mu}\delta(x-y)\partial^{\mu}\phi(y)-m^2\delta(x-y)\phi(y)\right]\\ =-(\square+m^2)\phi(x)$$

Thus, we can simply differentiate again to obtain

$$\frac{\delta^2S[\phi]}{\delta\phi(x)\delta\phi(y)}=-\frac{\delta}{\delta\phi(y)}\left[(\square_x+m^2)\phi(x)\right]=-(\square_x+m^2)\delta(x-y)$$

Which is the desired result (note that $\square_x$ simply means that the derivative is only with respect to $x$ -- sometimes this matters)! Note that the delta function comes after the Klein-Gordon operator.

And that's it! No need to expand to second order or pull your hair out deciding whether you have to integrate by parts and when you can.

I hope this helps!

B-B-B-BONUS ROUND

This type of manipulation is actually extremely useful! For instance, in the path integral formulation, we have

$$\langle\mathcal{F}[\phi](x)\rangle=\int\mathcal{D}\phi\,\mathcal{F}[\phi](x)\,e^{iS[\phi]}$$

With this, we can use the above manipulations to find correlation functions! The key is to note that the path integral of a total functional derivative is zero. Thus, we have

$$\int\mathcal{D}\phi\,\frac{\delta^2}{\delta\phi(x)\delta\phi(y)}e^{iS[\phi]}=i\int\mathcal{D}\phi\left[\frac{\delta^2S}{\delta\phi(x)\delta\phi(y)}+i\frac{\delta S}{\delta\phi(x)}\frac{\delta S}{\delta\phi(y)}\right]e^{iS[\phi]}\\ =i\bigg\langle\frac{\delta^2S}{\delta\phi(x)\delta\phi(y)}+i\frac{\delta S}{\delta\phi(x)}\frac{\delta S}{\delta\phi(y)}\bigg\rangle=0$$

This holds for any action $S[\phi]$. In particular, in your free theory, this gives us

$$\left(\square_y+m^2\right)\left(\square_x+m^2\right)\langle\phi(x)\phi(y)\rangle=-i\left(\square_y+m^2\right)\delta(x-y)$$

Eliminating $\square_y+m^2$ from each side tells you that the two point function for a free theory is the Green's function of the Klein-Gordon operator. No need for generating functionals or all that messy second quantization.


In your first attempted calculation, there is an issue in the second line. Your update looks almost ok, up to the point when you say you further integrate by parts: at that point, your are no longer under an integral so you can't do that (indeed, you have already done the IBPs implicitly in lines two and three, yielding the $\partial _{\mu } (\delta ^4(z-y))$ terms). But this is not needed either, as the correct result is:

$$K(x,y) = - m^2 \delta ^4(x-y) - \eta^{\mu \nu } \partial _{\mu }\partial _{\nu } (\delta ^4(x-y))$$

which is the integral kernel of the operator $-m^2 - \square$ (as can be seen by computing $\int dx dy\; f(x) K(x,y) g(y)$ via IBP).

A different way of coming at the result (which looks cleaner to me, but that's just my own cosmetic opinion...), is to go back to the definition of $\frac{\delta ^2 S[\phi ]}{\delta [\phi (x)]\delta [\phi (y)]}$, namely:

$$S[\phi + \delta \phi ] - S[\phi ] =: \int dx\; \frac{\delta S[\phi ]}{\delta [\phi (x)]} \delta \phi (x) + \frac{1}{2} \int dx dy\; \frac{\delta ^2 S[\phi ]}{\delta [\phi (x)]\delta [\phi (y)]} \delta \phi (x) \delta \phi (y) + o(\delta \phi ^2) \tag{1}$$

Note that this equation only prescribes $\frac{\delta ^2 S[\phi ]}{\delta [\phi (x)]\delta [\phi (y)]}$ up to antisymmetric terms: it must be completed by the requirement that $\frac{\delta ^2 S[\phi ]}{\delta [\phi (x)]\delta [\phi (y)]}$ be symmetric in $x,y$ (just like $\frac{\partial ^2 f (\vec{v})}{\partial v^j \partial v^i}$ is symmetric in $i,j$ for an ordinary function of $n$ variables).

So we must calculate the second order variation of $S[\phi ]$:

$$S[\phi ] = \int d^4x\; \mathcal{L}(\phi (x),\partial \phi (x))$$

$$S[\phi +\delta \phi ] = \int d^4x\; \mathcal{L}(\phi (x)+\delta \phi (x),\partial \phi (x)+\partial \delta \phi (x))\\ = \int d^4x\; \mathcal{L}(\phi (x),\partial \phi (x)) + (\text{1st order terms = 0 on shell})\\ + \frac{1}{2} \frac{\partial ^2\mathcal{L}}{\partial \phi ^2}(\phi (x),\partial \phi (x)) \,\delta \phi (x) \,\delta \phi (x)\\ + \frac{\partial ^2\mathcal{L}}{\partial \phi \partial (\partial _{\mu }\phi )}(\phi (x),\partial \phi (x)) \,\partial _{\mu }(\delta \phi (x)) \,\delta \phi (x)\\ + \frac{1}{2} \frac{\partial ^2\mathcal{L}}{\partial (\partial _{\mu }\phi )\partial (\partial _{\nu }\phi )}(\phi (x),\partial \phi (x)) \,\partial _{\mu }(\delta \phi (x)) \,\partial _{\nu }(\delta \phi (x))$$

Since the definition of $\frac{\delta ^2 S[\phi ]}{\delta [\phi (x)]\delta [\phi (y)]}$ requires an integral over $x$ and $y$, we introduce it by force now (keeping carefully track of the variable on which the various derivatives act):

$$ = S[\phi ] + \int d^4x \,d^4y \;\delta ^4(x-y) \Big[\\ \frac{1}{2} \frac{\partial ^2\mathcal{L}}{\partial \phi ^2}(\phi (x),\partial \phi (x)) \,\delta \phi (x) \,\delta \phi (y)\\ + \frac{\partial ^2\mathcal{L}}{\partial \phi \partial (\partial _{\mu }\phi )}(\phi (x),\partial \phi (x)) \,\partial ^{(x)}_{\mu }(\delta \phi (x)) \,\delta \phi (y)\\ + \frac{1}{2} \frac{\partial ^2\mathcal{L}}{\partial (\partial _{\mu }\phi )\partial (\partial _{\nu }\phi )}(\phi (x),\partial \phi (x)) \,\partial ^{(x)}_{\mu }(\delta \phi (x)) \,\partial ^{(y)}_{\nu }(\delta \phi (y)) \Big]$$

Now we perform a few IBPs, in the variables $x$ and $y$. The boundary terms will be proportional to either $\delta \phi (x)$ at the $x$ boundary or $\delta \phi (y)$ at the $y$ boundary, and these are typically assumed to be zero when calculating variations. So we get:

$$ = S[\phi ] + \int d^4x \,d^4y \; \Big[\\ \frac{1}{2} \frac{\partial ^2\mathcal{L}}{\partial \phi ^2}(\phi (x),\partial \phi (x)) \,\delta ^4(x-y)\\ - \partial ^{(x)}_{\mu } \left( \frac{\partial ^2\mathcal{L}}{\partial \phi \partial (\partial _{\mu }\phi )}(\phi (x),\partial \phi (x)) \,\delta ^4(x-y) \right)\\ + \frac{1}{2} \partial ^{(x)}_{\mu }\partial ^{(y)}_{\nu } \left( \frac{\partial ^2\mathcal{L}}{\partial (\partial _{\mu }\phi )\partial (\partial _{\nu }\phi )}(\phi (x),\partial \phi (x)) \,\delta ^4(x-y) \right) \Big] \,\delta \phi (x) \,\delta \phi (y) \tag{2}$$

Note that we have used that $\partial ^{(x)}_{\mu }(\delta \phi (y)) = 0$ (that's why we had to somewhat artificially introduce a distinction between the $x$ and $y$ variables before performing the IBPs, otherwise we would be left with $\partial \delta \phi $ terms).

Now, for any function $F$, $F(x)\,\delta ^4(x-y)$ is symmetric with respect to $x \leftrightarrow y$, while the symmetric part of $\partial _{\mu }^{(x)} \left( F(x) \, \delta ^4(x-y) \right)$ is:

$$\frac{1}{2} \big[ \partial _{\mu }^{(x)} \left( F(x) \, \delta ^4(x-y) \right) + \partial _{\mu }^{(y)} \left( F(y) \, \delta ^4(y-x) \right) \big] = \\ = \frac{1}{2} \big[ \left( \partial _{\mu }^{(x)} F(x) \right) \delta ^4(x-y) + F(x) \left( \partial _{\mu }^{(x)} \delta ^4(x-y) \right) + \partial _{\mu }^{(y)} \left( F(y) \, \delta ^4(y-x) \right) \big]\\ = \frac{1}{2} \big[ \left( \partial _{\mu }^{(x)} F(x) \right) \delta ^4(x-y) - F(x) \left( \partial _{\mu }^{(y)} \delta ^4(x-y) \right) + \partial _{\mu }^{(y)} \left( F(y) \, \delta ^4(y-x) \right) \big]\\ = \frac{1}{2} \big[ \left( \partial _{\mu }^{(x)} F(x) \right) \delta ^4(x-y) - \partial _{\mu }^{(y)} \left( F(x) \, \delta ^4(x-y) \right) + \partial _{\mu }^{(y)} \left( F(x) \, \delta ^4(x-y) \right) \big]\\ = \frac{\partial _{\mu }^{(x)} F(x)}{2} \delta ^4(x-y)$$

and, similarly, the symmetric part of $\partial _{\mu }^{(x)} \partial _{\nu }^{(y)} \left( F(x) \, \delta ^4(x-y) \right)$ is:

$$\frac{1}{2} \big[ \partial _{\mu }^{(x)} \partial _{\nu }^{(y)} \left( F(x) \, \delta ^4(x-y) \right) + \partial _{\mu }^{(y)} \partial _{\nu }^{(x)} \left( F(y) \, \delta ^4(y-x) \right) \big] = \\ = \frac{1}{2} \big[ \partial _{\mu }^{(x)} \left( F(x) \, \partial _{\nu }^{(y)} \delta ^4(x-y) \right) + \partial _{\mu }^{(y)} \partial _{\nu }^{(x)} \left( F(x) \, \delta ^4(x-y) \right) \big]\\ = \frac{1}{2} \big[ - \partial _{\mu }^{(x)} \left( F(x) \, \partial _{\nu }^{(x)} \delta ^4(x-y) \right) - \partial _{\nu }^{(x)} \left( F(x) \, \partial _{\mu }^{(x)} \delta ^4(x-y) \right) \big]\\ = \frac{1}{2} \big[ - \left( \partial _{\mu }^{(x)} F(x) \right) \left( \partial _{\nu }^{(x)} \delta ^4(x-y) \right) - F(x) \left( \partial _{\mu }^{(x)} \partial _{\nu }^{(x)} \delta ^4(x-y) \right) - \left( \partial _{\nu }^{(x)} F(x) \right) \left( \partial _{\mu }^{(x)} \delta ^4(x-y) \right) - F(x) \left( \partial _{\nu }^{(x)} \partial _{\mu }^{(x)} \delta ^4(x-y) \right) \big]\\ = - \frac{1}{2} \big[ \left( \partial _{\mu }^{(x)} F(x) \right) \left( \partial _{\nu }^{(x)} \delta ^4(x-y) \right) + \left( \partial _{\nu }^{(x)} F(x) \right) \left( \partial _{\mu }^{(x)} \delta ^4(x-y) \right) \big] - F(x) \left( \partial _{\mu }^{(x)} \partial _{\nu }^{(x)} \delta ^4(x-y) \right)$$

So, identifying eq. (2) with the definition of $\frac{\delta ^2 S[\phi ]}{\delta [\phi (x)]\delta [\phi (y)]}$ above (eq. (1)), we get:

$$\frac{\delta ^2 S[\phi ]}{\delta [\phi (x)]\delta [\phi (y)]} = \left( \frac{\partial ^2\mathcal{L}}{\partial \phi ^2}(\phi (x),\partial \phi (x)) - \partial ^{(x)}_{\mu } \frac{\partial ^2\mathcal{L}}{\partial \phi \partial (\partial _{\mu }\phi )}(\phi (x),\partial \phi (x)) \right) \delta ^4(x-y)\\ - \left( \partial ^{(x)}_{\mu } \frac{\partial ^2\mathcal{L}}{\partial (\partial _{\mu }\phi )\partial (\partial _{\nu }\phi )}(\phi (x),\partial \phi (x)) \right) \partial ^{(x)}_{\nu } \delta ^4(x-y)\\ - \left( \frac{\partial ^2\mathcal{L}}{\partial (\partial _{\mu }\phi )\partial (\partial _{\nu }\phi )}(\phi (x),\partial \phi (x)) \right) \partial ^{(x)}_{\mu }\partial ^{(x)}_{\nu } \delta ^4(x-y)$$

Check: We can check that we get the same result (and that we get it much faster...) with Bob Knighton's method:

$$S[\phi ] = \int d^4x\; \mathcal{L}(\phi (x),\partial \phi (x))$$

$$\frac{\delta S[\phi ]}{\delta [\phi (x)]} = \int d^4y\; \frac{\partial \mathcal{L}}{\partial \phi }(\phi (y),\partial \phi (y)) \frac{\delta \phi (y)}{\delta [\phi (x)]} + \frac{\partial \mathcal{L}}{\partial (\partial _{\mu } \phi }(\phi (y),\partial \phi (y)) \partial ^{(y)}_{\mu } \frac{\delta \phi (y)}{\delta [\phi (x)]}\\ = \frac{\partial \mathcal{L}}{\partial \phi }(\phi (x),\partial \phi (x)) - \left( \partial ^{(x)}_{\mu } \frac{\partial \mathcal{L}}{\partial (\partial _{\mu } \phi }(\phi (x),\partial \phi (x)) \right)$$

$$\frac{\delta ^2 S[\phi ]}{\delta [\phi (x)]\delta [\phi (y)]} = \frac{\delta }{\delta [\phi (y)]} \big[ \frac{\partial \mathcal{L}}{\partial \phi }(\phi (x),\partial \phi (x)) - \left( \partial ^{(x)}_{\mu } \frac{\partial \mathcal{L}}{\partial (\partial _{\mu } \phi )}(\phi (x),\partial \phi (x)) \right) \big]\\ = \frac{\delta }{\delta [\phi (y)]} \big[ \frac{\partial \mathcal{L}}{\partial \phi }(\phi (x),\partial \phi (x)) \big] - \partial ^{(x)}_{\mu } \left( \frac{\delta }{\delta [\phi (y)]} \big[ \frac{\partial \mathcal{L}}{\partial (\partial _{\mu } \phi )}(\phi (x),\partial \phi (x)) \big] \right)\\ = \frac{\partial ^2 \mathcal{L}}{\partial \phi ^2}(\phi (x),\partial \phi (x)) \, \delta ^4(x-y) + \frac{\partial ^2 \mathcal{L}}{\partial \phi \partial (\partial _{\mu } \phi )}(\phi (x),\partial \phi (x)) \left( \partial ^{(x)}_{\mu } \delta ^4(x-y) \right) - \partial ^{(x)}_{\mu } \left( \frac{\partial ^2 \mathcal{L}}{\partial \phi \partial (\partial _{\mu } \phi )}(\phi (x),\partial \phi (x)) \, \delta ^4(x-y) + \frac{\partial ^2 \mathcal{L}}{\partial (\partial _{\mu } \phi ) \partial (\partial _{\nu } \phi )}(\phi (x),\partial \phi (x)) \left( \partial ^{(x)}_{\nu } \delta ^4(x-y) \right) \right)\\ = \left( \frac{\partial ^2 \mathcal{L}}{\partial \phi ^2}(\phi (x),\partial \phi (x)) - \partial ^{(x)}_{\mu } \frac{\partial ^2 \mathcal{L}}{\partial \phi \partial (\partial _{\mu } \phi )}(\phi (x),\partial \phi (x)) \right) \delta ^4(x-y) - \left( \partial ^{(x)}_{\mu } \frac{\partial ^2 \mathcal{L}}{\partial (\partial _{\mu } \phi ) \partial (\partial _{\nu } \phi )}(\phi (x),\partial \phi (x)) \right) \left( \partial ^{(x)}_{\nu } \delta ^4(x-y) \right) - \frac{\partial ^2 \mathcal{L}}{\partial (\partial _{\mu } \phi ) \partial (\partial _{\nu } \phi )}(\phi (x),\partial \phi (x)) \left( \partial ^{(x)}_{\mu } \partial ^{(x)}_{\nu } \delta ^4(x-y) \right)$$