Is it really proper to say Ward identity is a consequence of gauge invariance?

This answer partially disagrees with Motl's. The crucial point is to consider the difference between the abelian and non-abelian case. I totally agree with Motl's answer in the non-abelian event — where these identities are usually denominated Slavnov-Taylor's rather than Ward's, so that I will refer to the abelian case.

First, a few words about terminology: Ward identities are the quantum counterpart to (first and second) Noether's theorem in classical physics. They apply to both global and gauge symmetries. However, the term is often reserved for the $U(1)$ gauge symmetry in QED. In the case of gauge symmetries, Ward identities yield real identities, such as $k^{\mu}\mathcal M_{\mu}=0$, where $\mathcal M_{\mu}$ is defined by $\mathcal M=\epsilon_{\mu}\,\mathcal M^{\mu}$, in QED, that tell us that photon's polarizations parallel to photon's propagation don't contribute to scattering amplitudes. In the case of global symmetries, however, Ward identities reflect properties of the theory. For example, the S-matrix of a Lorentz invariant theory is also Lorentz invariant or the number of particles minus antiparticles in the initial state is the same as in the final state in a theory with a global (independent of the point in space-time) $U(1)$ phase invariance.

Let's study the case of a massive vectorial field minimally coupled to a conserved current:

$$\mathcal L=-{1\over 4}\,F^2+{a^2\over 2}A^2+i\,\bar\Psi\displaystyle{\not}D\, \Psi - {m^2\over 2}\bar\Psi\Psi \\ =-{1\over 4}\,F^2+{a^2\over 2}A^2+i\,\bar\Psi\displaystyle{\not}\partial \, \Psi - {m^2\over 2}\bar\Psi\Psi-e\,A_{\mu}\,j^{\mu}$$

Note that this theory has a global phase invariance $\Psi\rightarrow e^{-i\theta}\,\Psi$, with a Noether current

$$j^{\mu}={\bar\Psi\, \gamma^{\mu}}\,\Psi$$

such that (classically) $\partial_{\mu}\,j^{\mu}=0$. Apart from this symmetry, it is well-known that the Lagrangian above is equivalent to a theory: i)that doesn't have an explicit mass term for the vectorial field. ii) that contains a scalar field (a Higgs-like field) with a different from zero vacuum expectation value, which spontaneously break a $U(1)$ gauge symmetry (this symmetry is not the gauged $U(1)$ global symmetry mentioned previously). The equivalence is in the limit where vacuum expectation value goes to infinity and the coupling between the vectorial field and the Higgs-like scalar goes to zero. Since one has to take this last limit, the charge cannot be quantized and therefore the $U(1)$ gauge symmetry must be topologically equivalent to the addition of real numbers rather than the multiplication of complex numbers with unit modulus (a circumference). The difference between both groups is only topological (does this mean then that the difference is irrelevant in the following?). This mechanism is due to Stueckelberg and I will summarize it at the end of this answer.

In a process in which there is a massive vectorial particle in the initial or final state, the LSZ reductio formula gives:

$$\langle i\,|\,f \rangle\sim \epsilon _{\mu}\int d^4x\,e^{-ik\cdot x}\, \left(\eta^{\mu\nu}(\partial ^2-a^2)-\partial^{\mu}\partial^{\nu}\right)...\langle 0|\mathcal{T}A_{\nu}(x)...|0\rangle$$

From the Lagrangian above, the following classical equations of motion may be obtained

$$\left(\eta^{\mu\nu}(\partial ^2-a^2)-\partial^{\mu}\partial^{\nu}\right)A_{\nu}=ej^{\mu}$$

Then, quantumly,

$$\left(\eta^{\mu\nu}(\partial ^2-a^2)-\partial^{\mu}\partial^{\nu}\right)\langle 0|\mathcal{T}A_{\nu}(x)...|0\rangle = e\,\langle 0|\mathcal{T}j^{\mu}(x)...|0\rangle + \text{contact terms, which don't contribute to the S-matrix}$$

And therefore

$$\langle i\,|\,f \rangle\sim \epsilon _{\mu}\int d^4x\,e^{-ik\cdot x}\,...\langle 0|\mathcal{T}j^{\mu}(x)...|0\rangle +\text{contact terms, which don't contribute}\sim \epsilon_{\mu}\mathcal{M}^{\mu}$$

If one replaces $\epsilon_{\mu}$ with $k_{\mu}$, one obtains

$$k_{\mu}\mathcal{M}^{\mu}\sim k _{\mu}\int d^4x\,e^{-ik\cdot x}\,...\langle 0|\mathcal{T}j^{\mu}(x)...|0\rangle$$

Making use of $k_{\mu}\sim \partial_{\mu}\,,e^{-ik\cdot x}$, integrating by parts, and getting ride of the surface term (the plane wave is an idealization, what one actually has is a wave packet that goes to zero in the spatial infinity), one gets

$$k_{\mu}\mathcal{M}^{\mu}\sim \int d^4x\,e^{-ik\cdot x}\,...\, \partial_{\mu}\,\langle 0|\mathcal{T}j^{\mu}(x)...|0\rangle$$

One can now use the Ward identity for the global $\Psi\rightarrow e^{-i\theta}\,\Psi$ symmetry (classically $\partial_{\mu}\,j^{\mu}=0$ over solutions of the matter, $\Psi$, equations of motion)

$$\partial_{\mu}\, \langle 0|\mathcal{T}j^{\mu}(x)...|0\rangle = \text{contact terms, which don't contribute to the S-matrix}$$

And hence

$$k^{\mu}\mathcal M_{\mu}=0$$

same as in the massless case.

Note that in this derivation, it has been crucial that the explicit mass term for the vectorial field doesn't break the global $U(1)$ symmetry. This is also related to the fact that the explicit mass term for the vectorial field can be obtained through a Higgs-like mechanism connected with a hidden (the Higgs-like field decouples from the rest of the theory) $U(1)$ gauge symmetry.

A more careful calculation should include counterterms in the interacting theory, however I think that this is the same as in the massless case. We can think of the fields and parameters in this answer as bare fields and parameters.

Stueckelberg mechanism

Consider the following Lagrangian

$$\mathcal L=-{1\over 4}\,F^2+|d\phi|^2+\mu^2\,|\phi|^2-\lambda\, (\phi^*\phi)^2$$

where $d=\partial - ig\, B$ and $F$ is the field strength (Faraday tensor) for $B$. This Lagrangian is invariant under the gauge transformation

$$B\rightarrow B + (1/g)\partial \alpha (x)$$ $$\phi\rightarrow e^{i\alpha(x)}\phi$$

Let's take a polar parametrization for the scalar field $\phi$: $\phi\equiv {1\over \sqrt{2}}\rho\,e^{i\chi}$, thus

$$\mathcal L=-{1\over 4}\,F^2+{1\over 2}\rho^2\,(\partial_{\mu}\chi-g\,B_{\mu})^2+{1\over 2}(\partial \rho)^2+{\mu^2\over 2}\,\rho ^2- {\lambda\over 4}\rho^4$$

We may now make the following field redefinition $A\equiv B - (1/g)\partial \chi$ and noting that $F_{\mu\nu}=\partial_{\mu}B_{\nu}-\partial_{\nu}B_{\mu}=\partial_{\mu}A_{\nu}-\partial_{\nu}A_{\mu}$ is also the field strength for $A$

$$\mathcal L=-{1\over 4}\,F^2+{g^2\over 2}\rho^2\,A^2+{1\over 2}(\partial \rho)^2+{\mu^2\over 2}\,\rho ^2-{\lambda\over 4}\, \rho^4$$

If $\rho$ has a vacuum expectation value different from zero $\langle 0|\rho |0\rangle = v=\sqrt{\mu^2\over \lambda}$, it is then convenient to write $\rho (x)=v+\omega (x)$. Thus

$$\mathcal L=-{1\over 4}\,F^2+{a^2\over 2}\,A^2+g^2\,v\,\omega\,A^2+{g^2\over 2}\,\omega ^2\,A^2+{1\over 2}(\partial \omega)^2-{\mu^2\over 2}\,\omega ^2-\lambda\,v\omega^3-{\lambda\over 4}\, \omega^4+{v^4\,\lambda^2\over 4}$$

where $a\equiv g\times v$. If we now take the limit $g\rightarrow 0$, $v\rightarrow \infty$, keeping the product, $a$, constant, we get

$$\mathcal L=-{1\over 4}\,F^2+{a^2\over 2}\,A^2+{1\over 2}(\partial \omega)^2-{\mu^2\over 2}\,\omega ^2-\lambda\,v\omega^3-{\lambda\over 4}\, \omega^4+{v^4\,\lambda^2\over 4}$$

that is, all the interactions terms between $A$ and $\omega$ disappear so that $\omega$ becomes an auto-interacting field with infinite mass that is decoupled from the rest of the theory, and therefore it doesn't play any role. Thus, we recover the massive vectorial field with which we started.

$$\mathcal L=-{1\over 4}\,F^2+{a^2\over 2}\,A^2$$

Note that in a non-abelian gauge theory must be non-linear terms such as $\sim g A^2\,\partial A\;$, $\sim g^2 A^4$, which prevent us from taking the limit $g\rightarrow 0$.


Let me try to answer my own question after spending quite some time reading L.Brown's "quantum field theory", but I'll not stick to his notations.

Let me clarify a bit on the terminology I'll use: "Generalized Ward Identity(GWI)" refers to $(l-k)_\mu\Gamma^\mu(k,l)=iS^{-1}(k')-iS^{-1}(l)$, where $\Gamma^\mu(k,l)$ is an electron-electron-photon vertex function, $S$ is an (full) electron-electron propagator.I'll come back to this in detail later; "Ward identity(WI)" refers to the special case when one lets $l\to k$ in GWI; "Ward-Takahashi Identity(WTI)" refers to $k_\mu {\mathcal M}^\mu(k) = 0$.

I should confess when I asked this question and when I put the words "...claim Ward identity is a consequence of gauge invariance of the theory.", I didn't know which of the three identities they were referring to, but now at least I can say GWI is really a consequence of gauge invariance, not global phase symmetry. In short, if $\Gamma^\mu$ in GWI is taken as an improper vertex(i.e. 1-particle reducible vertex), then GWI holds for theories which respect current conservation(or global phase symmetry). However, for theory with a gauge symmetry, we get a stronger GWI, that is, GWI holds not only for an improper vertex, but also the proper one(i.e. 1-particle irreducible vertex).

GWI of Improper Vertex

First let's see how to get the GWI for current conservation, and here I'll basically copy from Weinberg Vol I chap 10. Consider the vacuum time-ordered product $\langle \mathcal{T}\{J^\mu(x)\Psi_n(y)\bar{\Psi}_m(z)\}\rangle$. Diagrammatically, this is the sum all the diagrams with 1 external photon propagator and 2 external electron propagators, but with a bare external photon propagator stripped away. Now Weinberg defines $\Gamma^\mu(k,l)$ by

$$\int d^4xd^4yd^4ze^{-ipx}e^{-iky}e^{ilz}\langle \mathcal{T}\{J^\mu(x)\Psi_n(y)\bar{\Psi}_m(z)\}\rangle\equiv-iqS_{nn'}(k)\Gamma^\mu_{n'm'}(k,l)S_{m'm}(l)\delta^4(p+k-l)$$

where $S_{nm}$ is the Fourier transform of $\mathcal{T}\{\Psi_n(y)\bar{\Psi}_m(z)\}\rangle$(and omit a delta function), so it is the full electron propagator. Now we can see $\Gamma^\mu$ is the vertex function after 2 full electron propagators and 1 bare photon propagator get stripped away, thus it is 1-particle reducible along the photon line(i.e. still contains the photon vacuum-polarization correction),hence improper. Diagrams involved are: enter image description here

where a dashed line means that the line has been stripped, and $\Gamma^\mu_P$ denotes the proper vertex, and we can get $\Gamma^\mu_P$ if we can further strip away the photon vacuum polarization part. The rest basically follows from calculating $\frac{\partial}{\partial x^\mu}\langle \mathcal{T}\{J^\mu(x)\Psi_n(y)\bar{\Psi}_m(z)\}\rangle$, applying $\partial_\mu J^\mu=0$ and then a Fourier transform.

GWI of Proper Vertex

Now I shall claim for theory with local gauge invariance, GWI holds also for proper vertex $\Gamma^\mu_P$. The idea is to isolate $\Gamma^\mu_P$ from $\Gamma^\mu$. As can be easily seen from the 2nd figure, we can first add back the bare photon propagator(let's denote it by $G_0^{\mu\nu}$), and then remove a full photon propagator $G^{\mu\nu}$, that is,

$$\Gamma^\mu_P(k,l)=G^{-1}(p)^\mu_{\ \ \nu}G_0^{\nu\rho}(p)\Gamma_\rho(k,l),$$ where $p=l-k$.

So to mimic the LHS of GWI, we have$$(l-k)_\mu\Gamma^\mu_P(k,l)=p_\mu G^{-1}(p)^\mu_{\ \ \nu}G_0^{\nu\rho}(p)\Gamma_\rho(k,l)\cdots\cdots(*).$$ Now here is where gauge invariance comes into play:

$$\text{Statement: Gauge invariance}\implies p_\mu G^{-1}(p)^\mu_{\ \ \nu}G_0^{\nu\rho}(p)=p^\rho.$$

If the statement is true, we immediately get from equation $(*)$ that $$p_\mu\Gamma^\mu_P(k,l)=p_\rho \Gamma^\rho(k,l),$$ and since GWI holds for $ \Gamma^\rho(k,l)$, from here we can conclude it also holds for $\Gamma^\mu_P(k,l)$.

Here's the sketch of the proof of the above statement: With a gauge parameter $\xi$, we can write the inverse of bare propagator as $$G^{-1}_0(p)_{\mu\nu}=(g_{\mu\nu}p^2-p_\mu p_\nu)-\frac{1}{\xi}p_\mu p_\nu.$$ Because of the gauge invariance, the full propagator only differs from the bare one in the transverse part, and the longitudinal part remains the same, that is, $$G^{-1}(p)_{\mu\nu}=(g_{\mu\nu}p^2-p_\mu p_\nu)F(p^2)-\frac{1}{\xi}p_\mu p_\nu.$$ This theorem itself involves another not-so-short proof, and it can be found in Brown's book, anyway the key point is that to prove it one needs local symmetry, global symmetry is not enough. Plug in the general forms of the two propagators one can easily prove the statement.

This is in contrast with a theory without gauge invariance(e.g. you can get the propagtor of massive vector field by doing the replacement $\frac{1}{\xi}\to m^2$), in there the full propagator will also alter the longitudinal part so it becomes $$G^{-1}(p)_{\mu\nu}=(g_{\mu\nu}p^2-p_\mu p_\nu)F(p^2)-\frac{1}{\xi}H(p^2)p_\mu p_\nu,$$ then if you carry out the calculation in the statement, you'll get something like $ p_\mu G^{-1}(p)^\mu_{\ \ \nu}G_0^{\nu\rho}(p)=H(p^2)p^\rho$(or maybe $\frac{1}{H(p^2)}p^\rho$, cannot quite remember). Then for the proper vertex GWI is modified to $$p_\mu\Gamma^\mu_P(k,l)=H(p^2)[iS^{-1}(k')-iS^{-1}(l)],$$ which is not of too much good, e.g., one can't obtain the nice renormalization relation $Z_1=Z_2$ for proper vertex. Also, this means in gauge theory we can consider the (proper) vertex renormalization separately from vacuum polarization, while say in massive vector theory vacuum polarization has to be taken into account.

PS: Brown also gives a second proof of proper vertex WI by using effective action technique, which is in a way "shorter". However it needs much more preliminary knowledge about effective action, and also won't be so handy to contrast between the roles of gauge invariance and current conservation in GWI, so I didn't adopt the method here.


The Ward identity follows from the gauge symmetry and it's possible to see these things without mentioning any current whatsoever. The Ward identity says $k_\mu {\mathcal M}^\mu(k) = 0$ which really says that the longitudinal polarization of the gauge boson, one with the pure-gauge polarization vector proportional to the momentum, $\epsilon_\mu\sim k_\mu$, "decouples" i.e. its interactions (scattering amplitudes) with any collection of physical particles vanish.

This vanishing implies a symmetry – now yes, $k_\mu {\mathcal M}^\mu(k)$ may also be interpreted as a correlator including $\partial_\mu J^\mu$, a conserved current – and this symmetry is a gauge symmetry because the gauge field may only have a nonzero $k^\mu$ i.e. dependence on the spacetime if we allow the symmetry parameter to depend on spacetime.

Fields with an extra $m^2 A^\mu A_\mu$ etc. are no longer coupled to a conserved current because the current is modified by an extra $m^2 A_\mu$ – because this appears as a factor multiplying $A_\mu$ in a term you just added – which also means that the Ward identity won't hold if you break the symmetry in this explicit way (the Ward identity will broken "more controllably" if you break the symmetry spontaneously, not explicitly, becausethe full Lagrangian still has the gauge symmetry i.e. the gauge field coupled to a conserved current).