Why is the correlation function a power law at the critical point?

The exponent is not in general an integer, and therefore the divergence is not really polynomial. That being said, let's agree to call a structure of the form $x^a$ a polynomial, for any (real) $a$.

The general case.

In general terms, you cannot really prove that the divergence is always polynomial. If $f(x)$ has a singularity at $x_0$, you may define $$ k\equiv-\lim_{x\to x_0}\frac{\log|f(x)|}{\log |x-x_0|} $$

If $k$ is finite, then the singularity is of the form $$ f(x)\sim (x-x_0)^{-k} $$ which proves, a posteriori, that the divergence is indeed polynomial.

It may very well happen that $k$ does not exist. Some examples are $$ \begin{aligned} f(x)&=\mathrm e^{-1/(x-x_0)^a}\\ f(x)&=a\log|x-x_0| \end{aligned} $$ where $k=-\infty$ and $k=0$ respectively. In neither of these cases is the divergence polynomial.

Of course, these examples of $f$ are rather unphysical. In general, you have several physical principles that allow you to restrict the possible functions $f$, and you may sometimes even succeed in proving that $k$ is actually finite (cf. a CFT below). In general terms, it is better to say that the critical exponent $k$ has been observed to be finite for a wide class of theories, and this is confirmed by explicit calculation (numerical or otherwise) in many well-studied examples.

The case of a CFT.

If $\phi_1,\phi_2$ is a pair of primary fields of weight $\Delta_1,\Delta_2$, then conformal invariance implies that the corresponding correlation function is given by $$ \langle\phi_1(x)\phi_2(0)\rangle=\frac{a}{|x_1-x_2|^{\Delta_1+\Delta_2}} $$ in which case the divergence is indeed polynomial. For a proof of this formula, see e.g. 1511.04074, §2.6.

A general QFT, at a critical point, is described by a CFT, and therefore this result is valid for essentially every healthy QFT.


The renormalization group flow is defined in terms of rescaled variables, $\hat{x} = k x$, $\hat{\phi}(\hat{x}) = k^{(2-D-\nu)/2} \phi(\hat{x}/k)$ and renormalised couplings $\hat{m} = m/k^2$ and so on. $k$ is the running cut-off scale.

In these variables, the Green function [$G(x) = \langle \phi(x) \phi(0)\rangle$] is

$$\hat{G}(\hat{x};\hat{m},\hat{g},\dots) = k^{2-D-\nu} G(\hat{x}/k) \, . \qquad (*)$$

In principle $\hat{G}$ is a function of $\hat{x}$ and all the other couplings $\hat{m}$, $\hat{g}$, etc. and these couplings are all functions of $k$. At a fixed point of the renormalisation group however, we get $k\partial_k \hat{m} = k\partial_k \hat{g} = \dots = 0$. The couplings stop running. Then writing (*) as [The starred couplings are the coordinates of the fixed point]

$$ G(x) = k^{D+\nu-2} \hat{G}(kx;\hat{m}^*,\hat{g}^*,\dots) $$

and using the fact that $k$ is a silent variable [$G(x)$ does not depend on $k$ by construction] provides a scaling form. Indeed, since $k$ drops out we can choose it to be $k=1/x$ and get

$$ G(x) = x^{-D-\nu+2} \hat{G}(1;\hat{m}^*,\hat{g}^*,\dots) \, .$$

Note that this reasoning also works close to [but not exactly on] the fixed point. In that case, the couplings behave as $\hat{m} - \hat{m} \sim k^{\eta_m}$, $\hat{g} - \hat{g} \sim k^{\eta_g}$ and so on. Then we get

$$ G(x) = k^{D+\nu-2} \hat{G}(kx;\hat{m}^*+\delta m \, k^{\eta_m},\hat{g}^*+\delta g \, k^{\eta_g},\dots) \, .$$

Assuming that $\eta_m<0$ all the other exponents are positive provides

$$ G(x) = k^{D+\nu-2} \hat{G}(kx;\delta m \, k^{\eta_m},\hat{g}^*,\dots) \, , \qquad (**)$$

if $k$ is small enough. Note that if $\delta m$ is very small, then (**) holds for a wide range of values of $k$. Then choosing $k=1/x$ again provides

$$ G(x) = x^{-D-\nu+2} \hat{G}(1;\hat{m}^*,\delta g \, x^{-\eta_g},\dots) \\ \hspace{2.7cm} = x^{-D-\nu+2} \hat{G}(1;\hat{m}^*,[(\delta g)^{-1/\eta_g} \, x]^{-\eta_g},\dots) \\ \hspace{-1.5cm} = x^{-D-\nu+2} F(x/\xi) \, . $$

This tells us that the correlation length diverges as $\xi \sim (\delta g)^{\eta_g}$ as we get close to criticality.

Note that I made a strong assumption on the exponents $\eta_i$. In practice however you do not need to make this assumption. You can compute theses exponents by linearising the renormalization group flow close to the fixed point. Then you know the sign of all of them. It turns out that very often you get a small number of negative exponents.

Renormalization is a huge topic that can be interpreted in different ways and pops up all over physics. My favourite introductory book on the topic is 'Lectures on phase transitions and the renormalization group' by Goldenfeld.