The unreasonable effectiveness of the partition function

The partition function is strongly related to a very useful tool in probability theory called the moment generating function(al) of the probability distribution.

For any probability distribution $p$ of some random variable $X$, the generating function $\mathcal{M}(z)$ is defined as being:

\begin{equation} \mathcal{M}(z) \equiv \left\langle e^{zX}\right\rangle \end{equation}

so that we have for instance:

\begin{equation} \left(\frac{\partial \mathcal{M}}{\partial z}\right)_{z = 0} = \langle X \rangle, \end{equation}

\begin{equation} \left(\frac{\partial^2 \mathcal{M}}{\partial z^2}\right)_{z = 0} = \langle X^2 \rangle, \end{equation} and in general \begin{equation} \mathcal{M}^{(n)}(0) = \langle X^n \rangle \end{equation}

Now, in statistical mechanics the canonical ensembles (with the exception of the microcanonical ensemble) have an exponential form with respect to their corresponding fluctuating thermodynamic random variables (the energy $E$ for the canonical ensemble, the energy $E$ and the number of particles $N$ for the grand canonical ensemble and the energy $E$ and the volume $V$ for the isobaric ensemble, to name a few) so that the probability distribution itself has a form like this

\begin{equation} p_t(X) = f(X)e^{tX} \end{equation}where $t$ is a real number corresponding to one of the intensive thermodynamic variables.

The moment generating function for probabilities like these will look like

\begin{equation} \mathcal{M}(z) =\left \langle e^{zX}\right\rangle = \int \mathrm d\mu(x) \:f(x) e^{tx} e^{zx} \end{equation}

It is quite easy to realize that if we define a partition function as being

\begin{equation} Z(t) \equiv \int \mathrm d\mu(x) \: f(x)e^{tx}, \end{equation} we find that

\begin{equation} \mathcal{M}(z) = Z(t+z) \end{equation}so that

\begin{equation} \mathcal{M}^{(n)}(0) = Z^{(n)}(t) \end{equation}

In general in statistical mechanics, we prefer looking at the log of the partition function (which is also incidentally the logarithm of the moment generating function) as it allows to generate the cumulants of the distribution instead of the moments by applying successive derivatives.


The partition function contains so much information because it is directly related to the free energy, $$F = - k_B T \ln(Z) \, .$$ The physical assumption behind considering $F$ as a thermodynamic potential is that the statistics of the system as described by the canonical ensemble.

In turn, the applicability of the canonical ensemble is a direct consequence of the applicability of the microcanonical ensemble. The microcanonical ensemble states that all micro-states with identical energies will be visited by the dynamics of the system equally. This is called the ergodicity hypothesis. It works so well because most realistic systems are chaotic.

To summarise, the reasonning goes as follows:

Systems are chaotic $\rightarrow$ Microcanonical ensemble works, the entropy is a thermodynamic potential $\rightarrow$ Legendre transform the entropy implies that the free energy is a thermodynamic potential $\rightarrow$ The free energy is given by $F=-k_b T \ln(Z)$ $\rightarrow$ $Z$ contains all the thermodynamic information that one can wish for.

At the level of the thermodynamic potentials, entrpy and free energy are related through a Legendre transform, $$S(E) \rightarrow F(T) = S(E(T))-T E(T)\, .$$ At the level of statistical physics, this is mirored in the relation $$Z = \sum_{E} \Omega(E) \text{e}^{-E/k_BT} \, .$$ $\Omega(E)$ is the microcanonical "partition function". It is the number of micro-states with energy $E$ and is related to the entropy through $$ S(E) = k_B\ln(\Omega(E)) \, .$$

Note that in your expression for $Z$, you sum over all the micro-states of the system. Here I sum over the different energies that the system may have and use $\Omega(E)$ as a weight factor.


I think one way to understand why this works is that the spectrum of energy levels $E_i$ has undergone a sort of transform (analogous to Laplace transform) which results in the partition function $Z(T)$. In principle if you know the function $Z(T)$ you can reverse the process, and reconstruct the original spectrum of energy levels.

As such, all information about the $E_i$ spectrum has been encoded into $Z(T)$.