What is entropy really?

There are two definitions of entropy, which physicists believe to be the same (modulo the dimensional Boltzman scaling constant) and a postulate of their sameness has so far yielded agreement between what is theoretically foretold and what is experimentally observed. There are theoretical grounds, namely most of the subject of statistical mechanics, for our believing them to be the same, but ultimately their sameness is an experimental observation.

  1. (Boltzmann / Shannon): Given a thermodynamic system with a known macrostate, the entropy is the size of the document, in bits, you would need to write down to specify the system's full quantum state. Otherwise put, it is proportional to the logarithm of the number of full quantum states that could prevail and be consistent with the observed macrostate. Yet another version: it is the (negative) conditional Shannon entropy (information content) of the maximum likelihood probability distribution of the system's microstate conditioned on the knowledge of the prevailing macrostate;

  2. (Clausius / Carnot): Let a quantity $\delta Q$ of heat be input to a system at temperature $T$. Then the system's entropy change is $\frac{\delta Q}{T}$. This definition requires background, not the least what we mean by temperature; the well-definedness of entropy (i.e. that it is a function of state alone so that changes are independent of path between endpoint states) follows from the definition of temperature, which is made meaningful by the following steps in reasoning: (see my answer here for details). (1) Carnot's theorem shows that all reversible heat engines working between the same two hot and cold reservoirs must work at the same efficiency, for an assertion otherwise leads to a contradiction of the postulate that heat cannot flow spontaneously from the cold to the hot reservoir. (2) Given this universality of reversible engines, we have a way to compare reservoirs: we take a "standard reservoir" and call its temperature unity, by definition. If we have a hotter reservoir, such that a reversible heat engine operating between the two yields $T$ units if work for every 1 unit of heat it dumps to the standard reservoir, then we call its temperature $T$. If we have a colder reservoir and do the same (using the standard as the hot reservoir) and find that the engine yields $T$ units of work for every 1 dumped, we call its temperature $T^{-1}$. It follows from these definitions alone that the quantity $\frac{\delta Q}{T}$ is an exact differential because $\int_a^b \frac{d\,Q}{T}$ between positions $a$ and $b$ in phase space must be independent of path (otherwise one can violate the second law). So we have this new function of state "entropy" definied to increase by the exact differential $\mathrm{d} S = \delta Q / T$ when the a system reversibly absorbs heat $\delta Q$.

As stated at the outset, it is an experimental observation that these two definitions are the same; we do need a dimensional scaling constant to apply to the quantity in definition 2 to make the two match, because the quantity in definition 2 depends on what reservoir we take to be the "standard". This scaling constant is the Boltzmann constant $k$.

When people postulate that heat flows and allowable system evolutions are governed by probabilistic mechanisms and that a system's evolution is its maximum likelihood one, i.e. when one studies statistical mechanics, the equations of classical thermodynamics are reproduced with the right interpretation of statistical parameters in terms of thermodynamic state variables. For instance, by a simple maximum likelihood argument, justified by the issues discussed in my post here one can demonstrate that an ensemble of particles with allowed energy states $E_i$ of degeneracy $g_i$ at equilibrium (maximum likelihood distribution) has the probability distribution $p_i = \mathcal{Z}^{-1}\, g_i\,\exp(-\beta\,E_i)$ where $\mathcal{Z} = \sum\limits_j g_j\,\exp(-\beta\,E_j)$, where $\beta$ is a Lagrange multiplier. The Shannon entropy of this distribution is then:

$$S = \frac{1}{\mathcal{Z}(\beta)}\,\sum\limits_i \left((\log\mathcal{Z}(\beta) + \beta\,E_i-\log g_i )\,g_i\,\exp(-\beta\,E_i)\right)\tag{1}$$

with heat energy per particle:

$$Q = \frac{1}{\mathcal{Z}(\beta)}\,\sum\limits_i \left(E_i\,g_i\,\exp(-\beta\,E_i)\right)\tag{2}$$

and:

$$\mathcal{Z}(\beta) = \sum\limits_j g_j\,\exp(-\beta\,E_j)\tag{3}$$

Now add a quantity of heat to the system so that the heat per particle rises by $\mathrm{d}Q$ and let the system settle to equilibrium again; from (2) and (3) solve for the change $\mathrm{d}\beta$ in $\beta$ needed to do this and substitute into (1) to find the entropy change arising from this heat addition. It is found that:

$$\mathrm{d} S = \beta\,\mathrm{d} Q\tag{4}$$

and so we match the two definitions of entropy if we postulate that the temperature is given by $T = \beta^{-1}$ (modulo the Boltzmann constant).

Lastly, it is good to note that there is still considerable room for ambiguity in definition 1 above aside from simple cases, e.g. an ensemble of quantum harmonic oscillators, where the quantum states are manifestly discrete and easy to calculate. Often we are forced to continuum approximations, and one then has freedom to define the coarse gaining size, i.e. the size of the discretizing volume in continuous phase space that distinguishes truly different microstates, or one must be content to deal with only relative entropies in truly continuous probability distribution models Therefore, in statistical mechanical analyses one looks for results that are weakly dependent on the exact coarse graining volume used.


The entropy of a system is the amount of information needed to specify the exact physical state of a system given its incomplete macroscopic specification. So, if a system can be in $\Omega$ possible states with equal probability then the number of bits needed to specify in exactly which one of these $\Omega$ states the system really is in would be $\log_{2}(\Omega)$. In conventional units we express the entropy as $S = k_\text{B}\log(\Omega)$.


Here's an intentionally more conceptual answer: Entropy is the smoothness of the energy distribution over some given region of space. To make that more precise, you must define the region, the type of energy (or mass-energy) considered sufficiently fluid within that region to be relevant, and the Fourier spectrum and phases of those energy types over that region.

Using relative ratios "factor out" much of this ugly messiness by focusing on differences in smoothness between two very similar regions, e.g. the same region at two points in time. This unfortunately also masks the complexity of what is really going on.

Still, smoothness remains the key defining feature of higher entropy in such comparisons. A field with a roaring campfire has lower entropy than a field with cold embers because with respect to thermal and infrared forms of energy, the live campfire creates a huge and very unsmooth peak in the middle of the field.