Chemistry - Derivation of the Heisenberg uncertainty principle

Solution 1:

The proof I will use is taken from Griffiths, Introduction to Quantum Mechanics, 2nd ed., pp 110-111.

Defining "uncertainty"

Let's assume that the normalised state $|\psi\rangle$ of a particle can be expanded as a linear combination of energy eigenstates $|n\rangle$, with $\hat{H}|n\rangle = E_n |n\rangle$.

$$| \psi \rangle = \sum_n c_n |n\rangle \tag{1}$$

The expectation value (the "mean") of a quantity, such as energy, is given by

$$\begin{align} \langle E\rangle &= \langle \psi | H | \psi \rangle \tag{2} \end{align}$$

and the variance of the energy can be defined analogously to that used in statistics, which for a continuous variable $x$ is simply the expectation value of $(x - \bar{x})^2$:

$$\sigma_E^2 = \left\langle (E - \langle E\rangle)^2 \right\rangle \tag{3}$$

The standard deviation is the square root of the variance, and the "uncertainty" refers to the standard deviation. It's more proper to use $\sigma$ as the symbol, instead of $\Delta$, and this is what you will see in most "proper" texts.

$$\sigma_E = \sqrt{\left\langle (E - \langle E\rangle)^2 \right\rangle} \tag{4}$$

However, it's much easier to stick to the variance in the proof. Let's generalise this now to any generic observable, $A$, which is necessarily represented by a hermitian operator, $\hat{A}$. The expectation value of $A$ is merely a number, so let's use the small letter $a$ to refer to it. With that, we have

$$\begin{align} \sigma_A^2 &= \left\langle (A - a)^2 \right\rangle \tag{5} \\ &= \left\langle \psi \middle| (\hat{A} - a)^2 \middle| \psi \right\rangle \tag{6} \\ &= \left\langle \psi \middle| (\hat{A} - a) \middle| (\hat{A} - a)\psi \right\rangle \tag{7} \\ &= \left\langle (\hat{A} - a)\psi \middle| (\hat{A} - a) \middle| \psi \right\rangle \tag{8} \\ &= \left\langle (\hat{A} - a)\psi \middle| (\hat{A} - a)\psi \right\rangle \tag{9} \end{align}$$

where, in going from $(7)$ to $(8)$, I have invoked the hermiticity of $(\hat{A} - a)$ (since $\hat{A}$ is hermitian and $a$ is only a constant). Likewise, for a second observable $B$ with $\langle B \rangle = b$,

$$\sigma_B^2 = \left\langle (\hat{B} - b)\psi \middle| (\hat{B} - b)\psi \right\rangle \tag{10}$$

The Cauchy-Schwarz inequality

...states that, for all vectors $f$ and $g$ belonging to an inner product space (suffice it to say that functions in quantum mechanics satisfy this condition),

$$\langle f | f \rangle \langle g | g \rangle \geq |\langle f | g \rangle|^2 \tag{11}$$

In general, $\langle f | g \rangle$ is a complex number, which is why we need to take the modulus. By the definition of the inner product,

$$\langle f | g \rangle = \langle g | f \rangle^* \tag{12}$$

For a generic complex number $z = x + \mathrm{i}y$, we have

$$|z|^2 = x^2 + y^2 \geq y^2 \qquad \qquad \text{(since }x^2 \geq 0\text{)} \tag{13}$$

But $z^* = x - \mathrm{i}y$ means that

$$\begin{align} y &= \frac{z - z^*}{2\mathrm{i}} \tag{14} \\ |z|^2 &\geq \left(\frac{z - z^*}{2\mathrm{i}}\right)^2 \tag{15} \end{align}$$

and plugging $z = \langle f | g \rangle$ into equation $(15)$, we get

$$|\langle f | g \rangle|^2 \geq \left[\frac{1}{2\mathrm{i}}(\langle f | g \rangle - \langle g | f \rangle) \right]^2 \tag{16}$$

Now, if we let $| f \rangle = | (\hat{A} - a)\psi \rangle$ and $| g \rangle = | (\hat{B} - B)\psi \rangle$, we can combine equations $(9)$, $(10)$, $(11)$, and $(16)$ to get:

$$\begin{align} \sigma_A^2 \sigma_B^2 &= \langle f | f \rangle \langle g | g \rangle \tag{17} \\ &\geq |\langle f | g \rangle|^2 \tag{18} \\ &\geq \left[\frac{1}{2\mathrm{i}}(\langle f | g \rangle - \langle g | f \rangle) \right]^2 \tag{19} \end{align}$$

Expanding the brackets

If you've made it this far - great job - take a breather before you continue, because there's more maths coming.

We have1

$$\begin{align} \langle f | g \rangle &= \left\langle (\hat{A} - a)\psi \middle| (\hat{B} - b)\psi \right\rangle \tag{20} \\ &= \langle \hat{A}\psi |\hat{B}\psi \rangle - \langle a\psi |\hat{B}\psi \rangle - \langle \hat{A}\psi | b\psi \rangle + \langle a\psi |b\psi \rangle \tag{21} \\ &= \langle \psi |\hat{A}\hat{B}|\psi \rangle - a\langle \psi |\hat{B}\psi \rangle - b\langle \hat{A}\psi | \psi \rangle + ab\langle \psi |\psi \rangle \tag{22} \\ &= \langle \psi |\hat{A}\hat{B}|\psi \rangle - ab - ab + ab \tag{23} \\ &= \langle \psi |\hat{A}\hat{B}|\psi \rangle - ab \tag{24} \end{align}$$

Likewise,

$$\langle g | f \rangle = \langle \psi |\hat{B}\hat{A}|\psi \rangle - ab \tag{25}$$

So, substituting $(24)$ and $(25)$ into $(19)$,

$$\begin{align} \sigma_A^2 \sigma_B^2 &\geq \left[\frac{1}{2\mathrm{i}}(\langle\psi |\hat{A}\hat{B}|\psi \rangle - \langle \psi |\hat{B}\hat{A}|\psi\rangle) \right]^2 \tag{26} \\ &= \left[\frac{1}{2\mathrm{i}}(\langle\psi |\hat{A}\hat{B} - \hat{B}\hat{A}|\psi \rangle ) \right]^2 \tag{27} \end{align}$$

The commutator of two operators is defined as

$$[\hat{A},\hat{B}] = \hat{A}\hat{B} - \hat{B}\hat{A} \tag{28}$$

So, the term in parentheses in equation $(27)$ is simply the expectation value of the commutator, and we have reached the Robertson uncertainty relation:

$$\sigma_A^2 \sigma_B^2 \geq \left(\frac{1}{2\mathrm{i}}\langle[\hat{A},\hat{B} ]\rangle \right)^2 \tag{29}$$

This inequality can be applied to any pair of observables $A$ and $B$.2

The Heisenberg uncertainty principle

Simply substituting in $A = x$ and $B = p$ gives us

$$\sigma_x^2 \sigma_p^2 \geq \left(\frac{1}{2\mathrm{i}}\langle[\hat{x},\hat{p} ]\rangle \right)^2 \tag{30}$$

The commutator of $\hat{x}$ and $\hat{p}$ is famously $\mathrm{i}\hbar$,3 and the expectation value of $\mathrm{i}\hbar$ is of course none other than $\mathrm{i}\hbar$. This completes the proof:

$$\begin{align} \sigma_x^2 \sigma_p^2 &\geq \left(\frac{1}{2\mathrm{i}}\cdot\mathrm{i}\hbar \right)^2 \tag{31} \\ &= \left(\frac{\hbar}{2}\right)^2 \tag{32} \\ \sigma_x \sigma_p &\geq \frac{\hbar}{2} \tag{33} \end{align}$$

where we have simply "removed the square" on both sides because as standard deviations, $\sigma_x$ and $\sigma_p$ are always positive.


Notes

1 I have skipped some stuff. Namely, $\langle \hat{A}\psi |\hat{B}\psi \rangle = \langle \psi |\hat{A}\hat{B}|\psi \rangle$ which is quite straightforward to prove using the hermiticity of both operators; $\langle \psi |\hat{A}|\psi \rangle = a$; $\langle \psi |\hat{B}|\psi \rangle = b$; and $a = a^*$ since it is the expectation value of a physical observable, which must be real.

2 This does not apply to, and cannot be used to derive, the energy-time uncertainty principle. There is no time operator in quantum mechanics, and time is not a measurable observable, it is only a parameter.

3 Technically, it is a postulate of quantum mechanics. (If I am not wrong, it derives from the Schrodinger equation, which is itself a postulate.)

Solution 2:

Using the Cauchy-Schwarz inequality, as @orthocresol did in his answer, is certainly the most straightforward way to derive the Heisenberg uncertainty principle. There is however another very important inequality (that can, by the way, be used to derive the Cauchy-Schwarz inequality) that such a derivation can be based on, namely

\begin{align} \langle \phi | \phi \rangle \geq 0 \ , \tag{1} \end{align}

where $| \phi \rangle$ can be any vector in Hilbert space. This is a fundamental property of the inner product. In order to arrive at the desired equation we just have to plug the right vector into this inequality. For this I will adopt @orthocresol's notation and lean a bit on the first paragraph of his excellent answer, i.e. I will introduce two hermitian operators, $\hat{A}$ and $\hat{B}$, whose expectation values (which are real numbers) shall be denoted by $a$ and $b$, respectively. Furthermore the variance of the observable $A$, which is represented by $\hat{A}$, is $\sigma_A^2 = \left\langle (A - a)^2 \right\rangle = \langle \psi | (\hat{A} - a)^2 | \psi \rangle$ and analogously $\sigma_B^2 = \left\langle (B - b)^2 \right\rangle = \langle \psi | (\hat{B} - b)^2 | \psi \rangle$.

With that out of the way let's get started: First we use $\hat{A}$ and $\hat{B}$ to define another operator, $(\hat{A} - a) + \mathrm{i}\lambda (\hat{B} - b)$, where $\mathrm{i} = \sqrt{-1}$ and lambda is some arbitrary real constant. Since an operators map vectors onto vectors, letting our new operator act on some wave vector $| \psi \rangle$ will just give another wave vector, say $| \tilde{\psi} \rangle$,

\begin{align} \left( (\hat{A} - a) + \mathrm{i}\lambda (\hat{B} - b) \right) | \psi \rangle = | \tilde{\psi} \rangle \ . \end{align}

This new wave vector will have to satisfy inequality $(1)$:

\begin{align} \langle \tilde{\psi} | \tilde{\psi} \rangle \geq 0 \ . \end{align}

Using the hermiticity of $\hat{A}$ and $\hat{B}$, i.e. $\hat{A}^{\dagger} = \hat{A}$ and $\hat{B}^{\dagger} = \hat{B}$, the bra vector $\langle \tilde{\psi} |$ corresponding to the ket vector $| \tilde{\psi} \rangle$ is

\begin{align} \langle \tilde{\psi} | = \langle \psi | \left( (\hat{A} - a) + \mathrm{i}\lambda (\hat{B} - b) \right)^{\dagger} = \langle \psi | \left( (\hat{A} - a) - \mathrm{i}\lambda (\hat{B} - b) \right) \end{align}

we get

\begin{align} \left\langle \psi \middle| \left( (\hat{A} - a) - \mathrm{i}\lambda (\hat{B} - b) \right)\left( (\hat{A} - a) + \mathrm{i}\lambda (\hat{B} - b) \right) \middle| \psi \right\rangle &\geq 0 \\ \biggl\langle \psi \Bigg| (\hat{A} - a)^{2} + \lambda^{2} (\hat{B} - b)^{2} + \mathrm{i} \lambda \underbrace{\left( (\hat{A} - a) (\hat{B} - b) - (\hat{B} - b) (\hat{A} - a)\right) }_{= \left[ (\hat{A} - a) , (\hat{B} - b)\right] } \Bigg| \psi \biggr\rangle &\geq 0 \\ \left\langle \psi \middle| (\hat{A} - a)^{2} \middle| \psi \right\rangle + \lambda^{2} \left\langle \psi \middle| (\hat{B} - b)^{2} \middle| \psi \right\rangle + \mathrm{i} \lambda \left\langle \psi \middle| \left[ (\hat{A} - a) , (\hat{B} - b)\right] \middle| \psi \right\rangle &\geq 0 \tag{2} \end{align}

The commutator $\left[ (\hat{A} - a) , (\hat{B} - b)\right]$ evaluates just to the commutator of $\hat{A}$ and $\hat{B}$,

\begin{align} \left[ (\hat{A} - a) , (\hat{B} - b)\right] &= (\hat{A} - a) (\hat{B} - b) - (\hat{B} - b) (\hat{A} - a) \\ &= \hat{A}\hat{B} - \hat{B} \hat{A} - b\hat{A} + b\hat{A} - a \hat{B} + a\hat{B} \\ &= \hat{A}\hat{B} - \hat{B} \hat{A} = \left[ \hat{A} , \hat{B} \right] \ , \end{align}

so that, together with the definitions of the variances of $A$ and $B$ from above and the expectation value of the commutator $\langle [ \hat{A} , \hat{B} ] \rangle = \langle \psi | [ \hat{A} , \hat{B} ] | \psi \rangle$ , inequality $(2)$ becomes

\begin{align} \left\langle \psi \middle| (\hat{A} - a)^{2} \middle| \psi \right\rangle + \lambda^{2} \left\langle \psi \middle| (\hat{B} - b)^{2} \middle| \psi \right\rangle + \mathrm{i} \lambda \left\langle \psi \middle| \left[ \hat{A} , \hat{B} \right] \middle| \psi \right\rangle &\geq 0 \\ \sigma_A^2 + \lambda^{2} \sigma_B^2 + \mathrm{i} \lambda \big\langle \big[ \hat{A} , \hat{B} \big] \big\rangle &\geq 0 \ . \tag{3} \end{align}

Inequality $(3)$ already contains all the ingredients we need to arrive at the Robertson uncertainty relation. We just need to extract it out of it. To achieve this we have to recognize that the left-hand side of inequality $(3)$ depends on the arbitrary real constant $\lambda$ and that the strongest form of this inequality will be obtained when we choose a value for $\lambda$ that minimizes the left-hand side. This minimization is done in the usual way of taking the first derivative of the left-hand side with respect to $\lambda$ and setting it equal to $0$ (I leave out the check of the second derivative to ascertain that we really have a minimum):

\begin{align} \frac{\mathrm{d} \left( \sigma_A^2 + \lambda^{2} \sigma_B^2 + \mathrm{i}\lambda \big\langle \big[ \hat{A} , \hat{B} \big] \big\rangle \right)}{\mathrm{d} \lambda} &= 0 \\ 2 \lambda \sigma_B^2 + \mathrm{i} \big\langle \big[ \hat{A} , \hat{B} \big] \big\rangle &= 0 \\ \Rightarrow \lambda = \frac{1}{2\mathrm{i}} \frac{\big\langle \big[ \hat{A} , \hat{B} \big] \big\rangle }{\sigma_B^2} \ . \end{align}

Plugging this value for $\lambda$ into inequality $(3)$ yields the Robertson uncertainty relation

\begin{align} \sigma_A^2 - \frac{1}{4} \left(\frac{\big\langle \big[ \hat{A} , \hat{B} \big] \big\rangle }{\sigma_B^2} \right)^{2} \sigma_B^2 + \frac{1}{2} \frac{\big\langle \big[ \hat{A} , \hat{B} \big] \big\rangle }{\sigma_B^2} \big\langle \big[ \hat{A} , \hat{B} \big] \big\rangle &\geq 0 \\ \sigma_A^2 \sigma_B^2 \underbrace{- \frac{1}{4} \big\langle \big[ \hat{A} , \hat{B} \big] \big\rangle^{2} + \frac{1}{2} \big\langle \big[ \hat{A} , \hat{B} \big] \big\rangle^{2}}_{= \frac{1}{4} \big\langle \big[ \hat{A} , \hat{B} \big] \big\rangle^{2}} &\geq 0 \\ \Rightarrow \sigma_A^2 \sigma_B^2 \geq - \frac{1}{4} \big\langle \big[ \hat{A} , \hat{B} \big] \big\rangle^{2} = \left(\frac{1}{2i} \big\langle \big[ \hat{A} , \hat{B} \big] \big\rangle \right)^{2} \ . \tag{4} \end{align}

And how we get from here to Heisenberg's uncertainty relation has already been nicely demonstrated in @orthocresol's answer. Just replace $\hat{A}$ with the position operator $\hat{x}$ and $\hat{B}$ with the momentum operator $\hat{p}$, use the known expectation value of their commutator, $\big\langle \big[ \hat{x} , \hat{p} \big] \big\rangle = \mathrm{i} \hbar$, then take the positive square root of the resulting inequality and you get

\begin{align} \sigma_x \sigma_p &\geq \frac{\hbar}{2} \tag{5} \ . \end{align}