Why is it impossible to measure position and momentum at the same time with arbitrary precision?

The answer to your question is... Well let me get there step by step. The answer below summarizes what you can find in current articles published in scientific journals and current textbooks, works and results which I have experienced myself as a researcher in quantum optics. All references are given throughout the answer and at the end, and I strongly recommend that you go and read them. Also, this answer is meant to discuss the uncertainty principle and simultaneous measurement within quantum theory. Maybe in the future we'll all use an alternative theory in which the same experimental facts are given a different meaning; there are such alternative theories proposed at present, and many researchers indeed are working on alternatives. Finally, this answer tries to avoid terminological debates, explaining the experimental, laboratory side of the matter. Warnings about terminology will be given throughout. (I don't mean that terminology isn't important, though: different terminologies can inspire different research directions.)

We must be careful, because our understanding of the uncertainty principle today is very different from how people saw it in the 1930–50s. The modern understanding is also borne out in modern experimental practice. There are two main points to clarify.

1. What do we exactly mean by "measurement" and by "$\Delta x$"?

The general picture is this:

  1. We can prepare one copy of a physical system according to some specific protocol. We say that the system has been prepared in a specific state (generally represented by a density matrix $\pmb{\rho}$). Then we perform a specific operation that yields an outcome. We say that we have performed one instance of a measurement on the system (generally represented by a so-called positive-operator-valued measure $\{\pmb{O}_i\}$, where $i$ labels the possible outcomes).

  2. We can repeat the procedure above anew – new copy of the system – as many times as we please, according to the same specific protocols. We are thus making many instances of the same kind of measurement, on copies of the system prepared in the same state. We thus obtain a collection of measurement results, from which we can build a frequency distribution and statistics. (Throughout this answer, when I say "repetition of a measurement" I mean it in this specific sense.)

(There's also the question of what happens when we make two or more measurements in a row, on the same system. But I'm not going to discuss that here; see the references at the end.)

This is why the general empirical statements of quantum theory have this form: "If we prepare the system in state $\pmb{\rho}$, and perform the measurement $\{\pmb{O}_i\}$, we have a probability $p_1$ of observing outcome $i=1$, a probability $p_2$ of observing outcome $i=2$, ..." and so on (with appropriate continuous limits for continuous outcomes).

Now, there's a measurement precision/error associated with each single instance of the measurement, and also a variability of the outcomes across repetitions of the measurement. The first kind of error can be made as small as we please. The variability across repetitions, however, generally appears not to be reducible below some nonzero amount which depends on the specific state and the specific measurement. This latter variability is what we call "$\Delta x$".

So when we say "cannot be measured with arbitrary precision", what we mean more exactly is that "its variability across measurement repetitions cannot be made arbitrarily low". The fundamental mystery of quantum mechanics is the lack – in a systematic way – of reproducibility across measurement instances. But the error in the outcome of each single instance has no theoretical lower bound.

Of course this situation affects our predictive abilities, because whenever we repeat the same kind of measurement on a system prepared on the same kind of state, we don't really know what to expect, within $\Delta x$.

This important distinction between single and multiple measurement instances was first pointed out by Ballentine in 1970:

  • Ballentine: The Statistical Interpretation of Quantum Mechanics, Rev. Mod. Phys. 42 (1970) 358 (other copy)

see especially the very explanatory Fig. 2 there. And it's not a matter of "interpretation", as the title might today suggest. It's an experimental fact. Clear experimental examples of this distinction are given for example in

  • Leonhardt: Measuring the Quantum State of Light (Cambridge 1997)

see for example Fig. 2.1 there and its explanation. Also the more advanced

  • Mandel, Wolf: Optical Coherence and Quantum Optics (Cambridge 2008).

See also the textbooks given below.

The distinction between error of one measurement instance and variability across measurement instances is also evident if you think about a Stern-Gerlach experiment. Suppose we prepare a spin in the state $x+$ and we measure it in the direction $y$. The measurement yields only one of two clearly distinct spots, corresponding to either the outcome $+\hbar/2$ or $-\hbar/2$ in the $y$ direction. This outcome has some error, but we can distinguish whether it is $+$ or $-\hbar/2$. But if we prepare a new spin in the state $x+$ and measure $y$ again, we can very well find the opposite outcome – again very precisely measured. Over many measurements we observe these $+$ and $-$ outcomes roughly 50% each. The standard deviation is $\hbar/2$, and that's indeed the "$\Delta S_y$" given by the quantum formulae: they refer to measurement repetitions, not to one single instance in which you send a single electron through the apparatus.

It must be stressed that some authors (fore example Leonhardt above) use the term "measurement result" to mean, not the result of a single experiment, but the average value $\bar{x}$ found in several repetitions of an experiment. Of course this average value has uncertainty $\Delta x$. There's no contradiction here, just a different terminology. You can call "measurement" what you please – just be precise in explaining what your experimental protocol is. Some authors use the term "one-shot measurement" to make the distinction clear; as an example, check these titles:

  • Pyshkin et al: Ground-state cooling of quantum systems via a one-shot measurement, Phys. Rev. A 93 (2016) 032120 (arXiv)

  • Yung et al: One-shot detection limits of quantum illumination with discrete signals, npj Quantum Inf. 6 (2020) 75 (arXiv).

The fact that, even though the predictive uncertainty $\Delta x$ is finite, we can have infinite precision in a single (one-shot) measurement, is not worthless, but very important in applications such as quantum key distribution. In many key-distribution protocols the two key-sharing parties compare the precise values $x$ they obtained in single-instance measurements of their entangled states. These values will be correlated to within their single-instance measurement error, which is much smaller than the predictive uncertainty $\Delta x$. The presence of an eavesdropper would destroy this correlation. The two parties can therefore know that there's an eavesdropper, if they see that their measured values only agree to within $\Delta x$ rather than to within the much smaller single-instance measurement error. This scheme wouldn't work if the single-instance measurement error were $\Delta x$. See for example

  • Reid: Quantum cryptography with a predetermined key, using continuous-variable Einstein-Podolsky-Rosen correlations, Phys. Rev. A 62 (2000) 062308 (arXiv)

  • Grosshans et al: Quantum key distribution using gaussian-modulated coherent states, Nature 421 (2003) 238 (arXiv). In Figure 2 one can see very well the difference between single-instance measurement error and the variability $\Delta x$ across measurements.

  • Madsen et al: Continuous variable quantum key distribution with modulated entangled states (free access), Nat. Comm. 3 (2012) 1083. See especially Fig. 4 and its explanation.

2. What is exactly a "measurement of position" or of "momentum"?

In classical mechanics there's only one measurement (even if it can be realized by different technological means) of any specific quantity $Q$, such as position or spin or momentum. And classical mechanics says that the error in one measurement instance and the variability across instances can both be made as low as we please.

In quantum theory there are many different experimental protocols that we can interpret, for different reasons, as "measurements" of that quantity $Q$. Usually they all yield the same mean value across repetitions (for a given state), but differ in other statistical properties such as variance. Because of this, and of the variability explained above, Bell (of the famous Bell's theorem) protested that we actually shouldn't call these experimental procedures "measurements":

  • Bell: Against "measurement" (other copy), in Miller, ed.: Sixty-Two Years of Uncertainty: Historical, Philosophical, and Physical Inquiries into the Foundations of Quantum Mechanics (Plenum 1990).

In particular, in classical physics there's one joint, simultaneous measurement of position and momentum. In quantum theory there are several measurement protocols that can be interpreted as joint, simultaneous measurements of position and momentum, in the sense that each instance of such measurement yields two values, the one is position, the other is momentum. In the classical limit they become the classical simultaneous measurement of $x$ and $p$. This possibility was first pointed out by Arthurs & Kelly in 1965:

  • Arthurs, Kelly: On the simultaneous measurement of a pair of conjugate observables, Bell Syst. Tech. J. 44 (1965) 725 (other copy).

This simultaneous measurement is not represented by $\hat{x}$ and $\hat{p}$, but by a pair of commuting operators $(\hat{X}, \hat{P})$ satisfying $\hat{X}+\hat{x}=\hat{a}$, $\hat{P}+\hat{p}=\hat{b}$, for specially chosen $\hat{a}, \hat{b}$. The point is that the joint operator $(\hat{X}, \hat{P})$ can rightfully be called a simultaneous measurement of position and momentum, because it reduces to that measurement in the classical limit (and obviously we have $\bar{X}=\bar{x}, \bar{P}=\bar{p}$). In fact, from the equations above we could very well say that $\hat{x},\hat{p}$ are defined in terms of $\hat{X},\hat{P}$, rather than vice versa.

This kind of simultaneous measurement – which is possible for any pairs of conjugate variables, not just position and momentum – is not a theoretical quirk, but is a daily routine measurement in quantum-optics labs. It's used to do quantum tomography for example. You can find detailed theoretical and experimental descriptions of it in Leonhardt's book above, chapter 6, entitled "Simultaneous measurement of position and momentum".

But as I said, there are several different protocols that may be said to be a simultaneous measurement of conjugate observables, corresponding to different choices of $\hat{a},\hat{b}$. What's interesting is the way in which these measurements differ. They can be seen as forming a continuum between two extremes:

– At one extreme, the variability across measurement repetitions of $X$ has a lower bound (which depends on the state of the system), while the variability of $P$ is infinite. Basically it's as if we were measuring $X$ without measuring $P$. This corresponds to the traditional $\hat{x}$.

– At the other extreme, the variability across measurement repetitions of $P$ has a lower bound, while the variability for $X$ is infinite. So it's as if we were measuring $P$ without measuring $X$. This corresponds to the traditional $\hat{p}$.

– In between, there are measurement protocols which have more and more variability for $X$ across measurement instances, and less and less variability for $P$. This "continuum" of measurement protocols interpolates between the two extremes above. There is a "sweet spot" in between in which we have a simultaneous measurement of both quantities with a finite variability for each. The product of their variabilities, $\Delta X\ \Delta P$, for this "sweet-spot measurement protocol" satisfies an inequality similar to the well-known one for conjugate variables, but with an upper bound slightly larger than the traditional $\hbar/2$ (just twice as much, see eqn (12) in Arthurs & Kelly). So there's a price to pay for the ability to measure them simultaneously.

This kind of "continuum" of simultaneous measurements is also possible for the famous double-slit experiment. It's realized by using "noisy" detectors at the slits. There are setups in which we can observe a weak interference beyond the two-slit screen, and at the same time have some certainty about the slit at which a photon could be detected. See for example:

  • Wootters, Zurek: Complementarity in the double-slit experiment: Quantum nonseparability and a quantitative statement of Bohr's principle, Phys. Rev. D 192 (1979) 473

  • Banaszek et al: Quantum mechanical which-way experiment with an internal degree of freedom, Nat. Comm. 4 (2013) 2594 (arXiv)

  • Chiao et al: Quantum non-locality in two-photon experiments at Berkeley, Quant. Semiclass. Opt. 73 (1995) 259 (arXiv), for variations of this experiment.

We might be tempted to ask "OK but what's the real measurement of position an momentum, among all these?". But within quantum theory this is a meaningless question, similar to asking "In which frame of reference are these two events really simultaneous?" within relativity theory. The classical notions and quantities of position and momentum simply don't exist in quantum theory. We have several other notions and quantities that have some similarities to the classical ones. Which to consider? it depends, on the context and application. The situation indeed has some similarities with that for "simultaneity" in relativity: there are "different simultaneities" dependent on the frame of reference; which we choose depends on the problem and application.

In quantum theory we can't really say "the system has these values", or "these are the actual values". All we can say is that when we do such-and-such to the system, then so-and-so happens. For this reason many quantum physicists (check eg Busch et al. below) prefer to speak of "intervention on a system" rather than "measurement of a system" (I personally avoid the term "measurement" too).

Summing up: yes

So the answer to your question is that in a single measurement instance we actually can (and do!) measure position and momentum simultaneously and both with arbitrary precision. This fact is important in applications such as quantum-key distribution, mentioned above.

But we also observe an unavoidable variability upon identical repetitions of such measurement. This variability makes the arbitrary single-measurement precision unimportant in other applications, where consistency through repetitions is required instead.

Moreover, we must specify which of the simultaneous measurements of momentum and position we're performing: there isn't just one, as in classical physics.

To form a picture of this, you can imagine two quantum scientists having this chat:

– "Yesterday I made a simultaneous measurement of position and momentum using the experimental procedure $M$ and preparing the system in state $S$."
– "Which values did you expect to find, before making the measurement?"
– "The probability density of obtaining values $x,p$ was, according to quantum theory, $P(x,p)=\dotso$. Its mean was $(\bar{x},\bar{p}) = (30\cdot 10^{-17}\ \mathrm{m},\ 893\cdot 10^{-17}\ \mathrm{kg\ m/s})$ and its standard deviations were $(\Delta x, \Delta p)=(1\cdot 10^{-17}\ \textrm{m},\ 1\cdot 10^{-17}\ \mathrm{kg\ m/s})$, the quantum limit. So I was expecting the $x$ result to land somewhere between $29 \cdot 10^{-17}\ \mathrm{m}$ and $31 \cdot 10^{-17}\ \mathrm{m}$; and the $p$ result somewhere between $892 \cdot 10^{-17}\ \mathrm{kg\ m/s}$ and $894 \cdot 10^{-17}\ \mathrm{kg\ m/s}$." (Note how the product of the standard deviations is $\hbar\approx 10^{-34}\ \mathrm{J\ s}$.)
– "And which result did the measurement give?"
– "I found $x=(31.029\pm 0.00001)\cdot 10^{-17}\ \textrm{m}$ and $p=(893.476 \pm 0.00005)\cdot 10^{-17}\ \mathrm{kg\ m/s}$, to within the widths of the dials. They agree with the predictive ranges given by the theory."
– "So are you going to use this setup in your application?"
– "No. I need to be able to predict $x$ with some more precision, even if that means that my prediction of $p$ worsens a little. So I'll use a setup that has variances $(\Delta x, \Delta p)=(0.1\cdot 10^{-17}\ \textrm{m},\ 10\cdot 10^{-17}\ \mathrm{kg\ m/s})$ instead."

Even if the answer to your question is positive, we must stress that: (1) Heisenberg's principle is not violated, because it refers to the variability across measurement repetitions, not the the error in a single measurement. (2) It's still true that the operators $\hat{x}$ and $\hat{p}$ cannot be measured simultaneously. What we're measuring is a slightly different operator; but this operator can be rightfully called a joint measurement of position and momentum, because it reduces to that measurement in the classical limit.

Old-fashioned statements about the uncertainty principle must therefore be taken with a grain of salt. When we make more precise what we mean by "uncertainty" and "measurement", they turn out to have new, unexpected, and very exciting faces.

Here are several good books discussing these matters with clarity, precision, and experimental evidence:

  • de Muynck: Foundations of Quantum Mechanics, an Empiricist Approach (Kluwer 2004)

  • Peres: Quantum Theory: Concepts and Methods (Kluwer 2002) (other copy)

  • Holevo: Probabilistic and Statistical Aspects of Quantum Theory (2nd ed. Edizioni della Normale, Pisa, 2011)

  • Busch, Grabowski, Lahti: Operational Quantum Physics (Springer 1995)

  • Nielsen, Chuang: Quantum Computation and Quantum Information (Cambridge 2010) (other copy)

  • Bengtsson, Życzkowski: Geometry of Quantum States: An Introduction to Quantum Entanglement (2nd ed. Cambridge 2017).

You can't measure precise values at the same time because precise values for both don't exist at the same time.

All the properties of, say, an electron and be inferred from the electron's wave function, $\Psi(\vec x)$. The wave function is a mathematical object that covers all of space. It has a complex value at each point.

The electron doesn't have a precise position. Instead, it has a probability of being found at each point, $\vec x$, in space on being measured. That probability is $\Psi(\vec x)^*\Psi(\vec x)$. (That is a little loose. Really the probability of being found in a small region $d \vec x$ is $\int \Psi(\vec x)^*\Psi(\vec x) d \vec x$.)

The probability of being found somewhere is $1$, and so $\int\Psi(\vec x)^*\Psi(\vec x)dx = 1$. A function like this must approach $0$ everywhere except in some finite region.

There is a limiting case where it is $0$ everywhere except at one point, where it is infinite. In that case, it has a definite position.

You can also get the momentum from $\Psi(\vec x)$. Again, a definite momentum doesn't exist, except in a limiting case.

In general, $p = h\lambda$. That means an electron with a definite momentum would have a constant amplitude sinusoidal wave function with a definite wavelength. Such a wave function would cover all of space. $\Psi(\vec x) = A e^{i \vec p \cdot \vec x}$. This isn't possible, except as a limiting case where the amplitude approaches $0$. But in this limiting case, the wave function has the same (infinitesimal) amplitude everywhere. The electron has no location at all. It is spread over all space.

These limiting cases are at opposite ends of a range of possibilities. Most wave functions are non-zero over some finite region. Or at least, given any small number $\epsilon$, $|\Psi(\vec x)| > \epsilon$ only over a finite region.

The electron will be found in that finite region, but it doesn't have a precise location. Just a region where it will be found.

Likewise it doesn't have a definite momentum. You can use Fourier analysis to break a function up into a sum of functions of the form $A e^{i \vec p \cdot \vec x}$. $\Psi(\vec x) = \sum A(\vec p) e^{i \vec p \cdot \vec x}$. In the case of a non-periodic function like we have here, it is an infinite sum of infinitesimal functions. It is expressed as an integral rather than a sum. $\Psi(\vec x) = \int A(\vec p) e^{i \vec p \cdot \vec x} d \vec p$

You can think of $A(\vec p)$ as another way of expressing the wave function. This is another mathematical function, defined over that set of all possible momenta. It is useful for describing the momentum of the electron.

It can be shown that $A(\vec p)$ has lots of the same kinds of properties that $\Psi(\vec x)$ does. For example, the probability of finding the electron has momentum $\vec p$ is (again loosely) $A(\vec p)^*A(\vec p)$.

It can be shown $\int A(\vec p)^*A(\vec p)d\vec p = 1$. That is, the probability of finding the electron with some momentum is $1$. It can be shown the function can only be non-zero for a finite range of $\vec p$'s.

There is a limiting case where where $A(\vec p)$ is $0$ everywhere except for one value of $\vec p$. In this limiting case, the electron has a definite $\vec p$.

But the usual case is that the electron has neither a definite $\vec x$, nor a definite $\vec p$. That is, when the wave function is expressed as $\Psi(\vec x)$, it has a finite region where $\Psi(\vec x) > 0$. In this case, it turns out that when the wave function is expressed as $A(\vec p)$, there is a finite range of $\vec p$'s where $A(\vec p) > 0$.

The Uncertainty Principle is an important relation between the size of these two finite regions. $\Delta \vec x \Delta \vec p > \hbar/2$.

This video from 3blue1brown illustrates the idea. In particular, it shows how the Uncertainty Principle comes from wave properties.

Addendum - I didn't address an area where pglpm's answer really shines. I thought I would add my 2 cents.

Suppose you have an electron prepared in a state given by a particular wave function, $\Psi(\vec x)$. The position and momentum can be calculated to be particular values $\vec x$ and $\vec p$, with uncertainties $\Delta \vec x$ and $\Delta \vec p$. Note that uncertainties are often expressed as standard deviations of expected outcomes. This means position and momentum can be predicted to be $\vec x \pm \Delta x$ and $\vec p \pm \Delta p$.

Suppose the electron is just arriving at a free standing thin film surface containing many atoms.

If $\Delta \vec x$ is large, it is not possible to predict which atom the electron will hit in advance. Nevertheless, the electron will hit a particular atom. It may the atom is affected in some permanent way, say by being ejected and leaving a hole. In that case, it is possible to go back afterward and find out very precisely what the position of the electron was.

If $\Delta \vec p$ is large, it is not possible to predict in advance what the electron's momentum will be measured to be. But if it ejects an atom, it may be possible to measure time of flight of the scattered electron and atom to detectors with high spatial resolution and get a very precise value for what the electron's initial momentum turned out to be.

The Uncertainty Principle does not limit how precisely we can determine the outcomes of these measurements. It limits how precisely we can predict them in advance. If you have many electrons in the same state, it limits how repeatable multiple measurements will be.

Immediately after the collision, the electron and atom will be in new states. Both states will have a $\Delta \vec x$ and $\Delta \vec p$. It is not possible to predict in advance when and where either will hit their detectors. But it is possible to say that the combined outcomes of the position and momentum measurements of the scattered electron and atom will add up to a momentum consistent with the electron's initial momentum and uncertainty.

It is possible to measure both the position and the momentum of a particle to arbitrary position "at the same time", if you take that phrase to mean "within such quick succession that you can be confident that the probability distribution for the first measured quantity has not changed via Schrodinger evolution between the two measurements".

But doing so isn't very useful, because there will always be some infinitesimal delay between the two measurements, and whichever one comes second will effectively erase the information gained from the first measurement. For example, if you measure position and then momentum immediately afterwards, then you can get a very precise value for both measurements, but the process of getting a precise momentum reading will change the wavefunction such that it's position after the momentum measurement now has large uncertainty with respect to a subsequent measurement. So the momentum measurement "nullifies" the information from the prior position measurement, in the sense of rendering it unrepeatable.

So it's better to talk about the inability to "know" the position and momentum at the time than about the inability to "measure" both (which actually is possible). Fully understanding why requires understanding both the "state collapse" behavior of measurements, and the "wide <-> narrow" relation between non-commuting observables (e.g. via the Fourier transform) that you mention.

That's for measurements in extremely quick succession. You could ask about measurements that take place at exactly the same time, but that gets into philosophical waters as to whether two events ever occur at exactly the same time even in classical physics. In practice, if you try to do both measurements at once, then you'll always find that the particle comes out with either very tightly bounded position or momentum, and with large uncertainty in the other quantity.