If turning a perfectly monochromatic laser on for a finite time gives a frequency spread, where did the other frequency photons come from?

Reviewing the mode expansion

The laser being monomode and always turned on, the quantum state inside of its cavity or of the light it radiates can be described using coherent states of frequency $\omega_0$. If photons exist at other frequencies could you write down the interaction that created them?

The first issue here is that you're assuming that modes must have definite frequencies, and hence that photons in those modes have definite frequencies. This is an idealization, which doesn't hold in the real world. Let's review where it comes from in the textbooks:

  1. Consider the classical electromagnetic field, either in complete vacuum, or in a perfectly closed system with time-independent non-dissipative boundary conditions, and without any matter present to absorb energy.
  2. Under these unrealistic assumptions, the field has solutions that oscillate periodically in time forever, with absolutely no decay, which we'll call modes.
  3. Upon quantizing the system, we find that the quantum state of the field is described by a quantum harmonic oscillator for each mode, and we call the excitations within each mode photons.

Thus, modes of the classical field only have definite frequencies under idealized assumptions. In the real world, modes don't need to have definite frequencies, and neither do photons. In fact, in the real world, there are often cases where there are multiple valid sets of modes to use, which correspond to multiple valid definitions of photons; this will resolve your paradox below.

A toy example

Here's a toy model to illustrate subtleties with the mode expansion. (It actually won't be relevant to the final answer, but it might help build intuition.)

In free space, we can describe the evolution of a single degree of freedom of a field by a quantum harmonic oscillator. So more generally, consider a degree of freedom evolving under the Hamiltonian $$H(t) = \frac{p^2}{2m} + \frac12 m \, \omega(t)^2 x^2.$$ The time-dependent of $\omega(t)$ could represent, e.g. the effect of fluctuations of the cavity walls. The classical solutions to the equations of motion are not sinusoids, and hence don't have a definite frequency.

The same remains true when we quantize. At every time, we can define instantaneous raising and lowering operators in the usual way, along with an instantaneous vacuum, corresponding to an instantaneous mode which oscillates sinusoidally at the instantaneous frequency. Similarly, at every time, we can define a ladder of instantaneous energy eigenstates, $$|n(t) \rangle = \frac{(a^\dagger(t))^n}{\sqrt{n!}} |0(t) \rangle$$ In the case where $\omega(t)$ changes slowly, the adiabatic theorem applies, so $|n(t) \rangle$ at time $t$ evolves into the state $|n(t') \rangle$ at a later time $t'$. Similarly, you can define instantaneous coherent states, $$|z(t) \rangle \propto e^{z a^\dagger(t)} |0(t) \rangle$$ which in the adiabatic limit evolve into other instantaneous coherent states.

The adiabatic limit demonstrates that coherent states do not necessarily have definite frequency. Recall that for the electromagnetic field, the "position" variable is the vector potential $\mathbf{A}$, and the conjugate momentum is $\mathbf{E}$. A reasonable physical definition of "definite frequency" is that the observed electric field is sinusoidal, i.e. $\langle p(t) \rangle$ is sinusoidal for this coherent state. But it isn't, because Ehrenfest's theorem tells us that $$\frac{d \langle p(t) \rangle}{dt} = - m \, \omega(t)^2 \langle x(t) \rangle$$ or, differentiating again, $$\frac{d^2 \langle p(t) \rangle}{dt} = - \omega(t)^2 \, \langle p(t) \rangle $$ which does not have sinusoidal solutions when $\omega(t)$ varies. (This isn't actually related to your paradox, but it illustrates how you can get frequency spread inside a cavity even if only "one mode" is excited.)

In the non-adiabatic case, we can get even weirder behavior. For example, suppose that $\omega(t)$ suddenly changes at $t = 0$, $$\omega(t) = \begin{cases} \omega_< & t < 0, \\ \omega_> & t > 0. \end{cases}$$ We can define two sets of ladder operators before and after $t = 0$ corresponding to frequencies $\omega_<$ and $\omega_>$, and thereby define two independent sets of states, $|n_< \rangle$ and $|n_>\rangle$. In particular, if you start in the state $|0_< \rangle$, you won't end up in $|0_> \rangle$. Instead, you end up with some "$t > 0$" photons, not because there was an explicit source term, but because the natural definition of photons changed at $t = 0$.

Addressing the paradox

Let me boil down your paradox to the following:

  1. Start with a monochromatic plane wave in free space, containing only photons of frequency $\omega$.
  2. Couple a detector to this plane wave for a finite time $T$.
  3. The detector "sees" photons of frequency $\omega'$ in a width $\sim \hbar/T$ about $\omega$. In other words, to the detector, it's as if the laser pulse were only a time $T$ long, even though it really is infinite.

There's really no problem here, you just have to be careful with what it means for a detector to "see photons". In your situation, the state of the electromagnetic field is perfectly well-defined. Your detector can't perfectly capture that state, but no detector can see literally everything, nor should we expect any to.

For example, if I were color blind, a red photon and a green photon would look the same to me. That doesn't mean that my eyes are converting red photons to green, or a mixture of red and green, it just means they can't tell the difference. If your detector just measures the electric field for a short time, it's effectively color blind, so that's it.

Refining the paradox

This might not be satisfying, so let's consider an alternative detector which explicitly measures photons, following the question you linked. Suppose the detector works as follows: at a prescribed time, two perfectly conducting metal plates suddenly sweep down. The plates are separated by a distance $L = c T$, so they effectively "cut out" a time $T$ of the pulse. Then, the detector just counts up the photons inside it, along with their frequencies. The paradox is that the detector sees photons of frequency $\omega'$ in a width $\sim \hbar/T$ about $\omega$.

You can probably now see the trick, given the first section. The detector plates have changed the boundary conditions of the electromagnetic field. That means the photons the detector measures correspond to a different set of modes than the free space photons. The free space modes look like $e^{ik x}$ with no boundary conditions, while the detector modes look like $\sin(k' x)$ with the $k'$ defined by hard wall boundary conditions.

Upon quantizing each set of modes separately, we find that a state of the electromagnetic field corresponding to only photons in one free space mode also generally corresponds to photons in multiple detector modes. The standard mathematical tool used to swap between the equivalent mode descriptions is the Bogoliubov transformation.

This appeared in a simple form in the previous section, where $|0_< \rangle \neq |0_> \rangle$. It is also the reason behind the Unruh effect, the fact that an accelerating detector sees a thermal bath of photons, even in vacuum: this is due to the mismatch between detector-defined photons, and the plane wave photons defined in inertial frames in free space. Hawking radiation also runs on the same principle.

So in some sense, the resolution to your paradox is quite "exotic". But really, this ambiguity of modes was always there into the formalism of quantum field theory. Most textbooks ignore it only because there is a unique set of modes if you stay in inertial frames in free space, but this breaks down quickly.

The apparent paradox is analogous to the problem of blind men describing an elephant https://en.wikipedia.org/wiki/Blind_men_and_an_elephant (it's like a rope, a tree, a tent, a snake---). A Fourier transform is only one example of a way to represent a wave form. The same wave form can be represented as a sum of delta functions, Gabor wavepackets, and even square waves. They all describe the same thing, and not one of them is quite “correct”. Each is a blind man's description of something that can't be perfectly described from any one perspective.

A “monochromatic” beam is one whose wave peaks are perfectly in step for all time and for an infinite distance. To describe a laser beam as monochromatic is, of course not fully meaningful because we can never know whether it has been and will continue shining for all time. At best, it can only be “effectively monochromatic”: monochromatic enough for whatever the practical purpose may be.

Internal to the laser, the emission events do not take an infinite time to occur, so in one view the laser beam is composed of a lot of superimposed non-monochromatic pulses whose fundamental frequency components are in phase but whose higher frequency components are randomly out of phase. Add all those up, (for a very long time!) and the beam is effectively monochromatic.

So a monochromatic beam can be described in two very different ways (as one long thing or a superposition of short things), but still be the same thing.

Even the idea of “beam” has similar problems. If you have an infinitely wide plane wave, it will propagate as a perfectly collimated beam. But what is a “beam” if it's infinitely wide? If you block the infinite wave so that it has a finite diameter, it will no longer propagate as a collimated beam; it will spread at an angle inversely related to its diameter and directly related to its wavelength. Huygens showed that a plane wave can be represented both as a simple propagating plane wavefront, and as a superposition of an infinite number of spherical wavefronts diverging from points on the wavefront. Neither description is “correct”, but each is useful in different situations.

There is a direct correspondence between these two representations of wave propagation and the two representations of monochromaticity. In each case, both representations are equally valid; and neither is the “correct” representation. We use whichever representation is most useful for analyzing any given beam scenario.

You are seeking an intuitive understanding of the fact that the frequency spectrum of a light beam is altered by shuttering the beam to form a long pulse. Perhaps the easiest way to understand it is the second representation of a “monochromatic beam”: as the superposition of a lot of short pulses whose central frequencies are all in phase. It doesn't matter whether you consider the short pulses to be femtoseconds long or microseconds long; the math works out the same. When the beam is shuttered, it limits the number of such short pulses that can be summed to represent the resulting long pulse, and thus prevents full cancellation of the portions of the pulses that are out of phase (which correspond to the off-center frequencies of the pulses).