Is the momentum operator diagonal in position representation?

There is a heuristic way to look at this.

The Dirac delta function corresponds to a spike when its argument is zero. You can view it as the limit of a sequence of Gaussian functions whose areas are all one but whose width goes to zero. The derivative of a Gaussian function looks like this: enter image description here

So in the limit, the derivative of the Dirac function is something like an up spike infinitesimally to the left of the origin followed by a down spike infinitesimally to the right. So the matrix elements you're looking at aren't actually diagonal, they're infinitesimally off-diagonal.

These kinds of heuristics can be useful, but they can also be dangerous, so don't take what I say too literally.

Update: Another way to look at this is to approach the derivative Dirac delta via a discretisation. If the wavefunction is represented by a vector of equally spaced samples, the derivative can be represented by central differences. Assuming periodic boundary conditions we get a matrix like:

$\frac{1}{2}\pmatrix{ 0 & 1 & 0 & 0 & \ldots & -1 \\ -1 & 0 & 1 & 0 & \ldots & 0 \\ 0 & -1 & 0 & 1 & \ldots & 0 \\ 0 & 0 & -1 & 0 & \ldots & 0 \\ & & \vdots \\ 1 & 0 & 0 & 0 & \ldots & 0 \\ }$

We have 1's just above the diagonal and -1's just below. As the discretisation gets finer we get a matrix where they entries are more and more concentrated near the diagonal even though all of the non-zero terms are actually off-diagonal. In the limit you can again imagine something that is infinitesimally off-diagonal.


OP wrote (v1):

Does this imply that $$\tag{1}\langle x | \hat{p} | x^{\prime} \rangle = 0$$ whenever $x \neq x^{\prime}$?

For two fixed values of $x \neq x^{\prime}$, the answer is Yes $^1$. But don't try to integrate (1) over $x$ or $x^{\prime}$, i.e. treat $x$ and $x^{\prime}$ as running parameters.

Is the momentum operator diagonal in position representation?

No. If the position eigenstates $(| x \rangle)_{x\in\mathbb{R}}$ both diagonalized the position operator $\hat{x}$ and the momentum operator $\hat{p}$, this would for instance imply that they commute, which we know they don't, cf. the CCR.

The above apparent paradoxes are rooted in wrongly thinking of a distribution, say $\delta(x)$, as a function from $\mathbb{R}$ to $[0,\infty]$ (which takes the value $\infty$ at the point $x=0$). This is an insufficient picture. Distributions should either be understood as a suitable limit of ordinary functions, or defined with the help of test functions.

--

$^1$ This is related to that the distribution $\delta^{\prime}(x)$ only has support at $x=0$.


In addition to Qmechanic's answer, this is what happens when you integrate over $x$ with a test function, which is actually what you need to do for the expression to really become meaningful. So let's use $f(x)$ as a test function$^1$ and integrate:

$$-i \hbar \int dx \frac{\partial}{\partial x}f(x)\delta(x-x') = -i\hbar f'(x')$$

So in general this is not zero.


$^1$ Assume $f$ is differentiable.