How can one motivate the relativistic momentum?

What I would like to know is how can one motivate that the correct choice for p is the γ(v)mv

In Newtonian mechanics, the momentum of a particle of mass $m$ is given by

$$\mathbf p = m\frac{d {\mathbf r}}{dt} = m \mathbf v$$

where $\mathbf r$ is the position vector and $t$ is a universal parameter. However, in relativistic mechanics, $t$ is a coordinate, not a parameter, and is thus a component of a four-vector, the four-position $\mathbf R = (ct, \mathbf r)$.

The four-velocity is then defined as

$$\mathbf U = \frac{d \mathbf R}{d \tau} = \frac{d \mathbf R}{d t}\frac{dt}{d\tau} = \frac{d \mathbf R}{d t}\gamma = \gamma (c, \mathbf v) $$

where $\tau$ is the proper time parameter. In analogy with Newtonian mechanics, the four-momentum is then

$$\mathbf P = m \mathbf U = \gamma m(c, \mathbf v)$$

and then we see that the relativistic momentum is simply the spatial part of the four-momentum.


Special relativty is about Minkowski spacetime. A line element is given by $$ ds^2 = c^2dt^2 - dx^2 - dy^2 - dz^2 $$ A free particle will move on a straight line, that is, it will minimize the path length $$ L = \int ds = \int \sqrt{c^2 \left(\frac{dt}{d\lambda}\right)^2 - \left(\frac{dx}{d\lambda}\right)^2 - \left(\frac{dy}{d\lambda}\right)^2 - \left(\frac{dz}{d\lambda}\right)^2} \ d\lambda$$ where $\lambda$ is an arbitrary parametrisation of the path. We set $$ I(\lambda) := \sqrt{c^2 \left(\frac{dt}{d\lambda}\right)^2 - \left(\frac{dx}{d\lambda}\right)^2 - \left(\frac{dy}{d\lambda}\right)^2 - \left(\frac{dz}{d\lambda}\right)^2} $$

The Euler-Lagrange-equations give: $$ \frac{d}{d\lambda} \left( \frac{\delta I}{\delta \left( \frac{d(ct)}{d\lambda} \right)} \right) - \frac{\delta I}{\delta (ct)} = 0 $$ $$ \frac{d}{d\lambda} \left( \frac{\delta I}{\delta \left( \frac{dx}{d\lambda} \right)} \right) - \frac{\delta I}{\delta x} = 0 $$ etc.

Therefore if we evaluate the differentials and multiply by $I$: $$ c \frac{d^2t}{d\lambda^2} = 0 $$ $$ - \frac{d^2x}{d\lambda^2} = 0 $$ $$ - \frac{d^2y}{d\lambda^2} = 0 $$ $$ - \frac{d^2z}{d\lambda^2} = 0 $$

Now we parametrisate by proper time $d\lambda = d\tau = \frac{1}{c} ds$, introduce $x_\mu = (ct,-x,-y,-z)^T$ and multiply by $m$. This leaves us $$ m \frac{d^2x_\mu}{d\tau^2} = 0 $$ the covariant equation of motion of a free particle if we combine all 4 equations. Using $$ d\tau = \frac{1}{c} ds = \frac{1}{c} \sqrt{c^2 dt^2 - dx^2 - dy^2 - dz^2} \\ = \frac{1}{c} dt \sqrt{c^2 -\left(\frac{dx}{dt}\right)^2 - \left(\frac{dy}{dt}\right)^2 - \left(\frac{dz}{dt}\right)^2} = dt \frac{1}{\gamma(v)} $$ to express by system time $t$, this is equal to: $$ \frac{d}{dt} \left( m \cdot \gamma(v) \cdot \frac{dx_\mu}{dt} \right) \hat{=} \frac{d}{dt} \left( \matrix{\gamma(v) \cdot m c \\ - m \cdot \gamma(v) \cdot \frac{dx}{dt} \\ - m \cdot \gamma(v) \cdot \frac{dy}{dt} \\ - m \cdot \gamma(v) \cdot \frac{dz}{dt}} \right) \hat{=} \frac{d}{dt} \left( \matrix{ \gamma(v) \cdot m c \\ - m \cdot \gamma(v) \cdot \vec{v} } \right) = \left( \matrix{0 \\ \vec{0}} \right) $$

The new dynamical quantities are $ \vec{p} = m \cdot \gamma(v) \cdot \vec{v}$, which we may call momentum, and $\frac{E}{c} = \gamma(v) \cdot m c $ where $E$ is energy.

One can now try to add forces on the right side of the equation of motion.

In short: If we start by the assumption that a free particle moves on a straight line in Minkowski space, we are led to new dynamical quantities $\vec{p}$ and $E$ that can be used to describe properties of motion in a similar way as they did in newtonian mechanics.

If one tries to describe nature on basis of tensors, the quantity $\gamma(v) \cdot m$ is not a "good" quantity, as it does not transform like a tensor (e.g scalar). However the quantities $m$ and $(\frac{E}{c}, \vec{p})^T$ are tensors (scalar and contravariant tensor of first rank). So these are the "better" quantities according to the criterion.


The Hamiltonian $H$ generates time translations and the momentum $\mathbf p$ generates space translations. In a relativistic theory time and space can mix, so we should consider the 4-vector $$p^\mu = (H, \mathbf p).$$ Now whatever the $\mathbf p$ is in terms of mass, velocity and so on, certainly it is $\mathbf 0$ in a rest frame. Then so that there can be momentum in any frame at all, the time component can't vanish. Thus in a rest frame $$p^\mu = (m, \mathbf 0)$$ for some quantity $m$. Boost to a general frame to conclude that $$p^\mu = (\gamma m, \gamma m \mathbf v).$$ Comparison with the non-relativistic limit shows that $m$ is the mass.