How are the Pauli matrices for the electron spin derived?

That certainly depends on what exactly you mean. I take your question as "how do you see that the (non-relativistic) electron spin (or more generally, Spin-1/2) is described by the Pauli matrices?"

Well, to start, we know that measuring the electron spin can only result in one of two values. From this we see that we need matrices of at least dimension 2. The simplest choice is then of course exactly dimension 2.

Moreover the spin is an angular momentum, and thus described by three operators obeying the angular momentum algebra: $[L_i, L_j] = \mathrm{i} \hbar\epsilon_{ijk} L_k$. This together with matrix dimension 2 basically restricts the choice to sets of three matrices which are equivalent to $\hbar/2$ times the Pauli matrices (the freedom of choice of those matrices corresponds to the freedom to use three arbitrary orthogonal directions as $x$, $y$ and $z$ direction).

So now why choose from those equivalent choices exactly the Pauli matrices? Well, there's always one measurement direction which is represented by a diagonal matrix; this makes calculations much easier. Of course it makes sense to choose the matrices in a way that this direction is one of the coordinate directions. By convention, the $z$ direction is chosen. This ultimately fixes the matrices.

Spin is an angular momentum, so in the rest frame it is a 3-dimensional vector, or 4-dimensional vector with zero time component:

$\vec{v} = (v_1,v_2,v_3)$

Each 3D vector can be associated with a 2x2 matrix by the following rule:

$V = \begin{vmatrix} v_3 & v_1-iv_2 \\ v_1+iv_2 & -v_3 \end{vmatrix}$

In particular, if you chosse $\vec{v}$ as a basis vector: $\vec{v} = (1,0,0)$, it is associated with matrix

$H_1=\begin{vmatrix} 0 & 1 \\ 1 & 0 \end{vmatrix}$

Similarly, for $\vec{v} = (0,1,0)$ you will get

$H_2=\begin{vmatrix} 0 & -i \\ i & 0 \end{vmatrix}$

and for $\vec{v} = (0,0,1)$ you will get

$H_3=\begin{vmatrix} 1 & 0 \\ 0 & -1 \end{vmatrix}$

These are Pauli matrices. Now arbitrary vector corresponds to linear combination of $H_1, H_2$ and $H_3$. For given $\vec{v} = (v_1,v_2,v_3)$ you will get

$V = v_1 H_1 + v_2 H_2 + v_3 H_3$

Further you can use matrices to act on special objects with 2 complex components. These objects are called "spinors". They are used to construct wave functions of fermions with 1/2 spin. For instance, we can choose spinor of the form

$s_3=\begin{vmatrix} 1 \\ 0 \end{vmatrix}$

and act on it with matrix $H_3$. We will obtain:

$H_3 s_3=\begin{vmatrix} 1 & 0 \\ 0 & -1 \end{vmatrix} \begin{vmatrix} 1 \\ 0 \end{vmatrix} = \begin{vmatrix} 1 \\ 0 \end{vmatrix} = s_3$

As you can see, $s_3$ is an "eigenspinor" of $H_3$ with eigenvalue +1. There is also another "eigenspinor" of $H_3$ with eigenvalue -1:

$\begin{vmatrix} 1 & 0 \\ 0 & -1 \end{vmatrix} \begin{vmatrix} 0 \\ 1 \end{vmatrix} = -\begin{vmatrix} 0 \\ 1 \end{vmatrix}$

Why do we use spinors instead of scalars, vectors and tensors? Because:

We can construct vectors out of spinors. These vectors are always isotropic, but they have spatial direction.
We can use Pauli matrices and their linear combinations to rotate vectors constructed from spinors. The rotation rule is very simple: if $s$ is initial spinor, then the rotated spinor is $\bar{s}=(\exp{iV})s$. If we construct a vector from $\bar{s}$, it will be equivalent to "ordinary" rotation of the vector constructed from $s$ around the axis parallel to $\vec{v}$ to the angle $|\vec{v}|$.

Now if, for instance, $s$ is an "eigenspinor" of $H_3$, then direction of the vector constructed from $s$ will be invariant w.r.t. rotations around z axis.

The Hilbert space for spin 1/2 is two-dimensional - there are two possible values spin can take: $\hbar/2$ or $-\hbar/2$ (this is taken from experiment).

Now, in two-dimensional Hilbert space spin operator has to be self-adjoined (this comes from foundations of QM). Furthermore, sum of its eigenvalues has to be 0 - because sum of eigenvalues is just a sum of possible results of measurements - in this case $\hbar/2-\hbar/2 = 0$. Therefore the operator corresponding to the measurement of spin in a given direction has to be 2x2 complex hermitan traceless matrix (no commutation relations used so far!). This family of matrices is a three-parameter one (one constant on diagonal, two off-diagonal for real and imaginary part, rest is determined by hermicity and tracelessness).

Furthermore, every such an operator has to be diagonal in its eigenbasis - which corresponds to the measurement in a given direction. Let us name this direction $z$ direction. We see, that the only possible traceless hermitean diagonal matrix is a multiplicity of Pauli matrix $\sigma_z$. Now, we write the remaining two operators as $L_x$ and $L_y$.

Next, we have to use commutation relations of angular momentum: $[L_y,\sigma_z]=L_x$, $[L_x,\sigma_z]=-L_y$ and $[L_x, L_y] = \sigma_z$. Remembering that $L_y$ as well as $L_x$ are determined by 3 real constants each, we need 6 linear equations to solve for them all at once. The above commutation relations give exactly 6 equations (3 matrix equations, two-dimensional matrices) so we can solve for $L_x$ and $L_y$ which are exactly the remaining Pauli matrices.

How are the Pauli matrices for the electron spin derived?

Tags:

Quantum Mechanics

Quantum Spin

Related

Recent Posts