Examples of density operators $\rho=\sum\limits_n p_n|\phi_n\rangle\langle\phi_n|$ in which the states $\{|\phi_n\rangle\}$ are not orthogonal

This thread has seen a ton of incorrect statements coming from a number of sides, so it's probably a good idea to set the record straight in a bit more detail, and to provide some more examples of how expressions of this form come up in practice.

So, let's go through a brief rundown of some pertinent points.

  • The definition of a density matrix is just an operator $\rho:\mathcal H \to \mathcal H$ that is self-adjoint and positive semidefinite (and trace class if $\dim(\mathcal H)=\infty$), and whose trace satisfies $$\mathrm{Tr}(\rho)=1.$$ More importantly, this is all that's required by the definition. Any operator that satisfies those conditions can legitimately be called a density matrix, period.

  • Because of that, all operators that can be expressed in the OP's form, $$ \rho = \sum_n p_n |\phi_n \rangle\langle \phi_n|, \tag{$*$}$$ are valid density matrices so long as the component projectors are normalized to $\langle \phi_n|\phi_n \rangle =1$ and the weigths add up to $\sum_n p_n = 1$.

  • Those two requirements are the only actual requirements. None of the conditions for density-matrix-ness ($\rho^\dagger=\rho$, $\rho\geq 0$, and $\mathrm{Tr}(\rho)=1$) are impacted if the $|\phi_n\rangle$ are not pairwise orthogonal, or if their number exceeds the state space's dimension. That means that it's perfectly fine to take non-orthogonal states in a representation of the form $(*)$.

  • Explicit examples with non-orthogonal projectors are trivial to construct. Norbert Shuch's answer contains one example, but if you go looking for them you can build them instantly by just taking any collection of unit-normalized vectors weighted by unit-normalized weights $p_n$.

    To provide one such example explicitly, consider the two-level space $\mathcal H = \mathbb C^2$, and a sequence of $N$ vectors lying equispaced along the equator of its Bloch sphere, giving $$ \rho = \sum_{n=0}^{N-1} p_n |\varphi_n\rangle\langle \varphi_n| \quad \text{for} \quad |\varphi_n\rangle = \frac{1}{\sqrt{2}} \bigg( |0\rangle + e^{i 2\pi n/N} |1\rangle\bigg). \tag{$\star$} $$ Here the weights can be arbitrary so long as $\sum_{n=0}^{N-1} p_n=1$; one obvious choice is $p_n = 1/N$ which gives the maximally-mixed state $\rho = \frac12 \mathbb I$, but there's plenty of other possible choices.

  • Representations of the form $(*)$ are not unique. Suppose, say, that you have some density matrix $\rho$ that you've managed to represent as a sum of normalized projectors in two different ways, say, $$ \rho = \sum_n p_n |\phi_n \rangle\langle \phi_n| = \sum_m q_m |\chi_m \rangle\langle \chi_m|, \tag{$**$}$$ where $\sum_n p_n = 1 = \sum_m q_m$ and $\langle \phi_n|\phi_n \rangle =1=\langle \chi_m|\chi_m \rangle$. Then there are some loose requirements on the two sets of vectors, starting with the fact that $\mathrm{span}\{|\phi_n\rangle\}$ needs to match $\mathrm{span}\{|\chi_m\rangle\}$, but in general, the layout of the $|\phi_n\rangle$ and the $|\chi_m\rangle$ within that span can be very different. This is evident in the example $(\star)$ above with equal weights, where $\rho$ is independent of the number $N$ of vectors in your collection, and it can also be represented as $\rho = \tfrac12 \left[ |0\rangle\langle 0| + |1\rangle\langle 1| \right]$.

  • Representations of the form $(*)$ are interpretations, and little more. There is some physical content in the statement $$ \rho = \sum_n p_n |\phi_n \rangle\langle \phi_n|, \tag{$*$}$$ namely, that you can produce the system state $\rho$ by producing the pure states $|\phi_n\rangle$ with probabilities $p_n$ and then forgetting which pure state you actually produced. However, the operative word there is "can": the fact that that procedure will produce $\rho$ does not say, at all, that it is the only possible procedure that will produce that state.

  • Representations do not imply that the vectors involved are eigenvectors of the resultant density matrix. That's true if the projectors are pairwise orthogonal, but that's not a requirement at all, so it is perfectly possible to construct $\rho$ as a sum of projectors that have nothing to do with the sum's eigenprojectors.

    It's probably helpful to illustrate this with an explicit example, for clarity. Consider a two-level system that's prepared in a superposition of the form $$ |\theta_\pm\rangle = \cos(\theta/2)|0\rangle \pm \sin(\theta/2)|1\rangle,$$ i.e. an angle $\theta$ down from the north pole of the Bloch sphere, except that each time we flip a fair coin to see which sign of $\theta$ (i.e. which direction on the prime meridian) we take. Then the density matrix reads \begin{align} \rho & = \frac12 \bigg( |\theta_+\rangle\langle\theta_+| +|\theta_-\rangle\langle\theta_-| \bigg) \\ & = \frac12 \bigg( \big(\cos(\theta/2)|0\rangle + \sin(\theta/2)|1\rangle \big) \big(\cos(\theta/2)\langle 0| + \sin(\theta/2)\langle 1| \big) \\ & \qquad + \big(\cos(\theta/2)|0\rangle - \sin(\theta/2)|1\rangle \big) \big(\cos(\theta/2)\langle 0| - \sin(\theta/2)\langle 1| \big) \bigg) %\\ & = \frac12 \bigg( %\big(\cos^2(\theta/2)|0\rangle\langle 0| + \sin(\theta/2)\cos(\theta/2)|1\rangle %\langle 0| + \sin(\theta/2)\cos(\theta/2)|0\rangle \langle 1| + %\sin^2(\theta/2)|1\rangle\langle 1| \big) %\\ & \qquad + %\big(\cos^2(\theta/2)|0\rangle\langle 0| - \sin(\theta/2)\cos(\theta/2)|1\rangle %\langle 0| - \sin(\theta/2)\cos(\theta/2)|0\rangle \langle 1| + %\sin^2(\theta/2)|1\rangle\langle 1| \big) % \bigg) \\ & = \cos^2(\theta/2)|0\rangle\langle 0| + \sin^2(\theta/2)|1\rangle\langle 1| \end{align} because the off-diagonal terms cancel out. In this second representation, we do have orthogonal projectors, so here $|0\rangle$ and $|1\rangle$ are indeed the unique eigenvectors of $\rho$ (unless $\theta=\pi/2$ and $\rho$ is maximally mixed). But that doesn't stop our initial representation, $\rho = \frac12 \left( |\theta_+\rangle\langle\theta_+| +|\theta_-\rangle\langle\theta_-| \right)$, with its non-orthogonal, non-eigenvector components, from also being true.

  • If a state is built up using non-orthogonal projectors, then it also has a separate representation in terms of orthogonal projectors, and that's perfectly fine. Representations of the form $(*)$ are a dime a dozen if you know where to look. So, you found one that's not the canonical one: great! there's millions where that one came from.

  • Representations of the form $(*)$ really are a dime a dozen. If you want to build one yourself, say, for a two-level system, there's a few points that are particularly relevant to the recipe:

    • The Pauli matrices are a basis for all valid density matrices, i.e. if $\rho=\rho^\dagger$ is traceless, then it can be represented as $$ \rho = \tfrac12 \mathbb I + \vec p \cdot \vec \sigma,$$ where $\vec p = (p_x,p_y,p_z)\in \mathbb R^3$ and $\vec \sigma =(\sigma_x, \sigma_y, \sigma_z)$ are the Pauli matrices. (Further, that relationship can be inverted via $\vec p = \mathrm{Tr}(\rho\vec\sigma)$.)
    • The positivity condition $\rho\geq 0$ translates into the condition $||\vec p||\leq 1$, i.e. $\vec p$ lives inside the unit ball or its boundary $-$ generally known as the Bloch ball and the Bloch sphere in this context.
    • If $|\vec p|=1$, i.e. $\vec p$ is on the Bloch sphere boundary, then $\rho = |\psi\rangle\langle\psi|$ is a pure state, and if you write $|\psi\rangle = \cos(\theta/2) |0\rangle + e^{i\varphi}\sin(\theta/2)|1\rangle$ (which you always can) then $\theta\in [0,\pi]$ and $\varphi\in[0,2\pi)$ are the polar and azimuthal spherical coordinates for $$ \vec p = (\sin(\theta)\cos(\varphi), \sin(\theta)\sin(\varphi), \cos(\theta).$$
    • The relationship between $\vec p$ and $\rho$ is linear and bijective.
    • If $\rho_1$ and $\rho_2$ are valid density matrices, then any convex combination $$ \rho = q_1 \rho_1 + q_2 \rho_2$$ of the two, with weights adding to $q_1+q_2=1$, is also a valid density matrix.
    • Because the relationship between density matrices and Bloch-ball vectors is linear, any convex combination of density matrices translates directly into a convex combination of the corresponding Bloch-ball vectors. Thus, if $ \rho_1 = \tfrac12 \mathbb I + \vec p_1 \cdot \vec \sigma,$ $ \rho_2 = \tfrac12 \mathbb I + \vec p_2 \cdot \vec \sigma,$ and $ \rho = q_1 \rho_1 + q_2 \rho_2$, then $ \vec p= q_1 \vec p_1 + q_2 \vec p_2$ lies on the line that goes from $\vec p_1$ to $\vec p_2$, a fraction $q_1=1-q_2$ of the way in that direction.

    So, what does this mean for density-matrix representations? If you have a target density matrix $\rho$ that you want to represent, simply take its Bloch-ball vector $\vec p = \mathrm {Tr}(\rho\vec\sigma)$, and then pick $N$ points $\vec p_n$ on the Bloch sphere itself (the boundary) and weights $q_n$ (normalized to $\sum_n q_n=1$) such that their average $\sum_n q_n \vec p_n=\vec p$ gives you your chosen point. That will then naturally give you a representation of your density matrix as a weighted sum of $N$ pure-state projectors, and you can read off the computational-basis components directly from the spherical coordinates of your chosen extremal points.


Just consider a two-level system and take the three states $|0\rangle$, $|1\rangle$, and $|+\rangle = (|0\rangle+|1\rangle)/\sqrt{2}$. Then, the mixed state $$ \rho = \tfrac13 |0\rangle\langle0| + \tfrac13 |1\rangle\langle1| + \tfrac13 |+\rangle\langle+| $$ is an example for what you are asking for. (Of course, it has also an eigenvalue decomposition where the vectors are orthogonal.)

In case you don't want them to form an (over-complete) basis either, just consider the same example in a three-dimensional space.


A widely used example of the representation of this sort are so-called quasiprobability distributions.

Consider the Harmonic oscillator. You can use the orthonormal coordinate $|x\rangle$, momentum $|p\rangle$ or Fock $|n\rangle$ bases. However there are also nice states, \begin{equation} |\alpha\rangle=e^{\alpha a^\dagger - \alpha^\ast a}|0\rangle \end{equation} known as coherent states. Those are Gaussian wavepackets that are localized both in coordinate and momentum space with $\alpha=\langle x\rangle+i\langle p\rangle$. It's important to stress that the coherent states with different $\alpha$ are not orthogonal.

You can write any density matrix $\rho$ for the Harmonic oscillator as an integral over the phase space, \begin{equation} \rho=\int d^2\alpha\, P(\alpha,\alpha^\ast)|\alpha\rangle\langle\alpha| \end{equation} Obviously from the $\operatorname{Tr}\rho=1$ follows that, \begin{equation} \int d^2\alpha\, P(\alpha,\alpha^\ast)=1 \end{equation} and you often can treat it as a probability distribution in the phase space.

However here comes "quasi" in the "quasiprobability". The function $P(\alpha,\alpha^\ast)$ is allowed to be negative in some regions! The $\rho$ can still be positively defined.

So you can of course consider such representations but remember that some $p_n$ may actually be negative.