Why do we use Hermitian operators in QM?

One problem with the given $3\times 3$ matrix example is that the eigenspaces are not orthogonal.

Thus it doesn't make sense to say that one has with 100% certainty measured the system to be in some eigenspace but not in the others, because there may be a non-zero overlap to a different eigenspace.

One may prove$^{1}$ that an operator is Hermitian if and only if it is diagonalizable in an orthonormal basis with real eigenvalues. See also this Phys.SE post.


$^{1}$We will ignore subtleties with unbounded operators, domains, selfadjoint extensions, etc., in this answer.


If you want to see something different, there are actually a few articles by Carl Bender developing quantum mechanics formulated with parity-time symmetric operators. He shows that some Hamiltonians are not Hermitian, yet they have real eigenvalues and seem to represent valid physical systems. If you think about it, the requirement that your operator is parity-time symmetric makes more sense physically than hermiticity. In a later article, his quantum mechanics approach was proven to be equivalent to the standard one where operators are hermitian.

If you are interested, you can read http://arxiv.org/abs/quant-ph/0501052


To give an answer that is a little more general than what you're asking I can think of three reasons for having hermitian operators in quantum theory:

1) Quantum theory relies on unitary transforms, for symmetries, basis changes or time evolution. Unitary transforms are generated by hermitian operators as in $U=\exp(iH)$. And unitary Lie group representations come with a lie algebra of hermitian operators.

2) Outcomes of measurements are taken from a set of orthogonal states with real measurement values. This structure if efficiently represented by a hermitian operator that comes with an eigenstructure that matches these requirements precisely.

3) State representations of subsystems and ensembles lead to hermitian operators. For ensembles this can be seen from the construction as a convex sum of projectors, which are necessarily hermitian. For subsystem states it comes out of tracing a projector over tensor factor spaces. This is related to point 2) because processes like decoherence connect measurement outcomes with density operators.