Why are the solutions of polynomial equations so unconstrained over the quaternions?

When I was first learning abstract algebra, the professor gave the usual sequence of results for polynomials over a field: the Division Algorithm, the Remainder Theorem, and the Factor Theorem, followed by the Corollary that if $D$ is an integral domain, and $E$ is any integral domain that contains $D$, then a polynomial of degree $n$ with coefficients in $D$ has at most $n$ distinct roots in $E$.

He then challenged us, as a homework, to go over the proof of the Factor Theorem and to point out exactly which, where, and how the axioms of a field used in the proof.

Every single one of us missed the fact that commutativity is used.

Here's the issue: the division algorithm (on either side), does hold in $\mathbb{H}[x]$ (in fact, over any ring, commutative or not, in which the leading coefficient of the divisor is a unit). So given a polynomial $p(x)$ with coefficients in $\mathbb{H}$, and a nonzero $a(x)\in\mathbb{H}[x]$, there exist unique $q(x)$ and $r(x)$ in $\mathbb{H}[x]$ such that $p(x) = q(x)a(x) + r(x)$, and $r(x)=0$ or $\deg(r)\lt\deg(a)$. (There also exist unique $q'(x)$ and $s(x)$ such that $p(x) = a(x)q'(x) + s(x)$ and $s(x)=0$ or $\deg(s)\lt\deg(a)$.

The usual argument runs as follows: given $a\in\mathbb{H}$ and $p(x)$, divide $p(x)$ by $x-a$ to get $p(x) = q(x)(x-a) + r$, with $r$ constant. Evaluating at $a$ we get $p(a) = q(a)(a-a)+r = r$, so $r=p(a)$. Hence $a$ is a root if and only if $(x-a)$ divides $p(x)$.

If $b$ is a root of $p(x)$, $b\neq a$, then evaluating at $b$ we have $0=p(b) = q(b)(b-a)$; since $b-a\neq 0$, then $q(b)=0$, so $b$ must be a root of $q$; since $\deg(q)=\deg(p)-1$, an inductive hypothesis tells us that $q(x)$ has at most $\deg(p)-1$ distinct roots, so $p$ has at most $\deg(p)$ roots.

And that is where we are using commutativity: to go from $p(x) = q(x)(x-a)$ to $p(b) = q(b)(b-a)$.

Let $R$ be a ring, and let $a\in R$. Then $a$ induces a set-theoretic map from $R[x]$ to $R$, "evaluation at $a$", $\varepsilon_a\colon R[x]\to R$ by evaluation: $$\varepsilon_a(b_0+b_1x+\cdots + b_nx^n) = b_0 + b_1a + \cdots + b_na^n.$$ This map is a group homomorphism, and if $a$ is central, also a ring homomorphism; if $a$ is not central, then it is not a ring homomorphism: given $b\in R$ such that $ab\neq ba$, then we have $bx = xb$ in $R[x]$, but $\varepsilon_a(x)\varepsilon_a(b) = ab\neq ba = \varepsilon_a(xb)$.

The "evaluation" map also induces a set theoretic map from $R[x]$ to $R^R$, the ring of all $R$-valued functions in $R$, with the pointwise addition and multiplication ($(f+g)(a) = f(a)+g(a)$, $(fg)(a) = f(a)g(a)$); the map sends $p(x)$ to the function $\mathfrak{p}\colon R\to R$ given by $\mathfrak{p}(a) = \varepsilon_a(p(x))$. This map is a group homomorphism, but it is not a ring homomorphism unless $R$ is commutative.

This means that from $p(x) = q(x)(x-a) + r(x)$ we cannot in general conclude that $p(c) = q(c)(c-a) +r(c)$ unless $c$ commutes in $R$ with $a$. ~~So the Remainder Theorem may fail to hold (if the coefficients involved do not commute with $a$ in $R$), which in turn means that the Factor Theorem may fail to hold~~ So one has to be careful in the statements (see Marc van Leeuwen's answer). And even when both of them hold for the particular $a$ in question, the inductive argument will fail if $b$ does not commute with $a$, because we cannot go from $p(x) = q(x)(x-a)$ to $p(b)=q(b)(b-a)$.

This is exactly what happens with, say, $p(x) = x^2+1$ in $\mathbb{H}[x]$. We are fine as far as showing that, say, $x-i$ is a factor of $p(x)$, because it so happens that when we divide by $x-i$, all coefficients involved centralize $i$ (we just get $(x+i)(x-i)$). But when we try to argue that any root different from $i$ must be a root of $x+i$, we run into the problem that we cannot guarantee that $b^2+1$ equals $(b+i)(b-i)$ unless we know that $b$ centralizes $i$. As it happens, the centralizer of $i$ in $\mathbb{H}$ is $\mathbb{R}[i]$, so we only conclude that the only other complex root is $-i$. But this leaves the possibility open that there may be some roots of $x^2+1$ that do not centralize $i$, and that is exactly what occurs: $j$, and $k$, and all numbers of the form $ai+bj+ck$ with $a^2+b^2+c^2=1$ are roots, and if either $b$ or $c$ are nonzero, then they don't centralize $i$, so we cannot go from $x^2+1 = (x+i)(x-i)$ to "$(ai+bj+ck)^2+1 = (ai+bj+ck+i)(ai+bj+ck-i)$".

And that is what goes wrong, and there is where commutativity is hiding.

The finiteness of the number of roots of a polynomial $f(x)\in K[x]$ where $K$ is a field depends on two interlaced facts:

$K[x]$ is a Unique Factorization Domain: every polynomial $f(x)$ factors in an essentially unique way as a product of irreducibles;
if $f(\alpha)=0$ then $f(x)=(x-\alpha)g(x)$ where $\deg g(x)=(\deg f(x))-1$.

The combination of these two facts (the first one in particular) does not hold anymore if you think the polynomial $f(x)$ as a polynomial with coefficients in the ring $\Bbb H$ of Hamilton quaternions. This is because the latter is not commutative.

You may also ponder on this fact: in a commutative environment the transformation $a\mapsto\phi_h(a)=hah^{-1}$ (conjugation) is always trivial. Not so in $\Bbb H$, again as a side effect of non-commutativity. The point is that if an element $a$ satisfies a certain algebraic relation with real coefficient (such as $a^2=1$), so will all its conjugates $\phi_h(a)$.

I would like to emphasize a point which is made in Arturo Magidin's answer but perhaps in different words: if $D$ is a noncommutative division ring, then the ring $D[x]$ of polynomials over $D$ does not do what you want it to do.

If $F$ is a field, then one reason you might care about working with polynomials $F[x]$ is that they describe all the expressions you could potentially get from some unknown $x \in F$ (or perhaps $x \in \bar{F}$ or perhaps something even more general than this) via addition and multiplication.

Why does this break down when you replace $F$ with a noncommutative division ring $D$? The problem is that if you work with some unknown $x \in D$ (or in some ring containing $D$) then $x$, by assumption, doesn't necessarily commute with every element in $D$, so starting from $x$ and adding and multiplying you get not only expressions like $$a_0 + a_1 x + a_2 x^2 + ...$$

but more complicated expressions like $$a_0 + a_{1,0} x + x a_{1,1} + a_{1, 2} x a_{1, 3} + a_{2,0} x^2 + x a_{2,1} x + x^2 a_{2,2} + a_{2, 3} x^2 a_{2,4} + a_{2, 5} x a_{2, 6} x a_{2,7} + ... $$

The resulting algebraic structure is quite a bit more complicated than $D[x]$. Already you can't in general combine expressions of the form $axb$ and $cxd$, so even to describe the expressions you can get by using $x$ once I should've actually written $$a_0 + a_{1,0} x a_{1,1} + a_{1,2} x a_{1,3} + a_{1,4} x a_{1,5} + ....$$

Why are the solutions of polynomial equations so unconstrained over the quaternions?

Tags:

Quaternions

Polynomials

Abstract Algebra

Division Algebras

Roots

Related

Recent Posts