Derivative of quaternions

Quaternion derivatives are not that straight forward. Usually a direction is required (see here).

I'm assuming your thought process behind $H_2$ was something like

$$ h_2 = q_1 \otimes q \otimes q_2 = q_1 \otimes (q \otimes q_2) = [q_1]_L\ (q \otimes q_2) = [q_1]_L\ [q_2]_R\ q$$ $$ H_2 = \frac{\partial h_2}{\partial q} = [q_1]_L\ [q_2]_R\ $$

where I assigned L,R indices as in your reference. But that is the opposite indices of what you wrote. This also has other issues which I will get to in a bit.

For $H_1$, you have the direction cosine matrix as a function of $q$ but you don't appear to have taken the derivative of $C(q)$ at all. So I'm not following the logic there.

Looking at the definition of the direction cosine matrix in terms of the components of $q$, I feel this is venturing into the territory of abusing quaternions as a container of 4 variables.

There are a lot of things going on here, so let me try to untangle them a bit and note on them separately.

Representation of quaternions

If you wish to use linear algebra, there is already a real valued matrix representation of quaternions. I would suggest just using that representation if you want to clarify the content of some equations.

Instead, here quaternion are treated as a column vector (eq 7 of your reference), which leads to you then using two additional representations of quaternions to fit them into a linear algebra setting to represent multiplication. This confuses the mathematical content.

Using the definitions in your reference:

$$ [a + x\ i + y\ j + z\ k]_L = \begin{bmatrix} a & -x & -y & -z \\ x & a & -z & y \\ y & z & a & -x \\ z & -y & x & a \\ \end{bmatrix}$$

$$ [a + x\ i + y\ j + z\ k]_R = \begin{bmatrix} a & -x & -y & -z \\ x & a & z & -y \\ y & -z & a & x \\ z & y & -x & a \\ \end{bmatrix}$$

That then means:

$$[zk]_L\ [yj]_R = \begin{bmatrix} 0 & 0 & 0 & -z \\ 0 & 0 & -z & 0 \\ 0 & z & 0 & 0 \\ z & 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} 0 & 0 & -y & 0 \\ 0 & 0 & 0 & -y \\ y & 0 & 0 & 0 \\ 0 & y & 0 & 0 \end{bmatrix} = \begin{bmatrix} 0 & -yz & 0 & 0 \\ -yz & 0 & 0 & 0 \\ 0 & 0 & 0 & -yz \\ 0 & 0 & -yz & 0 \end{bmatrix} $$

Which doesn't fit the form for any of the previous representations of a quaternion.

Therefore saying $H_2 = [q_1]_R\ [q_2]_L$, is stating the concept which is used here for a derivative of a quaternion function, can result in something outside of the set of functions of quaternions. Is this what you intended? How do you intend to interpret this object?

Representing multiple things with the same object

We can use a real value to represent temperature or distance, but these are distinct things so somehow these 'type labels' ('units') must be carried around to remind us of this.

Similarly, trying to represent both rotations and positions with quaternions can confuse things if one isn't careful to carry around these 'type labels'. For while it may make some notation closer to how you'd write the calculation in code, these objects (positions and rotations) are not on the same footing and their components transform differently if we change basis.

Even the underlying choice to represent a rotation with a quaternion (so that one can write $x'=q \otimes x \otimes q^*$ as done on pg 3 of your reference), doesn't actually relate to the underlying mathematical structures as cleanly as one might imagine. (For example, why do we need to use quaternion multiplication twice? Why is the angle off by a factor of 2? See this classic paper on rotations and quaternions)

functions of quaternions

Not every "left linear" equation in quaterions $$f(q) = a\ q + b$$ can be rewritten as a "right linear" equation $$f(q) = q\ c + d$$

If we consider a linear equation of all combinations of multiplication from each side: $$g(q) = a\ q\ b + c\ q + q \ d + e$$

Then how should one interpret the derivative of this function? Even for linear equations we are forced to be careful with our expectations.

Potential "abuse" of quaternions

Now, it is possible that what you are trying to do has little to do with these complications. Somewhere along the way, ideas from quaternions get used for writing code in engineering, physics, graphics applications, but the equations eventually carry around so much specialized 'type information' that it really begins to feel more like linear algebra notation used merely to succinctly represent what the code is doing. In such cases quaternions eventually become abused as more of a way to carry around 4 parameters than anything else (which is how I'd describe many of the things at quaternions.com). In which case these linear algebra short hands can just be viewed as defining a calculation for a real valued function in four real variables. In which case you can discuss the partial derivatives of this with respect to any of the variables without the issues above.

So at the end of the day, if you are just using quaternions to calculate some transformation (likely including rotations without being forced to choose a basis for Euler angles), and you'd like to know the partial derivative of some transformation with respect to some parameter, you can always just expand out the transformation. It may not have some nice "quaternion" looking form, but that is just because you were actually just manipulating linear equations and nothing fundamentally "quaternion" in the first place.


There are different ways to answer your question, but you probably want one of these two:

  1. You want the derivative with respect to the 4 components of the quaternion q=w+ix+iy+iz, that is, with respect to a 4 vector $v_q=(w,x,y,z)^\top\in R^4$. We have the derivative of the rotation wrt this vector q as: $$ \frac{\partial q\otimes p \otimes q^*}{\partial v_q} = 2[wp+v×p , v^\top p I + vp^\top−pv^\top−w[p]_\times] \in R^{3\times4} $$ where:

    • $I$ is the 3x3 identity matrix.
    • $[p]_\times$ is the skew symmetric matrix fromed from $p$.
    • $\times$ is the cross product
    • $\otimes$ is the quaternion product. This sign can be omitted since it's a product and write simply $qpq^*$.

    So assuming A is a constant matrix, and knowing that $C(q)p = qpq^*$, your derivative is $$ \frac{\partial (AC(q)p)}{\partial v_q} = 2A[wp+v×p , v^\top p I + vp^\top−pv^\top−w[p]_\times] \in R^{3\times4} $$ This is the derivative that e.g. Ceres is going to compute should you use automatic differentiation of your function using #include Jet.h.

  2. You want the derivative with respect to the rotation itself seen as a 3-vector of the Lie algebra of the rotation group. The Lie-theory defines two Jacobians, left and right, for this, depending on whether you perturb the rotation on the right, $\tilde R=R\exp([\theta]_\times)$, or on the left, $\tilde R=\exp([\theta]_\times)R$.

    The right Jacobian of the rotation is: $$ \frac{\partial C(q)p}{\partial C} = -C(q)[p]_\times \in R^{3\times 3} $$ and so your full Jacobian is $$ \frac{\partial (AC(q)p)}{\partial C} = -AC(q)[p]_\times \in R^{3\times 3} $$

  3. The left Jacobian of the rotation is $$ \frac{\partial C(q)p}{\partial C} = -[C(q)p]_\times \in R^{3\times 3} $$ and so your full Jacobian is $$ \frac{\partial (AC(q)p)}{\partial C} = -A[C(q)p]_\times \in R^{3\times 3} $$


Eq (163) in your notes has the really amazing equation:

$$ \frac{\partial(\mathbf{q} \otimes \mathbf{a} \otimes \mathbf{q\ast})}{\partial \mathbf{q}} = \frac{\partial(\mathbf{R} \mathbf{a})}{\partial \mathbf{q}} = 2 \big[ w \mathbf{a} + \mathbf{v} \times \mathbf{a} + \mathbf{v}^\top \mathbf{a} \mathbf{I} + \mathbf{v}\mathbf{a}^\top - \mathbf{a}\mathbf{v}^\top - w[\mathbf{a}]_\times \big] $$

These engineering symbols for quaternions are quite fascinating. I have not digested all the symbols. In math we simply note that $x \mapsto q x q^{-1}$ (which might be called $q\ast$) is a Rotation.