What do the elements in a homography matrix mean?

First of all affine transformation are those that preserve straight lines and can many of arbitrary dimensionality

Homography describes the mapping across two planes or what happens during pure camera rotation.

The last row represents various shears (that is when x is function of both x, y)


It's not that difficult to understand if you have a grasp of matrix multiplication. Assume you point x is

/a\
\b/,

and you want to rotate the coordinate system by A:

/3 4\
\5 6/

and and "move it" it by t

/2\
\2/.

The latter matrices are the components of the affine transformation to get the new point y:

y = A*x + t = <a'; b'>T //(T means transposed).

As you know, to get that, one can construct a 3d matrix B and a vector x' looking like

    /3 4 2\         /a\
B = |5 6 2| ,  x' = |b|
    \0 0 1/         \1/

such that

     /a'\
y' = |b'| = B*x'
     \ 1/ 

from which you can extract y. Let's see how that works. In the original transformation (using addition), the first step would be to carry out the multiplication, ie. the rotating part y_r:

y_r = A*x = <3a+4b; 5a+6b>T

then you add the "absolute" part:

y = y_r + t = <3a+4b+2; 5a+6b+2>T

Now look at how B works. I'll calculate y' row by row:

1) a' = 3*a + 4*b + 2*1

2) b' = 5*a + 6*b + 2*1

3) the rest: 0*a + 0*b + 1*1 = 1

Just what we expected. First, the rotation part gets calculated--addition and multiplication. Then, the x-part of the translational part gets added, multiplied by 1--it stays the same. The same thing for the second row.

In the third row, a and b are dropped (multiplied by 0). The last part is kept the same, and happens to be 1. So, all about that last line is to "drop" the values of the point and keep the 1.


It could be argued, then, that a 2x3 matrix would be enough for that. That's partially true, but has one significant disadvantage: you loose composability. Suppose you are basically satisfied with B, but want to mirror one coordinate. Then you can choose another transformation matrix

    /-1 0 0\
C = | 0 1 0|
    \ 0 0 1/

and have a result

y'' = C*B*x' = <-3a+4b+2; 5a+6b+2; 1>T

This simple multiplication could not be done that easily with 2x3 matrices, simply because of the properties of matrix multiplication.

In principle, in the above, the last row (the XXX) could also be anything else of the form <0;0;x>. It was there just to drop the point values. It is however necessary exactly like this to make composition by multiplication work.

Finally, wikipedia seems quite informative to me in this case.