Geometrically explained, why do Linear Transformations Take a circle to an ellipse

The equation of a circle is $x^2 + y^2 = r^2$, or in terms of vectors $(x,y) \pmatrix{x\cr y} = r^2$ An invertible linear transformation $T$ takes $\pmatrix{x\cr y}$ to $\pmatrix{X\cr Y} = T\pmatrix{x\cr y}$. Thus $\pmatrix{x\cr y\cr} = T^{-1} \pmatrix{X\cr Y}$, and $(x,y) = (X, Y) (T^{-1})^\top$. The equation becomes $$(X, Y) (T^{-1})^\top T^{-1} \pmatrix{X\cr Y} = r^2 $$ Note that $(T^{-1})^\top T^{-1}$ is a real symmetric matrix, so it can be diagonalized, and its eigenvalues are positive.


Every real square matrix has a polar decomposition into the product of an orthogonal matrix $U$ and a positive-semidefinite (symmetric) matrix $P$. If the original matrix is nonsingular, then $P$ is positive-definite. In 2-D, orthogonal matrices represent either rotations or reflections, which are both isometries, so they don’t affect the shape of the transformed circle. As you’ve mentioned, $P$ can be orthogonally diagonalized, so it represents a stretch in some set of perpendicular directions.

The existence of this decomposition is equivalent to the existence of the SVD, but can be shown without relying on the latter. In a similar vein, the SVD decomposes the matrix into the product of a rotation or reflection, a scaling, and another rotation or reflection.

You might also have a look at the Steiner generation of an ellipse. This uses intersecting line segments drawn between points on the sides of a parallelogram to generate ellipses, including circles. Affine transformations preserve incidence relationships (the image of the intersection of a pair of lines is the intersection of the lines’ images) and maps paralellograms to parallelograms, so the image of an ellipse under an affine transformation is another ellipse.