Why do roots of polynomials tend to have absolute value close to 1?

Let me give an informal explanation using what little I know about complex analysis.

Suppose that $p(z)=a_{0}+...+a_{n}z^{n}$ is a polynomial with random complex coefficients and suppose that $p(z)=a_{n}(z-c_{1})\cdots(z-c_{n})$. Then take note that

$$\frac{p'(z)}{p(z)}=\frac{d}{dz}\log(p(z))=\frac{d}{dz}\log(z-c_{1})+...+\log(z-c_{n})= \frac{1}{z-c_{1}}+...+\frac{1}{z-c_{n}}.$$

Now assume that $\gamma$ is a circle larger than the unit circle. Then

$$\oint_{\gamma}\frac{p'(z)}{p(z)}dz=\oint_{\gamma}\frac{na_{n}z^{n-1}+(n-1)a_{n-1}z^{n-2}+...+a_{1}}{a_{n}z^{n}+...+a_{0}}\approx\oint_{\gamma}\frac{n}{z}dz=2\pi in.$$

However, by the residue theorem,

$$\oint_{\gamma}\frac{p'(z)}{p(z)}dz=\oint_{\gamma}\frac{1}{z-c_{1}}+...+\frac{1}{z-c_{n}}dz=2\pi i|\{k\in\{1,\ldots,n\}|c_{k}\,\,\textrm{is within the contour}\,\,\gamma\}|.$$

Combining these two evaluations of the integral, we conclude that $$2\pi i n\approx 2\pi i|\{k\in\{1,\ldots,n\}|c_{k}\,\,\textrm{is within the contour}\,\,\gamma\}|.$$ Therefore there are approximately $n$ zeros of $p(z)$ within $\gamma$, so most of the zeroes of $p(z)$ are within $\gamma$, so very few zeroes can have absolute value significantly greater than $1$. By a similar argument, very few zeroes can have absolute value significantly less than $1$. We conclude that most zeroes lie near the unit circle.

$\textbf{Added Oct 11,2014}$

A modified argument can help explain why the zeroes tend to be uniformly distributed around the circle as well. Suppose that $\theta\in[0,2\pi]$ and $\gamma_{\theta}$ is the pizza slice shaped contour defined by $$\gamma_{\theta}:=\gamma_{1,\theta}+\gamma_{2,\theta}+\gamma_{3,\theta}$$ where

$$\gamma_{1,\theta}=([0,1+\epsilon]\times\{0\})$$

$$\gamma_{2,\theta}=\{re^{i\theta}|r\in[0,1+\epsilon]\}$$

$$\gamma_{3,\theta}=\cup\{e^{ix}(1+\epsilon)|x\in[0,\theta]\}.$$

Then $$\oint_{\gamma_{\theta}}\frac{p'(z)}{p(z)}dz= \oint_{\gamma_{\theta,1}}\frac{p'(z)}{p(z)}dz+\oint_{\gamma_{\theta,2}}\frac{p'(z)}{p(z)}dz+\oint_{\gamma_{\theta,3}}\frac{p'(z)}{p(z)}dz$$

$$\approx O(1)+O(1)+\oint_{\gamma_{\theta,3}}\frac{p'(z)}{p(z)}dz$$

$$\approx O(1)+O(1)+\oint_{\gamma_{\theta,3}}\frac{na_{n}z^{n-1}+(n-1)a_{n-1}z^{n-2}+...+a_{1}}{a_{n}z^{n}+...+a_{0}}dz$$

$$\approx O(1)+O(1)+\oint_{\gamma_{\theta,3}}\frac{n}{z}dz\approx n i\theta$$.

Therefore, there should be approximately $\frac{i\theta}{2\pi}$ zeroes inside the pizza slice $\gamma_{\theta}$.


A complete derivation can be found in the classical paper of Shepp and Vanderbei:

Larry A. Shepp and Robert J. Vanderbei: The complex zeros of random polynomials, Trans. Amer. Math. Soc. 347 (1995), 4365-4384

But the heuristic explanation is that for small modulus the higher order terms contribute very little to the polynomials, and so can be thrown away (so the polynomial can be viewed as one of much lower degree, so has not so many roots), and for large modulus, one can use the same reasoning with $z\rightarrow 1/z.$

EDIT

For a general distribution of coefficients, see this (underappreciated, in my opinion, paper): Distribution of roots of random real generalized polynomials


I think the following geometric argument is interesting and maybe sufficient to answer "why" at an intuitive level (?).

When we take the powers of $x$ in the complex plane, the absolute value scales geometrically ($|x^n| =|x|^n$) and the argument (angle with the x-axis) scales linearly ($\arg x^n = n \arg x$). So the powers of $x$ look like this:

powers of x

If $x$ is a root of our random polynomial $$ p(x) = a_nx^n + \dots + a_1 x + a_0, $$ then each of these vectors (including the $x^0$ vector not drawn) is multiplied by a random coefficient, and the sum is equal to the zero vector. I'm just thinking of i.i.d. positive bounded coefficients for this response.

The key point is that this weighted sum of the vectors in any particular direction must cancel out to zero if $x$ is a root of the polynomial, yet each time $x^k$ goes "around the circle" the sizes $|x^k|$ of the vectors is geometrically larger --- unless $|x|$ is very close to $1$. Intuitivley, some randomness in the coefficients will not be enough to cancel out large growth of $|x^k|$ because the vectors must sum to zero in every direction simultaneously.

For concreteness, choose the direction of the positive $x$-axis. Then the condition that $x$ be a root implies that, letting $\theta = \arg x$ be the angle of $x$ with the $x$-axis, \begin{align*} 0 &= \sum_k a_k Re(x^k) \\ &= \sum_k a_k |x|^k \cos (k \theta) . \end{align*} Heuristically, since $\cos(k \theta)$ is an oscillating term in $\theta$ and the $a_k$ are independently random, $|x|$ must be very close to one or else the large-$k$ terms "unbalance" the sum. And this condition must hold in all directions, not just the positive $x$-axis.

I have drawn the case where $|x| > 1$, but the $|x| < 1$ case is exactly the same.

(Edit: Maybe also interesting, in light of Francois' simulations, but this suggests that if the coefficients are all positive, or more likely to be positive, and the degree $k$ is relatively small, then we should see few roots with argument (angle to $x$-axis) close to $0$: In this case there is not enough oscillation to get cancellation. That is, the powers of $x$ don't go "around the cycle" and neither are they cancelled by negative coefficients.)