Unbiased Estimator for a Uniform Variable Support

Of course $\hat\theta=\max\{x_i\}$ is biased, simulations or not, since $\hat\theta<\theta$ with full probability hence one can be sure that, for every $\theta$, $E_\theta(\hat\theta)<\theta$.

One usually rather considers $\hat\theta_N=\frac{N+1}N\cdot\max\{x_i\}$, then $E_\theta(\hat\theta_N)=\theta$ for every $\theta$.

To show that $\hat\theta_N$ is unbiased, one must compute the expectation of $M_N=\max\{x_i\}$, and the simplest way to do that might be to note that for every $x$ in $(0,\theta)$, $P_\theta(M_N\le x)=(x/\theta)^N$, hence $$ E_\theta(M_N)=\int\limits_0^{\theta}P_\theta(M_N\ge x)\text{d}x=\int\limits_0^{\theta}(1-(x/\theta)^N)\text{d}x=\theta-\theta/(N+1)=\theta N/(N+1). $$

Edit (Below is a remark due to @cardinal, which completes nicely this post.)

It may be worth noting that the maximum $M_N=\max\limits_{1\le k\le N}X_k$ of an i.i.d. Uniform$(0,θ)$ random sample is a sufficient statistic for $θ$ and that it is one of the few statistics for distributions outside the exponential family which is also complete.

An immediate consequence is that $\hat\theta_N=(N+1)M_N/N$ is the uniformly minimum variance unbiased estimator (UMVUE) for $θ$, that is, that any other unbiased estimator for $θ$ is a worse estimator in the $L^2$ sense.


Another way to look at the result derived by Joriki:

It is known that if you order $N$ uniform observations in a sample from a given interval, then the resulting partition of the interval gives a point sampled uniformly on the $(N+1)$-simplex.

In particular, the remaining difference $\theta-\hat\theta$ has the same distribution of any component of such a random point. In particular, it is clear that it has expectancy $\theta/(N+1)$ (since by symmetry, all components have the same expectation).

All in all: $$\mathbb{E}[\theta-\hat\theta] = \frac{\theta}{N+1},$$ and indeed

$$\mathbb{E}[\hat\theta] = \frac{N}{N+1}\theta.$$

Edit: If the minimum of the interval is unknown as well (say $[a,b]$), then by calling $\hat a$ the minimum of the sample and $\hat b$ the maximum, the same reasoning shows that $$\mathbb{E}[b-\hat b]=\frac{b-a}{N+1},$$ $$\mathbb{E}[\hat b]=\frac{N}{N+1}b+\frac{1}{N+1}a.$$

And so to have an unbiased estimator of the maximum $b$, one could use for example $$\frac{N\hat b-\hat a}{N-1}.$$


To derive Didier's result, observe that the cumulative distribution function for the maximum $m$ is given by the ratio of the volume of $[0,m]^N$ to the volume of $[0,\theta]^N$, which is $(m/\theta)^N$. So the density is the derivative of that, and we can calculate the expectation value of $m$ as

$$ \begin{eqnarray} \int_0^\theta m\frac{\mathrm d}{\mathrm dm}\left(\frac m\theta\right)^N\mathrm dm &=& \frac N{\theta^N}\int_0^\theta m^N\mathrm dm \\ &=& \frac N{N+1}\theta\;, \end{eqnarray} $$

so we get an unbiased estimator for $\theta$ by multiplying by $(N+1)/N$.

Tags:

Statistics