What is a sample of a random variable?

^I think @adjan's answer and @Ian's answer above are other correct interpretations.

Generally, "sample" is more of a concept from statistics then probability.

If you have a "population" which assumes a certain property with probability defined by distribution $P$, then a sample is a vector (of arbitrary length, but ideally specified beforehand) of independent copies of a random variable $X$ distributed according to distribution $P$.

To be technical, two copies $X_1, X_2$ of an independent random variable $X$ are actually two random variables on the product space $\Omega \times \Omega$ which have the product measure (i.e. $\mathbb{E}X_1X_2 = \mathbb{E}X_1 \mathbb{E}X_2$). If $\pi_1, \pi_2$ are the canonical projections and $i_1, i_2: \Omega \to \Omega \times \Omega$ are any two injections such that $\pi_1 \circ i_1 = id_{\Omega}, \pi_2 \circ i_2 = id_{\Omega}$, then $X_1 \circ i_1 \sim X$ and $X_2 \circ i_2 \sim X$.

Sampling a predetermined, fixed size $n$ of independent "samples" from the total population is really just considering the product space $\Omega^n$ with random variables $X_1, \dots, X_n$ endowed with the product measure, thus $X_1 \circ i_1 \sim \dots \sim X_n \circ i_n \sim X$.

If we don't want to fix in advance the number of samples we take, then sampling is just considering the product space $\Omega^{\mathbb{N}}$ with random variables $X_1, \dots, X_n, \dots$ endowed with the product measure, thus $X_1 \circ i_1 \sim \dots \sim X_n \circ i_n \sim \dots \sim X$. Such a construction is possible in the countably infinite case by the Ionescu Tulcea extension theorem, for example, even in many cases when the conditions of Kolmogorov's extension theorem do not apply.

It is also worth noting that we don't necessarily need to assume that the "copies" of $X$ are independently distributed; using an entirely analogous definition, it would still be possible for $X_1 \circ i_1 \sim \dots \sim X_n \circ i_n \sim \dots \sim X$ without having the product measure on $\Omega^{\mathbb{N}}$ (here I am drawing an implicit distinction between "the" product measure on the product of measure spaces and an arbitrary measure on such a space, which need not equal the canonical such measure.)

Then a "statistic" is any (measurable) function of this sample vector in the finite case or sample sequence in the infinite case.

Because in practice every experiment conducted in real life only takes finitely many variables, references to sample vectors (of an arbitrary, unspecified size $n$) rather than sample sequences will often be more common in statistics-oriented literature, although the greater generality afforded (in some sense) by the limiting case $n \to \infty$ is used very frequently in the pure probability literature (e.g. when discussing random walks or results like the CLT or Law of the Iterated Logarithm). However in the pure probability literature such infinite samples are usually termed "a sequence of identical (independently distributed) random variables" rather than a "sample", even though any sequence of identical random variables is the same type of mathematical object as the unbounded version of the theoretical model of "samples" used by statisticians. Indeed, this correspondence is why the terms "sampling with replacement" and "sampling without replacement" turn up in the pure probability literature. Since the central object of study of statistics are vectors of random variables, this explains why the theory of vector-valued random variables is more prominent or prevalent in the statistical literature than other fields.

For example, we could have a population of particles with velocities distributed as $N(0,1)$. Then a sample would correspond to a vector $(X_1, \dots, X_n)$ with each $X_i$ having the distribution $N(0,1)$.

"In practice" this is the theoretical model for choosing $n$ particles from the population and noting what their velocity is.

Given this sample, we can calculate statistics, for example, the "sample mean" $(X_1 + \dots X_n)/n$.

However, this is in general different from the "mean of the population" (which is 0, the expectation of the normal distribution), since it is a random variable and hence assumes different values randomly depending on which particles we choose. In this case, however, the statistics has the optimal property that its expectation is equal to the population mean (i.e. $E[(X_1 + \dots X_n)/n]=EX$.


When discussing samples, there is a vector-valued random variable $Y$ where the real-valued random variables $Y_i$ are independent and have the same distribution as $X$. Then $\{ x_i \}_{i=1}^n$ is $Y(\omega)$ for some particular $\omega$ in the sample space $\Omega$. Sometimes when we refer to a sample we actually mean $Y$ itself rather than a particular value, it depends on the exact context.