Why do we need (the abstract concept of) random variables (in discrete probability models)?

In short, the abstract concepts are for blurring out the details, i.e. the abstract concept of random variables is useful because it allows us to work at higher level than ordinary atoms.

To give some more motivation:

  • You can have more than one random variable on single probability space, but doing it with atoms you would need to care for every possible combination.
  • The notion of conditional expected value $\mathbb{E}(X | Y)$ is very useful, but cumbersome to define in terms of atoms.
  • You can nicely characterize the random variable in terms of characteristic function.

Finally, there's one more thing to add: humans have a very strong intuition about randomness, but in mathematics there is no randomness at all. All you have is just functions (I will skip other entities for simplicity), and every time you apply the same arguments to a function you get the same result. I will repeat: in mathematics there is no randomness at all.

How one deal with that? By introducing something like "the state of the world" or what we often denote by $\Omega$, the universe. So if the world happens to be in some state $\omega_1 \in \Omega$, then the outcome of every experiment is precisely determined, we can predict perfectly every outcome of every action (tossing a coin or whatever it is). However, we don't know in which state the world is, and this way we hacked the randomness into math. And this interpretation gives natural rise to the definition of random variables -- they are only random, because we do not know in which state of the world we live in.

To conclude, it does not matter whether $\Omega$ is finite or not, it does not matter if we prefer to deal with raw sets and atoms, or theorems about abstract concepts, the notion of a random variable is very useful, and I would even say that is the intuitive connection between pure mathematics and the imperfect (and therefore beautiful) world we live in.

Edit: To answer one of the comments of OP (to long for another comment), random variables aren't theoretically necessary, as we could inline their definitions into our proofs, etc., but in practice any such proof would be unreadable and unmanageable, so random variables are necessary. Consider real numbers: one could always work with sequences of rational numbers or continued fractions, or whatever, so the question is, are they essential? I think yes, and there are many similar examples.


If everything was about computing probabilities of events $A_k$, you might be able to get away with avoiding random variables. However, there's a lot more to probability than that. When, for example, you want to calculate means and variances, or talk about the relations between many "parametrized families", it's very useful to have random variables.


In principle, you can get away with joint distribution laws instead of random variables but the things get extremely messy and formulations ridiculously tangled (try to reformulate the strong law of large numbers this way and you'll see what I mean). In addition, you lose all intuition coming from classical analysis and the Lebesgue integration theory. Even such a simple operation as the truncation of a random variable at some level will become the projection of the distribution measure, etc. So, why to have a headache expressing everything in a fancy way when a nice and powerful language is available?

Of course, the Kolmogorov formalism may lead to some confusion too. For instance, if you have a random variable on $\Omega$, and want to introduce an independent copy (a standard trick in many proofs), it may be just impossible to do it on the same $\Omega$, so, technically speaking, a random variable is not quite the same as a function on a measure space; you should rather say that it can be realized as a function on a measure space and to check that the properties you define do not depend on the realization and are determined by the (joint) distribution only. Still, after proving it three times or so, it becomes fairly obvious what is legitimate and what is not and you can safely forget about fine issues like this one (There are more of them; the next one comes when you need conditional probabilities in the continuous setting, which requires the underlying $\Omega$ to be a Polish space, etc.).