Intuition for random variable being $\sigma$-algebra measurable?

The random variable $X$ is measurable with respect to $\sigma$-algebra $\mathfrak F$ if $X=\mathbb E(X\mid\mathfrak F)$.

One can understand this in a few steps:

  1. $\mathbb E(X\mid A)$, where $A$ is an event, is the expected value of $X$ given that $A$ occurs;
  2. $\mathbb E(X\mid Y)$, where $Y$ is a random variable, is a random variable whose value at $\omega\in\Omega$ is $\mathbb E(X\mid A)$ where $A$ is the event $\{Y=y\}$ and $y=Y(\omega)$;
  3. $\mathbb E(X\mid \mathbf 1_A)$ is the case $Y=\mathbf 1_A$, and $\mathbf 1_A(\omega)$ is 1 if $\omega\in A$, and 0 otherwise. This is the random variable that returns $\mathbb E(X\mid A)$ if $\omega\in A$, and $\mathbb E(X\mid A^c)$ if $\omega\not\in A$;

  4. $\mathbb E(X\mid \mathfrak F)$, where $\mathfrak F=\{\varnothing, \Omega, A, A^c\}$, is the same as $\mathbb E(X\mid 1_A)$;

  5. $\mathbb E(X\mid \mathfrak F)$, where $\mathfrak F=\{\varnothing, \Omega, A, A^c, B, B^c, A\cup B, A\cup B^c,\dots\}$ ($2^{2^2}=16$ elements), is something we could call $\mathbb E(X\mid 1_A, 1_B)$ and which returns $\mathbb E(X\mid A\cap B^c)$, or $\mathbb E(X\mid A^c\cap B)$, or $\mathbb E(X\mid A\cap B)$, or $\mathbb E(X\mid A^c\cap B^c)$; it is sort of superfluous to list $A\cup B$ etc. in $\mathfrak F$, it would suffice to list a generating set, but the generating set may not be unique so it is best to list all of $\mathfrak F$;

  6. $\mathbb E(X\mid \mathfrak F)$, where $\mathfrak F=\mathfrak F_t$ is a $\sigma$-algebra corresponding to what's known at time $t$, is an infinite version of (5). It's a random variable that returns our best estimate of $X$, given answers to all the questions "$\omega\in A$?" for $A\in\mathcal F$. If that answer is always just the same as $X$, then $X$ is $\mathfrak F$-measurable or "known at time $t$".


Maybe this can help you to understand the concept of conditional expectation, behind your question.

Suppose you have a probability space $(\Omega, \mathcal P (\Omega), \mathbb{P})$, where $\mathcal P (\Omega)$ denotes the set of all possible subsets of $\Omega$ (evidently, a $\sigma$-algebra), and $\mathbb{P}$ is a probability measure (in this case, a function from $\mathcal P (\Omega)$ to [0,1]).

Suppose you have a random variable (measurable function) $X:(\Omega, \mathcal P (\Omega)) \to (\mathbb{R}, \mathcal B (\mathbb R ))$, where $\mathcal B (\mathbb R )$ is the usual Borel $\sigma$-algebra.

Take as a sub-$\sigma$-algebra the trivial one, $\mathcal F = \{\emptyset, \Omega\}$. Suppose we only know the conditional expectation $\mathbb E(X | \mathcal F)$, but not $X$ itself. How much do we know about X? Well, $Y = \mathbb E(X | \mathcal F)$ is a random variable, $\mathcal F$/ $\mathcal B (\mathbb R )$- measurable. From Y, we can only determine ONE thing (think about this!): $$\mathbb E(Y) = \mathbb E(\mathbb E(X | \mathcal F)) = \mathbb E X.$$ So, what is $\mathbb{E}(X | \mathcal F)$? It is the most simplified knowledge that we can have; we arrive at this if we determine the expectation of the random variable but know nothing about its values in particular events (in $\mathcal P (\Omega)$).

(In fact, $Y$ is constant... otherwise, it would not be measurable.)

Suppose now that we enlarge this $\sigma$-algebra, say to $\mathcal F' = \{\emptyset, A, A^c, \Omega\}$, for some non-trivial set $A$. Again, suppose that we only know $\mathbb{E}(X | \mathcal F')$, not X. Then, we can determine three things about the variable: $$\mathbb E(X 1_A), \, \mathbb E(X 1_{A^c}) \text{ and } \mathbb E (X).$$ Conclusion: a bigger $\sigma$ algebra implies more knowledge about the random variable X (we are interested in that one)!

Check that in the extreme case, when $\mathcal F'' =\mathcal P (\Omega)$, the knowledge of $\mathbb E (X|\mathcal F'')$ allows us to determine all the expected values $\mathbb E(X 1_{\{X=x\}})= x\mathbb P (X=x)$, because the events $\{X=x\}$ are contained in $\mathcal F''$ (like every other subset). If $X$ only take a finite number of different values (for instance, when $\Omega$ is finite), these expectations are enough to determine the probability of all the events $\{X=x\}$. (When $X$ is continuous, the above reasoning is not very useful, for the subsets $\{X=x\}$ have probability zero and the expectations above are zero too. Anyway, by the general properties of the conditional expectation, $\mathbb E(X|\mathcal F'') = X$, because $X$ is $F''$-measurable. In this sense, we can say that the variable is recovered from its conditional expectation.)