Any Simple Example of Lebesgue Integration?

When they both exist, Lebesgue and Riemann integration give the same thing. In particular, the fundamental theorem of calculus, substitution theorems, etc, are just as true for the Lebesgue integral as for the Riemann integral. So when you use substitution to compute the expected value of an exponential as $$ E(X) = \int_0^\infty x\lambda e^{-\lambda x}dx = \frac{1}{\lambda}$$ it doesn't matter at all whether it's a Lebesgue integral or a Riemann integral.

Nonetheless, we can certainly do a toy computation of $\int_0^1 x^2 dx$ "the Lebesgue way". The basic idea, as I'm sure you've heard, is to partition the $y$ axis rather than the $x$ axis. So let's take an even-spaced partition $A^n_i = [\frac{i-1}{n},\frac{i}{n})$ of $[0,1).$ Then we can partition the $x$ axis into $$B_i^n = \{x\in[0,1) \mid f(x)\in A_i^n\}.$$ Here this partition is simple since $f$ is increasing. We have $B_i^n = [ \sqrt{\frac{i-1}{n}},\sqrt{\frac{i}{n}}).$ (But note that for a non-monotonic function the $B_i^n$ could be the union of several intervals, not just a single interval).

Then, we can write a sequence of simple functions (i.e. step functions) $$ s^{lower}_n(x) = \sum_{i=1}^n \frac{i-1}{n} 1_{B_i^n}(x)$$ where $1_E(x)$ is the indicator function that is one if $x\in E$ and zero otherwise. So it's a function that is locally constant on the sets $B_i^n$ but takes different values between the different sets... it looks like a staircase here. Notice that $\frac{i-1}{n}$ is the lower limit of the y-axis partition interval $A_i^n,$ so that $s^{lower}_n < f.$

Now, we define that for a constant function $f(x)=c$, the integral of the function over some set $E$ is just $c\mu(E)$ where $\mu(E)$ is the measure of $E,$ and we extend it to simple functions in the obvious way: $$ \int \sum_{i=1}^n a_i 1_{E_i} \;d\mu = \sum_{i=1}^n a_i \mu(E_i).$$

So we have $$ \int_{[0,1]} s^{lower}_n \;d\mu = \sum_{i=1}^n \frac{i-1}{n}\mu(B_i^n) = \sum_{i=1}^n \frac{i-1}{n}\left(\sqrt{\frac{i}{n}}-\sqrt{\frac{i-1}{n}} \right).$$

Similarly, we can define a function $s_n^{upper}(x) >f $ by $$s_n^{upper}(x)=\sum_{i=1}^n \frac{i}{n} 1_{B_i^n}(x) $$ and compute $$ \int_{[0,1]} s^{upper}_n \; d\mu= \sum_{i=1}^n \frac{i}{n}\left(\sqrt{\frac{i}{n}}-\sqrt{\frac{i-1}{n}} \right).$$

I'm not sure if these sums can be computed in closed form, but in any event, it is true that $$ \lim_{n\to\infty} \int_{[0,1]} s^{upper}_n \; d\mu = \lim_{n\to\infty} \int_{[0,1]} s^{upper}_n \; d\mu = \frac{1}{3},$$ so since $s_n^{upper} > f > s_n^{lower}$ for all $x,$ the integral of $f(x) = x^2$ is squeezed to $1/3.$

(Formally, the Lebesgue integral of a non-negative function is defined as $$ \int_{[0,1]} f(x)d\mu = \sup_{s\in\Sigma_{[0,1]}} \left\{\int s d\mu \mid s <f\right\}$$ where $\Sigma_{[0,1]}$ is the set of all simple functions on $[0,1].$ We can show that for any simple, $s < f,$ $\int s\;d\mu \le \int s_n^{upper} d\mu,$ so we indeed have $\int f\;d\mu \le \int s_n^{upper} d\mu,$ and analogously for $s^{lower}.$)

This might be a little unsatisfying for your purpose, cause we can't compute the sum in closed form before you take the limit, like you could in the Riemann case (or I can't, anyway). But this was due to the fact that I chose an evenly spaced partition of the y-axis for clarity. If I instead let $A^n_i = [\left(\frac{i-1}{n}\right)^2,\left(\frac{i}{n}\right)^2),$ then we get $B^n_i = [\frac{i-1}{n},\frac{i}{n})$ and going through the same motions we get $$s^{lower}_n(x) = \sum_{i=1}^n \left(\frac{i-1}{n}\right)^2 \mu(B_n^i) = \sum_{i=1}^n \left(\frac{i-1}{n}\right)^2 \frac{1}{n}$$ which is precisely the same thing you'd write down for a Riemann sum computation (and that I presume you know how to compute in closed form). This is no accident, of course... Any proper Riemann-integrable function is Lebesgue integrable and the integrals have the same value.

Hopefully my indulgence of your question doesn't obscure what I said in the first paragraph. The "point" of Lebesgue integration is not that it's a way to do standard integrals of calculus by some new method. It's that the definition of the integral is more theoretically powerful: it leads to more elegant formalism and cleaner results (like the dominated convergence theorem) that are very useful in harmonic/functional analysis and probability theory. It's also true, as said in the comments, that it allows you to integrate more functions, like the indicator function of the rationals. This isn't really that compelling of an advantage in and of itself (and note that if we extend to improper integrals there are things that are Riemann-integrable but not Lebesgue), but the argument that it is zero shows how nice of an ally measure theory can be... a function that was ugly and too cumbersome for Riemann integration to deal with is tamed quite easily by Lebesgue.


I think SpaceIsDarkGreen's answer is very good, but I'd like to reformulate one of his examples with a notation that physicists and engineers may relate to more easily. In any case, it made it much easier for me to write it this way so I hope it's useful to others. I'll also argue that the Lebesgue point of view can have a practical use even for integrals of continuous functions.

Consider the (Riemann) integral between $-1$ and $+1$ of the function $f(x) = x^2$:

$$ I = \int_{-1}^{1}dx\; f(x) = \int_{-1}^{1}dx\; x^2 = \left.\frac{x^3}{3}\right|_{-1}^{1} = \frac{2}{3} $$

The Lebesgue integral calculates the same things with a very special change of integration variables. I'll write the Lebesgue integral this way:

$$ I = \int_{0}^{1} f\;d\mu(f) $$

Basically we're doing what SpaceIsDarkGreen said. For each value of the function $f \in \left[0,1\right]$, we find the size $d\mu(f)$ of the integration domain that maps to the image $\left[f,f+df\right]$. Then we multiply $f$ by $d\mu(f)$ add up horizontal slices under the curve. To give a practical meaning to our Lebesgue integral, we now need to calculate the size $d\mu(f)$. For the integration region above, there are two points on the $x$ axis that map to the same value $f$, one at $+x$ and one at $-x$. For one of these points, the size of the domain that maps to $\left[f,f+df\right]$ is $\sqrt{f+df} - \sqrt{f}$. The function is symmetric so the total size of the domain mapping to $\left[f,f+df\right]$ is twice that:

$$ d\mu(f) = 2\times\left[\sqrt{f+df}-\sqrt{f}\right] \approx \frac{df}{\sqrt{f}} $$

Here, I used the fact that $df$ is small. Putting this back into the Lebesgue integral above, we find that

$$ I = \int_{0}^{1} f\;\frac{df}{\sqrt{f}} = \int_{0}^{1} df\;\sqrt{f} = \left.\frac{2}{3}f^{3/2}\right|_{0}^{1} = \frac{2}{3} $$

Of course, we get the same answer. But note that even though we didn't need the Lebesgue integral, we have a very practical result. We proved the following equality:

$$ I = \int_{-1}^{1}dx\; x^2 = \int_{0}^{1} dx\;\sqrt{x} $$

and we proved that this equality is not accidental. Here it's a bit trivial, but it may sometimes be the case that even for nice continuous functions, the Lebesgue point of view could give integrals that are easier to solve.

Edit: I still want to emphasize the point made earlier though: this is not the main reason why Lebesgue integration is interesting. But I thought it was worth pointing out that it can give a new point of view on practical integration.