Jensen's inequality for integrals

First of all, Jensen's inequality requires a domain, $X$, where $$ \int_X\,\mathrm{d}x=1\tag{1} $$ Next, suppose that $\varphi$ is convex on the convex hull of the range of $f$, $\mathcal{K}(f(X))$; this means that for any $t_0\in \mathcal{K}(f(X))$, $$ \frac{\varphi(t)-\varphi(t_0)}{t-t_0}\tag{2} $$ is non-decreasing on $\mathcal{K}(f(X))\setminus\{t_0\}$. This means that we can find a $\Phi$ so that $$ \sup_{t<t_0}\frac{\varphi(t)-\varphi(t_0)}{t-t_0}\le\Phi\le\inf_{t>t_0}\frac{\varphi(t)-\varphi(t_0)}{t-t_0}\tag{3} $$ and therefore, for all $t$, we have $$ (t-t_0)\Phi\le\varphi(t)-\varphi(t_0)\tag{4} $$ Now, let $t=f(x)$ and set $$ t_0=\int_Xf(x)\,\mathrm{d}x\tag{5} $$ and $(4)$ becomes $$ \left(f(x)-\int_Xf(x)\,\mathrm{d}x\right)\Phi\le\varphi(f(x))-\varphi\left(\int_Xf(x)\,\mathrm{d}x\right)\tag{6} $$ Integrating both sides of $(6)$ while remembering $(1)$ yields $$ \left(\int_Xf(x)\,\mathrm{d}x-\int_Xf(x)\,\mathrm{d}x\right)\Phi\le\int_X\varphi(f(x))\,\mathrm{d}x-\varphi\left(\int_Xf(x)\,\mathrm{d}x\right)\tag{7} $$ which upon rearranging, becomes $$ \varphi\left(\int_Xf(x)\,\mathrm{d}x\right)\le\int_X\varphi(f(x))\,\mathrm{d}x\tag{8} $$


I like this, maybe it is what you want ...

Let $E$ be a separable Banach space, let $\mu$ be a probability measure defined on $E$, let $f : E \to \mathbb R$ be convex and (lower semi-)continuous. Then $$ f\left(\int_E x d\mu(x)\right) \le \int_E f(x)\,d\mu(x) . $$ Of course we assume $\int_E x d\mu(x)$ exists, say for example $\mu$ has bounded support.

For the proof, use Hahn-Banach. Write $y = \int_E x d\mu(x)$. The super-graph $S=\{(x,t) : t \ge f(x)\}$ is closed convex. (Closed, because $f$ is lower semicontinuous; convex, because $f$ is convex.) So for any $\epsilon > 0$ by Hahn-Banach I can separate $(y,f(y)-\epsilon)$ from $S$. That is, there is a continuous linear functional $\phi$ on $E$ and a scalar $s$ so that $t \ge \phi(x)+s$ for all $(x,t) \in S$ and $\phi(y)+s > f(y)-\epsilon$. So: $$ f(y) -\epsilon < \phi(y)+s = \phi\left(\int_E x d\mu(x)\right)+s = \int_E (\phi(x)+s) d\mu(x) < \int_E f(x) d\mu(x) . $$ This is true for all $\epsilon > 0$, so we have the conclusion.

diagram


One way would be to apply the finite Jensen's inequality $$\varphi\left(\frac{\sum a_i x_i}{\sum a_j}\right) \le \frac{\sum a_i \varphi (x_i)}{\sum a_j}$$ to each Riemann sum. The finite inequality is itself easily proved by induction on the number of points, using the definition of convexity.