Extremal distribution of random variable that averages to a given value

Indeed the second claim is true. In order to prove this, we introduce some notation:

1. We will use binary splitting process to establish the desired fact. This will involve splitting the interval $[0, 1]$ into pieces, and we will track this process using binary trees. More precisely, we will consider the binary trees $\mathsf{T}$ such that

  1. Each node of $\mathsf{T}$ is a tagged interval of the form $(I, a)$, where $a \in I$.

  2. Each internal node $(I, a)$ of $\mathsf{T}$ has exactly two childs $(I_0, a_0)$ and $(I_1, a_1)$, where

    • $I_0 = I \cap (-\infty, a)$ and $I_1 = I \cap [a, \infty)$, and
    • $a_0, a_1 \in I$ such that $a_0 < a < a_1$.

    So, the tagged point $a$ is used to split the interval $I$.

Here is an example of a binary tree:

$$ \small \begin{gathered} ([0, 1], 0.7) \\ \swarrow \hspace{4em} \searrow \\ ([0, 0.7), 0.3) \quad ([0.7, 1], 0.8) \\ \hspace{5.5em} \swarrow \hspace{4em} \searrow \\ \hspace{5.5em} ([0.7, 0.8), 0.72) \quad ([0.8, 1], 0.95) \end{gathered} $$

Since knowing the initial interval (the interval in the root) and all the tagged points in $\mathsf{T}$ is enough to reconstruct the entire $\mathsf{T}$, we will usually abbreviate by omitting the interval part whenever no confusion arises. For instance, the above example can be abbreviated as

$$ \small \begin{gathered} 0.7 \\ \swarrow \quad \searrow \\ 0.3 \qquad 0.8 \\ \hspace{3.25em} \swarrow \quad \searrow \\ \hspace{3.5em} 0.72 \qquad 0.95 \end{gathered} $$

2. Now we recursively define $\nu[\mathsf{T}]$ as follows:

  1. $\nu[a] := \delta_a$.

  2. If $\mathsf{T} = \Bigl( {\scriptsize\begin{gathered} a \\[-5pt] \swarrow \ \searrow \\[-5pt] \mathsf{T}_0 \qquad \mathsf{T}_1 \end{gathered}} \Bigr) $ such that $\mathsf{T}_i$ has root $a_i$ for each $i = 0, 1,$ (in particular, $a_0 < a < a_1$), then

    $$ \nu[\mathsf{T}] = \frac{a_1 - a}{a_1 - a_0} \nu[\mathsf{T}_0] + \frac{a - a_0}{a_1 - a_0} \nu[\mathsf{T}_1]. $$

This notation is related to OP's notation in that, if $x < p < y$, then

$$ \nu \Bigl[ {\scriptsize\begin{gathered} p \\[-5pt] \swarrow \ \searrow \\[-5pt] x \qquad y \end{gathered}} \Bigr] = \lambda^{y}_{x}. $$

Then the following lemma holds:

Lemma. For each binary tree $\mathsf{T}$, there exists a probability measure $\kappa$ on $([0,p)\times(p,1])\cup\{p\}$ such that

$$ \nu[\mathsf{T}](\cdot) = \int_{[0,p)\times(p,1]} \lambda_{x}^{y}(\cdot) \, \kappa(\mathrm{d}x\mathrm{d}y) + \delta_p(\cdot) \, \kappa(\{p\}). $$

This follows by the structural induction together with the following reduction formula

$$ \nu \Biggl[ {\scriptsize\begin{gathered} a \\[-5pt] \swarrow \ \searrow \\[-5pt] a_0 \qquad a_1 \\[-5pt] \swarrow \ \searrow \hspace{3em} \\[-5pt] a_{00} \qquad a_{01} \hspace{3em} \end{gathered}} \hspace{-1.8em} \Biggr] = \alpha \nu \Bigl[ {\scriptsize\begin{gathered} a \\[-5pt] \swarrow \ \searrow \\[-5pt] a_{00} \qquad a_1 \end{gathered}} \Bigr] + (1-\alpha) \nu \Bigl[ {\scriptsize\begin{gathered} a \\[-5pt] \swarrow \ \searrow \\[-5pt] a_{01} \qquad a_1 \end{gathered}} \Bigr], $$

where $\alpha$ is given by

$$ \alpha = \frac{(a_1 - a_{00})(a_{01} - a_0)}{(a_1 - a_0)(a_{01} - a_{00})} \in (0, 1). $$

3. The final ingredient is the observation that any probability measure $\mu \in \mathcal{M}$ on $[0, 1]$ that averages to $p$ can be approximated by $\nu[\mathsf{T}]$. This follows from the binary splitting process (see this, for instance), which is often adopted in the proof of the Skorokhod's embedding theorem.

Here, we only outline the key idea. We will construct a sequence of binary trees $(\mathsf{T}_n)_{n=0}^{N}$ by running the following algorithm:

  • We begin with the binary tree $\mathsf{T}_0$ which consists only of the root $([0, 1], p)$.

  • Suppose that $\mathsf{T}_n$ is defined and satisfies $a = \mu[x \mid x \in I] = \frac{1}{\mu(I)} \int_{I} x \, \mu(\mathrm{d}x)$ for each node $(I, a)$ of $\mathsf{T}_n$.

    1. If $\mu(I) = \mu(\{a\})$ holds for all leaves $(I, a)$ of $\mathsf{T}_n$, we set $N = n$ and halt the algorithm.

    2. Otherwise, pick a leaf $(I, a)$ of $\mathsf{T}_n$ and set

      $$ a_0 = \mu[x \mid x \in I \cap [0, a)] \qquad\text{and}\qquad a_1 = \mu[x \mid x \in I \cap [a, 1]]. $$

      The two assumptions, $a = \mu[x \mid x \in I]$ and $\mu(I) \neq \mu(\{a\})$, shows that both $\mu(I \cap [0, a)) > 0$ and $\mu(I \cap [a, 1]) > 0$ hold true, and so, both $a_0$ and $a_1$ are well-defined. Moreover, with this choice, we have

      $$ \nu \Bigl[ {\scriptsize\begin{gathered} a \\[-5pt] \swarrow \ \searrow \\[-5pt] a_{0} \qquad a_{1} \end{gathered}} \Bigr] = \frac{\mu(I \cap [0, a))}{\mu(I)} \delta_{a_0} + \frac{\mu(I \cap [a, 1])}{\mu(I)} \delta_{a_1}. $$

      Then we set $\mathsf{T}_{n+1}$ as the binary tree obtained by inserting two children $(I \cap [0, a), a_0)$ and $(I \cap [a, 1], a_1)$ under the node $(I, a)$.

  • If the algorithm does not halt in finite time, set $N = \infty$.

Then it can be proved that $\nu[\mathsf{T}_n]$ converges to $\mu$ weakly as $n \to N$. If we denote $\kappa_n$ for the probability measure obtained by applying the lemma to $\nu[\mathsf{T}_n]$, then any limit point of $(\kappa_n)_{n=0}^{N}$ as $n \to N$ will provide the desired probability measure associated with $\mu$.