expected error when flipping bits of binary string

Your ideas proposed in the update are correct. First, we can show

$$\displaystyle\mathbb{E}_{x, x' \in M, d(x, x') = k} |{\pi(x) - \pi(x')}| = \mathbb{E}_{x, x' \in M, d(x, x') = k} |x - x'|$$

where $M = \{0, \dots, 2^n - 1\}$ and $\pi$ is the inverse Gray code i.e. decoding function (actually, both work, but I think your claim flips codeword bits, not plaintext bits). I did it by induction, but it would be nice if there was a slick counting argument:

$$\displaystyle\mathbb{E}_{x, x' \in M, d(x, x') = k} |{\pi(x) - \pi(x')}| = p_{00}\mathbb{E}_{x, x' \in M, d(x, x') = k, x_1 = x'_1 = 0} |\pi(x) - \pi(x')| + p_{01}\mathbb{E}_{x, x' \in M, d(x, x') = k, x_1 \neq x'_1} |\pi(x) - \pi(x')| + p_{11}\mathbb{E}_{x, x' \in M, d(x, x') = k, x_1 = x'_1 = 1} |\pi(x) - \pi(x')|$$

where $p_{00}, p_{01}, p_{11}$ are some constants and $x_1, x'_1$ denote the first bits of $x, x'$. A critical property of the Gray code is that $\pi$ restricts to the Gray code on $\{0, \dots, 2^{n-1} - 1\}$. Thus the induction hypothesis covers the $p_{00}$ term (and similarly the $p_{11}$ term).

For the $p_{01}$ case, WLOG we've picked $x$ from $\{0, \dots, 2^{n-1} - 1\}$ and $x'$ from $\{2^{n-1}, \dots, 2^{n} - 1\}$. This guarantees that $\pi(x')$ is greater than $\pi(x)$, and so conditioning on this case

$$\mathbb{E}|\pi(x') - \pi(x)| = \mathbb{E}[\pi(x')] - \mathbb{E}[\pi(x)] = 2^{n-1} = \mathbb{E}|x' - x|$$

This completes the induction. Now we can straightforwardly compute the expectation

$$\mathbb{E}_{x, x' \in M, d(x, x') = k} |x - x'| = \sum_{i=0}^{n-1}2^i \frac{i \choose k-1}{n \choose k} $$

The probabilistic experiment that samples $|x- x'|$ is equivalent to the following: select $k$ (distinct) indices out of $\{0, 1, \dots, n-1\}$, and call them $i_1 < i_2 < \cdots < i_k$. These are the bits on which $x$ and $x'$ differ. Pick a random sign $s_j \in \{\pm 1\}$ for each, and then output $|s_12^{i_1} + s_22^{i_2} + \cdots + s_k 2^{i_k}|$.

The key observation here is that the sign of $s_12^{i_1} + s_22^{i_2} + \cdots + s_k 2^{i_k}$ is determined by $2^{i_k}$, because the other terms cannot overpower it. So
$$|s_12^{i_1} + s_22^{i_2} + \cdots + s_k 2^{i_k}| = 2^{i_k} + s_12^{i_1} + s_22^{i_2} + \cdots + s_{k-1} 2^{i_{k-1}}$$ Now when we take the expectation, we can use linearity. All terms except the first disappear, and we're left with $$\mathbb{E}_{x, x' \in M, d(x, x') = k} |x - x'| = \mathbb{E}[2^{i_k}]$$ Finally, the probability that $i$ is the maximum of $k$ choices from $\{0, \dots, n-1\}$ is $\binom{i}{k-1} \left/\binom{n}{k}\right.$, as we must pick $k-1$ smaller elements from $\{0, 1, \dots, i-1\}$, a set of size $i$. This shows $$\mathbb{E}_{x, x' \in M, d(x, x') = k} |x - x'| = \sum_{i=0}^{n-1}2^i \frac{i \choose k-1}{n \choose k} $$