Non-zero Conditional Differential Entropy between a random variable and a function of it

The paradox can be stated in a simpler form:

We know that $I(X;Y)=h(X)-h(X|Y)\ge 0$ holds, also for continuous variables. Take the particular case $Y=X$; then second term vanishes ($h(X|X)=0$) and we get

$$ I(X;X)= h(X)-h(X|X)=h(X) \ge 0$$

But this is not right. The differential entropy can be negative. So what?

when dealing with continuous rv's where the one is a function of the other, their conditional differential entropy may be non-zero (it doesn't matter whether it is positive or negative), which is not intuitive at all.

Your problem (and the problem with the above paradox) is to implicitly assume that the concept "zero entropy means no uncertainty" applies also to differential entropies. That's false. It's false that $h(g(X)|X)=0$, it's false that $h(X|X)=0$, it's false that $h(X)=0$ implies zero uncertainty. The fact that (by a mere change of scale) a differential entropy can be negative, suggests by itself that here zero entropy (conditional or not) has no special meaning. In particular, a uniform variable in $[0,1]$ has $h(X)=0$.


Differential entropy is only defined for absolutely continuous random variables with respect to Lebesgue measure. The joint distribution of

$$(X,g(X))$$

is not absolutely continuous because its support is

$$\{(x,g(x)):x\in\mathbb{R}\},$$

a measure 0 subset of $\mathbb{R}^2$.

The definition of conditional entropy requires an absolutely continuous joint distribution, $f(x,y)$: \begin{align} h(X|Y) &= -\int f(x,y) \log f(x|y)\ dxdy \\ &= h(X,Y)-h(Y). \end{align} If $(X,Y)$ is not absolutely continuous against Lebesgue measure in $\mathbb{R}^2$, then there is no function $f(x,y)$ such that the probability measure of $(X,Y)$ can be expressed as an integral against $f$.

Therefore neither the quantity $h(g(X)|X)$, nor the quantity $I(X;g(X))$ are well defined as differential entropies.