Show independence of number of heads and tails

The conditional distribution of $X$ given the event $N=n$ is the particular binomial distribution that you identify. But notice that the support of the marginal (i.e. not conditional on $N$) distribution of the number of heads is $\{0,1,2,3,\ldots\}$. It doesn't just go up to some number $n$ and then stop. It is unbounded: there is no integer so large that the probability that that is the number of heads is $0$. Hence the marginal (unconditional) distribution of $X$ cannot be a binomial distribution. Nor $Y$, for the same reason. Notice that the conditional expectation of $X$ given $N$ is $pN$, so the marginal (unconditional) expectation of $X$ is $p\,\operatorname{E}N = p\lambda$, and that of $Y$ is $(1-p)\lambda$.

We have $$ \Pr(X=x\mid N) = \binom N x p^x(1-p)^{N-x}, $$ so \begin{align} & \Pr(X=x) = \operatorname{E}(\Pr(X=x\mid N)) = \operatorname{E}\left(\binom N x p^x(1-p)^{N-x} \right) \\[10pt] = {} & \sum_{n=x}^\infty \binom n x p^x(1-p)^{n-x}\cdot\Pr(N=n) \\[10pt] = {} & \sum_{n=x}^\infty \frac{n!}{x!(n-x)!} p^x(1-p)^{n-x}\cdot\frac{\lambda^n e^{-\lambda}}{n!} \tag a\\[10pt] = {} & \sum_{n=x}^\infty \underbrace{\frac{(p\lambda)^x e^{-p\lambda}}{x!}}\cdot\frac{((1-p)\lambda )^{n-x} e^{-(1-p)\lambda}}{(n-x)!}. \tag b \end{align} Now a crucial fact is needed: The part over the $\underbrace{\text{underbrace}}$ does not depend on $n$, so it can be pulled out: $$ \frac{(p\lambda)^x e^{-p\lambda}}{x!} \sum_{n=x}^\infty \frac{((1-p)\lambda)^{n-x}e^{-(1-p)\lambda}}{(n-x)!}. $$ This sum is $1$ since it's the sum of probabilities assigned by a certain discrete probability distribution: the Poisson distribution with expected value $(1-p)\lambda$. Thus we have $$ \Pr(X=x) = \frac{(p\lambda)^xe^{-p\lambda}}{x!}\cdot1. $$ In other words $X\sim\mathrm{Poisson}(p\lambda)$. In the same way, one can show that $Y\sim\mathrm{Poisson}((1-p)\lambda)$.

Now let us think about $\Pr(X=x\ \&\ Y=y)$ and see if it's the same as $\Pr(X=x)\cdot\Pr(Y=y)$. \begin{align} \Pr(X=x\ \&\ Y=y) & = \operatorname{E}(\Pr(X=x\ \&\ Y=y\mid N)) \\[10pt] & = \sum_{n=0}^\infty \underbrace{\Pr(X=x\ \&\ Y=y\mid N=n)}\cdot\Pr(N=n). \end{align} Now the part over the underbrace is $0$ except when $n=x+y$, so the whole sum is just one of the terms: $$ \Pr(X=x\ \&\ Y=y) = \Pr(X=x\ \&\ Y=y\mid N=x+y)\Pr(N=x+y). $$ Now notice that this is exactly the same as $$ \Pr(X=x\mid N=x+y)\Pr(N=x+y). \tag 1 $$ (In other words $X$ and $Y$ are very very far from independent when one conditions on $N=n$ --- so far from independent that knowing $X$ tells you $Y$.)

Since $X\mid (N=x+y)\sim\mathrm{Binomial}(x+y,p)$ and $N\sim\mathrm{Poisson}(\lambda)$, the value of $(1)$ is $$ \binom{x+y}x p^x (1-p)^y \frac{\lambda^{x+y}e^{-\lambda}}{(x+y)!}. $$ The $(x+y)!$ in the numerator and denominator cancels and this whole thing becomes $$ \frac{(p\lambda)^x e^{-p\lambda}}{x!} \cdot \frac{((1-p)\lambda)^y e^{-(1-p)\lambda}}{y!} = \Pr(X=x)\Pr(Y=y). $$


The problem can also be solved without the use of expectations, which at this point of the book (chapter 2 [1]) are yet to be introduced (they only appear in chapter 3 [1]).

As you correctly identified, we are going to be seeking $f_{X}(x)$, $f_{Y}(y)$ and $f_{X, Y}(x, y)$, to show that $$f_{X}(x)f_{Y}(y) = f_{X, Y}(x, y)$$ for all $x$, $y$. Should there be any doubt as to why this is equivalent to showing that $X$ and $Y$ are independent, one can consult theorem 2.30, p. 35 in [1],

From the definition of the probability mass function, $f_X(x) = \Pr(X = x)$, $f_Y(y) = \Pr(Y = y)$ and $f_{X,Y}(x, y) = \Pr(X=x, Y=y)$, so we can focus our attention on finding these three probabilities, to establish:

$$\Pr(X = x)\Pr(Y = y) = \Pr(X=x, Y=y)$$

We are going to rely on a few key tricks:

  1. The law of total probability (Theorem 1.16, p. 12 [1]): $$P(B)=\sum_{i=1}^{k}\Pr(B \mid A_{i})P(A_{i})$$ where $A_{1}, \ldots A_{k}$ is a partition of the sample space, and $B$ is any event.
  2. $\sum_{k=0}^{k=+\infty}\frac{x^k}{k!}=e^{x}$, which is well-known.
  3. $P(X = x, N = n) = \Pr(X = x, Y = n - x)$, which comes from the fact that $X + Y = N$.
  4. Not really a trick, but from the definition of conditional probability, $\Pr(A \mid B)\Pr(B) = \Pr(AB)$

Using trick 1 we can write down: $$\Pr(X = x) = \Pr(X = x \mid N < x)\Pr(N < x) + \sum_{k=0}^{k=+\infty}\Pr(X=x \mid N = x + k)\Pr(N = x + k)$$

The first part of the sum is 0, since $N$ can never be less than $x$. Let us compute the second part.

Using the definitions of Binomial and Poisson distributions, we can write the following:

$$\Pr(X=x \mid N = x + k) = {x + k \choose x} p^x(1-p)^k$$

$$\Pr(N=x + k) = e^{-λ} \frac {λ^{x + k}}{(x + k)!}$$

This, after some simplifications, gives us

$$e^{-λ}\frac{λ^x}{x!}p^x\sum_{k = 0}^{k = +\infty}\frac{(λ(1-p))^{k}}{k!}$$

The time has come to apply trick 2, which yields:

$$\Pr(X = x) = p^{x}\frac{λ^{x}}{x!}e^{-λp}$$

By symmetry, we can argue that: $$\Pr(Y = y) = (1-p)^{y}\frac{λ^{y}}{y!}e^{-λ(1-p)}$$

We're half-way done, but the "crux" of the problem lies in the computation of $\Pr(X = x, Y = y)$. We can apply trick 3.

$$\Pr(X = x, Y = n - x) = \Pr(X = x, N = n)$$

To compute the right side, we can use trick 4. The left side is almost what we desire, except we will have to substitute $y = n - x$ on both sides after we're done with all the computations. So

$$\Pr(X = x, Y = n - x) = \Pr(X = x \mid N = n)\Pr(N = n)$$

$$\Pr(X = x, Y = n - x) = \mathrm{Binomial}(n, p)(x)\mathrm{Poisson}(λ)(n)$$

After some computation we get

$$\Pr(X = x, Y = n - x) = \frac{p^{x}(1-p)^{n-x}e^{-λ}λ^{x + n-x}}{x!(n-x)!}$$

Applying the substitution, we have

$$\Pr(X = x, Y = y) = \frac{p^{x}(1-p)^{y}e^{-λ}λ^{x + y}}{x!y!}$$

Which, combined with our previous result, gives $\Pr(X = x)\Pr(Y = y) = \Pr(X=x, Y=y)$, as required.

As other answers have pointed out, this is vaguely amusing, since having $N$ fixed makes $X$ and $Y$ as dependent as variables can possibly be: knowing $X$ determines $Y$ exactly, but somehow, someway setting $N \sim \mathrm{Poisson}(λ)$ adds just about enough randomness to $N$ that learning $X$ doesn't tell us anything about $Y$ that we wouldn't have known before we learned $X$ ($X$ and $Y$ are independent).

The bizarre interplay between $\mathrm{Poisson}$ and $\mathrm{Binomial}$ continues, for additional practice see for example a very similar exercise 2.16, p. 45 [1]

[1] Larry Wasserman, All of Statistics, 2004, corrected second printing, ISBN 0-387-40272-1.