Characterization of the Poisson law

We can't expect a completely finite way for the Poisson distribution to arise, since the number $e$ must come from somewhere. On the other hand, it should definitely not be necessary to introduce Stirling's formula.

I think the most natural approach is to define Poisson($\lambda$) as the limit distribution of the number of heads in a sequence of $N$ independent flips of a biased coin with probability $\lambda/N$ of heads.

This must be accompanied by some derivation showing that there is such a limit, and leading to the formula you state. Such a derivation will use binomial coefficients and the definition of the number $e$, but not much more.

But even without the derivation, if we just assume that the limit exists, it shows why the sum of two independent Poisson variables is again Poisson.


Related to camomille's answer, I also like to think about Poisson processes, but in another way.

The exponential distribution is perhaps not too hard to motivate, due to its memoryless property. If you have "arrivals" that are going to happen "unexpectedly", then it's reasonable to guess that the inter-arrival times should be iid exponential with some rate $\lambda$. If you ask for $N(t)$, the number of arrivals by time $t$, you find that it's Poisson with parameter $\lambda t$.

This also plays nicely into the sum of independent Poissons being Poisson: if you have exponential arrivals of two types occurring independently with rates $\lambda, \mu$, it's easy to see that the time until the next arrival (of either type) is also exponential, with rate $\lambda + \mu$. Thus the number of arrivals of either type by time $1$ (i.e. the superposition of the two Poisson processes) is on the one hand the sum of independent Poissons with parameters $\lambda, \mu$, and on the other hand Poisson with parameter $\lambda + \mu$.


What about the characterization of Poisson point processes ?

Let us consider a counting process $(N(t))_{t \ge 0}$. That is, $N(0)=0$, $N(t)$ only increases by jump of height $1$, and is right continuous. You can see $N(t)$ as the number of points of a random set in $]0,t]$.

Then $N(t)_t$ is a homogeneous poisson point process if and only if :

1) the increments are independent

2) the increments are stationary : $N(t+s)-N(t)$ has the same law as $N(s)$.

(Maybe there is a further regularity assumption).

This implies that the increments are Poisson distributed : there exists $\lambda$ such that $N(s)$ is distributed according to Poisson with parameter $s\lambda$ for all $s$. This shows that under seemingly general conditions the Poisson distribution appears.

You can also see this from a more geometrical point of view, by considering more general point process than on the line.