Why is a wave pulse a superposition of sine waves?

In this case it's probably best to be pragmatic. A pulse can be described as a superposition of sine waves that extend infinitely into space and time. But it's just that: a mathematical description that is useful for your purposes. There is not necessarily a physical meaning connected to it. Nevertheless, in quantum mechanics the wave-description of phenomena has turned out to be incredibly successful. But it can often lead to some confusion or counter-intuitive results.

If you model a short laser pulse as a superposition of infinitely many sine waves, you also run into other problems than the ones you described. For example, you will discover that the phase velocity (the velocity of the sine wave trains) will be higher than the speed of light, which is apparently clashing with general relativity that prohibits faster-than-light propagation of information.

Again, the solution of this apparent paradox is that the sine waves do not carry physical meaning. However, the group velocity (the velocity of the pulse itself, constructed from these sine waves) can only move at the speed of light or below.

I just wanted to add to a previous (very accurate) answer: you can think of it as an Fourier expansion of the actual (physical) wave profile. It is not a real life process, it is a mathematical approximation. The wave pulse can be thought of as a superposition of plane waves, which happens to interfere destructively in entire space, except for the localized region - location of the pulse.

This is a lot more subtle problem than is indicated in any of the comments. The problem is not just the issue of how the sum of non-causal signals can approximate a causal one, but how is it possible that while all real-life signals must start and stop at some time they must also be band-limited beyond some frequency, but as we know these two are contradictory (Paley-Wiener). The resolution of this paradox along with the present question is in Slepian: On Bandwidth, Proc. IEEE, 1976 vol. 64, No. 3 pp292-300. In short, the answer lies in dual approximation valid in both time and frequency domains, and in what we may call a legitimate mathematical model of a physical signal. Slepian's article is very readable and can be understood without much mathematical baggage.