How does quantum superposition make calculation faster?

The operations aren't faster, they're more flexible. From that flexibility comes power.

For example: in a classical computer there is no single-bit operation that, when applied twice, flips a bit. There's no boolean function $f$ such that $f(f(x)) = \overline{x}$. But quantum computers do have such an operation, represented by the matrix $M = \frac{1}{2} \begin{bmatrix} 1+i & 1-i \\ 1-i & 1+i \end{bmatrix}$. Note that $M^2 = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}$ is the matrix form of an operation that flips a bit.

This flexibility extends to systems of many qubits and creates paths from problems to solutions that aren't available to classical computers. For example, Grover's search algorithm performs quadratically faster unstructured search by basically setting up a gradual rotation from a starting state to the solution state. But the rotation doesn't correspond to any classical operation, so you can only do it with a quantum computer.

The other big example of a quantum-only operation is the Quantum Fourier Transform. Imagine plotting the probabilities of the computer being in various states, one after another, arranged into a line and forming a jagged graph jumping up and down. If you interpret that graph as a sound wave, the QFT will tell you the strongest frequency that's present. Not the strongest frequency in an explicitly stored list of values, the strongest frequency in the implicit probabilities that your computer is in various states. (Warning: I'm over-simplifying a lot here.) That's pretty weird! And, it turns out, useful.