Why does more bandwidth guarantee high bit rate?

The simplest explanation is found in Shannon's equation:

$$C = B\log_2(1+S/N)$$

where C = channel capacity in bits/second B = channel bandwidth in Hertz S = signal power in watts N = noise power in watts

This equation relates the maximum channel capacity (C), that is the maximum data rate, as a function of channel bandwidth (B) and channel signal-to-noise ratio (S/N). The bandwidth basically sets the limit on how many symbols per second can be sent. The signal-to-noise ratio, S/N, sets the limit on how many bits can be sent by each symbol. If you consider the signal to be a square wave, it is clear that higher bandwidths allow higher frequency square waves to be transmitted. Similarly, higher signal-to-noise ratios allow more bits for each symbol because more amplitude values can be discriminated at the receiver. You can increase data rate, without increasing bandwidth, by increasing transmitter power because that improves the signal-to-noise ratio which, by Shannon's equation, increases the channel capacity. However, as the equation also shows, the ultimate channel capacity also depends on the bandwidth. Thus, for the same transmitter power, the channel with the higher bandwidth will have the higher channel capacity.


The fundamental reason can be loosely stated as "more bandwidth means the sooner you can be surprised", and only surprises can carry data. For base-band signals, this is pretty obvious: a higher bandwidth means a faster rise time, which means the signal can take on a new value faster. However, the same is true of carrier modulation signals. If you have an unmodulated (CW) carrier at 5.6 GHz, the signal is oscillating very quickly, but since the bandwidth is low, you can predict what it is going to be for a long period of time. Anything that deviates from that expected value, whether a change in amplitude, phase, or frequency, increases the bandwidth. The faster it diverges from the "predicted" oscillation, the higher the bandwidth.


Rephrasing what others have answered in formal ways, look at it this way:

Information can only be transmitted via the change of some state ("surprises" in @Evan's terms). A zero-bandwidth (constant amplitude and frequency) sine wave does not convey any information, it is just there.

Now, every time a (sinusodial) signal of frequency f changes, be it in amplitude or phase or both, the resulting signal around the point of time of the change cannot be of frequency f anymore; otherwise the signal would not change at all. So, any change of a signal from a continuous sine wave (temporarily) generates a frequency or frequencies somewhat above and/or below the original frequency f.

The difference of the temporary frequency/-ies and the base frequency f, delta-f, determines how fast and how large the change can be (rate of change) and vice versa. A quick change generates/requires greater frequency deviations than a slow one. In theory, if you have a constant sinusodial signal and you would switch it off (0% amplitude) instantly, i.e. with 0 time taken passing from one state (100% amplitude) to the other (0%), this would create/require infinitely high frequencies. That's why it's impossible to modify a given signal at arbitrarily high speeds.

Picking up from above, each single change of the signal can be used to convey some information, be it a single bit or less or more than one bit. To pack more information into a single change (more bits) you need greater changes (you need to be able to discern e.g. 4 states (2 bits, range 0-3) instead of 2 (1 bit, range 0-1)). Greater changes cause/require greater delta-f's. If you just want to transmit more changes per second, the time allowed for each change to become effective (before the next change will be modulated) is reduced. Thus, you get a greater delta-f because you must make sure the changes become visible quicker.

Example: If I was to transmit 1 bit per second, I could limit myself to really low frequencies, because I will probably be all right if each bit sent requires 0.5 seconds to reach the corresponding state of signal at the receiving end. A bandwidth of 1-2Hz may be sufficient. Trying to send 100 bits per second cannot be done if each bit would require 0.5 seconds to be visible at the receiver: During this time there are 50 other bits also modulated onto the signal so the receiver would see some kind of average of the 50 bits sent after 0.5s. No way to reconstruct the individual bits. That's why I need more bandwidth to allow greater delta-f's which allows the signal at the receiving side to change its state more quickly.

So whatever you do to get more information per second transmitted, you will have to provide more bandwidth, because more signal change(s) per second must be visible.

(This is all assuming the same required SNR margin. By reducing the SNR margin one may squeeze some more information onto a signal of a given bandwidth.)

To visualize the relationship between bandwidth and rate-of-change, you can take/simulate e.g. a simple low pass filter. Look at what happens at the filter's output when a given (sinusodial) input signal is "instantly" turned on/off: The output will only respond slowly to the quick change. If you modulate the input signal quicker you will begin to see that the output signal becomes more or less stable the quicker you modulate the input, up to a point where the input modulation cannot be seen on the output signal any more.