The Shannon-Hartley theorem describes the maximum rate at which information can be sent over a bandwidth-limited AWGN channel. This rate is called the channel capacity. If \(B\) is the bandwidth of the channel in Hz, \(S\) is the signal power (in units of W), and \(N_0\) is the noise power spectral density (in units of W/Hz), then the channel capacity \(C\) in units of bits per second is\[C = B \log_2\left(1 + \frac{S}{N_0B}\right).\]
Let us now consider that we make \(n\) “sub-channels”, by selecting \(n\) disjoint bandwidth intervals contained in the total bandwidth of the channel. We denote the bandwidth of these sub-channels by \(B_j\), \(j = 1,\ldots,n\). Clearly, we have the constraint \(\sum_{j=1}^n B_j \leq B\). Likewise, we divide our total transmit power \(S\) into the \(n\) sub-channels, allocating power \(S_j\) to the signal in the sub-channel \(j\). We have \(\sum_{j=1}^n S_j = S\). Under these conditions, each sub-channel will have capacity \(C_j\), given by the formula above with \(B_j\) and \(S_j\) in place of \(B\) and \(S\).
The natural question regards using the \(n\) sub-channels in parallel to transmit data: what is the maximum of the sum \(\sum_{j=1}^n C_j\) under these conditions and how can it be achieved? It is probably clear from the definition of channel capacity that this sum is always smaller or equal than \(C\). After all, by dividing the channel into sub-channels we cannot do any better than by considering it as a whole.
People used to communications theory might find intuitive that we can achieve \(\sum_{j=1}^n C_j = C\), and that this happens if and only if we use all the bandwidth (\(\sum_{j=1}^n B_j = B\)) and the SNRs of the sub-channels, defined by \(S_j/(N_0B_j)\), are all equal, so that \(S_j = SB_j/B\). After all, this is pretty much how OFDM and other channelized communication methods work. In this post I give an easy proof of this result.