Scramblers and their implementation in GNUradio

A scrambler is a function that is applied to a sequence of data before transmitting with the goal of making this data more “random-like”. For instance, scrambling avoids long runs of the bits 0 or 1 only, which may make the receiver loose synchronization or cause spectral concentration of the signal. The receiver applies the inverse function, which is called descrambler, to recover the original data. The documentation for the scrambler blocks in GNUradio is not very good and one may need to take a look at the implementation of these blocks to get their parameters right. Here, I’ll try to explain a bit of the mathematics behind scramblers and the peculiarities of their implementation in GNUradio.

There are two types of scramblers, multiplicative or self-synchronizing and additive or synchronous. Both may be explained in terms of Linear Feedback Shift Registers but I will try to do an alternative exposition without using this concept.

Let \(M\) be a module over a ring \(R\). Usually, \(M\) and \(R\) will be \(\mathbb{Z}/2\mathbb{Z}\) in the applications (this is the ring consisting of the elements \(\{0,1\}\), with addition given by the XOR Boolean operation and multiplication given by the AND Boolean operation). Both types of scramblers are defined by some fixed coefficients \(\alpha_1,\ldots,\alpha_l \in R\).

The multiplicative scrambler transforms the sequence \(\{x_n\}_{n\geq 0} \subset M\) into the sequence \(\{y_n\}_{n\geq 0}\) given by the recurrence\[y_n = x_n + \sum_{k=1}^l \alpha_k y_{n-k}.\]Here, the values \(y_{-1},\ldots,y_{-l}\) have to be fixed beforehand and are known as the seed of the scrambler.

The multiplicative descrambler works as follows: it transforms a sequence \(\{t_n\}_{n\geq 0} \subset M\) into the sequence \(\{z_n\}_{n \geq 0}\) given by\[z_n = t_n – \sum_{k=1}^l \alpha_k t_{n-k}.\]Here, the values \(t_{-1},\ldots,t_{-l}\) are fixed beforehand (these are the seed of the descrambler). To see that this function does, in fact, revert the scrambling process, assume that the receiver starts receiving and descrambling at some point in time, so that \(t_n = y_{n+N}\) for \(n \geq 0\). Then we see that \(z_n = x_{n+N}\) for \(n \geq l\). Thus, the descrambler recovers the stream of data (obviously shifted \(N\) units in time), except for the first \(l\) elements.

We have seen that the multiplicative descrambler can start descrambling at any time, without the need to synchronize with the stream of data. For this reason, the multiplicative scrambler/descrambler is called self-synchronizing. Another remark is that the seeds used for the scrambler and descrambler make no effect in practice and any values can be used as a seed. The descrambler “loses” the first \(l\) elements of the data, but this is not a problem in applications.

The additive scrambler takes the sequence \(\{x_n\}_{n\geq 0}\) and transforms it into the sequence \(\{y_n\}_{n\geq 0}\) given by\[y_n = x_n + w_n,\]where \(w_n\) is defined by the recurrence\[w_n = \sum_{k=1}^l \alpha_k w_{n-k}.\]The values \(w_{-1},\ldots,w_{-l}\) are known as the seed and are fixed beforehand.

Now we see that, in contrast to the multiplicative scrambler, there is no way to descramble this sequence in a self-synchronizing manner. In fact, the only way possible way to descramble it is that the scrambler and descrambler start at the same time (so the descrambler input \(\{t_n\}\) is \(t_n = y_n\)). The descrambler generates the same sequence \(\{w_n\}\) using the same seed, and takes the sequence \(\{t_n\}\) into \(\{z_n\}\) by\[z_n = t_n – w_n.\] Then, clearly \(z_n = x_n\) for all \(n\geq 0\).

Therefore, the additive scrambler and descrambler must start at the same time and use the same seed. For this reason, this scrambler is called synchronous. In applications, an unscrambled synchronization word is sent before the scrambled data to signal the descrambler that it has to start working. Another remark is that when the characteristic of \(R\) is 2 the additive scrambler and descrambler are the same function.

It is usual to give the coefficients \(\alpha_k\) as the coefficients of a polynomial\[p(x) = 1 – \sum_{k=1}^{l} \alpha_k x^k.\] (The reason for this is quite interesting, but it is outside the scope of this post). When \(M = R = \mathbb{Z}/2\mathbb{Z}\), the coefficients of \(p\) are \(0\) or \(1\), so it is usual to encode them as the digits of a binary number. However, there are several possible ways to do so. First, there is the choice of order: whether later bits correspond to higher or lower powers of \(x\). Then, the independent term of \(p\) is always \(1\), so it is possible to omit this coefficient in the binary representation. Also, the leading term of \(p\) is always \(1\), because we can assume that \(l\) is the degree of \(p\). Then, it is possible to omit this coefficient as well.

The choice of the binary representation to use is ultimately tied to the implementation of the the scrambler using a shift register, as there are several possible ways to do so. For instance, this huge list uses the notation \(1\alpha_1\ldots\alpha_{l-1}\). For instance, \(1 + x + x^4\) is is represented as 0xC.

The documentation from GNUradio suggests that the notation used in GNUradio is \(\alpha_l\ldots\alpha_11\). For instance, it says that \(x^4+x^3+1\) is represented as 0x19. However, a careful look at the code reveals that this is not the case. The following method next_bit_scramble() implements the multiplicative scrambler.

unsigned char next_bit_scramble(unsigned char input)
{
  unsigned char output = d_shift_register & 1;

  unsigned char newbit = (popCount( d_shift_register & d_mask )%2)^(input & 1);

  d_shift_register = ((d_shift_register>>1) | (newbit<<d_shift_register_length));

  return output;
}

The function popCount() just counts the number of bits that are 1. We see that d_shift_register is used to store the bits \(0\ldots0y_{n-1}\ldots y_{n-l}\) (big-endian order). The variable d_mask stores the polynomial. Hence, we see that the representation used for \(p\) is \(\alpha_1\ldots\alpha_l\). For instance, \(x^4+x^3+1\) would be represented as 0x3. The polynomial \(x^{17} + x^{12} + 1\), which is used in 9k6 baud FSK AX.25, is represented as 0x21. When using this representation, one should also indicate the degree of \(p\). The way to indicate this in GNUradio is by means of the variable d_shift_register_length. This should be set to \(\operatorname{deg}(p) – 1\).

A similar notation problem happens for the seed value. Although one can use any seed for the multiplicative scrambler/descrambler, it is necessary to use the correct seed for the additive scrambler. Moreover, above we have defined the seed as the values \(w_{-1},\ldots,w_{-l}\). It is also possible to use the values \(w_{l-1},\ldots,w_0\) instead. If \(\alpha_l\) is invertible, then one can obtain \(w_{-1},\ldots,w_{-l}\) in terms of \(w_{l-1},\ldots,w_0\) and vice versa. Thus, the choice of using one definition of the seed or the other one depends more on how the scrambler is implemented.

In GNUradio, the seed is defined as \(w_{l-1},\ldots,w_0\). The binary representation used for the seed is similar to the notation used for polynomials: it is represented as the binary number \(w_{l-1}\ldots w_0\).

4 comments

Pingback: Reverse engineering Outernet: modulation, coding and framing – Daniel Estévez
Pingback: 【技术分享】针对Outernet卫星信号的逆向工程 - 莹莹之色
Pingback: Coding for HIT satellites (and other CCSDS satellites) – Daniel Estévez
Kurtul Mehtap says:

June 10, 2019 at 09:13 UTC

Thank you for this perfect explanation.
I have been looking at the descrambler used in GR-OUTERNET and want to create a GNU RADIO flowgraph which uses the scrambler used in it.
What should be the polynomial and the settings for multiplicative scramber block?

4 comments

Leave a comment Cancel reply