I have been laying some groundwork for Maia SDR, and for this I will need to run the Parks-McClellan algorithm in maia-httpd, the piece of software that runs in the Pluto ARM CPU. To evaluate what implementation of this algorithm to use, I have first gone to the implementations that I normally use: the SciPy remez function, and GNU Radio’s pm_remez function. I read these implementations, but I didn’t like them much.

The SciPy implementation is a direct C translation of the original Fortran implementation by McClellan, Parks and Rabiner from 1973. This C translation was probably written decades ago and never updated. The code is very hard to read. The GNU Radio implementation looks somewhat better. It is a C implementation that was extracted from Octave and dates from the 90s. The code is much easier to follow, but there are some comments saying *“There appear to be some problems with the routine search. See comments therein [search for PAK:]. I haven’t looked closely at the rest of the code—it may also have some problems.”* that have seemingly been left unattended.

Because of this and since I want to keep all the Maia SDR software under permissive open source licenses (the GNU Radio / Octave implementation is GPL), I decided to write from scratch an implementation of the Parks-McClellan algorithm in Rust. The result of this has been the pm-remez crate, which I have released recently. It uses modern coding style and is inspired by recent papers about how to improve the numerical robustness of the Parks-McClellan algorithm. Since I figured that this implementation would also be useful outside of Maia SDR, I have written Python bindings and published a pm-remez Python package. This has a few neat features that SciPy’s remez function doesn’t have. The Python documentation gives a walkthrough of these by showing how to design several types of filters that are commonly used. This documentation is the best place to see what pm-remez is capable of.

The rest of this post has some comments about the implementation and the things I’ve learned while working on this.

One of the things I’ve found when documenting myself is that many sources that explain the Parks-McClellan algorithm don’t go into all the details that are needed to implement it. They leave a taste of understanding a global picture of what the algorithm does, but not how each step works exactly. An example of this kind of explanation is in Section 3.3 in fred harris’ book. A book which does a much better job at explaining the algorithm is Oppenheim and Schafer, in Section 7.7. The GNU Radio implementation mentions some equations and notation from this book, so it is convenient to have it at hand when studying the code.

But by far the best description that I have found of the Parks-McClellan algorithm is in the paper that presents McClellan, Parks and Rabiner original Fortran implementation of the algorithm. The paper includes the Fortran code, but the text of article has all the necessary explanations, equations and flowcharts, so it is not really necessary to try to read the Fortran code. This paper assumes some familiarity with the Parks-McClellan algorithm, so it should be preceded by reading either the original Parks-McClellan paper (which is a good read on its own, although it is more about the mathematics that about algorithmic implementation details), or one of the textbooks mentioned above.

Another nice paper that I have found is A Robust and Scalable Implementation of the Parks-McClellan Algorithm for Designing FIR Filters, by S.I. Filip. This is a paper from 2016 that describes how to improve the Parks-McClellan algorithm to make it work in situations which are prone to numerical ill-conditioning, such as when designing filters with many coefficients, steep transitions or large attenuation factors. The paper is accompanied by a C++ implementation, which can be consulted when some small details are omitted in the paper.

Section 7.2 of the paper is interesting because it benchmarks the implementation described in the paper against SciPy, GNU Radio and MATLAB. A series of example problems are defined, and the runtimes of each implementation and whether it converges or fails are measured. The takeaway from this section is that the GNU Radio implementation is less robust than SciPy and MATLAB, but it is also much faster.

I have taken mainly two ideas from this paper for my pm-remez implementation. The first is Section 5, which analyses the numerical errors of polynomial interpolation and proposes to use barycentric Lagrange interpolation. The second is Section 6, which proposes to use Chebyshev proxy root finding to find the local extrema of the weighted error function. A key step in the Parks-McClellan algorithm is the Remez exchange step. There, the local extrema of the current weighted error function are found, and these extrema are used as polynomial interpolation nodes for the next iteration of the algorithm.

The original Fortran implementation, as well as the SciPy and GNU Radio implementations, evaluate the error function over a finite grid of points to approximate the local extrema. There are better approaches to find the local extrema, such as Chebyshev proxy root finding, a variant of which is also used in the MATLAB implementation of the `remez`

function. In this method, the weighted error function is approximated by a Chebyshev interpolant in a small interval, and the zeros of the derivative of this interpolant are found. If the approximation is good, these zeros are close to the local extrema.

Finding the zeros of this polynomial is usually done by solving an equivalent eigenvalue problem. The C++ implementation by Filip uses the Eigen library, which is a modern C++ template library. This allows Filip’s implementation to use MPFR to do benchmarks/debugging and to solve very ill-conditioned problems which are impossible to solve with double-precision arithmetic. Unfortunately, there isn’t a nice equivalent of Eigen in Rust. The only reasonable way to compute eigenvalues seems to be with the ndarray-linalg crate, which uses LAPACK. This is a bit of a let-down, because it makes my project feel as running away from some old piece of Fortran only to end up depending on another old piece of Fortran. I briefly considered writing an implementation of the QR algorithm from scratch, but doing this properly seemed like a whole project on its own, because LAPACK does a lot of numerical computing tricks to make it work well in practice.

I still have taken Filip’s idea of using a higher-precision floating point implementation to solve tricky problems, so my implementation allows using num-bigfloat, although these floats need to be converted to `float64`

to compute the eigenvalues, because that’s the only that LAPACK can handle.

Another important point of the Parks-McClellan algorithm that is often not treated in enough detail is that it only serves to design FIR filters with an odd length and even symmetric taps. The frequency response of this type of FIR filters is a polynomial in the cosine of the frequency, which is what allows to turn the filter design problem into a polynomial approximation problem. Nevertheless, the problem of designing other types of FIR filters can be reduced to the case of odd length and even symmetry by scaling the frequency response with a suitable trigonometric function (and as a consequence, scaling the desired function and weight too). The paper by McClellan, Parks and Rabiner explains this very clearly in Section II. It includes the formulas about how to scale the desired function and weight depending on the type of the filter in a flowchart in Figure 2, and formulas to convert the taps of the odd length even symmetry filter designed by the algorithm into the required filter in equations (7)-(12). Regarding this point, it is convenient to mention that the GNU Radio implementation computes the taps by scaling the frequency response of the filter depending on each case, but given that there are straightforward formulas to transform the taps, I don’t see the point in doing this.

According to these conversions, there are four cases in which the algorithm can work. These depend on whether the length is odd or even, and on whether the taps have even or odd symmetry. These cases are sometimes referred to as Type I, Type II, Type III and Type IV FIRs, but I find this nomenclature confusing and never remember which is which. The following Figure from Oppenheim and Schafer is a good reminder.

Most filters used in practice have even symmetry (Type I or Type II), but there are special classes of filters that are occasionally required that have odd length and odd symmetry (Type III). These are the Hilbert filter and the differentiator filter. The SciPy and GNU Radio implementations treat the Hilbert and differentiator filters as special cases, making the algorithm less flexible and the code harder to read. In my implementation the user can specify whether the length of the filter is even or odd (indirectly, by specifying the length), and whether the filter taps should have even or odd symmetry. Hilbert and differentiator filters can be designed by setting these parameters appropriately, as shown in the example of the pm-remez Python documentation.

Related to differentiator filters is the ability to set a custom desired response or weight function. The original Fortran implementation has functions `EFF`

and `WATE`

that define the desired frequency response and the weights. The authors say that if the user needs custom functions, they can replace these functions. This made sense for Fortran code in the 70s, but in the present day with modern languages we can let the user pass an arbitrary closure to define a custom function if they need to. This is what I have done in pm-remez. In the Rust API, the desired response and weight function are defined by closures `Fn(T) -> T`

(where `T`

is the scalar type, for instance, `T = f64`

), so the user can pass in whatever custom functions they want and enjoy compile time optimizations. I reckon that most of the time the user needs a constant or linear response in each band, so there are shorthands to set those. The Python API also has the same flexibility. The response in each band can be defined either as a single scalar, indicating a constant response, a pair of scalars, indicating a linear slope, or a Python callable, which is used to define an arbitrary function.

There is nothing specially magical or easier about having a constant desired response and weight in each band. Even designing a lowpass filter with an even number of taps involves non-flat desired response and weight functions, due to the scaling done to transform the problem into one involving a FIR with an odd number of taps. Therefore, not letting the user define these functions arbitrarily is just bad API design in the present day.

Having the capability to define the desired response and weights freely permits to handle some interesting filter design problems. The first is lowpass filters with a 1/f response in the stopband. fred harris advocates for the use of these filters instead of equiripple filters in Section 3.3.1 of his book. He explains that these 1/f filters can be designed by setting the weight function to an increasing linear slope in the stopband. He also comments that some Parks-McClellan implementations don’t allow this, in which case a staircase weight function realized by defining several adjacent stopbands with increasing constant weights can be used as an approximation. This is one of shortcomings I find with SciPy’s remez function. In contrast, the pm-remez Python API allows to design this kind of FIR almost as easily as an equiripple filter, as shown in the documentation examples.

Another FIR filter that requires a linear slope (this time for the desired function) is the differentiator. The SciPy and GNU Radio implementations handle this by treating the differentiator as a special case and overriding its desired function. With pm-remez, designing a differentiator is just treated in the same way as any other filter, by specifying a linear desired response in the passband.

Finally, my favourite example that requires a non-constant desired response is a CIC compensation filter. The passband response of this filter needs to be the inverse of the CIC frequency response, which involves some sine functions. The Python documentation shows how to design a compensation filter that looks like this.

Installing the pm-remez Python package is just a

pip install pm-remez

command away. There are pre-built packages for Linux x86_64, Windows x86_64, and MacOS x86_64 and arm64. I will certainly be using pm-remez for my filter design tasks in the future, instead of SciPy’s remez function.

]]>In the post I found a transmission where only one codeword was transmitted. It used the precoding matrix \([1, i]^T/\sqrt{2}\). This basically means that a 90º phase offset is applied to the two antenna ports as they simultaneously transmit the same data. I mentioned that this was the reason why I obtained bad results when I tried to equalize this PDSCH transmission using transmit diversity in another previous post, and that in a future post I would show how to equalize this transmission correctly. I have realized that I never wrote this post, so now it is as good a time as any.

Before beginning with TM4, I need to go back to the post where I spoke about transmit diversity. There, I mentioned that there was a \(1/{\sqrt{2}}\) factor in TS 36.211 that I couldn’t account for. The formula for transmit diversity precoding over two antenna ports is shown here (this is taken from Section 6.3.4.3 in TS 36.211).

However, the formula I used for equalization was the same one but without the \(1/\sqrt{2}\) factor. If I included this factor, I obtained a QPSK constellation with amplitude \(\sqrt{2}\) rather than one. I couldn’t find anything in the 3GPP documents that explained why this factor was being cancelled. This is relevant for TM4, because the precoding matrices also have this \(1/\sqrt{2}\) factor.

Doing more research about this, I have realized that I was missing an important piece of the puzzle: downlink power allocation. It turns out that in LTE the power used by each resource element of the PDSCH and other downlink signals can be different from the power of the resource elements used by the CRS (cell-specific reference signal). The power levels are defined relative to the CRS resource elements, and configured by the higher-layers. A UE needs to know these power ratios in order to perform equalization correctly. Knowing the amplitude (or power) relation between a signal such as the PDSCH and the CRS is not terribly important for QPSK, because failure to use the correct ratio only scales the constellation. However, it is critical for QAM constellations.

The details of how this works are in Section 5.2 in TS 36.213. This section is quite confusing to read, because there are many special cases and higher-layer parameters mentioned. Some summaries of this information are the one by ShareTechnote, this one by Smart Telecom Edu, and a post in the Huawei forums. Briefly speaking, for the PDSCH there are two quantities \(\rho_A\) and \(\rho_B\) that define the power ratio between PDSCH resource elements and CRS resource elements. The value \(\rho_A\) is used for resource elements in symbols in which there are no CRS (symbols 1, 2, 3, 5, 6 for two antenna ports and normal cyclic prefix), and \(\rho_B\) is used for resource elements in symbols in which there are CRS (symbols 0 and 4 for two antenna ports and normal cyclic prefix). There is much flexibility to set \(\rho_A\), and it can even by set differently per UE. However, the value of \(\rho_B\) is determined from \(\rho_A\) by the quotient \(\rho_B/\rho_A\), which has a fixed value given by a parameter \(P_B\) transmitted in the SIB2, according to this table in TS 36.213.

The missing \(1/\sqrt{2}\) factor that I was observing can be explained if for this particular recording \(\rho_A = \rho_B = 2\). Most of the examples in these summaries have \(\rho_A < 1\), meaning that less power is allocated to the PDSCH resource elements than to the CRS resource elements. They mention that it is important to allocate higher power to the CRS because the reference signals need to have higher SNR for good equalization, but there are also some examples with \(\rho_A > 1\), such as the following, taken from the video that accompanies the Smart Telecom Edu post.

Something that I haven’t seen mentioned when discussing downlink power allocation is that the precoding matrices for two antenna ports all have the property that if the input is a QPSK constellation with power one (either one or two layers of this), then the total power over both antenna ports has an expected value of one. This happens thanks to the \(1/\sqrt{2}\) and \(1/2\) factors in the matrices in the following table taken from TS 36.211, and also for transmit diversity precoding. Technically speaking, these factors make the Hilbert-Schmidt norm of these matrices equal to one, which is what is required to obtain this kind of power normalization.

However, in some situations it might be more reasonable to consider what happens with the power of each antenna port individually, and normalize things so that the power of each port has expected value one (and hence the total power over both ports has an expected value of 2 for these two-port transmissions). For instance, if each antenna port has a separate power amplifier, this can be a good point of view (although if we are considering the eNB for spectrum management, then total power over all ports is a better metric). This normalization is achieved by setting \(\rho_A = \rho_B = 2\), and in a sense this is a rather natural choice, as it makes all the resource elements in each port have the same power. It is true that, for each port, the resource elements occupied by the CRS of the other port are muted, and hence the symbols containing CRS resource elements have 5/6 of the power of the other symbols. If we want to have the same power in all the symbols by increasing the power of PDSCH resource elements in symbols containing CRS, then that is what the setting \(P_B = 0\), which gives \(\rho_B/\rho_A = 5/4\) is for.

Another interesting aspect of power allocation in the recording I have been using is that the PHICH (physical hybrid-ARQ indicator channel) has twice more power than the other physical downlink channels, such as the PCFICH, the PDCCH and the PBCH. All these channels use transmit diversity, but when equalized in the same way, the constellations for the PCFICH, PDCCH and PBCH have amplitude 1 but the constellation for the PHICH has amplitude \(\sqrt{2}\) (so it has twice as much power). I mentioned this in my post about the PHICH. The higher power allocation for the PHICH is clear even in this plot, which shows the RMS amplitude in each REG in the first symbol of some subframes. We can see that the PHICH REGs that are active have a larger amplitude than the PCFICH REGs and the REGs occupied by the PDCCH.

Looking into how downlink power allocation is defined by the standard, I haven’t found an explanation for why the PHICH can be set to a higher power than the other channels. I think that given that the PHICH uses a BPSK constellation and doesn’t use any FEC that requires knowledge of the amplitude for decoding (which is what happens with the FEC decoders that need LLRs), the UE doesn’t really care about how much power is allocated to the PHICH. Probably, in practice the eNB is free to set the PHICH power level as it desires, without indicating this choice to the UEs.

In the post about the PDCCH, I showed that one of the PDSCH transmissions had the following DCI in the PDCCH. This is a TM4 transmission, but only one transport block is active. This means that there is only one codeword (and one layer) used in the spatial multiplexing precoding.

The precoding information in the DCI is 3. This indicates which precoding matrix is used, but it doesn’t refer to the codebook index of the TS 36.211 Table 6.3.4.2.3-1 shown above. The table that needs to be used to interpret this DCI field is Table 5.3.3.1.5-4 in TS 36.212, reproduced here. The precoding matrix corresponding to the value 3 is \([1, i]^T/\sqrt{2}\).

In general, when a single codeword is transmitted with a precoding matrix \([a_0, a_1]^T\), this means that for each resource element the output of antenna port \(p\) is\[y_p = a_p x,\]where \(x\) is the symbol in the codeword that is mapped to this resource element. Assuming that the receiver has a single antenna and denoting by \(h_p\) the channel response between antenna port \(p\) of the transmitter and the receiver antenna, we see that (ignoring noise), the received receives the symbol\[z = h_0 y_0 + h_1 y_1 = (a_0 h_0 + a_1 h_1) x.\]Therefore, it can recover \(x\) as\[x = \frac{z}{a_0 h_0 + a_1 h_1}.\]The values of \(a_p\) are known, and \(h_p\) are estimated using the CRS.

If the received symbol \(z\) has noise variance \(\sigma^2\), then the noise variance of \(x\) is \(\sigma^2 / |a_0 h_0 + a_1 h_1|^2\). The noise variance for a transmission over a single antenna port \(p\) would be \(\sigma^2 / |h_p|^2\), and as we saw in the post about transmit diversity, the noise variance for transmit diversity would be \(2 \sigma^2 / (|h_0|^2 + |h_1|^2)\) (here I am assuming that we have the \(1/\sqrt{2}\) factor in the precoding formulas, so that the total transmitted power over all ports is the same in the three cases; for the TM precoding matrix this condition is \(|a_0|^2 + |a_1|^2 = 1\)).

Comparing transmit diversity and transmission over a single port under this condition shows that if we know which port \(p\) has maximum channel response \(|h_p|\), then it is advantageous to transmit only over this port. However, if we don’t know which port \(p\) is better, then transmit diversity is a good option because it gives an SNR which is better or equal than -3 dB compared to single port transmission over the best port. This explains why transmit diversity is so useful for transmissions in which it is not possible to select the best port \(p\) because either we don’t have CSI (channel state information) from the UE or because the transmission is addressed to several UEs (as is the case with the control channels).

If we now consider single-codeword TM4, assuming that we had perfect CSI and that we could choose any precoding matrix, according to Cauchy-Schwarz the best that can be done is to choose \(a_p = \overline{h_p}/\sqrt{|h_0|^2 + |h_1|^2}\). This gives a noise variance of \(\sigma^2 / (|h_0|^2 + |h_1|^2)\). This is exactly 3 dB better than transmit diversity, and always better than single-port transmission. However, in reality the choice of precoding matrix is limited to one of the form \([1, i^k]^T/\sqrt{2}\) for \(k = 0, 1, 2, 3\). The optimal choice with this limitation depends on the value of \(\alpha = \arg(h_0 \overline{h_1})\). Namely, it is\[k = \left\lfloor \frac{2\alpha}{\pi} + \frac{1}{2}\right\rfloor \mod 4.\]Intuitively speaking, this choice is the one that aligns the phases of \(a_0 h_0\) and \(a_1 h_1\) in the best possible way. In the worst case, the phase of these two numbers will differ by \(\pi/4\), because the choices for \(a_1\) are spaced \(\pi/2\) apart in phase. Therefore, we always have\[|a_0 h_0 + a_1 h_1|^2 \geq \frac{1}{2}\left(|h_0|^2 + |h_1|^2 + \sqrt{2}|h_0||h_1|\right).\]

We see that single-codeword TM4 with the appropriate precoding matrix always gives more SNR than transmit diversity. In the special case when \(|h_0| = |h_1|\), single-codeword TM4 gives at least a factor of \(1 + \sqrt{2}/2\) more SNR than transmit diversity. This is 2.32 dB, which is not far from the ideal 3 dB difference that can be achieved with an arbitrary precoding matrix. However, when \(|h_0| \ll |h_1|\) or vice versa, the above equation shows that the improvement of single-codeword TM4 over transmit diversity is very small.

It is possible to interpret single-codeword TM4 as a form of transmit beamforming. The effect of the precoding matrix is to transmit the same signal over both ports, but using a phase offset between the ports that must be an integer multiple of \(\pi/4\) (or 90 degrees). From the possible 4 “beams” to choose from, the one which gives better SNR because it aligns better the phases of the signals at the receiver is the one that is used.

Interestingly, it seems that many 2-port eNBs use cross-polarized antennas in an X pattern. One of the antenna ports is the +45 deg polarization, and the other port is the -45 deg polarization. For instance, the following figure from a Rohde & Schwarz whitepaper on LTE TMs shows typical eNB antenna configurations. The 2-port configuration is the one on the left. The antenna elements in red form one port, and the antenna elements in blue form the other port.

With this kind of antenna, single-codeword TM4 actually has an interpretation as “polarization-forming” (for lack of a better word). Rather than beamforming, the result of combining +45 and -45 deg polarizations with a phase offset that is an integer multiple of \(\pi/4\), is vertical, horizontal, RHCP, or LHCP polarization (depending on the choice of the precoding matrix). Therefore, we can think that if the eNB has CSI, it can choose the polarization among these 4 that best matches the receiver antenna polarization.

After all this theory, let us show the result of performing TM4 equalization for the PDSCH transmission corresponding to the DCI shown above. We can see that the constellation of the data symbols looks quite good.

Recall that in this recording the channel amplitude response looks like this. In the higher frequencies of the cell (which is where this transmission takes place), port 1 is received much stronger than port 0. We can also see this in the equalized reference symbols R0 and R1 above. R0 is much noisier because it has less SNR.

If we equalize this TM4 transmission as if it was a single-port transmission on port 1, we get the following constellation, which doesn’t look too bad. Indeed, the effect of doing this wrong equalization is to distort the constellation by a factor of \(h_0/h_1 + i\), which is close to a 90 degree rotation.

If we instead use transmit diversity equalization, the constellation is even worse, but still not too terrible. Intuitively speaking, this is because since in this case the channel response of one of the ports is much larger than the other, even though using the transmit diversity equalization formula is wrong in this case, the contribution of the terms that are multiplied by the smaller channel acts as a small inter-symbol interference (however, since in this case it is port 1 the port whose channel is largest, there are some complex conjugates, subcarrier swaps and sign inversions affecting the constellation, so even if it looks right, the bits decoded from it aren’t).

Another example of a single-codeword TM4 transmission in the recording is given by the following DCI, `0x7007ceeb0160`

, scrambled with C-RNTI `0xced8`

. Here the transmission uses two segments of resource blocks. The precoding matrix is the same as in the previous example.

The equalized symbols for the lower resource blocks and the upper resource blocks are shown here separately. Note that the two segments is somewhat different, since the channel response is quite different.

In the recording that I used, there is one PDSCH transmission that uses TM4 with two codewords. This is its corresponding DCI in the PDCCH. We can see that the two codewords use the same MCS.

The precoding information field uses the same table as for one codeword. In this case, the value 0 indicates that the precoding matrix is\[\frac{1}{2}\begin{pmatrix} 1 & 1 \\ 1 & -1\end{pmatrix}.\]Note that as in the case of one codeword, the precoding matrices are normalized so that the sum of the power transmitted over the two antenna ports has an expected value of one if the symbols of each codeword are uncorrelated and have power one.

The following is a short treatment of 2×2 MIMO, as used in LTE TM4. The full theory of 2×2 MIMO is much more nuanced and admits many other techniques. We will be treating the cases of one codeword and two codewords at the same time and with the same notation, in order to compare them. We denote by \(x\) either a symbol from the single codeword, or a pair of symbols from each of the two codewords. Thus, \(x\) is either a complex scalar or a vector in \(\mathbb{C}^2\). We denote by \(W\) the precoding matrix, which is either 2×1 or 2×2. The vector of the symbols transmitted by each antenna port is\[y = Wx.\]We assume that the UE has two antennas, and denote by \(H\) the channel response between the two transmit and two receive antennas. This is a 2×2 matrix with entries \(h_{jk}\) that denote the channel between transmit port \(k\) and receive port \(j\). The UE receives the vector\[z = Hy + n = HWx + n.\]Here \(n\) denotes the receiver noise. We assume that its covariance is \(\mathbb{E}(nn^*) = I\). This can be assumed without loss of generality by multiplying \(z\) and \(H\) on the left by \(\mathbb{E}(nn^*)^{-1/2}\).

Assuming that \(W^*H^*HW\) is invertible, the UE can estimate \(x\) by computing\[\widehat{x} = (W^*H^*HW)^{-1}W^*H^*z = x + (W^*H^*HW)^{-1}W^*H^*n.\]We denote by\[\widehat{n} = (W^*H^*HW)^{-1}W^*H^*n\]the noise vector that affects the recovered symbols. Its covariance \(R\) can be computed as\[R = \mathbb{E}(\widehat{n}\widehat{n}^*) = (W^*H^*HW)^{-1}.\]It is now useful to consider the singular value decomposition of \(H\),\[H = U\Sigma V^*,\]where \(U\) and \(V\) are unitary and\[\Sigma = \begin{pmatrix} \sigma_0 & 0 \\ 0 & \sigma_1\end{pmatrix},\quad \sigma_0 \geq \sigma_1 > 0.\]Using this decomposition we have\[R = (W^*V\Sigma^2V^*W)^{-1}.\] Since\[\Sigma^2 = \sigma_1^2 I + (\sigma_0^2 – \sigma_1^2)e_0e_0^*,\]where \(e_0 = (1, 0)^T\), putting \(v_0 = Ve_0\) we get\[R = (\sigma_1^2 W^*W + (\sigma_0^2 – \sigma_1^2) W^* v_0 v_0^* W)^{-1}.\]

Now, in the single codeword case, \(W\) is a 2×1 vector which we denote by \(w\). The covariance \(R\) is the scalar\[R = (\sigma_1^2 \|w\|^2 + (\sigma_0^2 – \sigma_1^2) |\langle w, v_0\rangle|^2)^{-1}.\]We see that\[\|w\|^{-2}\sigma_0^{-2} \leq R \leq \sigma_1^{-2}\|w\|^{-2}.\]Moreover, if we have knowledge of the channel \(H\) and are free to choose \(w\) with a fixed \(\|w\|\), then we can obtain any value of \(R\) that satisfies the above inequalities by choosing \(w\) appropriately.

In the two codeword case, if we assume that \(W\) is unitary, then\[R = W^*V^*\Sigma^{-2}VW.\]We denote by \(r_j = e_j^* R e_j\) the covariance of the noise on each of the two recovered symbols. We have the condition\[r_0 + r_1 = \operatorname{tr}(R) = \operatorname{tr}(\Sigma^{-2}) = \sigma_0^{-2} + \sigma_1^{-2}.\]Moreover, \(r_j\) satisfy the inequalities\[\sigma_0^{-2} \leq r_j \leq \sigma_1^{-2}.\]If we have knowledge of the channel \(H\) and are free to choose the unitary \(W\), we can obtain any pair of values for \(r_0, r_1\) that satisfy these conditions. Indeed, putting \(v_1 = V^*e_1\), we have\[R = \sigma_0^{-2} I + (\sigma_1^{-2}-\sigma_0^{-2})W^*v_1v_1^*W,\]so\[r_j = \sigma_0^{-2} + (\sigma_1^{-2} – \sigma_0^{-2})|\langle W e_j, v_1\rangle|^2.\]

For a fair comparison between the one codeword case and the two codeword case, we assume that in the one codeword case \(\|w\| = \sqrt{2}\). This is so that the total transmitted power over the two ports is the same regardless of whether one or two codewords are transmitted. In fact, a look at the LTE precoding matrices table above shows that for two codewords all the precoding matrices are a 2×2 unitary times \(1/\sqrt{2}\), and for one codeword they all are a 2×1 vector with norm one.

There are different metrics with which the 1×1 and 2×2 covariances \(R\) can be compared. A usual choice is the channel capacity. For the one codeword case, the channel capacity is\[C = \log (1 + R^{-1})\]if we omit the constant factor that gives the bandwidth and the conversion from \(\log\) to \(\log_2\). We see that to maximize the capacity, it is necessary to make \(R\) as small as possible, which is quite intuitive. As seen above, the best we can achieve by choosing freely \(w\) with \(\|w\| = \sqrt{2}\) is \(R = \sigma_0^{-2}/2\). This corresponds to transmitting through the channel eigenmode that has maximum SNR. In this optimal case, the capacity is\[C = \log (1 + 2\sigma_0^2).\]

In the two codeword case, the capacity is the sum of the capacities for each codeword, so\[C = \log(1 + r_0^{-1}) + \log(1 + r_1^{-1}).\]Doing some algebra and using \(r_0 + r_1 = \sigma_0^{-2} + \sigma_1^{-2}\), we see that\[C = \log\left(1 + \frac{\sigma_0^{-2} + \sigma_1^{-2} + 1}{r_0r_1}\right).\]Hence, the maximum capacity is achieved by making the product \(r_0r_1\) as small as possible. Putting \(s = |\langle W e_0, v_1\rangle|^2\) and using \(|\langle W e_1, v_1\rangle|^2 = 1 – s\), we get\[r_0r_1 = \sigma_0^{-2}\sigma_1^{-2} + (\sigma_1^{-2} – \sigma_0^{-2})^2 s(1-s).\]Note that \(0 \leq s \leq 1\), and if we are free to choose a unitary \(W\), we can obtain any value of \(s\) satisfying this condition. The optimal capacity is achieved for \(s = 0\) or \(s = 1\). This choice corresponds to transmitting signals of equal power through the channel eigenmode that has maximum SNR (which corresponds to the singular value \(\sigma_0\)) and the eigenmode that has minimum SNR (which corresponds to the singular value \(\sigma_1\)). In this optimal case the capacity is\[C = \log\left(1 + \frac{\sigma_0^{-2} + \sigma_1^{-2} + 1}{\sigma_0^{-2}\sigma_1^{-2}}\right) = \log(1 + \sigma_0^2 + \sigma_1^2 + \sigma_0^2 \sigma_1^2).\]

Note that this optimal choice is not the same as the water filling algorithm. The water filling algorithm puts different power levels into each channel eigenmode, which cannot be done with a unitary precoding matrix \(W\). With the normalization we are doing, the water filling algorithm would give the solution \(q_0, q_1\) that maximizes\[C = \log((1 + q_0 \sigma_0^2)(1 + q_1 \sigma_1^2))\]subject to the conditions \(q_0 + q_1 = 2\) and \(q_j \geq 0\). The optimal solution for unitary precoding corresponds to the choice \(q_0 = q_1 = 1\), but in general there is a better choice of \(q_0, q_1\) that maximizes the capacity. This choice can be found using Lagrange multipliers. It is\[q_j = 1 + \frac{1}{2\sigma_0^2} + \frac{1}{2\sigma_1^2} – \frac{1}{\sigma_j^2}\]if this formula gives \(q_1 \geq 0\), or \(q_0 = 2\), \(q_1 = 0\) otherwise.

Let us now compare the capacities of the one codeword and two codeword cases when we can freely choose \(w\) with \(\|w\| = \sqrt{2}\) for the one codeword case and \(W\) a unitary for the two codeword case. We see that the two codeword case gives greater capacity whenever\[0 \leq \sigma_1^2 – \sigma_0^2 + \sigma_0^2 \sigma_1^2.\]This inequality holds whenever \(\sigma_1 \geq 1\), and also whenever\[\sigma_0^2 \leq \frac{\sigma_1^2}{1 – \sigma_1^2}.\]The intuition here is that the two codeword case gives better capacity unless the SNR of the weak eigenmode is low and the SNR of the strong eigenmode is large enough.

In LTE, however, the precoding matrix \(W\) cannot be chosen freely. There are only a few possible choices for it. For one codeword, with the normalization \(\|w\| = \sqrt{2}\), the possible choices are \(w = (1, i^k)^T\) for \(k = 0, 1, 2, 3\). Let us now study what is the best of these matrices to choose depending on the channel \(H\). The columns of the matrix \(V\) are eigenvectors of \(H^*H\). Since these eigenvectors are uniquely determined only up to multiplication by a complex scalar of modulus one, we can assume that \(V\) is of the form\[V = \begin{pmatrix}\cos \theta & – e^{-i\varphi}\sin\theta \\ e^{i\varphi}\sin\theta & \cos \theta\end{pmatrix}\]for \(\theta, \varphi \in \mathbb{R}\). These two parameters have the following geometric interpretation. If we consider a signal \(y\) transmitted by the two antenna ports of the eNB with total power one, \(\|y\|^2 = 1\), then the total power received at the two antenna ports of the UE is \(\|Hy\|^2\). The transmit vector \(y\) that maximizes \(\|Hy\|^2\) is in fact the first column of \(V\), this is, \(y = v_0 = V e_0\). This follows from \(\|Hv_0\|^2 = \|U\Sigma V^*V e_0\|^2 = \sigma_0^2\). Therefore, the parameter \(\theta\) indicates the power sharing between the two transmit ports in this particular maximal solution \(y = v_0\). A value of \(\theta\) close to something of the form \((2n+1)\pi/4\) gives near equal power sharing, while a value of \(\theta\) close to something of the form \(n\pi/2\) allocates all the power to one of the antennas. The parameter \(\varphi\) gives the phase offset that needs to be applied to the transmit ports in this solution \(y = v_0\).

I don’t know if there are assumptions that imply that in practical situations the value of \(\theta\) is more likely to be close to something of the form \((2n+1)\pi/4\) rather than to something of the form \(n \pi/2\). As we will see next, the precoding matrices in the LTE codebooks are not very good choices when \(|\sin 2\theta|\) is small.

Continuing with the one codeword case, by putting\[t = \frac{|\langle w, v_0\rangle|^2}{\|w\|^2},\]we have\[R = \|w\|^{-2}(\sigma_1^2 + (\sigma_0^2-\sigma_1^2)t)^{-1} = (2\sigma_1^2 + 2(\sigma_0^2-\sigma_1^2)t)^{-1}.\]Now we can compute\[t = \frac{1}{2}|\cos \theta + i^k e^{-i\varphi}\sin\theta|^2 = \frac{1 + \sin 2\theta \cos \left(\varphi – \frac{k\pi}{2}\right)}{2}.\]By an appropriate choice of \(k\) we can assume that\[\sin 2\theta \cos \left(\varphi – \frac{k\pi}{2}\right) \geq |\sin 2\theta| \frac{\sqrt{2}}{2},\]but since it can happen that \(\sin 2\theta = 0\), the only lower bound that we have for \(t\) that holds in all cases (choosing \(w\) appropriately depending on \(V\)) is \(t \geq 1/2\). This implies\[R \leq (\sigma_0^2 + \sigma_1^2)^{-1}.\]Therefore, the channel capacity satisfies\[C \geq \log (1 + \sigma_0^2 + \sigma_1^2).\]

For the two codeword case there are only two possible choices for the unitary precoding matrix \(W\):\[W = \frac{1}{\sqrt{2}}\begin{pmatrix} 1 & 1 \\ 1 & -1\end{pmatrix},\]and\[W = \frac{1}{\sqrt{2}}\begin{pmatrix} 1 & 1 \\ i & -i\end{pmatrix}.\]Observe that the four columns of these two matrices are the four vectors \(w\) that can be chosen for the one codeword case (scaled by a factor \(1/\sqrt{2}\)). Here we are concerned with computing the product \(s(1-s)\), where \(s = |\langle We_0, v_1|\rangle|^2\). The vector \(v_1\) is \(v_1 = (e^{i\varphi}\sin\theta, \cos\theta)^T\). For the first possible choice for \(W\) we have\[s = \frac{1}{2}|e^{i\varphi}\sin\theta + \cos\theta|^2 = \frac{1 + \sin 2\theta \cos \varphi}{2}.\]This gives\[s(1-s) = \frac{1-\sin^2 2\theta \cos^2 \varphi}{4}.\]A similar calculation shows that for the second possible choice for \(W\) we have\[s(1-s) = \frac{1 – \sin^2 2\theta \sin^2 \varphi}{4}.\]Selecting one of these two choices of \(W\) according as to whether \(\cos^2 \varphi\) or \(\sin^2\varphi\) is largest, we can achieve\[s(1-s) \leq \frac{1 – \frac{\sin^2 2\theta}{2}}{4}.\]However, as in the one codeword case, it can happen that \(\sin 2\theta = 0\), so the best upper bound that we can give for \(s(1-s)\) that holds in every case by choosing appropriately between the two precoding matrices \(W\) is\[s(1-s) \leq \frac{1}{4}.\]This bound is tight whenever \(\sin 2\theta = 0\). Such a bound also follows from the fact that \(0 \leq s \leq 1\), so this shows that when \(\sin 2\theta = 0\), the two precoding matrices that are available actually give the worst possible choices among all unitary matrices \(W\). The bound \(s (1 – s) \leq 1/4\) implies that the channel capacity satisfies\[C \geq \log\left(\frac{4(1+\sigma_0^2)(1+\sigma_1^2)\sigma_0^2\sigma_1^2}{4\sigma_0^2\sigma_1^2 + \sigma_0^2-\sigma_1^2}\right).\]

Here we have obtained lower bounds for the channel capacity in the one codeword and two codeword cases. These bounds are tight whenever \(\sin 2\theta = 0\). Unfortunately, comparing the two bounds to see which is greater does not give a simple expression in terms \(\sigma_0\) and \(\sigma_1\) (it involves solving a quadratic equation). We can compare the two bounds numerically. The following plot shows in black the region where the lower bound for two codewords is greater than the lower bound for one codeword. Since \(\sigma_0 \geq \sigma_1\), the upper-left triangle of the plot is not shaded in black.

This plot shows that in practice, if \(\sigma_1 \geq 0.75\), then the two-codeword lower bound is higher, suggesting that in this condition two codewords will achieve higher capacity regardless of the geometry of the unitary \(V\).

As indicated above, a common antenna configuration for two-port eNBs is as cross-polarized antennas with +45 deg and -45 deg polarizations. Let us assume that the UE has two ports that are also in orthogonal polarizations. If the propagation path is polarization independent, then the matrix \(H\) is a positive constant \(\sigma\) times a unitary \(U\) that gives the transformation from the eNB polarization basis to the UE polarization basis. The singular value decomposition for \(H\) is \(H = U \Sigma V^*\) with \(\Sigma = \sigma I\), so that \(\sigma_0 = \sigma_1 = \sigma\), and \(V = I\). Applying the results of the previous section, we see that precoding with any vector \(w\) with \(\|w\| = \sqrt{2}\) in the one codeword case, and any unitary \(W\) in the two codeword case give optimal results regarding the channel capacity, because \(\sigma_0 = \sigma_1\).

In the two codeword case, the two available precoding matrices correspond to either transmitting one codeword in horizontal polarization and another codeword in vertical polarization, or to transmitting one codeword in RHCP and the other in LHCP. The UE can use the CRS (which have +45 deg and -45 deg polarizations) to estimate the polarization basis change that separates both codewords, namely \(H^{-1} = \sigma^{-1} U^*\). Therefore, two-codeword TM4 can be understood in this case as polarization diversity transmission.

In a more realistic situation, the propagation will depend somewhat on the polarization, so the matrix \(H\) will no longer be a multiple of a unitary. However, in good conditions it will still be close to a multiple of a unitary, and this reasoning will apply approximately, because \(\sigma_0/\sigma_1\) will be close to one.

As we have seen, two-codeword TM4 is intended to be received by a UE with two or more antenna ports. If a UE has only one antenna port, then the data received on each subcarrier gives only one equation for two unknowns (one symbol transmitted from each of the two codewords), so it cannot recover the transmitted data in general. It would seem that we cannot do anything to decode the two-codeword TM4 transmissions in the recording that I’m using in these posts, since the recording was done with a single antenna. However, with some ingenuity, I think that it is possible in some cases.

The following figure shows the upper resource blocks of the two-codeword TM4 transmission corresponding to the DCI shown above (recall that this transmission was split into two segments of resource blocks, one occupying the lower part of the cell and the other the higher part). Here it has been equalized using only the channel response for antenna port 1. The constellation we obtain looks quite curious. It is a cross and four QPSK points at amplitude \(\sqrt{2}\).

In order to understand better this constellation, it is good to simulate what happens if we take a two-codeword TM4 transmission, run it through the channel present in the recording, and equalize it with the channel for antenna port 1. Such a transmission carries 4 bits per subcarrier (2 bits for each of the QPSK constellations of each of the two codewords). Therefore, there are 16 different possible values that can be transmitted. We mark each of the 16 values with a different colour, and simulate what happens for each of the subcarriers in resource blocks 36-49, which are the ones occupied by the transmission we’re examining. The result corresponding to symbol 1 in slot 37 is shown here.

Comparing this with the data that we have demodulated from the recording , we see that it looks the same, plus noise.

To understand better why this constellation appears, first take into account that with the precoding matrix used in this transmission, port 0 is transmitting the sum of the symbols from the two codewords divided by \(\sqrt{2}\), and port 1 is transmitting the difference of the symbols divided by \(\sqrt{2}\) (the precoding matrix actually has a \(1/2\) factor rather than \(1/\sqrt{2}\), but as I have mentioned, in this recording the PDSCH transmissions are scaled with an additional \(\sqrt{2}\) compared to the CRS). Therefore, whenever the symbols in the two codewords are opposite, port 0 is transmitting a zero, and port 1 is transmitting a QPSK symbol with amplitude \(\sqrt{2}\). When we equalize with the CRS for port 1, we simply get this QPSK symbol with amplitude \(\sqrt{2}\).

When the symbols are not opposite, then port 1 can either transmit \(0\) if they’re equal, or one of \(\pm\sqrt{2}\) or \(\pm \sqrt{2}i\) if they have opposite real parts and equal imaginary parts or vice versa. Now we remember that the channel response for port 0 is smaller than the channel response for port 1, so whatever that port 0 transmits acts as a perturbation for the symbol transmitted by port 1, shifting the symbol somewhat in the constellation plot. For example, when port 1 is transmitting \(\sqrt{2}\), the possible two symbols that give this value are \((1 + i)/\sqrt{2}\) and \((-1 + i)/\sqrt{2}\), or \((1 – i)/\sqrt{2}\) and \((-1 – i)/\sqrt{2}\). In the first case, port 0 transmits \(\sqrt{2}i\), and in the second case port 0 transmits \(-\sqrt{2}i\). When received and equalized with the channel for port 1, the signal transmitted by port 0 will pull the constellation point in opposite directions for each of these options. The angle in which it’s pulled depends on the phase difference between the channels for port 1 and port 0. It turns out that for the upper frequencies of the cell, the phase difference is around -100 deg. So in this example the signal from port 0 pulls the symbol in a direction which is almost parallel to the real axis.

This is precisely what we see in the constellation above. The locations of the constellation symbols are dominated by what is transmitted by port 1, but the signal from port 0 splits apart the two symbol combinations that get mapped to the same signal in port 1, allowing us to distinguish them. The amplitude of the port 0 signal compared to the port 1 signal is large enough to make this splitting easy to detect, but small enough to make the constellation still look like something recognizable.

A way in which we can exploit this situation is, for each subcarrier, to interpret the received data equalized with port 1 as a weird 16-point constellation, and to compute log-likelihood ratios for each of the 4 bits transmitted by that subcarrier. To do this, I have made an educated guess of 0.15 for the noise standard deviation and computed the LLRs using the max*-safe function. In the plot below, the first panel shows the LLRs obtained from the recording data. The remaining plots show a simulation of how the LLRs would look like without any noise for each of the 16 possible combinations of 4 transmitted bits. We see that there are some bit combinations and areas of the spectrum that are particularly bad, with the LLR getting too close to zero, but in general the LLR is large enough that it is possible to decode the bit without errors despite the noise. Looking at the LLRs from the recording, there are some points clustered around the zero line, but most of them are well away from zero and can be decoded. Perhaps the Turbo decoder would be able to decode the two codewords with the LLRs obtained with this method.

I have updated the LTE downlink Jupyter notebook with the calculations used in this post. The recording that I have used can be found here.

]]>Queqiao-2 transmits telemetry on S-band, at 2234.5 MHz. In this post I will analyse a short IQ recording that Scott Tilley VE7TIL has shared with me.

The modulation is 4096 baud PCM/PSK/PM with a 65536 Hz subcarrier. This is very similar to Queqiao-1 (about which I spoke in a previous post) and other Chinese spacecraft such as Tianwen-1 and Chang’e-5. The only difference with Queqiao-1 is that Queqiao-1 uses 2048 baud. Probably the baudrate can be changed in powers of 2 according to the link budget conditions, as we saw during the Chang’e 5 mission. The coding is CCSDS concatenated frames with 220 information bytes. This means that a shortened (252, 220) Reed-Solomon code is used. The frames have 2048 bits including the 32-bit ASM syncword, so at 4096 baud it takes exactly one second to send one frame (due to the convolutional encoding). Other Chinese spacecraft also use this coding that gives frame durations that are a power of two (positive or negative) seconds. Unlike Queqiao-1, Queqiao-2 doesn’t have the Reed-Solomon encoder bug.

To decode the telemetry I have used the following GNU Radio flowgraph. It uses a typical PCM/PSK/PM demodulator followed by the gr-satellites CCSDS Concatenated Deframer block.

The GUI of the flowgraph running on Scott’s recording can be seen here. The SNR is very good, so there are no bit errors.

There are some features of the spectrum of the signal that are worthy of mention. At ~65 kHz we can see the telemetry sidebands, as described above. At twice this frequency, there are CW tones that appear in-phase with the residual carrier (rather than in quadrature, as the telemetry sidebands). This is also to be expected. It happens because a telemetry baseband waveform that doesn’t have instantaneous zero-crossings modulates in amplitude the residual carrier, since there is more residual carrier power during each zero-crossing.

There are CW tones at 100 kHz, in quadrature with the residual carrier. These are ranging tones. They are perhaps generated on ground and turned-around by the spacecraft transponder, although I’m not sure that the spacecraft is in ground lock here, as there is no telecommand loopback. So perhaps the ranging tones are generated on-board. The second harmonic of these CW tones can be seen at 200 kHz. It is in-phase with the residual carrier, for the same reason as mentioned above for the harmonic of the telemetry subcarrier (or simply as a consequence of the Jacobi-Anger expansion of phase modulation in terms of Bessel functions: odd harmonics are imaginary and even harmonics are real).

There are some other signals in-phase with the residual carrier at around 35 kHz and 165 kHz. These are intermodulation products of the telemetry subcarrier and the 100 kHz ranging tone (35 kHz and 165 kHz are the difference and sum of 100 kHz and 65 kHz respectively).

The telemetry frames are CCSDS AOS frames. They use spacecraft ID 0xB3. There is no matching entry for this in the SANA registry. Only virtual channel 0 is in use. There is no frame error control field (CRC-16), since it is not necessary when using Reed-Solomon, and there is no operational control field either.

Here is where the adherence to CCSDS standards seems to end. There is no M_PDU header following the AOS Primary Header, and I haven’t found any indication of Space Packets being used. It seems that the encoding of the AOS frames payload is custom. However, it is possible to see a lot of structure in the data. The contents of the frames seem to repeat every 32 frames. The following plot shows in yellow the bits that differ between the first block of 32 frames, and the second block of 32 frames. There are some telemetry fields that change their value over time, but it is clear that the structure of the two blocks of 32 frames is the same.

The first 4 bytes of the AOS frame payload seem to be timestamps. The format is big-endian 32-bit integers that give the number of seconds elapsed since the epoch 2021-01-01 00:00 China Standard Time (it is quite common for Chinese spacecraft clocks to run on China Standard Time rather than UTC). Indeed, subtracting the values of the timestamps from 2023-03-20 04:05 UTC, which is approximately when this short recording starts (with an error of perhaps a few minutes, according to Scott), gives timestamps around 2020-12-31 16:00 UTC, which is the same as 2021-01-01 00:00 China Standard Time. Also, as expected, these timestamps increment by one on each frame, since the frame duration is exactly one second.

The next 5 bytes in the payload admit a simple description. The first two are always `0x0044`

. The following two bytes alternate between `0x0040`

and `0x4e52`

. The next byte is usually `0x00`

, and occasionally `0x07`

. I don’t know what these bytes mean, but maybe they are some kind of header. After these, the bulk of the telemetry data starts.

The GNU Radio flowgraph and Jupyter notebook used in this post, as well as a binary file with the decoded frames, can be found in this repository.

]]>Let’s begin by recalling the signals that are present in the recording. The recording is a short IQ recording of a 5 MHz cell done at 7.68 Msps, and the SNR is very good. The following plot shows the signal power in each OFDM symbol in each frame. This follows the unconventional approach of numbering as symbol 0 the symbol where the SS/PBCH block begins. In reality, the SS/PBCH in this recording begins in symbol 2 of each radio frame, so take this into account when interpreting the plot.

The reference signals occur in frames 2 and 10, and frames 3 and 11. They seem to be transmitted with a periodicity of 8 frames, or 80 ms. Reference signals in NR are highly configurable. The higher layers can announce to the UEs what kinds of reference signals are available. However, we can also figure out these details by doing signal analysis. These reference signals are needed in NR because it doesn’t have the CRS (cell-specific reference signal) of LTE.

Technically, the two different reference signals that are present in this recording are both CSI-RS (channel state information reference signal). Thus, the signals are constructed in the same way, as indicated by Section 7.4.1.5 in TS 38.211. However, they are configured in a different way, as they are intended to be used for different purposes. The srsRAN source code where these signals are configured is available, and this helps to understand the role played by each reference signal.

The first signal we look at is the CSI-RS, which is shown in the right hand side of this waterfall. It is clear that the subcarriers are used sparsely. In fact, only one out of every 12 subcarriers is active.

The way to find if we have correctly understood how a reference signal is generated is to generate the pseudorandom sequence that is used to modulate the reference signal. By multiplying the received signal with the complex conjugate of the correct pseudorandom sequence, we obtain a pure pilot signal. If we have made a mistake, we obtain something else. This process is known as wiping off the pseudorandom sequence.

The following plots show the wiped-off symbols for the CSI-RS’s of frames 2 and 10, together with the parameters used to construct the signal (which are the same in both cases). Some of these parameters affect the initialization value for the pseudorandom sequence, and others affect the mapping to physical resources.

The \(n_{\text{ID}}\) parameter is used in the generation of the initialization value for the pseudorandom sequence. This is the only of these parameters that is a priory unknown for us, because the other parameters describe the symbol within the radio frame in which the CSI-RS is transmitted. According to TS 38.211, “\(n_{\text{ID}}\) equals the higher-layer parameter *scramblingID* or *sequenceGenerationConfig*.” We don’t have access to this higher-layer information, but by some trial and error I’ve found that \(n_{\text{ID}} = 1\) in this recording. This is a good guess because the physical cell ID is also 1, and in the generation of other similar pseudorandom sequences, the physical cell ID plays the same role that \(n_{\text{ID}}\) plays here.

The mapping to physical resources of a CSI-RS is mainly defined in terms of Table 7.4.1.5.3-1, which lists all the possible patterns in the time-frequency grid that a CSI-RS can have. In this case, the configuration in row 2 is used. This is the simplest configuration: it only uses a single symbol and one out of every 12 subcarriers (one subcarrier per resource block). The density \(\rho\), and the CDM type are given by the table row. Additionally, the CSI-RS takes an offset \((k_0, l_0)\) in the time-frequency grid. Here \(k_0 = 11\), which means that the uppermost subcarrier of each resource block is used, and \(l_0 = 4\), which means that the signal appears in symbol 4 in the subframe. Additionally, \(n_{s,f}^\mu = 2\), meaning that the signal appears in subframe 2 in the radio frame.

We can see the code that configures this CSI-RS signal is the `make_channel_measurement_nzp_csi_rs_resource`

function in srsRAN. The configuration in this function matches what we have found (this cell has only one port).

This CSI-RS is intended for channel measurement by the UE, so in a sense it is the “main” CSI-RS. The signal allows a UE to estimate the downlink channel. It is transmitted only every 80 ms because it is assumed that the channel doesn’t change significantly over timescales shorter than this. It only uses one in every 12 subcarriers because it is assumed that the channel response for subcarriers that are nearer than this is highly correlated (this is related to the delay spread of the channel). In this way, the CSI-RS spends only a very small amount of downlink resources compared to the LTE CRS.

The second reference signal that appears in the recording is the TRS (tracking reference signal). This is a CSI-RS that is specifically configured to measure the channel coherence over time (to estimate Doppler spread, for instance). Here is a paper that talks about this. The TRS appears in two consecutive subframes, and in each of the subframes it appears in two symbols that are 4 symbols apart. Therefore, it allows the UE to measure the channel correlation over delays of 4 symbols and 1 subframe (1 ms). We can also see in the waterfall that the TRS is more dense than the main CSI-RS: it uses one out of every 4 subcarriers, instead of 1 out of every 12.

The following plots show the wiped-off symbols for the TRS in frames 3 and 11. Since the TRS is a CSI-RS, the parameters that control its configuration are the same as for the main CSI-RS, but have different values.

The TRS uses row 1 in the table, which corresponds to using subcarriers \(k_0, k_0 + 4, k_0 + 8\) in each resource block in order to occupy one out of every 4 subcarriers. For this signal, \(k_0 = 0\), so the subcarriers that are used in each resource block are 0, 4 and 8. The TRS appears at symbols \(l_0 = 4, 8\). The first symbol is the same symbol as the one in which the main CSI-RS appeared, but there is a second symbol three symbols later. As mentioned above, the TRS appears in two adjacent subframes: 2 and 3.

The srsRAN source code configures the TRS in the `fill_tracking_nzp_csi_rs_resource`

function. The configuration in the code matches what we have found. Something I haven’t understood well is why there is a CSI-RS for channel estimation, and then a TRS, since the UE could also use the TRS for channel estimation. It seems that the srsRAN source code gives a clue, because it always configures the TRS for one antenna port. Maybe this is the main point of the main CSI-RS: it is transmitted over as many ports as the gNB has (srsRAN only considers 1, 2, and 4 ports, but the table in TS 38.211 covers up to 32 ports). This allows the UE to estimate the channel for each port. Therefore, the main CSI-RS must be quite efficient in terms of resource element utilization if there are many ports, and this is where complicated CDM schemes come into play. On the other hand, the TRS is used by the UE to estimate Doppler spread and similar channel statistics, so it can be transmitted over just one of the gNB ports under the assumption that these statistics will be similar for all the ports.

Frame 10 in the recording has a PDSCH transmission. This actually corresponds to the SIB1, which is always transmitted periodically. Since in NR there is no CRS, each downlink channel has its own DM-RS (demodulation reference signal). In the previous post we already saw how the DM-RS of the PBCH looks like, and I already indicated that the PDSCH DM-RS occuppies symbols 0, 5, 9 in the PDSCH transmission (counting as symbol 0 the first symbol of the PDSCH transmission), while symbols 1, 2, 3, 4, 6, 7, 8, 10, and 11 are used to transmit the data. The PDSCH DM-RS only uses every other subcarrier, and its constellation has amplitude \(\sqrt{2}\) rather than one so that it has the same power as the data symbols. However, in the previous post I didn’t generate and wipe-off the pseudorandom sequence.

The initialization value for the pseudorandom sequence used in the PDSCH DM-RS is given in Section 7.4.1.1.1 in TS 38.211. The formula depends on several variables whose value depends on higher-layer configuration. However, in this case, they have simple values. The variables \(\overline{\lambda}\) and \(\overline{n}_{\text{SCID}}^{\overline{\lambda}}\) are zero, and \(N_{\text{ID}}^{\overline{n}_{\text{SCID}}^{\overline{\lambda}}}\) is equal to the physical cell ID, which is 1. The remaining variables used for the generation of the pseudorandom sequence indicate the symbol within the radio frame in which the DM-RS is transmitted.

The figure below shows the PDSCH data symbols (blue) and DM-RS symbols (green). The pseudorandom sequence of the DM-RS has been wiped-off. This plot shows that the sequence has been generated correctly.

Immediately before the PDSCH transmission for the SIB1 there is its corresponding PDCCH transmission. This can be seen in the waterfall below. It is the signal that starts at 0.11 seconds. It occupies two symbols in the time domain, and two disjoint parts of the spectrum in the frequency domain. The PDSCH transmission follows immediately afterwards.

Similarly to the PBCH, the PDCCH DM-RS occupies one out of every 4 subcarriers. It uses the subcarriers that are congruent with 1 modulo 4. The pseudorandom sequence is generated with an initialization value whose formula looks like a simplified version of the value for the PDSCH DM-RS. Besides the variables that indicate the symbol within the frame, there is the variable \(N_{\text{ID}}\), which can be overridden by higher-layers, but defaults to the physical cell ID (and this is what is used in this case).

The challenge regarding the use of the pseudorandom sequence is the following paragraph in Section 7.4.1.3.2 in TS 38.211:

The reference point for \(k\) is

– subcarrier 0 of the lowest-numbered resource block in the CORESET if the CORESET is configured by the PBCH or by thecontrolResourceSetZerofield in thePDCCH-ConfigCommon IE,

– subcarrier 0 in common resource block 0 otherwise

What this means is that the first symbol in the scrambling sequence is not used to scramble the first subcarrier where the PDCCH is transmitted. The first symbol would be used to scramble the subcarrier at a reference point for \(k\) if the PDCCH actually started at this point. In general, the PDCCH starts \(N\) subcarriers above this point, so those many \(N\) symbols from the beginning of the pseudorandom sequence need to be discarded.

In this case, since the PDCCH is for SIB1, the relevant configuration is for CORESET 0, which is configured by the MIB transmitted in the PBCH (see this page for more information). Therefore, we are in the first case of what the paragraph from TS 38.211 mentions. Without decoding and interpreting the MIB, it isn’t obvious how to find this reference point for \(k\), which gives the number \(N\) of symbols at the beginning of the scrambling sequence that need to be discarded. Additionally, \(M\) symbols in the pseudorandom sequence need to be “jumped over” to account for the gap between the two frequency regions occupied by this PDCCH. However, the value of \(M\) is easy to measure on the spectrum.

What I have done to find \(N\) is to generate a long enough pseudorandom sequence and compute its correlation with the PDCCH DM-RS symbols. This tells me how many symbols at the beginning of the pseudorandom sequence I need to throw away to align it to the DM-RS, and therefore, where CORESET 0 starts. It turns out that with the subcarrier numbering that I’m using for OFDM demodulation in this recording, it starts at subcarrier 128.

With this information it is now possible to wipe off the PDCCH DM-RS symbols. The figure below shows the constellation of the two PDCCH symbols. Data symbols are shown in blue, and wiped-off DM-RS symbols are shown in green.

The CSI-RS and the TRS begin at subcarrier 116, which is one resource block below the start of CORESET 0. I think that these CSI-RS must occupy the whole cell. In fact, they occupy the subcarriers from 116 (included) to 416 (not included). This is a total of 25 resource blocks, which is what a 5 MHz cell uses. Therefore, it seems that CORESET 0 starts at resource block 1 (numbering the first resource block in the cell as 0). Interestingly, the PDSCH transmission for SIB1 also starts at resource block 1.

I have updated the Jupyter notebook with the work that I have done in this post. The recording can be found in the same repository.

]]>In this post I will examine some recordings of the S-band telemetry signal done by AMSAT-DL with the 20 metre antenna in Bochum observatory. These recordings were done while the lander was still in-orbit. When landed on the Moon, IM-1 used the same configuration, but the recordings done at Bochum are probably too weak to decode, due to the orientation of the lander antennas.

I will look at two recordings in this post:

`2024-02-15_17-11-34_288000SPS_2210570000Hz.s16`

`2024-02-22_20-10-59_144000SPS_2210600000Hz.s16`

The spacecraft is using different configurations in the two recordings. When landed, IM-1 used the same configuration as in the recording from February 22.

According to FCC paperwork, IM-1 has two Thales Alenia transceivers with 5 dBic low-gain antennas and 8 W output power, and two Quasonix transmitters with a 15 dBic high-gain antenna and 25 W output power for high-speed data. These recordings are likely from the Thales radios, although IM-1 also used the Quasonix transmitter with the low-gain antennas for low-speed data at the end of the mission to improve its link budget.

In the first recording, the modulation and coding is residual-carrier PCM/PSK/PM with a baudrate of approximately 2511.5 baud and a subcarrier frequency of 12 times the baudrate, which gives 30.138 kHz. In the second recording, the modulation is suppressed-carrier BPSK, also with a baudrate of approximately 2511.5 baud. The coding is the same in both configurations: CCSDS k=7, r=1/2 convolutional coding, with 153-byte AOS frames.

Here is the flowgraph that I’ve used to decode the signal in the first recording. It is a typical demodulator for PCM/PSK/PM, followed by Viterbi decoders and a Sync and create PDU blocks to find the ASM and extract the packets. Two Viterbi decoders are used in parallel to test both possible pairings of BPSK symbols, and each of these feeds two Sync and create PDU blocks that test the two possible 180 degrees phase ambiguities.

This is how the GUI of the flowgraph looks like when running on the recording. The SNR is excellent, and the constellation has no bit errors.

The following shows the flowgraph for decoding the suppressed-carrier BPSK telemetry. Since there is Doppler drift on the signal, the FLL Band-Edge block is used. There is something interesting about the demodulator, which is that the rectangular pulse filter is using only a 1/2-symbol window, instead of the 1-symbol window which is more common. The reason is that using a 1-symbol window gives a lot of inter-symbol interference. It seems that the transmitter is using some form of triangular pulse shape, instead of the rectangular pulse shape which is normally used in deep space missions.

Here is the GUI of this flowgraph running on the recording from February 22. The SNR is again quite good. I have selected the part of the recording in which there is better SNR, as there are large changes in signal strength in this recording. The bottom panel shows the time-domain waveform. It is clear from this that the transmitter pulse shape is not rectangular.

The frames are 153-byte AOS frames. There is a Frame Error Control Field (CRC-16) which is checked by the GNU Radio flowgraph. Only virtual channel 1 is in use. The spacecraft ID is `0xCE`

. This does not appear in the SANA registry.

The following figure shows a raster map of all the AOS frames decoded from the first recording. The FECF has been removed from these frames already. There are two zones which are always zero. These show as dark purple bands in the plot. The first zone is the space where there would usually be an M_PDU header. All zeros is a valid value for an M_PDU header, which indicates that the first packet starts at the beginning of the packet zone (first header pointer equal to zero). The second zone is the space where there can be an Operational Control Field carrying a CLCW. In this case the CLCW is probably missing, since CLCWs typically have some non-zero bits.

The payload in the frames looks quite random. In fact, the FCC documentation says that the data is encrypted. However, there are some patterns also. These probably correspond to padding data used to fill up the rest of the AOS frame when there isn’t more data to send immediately.

The raster map of the frames decoded in the second recording looks very similar.

The GNU Radio flowgraphs and the Jupyter notebook used in this post, as well as binary files containing the decoded frames, can be found in this repository.

]]>In this post I study the geometry of the lunar reflection and find what causes these bands.

The variable that we have to separate reflections from different points of the Moon’s surface is Doppler. This is only one variable. Since the lunar surface is two-dimensional, the problem of determining where the reflection comes from is undetermined. At any given instant there is a whole curve of points on the lunar surface that all have the same reflection Doppler. However, not all is lost, even if at some instant all the points in these curve have the same reflection Doppler, as time passes the points will have different Doppler versus time curves. So by assuming that there is a certain correlation in reflection strength over time for each point, we might be able to identify individual points on the lunar surface from their Doppler trajectories. Looking back at the figure above, the question becomes: “do the higher intensity patterns in the reflection match the Doppler trajectories of points on the lunar surface?”.

Here we are only interested in the frequency difference between a reflected signal and the direct line-of-sight signal. This is what is plotted in the vertical axis of the figure above. We can make some approximations when computing the Doppler of the reflected signal.

An accurate reflection Doppler calculation involves knowledge of the trajectories of the transmitter \(A(t)\) and receiver \(B(t)\) in an inertial frame. Assuming we are able to determine the point \(P(t)\) where the signal that arrives to the receiver at time \(t\) has bounced off (the bounce happens at time \(t – c^{-1}\|B(t) – P(t)\|\)), then the Doppler is\[-\frac{f_c}{c}\frac{d}{dt}\left[\|B(t) – P(t)\| + \|P(t) – A(t – \tau(t))\|\right],\]where \(f_c\) is the carrier frequency and \(\tau(t)\) is the solution to the equation\[\|B(t) – P(t)\| + \|P(t) – A(t – \tau(t))\| = c\tau(t).\]This formulation is complicated to work with, and it also makes the questionable assumption that we are able to meaningfully identify a trajectory \(P(t)\) for the point where the reflection occurs. This makes sense when the reflection happens at a well-defined point, such as the specular reflection point, but not in a more general situation where the reflection is spread over an area.

One approximation that we can make is to ignore the finite propagation speed of light. This amounts to saying that \(A(t – \tau(t))\) is approximately equal to \(A(t)\), which allows us to ignore \(\tau(t)\) completely. In the case of SLIM, \(\tau(t)\) is on the order of one second, and is dominated by the distance between the Moon and the Earth. A more accurate approximation would take a fixed value \(\tau_0\) corresponding to the Moon-Earth distance when the landing happened, and replace \(A(t-\tau(t))\) by \(A(t-\tau_0)\).

Once we ignore the finite propagation speed of light, the reflection Doppler formula involves only points at the same time instant \(t\). Therefore, instead of doing the calculations in an inertial frame, we can do the calculations in a non-inertial frame. In what follows we will ignore the propagation speed of light and assume that \(A(t)\), \(B(t)\) and \(P(t)\) are given in lunar body-fixed coordinates, since this will simplify the calculations.

Since\[\frac{d}{dt}\|v(t)\| = \frac{\langle v'(t), v(t)\rangle}{\|v(t)\|},\]we have that the reflection Doppler is\[-\frac{f_c}{c}\left[\frac{\langle B'(t) – P'(t), B(t) – P(t)\rangle}{\|B(t)-P(t)\|} +\frac{\langle P'(t) – A'(t), P(t) – A(t)\rangle}{\|P(t)-A(t)\|}\right].\]We assume that the vector \(P'(t)\) lies in the plane tangent to the lunar surface at \(P(t)\) (taking into account any surface roughness or inclination), and denote by \(N(t)\) a unit vector normal to this tangent plane. This assumption follows naturally from the condition that \(P(t)\) must lie in the lunar surface for all \(t\). Note that here we are using the fact that \(P(t)\) is given in lunar body-fixed coordinates. Certainly \(P'(t)\) would not lie in the plane tangent to the surface if \(P(t)\) is given in an inertial system centred at the Earth-Moon barycentre. We also assume that the incoming and reflected rays are related by a reflection along this tangent plane:\[\frac{B(t) – P(t)}{\|B(t) – P(t)\|} = \frac{P(t) – A(t)}{\|P(t) – A(t)\|} – 2 N(t) \frac{\langle P(t) – A(t), N(t)\rangle}{\|P(t) – A(t)\|}.\]Since \(\langle P'(t), N(t)\rangle = 0\), this implies\[\frac{\langle P'(t), B(t) – P(t)\rangle}{\|B(t) – P(t)\|} = \frac{\langle P'(t), P(t) – A(t)\rangle}{\|P(t) – A(t)|}.\]Thus, we can simplify the above formula for the reflection Doppler to\[-\frac{f_c}{c}\left[\frac{\langle B'(t), B(t) – P(t)\rangle}{\|B(t)-P(t)\|} -\frac{\langle A'(t), P(t) – A(t)\rangle}{\|P(t)-A(t)\|}\right].\]

Doing a similar reasoning, we can find that the direct line-of-sight Doppler is approximately equal to\[-\frac{f_c}{c}\left[\frac{\langle B'(t), B(t) – A(t)\rangle}{\|B(t)-A(t)\|} -\frac{\langle A'(t), B(t) – A(t)\rangle}{\|B(t)-A(t)\|}\right].\]The vectors \((B(t) – P(t))/\|B(t)-P(t)\|\) and \((B(t) – A(t))/\|B(t)-A(t)\|\) are approximately equal, since the receiver on the Earth \(B(t)\) is much further away from \(P(t)\) and \(A(t)\) than the distance between \(P(t)\) and \(A(t)\), which are points close to the Moon. We are only interested about the difference between the reflection Doppler and the direct Doppler, so we can ignore the first terms in the two formulas above and obtain\[\frac{f_c\langle A'(t), P(t) – A(t)\rangle}{c\|P(t)-A(t)\|}\]for the reflection Doppler and\[\frac{f_c\langle A'(t), B(t) – A(t)\rangle}{c\|B(t)-A(t)\|}\]for the direct Doppler. These formulas have the following interpretation: the reflected Doppler is equal to the projection of the spacecraft velocity vector onto the unit vector that gives the line-of-sight from the spacecraft to the reflection point; the direct Doppler is equal to the projection of the spacecraft velocity vector onto the unit vector that gives the line-of-sight from the spacecraft to the receiver at Earth. For simplicity, we can ignore the location of the receiver on Earth and take as \(B(t)\) the centre of the Earth.

In a simplified model where the lunar surface is assumed to be flat and the receiver very far away, it is quite easy to compute the specular reflection point. The trajectory of the reflected ray is obtained by reflecting the direct ray along a horizontal plane centred on the transmitter. In particular, the angle that the reflected ray makes with this horizontal plane as it originates from the transmitter is equal to the angle made by the direct ray (but it goes downwards instead of upwards).

In this flat surface model, the difference between the specular reflection Doppler and the direct Doppler has a nice interpretation: it is proportional to twice the vertical velocity of the spacecraft. This is because the vectors from the spacecraft to the reflection point and to the receiver are symmetric with respect to the horizontal plane.

A more accurate model considers the lunar surface as a sphere and does not assume that the distance to the receiver is infinite. I have already treated in this blog the problem of calculating the reflection point on a sphere for a ray travelling between two given points. In the appendix of this post I computed the reflection point as the point that minimizes the length of the reflection path, and solved the minimization problem by brute force. It turns out that this problem is called Alhazen’s problem and it does not have a closed form solution. It can be solved by finding the roots of a quartic polynomial or by numerically solving an equation involving trigonometric functions. This short paper describes these two ways to solve it.

As a comparison of each of the two models, the following plot gives the distance along the lunar surface from the subsatellite point to the specular reflection point calculated with the flat surface and receiver at infinity model and with Alhazen’s problem (spherical surface model). We can see that there is a noticeable difference between the two models at the beginning of the recording, when SLIM is still at 40 km height, but the difference becomes much smaller as SLIM descends.

The spherical model predicts a specular point that happens closer to the spacecraft than the flat model. This is because due to the curvature of the surface, the surface normal rotates away from the spacecraft as we go further along the surface. Therefore, compared to a reflection on a flat surface, the ray from the transmitter must be tilted slightly more downwards so that after the reflection on the spherical surface the outgoing ray has the correct angle to reach the receiver.

In the spherical surface model, the difference between the specular reflection Doppler and the direct Doppler is no longer proportional to twice the vertical velocity, but it is reasonably close as long as the spacecraft is not very high. Therefore, this is still a useful intuition to keep in mind.

Here is the distance along the surface from the subsatellite point to the horizon, assuming a spherical surface. This gives the radius of the footprint of the satellite. The HORIZONS ephemerides that I’m using for these calculations are not accurate enough after 15:15 UTC. In fact, they give a landing location which is some 5 km above the surface. This is what causes the relatively large distance to the horizon at the end of the plot.

In what follows it is useful to define a LVLH (local velocity local horizon) frame for SLIM. The +Z vector of this frame points up, in the direction of the position vector in Lunar body fixed coordinates. The +Y vector is defined as the cross product of +Z and the velocity vector in Lunar body fixed coordinates. This gives a vector that is normal to the orbital plane. The +X vector is defined as the cross product of +Y and +Z. This gives a right-handed system, and moreover +X points in the direction of the horizontal velocity vector.

The following plot shows the angles that the velocity vector and a vector joining the spacecraft with a point in the horizon make with the horizontal XY plane. We see that for most of the recording the velocity vector points above the horizon. This is in fact what is needed if the spacecraft “wants to miss the ground”. In the final part of the landing the spacecraft initiates a steep descent and the velocity vector now points significantly below the horizon.

This geometry has implications for the reflection Doppler. As we have seen above, we can approximate the reflection Doppler as the projection of the velocity vector onto the line-of-sight vector to the reflection point. The Doppler is maximal when these two vectors are aligned, but the reflection point needs to be on the surface. When the velocity vector points below the horizon in the steep descent, it points to a spot on the surface that gives the maximum Doppler. When the velocity vector points above the horizon, the spot on the surface that gives maximum Doppler is the point in the horizon ahead of the spacecraft.

The following plot overlays on the waterfall of the signal reflection the Doppler curves corresponding to the specular reflection (computed using the spherical surface model), the maximum Doppler (with the restriction that the reflection point must be on the ground), and the Doppler at the point in the horizon ahead of the spacecraft. The two latter curves coincide for most of the recording, because the velocity vector points above the horizon. Even when they don’t coincide during the steep descent, they are close.

From this plot it is clear that the ephemerides are bad after 15:15 UTC, since the waterfall shows reflections that exceed the maximum Doppler according to the ephemerides. Before 15:15 UTC, the contour of the area in the time-frequency plane where there is a reflected signal matches the maximum Doppler on ground quite well.

Something else is noteworthy. The specular reflection Doppler curve doesn’t have any reasonably obvious counterpart in what we see in the waterfall of the signal (except perhaps during the steep descent some minutes before 15:15 UTC). This means that most of the reflected signal power that we see doesn’t correspond to the specular reflection. This contrasts with the lunar reflections that we saw from the Lonjiang-2 mission at UHF, which followed the specular reflection Doppler curve quite well. Some reasons why the specular reflection is not prominent in this case will be given below.

The thing that intrigued me most in the waterfall of the reflected signal is the curved shapes that look like a ‘1’. The clearest of these happens at 14:58 UTC. Maybe these correspond to reflections on some specific parts of the lunar surface. But which parts? And why do these parts produce a stronger reflection?

What I have done to answer these questions is to try to match the shape in the waterfall with the Doppler curve of a particular point in the Moon surface. The way in which I chose this point is by specifying the moment in which the point is in the horizon (this is the moment in which the point first becomes visible from the spacecraft, and so a reflection can start to happen), and the heading to this point in this moment (using the LVLH frame introduced above). By varying these two parameters I try to match the Doppler curve with the shape on the waterfall. Roughly speaking, changing the time at which the point appears on the horizon moves the curve left or right, and changing the heading to the point when it is in the horizon affects the amplitude and steepness of the curve. I also consider points that appear in the horizon at a slightly earlier and later times, and at a slightly smaller and larger heading, in order to obtain some error bars for the location of the point.

The following figure shows a plot of the strongest of these shapes (the one happening at 14:58 UTC), and the Doppler curve for a point in the surface that matches this shape, as well as the two error bars. The latitude and longitude of the reflection point is given in the figure title.

There is a lot to unpack in the plot below the waterfall, so let’s go curve by curve. The blue curve shows the value that the angle between the normal of the reflection plane and the vertical must have so that the reflection can happen. This is not a specular reflection, so the angle is not zero, the surface causing the reflection must be tilted with respect to a perfect spherical surface in order to allow for the reflected ray to reach the Earth. We see that the angle ranges between 10 and 20 degrees when the reflection is strongest.

The orange curve gives the elevation of SLIM as seen from the reflection point. This starts at zero when the point first appears on the horizon of SLIM, increases to a maximum as SLIM passes abeam of this point, and then decreases again as SLIM leaves the point behind and it disappears below the horizon. The green curve is the elevation of Earth as seen at the reflection point. This will become relevant below.

Finally, the red and purple curves give the heading from SLIM to Earth and to the reflection point. The heading is defined with respect to the LVLH frame defined above. Due to how the frame is defined, heading increases counterclockwise, instead of clockwise as is more usual. As a general rule, in order for a strong reflection to happen, the transmitter, reflector and receiver need to be more or less aligned. These means that the two headings must be close. A reflection plane that is very tilted can provide a reflection path in which the heading changes significantly. Also, scatter by a rough surface (which can be modelled as the surface normal varying quickly at a small scale) can change the heading of a reflection. But usually the reflection cannot change the heading much. We see that the reflection is only strong when the difference between the two headings is less than 10 degrees or so. This matches our intuition.

An important remark is that points that are symmetric with respect to SLIM’s groundtrack give the same reflection Doppler curve. This happens because the angles between SLIM’s velocity vector and the line-of-sight vector from SLIM to each of these two symmetric points always coincide. Here the point I have selected is west of SLIM’s groundtrack. The Earth is also west of SLIM, since SLIM is travelling towards the north in the eastern hemisphere of the Moon. Therefore, at some instant this point on ground, and Earth become aligned. There is a symmetric point east of SLIM’s groundtrack that has the same Doppler curve. This point has opposite heading to that of the point we’re considering. The heading difference for this point is therefore always large, and so a strong reflection at this point cannot happen.

The figure below shows the reflection points plotted in LROC QuickMap. Six points are shown: the reflection point west of SLIM and the two points used as error bars, and the symmetric point east of SLIM with its two error bar points. When the reflection is most strong, SLIM was south of these points, seeing the points at a heading between 10 and 20 degrees, and travelling north. The Earth would be to the northwest, at a heading slightly above 20 degrees.

It is remarkable that the points west of SLIM fall quite close to the nortwestern wall of Spallanzani crater. If we imagine the geometry, the Earth is at an elevation of around 45 degrees at this point of the lunar surface. When the reflection is strongest, SLIM is lower on the sky of the reflection point, at an elevation of around 20 degrees. The northwestern wall of the crater, with a slope of around 20 degrees is exactly what is needed to reflect the SLIM signal upwards towards Earth, instead of to a lower elevation, as a flat surface would do. The SLDEM2015 slope data in QuickMap shows that the slope of this crater wall ranges between 15 and 30 degrees, so it provides many surfaces that have the required geometry for the reflection to happen.

The surface point whose reflection Doppler curve was a good match for the reflection shown above is very close to the northwestern wall of a crater. I have explained why this situation should give a good reflection geometry, but to try to find if this was just a coincidence, I have studied some of the other reflections that are distinctly visible in the waterfall. Their plots are shown here.

To automate the analysis with QuickMap, I’m exporting the reflection points in a GeoJSON, which can then be loaded in QuickMap. The first of these reflection points is at the northwestern wall of the Mutus M crater.

The next reflection is more difficult to explain. The reflection point lies somewhat north of the wall of a crater next to Nicolai E. It might be that this wall is what causes the reflections. There are several smaller craters around the reflection point, and these also provide a surface normal which is as required by the reflection geometry. Interestingly, the symmetric reflection point east of SLIM lies inside a crater, although I don’t think that this crater can turn the heading of the reflection path by around 40 degrees, which is what would be needed to get a reflection there.

Finally, the reflection point for the last reflection matches quite well the northwestern wall of the Lindenau crater.

The following map shows all the reflection points that I have identified. This gives good evidence for the fact that a distinct reflection appears in the waterfall whenever SLIM passes southeast of a crater, as the northwestern crater wall gives a surface that can reflect the signal upwards to Earth.

A natural question is why the specular reflection is missing in the waterfall plot. The relatively flat lunar surface between the craters provides ample opportunities for a specular reflection to happen. Something to keep in mind is that the polarization used for this recording is the one corresponding to the direct signal (nominally RHCP). How a reflection interacts with circular polarization depends on the incidence angle. A normal reflection returns circular polarization of the opposite sense. This is well know when dealing with antenna reflectors. In contrast, a grazing reflection returns circular polarization of the same sense. In general a mixture of RHCP and LHCP will be returned depending on the incidence angle. When the incidence matches Brewster’s angle, the return will be horizontally polarized. This can be understood as 50% RHCP and 50% LHCP.

The report Radar Reflectivity of the Earth’s surface by Katz gives more details about this. The exact behaviour depends on the surface composition and frequency, but qualitatively it is always as described above. The following figure, taken from this report, gives the reflection coefficients for a smooth sea surface at a frequency of 6 GHz. In the case of SLIM we instead have the lunar surface at 2.1 GHz. I don’t know if the S-band reflectivity of the Moon has been studied in this detail.

The reflection coefficient for circular polarization is given by the thin lines. Same-sense polarization is the dashed line, and opposite-sense polarization is the continuous line. We see that there is only a large same-sense reflection for small grazing angles. In this case the Brewster angle is around 7 degrees. There is still some small amount of vertical polarization returned at this angle, probably because the surface does not behave as an ideal model.

In the case of SLIM, the grazing angle for a specular reflection is equal to the elevation of the Earth at the reflection point. This elevation increases as SLIM travels north through the southern lunar hemisphere, but in general it ranges between 35 and 60 degrees. These angles are probably well above the Brewster angle, so only a small fraction of same-sense polarization power is returned. I think this explains why there is no distinct sign of a specular reflection in the waterfall plot. It would have been very interesting to record this event in dual polarization and compare the two polarizations.

Even if we consider non-specular reflections, the grazing angle is bounded below by the geometry. Since SLIM must be at a non-negative elevation at the reflection point, the minimum grazing angle is \(\varepsilon_E/2\), where \(\varepsilon_E\) is the elevation of Earth at the reflection point. More in general, the grazing angle is \(\varepsilon_E/2 + \varepsilon_S/2\), where \(\varepsilon_S\) is the elevation of SLIM at the reflection point. This shows why the same-sense polarization waterfall favours reflections far from SLIM, for which \(\varepsilon_S\) is small. Such reflections need a surface slope of \(\varepsilon_E/2 – \varepsilon_S/2\) in order to reflect the signal upwards towards Earth. Since the heading to the reflection point and the heading to Earth must also be close, this means that the same-sense polarization waterfall favours reflections on a direction which is close to the velocity vector. These have a Doppler that is larger than the specular reflection or the direct signal.

We can also see that the reflection in general gets weaker as time advances, until the moment of the steep descent. This makes sense taking polarization into account. As SLIM travels further north, the Earth is higher up on the sky, so it is more difficult to get reflections with small grazing angle that have a significant amount of sense-sense return. There is another effect that is at play here, since SLIM is also descending. This means that the area that can reflect signals gets smaller, as the footprint shrinks. On the other hand, there is less free-space path loss in the radar equation. It turns out that the free-space path loss reduction wins out, so in general a lower height should give more reflected power.

This is not the full story of why of all the possible non-specular reflection paths only some of them are strong in the waterfall. The transmitter antenna pattern would also play a role here, but I don’t have any information about this. At least we know that until the steep descent the attitude of the spacecraft was more or less the same, since it was pointing the engine for a retrograde burn. When the steep descent happens, the attitude changes significantly. Perhaps this explains why a strong reflection appears again, even though the Earth is now very high in the sky and reflections with small grazing angle are not possible.

In this post I have shown how some distinct patterns in the waterfall of the signal reflected off the Moon’s surface during SLIM’s landing can be matched to the Doppler curves of reflections happening on the northwestern walls of craters. As SLIM passes southeast of these craters while travelling north on the southern lunar hemisphere, these walls give the required reflection geometry. Moreover, compared to the specular reflection, these reflections on crater walls have a significantly smaller grazing angle, which causes a higher same-sense polarization return.

The fact that the data was only recorded in same-sense polarization constrains the area where stronger reflections can happen, favouring this kind of crater wall reflections. How the transmitter antenna pattern illuminates the lunar surface also influences which reflections can be strongest, but there is not enough data about the transmitter antenna to use this in the interpretation of the observation.

The Jupyter notebook and GNU Radio flowgraph used in this post can be found in this repository.

]]>AMSAT-DL recorded the S-band signal from SLIM during the landing with the 20-meter antenna in Bochum Observatory. In this post I will analyse a recording done between 14:51:51 and 15:21:54 UTC (the touchdown was at 15:20 UTC). I will study the Doppler of the residual carrier and other radiometric quantities rather than the telemetry.

SLIM is transmitting a signal throughout all the recording, except for 2 seconds shortly after landing, when the transmitter turns off and turns on again. The spacecraft is in ground lock all the time, and an idle 2 kbps telecommand signal with a 16 kHz subcarrier is visible in the transponder.

The first step in the analysis is Doppler correction. HORIZONS has data for SLIM, but it says “Post-launch trajectory from JAXA. Data through January 18, prediction thereafter.” The relevant file is

File name Begins (TDB) Ends (TDB) ---------------------------------------------- ----------------- ----------- 20240119175154.202401190000.2024020100.burn.v1 2024-Jan-19 00:00 2024-Jan-31

I haven’t found any publicly available SPICE kernels.

After some attempts at Doppler correction, I realized that SLIM was in coherent Doppler mode with a groundstation in Japan transmitting at a constant uplink frequency. I haven’t found any official confirmation for the location of SLIM’s grounstation. There is some word in Twitter that it is Chiba, but for simplicity I have used Usuda, since the location of the deep space antenna is already in HORIZONS. There is not much difference, since Usuda is only 1.8 degrees west of Chiba.

To generate the Doppler files required by the Doppler correction GNU Radio block, I have used a script that I prepared for last year’s BSRC REU GNU Radio tutorials. This fetches the data from HORIZONS using Astroquery and writes the Doppler file in the required format. I have run the script as follows to generate the Doppler files for Bochum and Usuda. Due to the frequency multiplication done by the spacecraft transponder, the Doppler correction needs to be done using the downlink frequency, around 2212 MHz, for both the uplink and downlink legs of the path.

./horizons_doppler_correction.py --carrier-frequency 2212 \ --observatory Bochum --spacecraft SLIM \ --start-epoch 2024-01-19T14:30:00 --duration 3600 \ --time-step 1 \ --output-file SLIM-doppler-Bochum-2024-01-19.txt ./horizons_doppler_correction.py --carrier-frequency 2212 \ --observatory Usuda --spacecraft SLIM \ --start-epoch 2024-01-19T14:30:00 --duration 3600 \ --time-step 1 \ --output-file SLIM-doppler-Usuda-2024-01-19.txt

I have used this GNU Radio flowgraph to perform Doppler correction using two Doppler correction blocks and decimation. There are two output files. One at 38400 sps that is good to see the telecommand subcarrier and reflections of the residual carrier on the Moon (which can be up to 5 kHz away from the residual carrier), and one at 2400 sps, which is good to see the residual carrier in detail.

Here is how the waterfall of the beginning of the 38400 sps file looks like. We can see the residual carrier with a broad lunar reflection that spreads from -1 kHz to 2 kHz, and the idle telecommand subcarrier.

The reflection of the residual carrier on the lunar surface is best seen in the following waterfall, made with the full 38400 sps recording, and only showing the frequency range where the reflection appears. The patterns formed by the reflection look fascinating.

The next waterfall, obtained with the 2400 sps file, shows the residual carrier. All the frequency changes that we see here correspond to velocity changes that are not modelled by the HORIZONS ephemerides that I used for Doppler correction, plus any frequency errors in the transmitter and receiver. I’m assuming that the groundstation transmitter uses a very stable frequency reference (the spacecraft frequency reference doesn’t matter, since it is coherently locked to the uplink), but I don’t know if the receiver used at Bochum was locked to the GPSDO.

The part after 15:15 UTC is quite interesting, because the residual carrier shows a “triangle wave pattern”. This is best seen in the following Inspectrum waterfall. I think that what happens here is that the thrusters were switched on and off in rapid succession in order to achieve the desired average thrust. SLIM approached the landing zone in a northbound polar orbit, and Shioli crater is on the southern hemisphere, so the Earth was in the northern part of the sky. Accelerations opposing the velocity vector thus pointed away from Earth and caused a negative Doppler drift. Therefore, the downwards parts of the triangle wave correspond to moments when the thrusters are on, and the upwards parts correspond to moments in which the thrusters are off.

The next waterfall shows the moment of touchdown. After this, the transmitter switches off for a short period and then resumes with only the lunar Doppler, indicating that it is stationary on the Moon’s surface. Before this, there is a sawtooth pattern that I don’t know how to interpret. The positive jumps of the signal would correspond to very large accelerations. I have wondered if these frequency jumps are of a non-physical nature. Perhaps the spacecraft receiver’s PLL lost lock. However, the exact same sawtooth pattern also appears in the telecommand signal. This strongly suggests that the receiver PLL was in lock. Since the telecommand signal is transponded at baseband, if the PLL had lost lock, the signal would become “doubled”, as the receiver frequency error would cause the two sidebands of the telecommand signal not to overlap at baseband.

The next plot contains the frequency and power measurements of the residual carrier done using the waterfall. The power of the signal doesn’t change by that much during the whole landing sequence.

Finally, here is a series of plots related to the landing trajectory. There is a lot to unpack here, so I will explain each subplot below the figure. All this data has been computed from the HORIZONS ephemerides.

The first subplot shows the height of SLIM above the altitude of the landing zone. Until 15:00 UTC the spacecraft is free falling on an orbit with a periapsis altitude of 10 km. When it passes periapsis, it begins to decelerate and increase its altitude somewhat, before descending again as it continues decelerating. The second plot shows the spacecraft’s total velocity and horizontal velocity. Most of the velocity is in the horizontal component until the final part of the landing, when most of the horizontal velocity has been cancelled out. The third plot shows the vertical velocity.

The fourth plot contains the angle of the velocity vector with respect to the horizon. Until 15:14 the angle is slightly negative, but close to zero, since there is a lot of horizontal velocity. Then the angle becomes increasingly more negative as the spacecraft has cancelled most of the horizontal velocity and is now on a steep powered descent curve. Finally, the angle sweeps up and down due to the vertical velocity alternating between positive and negative some minutes before touchdown.

The fifth plot shows accelerations. The total acceleration (the norm of the acceleration vector) of the spacecraft is shown in blue. The orange curve shows the gravitational acceleration. This depends on the altitude, so it varies slightly as the spacecraft descends. The green curve is the non-gravitational acceleration. This represents the thrust applied by the engines during flight, and also the acceleration exerted by the ground once the spacecraft has touched down. The red curve shows the projection of the non-gravitational acceleration along the velocity vector, and the purple curve is the vertical non-gravitational acceleration.

Looking at the non-gravitational acceleration, we see that there are three distinct phases of thruster firings. We also saw these in the plot of the velocity. The first two phases are mainly intended to cancel a large fraction of the spacecraft’s velocity. Most of the acceleration is along the velocity vector, and opposite to it. The acceleration is about 2 m/s². This is probably the maximum acceleration that the thrusters can achieve. It is enough to counteract the 1.6 m/s² lunar gravity with some margin. In the third phase there starts to be substantial acceleration in the vertical direction, to cancel out the spacecraft’s vertical velocity. The vertical acceleration oscillates around 1.6 m/s², as is needed for a soft landing. It is in this phase when the residual carrier shows a triangle wave pattern that suggests that the thrusters are not fired continuously, but in short bursts.

The sixth plot shows the Doppler that was used in Doppler correction. This resembles somewhat the spacecraft velocity, but it is not the same, since the angle between the velocity vector and the line-of-sight vector changes. The last plot shows a comparison of the Doppler drift and the residual frequency measured on the Doppler corrected residual carrier. Until 15:12 UTC the two curves are quite similar. This suggests that there is a time difference between the Doppler curve used for Doppler correction and what the spacecraft actually did. The difference would be on the order of 5 to 10 seconds, judging by the proportionality of the two curves.

Another thing that we can see is that there are small jumps in the Doppler drift curve. These happen because the HORIZONS ephemerides have discontinuous accelerations. We also saw this in the accelerations plot. Therefore, the Doppler curve that we have used for Doppler correction has some “corners”. We also see these corners in the Doppler corrected signal. Corners in the Doppler correspond to a discontinuous acceleration, which happens when a thruster is suddenly switched on or off. In this case we cannot analyse the Doppler corrected signal qualitatively by looking at the corners that appear, because it is not easy to tell if each of these corners was present in the Doppler correction file, in the original signal, or in both.

It is interesting to compare the data in this post with the telemetry shown in the landing livestream by JAXA, which gives additional context. I hadn’t watched this before, and only watched it after doing the data analysis. I then could realize a few things that I had overlooked in the data. For instance, the non-gravitational acceleration provided by the engine thrust increases somewhat as the mass of the spacecraft decreases due to spent fuel. The livestream also mentions that the thrusters are switched on and off rapidly during the final descent, confirming what we have seen in the Doppler.

The Jupyter notebook and GNU Radio flowgraph used in this post can be found in this repository.

]]>`slim_`

).
The information about the telemetry signal of LEV-1 is scarce. Its website just says

Telemetry format of LEV-1 stands on CCSDS. The contents of telemetry are under developing.

The IARU coordination sheet contains other clues, such as the mention of PCM/PSK/PM, CW, and bitrates of 31, 31.25 and 32 bps, but not much else. Regardless of the mention of CCSDS, I have found that the signal from LEV-1 is quite peculiar. This post is an account of my attempt to decode the data.

In this post I will only be looking at the file slim_2024-01-19_15_39_58_437.200MHz_1.00Msps_ci16_le.chan0.sigmf-meta. I guess that the `chan0`

and `chan1`

in the filename refer to the two polarizations at the feed, but the SNR is similar and strong enough to decode in each polarization, so it is enough to look at one of them. There is another recording done 10 minutes later, but it contains signals similar to this one.

The waterfall of the moment when the LEV-1 signal first appears is plotted using Inspectrum and shown here (click on the image to view it in full size). Two things stand out. First, there is noticeable fading. This is the only significant fading in this recording, but the second recording also shows some signal strength changes. This would be difficult to explain if LEV-1 was completely stationary on the Moon’s surface, and suggests that the hopper was moving.

The second unusual thing is that there is amplitude shift keying in the signal. This signal is in fact a typical PCM/PSK/PM signal, but the residual carrier is amplitude shift keyed with what looks like Morse code. Modulating the residual carrier in this way is highly unusual, because the residual carrier is supposed to be a stable phase reference for the demodulation of the signal, so shift-keying its amplitude doesn’t look like a great idea. The low-level amplitude of the residual carrier (which corresponds to the gaps between the dits and dahs) is very low, and the carrier almost disappears in the waterfall.

The telemetry subcarrier has a frequency of 2048 Hz and a symbol rate of 64 baud. This frequency zoom level is too low to see the modulation, so the telemetry subcarrier appears like a CW carrier at either side of the residual carrier. Its power level changes somewhat due to the amplitude shift keying of the residual carrier, but these power changes are much smaller than those of the residual carrier. Additionally, when the residual carrier is in the low-amplitude level, the second harmonic of the telemetry subcarrier is well visible in the waterfall.

Despite the amplitude shift keying on the residual carrier, it turns out a PLL can do a good job at tracking the phase of the residual carrier. I’m using a rather narrow PLL bandwidth of 5 Hz, which is quite reasonable for a UHF signal. I am using the following GNU Radio flowgraph to demodulate the PCM/PSK/PM telemetry signal and also to extract the amplitude data from the residual carrier.

This is what the GUI of the demodulator shows when running with this recording. Since the telemetry is only 64 baud, the constellation plot with this SNR is excellent. Note the second harmonic of the telemetry subcarrier, which appears in the in-phase component. This is typical when the subcarrier zero-crossings are smooth instead of instantaneous, since the smooth zero-crossings modulate the residual carrier amplitude.

After arriving to this point, I tried to determine the CCSDS coding without success. I used my typical approach of correlating the symbols against each of the CCSDS syncwords, and also performing Viterbi decoding for each of the possible variations of the CCSDS code and then correlating against the 32-bit syncword. I also tried to look at the autocorrelation of the symbols, but this didn’t yield any patterns.

Since none of this worked but the clue of 32 bits per second mentioned in the IARU coordination sheet compared with the 64 baud symbol rate strongly suggested the presence of convolutional coding, I did some more work to check if this signal was convolutionally encoded. I used the BCJR decoder that I had done for Voyager I with each of the 4 possible variants of the CCSDS code (two possible orders for the polynomials, plus an optional bit inversion in one of the encoder branches), as well as the two possible pairings of symbols. The nice thing about a BCJR decoder in comparison to a Viterbi decoder is that its output is LLRs (log-likelihood ratios). If the decoder is set for the wrong code, the LLRs will look like a mess, but if the decoder is set for the correct code, the LLRs will form a clean “constellation” that avoids values around zero. Of all the combinations tested, the CCSDS/NASA-GSFC convention with no inversion in one of the branches gave good LLRs, as shown here.

The output of the BCJR validated that I had found the correct convolutional code, and provided the decoded stream of bits. Still, I couldn’t find any syncword in this stream, and the autocorrelation didn’t shown any patterns. Plotting the bits as a raster map with an arbitrary width gave a randomly looking image. This made me suspect that the signal was scrambled with an asynchronous scrambler, so I tried a to descramble it using a few common algorithms, and then looked at the raster map again.

The G3RUH descrambler gave something that still looked random, but the Intelsat IESS-308 descrambler gave something that looked promising. The output had long runs of zeros (after correcting for a polarity inversion caused by the BPSK 180 degree phase ambiguity). Moreover, the autocorrelation of the descrambled bits showed very large peaks at a lag of 560 bits and its integer multiples.

Still, no trace of any of the CCSDS syncwords. I plotted the raster map of the bits using a width of 560 bits, and noticed that some of the columns of this raster map looked like what could be a syncword. Writing these columns in hex gave `0xAAAAFAF320`

. The `0xAAAA`

part is just a series of alternating ones and zeros. The kind of thing that you would use as a preamble to train a receiver clock. But the `0xFAF320`

part is a commonly used syncword. I know that it is used in CCSDS Proximity-1 (I mentioned this recently in my post about the MOVE-II cubesat), but apparently it is also used in IRIG 106. A Google search for “FAF320 syncword” reveals many results. Of special interest is this NASA report, which lists all the CCSDS syncwords, as well as the HDLC `0x7e`

flag (which I had also tried to search in the stream of bits), and mentions the relation between `0xFAF320`

and IRIG.

Indeed, correlating the stream of descrambled bits with the syncword `0xFAF320`

reveals peaks every 560 bits. There is an exception to this, which is the first and second peak. They are at a distance of 1120 bits. I think that the explanation is that the first frame was of a different type, which was twice the length of the other frames.

The next figure shows the raster map of the bits in each frame. The width of the raster map is 560 bits, so the first two rows should be understood as the first frame, which is twice as long. It is apparent that it has a different format compared to the other frames. Its final part is all zeros. The remaining frames have all the same format, and fields that line up are apparent.

The following figure is the same kind of raster map, but with bytes instead of bits. Here it is clear that there are some telemetry fields that have increasing values. Some of these fields are 16-bit wide, because when a byte overflows, the previous byte increments by one. The last 2 bytes might be a CRC-16, but I haven’t managed to find the algorithm (there are many possible algorithms, and it is not clear if the preamble or syncword should be included in the CRC calculation).

I haven’t reverse engineered the telemetry data beyond this. It is not so easy, since there isn’t much telemetry to try to find patterns, and it doesn’t seem to follow CCSDS protocols. It would be great to have more documentation about the telemetry from the LEV-1 team, specially since this spacecraft is using the amateur satellite service and has gone through an IARU coordination process (in which one of the questions is whether technical documentation about the telemetry will be publicly available).

Now let us turn our attention at the Morse code modulating the residual carrier in amplitude. The next figure shows the amplitude of the residual carrier throughout all the recording. Each 10 second segment is represented as one trace in the plot. The plot should be read left to right and top to bottom as if it was text. To me it is clear that this is intended to be Morse code, as opposed to digital amplitude-shift-keying. All the runs of low levels and high levels appear to be have length 1 or 3, which is what happens in Morse. A digital ASK modulation would typically have runs of other lengths. The IARU coordination sheet also mentions “CW morse beacon contains housekeeping data”.

Despite being almost sure that this is Morse, it looks like nonsense to me. Many regular characters can be recognized, but there are also some strange sequences that might be rare prosigns (-….-.. and …-…. for instance). I have seen that other satellites transmit binary data in hexadecimal in Morse, but this doesn’t seem to be the case. Maybe some other radio amateurs can find the meaning of this, or perhaps someone from the LEV-1 team can tell us what this means.

In summary, what I have found about the LEV-1 437.41 MHz telemetry is that it is PCM/PSK/PM with a symbol rate of 64 baud and 2048 kHz subcarrier. The residual carrier is modulated in amplitude with Morse code. The telemetry signal is convolutionally encoded using the CCSDS/NASA-GSFC and no inversion in one of the encoded branches (**correction:** the convolutional code has inversion in one of the branches, so it is exactly as the code recommended by CCSDS; see update below). There is IESS-308 scrambling before the convolutional encoding. The frames have a `0xAAAA`

preamble followed by the syncword `0xFAF320`

. However, the contents of the frames and the meaining of the Morse code data are unknown.

The GNU Radio flowgraphs and Jupyter notebooks used in this post, as well as the intermediate data files can be found in this repository.

**Update:** Norbert DL8LAQ has figured out how to read the Morse code. It turns out that it is inverted: the dahs and dits are represented by a low amplitude on the residual carrier, and the gaps by a high amplitude. I don’t know what was the reasoning behind this, since when one listens to the residual carrier, what is heard would be the opposite of what one expects. Perhaps it was intended to listen the Morse code off the second harmonic of the telemetry subcarrier, for which the dahs and dits correspond to high amplitude levels.

Here is the same plot as above, but with the sign of the amplitude inverted. Now it is clear what Norbert says: the transmission is CQ CQ DE JS1YMG” followed but a lot of hexadecimal data. The gaps between words are the same length as the gaps between letters, but otherwise this is perfectly readable Morse.

**Update 2024-01-22:** Scott Chapman K4KDR has found that the last two bytes of the frames are indeed a CRC-16. The CRC algorithm is CRC-16-CCITT-FALSE (see this online calculator), but there is the catch that the data included in the CRC calculation starts 3 bytes after the `0xFAF320`

syncword. The three bytes that must be skipped are `0x003e20`

in all the frames. This is probably some sort of header. This only applies to the 560 bit frames. I haven’t been able to find a way to validate the CRC of the first frame in the recording, which appears to be a 1120 bit frame (and the three bytes after its syncword are `0x005632`

instead). I have updated the Jupyter notebook to include the CRC check, showing that all the frames except this first longer frame have a valid CRC.

**Update 2024-03-06:** I have realized that the convolutional code does have inversion in one of the branches. The reason why I thought there was no inversion is because my BCJR decoder had the inversion logic incorrectly labelled: it was always doing an inversion (since it was adapted from a Voyager-1 decoder, which needs this), and then it was doing an additional inversion when inversion was enabled. I have realized this when updating the GNU Radio decoder to include a Viterbi decoder. The GNU Radio decoder flowgraph now has two Viterbi decoders (one for each possible pairing of input symbols) followed by IESS-308 descramblers, and the output of these is saved to a file.

The initial release of galileo-osnma that I made in March 2022 was based on the Galileo OSNMA User ICD for Test Phase v1.0, which was the latest ICD at the time and matched the signal-in-space. In December 2022, a new ICD, the Galileo OSNMA SIS ICD v1.0 was published. This contained some breaking changes with respect to the previous ICD. Below I will outline some of the changes. The signal-in-space was updated to follow this ICD in August 2023. Since some of the ICD changes interfered with the operation of galileo-osnma, I did some updates in August to fix the largest problems.

Yet another new ICD, the Galileo OSNMA SIS ICD v1.1 was published in October 2023. This was a smaller change compared to the previous one, but still had some breaking changes. New entries in the MAC lookup-table were defined, and some of these included FLX (flexible) entries, which is a feature that was defined in previous ICDs, but not exercised before. In December 2023 the OSNMA parameters of the signal-in-space were updated to use the MAC lookup-table ID 34, which only appeared in the v1.1 ICD. In January 2024 I have made the updates to galileo-osnma required for the v1.1 ICD and also added several new features.

Besides updates in the ICDs, there have also been changes in the cryptographic material. In August 2023, together with the update of the signal-in-space to the SIS ICD v1.0, there was a full change in the cryptographic material. A new Merkle tree was published, together with a new ECDSA public key with ID 1 belonging to that tree. In December 2023 there was a public key revocation exercise. As a consequence of this exercise, the key was changed to a new ECDSA public key with ID 2, belonging to the same Merkle tree. I recorded the OSNMA data for a few days around this exercise, and I will be showing some results below. In January 2024 all the cryptographic material, including the Merkle tree, has been changed again, coinciding with some updates in documentation about key distribution.

Compared to the previous ICD, the SIS ICD v1.0, which began its applicability in August 2023, had a large number of changes. Some of these changes seem to have been intended to solve some shortcomings and corner cases that were present in OSNMA. I had mentioned some of these problems when speaking about my tests with galileo-osnma in 2022. Here I will outline the main changes.

A new COP field was added to the MACK Header, as shown here.

The COP field stands for Data Cut-Off-Point parameter. Briefly speaking, this parameter indicates how many 30 second intervals have passed since the last time that the data that is authenticated by this tag changed. It is used as a hint for receivers to determine whether they have the most recent version of the navigation data. If a receiver is not decoding INAV pages from some satellite because of low signal quality, it will still have stored the navigation data from that satellite that it managed to receive earlier. This data can be authenticated against the currently transmitted tags as long as the navigation data has not changed (because the tags always authenticate the most recent version of the navigation data).

Before the introduction of the COP field, there was really no way for a receiver to know whether the navigation data it had was the latest version, other than simply trying to authenticate it against the current tags. If the authentication failed, it wasn’t possible to tell if this was because the data was stale or because the data was spoofed.

Using the COP field in galileo-osnma required a partial rewrite of the module that stores the navigation data. The COP field allows to remove all the heuristics to determine when several words of the navigation message can be put together to form a set of navigation data. Before this field existed, words that have an IODnav could be checked for consistency, but the IODnav wasn’t present in all the words. Now that the COP field exists, it is possible to keep track of the “age” of each word separately (the number of 30 second intervals elapsed since a copy of this word was last received). When trying to authenticate the navigation data against a tag, if no word has an age greater than the COP, we know that these words form a matching and current set that can be authenticated against the tag.

Another important change was in the definition of the ADKD=4. This was actually the main breaking change of this ICD, since a receiver that ignored the COP field would continue to work. The ADKD=4 data refers to the Galileo constellation timing parameters, which are the GST-UTC and GST-GPST conversion parameters. These parameters are supposed to be global for all the constellation, so the older ICD defined the PRND associated with the parameters as 255. The navigation data for these global parameters could be received from any satellite, since it is assumed that all the satellites are broadcasting the same values for these parameters.

However, in practice this is not the case. It is not rare to see that different satellites are transmitting different values for these parameters simultaneously. In the tests that I did with galileo-osnma, this occasionally caused authentication failures for ADKD=4, specially when using the live data from Galmon, which includes the data for all the satellites in the constellation, instead of only the ones in view for a particular receiver.

Rather than fixing the root cause of the problem, which is having satellites transmitting contradicting information for the timing parameters, what the OSNMA SIS ICD v1.0 has done is to embrace the problem and now make the ADKD=4 data satellite dependent, in the same way that the ephemeris and satellite clock is also satellite dependent. This means that now the constellation timing parameters must be collected and stored separately for each satellite, instead of globally, just in case there are some satellites transmitting different versions of this data.

Implementing this change has increased somewhat the memory footprint of galileo-osnma. Most of the footprint is caused by the storage of the navigation message data. After implementing this change, the application still fits in the 32 KiB RAM microcontroller that I’m using as a demo, but it is a pity that the system has solved the problem in this way (which potentially also requires more time to authenticate the navigation data) instead of solving the root cause of the problem.

The main change of SIS ICD v1.1 with respect to v1.0 is the addition of new entries to the MAC look-up table. The table for v1.0 had only 4 entries.

In v1.1, 8 entries are added, for a total of 12. Many of these new entries have FLX (flexible) slots, which means that the type of the tag transmitted in the slot is not fixed by the table.

Whereas with a 4-table entry it was reasonable to implement the table as a `match`

statement that exploited the regularities of the table, with a much larger 12-table entry a much better solution is to encode the table itself in some form as read-only data. This is what I have done.

Besides the need for a change in the implementation approach, which is more of an anecdote than anything else, I would like to take this ICD change as an example of what I believe to be somewhat of a culture problem in the Galileo system engineering. In this and other occasions interacting with the system, I have felt that there is a strong tendency to include too many options for the sake of flexibility. Often, many of these options are too similar to offer real advantages, and since there are many, most of them never get exercised in practice, and receivers don’t get all the testing they should. The designers seem to be willing to take this approach in several aspects of the system instead of making studies to decide which configuration is the best (or at least, good enough). This reminds me of a friend of mine, who says “if you can’t make it perfect, make it adjustable”. This is a great principle in engineering, but it can only go so far.

The MAC look-up-table in the ICD v1.1 is the most clear example of this situation in OSNMA. To me, this table looks a bit ridiculous because I just fail to see the point of having 12 different entries, with many of these entries having FLX slots (which are like joker cards). However, it is important to remember that this is not the only flexible aspect of OSNMA. The ICD specifies two cryptographic algorithms for each of the functions: ECDSA P-256 vs ECDSA P-521, SHA-256 vs SHA3-256, and HMAC-SHA-256 vs CMAC-AES (maybe the thinking was that if one of the algorithms gets broken, the system can immediately switch to the other one), and variable size for the TESLA keys and the MAC tags. If I remember correctly, only one choice of each of these options has been used so far.

In January I have also implemented some features that were missing in the initial release. These are listed here.

Two years ago I didn’t include this because there wasn’t a suitable implementation of this elliptic curve in Rust. Since March 2023 there is the p521 crate, which has made implementing this as easy as ECDSA P-256. However, P-521 support is still marked as experimental, because so far the signal-in-space has only used P-256, and there are no test vectors for P-521 published.

Additionally, P-521 support is gated behind a feature (which is enabled by default). The reason is that including this makes the size of the osnma-longan-nano firmware larger than 128 KiB flash size of the microcontroller used for this demo. Therefore, since P-521 isn’t used currently (and maybe it never will), it makes sense to make this optional, for the sake of code size in small embedded applications.

The DSM-PKR message is used to transmit a new public key, which is authenticated against the Merkle tree root. I didn’t implement this in my initial release, because the most common way of operating an OSNMA receiver is to load a trusted public key directly. DSM-PKR messages are only transmitted every 6 hours or when the public key changes. Since I wanted to test the December 2023 public key revocation scenario, I needed to have this feature working, so I have implemented it.

Non-nominal scenarios includes public key renewal, public key revocation, chain renewal, chain revocation, Merkle tree change, and alert message. I have updated galileo-osnma to handle all these situations correctly. In order to handle renewal scenarios seamlessly, two cryptographic elements need to be stored. This is a TESLA key for the current chain, and a TESLA key for the next chain, as well as the current public key and the next public key. By doing this, when the change takes place, the new cryptographic material is already present in the receiver.

The public key revocation exercise was done on 2023-12-13. I recorded INAV data with an uBlox receiver at home between 2023-12-12 and 2023-12-15 using Galmon‘s tools. I have published this data in the dataset “Galileo INAV data for OSNMA public key revocation exercise in December 2023” in Zenodo.

The following diagram from the ICD shows how a public key revocation should look like.

The Merkle tree root corresponding to this exercise was

0E63F552C8021709043C239032EFFE941BF22C8389032F5F2701E0FBC80148B8

The ECDSA public key in force at the beginning of the exercise had ID 1 and was

-----BEGIN PUBLIC KEY-----

MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEdKklz6D/GAXlxaWP26Mb8BRdW1vi

8GLT+Lsu6Y8PbbCBqF1dQK+JP+3Ss8YUnBSSGfDcEp9YCpRYO8lZE3fSqQ==

-----END PUBLIC KEY-----

Running the galmon-osnma tool in galileo-osnma v0.7.0 with `RUST_LOG=info`

and this cryptographic material shows that on 2023-12-12 the DSM-KROOT was signed with public key ID 1 and the TESLA chain parameters were

Chain { id: 0, hash_function: Sha256, mac_function: HmacSha256, key_size_bytes: 16, tag_size_bits: 40, maclt: 34, alpha: 250728117155926 }

The DSM-PKR was transmitting public key ID 1 at the appropriate times (00:00, 06:00, 12:00 and 18:00 GST).

On 2023-12-13, at TOW 299131 (11:05:31 GST), a DSM-PKR for public key ID 2 is first transmitted:

verified public key in DSM-PKR: DsmPkr { number_of_blocks: Some(13), message_id: 1, intermediate_tree_node_0: [229, 83, 10, 51, 213, 203, 96, 201, 80, 22, 184, 174, 199, 69, 147, 219, 205, 242, 113, 29, 57, 158, 162, 72, 105, 23, 60, 162, 41, 55, 154, 21], intermediate_tree_node_1: [49, 111, 169, 40, 95, 90, 30, 68, 4, 36, 19, 189, 175, 24, 170, 60, 246, 132, 114, 51, 151, 215, 184, 50, 90, 236, 161, 235, 202, 159, 15, 100], intermediate_tree_node_2: [153, 5, 66, 76, 190, 72, 42, 26, 50, 176, 16, 100, 248, 93, 12, 54, 223, 3, 142, 82, 206, 18, 142, 126, 197, 243, 35, 225, 101, 177, 130, 167], intermediate_tree_node_3: [21, 55, 189, 176, 16, 151, 46, 180, 163, 185, 11, 170, 205, 20, 148, 30, 244, 13, 162, 203, 43, 130, 211, 120, 179, 21, 192, 8, 222, 206, 253, 142], new_public_key_type: EcdsaKey(P256Sha256), new_public_key_id: 2, new_public_key: Some([3, 53, 120, 229, 199, 17, 169, 195, 189, 221, 28, 164, 238, 133, 247, 197, 27, 54, 120, 151, 203, 64, 184, 133, 104, 160, 200, 151, 218, 48, 239, 183, 195]), padding: Some([36, 224, 34, 44, 144, 128]) }

The same DSM-PKR continues to be transmitted periodically.

At TOW 299371 (11:09:31 GST), step 1 begins. The NMA status and CPKS change to don’t use and public key revoked, indicating that the current public key has been revoked. The chain ID in the NMA header is still 0. At the same moment, the DSM-KROOT starts to be signed by public key ID 2. The chain ID of this DSM-KROOT is 1 (it was 0 previously). The remaining parameters of the chain do not change. The OSNMA ICD indicates that under these circumstances the receiver must discard the TESLA key corresponding to the chain ID in the NMA header. galileo-osnma does this, and since this point it is not able to authenticate MAC tags.

This situation continues until TOW 306541 (13:09:01 GST), when step 2 begins. The NMA status changes to test. The CPKS is still public key revoked, so this indicates that the past key has been revoked. The chain ID in the NMA header is now 1. In this situation galileo-osnma can authenticate the TESLA key transmitted in this subframe using the KROOT with TOW 305970 (12:59:30 GST) that it had received previously in the DSM-KROOTs for chain ID 1 (signed with public key ID 2). galileo-osnma can now check the MAC tags transmitted in the previous subframe, and from this point on it continues authenticating MAC tags successfully.

The CPKS continues indicating public key revoked until TOW 385291 (2023-12-14 11:01:31 UTC), when it changes to nominal. This the beginning of step 3, and thus the end of the public key revocation process. After this point, OSNMA operates normally.

The data I collected during this exercise indicates that the signal-in-space followed what the ICD describes during the public key revocation and that galileo-osnma processes the revocation correctly. It discards the public key and TESLA key when the revocation begins, and is not able to authenticate MAC tags during step 1. When step 2 begins, galileo-osnma immediately can begin authenticating MAC tags again using the KROOT for the new chain received during step 1.

In my post presenting the initial release of galileo-osnma I mentioned that I believed that it was important to have an open-source implementation of OSNMA. Even if the cryptographic algorithms are correctly implemented and secure, subtle bugs in the logic of an OSNMA implementation can cause security issues, so having an open-source implementation with many eyes looking at it is a good way of improving the security. Since there was no OSNMA reference implementation provided by Galileo, I implemented galileo-osnma in Rust, which is a relatively safe systems programming language that is suitable for embedded, and released it under a permissive open-source license, so that it could be integrated in any kind of commercial projects. My intention was to try to see if this would attract any interest from the industry and if all together we could have a common OSNMA implementation that is better, specially in terms of security.

Nothing like this has happened. I haven’t seen too much interest in my galileo-osnma implementation (though it is always difficult to judge the impact of open-source projects), and haven’t heard any feedback from industry. I guess that each company working on OSNMA is rolling their own implementation, much to the detriment of everyone (as the saying goes, “don’t roll your own cryptography”). Probably OSNMA is still a very niche topic, since it is in a test phase and it only provides a very specialized feature related to the security of just one particular GNSS system. So I still have good hopes for the future of galileo-osnma.

In the post I wrote two years ago I also did an overview of other open-source implementations of OSNMA. I mentioned osnma_core and osnma-receiver by Aleix Galán. Apparently, in April 2022 these projects were refactored into OSNMAlib, which forms part of Aleix’s ongoing PhD thesis at KU Leuven. OSNMAlib looks very capable and well maintained.

OSNMAlib has some additional features to recover cryptographic material from partially missing data. Citing from the README, “OSNMAlib implements several optimizations in the cryptographic material extraction and in the process of linking navigation data to tags. None of these optimizations imply trial-and-error on the verification process, any authentication failure should be assumed as spoofing.” As an example of these techniques, OSNMAlib is able to use the tags in the MACK sections that were received, even if some sections of the MACK message are missing. It is also able to piece together the TESLA key from MACK sections transmitted by different satellites, since all the satellites transmit the same key. These techniques can be very helpful when some INAV words are lost because of bad signal quality. However, it would be interesting to review if they don’t really impact the security of the OSNMA protocol.

OSNMAlib is licensed under the EUPL-1.2, which is a copyleft license (the former project osnma_core was GPLv3). This limits its applicability in commercial projects. Additionally, it is written in Python, which is not suitable for small embedded systems.

As part of the OSNMAlib project there is a very nice webpage in osnmalib.eu that contains a real-time view of the status OSNMA. Currently it gives a view of the data from a Septentrio receiver at KU Leuven, and also of the data collected by Galmon. The webpage works by using a logging feature of OSNMAlib that dumps the state of the OSNMA receiver every 30 seconds. This dump is converted to a simple static HTML, which is rendered with some nice CSS. The HTML page is refreshed when this happens by some Javascript. The approach is simple, but effective. Unfortunately, at the moment it seems that none of the code related to this webpage is publicly available.

I’m interested in this kind of webpage because it is something like Galmon but for OSNMA. When I released galileo-osnma, I had in mind to integrate it with Galmon so that the OSNMA status could show up in Galmon’s webpage. However, I lacked the time and motivation to do this. Nowadays it seems that Galmon is somewhat unmaintained, and even basic PRs are open for many months without merging. Bert Hubert, the project lead, seems to have moved on to some other equally interesting and great projects.

If the code used to produce the osnmalib.eu webpage was released as open-source, this would give me an easy way to display the status of galileo-osnma in a webpage. I could probably adapt the `galmon-osnma`

demo application to output the same kind of logging dump as OSNMAlib. At the moment, the galileo-osnma API doesn’t have the features required to get the detailed information about the OSNMA data that is being received, which is what is needed for this webpage. All that the high-level API provides is the most current navigation data that is successfully authenticated. This is really the only thing that a GNSS receiver cares about, unless it intends to do detailed monitoring of OSNMA.

I have thought of maybe adding some callbacks to the galileo-osnma API defined through a Rust trait. This would allow external applications to provide custom behaviour at compile time that would run whenever some piece of data is received or some event happens, and there would be no cost if this feature is not used. With this callback idea it would be simple to output the data required for osnmalib.eu, or to log all the OSNMA data to InfluxDB or Redis.

Another OSNMA implementation in Python is osnmaPython, by Marc Cortés-Fragas. The first commit on this repository was in December 2021 and it was last updated in November 2022, so it doesn’t follow the current signal ICDs and is perhaps abandoned. The project has no license.

Yet another Python implementation is py-osnma-parser, by hmk3r. He says that this was part of his MSc at ETH Zurich, and that he proposed some attack concepts against OSNMA, which is quite interesting. The first commit on this repository was in April 2022, and the last code update was in September 2022. The repository has been updated more recently (September 2023) to include some documents and to update the README to say that the code is currently broken due to the signal-in-space changes in August 2023. Since the project was part of a finished MSc program, it seems that it is now abandoned. Yet again, this project has no license.

Finally, there is fgi-osnma. This is relatively new (first commit in September 2023), but it comes from the Finnish Geospatial research Institute, which is a well-known institution in the GNSS community. The project was presented in ION-GNSS+ 2023 and seems active (last update on November 2023). It is implemented in Python and released under the GPLv3 license.

My conclusions from this quick survey of a Github repository search by the name “osnma” is that Python is a prevalent language in open-source implementations of OSNMA, perhaps because these implementations are mainly focused on research. It is also somewhat alarming the fact that two of the implementations have no license. The two projects that do have a license are under copyleft licenses.

Therefore, I think that even though I have spent long periods without updating galileo-osnma, it still holds an important position in the open-source OSNMA scene. First and more importantly, because it is the only implementation done in a systems programming language. An implementation that is not written in a compiled language such as C, C++, Rust, or similar is probably not going to find its way into an embedded GNSS receiver. Second, and also importantly, because it is the only implementation that is released under a permissive open-source license. This makes it possible to use it in all sorts of commercial applications, and might result in an implementation that is better and more secure for everyone, if the project manages to gain some more traction.

]]>In this post I won’t speak about propulsion anomalies, but rather about low-level technical details of the communications system, as I usually do. Peregrine Mission One, or APM1, which is NASA DSN‘s code for the mission, uses the DSN groundstations for communications, as many other lunar missions have done. However, it is not technically a deep space mission. In CCSDS terms, it is a Category A mission rather than a Category B mission (see Section 1.5 in this CCSDS book), since it operates within 2 million km of the Earth. Communications recommendations and usual practices are somewhat different between deep space and non-deep space missions, but APM1 is specially interesting in this sense because it differs in several aspects of what typical deep space missions and other lunar missions look like.

For this post I have used some IQ recordings done by the AMSAT-DL team with the 20 metre antenna at Bochum Observatory. To my knowledge, these recordings are not publicly available.

Alan Antoine F4LAU did a preliminary analysis of modulation and coding shortly after launch. The signal is PCM/PSK/PM, which means that telemetry is BPSK modulated on a subcarrier, which is then phase modulated with residual carrier onto the RF carrier. The baudrate is 13 kbaud, and the subcarrier frequency is 1024 kHz. Due to the large subcarrier over baudrate ratio, the signal leaves a huge empty gap between the data subcarrier and the residual carrier, and occupies much more bandwidth than it should. For a baudrate as low as 13 kbaud, typically much lower subcarrier frequencies of around 60 kHz are used. This is a CCSDS recommendation even for Category A missions (see 2.4.14A in the blue book). I wonder if there was a reason behind the choice of such a large subcarrier frequency, such as perhaps using the gap between the data subcarrier and the residual carrier for ranging signals, to somehow use the large separation of the data subcarriers for navigation, or to accommodate much larger symbol rates using the same subcarrier frequency.

The coding is the CCSDS C2 LDPC code. This is a code that is designed for near Earth missions that need to operate at high data rates on the order of 100 Mbps. It focuses on low encoding and decoding complexity to allow these fast rates, rather than on achieving very good Eb/N0 performance. Thus, it is favoured by Earth observation satellites and similar missions, which need to transfer large volumes of data but usually have good link budgets. Section 8 of the TM Synchronization and Coding green book contains a more detailed discussion of this LDPC code and how it compares with the AR4JA family of codes, which are designed for good Eb/N0 performance and are the ones that are typically used in deep space and lunar missions (for instance, the Artemis I Orion vehicle used the r = 1/2 AR4JA code).

The C2 code has a rate of approximately 7/8. More precisely, the rate is 223/255, since the C2 code is an (8160, 7136) code designed so that the coded block and information block sizes are exactly 4 times those of the Reed-Solomon (255, 223) code. This makes it easier to replace Reed-Solomon by the C2 code in existing systems. The DSN has good support of the C2 code, as can be seen at the bottom of Table 3 in this document. Nevertheless, I can’t help but think that using the C2 code instead of an AR4JA code for this mission is an unusual and interesting choice. To me, the C2 code feels like a thing typical of low Earth orbit satellites, not deep space or lunar missions. Maybe Astrobotic based their communications system on existing low Earth orbit technology. This is not necessarily a bad thing. After all, there is a reason why the DSN supports the C2 code in addition to the AR4JA codes.

Since I hadn’t dealt with the C2 code before, I have added a command to ldpc-toolbox to generate the alist for the code. This allows me to use gr-ldpc-toolbox (or any other LDPC decoder that supports alists) to decode the LDPC codewords. However, the C2 code is somewhat peculiar because it is obtained from a basic (8176, 7156) LDPC code by expurgation, shortening and extension. The following figure is a good representation of what these operations involve.

The LDPC decoder works with the unshortened 8176 bits codeword, but what we receive is the shortened 8160 bits codeword. Therefore, to decode it is necessary to discard the symbols corresponding to the 2 zero fill bits, and add 18 symbols with LLRs that indicate a very high confidence in the bit zero in the place of the 18 virtual fill zeros. We treat the resulting codeword as a codeword of a systematic (8176, 7154) code with the same alist as the basic code (note that the value of \(k\) differs by 2 from that of the basic code), and drop the first 18 bits of the decoded codeword. This gives us the 7136-bit information word. All these operations can be done with stream and vector operations in GNU Radio, so it is possible to adapt the LDPC decoder from gr-ldpc-toolbox to work with the C2 code by adding some external blocks.

The following figure shows the GNU Radio flowgraph that I have used to decode Peregrine Mission One. The first part of the flowgraph is a typical demodulator for PCM/PSK/PM. The second part of the flowgraph includes the LDPC Decoder block from gr-ldpc-toolbox and the auxiliary blocks used for the codeword manipulations.

This is the GUI of the flowgraph running on one of the AMSAT-DL recordings. In this case the SNR is very good and there are no bit errors, but the signal has some periodic fading, and there are times when the SNR becomes much worse. The large empty space between the data subcarrier and the residual carrier can be seen in the spectrum. There are CW tones at what appears to be one half of the subcarrier frequency. I don’t know if these are an artefact of the signal generation, or purposely used for navigation.

For this analysis I have used a recording done on 2024-01-08 22:10:22 UTC. The recording has a duration of 1 hour and 6 minutes.

The frames are CCSDS AOS frames. The spacecraft ID is `0xE0`

. This doesn’t appear to be in the SANA registry. Virtual channels 0 and 63 are in use. Virtual channel 0 contains telemetry, and virtual channel 63 is the only-idle-data virtual channel. There are very few frames from virtual channel 63, since most often there is something to send on virtual channel 0. All the frames have a Frame Error Control Field (CRC-16), which is checked by the GNU Radio decoder.

All the bytes in the virtual channel 63 frames, except for the AOS primary header and the FECF, are filled with the value `0x8d`

. This is an unusual choice. The values `0xaa`

or `0x55`

(which have alternating ones and zero in binary), `0x00`

, or an 8-bit counter, are more common choices.

The frames in virtual channel 0 use the M_PDU protocol to carry CCSDS Space Packets. These frames also have an Operational Control Field (OCF) carrying a Communications Link Control Word (CLCW). The contents of the CLCW of the first decoded frame in virtual channel 0 are the following.

Container: control_word_type = False clcw_version_number = 0 status_field = 0 cop_in_effect = 1 virtual_channel_identification = 0 rsvd_spare = 0 no_rf_avail = False no_bit_lock = False lock_out = False wait = False retransmit = False farm_b_counter = 1 rsvd_spare2 = 0 report_value = 245

The only field that changes value throughout the recording is the report value, which increments as each telecommand is received. The value of this field is plotted here. In total, 14 telecommands were received during the recording.

The next figure shows the frame loss in virtual channel 0, computed using the virtual channel frame count. There are periods when no frames are lost, but there are other periods where the frame loss is much higher. In total, there is a frame loss of around 14%.

The periods when there is large frame loss correspond to fading in the signal. I haven’t tried to optimize the decoder for best results, but at times the constellation plot looks like this due to fading. The C2 code needs around 4 dB Eb/N0 to work, and probably the fades are going below this value.

The next figure shows the virtual channel usage, measured in 10 frame averages. Each spike in VC 63 corresponds to a single frame.

There is something quite unusual about the Space Packets transmitted in Virtual Channel 0. Most of them have the value 1 rather than 0 in the packet version number field in the primary header. The CCSDS Space Packet Protocol blue book specifies that the packet version number field should be zero. This is a somewhat confusing aspect of CCSDS: protocol versions are numbered starting by 1 (so the Space Packet Protocol is actually version 1), but the version number field encodes version minus one rather than version, so its value should be 0, not 1. In fact, packet version numbers are registered in SANA, and packet version 2 (which corresponds to the value 1) was formerly used for SCPS-NP CCSDS 713.0-B-1, but is now deprecated.

More interestingly, the packets from APID 128 correctly have a value of 0 in the packet version number field. However, the packets from the rest of the APIDs have the value 1. I wonder what is special about APID 128. Maybe it is generated by a different software. Due to this mistake with the packet version number, I had to patch my Space Packet defragmentation code, since it checks the packet version number as a sanity check.

APID 255 is used for idle Space Packets. This is another deviation from the CCSDS standard, which specifies that the idle APID should be 2047 (all ones in a 11-bit field). Maybe Peregrine Mission One is limited to 8-bit APIDs for some reason, because all the APIDs used are smaller than 256.

The packets in all the APIDs except APID 255 have the secondary header flag enabled. It appears that the length of the secondary header is 10 bytes. The first 2 bytes have a relatively small set of possible values: `0x044f`

, `0x082f`

, `0x083f`

, `0x08ee`

, `0x0c3f`

, `0x0c4f`

. I don’t know what these mean. The next two bytes are always `0x001e`

. The remaining 6 bytes are a timestamp, encoded as a 32-bit integer giving the number of seconds elapsed since the J2000 epoch (2000-01-01 12:00:00) and a 16-bit integer giving the fraction of the second.

I think that perhaps the `0x1e`

preceding the timestamp is intended to be the CCSDS P-field that describes the format of the timestamp. Such a value would indicate correctly the length of the integer seconds and fractional seconds fields, but it also indicates the CCSDS standard epoch of 1958-01-01, rather than an agency-defined epoch (which is what should be used for the J2000 epoch).

The payload of APID 255 packets starts by `0x083f001e0000`

(unless the packet payload is shorter than 6 bytes, in which case the payload is trimmed accordingly). This seems quite similar to the secondary header of the non-idle packets, even though the secondary header flag is disabled in this APID. After these 6 bytes, there is ASCII text: first `LE`

, and then `IDLE`

repeating until the end of the packet. Thus, the payload of idle packets is basically filled with repetitions of the ASCII text `IDLE`

. This is a fun detail. Peregrine Mission One is the first spacecraft that I have seen doing this. Other spacecraft use as filler either some Easter egg message in ASCII or something more boring such as `0xaa`

, `0x55`

, a counter or a pseudorandom sequence. Interestingly, the repetitions of `IDLE`

start by `LE`

rather than by `IDLE`

. Maybe this is just because of the fixed 6 bytes at the beginning, which overwrite 2.5 repetitions of the 4-character sequence `IDLE`

.

The next plot shows the timestamps of the Space Packets transmitted in each non-idle APID. It is apparent that there are many APIDs in use, and that each of them has a different periodicity. The gaps in some of the APIDs are most likely due to frame loss rather than the spacecraft actually stopping to send these packets (except for APID 221, which clearly stops near the beginning of the recording).

The Jupyter notebook in which I analysed this data contains raster maps for the packets in each of the APIDs. The packets in the non-idle APIDs line up neatly, with fields having the same positions in all the packets in the same APID. In the raster map many numerical fields can be distinguished by the patterns they form. I haven’t taken a look at any of these, because it would be a lot of work, since there is a lot of data that is readily accessible. Here are some representative examples of raster maps. The notebook contains all of them.

The raster map for the idle APID 255 is also interesting to see. The ASCII text shows up in blue, and we see that each packet has a different length (the purple part on the right does not form part of the packets), since they are used to flush an AOS frame by filling the remaining part of its packet zone.

The GNU Radio flowgraph and Jupyter notebook used in this post can be found in this repository, together with a binary file containing the decoded AOS frames.

]]>