DVB-S2 GRCon23 CTF challenge

This year I submitted a challenge track based around a DVB-S2 signal to the GRCon CTF (see this post for the challenges I sent in 2022). The challenge description was the following.

I was scanning some Astra satellites and found this interesting signal. I tried to receive it with my MiniTiouner, but it didn’t work great. Maybe you can do better?

Note: This challenge has multiple parts, which can be solved in any order. The flag format is flag{...}.

A single SigMF recording with a DVB-S2 signal sampled at 2 Msps at a carrier frequency of 11.723 GHz was given for the 10 flags of the track. The description and frequency were just some irrelevant fake backstory to introduce the challenge, although some people tried to look online for transponders on Astra satellites at this frequency.

The challenge was inspired by last year’s NTSC challenge by Clayton Smith (argilo), which I described in my last year’s post. This DVB-S2 challenge was intended as a challenge with many flags that can be solved in any order and that try to showcase many different places where one might sensibly put a flag in a DVB-S2 signal (by sensibly I mean a place that is actually intended to transmit some information; surely it is also possible to inject flags into headers, padding, and the likes, but I didn’t want to do that). In this sense, the DVB-S2 challenge was a spiritual successor to the NTSC challenge, using a digital and modern TV signal.

Another source of motivation was one of last year’s Dune challenges by muad’dib, which used a DVB-S2 Blockstream Satellite signal. In that case demodulating the signal was mostly straightforward, and the fun began once you had the transport stream. In my challenge I wanted to have participants deal with the structure of a DVB-S2 signal and maybe need to pull out the ETSI documentation as a help.

I wanted to have flags of progressively increasing difficulty, keeping some of the flags relatively easy and accessible to most people (in any case it’s a DVB-S2 signal, so at least you need to know which software tools can be used it to decode it and take some time to set them up correctly). This makes a lot of sense for a challenge with many flags, and in hindsight most of the challenges I sent last year were quite difficult, so I wanted to have several easier flags this time. I think I managed well in this respect. The challenge had 10 flags, numbered more or less in order of increasing difficulty. Flags #1 through #8 were solved by between 7 and 11 teams, except for flag #4, which was somewhat more difficult and only got 6 solves. Flags #9 and #10 were significantly more difficult than the rest. Vlad, from the Caliola Engineering LLC team, was the only person that managed to get flag #10, using a couple of hints I gave, and no one solved flag #9.

When I started designing the challenge, I knew I wanted to use most of the DVB-S2 signal bandwidth to send a regular transport stream video. There are plenty of ways of putting flags in video (a steady image, single frames, audio tracks, subtitles…), so this would be close to the way that people normally deal with DVB-S2, and give opportunities for many easier flags. I also wanted to put in some GSE with IP packets to show that DVB-S2 can also be used to transmit network traffic instead of or in addition to video. Finally, I wanted to use some of the more fancy features of DVB-S2 such as adaptive modulation and coding for the harder flags.

To ensure that the DVB-S2 signal that I prepared was not too hard to decode (since I was putting more things in addition to a regular transport stream), I kept constantly testing my signal during the design. I mostly used a Minitiouner, since perhaps some people would use hardware decoders. I heard that some people did last year’s NTSC challenge by playing back the signal into their hotel TVs with an SDR, and for DVB-S2 maybe the same would be possible. Hardware decoders tend to be more picky and less flexible. I also tested gr-dvbs2rx and leandvb to have an idea of how the challenge could be solved with the software tools that were more likely to be used.

What I found, as I will explain below in more detail, is that the initial way in which I intended to construct the signal (which was actually the proper way) was unfeasible for the challenge, because the Minitiouner would completely choke with it. I also found important limitations with the software decoders. Basically they would do some weird things when the signal was not 100% a regular transport stream video, because other cases were never considered or tested too thoroughly. Still, the problems didn’t get too much in the way of getting the video with the easier flags, and I found clever ways of getting around the limitations of gr-dvbs2rx and leandvb to get the more difficult flags, so I decided that this was appropriate for the challenge.

In the rest of this post I explain how I put together the signal, the design choices I made, and sketch some possible ways to solve it.

High level view of the DVB-S2 signal and flags

I think it is good to begin by giving a list of where each flag was found. I only had this list when I finished preparing the signal, since at the beginning of the design I only had some rough ideas that kept evolving as I tried different things out. However, giving this list first probably serves as a good summary of how the final DVB-S2 signal looks like.

  • Flag #1. Main TS video. Audio track 1.
  • Flag #2. Main TS video. Text always present in the image.
  • Flag #3. Main TS video. Audio track 2.
  • Flag #4. Main TS video. QR code in single frame.
  • Flag #5. Main TS video. Service provider.
  • Flag #6. Main TS video. Service name.
  • Flag #7. Main TS video. Subtitle track 1 (English subtitles).
  • Flag #8. Main TS video. Subtitle track 2 (Spanish subtitles).
  • Flag #9. Stream 1: GSE encapsulating IPv6 packets of a TFTP download of a file. The file is a .tar.xz of the gr-dvbgse source code in which an additional file flag.txt has been added.
  • Flag #10. Stream 2: different MODCOD with a transport stream. The flag is the PID number of the single non-idle PID in the stream, which contains custom data with ASCII instructions explaining what the flag is and how to format it.

The DVB-S2 signal would therefore consist of three different streams. Stream 0 would carry the main TS video, stream 1 would be used for GSE, and stream 2 would showcase a different MODCOD. Streams 0 and 1 would use the MODCOD QPSK 1/2, because that is arguably the most straightforward or obvious DVB-S2 MODCOD. Since I wanted stream 2 to stand out, I decided to use a different constellation. This might draw people’s attention if they were looking at a constellation plot. In order to simplify demodulating this MODCOD manually, in case someone wanted to go in that direction, I decided to use 8PSK, which can be handled easily with Costas Loop block set to order 8 (tracking the phase of 16APSK and 32APSK is more tricky). I decided to have a single BBFRAME with this MODCOD, so that manual demodulation wouldn’t be tedious. For this reason, I chose the highest FEC rate, r=9/10, to get the maximum possible amount of data in the BBFRAME.

Making a transport stream video share the DVB-S2 signal with other data

In contrast to other television systems such as DVB-S, DVB-S2 supports several streams in the same signal. Each BBFRAME carries data from only one stream, but the transmitter is free to mix BBFRAMEs from each of the streams in any fashion in the combined signal. Playback of a transport stream video, specially in hardware, requires the transport stream to arrive at a nearly constant rate, and with a nearly constant delay between the transmitter and receiver. The arrival of the transport stream bits at a steady rate is key to generate a reference clock in the receiver using the PCR to synchronize the video and audio.

When a DVB-S2 signal has a single stream containing a transport stream and uses constant modulation and coding, which is the most common case for normal television signals, then this works in the same straightforward way as in DVB-S. Each BBFRAME carries a fixed and constant amount of bits from the transport stream, and the symbol rate is constant. Therefore, this gives a steady rate of arrival of the transport stream.

DVB-S2 gives two mechanisms to allow a transport stream (or multiple) to coexist with other streams in the same signal and to support MODCOD changes. These mechanisms allow the receiver to regenerate the original transport stream at a constant bitrate regarless of the fact that BBFRAMEs from other streams might get inserted in the middle or that MODCOD changes might change the rate at which the bits flow.

The first of these mechanisms is called null packet deletion and is relatively simple. The transmitter can delete null packets in the transport stream in order to throw away useless data and gain “bandwidth” to transmit data from other streams, or dummy PLFRAMEs (which are commonly used with adaptive modulation and coding when there is no useful data to transmit). The only reason why null packets are present in the transport stream in the first place is because the transport stream needs to have a constant bitrate, while the video and audio codecs are often not using 100% of this bitrate. To allow the receiver to restore the deleted null packets, a DNP field is inserted at the end of each transmitted transport stream packet. The field indicates how many null packets were deleted before the packet. This procedure is illustrated by the following diagram taken from the ETSI standard.

Null packet deletion (taken from ETSI EN 302 307-1 V1.4.1)

Null packet deletion allows the transmitter to “rob” unused bandwidth from input transport streams to allocate it to other streams, while allowing the receiver to regenerate the original transport stream with all its null packets (whose presence is essential for the correct operation of the PCR). However, this mechanism only takes care of reproducing exactly the same bitstream in the receiver. It does not take into account the timing of the receiver. A chunk of bits might arrive too fast if a high rate MODCOD was used, and then no bits might arrive for a while if BBFRAMEs for other streams are being sent. The average data rate of the transport stream must still be the same (after undoing null packet deletion, that is), but the data might arrive in a very bursty manner. Buffer handling to smooth out this bursty rate while achieving relatively low latency and low time for channel zapping would be too complicated. And still we need to keep in mind that a hardware receiver needs to generate a hardware clock with the transport stream bitrate, so it must somehow know what this average bitrate is.

The mechanism that allows the receiver to recover the original timing of the transport stream is called ISSY, which stands for Input Stream Synchronizer. This involves a 22-bit counter in the transmitter that counts symbols (the symbol rate is always constant). Either the full value of this counter, or its 15 LSBs are sent in a 2 or 3 byte field ISSY field after each transport stream packet. The value of the counter is latched at the time that each transport stream packet arrives to the transmitter (at a constant rate). This counter allows the receiver to know the timing of the transport stream packets (since the receiver has also synchronized itself to the symbol clock), even if BBFRAMEs from other streams are inserted in between or there are MODCOD changes. Additionally, it is also possible to replace some of the ISSY fields by values that indicate to the receiver what is the minimum buffer size that it will need to use to smooth out the rate changes in the received data, and what should be the current fill state of such buffer (which the receiver uses when starting to receive the signal). These give hints to the receiver about how to manage its buffers. Perhaps the following diagram from the standard helps understand how the transmitter uses the ISSY field.

Input stream synchronization (taken from ETSI EN 302 307-1 V1.4.1)

Null packet deletion and input stream synchronization are optional. Whether they are used or not is indicated by two flags (the NPD flag and the ISSYI flag) in each BBHEADER. After all, the fields take up useful space after each transport stream packet, and there is no point in using them if the DVB-S2 signal contains a single stream with constant modulation and coding (in which case the receiver gets the transport stream data at a constant rate automatically, as discussed above).

The good news is that null packet deletion and input stream synchronization work great on paper, even if they can be somewhat tricky to implement by hand, which is what I was trying to do for this challenge. As I will explain below, I had generated my transport stream video with ffmpeg at the full data rate supported by QPSK 1/2 assuming no sharing with other streams, and then I would use null packet deletion to rob some unused bandwidth from this stream and have room for the two other streams.

The bad news is that since a simple television signal with a single stream and constant modulation and coding does not use null packet deletion and input stream synchronization, then simple receivers do not support these options. They simply seem to ignore the NPD and ISSYI fields in the BBHEADER and assume that there are no extra fields at the end of each transport stream packet. Clearly this causes the receiver to drop most of the transport stream packets because their CRC-8 does not match (since it is messing up their boundaries alignment). The receiver can get the first full packet in each BBFRAME, because its beginning is “pointed to” by the SYNCD field in the BBHEADER, but then it fails to decode the following packets. I tested that this is exactly what the Minitiouner does. When I enabled either null packet deletion or input stream synchronization or both, my video would no longer play at all, and I would only see a few transport stream packets come through. I imagined that other simple hardware and software decoders would do similarly, so this was a no go for the challenge, as it made accessing even the easier flags on the video too hard.

At this point I thought of simplifying the challenge and using a single transport stream and constant modulation and coding. This would work well even with simple receivers. I could use MPE instead of GSE to send the IP data (this is what Blockstream Satellite does, for instance). However, I really wanted to use GSE, which is much cleaner and simpler to process by hand than MPE. Plus I also really wanted to showcase MODCOD changes, because this was the part of the challenge that really involved the physical layer (which I feel is nice to have it play a role in GRCon CTF challenges). Additionally, using MPE would have been too similar to last year’s Blockstream Satellite challenge.

Instead of adding other streams to my transport stream video properly using null packet deletion and input stream synchronization, and then having something that is unplayable in practice, I decided to do it badly but in a way that it would play mostly okay. Using the DVB-S2 signal of the single transport stream video in constant modulation and coding, I set out to do some experiments to find out how much introducing some changes would disturb the Minitiouner.

I was not too worried about messing up the transport stream timing by introducing additional BBFRAMEs from other streams, because software video players are quite tolerant regarding how they handle their buffers. Even hardware video players might do okay given that the CTF IQ recording wouldn’t be too long. Doing things properly is required to avoid receiver buffers from underflowing or overflowing even if the signal is being received for an indefinite amount of time (as can happen with broadcast television). But for a short recording there is not much room for underflowing or overflowing buffers, unless the timing is messed up really badly. And even if that was the case, people could receive the transport stream data into a file, and then play it back with software (in which case all these considerations about timing and buffers are no longer a problem).

I found that introducing some BBFRAMEs from another stream in between the video transport stream BBFRAMEs would mess up the video received by the Minitiouner. I think that what happens is that the Minitiouner ignores that these BBFRAMEs belong to a different stream, and tries to process them as if they belonged to the same transport stream. The contents of these BBFRAMEs are not valid transport stream packets (they are GSE), so they are dropped. This is not a problem. What is the problem is that usually a transport stream packet gets split between the end of a BBFRAME and the beginning of the next one. The Minitiouner would fail to decode this split packet if there was a BBFRAME from the other stream inserted in the middle. Losing even a single transport stream packet would often cause the video and audio codecs to miss a critical part of the data, and so playback would glitch. Occasionally a null packet would be the one hit by this problem, which caused no problems.

I saw that I could control to some extent how glitchy the video playback was by controlling how frequently I inserted BBFRAMEs from the other stream. I realized that this was my way to make the challenge work. I decided that having a low amount of glitching would be fun for the challenge. Something that I didn’t like too much about having many flags in the video was that once you got the video playing correctly you could access all these flags without too much effort. Having a video that is somewhat glitchy makes this a bit harder, because the playback might glitch at the moment that the flag is spoken out, for example. I would need to make sure that none of the flags were rendered unrecoverable by these glitches. Interestingly, it seems that the glitches are not completely deterministic when the IQ recording is played back multiple times in a loop. This seems odd to me, because which transport stream packets get dropped would be deterministic. I don’t know. Maybe it has something to do with the level of buffer fill, which can be non-deterministic. I was testing both mpv and VLC and saw that they glitch somewhat differently (which is to be expected).

So overall, these glitches seemed to me more of a positive than a negative thing for the challenge. They gave a fun character to watching the video, as in “I’m trying hard to get the flags out of this video that keeps glitching” instead of “I’m just watching this video that plays back perfectly to get a bunch of flags”. The downside is that some people might decide that the glitches were a problem in their demodulator (as in Costas loop or symbol synchronization) and spend too much time trying to hunt this problem down and tweak the decoder. Hopefully the very good constellation (the IQ recording was prepared at a high, bit error free, SNR) would give a solid clue that the problem was not with the demodulator.

By doing these tests with the Minitiouner I determined how many BBFRAMEs I could allocate to stream 1 (containing the GSE data) without making the video glitches too bad. The number of frames I could get in was enough to have some meaningful data sent over IP. I also tested that inserting a single BBFRAME with a different MODCOD didn’t produce catastrophic results with the Minitiouner. It just produced another glitch in the video. So this was my way forward to put the challenge together.

Since I am inserting additional BBFRAMEs, the receiver is getting the transport stream at a rate slower than it should. This is bad, because a buffer will underflow in the receiver. To compensate, I delete some null packets by hand. This messes up the PCR, but it is probably better than not deleting the null packets. With this approach I can hope that even though the timing of the video is bad, it isn’t so bad that a receiver cannot tolerate it. Certainly mpv and ffmpeg did quite well, but hardware receivers (which I didn’t test) are more picky.

Transport stream video

Now that I have explained the structure that the DVB-S2 signal ended up having and why, I can move on to how I prepared the transport stream video. This was actually the first thing I did.

I wanted the video to be a static image with the GRCon23 logo most of the time. This was what Clayton did for the NTSC challenge, so it seemed quite fitting. However, instead of Rickrolling people with a QR code, I just put a plainly visible flag in there. This is the main frame that I prepared for the video.

Typical frame of the video

I also prepared a video frame with a QR code with a flag. This would be inserted in a single video frame, again because this is something that Clayton did for NTSC.

Single frame with a QR code

The resolution for these frames was 1920×1080, since doing a 1080p video seemed a pretty standard thing. Another thing that I needed to decide at this point was the length of the video. This is important, because together with the DVB-S2 symbol rate it will affect the size of the IQ file, and we don’t want to have a huge recording. I ran some numbers and saw that 1 Msyms was a pretty standard and relatively low symbol rate that would still allow me to put in a video of decent quality at QPSK 1/2. Then I could use 2 Msps complex 8-bit for the IQ file. If I made the duration of this file be one minute, it would take 240 MB, which seems reasonable. So the length of the video would be exactly one minute. A frame rate of 25 fps seemed a pretty standard choice, so I needed 1500 frames in total. All of them would be the typical frame except for a single one with the QR code.

Knowing the length of the video, I could now make the audio tracks. For the first track I recorded myself saying “Flag number one is satellite” over and over again. I did actually say this for a whole minute rather than recording it a single time and then copying and pasting. This seemed less monotonous and could help if sometimes the pronunciation was not too clear (specially for speakers of other languages).

The second audio track should be somewhat more difficult, so it only says “Welcome to the second audio track. Flag number three is video.” once in the middle of the video. The rest is silent, so you must wait to find out that there is actually something of value in this track. I recorded both audio tracks to 44.1 kHz mono 16-bit WAV, as then ffmpeg would take care of all the encoding.

Preparing the subtitles was more fun. First I had to document myself on how one can put subtitle tracks on a transport stream, because different video containers support different subtitle formats. Apparently the way to do this is with DVB subtitles, which are based on pre-rendered bitmaps rather than text. ffmpeg supports adding DVB subtitles (.sub file) to a transport stream, but they need to be rendered from an .srt file or similar with another tool. I found that SubtitleEdit can do this, so this is what I used. I first prepared my .srt files by hand, and then converted them to .sub with SubtitleEdit.

This is the .srt file for the first subtitle track:

1
00:00:05,000 -> 00:00:10,000
This is a subtitle track

2
00:00:10,300 -> 00:00:13,300
Maybe it has a flag

3
00:00:14,000 -> 00:00:15,000
wait for it...

4
00:00:17,000 -> 00:00:19,000
not there yet

5
00:00:25,00 -> 00:00:29,000
flag{eeCh7ahzee}

6
00:00:31,000 -> 00:00:32,500
That was flag number seven

For the second subtitle track I decided to do it in Spanish just for fun. This is my first language, it is also very common in the US, and subtitles are usually multilingual (in fact there is metadata to indicate the language of each track, and players such as mpv display this). Pulling this trick with subtitles is reasonable, because people that don’t know the language can just use Google translate. In here, the only thing you actually need to know or search is that “ocho” means “eight”, since the flag is given as a random string in the usual format (or you might simply submit this flag as an answer to all the flags that you’re missing). I refrained from doing multilingual audio tracks because it’s unreasonably harder for people who don’t speak the language.

1
00:00:00,000 -> 00:00:10,000
Esta es la pista de subtítulos internacionales

2
00:00:12,000 -> 00:00:17,000
Gracias por participar en el CTF de GRCon 2023

3
00:00:19,000 -> 00:00:23,000
Esperamos que lo estés pasando en grande

4
00:00:27,000 -> 00:00:34,000
La bandera número ocho es

5
00:00:34,500 -> 00:00:43,000
flag{keech4Xeen}

6
00:00:47,500 -> 00:00:55,000
¿Tienes ya todas las banderas de este vídeo?

Now we need to put everything together with ffmpeg. For the video I just told ffmpeg to generate it from PNG files for the individual frames. Since I had the two PNGs I wanted, base-frame.png and qr-frame.png, I simply made symbolic links for each frame:

for n in {0..1499}; do ln -s base-frame.png frame-$(printf "%04d" $n).png; done
ln -sf qr-frame.png frame-0937.png

To encode the video, it’s also necessary to decide the video and audio coded. In keeping with somewhat modern choices, I used H.265 (HEVC) for the video and AAC for the audio. This is the ffmpeg snippet that I used. Note that I’m also setting here the flags in the service provider and service name of the transport stream.

ffmpeg -y -framerate 25 -pattern_type glob -i 'frame-*.png' \
  -i ../audio/track1.wav -i ../audio/track2.wav \
  -i ../subtitles/track1.sub -i ../subtitles/track2.sub \
  -map 0 -map 1 -map 2 -map 3 -map 4 \
  -c:v hevc -c:a aac -c:s dvb_subtitle -f mpegts \
  -b:v 800k -maxrate 900k -bufsize 2000k \
  -metadata service_provider="Flag #5: flag{tekioh3vah}" \
  -metadata service_name="Flag #6: flag{einae4eebe}" \
  -metadata:s:a:0 language=eng -metadata:s:a:1 language=eng \
  -metadata:s:s:0 language=eng -metadata:s:s:1 language=esp \
  -max_interleave_delta 0 \
  -mpegts_flags initial_discontinuity \
  -muxrate 988858.11 output.ts

This ffmpeg call is quite similar to what is often used by amateur radio operators to encode video for transmission over the QO-100 WB transponder, with the difference that there are several audio tracks and also subtitle tracks. The muxrate comes from the net bitrate that is available for a transport stream in a 1 Msym QPSK 1/2 DVB-S2 signal. I calculated this with some Python (the number I used is rounded to 0.01 bps units):

symbol_rate = 1e6
kbch = 32208 # r = 1/2
dfl = kbch - 80 # 80 bits is BBHEADER size
numslots = 360 # QPSK
numsymbols = numslots * 90 + 90 # PLHEADER is 90 symbols
ts_rate = dfl / (numsymbols / symbol_rate)

I know that there are online calculators for this, but I found two different calculators giving me slightly different results, which made me wary.

I included the -max_interleave_delta 0 because of some error that ffmpeg was giving me when encoding (the error suggested this workaround). I don’t understand what this does.

As I have mentioned, this kind of ffmpeg call is often used by amateurs to encode video for QO-100. However, since I’ve spent some time talking about transport stream synchronization, I might as well also mention that this is not the correct way to do it, at least when getting the video and audio from some real time sources such as a camera and microphone. The problem is that the transport stream bitrate that ffmpeg generates is unrelated to the symbol rate that is used in the transmitter, so eventually a buffer will overflow or underflow somewhere in the transmitter system. This is basically the typical two clocks problem in SDR.

The proper way to do it is much more complicated, which is the reason why no one uses it in amateur radio. I involves either slightly varying the symbol rate of the transmitter to track the variable transport stream bitrate generated by ffmpeg (which is certainly doable by having a polyphase arbitrary resampler that is controlled by buffer fill levels), or using the SDR sample rate timing to control the video and audio acquisition (this seems much harder; I don’t know how I would do it, other than by doing the same arbitrary resampling controlled by buffers trick), or having the transport stream encapsulation use some feedback from the transmitter buffer fill levels to control the generation of the PCR and null packet insertion, so that in effect the transport stream average bit rate gets locked to the symbol rate (also difficult stuff).

The fact that the simplified approach usually works well in practice (specially since most transmission on QO-100 are only a few minutes long) is the main reason why I’m bringing this up now. This example shows that it is not crucial to get the timing of the transport stream exactly right, for things mostly tend to work even despite bad timing. Therefore, all my messing up with the transport stream timing to accommodate the other streams doesn’t seem so bad.

GSE

For the GSE part of the challenge I wanted to have something that was relatively simple but that still seemed like a somewhat realistic file transfer, taking into account that only one of the two directions of the IP communication would be visible in the DVB-S2 signal. I decided to use TFTP, because it is a very simple protocol, and even Wireshark can reassemble the file for you. Although the flag would be in a text file, the file sent by TFTP would be compressed to avoid people from grepping the flag in the BBFRAMEs and skipping all the protocol layers (this was actually possible in the Blockstream Satellite challenge). I also used IPv6 instead of IPv4 because I like IPv6.

My idea to get some filler data was to get something meaningful, such as the source code of GNU Radio, put a flag.txt file with the flag in it, and then make a .tar.xz. However, the space I had was too small for the source code of GNU Radio. I could only put on the order of 50 GSE BBFRAMEs without glitching the video too much. At FEC rate 1/2 these can only carry 196 KiB of data. I also tried the source code of gr-dvbs2rx, but that didn’t fit either. I finally found out that the source code of gr-dvbgse was small enough (82 KiB in .tar.xz). This also seemed a nice self-referential concept (although I didn’t use gr-dvbgse to make the challenge).

I prepared my IPv6 packets using Scapy. I used link-local IPv6 addresses, because this is one of the features of IPv6 that I find useful most often (you just bring up a network interface and can communicate with the other end without the need to set up addresses). I used fe80::1/64 for the server and fe80::2 for the client. For the UDP ports I used random high ports because a file transfer is supposed to use ephemeral ports, at least according to some examples I saw.

The part of the file transfer conversation that we can see in the DVB-S2 signal (which goes from server to client) starts with an options acknowledgement packet that indicates the file length (this is essential for Wireshark and any other reassembler to understand when we’ve reached the end of the file) and the block length (which is probably not so important). Then the file is transmitted in block packets (and the block number of the first packet is one, not zero!).

I encapsulated the IPv6 packets into GSE packets and created the corresponding BBFRAMEs by hand. For simplicity, what I did was to avoid fragmentation and put two IPv6 packets per BBFRAMEs (a third IPv6 packet won’t fit without fragmenting it into two GSE packets that go in different BBFRAMEs). This simplification was useful not only for me to put together the challenge, but also for people who tried to process this by hand (although I was expecting that people would find and use my dvb-gse Rust application).

8PSK 9/10

For the 8PSK 9/10 MODCOD, initially I didn’t have a clear idea of what to put in as the flag, since the main driver for this flag was just to have a second MODCOD in the signal. I was thinking of also using GSE, and maybe showing another type of file transfer, such as a HTTP transfer (which I could do and record with Wireshark instead of faking from scratch, because getting correctly all the details of TCP is tricky). I thought of maybe putting the flag in an image, as a different way to avoid grepping for an ASCII flag. However, in a single 9/10 BBFRAME there is only room for 7 KiB of data, which is too small for a decent image that contains the flag as rendered text.

I spent some time thinking about alternative ideas. I was toying with the idea of making this stream be a generic packetized stream, which is the third type of stream allowed by DVB-S2, in addition to transfer stream, and generic continuous stream (which is what GSE uses). I even thought about embedding CCSDS frames. I couldn’t find a reasonable and appealing way to incorporate these ideas. Then I wondered about also making this stream a transport stream. I realized that so far I hadn’t involved the transport stream itself as a part of the challenge. For the video it was a video player handling the transport stream, not the participant. So I decided to make this flag require people to look at transport stream headers, even if in a simple way.

What I did is to prepare by hand a transport stream that contained only two PIDs: the null PID, with some null packets, and PID 0x0e2b (the PID number was chosen at random, avoiding reserved ones). The payload of the packets in this PID would contain the following repeating text:

Flag #10 is the transport stream PID in which this text is found,
formatted in the usual flag{…} format, in lowercase hex.

(Format example: flag{0x012a})

My hope was that people would find this text somehow (even with hexdump or grep), and learn that they would need to look at the transport stream headers to see the PID, either by hand or with a tool. This is simple enough to do by hand (transport stream packets have a fixed length of 188 bytes and the header format is in Wikipedia), and it also seems suitable for a CTF. An alternative way of doing this is to use dvbsnoop -if file.ts -s ts, which will show information such as

------------------------------------------------------------
TS-Packet: 00000001   PID: (Unkown PID), Length: 188 (0x00bc)
from file: file.ts
------------------------------------------------------------
  0000:  47 0e 2b 10 46 6c 61 67  20 23 31 30 20 69 73 20   G.+.Flag #10 is 
  0010:  74 68 65 20 74 72 61 6e  73 70 6f 72 74 20 73 74   the transport st
  0020:  72 65 61 6d 20 50 49 44  20 69 6e 20 77 68 69 63   ream PID in whic
  0030:  68 20 74 68 69 73 20 74  65 78 74 20 69 73 20 66   h this text is f
  0040:  6f 75 6e 64 2c 0a 66 6f  72 6d 61 74 74 65 64 20   ound,.formatted 
  0050:  69 6e 20 74 68 65 20 75  73 75 61 6c 20 66 6c 61   in the usual fla
  0060:  67 7b 2e 2e 2e 7d 20 66  6f 72 6d 61 74 2c 20 69   g{...} format, i
  0070:  6e 20 6c 6f 77 65 72 63  61 73 65 20 68 65 78 2e   n lowercase hex.
  0080:  0a 0a 28 46 6f 72 6d 61  74 20 65 78 61 6d 70 6c   ..(Format exampl
  0090:  65 3a 20 66 6c 61 67 7b  30 78 30 31 32 61 7d 29   e: flag{0x012a})
  00a0:  0a 46 6c 61 67 20 23 31  30 20 69 73 20 74 68 65   .Flag #10 is the
  00b0:  20 74 72 61 6e 73 70 6f  72 74 20 73                transport s

Sync-Byte 0x47: 71 (0x47)
Transport_error_indicator: 0 (0x00)  [= packet ok]
Payload_unit_start_indicator: 0 (0x00)  [= Packet data continues]
transport_priority: 0 (0x00)
PID: 3627 (0x0e2b)  [= ]
transport_scrambling_control: 0 (0x00)  [= No scrambling of TS packet payload]
adaptation_field_control: 1 (0x01)  [= no adaptation_field, payload only]
continuity_counter: 0 (0x00)  [= (sequence ok)]
    Payload: (len: 184)
    Data-Bytes:
          0000:  46 6c 61 67 20 23 31 30  20 69 73 20 74 68 65 20   Flag #10 is the 
          0010:  74 72 61 6e 73 70 6f 72  74 20 73 74 72 65 61 6d   transport stream
          0020:  20 50 49 44 20 69 6e 20  77 68 69 63 68 20 74 68    PID in which th
          0030:  69 73 20 74 65 78 74 20  69 73 20 66 6f 75 6e 64   is text is found
          0040:  2c 0a 66 6f 72 6d 61 74  74 65 64 20 69 6e 20 74   ,.formatted in t
          0050:  68 65 20 75 73 75 61 6c  20 66 6c 61 67 7b 2e 2e   he usual flag{..
          0060:  2e 7d 20 66 6f 72 6d 61  74 2c 20 69 6e 20 6c 6f   .} format, in lo
          0070:  77 65 72 63 61 73 65 20  68 65 78 2e 0a 0a 28 46   wercase hex...(F
          0080:  6f 72 6d 61 74 20 65 78  61 6d 70 6c 65 3a 20 66   ormat example: f
          0090:  6c 61 67 7b 30 78 30 31  32 61 7d 29 0a 46 6c 61   lag{0x012a}).Fla
          00a0:  67 20 23 31 30 20 69 73  20 74 68 65 20 74 72 61   g #10 is the tra
          00b0:  6e 73 70 6f 72 74 20 73                            nsport s

Putting everything together

To assemble the complete DVB-S2 signal from the elements described in the previous sections, I used a combination of operations by hand in a Jupyter notebook, and some small GNU Radio flowgraphs to do the heavy lifting such as FEC encoding and physical layer modulation in several steps.

To make BBFRAMEs from the transport streams I had my custom Python code from the experiments with null packet deletion and input stream synchronization, so my input at these point was the BBFRAMEs that I wanted in each stream. First, I put together streams 0 (video) and 1 (GSE), since they should be encoded with the same QPSK 1/2 MODCOD. What I did was to insert the 30 BBFRAMEs for the GSE stream into random locations in the list of BBFRAMEs from the video. To avoid problems with people missing TFTP packets from the GSE stream if they were too close to the beginning or end of the IQ recording, I avoided inserting GSE packets in these regions.

Then I ran the BBFRAMEs through this flowgraph, which performed FEC encoding and physical layer framing using QPSK 1/2. The flowgraph is basically extracted from the DVB-S2 GSE transmitter example in gr-dvbgse. I just took out the parts I didn’t need. The output of this flowgraph is a sequence of QPSK symbols. Each 32490 of these symbols make up a PLFRAME. It is important to take into account that the Physical Layer Frame output is actually 2 samples per symbol (inserting a zero between each pair of symbols) so that it can be run directly through an RRC filter to produce a waveform at 2 samples per symbol.

GNU Radio flowgraph for QPSK 1/2 FEC encoding and physical layer framing

I did an analogous thing with the single 8PSK 9/10 BBFRAME, using a different flowgraph set for this MODCOD. The output of this would be a PLFRAME containing 21690 symbols (the difference in PLFRAME length is only due to the change in modulation, not to the change in FEC rate, since all the FECFRAMEs have 64800 bits).

Next I inserted the 8PSK 9/10 PLFRAME in between two QPSK 1/2 PLFRAMEs at the middle of the data. The final step was to take all the symbols, run them through an RRC filter (I used a roll-off of 0.2, which is the smallest allowed by DVB-S2; this needs to be consistent with the roll-off field in the BBHEADER), apply a channel model (it’s always good to put in some carrier frequency offset and sampling frequency offset to avoid making demodulation by hand extremely easy), and write to an IQ file. After the Channel Model block I used a low-pass filter to emulate somewhat the edges of the passband of an SDR receiver and avoid having a perfectly white noise floor. Often a telltale sign for whether a CTF IQ file is a real recording or artificially generated is whether the noise floor is perfectly flat. I kept the SNR reasonably high because I didn’t want low SNR to be part of the difficulties of the challenge.

GNU Radio flowgraph for DVB-S2 modulation and channel emulation

Possible solutions

As I have mentioned, during the challenge design I was constantly checking my signal with the Minitiouner to make sure that what I produced was something that people could work with. When I had finished putting everything together, I also tried to get all the flags with gr-dvbs2rx and leandvb to get a better idea of how difficult the challenge was going to be and what could be the difficulties that people encountered. Here I will sketch some possible ways of getting the flags based on these tests.

With the Minitiouner it only seems possible to get flags #1-#8, since it misses the data from the other streams (it drops it or corrupts it). Getting these flags is more or less simple. One problem that I observed is that the subtitles display in a very large size with VLC, so it is impossible to read the flag, as it is off screen. They display with a reasonable size in mpv. I don’t know what causes the difference.

If you configure the typical DVB-S2 receiver flowgraph from gr-dvbs2rx for QPSK 1/2, then you’ll get the first half of the video. The flowgraph seems to choke when it sees the 8PSK 9/10 PLFRAME in the middle of the video, and then it fails to decode any of the following PLFRAMEs. I’ve also seen this choking effect with DVB-S2 signals that contain dummy PLFRAMEs. I suspect that the problem is that the DVB-S2 PL Sync block passes to its output the payload of all the PLFRAMEs, regardless of their MODCOD (even if you’re using the PLS filter options). Since the LDPC Decoder block expects frames of a fixed size (the size for the MODCOD that it has been set to) at its input, once that a PLFRAME of the wrong size has been passed to it (because it had a different modulation or it was a dummy PLFRAME), then the input of the LDPC Decoder becomes misaligned and it never works again.

The first half of the video can still give you a good number of flags. To get the second half of the video it is possible to use the Offset in the File Source to skip over the problematic 8PSK 9/10 PLFRAME. This seems fair enough for the challenge, and maybe also gives a clue that there is something unusual going on at this point.

leandvb is better designed regarding adaptive modulation and coding. It actually detects that there is a different MODCOD, spawns a new ldpc_tool specific for that MODCOD and continues processing both MODCODs correctly. This means that with a single run of leandvb you get the whole video.

Neither gr-dvbs2rx nor leandvb seem to pay much attention to the stream type field in the BBHEADER, which indicates if the stream contains transport stream packets or if it contains a generic continuous stream that should be treated differently (in the case of this challenge, as a GSE stream). Both of them seem to try to process all the data as if it was a transport stream. This is the same as the Minitiouner does, so it means that we get glitches in the video with all these tools.

In the case of gr-dvbs2rx it is very straightforward to avoid treating the data as a transport stream, since the decoder has been designed in a modular way using GNU Radio blocks. The BBdeheader block gets BBFRAMEs at its input and produces transport stream packets at its output, assuming that all the BBFRAMEs carry a transport stream (and probably ignoring if there are different streams). If we remove this block, then we get BBFRAMEs which we can write to a file for analysis or do whatever we want with them. If we run the decoder set to QPSK 1/2, we will get both the BBFRAMEs of the video stream and of the GSE stream (only for the first half of the file, due to the caveat mentioned above). Writing these frames to a file and then looking at the BBHEADERs manually would be quite helpful, as this would show that there is some GSE in the signal.

For processing GSE, I expected that some people would find my experiments with GSE and use this as a help for how to set up dvb-gse. Often the challenges I make are related to things I’ve worked with before, so some of my posts can be a great reference. In this case, to connect a gr-dvbs2rx flowgraph and dvb-gse, it is possible to use the UDP Sink block. In this way, dvb-gse will receive the BBFRAMEs from the GNU Radio flowgraph, decapsulate the GSE data and send the IP packets to a TUN device. Wireshark or tcpdump can capture from this TUN device to obtain the frames.

It is not completely necessary to use my dvb-gse tool for this part of the challenge. GSE is a rather simple protocol, and I wasn’t using fragmentation, which is more cumbersome to deal with. Essentially the only thing that needs to be done is to read the length of each IP packet from its GSE header. By looking at the GSE standard, it is not difficult to write some code that does this from scratch.

Once we get the UDP packets in Wireshark, we need to understand that they are TFTP. This isn’t immediately obvious, since they all use ephemeral ports rather than port 69, but the first frame contains the ASCII strings blksize and tsize. A Google search for these gets many references to TFTP. When we tell Wireshark to parse these packets as TFTP, it is even able to reassemble the whole file for us, and save it to disk.

Something important about using dvb-gse for this challenge is that it needs to be run with --isi 1, so that it only processes the data from stream 1, which is the one containing GSE. Otherwise it will assume that the signal is SIS (single input stream), and probably complain. The correct ISI to use is not too difficult to figure out by looking at the BBHEADERs.

Another remark is that it is necessary to make the UDP Packet Data Size in the UDP Sink block match the BBFRAME size. In this way, each UDP packet sent to dvb-gse will contain a full BBFRAME. The size to use here depends on the FEC rate and can be found in the DVB-S2 standard. It is the so called \(K_{\mathrm{BCH}}\) parameter (divided by 8 to get bytes instead of bits). dvb-gse supports receiving BBFRAMEs fragmented into multiple UDP packets, but it was implemented for the specific type of fragmentation that longmynd does, which always has the start of a BBFRAME at the beginning of a new UDP packet. The UDP Sink block will not do this if we set it to the wrong packet data size.

Regarding processing GSE with leandvb, there doesn’t seem to be an option to make this tool spit out BBFRAMEs instead of trying to process everything as a transport stream. Since it is a C application, it is less flexible than a GNU Radio flowgraph. Probably it is not too difficult to modify the source code a little to dump the BBFRAMEs to a file, but this requires going through the code to understand how the decoder is structured.

To get the 8PSK 9/10 PLFRAME with gr-dvbs2rx, what I tried was to configure my flowgraph for this MODCOD, and then play with the Offset in the file sink to make the file playback start just before this PLFRAME. Actually it is necessary to start a little bit before it, because the demodulator loops need some time to lock. However, the moment in which we start the playback is somewhat critical, because as I have mentioned the decoder will break if it sees one of the QPSK 1/2 PLFRAMEs before it tries to decode the 8PSK 9/10 PLFRAME. If we enable all the debugging, the decoder logs the MODCOD of each PLFRAME detected, which makes this process easier than searching in the blind. I don’t like this approach too much, because it is a bit cumbersome and tricky. However, I gave a hint to Vlad along these lines, since I had said that it was possible to get all the flags using gr-dvbs2rx and he asked for more details, given that gr-dvbs2rx states that it doesn’t support adaptive modulation and coding. He managed to get the flag with this hint, so I guess the method is doable.

Another way of getting this PLFRAME is to ignore all the DVB-S2 decoding for a moment and write demodulated symbols to a file. Then we need to extract the 8PSK 9/10 PLFRAME from those. This can be done either manually by looking at the constellation, because all the other symbols are QPSK (or technically \(\pi/2\)-BPSK for the PLHEADER), or more thoroughly by correlating with the SOF sequence to find the start of each PLFRAME and then looking at the PLHEADERs to get what is the MODCOD of each PLFRAME (and by doing this we also discover that the rate is 9/10, which is not obvious just by looking at the constellation). When we have extracted the 8PSK symbols, then we can feed those into the gr-dvbs2rx LDPC Decoder block and the rest of the DVB-S2 chain.

For this 8PSK 9/10 PLFRAME we will get a properly formatted transport stream if we are using the BBdeheader block, or if not we can just look at the BBHEADER and discover that it contains a transport stream and process it as such (which is probably as simple as throwing away the BBHEADER and replacing the packet CRC-8’s with the 0x47 sync byte, because the first packet starts at the beginning of the BBFRAME data field). The instructions for how to get flag #10 are in ASCII, so we can read them directly with hexdump or similar. Then we can get the PID of the transport stream (which is the flag) using dvbsnoop or manually, as I suggested above.

I have said that leandvb handles MODCOD changes quite well. However, getting flag #10 with leandvb doesn’t work out of the box. In any case, I think that leandvb ignores the ISI field in the BBHEADER, which indicates to which stream the BBFRAME belongs. To avoid the data of different streams from getting mixed up, it is advisable to use the --modcods flag to process only the 8PSK 9/10 BBFRAMEs. Doing this, I wouldn’t get any data at the output.

After wondering why, and adding some debug printf()‘s, I discovered that the problem is that it tries to batch 32 FECFRAMEs before giving them to the LDPC decoder. The reason is that the LDPC decoder does inter-codeword SIMD on 32 frames at a time using AVX2. Here there is just a single 8PSK 9/10 frame, so it never gets passed to the LDPC decoder. It is possible to loop the file repeatedly, but I don’t know if there is some resetting that happens after a while or every time that the file is looped, or that I simply wasn’t patient enough to loop the file 32 times, but I never got any data from the 8PSK 9/10 frame. Then I found that by changing a couple of lines in the code I could change the LDPC decoder batch size to only one frame. For this, I also needed to modify the ldpc_tool to expect a single input frame (and looking at my code changes, it seems that I also wrote the decoded frame 32 times to the output just in case there was batching there). In any case, if I remember correctly, with these small but rather specific changes I was able to get the 8PSK 9/10 transport stream with leandvb.

Implications for amateur TV transmissions on QO-100

I also wanted to use the preparation of this challenge to gather some information on the support for somewhat exotic DVB-S2 signals by widely available decoders for amateur radio, such as the Minitiouner and software decoders. Since the QO-100 WB transponder was inaugurated, 99.9% of its traffic has been DVB-S2 transmissions with a single transport stream and constant modulation and coding. Some people such as Evariste Courjard F5OEO and myself have also been doing experiments with GSE and IP data.

It would be nice to promote the usage of more complex DVB-S2 signals on the transponder. For example, a low data rate GSE stream with IP data could be added to the usual transport stream video signals by stealing some bandwidth of the video through null packet deletion. The IP data could be used to send some broadcast data about station characteristics or even some real time monitoring of station performance such as PA temperature and output power. This would fit nicely with the current usage, because I’ve seen that people often put some of this data into their video, but there are not so much details that you can fit into a video screen.

The tests I did for this challenge show that the common decoders are not yet at a point where it is easy to process a mixed transport stream + GSE signal. A compromise solution such as using a single transport stream with MPE for the IP data could be used instead.

Code and data

All the code and data that I used to prepare the challenge is in the daniestevez/grcon23-ctf Github repository. The final SigMF file used in the challenge can be downloaded from the CTF page. The main thing in the Github repository is the TS.ipynb Jupyter notebook, which ended up being where all the code to put together the DVB-S2 is. This notebook goes through the steps explained above, and calls some external GNU Radio flowgraphs (by using Python’s subprocess). At the bottom of the notebook there are some remnants of my experiments with input stream synchronization, including some quick analysis of the transport stream PCR.

One comment

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.