在不超過4ms的等待時間的情況下傳達單個信息比特的良好調製方案是什麼?


2

幾天前closed question的註釋中已經提出了這個問題。我與原始問題無關,但是問題的這一方面使我感到非常好奇,因此,如果有人可以嘗試回答它,那就太好了。問題的背景是通過HF的長距離傳輸。

除了業餘無線電法規(HF:最大功率為400W)以外,對位置,設備或功率沒有特別的限制,但更高效(低成本或低功率)和更可靠的解決方案是首選。從開始傳輸到完成解碼之間的時間不得超過4 ms。

假定尚未建立任何通信通道,並且除了發送器和接收器之外,將不使用任何其他基礎結構。

假設無需解碼呼號

4

So, I'll go ahead and include the callsign, if we're sticking to regulation, in what needs to be decodable (though not necessary within the 4 ms). I'll also assume 4ms is from the point in time the signal starts reaching the receiver until it knows what was sent. Makes no sense to include the propagation delay: 4 ms is only 1200 km of distance at speed of light, and the original question was about 10000 km (and HF travels slower than vacuum speed of light).

WSPR reserves 28 bits for a callsign, and that seems about right, so I'll go with that.

Together with a 1 bit payload, that makes 29 bit.

We have 400 W in 4 ms, so we can send 4·10² W · 4·10⁻³ s = 1.6 Ws = 1.6 J. Not bad at all!

What matters at the receiver is that we make the right decision about having detected a 0 or a 1. That's an data estimation from a noisy observation, and that's pretty simple to model: We receive something, quantify it as a number, and mark somewhere that "left of this mark, it's a 0, right of this mark, it's a 1". All we have to find is a sensible method of mapping our received signal to a number, then find the optimal place to set that decision boundary.

Now, we have to deal with noisy reception. That means that the receiver can never be actually sure what was sent, because the received signal is added to noise.

We can put a number on how likely it is that it was wrong, a bit error probability, however, on a system, as soon as we understand how the noise amplitudes compare to the signal amplitudes.

Typically, this looks like this:

Say, this is your noise probability density function (PDF). Noise is a bit unnerving, because it's not a deterministic thing: Noise is by definition random. We can't know what its value is, and we thus can't simply subtract it from what we've observed. But we can describe that random variable by its pdf:

Normal distribution PDF
Normal PDF, Tmennink / CC BY-SA

You need to read it as such: "The probability that the noise amplitude is between 1 and 2 is the area below the curve, below the points (on the horizontal axis) 1 and 2", or "The probability that the noise takes a value lower than -2 is the area under the curve (that's an integral, by the way) from -infinity to -2", or "The probability that an absolute amplitude no higher than 1 ocurred is the area between -1 and +1".

Let's do an example; imagine this: you know that your transmitter sends a positive +1 to signal a "0" bit, and a negative -1 to signal a "1" bit.

However, you don't know how much your channel attenuated your signal. So, your +1/-1 might have been shrunk to +0.1/-0.1 or to +0.00000000000001/-0.00000000000001(= +10⁻¹⁴/-10⁻¹⁴) (more realistic for a long range channel...).

You observe a negative value -0.45. What was sent?

  • Either, a negative -amplitude was received, and the noise didn't add enough to make it positive, so that you're still seeing the negative value, or
  • a positive +amplitude was received, but the noise added enough negative amplitude, so that the noise converted your positive value to the negative -0.45.

Both happens in reality. So, we can't be sure. What that looks like in the terms of the above plot is that if we know that a +amplitude reached us, we know that the value of signal + noise amplitude has the same pdf as noise, but with the horizontal 0 point shifted right to +amplitude, so that the "bell" now is centered around the actual received signal.
The same happens when -amplitude reaches us, but in that case, the noise pdf bell curve is shifted to the left:

Conditional PDFs

We never see that bell, it's just a hidden property of the random variable "receiver output", but that allows us to reason about things:

Of course, we want to make a decision that makes most sense. In our example, that means we'll assume that +amplitude was sent when we observe an $r$ right of the vertical axis, and -amplitude when we observe an $r$ left of the vertical axis. But note that this is only true if we assume both to be equally likely, and if our PDF is symmetrical! (this is the Maximum-Likelihood estimator, by the way.)

This lets us reason what influences the probability that we make an error. It's two things:

  1. the width of these bell curves. If they get wider, more of the area "laps" over the vertical axis, and that's our error probability! In terms of probability theory, the width of the curve is proportional to the square root of the variance of the noise, which is the same as noise power.
  2. the distance between +amplitude and -amplitude. The wider we put them apart, the less of the bells lap over to the other side, and the lower our error probability gets. Since we can't modify the channel, the attenuation is given, and the only way to influence the received amplitude is to increase the transmitted amplitude proportionally. Sadly, amplitude is the square root of signal power, so this find legal and technical bounds quickly.

What this shows us is that to design a system for a given error probability (99%, in your case), we need to think about the ratio of power in the received signal, and power of the received noise.

Now, if we really encode the one payload bit separately from the 28 callsign bits, that's a bit energy $E_b = 1.6\,\text{J}$.

Let's say we have an attenuation $a$ over the channel (in fact, $a$ will very much depend on the time of day, sun activity, weather, mood of your cat, …, so we'll have a probability distribution for a random variable $a$, and we need to pick $a$ such that it's at least as good as that 99% of times).

That means of the 1.6 J, a·1.6 J reach the receiver.

The receiver has a noise floor density $N_0$, a noise power per bandwidth. Now, bandwidth is an inverse of time, so that the physical entity $N_0$ is indeed power times duration, i.e. has the same physical meaning as the bit energy.

Hence, $E_b/N_0$, energy per bit to noise power spectral density ratio, is dimensionless. Makes sense: if we send more bits per second, we get proportionally more bit energies per unit of time, but we need proportionally more bandwidth, and with noise being white, we get proportionally more noise power.

So, doesn't really matter how much bits per second we send; the $E_b/N_0$ value is defined by the transmit power and the unit-bandwidth noise variance.


I'll now tell you something that you might have guessed all along: Physics doesn't exacly like us. Physics has it that every device at room temperature sees a noise power density of -174 dBm/Hz = -204 dBW/Hz (that's $10^{-17.4}\,\text{mW/Hz}=10^{-20.4}\,\text{W/Hz}$).

Also, receivers aren't perfect. So, we can add 2 to 4 dB of noise figure to that, which is roughly a factor of 2.

So, sadly, it's not for us to choose 1., the noise variance, in what determines the probability of error.


That leaves us with choosing 2, the power of the received value.

Armed with a table for the integrals of the bell curve we've shown above, we can look up the $E_b/N_0$ value that we need for a given maximum error probability.

This table can be put into a plot: The BER curve. If we receive +1/-1, we call that BPSK, and the BER curve looks like this:

BER curve BPSK
from DSPLog

We can see that for your acceptable BER of 1/1000 = 10⁻³, we need an $E_b/N_0$ of about 7 dB.

That means that, since our $N_0=-204\,\text{dBJ}$, our $E_b$ needs to be at least $-197\,\text{dBJ} = 10^{-19.7}\,\text{J}$.

We can transmit 1.6 J of energy legally for that single bit. As long as far-range transmission doesn't impose more than about 196 dB of attenuation, we're fine.


I don't know current channel conditions on say the 40 m band. I really don't. If someone has a current table that says "with a probability of x, we see such and such an attenuation", I could tell you what is OK to assume in your 99% of cases. (By the way, that would again take the form of a PDF, not of noise as random variable, but as the channel attenuation as random variable.)

What I do know is that we don't know the channel beforehand. That means we don't know the phase that the channel imposes on us (which simply depends on the exact length, and medium effects, reflections and so on). Since that includes 180° phase shifts, we can't just transmit a single +1 or -1 with a high power, because the receiver simply couldn't tell the sign.

Also, we're far from the only people to use that band. So, just sending a tone will not work out – other do the same, and suddenly our noise isn't only the receiver noise, but also the interference from others.

So, what we need to do is to give our signal

  1. as long-as possible shape to maximize power
  2. a as-unique-as-possible shape to maximize "identifiability" among interfering signals.

The standard way of doing that is spreading the signal. So, you use a spreading sequence. Say, $+1, -1, +1, -1$. You multiply what you want to send, let's say the $+1$, with each element of that sequence. You then send the result, but with only 1/(length of sequence) of time per resulting number, so that you still send the same amount of "payload" bits per time.

At the receiver, you take the same spreading sequence, and multiply every value you receive with it, and sum up. You correlate. For example, you receive $0.2, 0.1, 0.1, -0.5$, and you know the spreading sequence as specified above. Then you calculate the sum: $0.2·(+1)+0.1·(-1)+0.1·(+1)+(-0.5)·(-1)=0.2-0.1+0.1+0.5=0.7$
Notice how 0.7 is larger than any of the noisy values you received, individually? That makes sense: the $+1,-1,+1,-1$ "content" was multiplied with itself ($(+1)·(+1)+(-1)·(-1)+(+1)·(+1)+(-1)·(-1)=1+1+1+1$, which gives you 4, i.e. the length of the sequence.

Sadly, you mustn't forget that in order to convert that single bit into four bits and transmit them, you had to use only one quarter of the original energy for each transmitted bit. But, since you added up signal, which was always the same, your signal gain is quadratic, whereas your noise gain only linear to sequence length. You generally gain SNR as a function of the length of the sequence.

That's exactly how cheap-as-hell GPS receivers can fish GPS signals from far, far below the noise floor.

A big bonus of using a sequence is also that you can use a unique one. Which means that if you correlate with some signal that wasn't shaped with the same or a similar sequence, it, just like noise, doesn't add up constructively. So, you can "isolate" your receiver from interferers. Neat!

Now, it seems intuitive that you'd want to spread as much as possible. Make your sequence a million elements long. Be totally unimpressed by any interferer.

Sadly, you need to send more symbols the longer your sequence get. Where we had 1 bit that took 4 ms to begin with, equivalent to 250 bits per second, we'd need to transmit 250 million bits per second. Since we can't transmit more than 1 symbol/s/Hz, that would require us to use a bandwidth of at least 250 MHz (assuming binary symbols). You can't get that bandwidth around 7 MHz...

Another problem with bandwidth: you'd also need to make sure that if your signal finds more than one path to your receiver, the "later" copy of the symbol doesn't disturb the next symbol. (We call that inter-symbol-interference, ISI.)

There's three ways of managing that:

  1. Simply don't. Your symbol rate needs to be low enough that all copies land in the same symbol, because the symbol is just so looooong. But that means no or little spreading.
  2. Take your wide channel and say: "hey, if I act as if this wide thing, in which symbols are so short that the echos overlap into the next one, is actually many narrow channels next to each other, and I divide my input data into these channels evenly", you've solved the hard ISI-problem by reducing it to many, "how do I deal with these many different easier channels"-problem. That's what OFDM does. And that's what Wifi, LTE, DVB-T, DAB+, but also HF modes like DRM+ and FreeDV do.
  3. Use an Equalizer. That's an algorithm that needs to estimate when later copies come in, and then adds them to the first copy and subtracts them from the symbol that they leaked into.

The first option is not an option. This system won't work without identifiability.

The second option requires a separate decision of what was sent on many channels, and requires that you really send enough data to even fill all the subchannels. Also, we can't really do many subchannels: our frequency resolution is 1/duration, duration is limited to 4 ms, so our channel spacing is 250 Hz, at least. We have at most 2700 Hz of overall bandwidth in the 40 m band, so that's at most 10 channels (there's some non-negligible overhead for such methods). 10 is not "many" subchannels. (Really useful would be 128 to 4096 or something like that).

The third option sounds best, but it requires that you estimate the channel impulse response, so that it can be reverted by the equalizer.
Now comes why it's bad to not consider the callsign alongside with the 1 bit of info:

If we need to send 29 bits in total, sending 1 bit of known preamble to allow the receiving end to identify the properties of the channel doesn't hurt much – it's less than 4% of the energy we would have invested to send the over the whole shabang. If we only send 1 bit, then that 1 bit of preamble means 50% of energy lost to channel estimation. Ouch.

Also, remember the BER above? Yeah, that's not optimal. If your $E_b/N_0$ isn't outright terrible (which we can avoid, with but a bit of spreading), and you transmit enough bits (1 doesn't allow for much) we can apply channel coding with an forward error-correcting code (FEC) to make that line better, i.e. needing less bit energy for the same amount of data. That "blows up" the amount of data you need to transmit by a factor of $1/R>1$. The good thing about error-correcting codes is that they are even more efficient than spreading at converting more bandwidth into better $E_b/N_0$, but they can't help you much with telling your signal from noise or other people's signals.


So, all in all, your transmitter would likely look like this:

  1. Add robust error correction coding to all your 29 bits that you needed to send (that's a rate of (1/R)·original rate)
  2. Use a relatively benign spreading code (maybe, length 32) to reduce the likelihood of misidentifying your signal (rate 32·(1/R)·original rate)
  3. Use a modulation that doesn't require you to intensely estimate the channel beforehand. For HF, FSK modes have proven to work well (won't go into the math why, but there's plenty of math to show that's the case). Using FT-8 as a guideline, say, 4-FSK, ie. every symbol you transmit carries 2 bits ($=\log_2(4)$). (symbol rate = 1/2·32·(1/R)·original rate)
  4. Since we're very time-constrained, we decide to use the full 2700 Hz channel, and divide it into as many parallel channels as possible to achieve the complete transmission in 4 ms = 1/250 s. That means, we need the number of channels to be 16/R. R would realistically be something like 1/2, so 32 subchannels (each of which does a 4-FSK) doesn't sound so bad. 2700 Hz / 32 = 84 Hz per subchannel bandwidth.
  5. add a bit of overhead everywhere: you're now collecting all energy of the full band in your receiver. It's not even question anymore whether you will have an interferer that has far more power than your transmitter in there, but only how many.

Notize how the FT-8 bandwidth is 50 Hz, and thus pretty close to that 84 Hz? Makes a lot of sense, FT-8 is meant to be nice for many people playing together at the same time, and doesn't have a (hard to ridiculous) 4 ms constraint.