Abstract

By contraposing the signal detection for filter bank multicarrier (FBMC) communications with the underwater acoustic (UWA) channel, this paper analyzes the traditional imaginary interference problem and proposes a deep learning-based method. The neural network with feature extraction and automatic learning ability is employed to replace the demodulation modules to recover transmitted signals without explicit channel estimation and equalization. Sufficient data sets are generated according to the measured channel conditions in Qingjiang river, the optimization of network parameters is finished by constraining cost function in offline training, and the signal detection is carried out directly with the well-trained network in online testing. The system performance of various supervised learning models such as multilayer perceptron (MLP), convolutional neural network (CNN), and bidirectional long short-term memory (BLSTM) network is compared under different data sizes, network parameters, and prototype filters. The simulation results show that the bit error rate (BER) performance of the proposed signal detection is better than that of the classic one, which indicates that deep learning is a promising tool in UWA communication systems.

1. Introduction

Compared with other transmission media, UWA channel is much more complicated due to strict bandwidth limitation, Doppler frequency shift, and background noise. Orthogonal frequency division multiplexing (OFDM) is currently an effective method to realize high-rate UWA communication due to its ability to handle long multipath broadening and frequency selectivity [14]. However, the orthogonality of the subcarriers in OFDM system is easily affected by Doppler effect, which will cause difficulties in channel estimation and signal detection [57].

As a new force in 5G multicarrier modulation, FBMC introduces filter bank in OFDM to ensure the independence between subchannels without cyclic prefix that provides protection interval, which greatly improves the spectrum efficiency. The prototype filter bank has excellent time-frequency (TF) focusing characteristics to make FBMC more robust against both ISI and ICI [810]. The subcarriers of FBMC only meet the orthogonality in the real domain, resulting in the inherent imaginary interference between adjacent subcarriers and symbols. Furthermore, the classical signal processing method cannot be directly used, which makes the signal detection of FBMC system more challenging. Researchers have proposed many signal detection approaches based on pilot to counter imaginary interference including interference approximation method [11, 12] and interference cancellation method [1315], so as to maximize the symbol amplitude at the pilot after demodulation. The interference approximation method is designed to calculate the value of the neighborhood symbol interference, and the interference cancellation method makes full use of the odd symmetry of the filter fuzzy function, but the performance of these systems still depends on the accuracy of channel estimation and the pilot overhead is high.

Recently, deep learning has sprung up in speech processing [16], real-time vision [17], and other engineering fields. The concept of applying deep learning to wireless communication systems, especially UWA communication systems, has just begun to emerge in public. According to [18], a symbol demodulation and detection model based on twice training network is proposed whose performance is far better than that of the maximum likelihood algorithm. According to [19], a linear channel coding and decoding algorithm based on deep neural network has been proved to be superior to the classical belief propagation algorithm. MLP is the most basic deep learning model, which consists of multiple fully connected neural layers [20]. Ye et al. introduce MLP into the receiver of OFDM system for channel estimation and signal detection and reveal that deep learning method can obtain the analogous BER performance compared with the traditional OFDM system [21]. Inspired by the above, Zhang et al. propose a deep learning-based OFDM communication system and analyze the robustness under the UWA channel [22]. Qasem et al. propose a new scheme called deep learning-coded index modulation-spread spectrum to deal with the increasing data rate restriction of limited user number [23].

Stimulated by the potential of neural network in the UWA communication field, this paper proposes a deep learning-based receiver for FBMC system. By regarding FBMC signal detection as label prediction of neural networks, several supervised learning models such as feedforward MLP, CNN [24], and BLSTM [25, 26] have been adopted to realize implicit channel estimation and equalization. The performance of the proposed method is quantitatively analyzed with sufficient amount of transmitted data which is simulated by the channel impulse response (CIR) measured in Qingjiang river. Simulation results demonstrate that compared to classical channel estimation methods such as least square (LS), the signal detection method based on deep learning is more effective in improving the BER performance of UWA FBMC communication.

The rest of this paper is organized as follows. In Section 2, the model of FBMC and the problem of imaginary interference are introduced. In Section 3, several supervised learning models are reviewed, and then, the deep learning-based signal detection for UWA FBMC systems is presented. 4In Section 4, the system performance analysis and comparison are provided. The conclusions are made in Section 5.

Notations: denotes the th TF point. denotes the real part of complex number. denotes the conjugate. denotes the convolution. denotes the Hadamard product.

2. System Model and Problem Formulation

2.1. UWA FBMC System Model

Different from OFDM, the transmitted symbol of FBMC system is offset quadrature amplitude modulation (QAM) symbol; namely, the real and imaginary parts of complex QAM symbols are extracted, respectively, and then sent after misplacing half symbol period. Figure 1 shows the block diagram of FBMC system implemented by filter bank and IFFT. The output of the transmitted symbol through the synthesis filter bank (SFB) can be expressed as [27] where is the subcarrier number, is real data on the th subcarrier of the th FBMC symbol, and phase factor is set to . denotes the prototype filter with length , where denotes the overlap factor. represents the synthesis basis obtained from the TF transformation of . After channel and analysis filter bank (AFB), the demodulation symbol at TF point is where the orthogonal condition of for perfect signal reconstruction satisfies . denotes the Kronecker delta function which equals 1 if and equals 0 if . Thus, the transmitted symbol can be accurately recovered at FBMC receiver after taking the real part of the demodulated symbol.

2.2. The Problem of Imaginary Interference

It is worth noting that FBMC systems satisfy orthogonality only in the real field, which implies that even under ideal channel conditions, there will be inherent imaginary interference in the AFB if any . The distribution of varies according to the filter bank employed.

We assume that the channel is frequency flat and unchanged over the duration of the prototype filter, so the output of the AFB at can be shown as [12] where is the channel frequency response and and denote the imaginary interference and noise component. Considering that imaginary interference mainly comes from adjacent TF points, the first-order neighborhood of is defined as , where . Then, Equation (3) can be further expressed as where is the symbol virtually transmitted. When data at and are known, can be treated as a pseudopilot to estimate channel frequency response by the LS principle, as .

3. Supervised Learning Models and Deep Learning-Based Signal Detection

3.1. Multilayer Perceptron

As shown in the dashed box in Figure 2(a), a MLP can be summarized as an artificial neural network with multiple hidden layers between input and output layers [28]. The output of the th neuron in the th layer can be expressed as where is the weight between the th neuron in the th layer and the th neuron in the th layer, is the bias of the th neuron in the th layer, and is the selected activation function of this layer. , an improved ReLU function, is employed for the hidden layers, and the output layer applies the function to make the network output in the interval . In addition, each hidden layer adopts dropout regularization to prevent the network from favoring certain features with iterative training, so as to guarantee the generalization ability of the system.

3.2. Convolutional Neural Network

The CNN in Figure 2(b) uses shared convolution kernels to automatically extract local spatial correlation features of input data [29]. The weight sharing method greatly reduces the number of parameters and makes the whole training process easier. The output of the th convolutional layer can be expressed as [30] where is the convolution kernel with adjustable weights of the th layer, is the bias, and represents the result of two-dimensional convolution. Batch normalization, an efficient regularization method with faster convergence speed, is adopted in the convolutional layer to prevent gradient disappearance and overfitting. Pooling layer is not taken in this article because the input tensor is not large. The whole convolution process can be regarded as a special feature extraction, in which the feature data is output through a few fully connected layers after flattening.

3.3. Bidirectional Long Short-Term Memory

Recurrent neural network is a kind of recursive neural network which characterizes the time correlation of input sequence. As shown in Figure 3, LSTM introduces gate mechanism and storage units to neurons to address the long-term dependence challenge of sequences. At time , the input gate, forget gate, output gate, LSTM input, LSTM output, cell state, and the candidate are, respectively, represented as , , , , , , and ; the operational processes are as follows [31]: where and represent weights and biases, is the state of the cell at the previous moment, and is also an activation function. Figure 2(c) shows that a BLSTM layer consists of two LSTM layers stacked in opposite directions, whose output is calculated jointly through two layers of hidden state by where is the forward sequence and is the backward sequence. So the final prediction depends not only on the past input but also on the future input.

3.4. Neural Network-Driven UWA FBMC Systems

Figure 2 shows the structure of the deep learning-based UWA FBMC system, in which the neural network models replace the channel estimation, equalization, and demapping modules at the receiver of the traditional system, while the transmitter remains unchanged. In each simulation, the frequency domain data received after FFT and the random binary sequence transmitted are recorded as a set of input and corresponding label . The models are trained by viewing FBMC demodulation and UWA channels as black boxes [21]. With the network iteration, the weights (or ) and biases of the neural network are adjusted, and the difference between output and label is continuously reduced.

In this paper, we take signal detection as binary label classification and adopt crossentropy (CE) cost function to measure the difference where represents the number of neurons in the output layer. When the cost function meets the preset threshold condition or the network iteration reaches the maximum epoch limit, the neural network finishes the training process, and (or ) and stop updating and are saved accordingly. The online neural network directly outputs the predicted binary sequence after loading the new received frequency domain signal.

4. System Performance

4.1. Simulation Configuration

In order to carry out offline training more realistically and effectively, we use the measured underwater acoustic channel of Qingjiang river (as shown in Figure 4) to generate enough communication data. Figure 5 depicts the layout of this experiment. The river depth at the experimental site is about 100 m, the hanging depth of the transmitting transducer is about 30 m, and the hanging depth of the receiving hydrophone is about 10 m. During the experiment, both the sending ship and the receiving ship are in a free-drifting state, with a distance of about 1.5 km.

The input number depends on the number of real and imaginary parts of 2 FBMC blocks with 512 subcarriers. The networks involved in this paper extract features from the amplitude, space, and time dimensions, respectively, whose input tensor, selection of the network layer, and settings of hyperparameter are shown in Table 1. A rate convolutional coder with generator polynomial [5, 7] in octal format and 4-QAM is considered. The PHYDYAS filter [32] is adopted as prototype filter, and a total of 50000 sets of obtained communication data are divided into training set and test set by 9 : 1.

4.2. BER versus the Data Size

Several supervised learning models are first compared with LS method for signal detection in Figure 6, where the LS method performs the worst because the accuracy of channel estimation is easily affected by imaginary interference. The MLP method (the number of neurons in each layer is 2048, 512, 128, 32, and 16) significantly improves BER performance through data-driven implicit channel estimation. In addition, CNN (the number of channels in each kernel is 4 and 8) and BLSTM methods further explore the spatial correlation and temporal correlation among the input data, respectively, which perform state-of-the-art signal detection.

We also double the communication data and maintain the original proportion to explore the impact of data size on the proposed system. The MLP method seems to achieve greater gain than CNN and BLSTM do due to the more space for learning caused by the amplitude feature extraction only, but the latter two still have better BER performance when more data is provided. The results indicate that the characteristics of UWA channel are efficiently learned by deep learning-based methods and the BER performance is sensitive to the data size.

4.3. BER versus the Network Parameters

The accuracy of deep learning-based signal detection mainly depends on the complexity of its model. According to the structural characteristics of different networks, the network parameters such as hidden layer neurons of MLP, channels in the convolution kernel of CNN, and direction of propagation of LSTM are regulated to make them deeper. The number of neurons in MLP is reset to 2048, 1024, 512, 64, and 16; the number of kernel channels in CNN is reset to 8 and 16; and BLSTM is compared with unidirectional LSTM. From Figure 7, the BER performance of the deeper model is generally improved. It is noted that the gain of BLSTM indicates that the transmitted symbols in the future also have an impact on the current signal detection.

4.4. BER versus the Prototype Filter

Wondering how the BER performance of the deep learning-based signal detection and LS-based signal detection is affected by the selection of prototype filter, we add EGF and IOAT filters in simulation. For fair comparison, the settings of UWA communication system and network parameters remain fixed. As depicted in Figure 8, MLP and CNN methods own more stable performance and better robustness than LS algorithm under different communication scenarios, but the BLSTM method presents an obvious performance difference. That is, BLSTM is sensitive to the degree of matching between filter banks and underwater acoustic channels.

5. Conclusion

This paper presents a deep learning-based FBMC signal detection for UWA communications, which only need to collect received symbols for implicit channel estimation and equalization in a data-driven way. Furthermore, CNN with spatial correlation and BLSTM with temporal correlation are analyzed for deeper feature extraction. The proposed receiver has been tested with CIR measured in Qingjiang river at a range of 1.5 km. Results of comparison show that the proposed methods outperform classical algorithms in detection accuracy, which leads a flexible design for future UWA communications.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Acknowledgments

This research was supported in part by the National Natural Science Foundation of China under Grant 52071164 and in part by the Postgraduate Research & Practice Innovation Program of Jiangsu Province under Grant KYCX20_3161.