Abstract

The expectation propagation (EP) detector achieves significantly better performance than the linear detectors (such as minimum mean squared error detector) in massive MIMO systems, which has drawn great attention recently. EP’s approximation (EPA) algorithm simplifies the update formula of the EP algorithm by reexpressing the moment matching condition so that the number of matrix inversions in the EP algorithm is reduced to one. However, the expense is that the EPA algorithm requires higher accuracy for this inversion; otherwise, the bit-error-rate (BER) performance will suffer serious losses. To tackle this issue, the SORI iterative algorithm is introduced to obtain the high-precision result of this inversion to ensure the good BER performance of the EPA algorithm. First, the new expression of the SORI iterative algorithm is derived under the equivalent real-valued system. Second, the improved EPA-SORI algorithm is then introduced by the SORI algorithm, which is used to approximate the initial value of the EPA algorithm under the real-valued system. Finally, by designing the initial solution and the relaxation factor of the EPA-SORI iterative algorithm, the convergence rate can be quickly increased without increasing the complexity. Simulation and complexity results exhibit that in various massive MIMO system configurations, the proposed EPA-SORI algorithm can achieve the same BER performance as the Exact EP algorithm with significantly lower complexity. At the same time, compared with MMSE and the existing EPA algorithms, the proposed EPA-SORI algorithm has a better performance-complexity trade-off advantage, which is more obvious in scenarios with high modulation order and a large number of users.

1. Introduction

Future wireless communication requires a high data rate and a tremendous amount of connection for emerging applications such as the Internet of things (IoT) [1, 2]. The access of massive IoT devices to the network will lead to tremendous growth in the data volume of mobile communication services, and the wireless network capacity will face unprecedented challenges [35]. Aiming at this challenge, main solutions include the usage of larger bandwidth, higher-order MIMO [6, 7], higher-order modulation, more effective coding, and so on. Among these solutions, through deep utilization of spatial dimensions, massive MIMO technology attains enhanced wireless communication capacity and spectral efficiency and has become a key technology for 5 G/B5G wireless communication [8]. With enormous system dimensions and the use of higher-order modulation, signal detection faces a challenge in terms of computational burden and hardware implementation [9].

The traditional optimal signal detector, maximum likelihood (ML) detector, faces the problem of exponential increase in computational complexity for massive MIMO systems [10]. In contrast, linear detection algorithms (such as MMSE, ZF) have reduced computational complexity. Especially when the loading factor (, where and represent the number of single antenna users and the number of base station antennas, respectively), the minimum mean squared error (MMSE), and Zero Forcing (ZF) linear detection algorithms can achieve near-optimal system performance. However, as a large number of IoT devices are connected to the cellular network, more users need to be served in a cell (e.g., mobile phones, unmanned aerial vehicles (UAVs) [11, 12], sensors [13], vehicle to vehicle (V2V) [14]). Unfortunately, as the number of users increases, the performance of the linear detection algorithm suffers severe degradation. Compared with ML algorithm, other complex detectors (e.g., belief propagation (BP) [15, 16], approximate message passing (AMP) [17]) can achieve excellent performance. However, the convergence speed of the iterative update in the BP algorithm will decrease, and the update formula during the iteration process becomes more complicated, because massive MIMO contains a large number of ring structures when the number of users increases. To address the above issues, an EP algorithm is proposed [18, 19]. For a random value of and high-order modulation, the EP algorithm not only shows significantly better performance than other algorithms such as BP, AMP and MMSE, but also has great flexibility and strong robustness, which has attracted wide attention. However, during each iteration of EP, it is necessary to perform a full matrix inversion with a complexity as high as (). In addition, the huge computational cost makes it difficult to implement on hardware. To address this problem, EP’s approximation algorithm (EPA) simplifies the update formula of the EP algorithm by reexpressing the moment matching condition, so that the number of matrix inversions in the EP algorithm is reduced to one [20]. However, the expense is that the EPA algorithm requires higher accuracy for this inversion; otherwise, the bit-error-rate (BER) performance will suffer serious performance losses. Recently, some methods based on matrix polynomial decomposition are proposed to apply to the EPA algorithm (such as EPA-NSA and EPA-wNSA) [9, 20]. When loading factor , the EPA-NSA algorithm shows good performance. But with the increase of , the EPA-NSA algorithm shows slow convergence or even nonconvergence, resulting in serious degradation of system performance. Although EPA-wNSA algorithm has an improved performance compared with the EPA-NSA algorithm, the degree of improvement is very limited. In addition, the above-mentioned algorithms like EPA-NSA, EPA-wNSA, etc., in spite of avoiding the direct inversion of the matrix and reducing some complexity, calculation of the Gram matrix with a complexity up to () still needs to be involved to obtain high-precision signal detection. Some famous linear iterative method, such as Gauss-Seidel [2123], successive over relaxation [24, 25], suffer from calculating the Gram matrix and low parallelism. Therefore, they can not directly be applied to alleviate the computational burden of EP iteration. In order to solve these complex issues, an improved algorithm EPA-SORI is proposed in this paper. The EPA-SORI algorithm introduces the SORI iterative algorithm in EPA to obtain a high-precision result of this one inversion to ensure that the EPA algorithm has good error bit rate performance. The contributions of this paper are as follows:(1)We first deduce a new expression of the SORI iterative algorithm in the real value system. Then, the SORI algorithm under the real value system is further applied to the EPA algorithm. Furthermore, LLR approximation is provided to enhance the accuracy of the EPA-SORI detector.(2)According to the random matrix theory, under the real-valued system, promising initial solution and relaxation factors are used to further enhance the convergence rate and accuracy and then reduce the computational complexity. Furthermore, a theoretical analysis of the convergence speed of the proposed EPA-SORI algorithm is presented. Theoretical analysis proves that the convergence speed of the proposed EPA-SORI algorithm is significantly higher than the recently reported EPA-wNSA algorithm.(3)In the iterative process, the proposed EPA-SORI algorithm not only requires no matrix inversion operations but also avoids direct calculation of the Gram matrix, effectively reducing the complexity of the entire algorithm. At the same time, the proposed EPA-SORI algorithm is high-parallel and hardware-friendly.(4)Simulation and complexity results show that the bit error rate performance of the EPA-SORI algorithm is much better than MMSE, and the complexity is much lower than MMSE. Compared with existing EPA algorithms (such as EPA-INSA, EPA-wNSA, etc.), the proposed EPA-SORI can achieve performance close to Exact EP with lower complexity and higher convergence rate. Furthermore, with high modulation order and a large number of users, EPA-SORI will show a more obvious performance-complexity trade-off advantage.

1.1. Notation

Matrices and column vectors are represented by uppercase and lowercase boldface letters, respectively. The element in -th row and -th column of matrix is denoted by . , , , , and denote transpose, conjugate transpose, inversion, and determinant, respectively. Also, the probability distribution of is denoted by . The real part and imaginary part are denoted by and , respectively. represents the Gaussian probability distribution with mean and variance .

2. System Model

We consider a massive MU-MIMO system in which antennas are deployed at the base station to serve users. is a -dimensional transmission signal vector, where and are the symbol set of the QAM constellation (e.g., , etc.). We assume that is a flat Rayleigh fading channel. The received signal at the receiver can be modeled by the following:where is the additive white Gaussian noise (AWGN), and its elements satisfy a complex Gaussian distribution with mean 0 and variance . Here, the received signal can be reexpressed by the equivalent real-valued augmented vectors as follows:where , and represents the set of real and imaginary parts of the point set on the constellation in QAM modulation. , where satisfy a real Gaussian distribution with mean 0 and variance . Thus, (1) can be modeled by , where .

3. EP and EPA Algorithms

EP algorithm is a reasoning method based on Bayesian inference, which is used to estimate the value of under the condition that is known when a joint distribution is given [2628]. If we use to estimate directly, the complexity will increase exponentially with the dimensions of and , which consume huge hardware resources in massive MIMO scenario. Therefore, an approximate distribution is used to estimate . In model , the joint posterior distribution of the and is . To facilitate the analysis, according to [7], when the transmitted symbols are independent of each other, the EP detector uses a nonstandard Gaussian distribution to replace the prior distribution of each transmitting antenna, that is, . And then, a posterior distribution whose distribution satisfies the exponential family approximate. can be expressed as follows:

The cumulative multiplication in (3) can be transformed into the following form:where is a real-valued column vector, and is a diagonal matrix. According to (4), the expression of mean and variance of is as follows:

Having constructed , it is necessary to use the moment matching technique so that and are as similar as possible. However, the complexity of is very high, and it is difficult to directly obtain and . Aiming at this problem, a sequential EP algorithm is proposed in [9, 20]. It is assumed that is the th edge of : and are adopted as the initial solutions of (7), where denotes the mean symbol energy. Then, the alternative distribution expression is expressed as follows:where is expressed as follows:

Then, the first-order moments and the second-order moments of can be obtained by the following:

Assuming that , are the mean and variance of , the values of and for iterations can be obtained as follows:

Next, the updated and can be obtained by matching the first-order and second-order moments of and the cavity marginal distribution . Since the calculation process of the EP algorithm is too cumbersome, and each iteration involves the matrix inversion calculation in equation (5), which brings difficulties to the signal detection in massive MIMO scenarios. To simplify the calculation process and reduce computational complexity, EP’s approximate algorithm EPA is proposed in [20]. The EPA algorithm reexpresses the moment matching conditions and no longer uses explicit matrix matching to calculate the results of each iteration. It simplifies the update formula in the iteration process of the EP algorithm and uses a fixed matrix to estimate the matrix before the iteration. Then, the approximate value is used to estimate the value of , and finally, we will obtain an approximate algorithm EPA which is different from the Exact EP iteration process. The EPA algorithm reduces the EP algorithm that requires multiple matrix inversion calculations to one time, which can effectively avoid the impact of multiple inaccurate approximate matrix inversion results. However, the expense is that the EPA algorithm requires a complete and high-precision matrix inversion to obtain an initial value of iteration. The implementation of EPA is detailed in Algorithm 1.

(1) Input: , , , , , .
(2) Output: .
(3) .
(4) .
(5) .
(6) .
(7) .
(8) for do
(9) ;
(10) end for
(11) repeat
(12) for do
(13) ;
(14) end for
(15) .
(16) .
(17) .
(18) , the damping factor .
(19) ;
(20) until convergence or .

4. The Proposed EPA-SORI Detector

MMSE solution is usually used as the iterative initial solution of EP detection [20], then we also use MMSE solution as the iterative initial solution of the proposed EPA-SORI algorithm. Define , and the initial iteration can be obtained by the following:

From (14), the initial iteration solution problem can be treated as a solution of the equation: . Note that the result of this equation is required to be highly accurate; otherwise, the bit error rate performance will suffer serious degradation, so we propose to apply SORI iteration to obtain a high-precision iterative initial solution. In the complex value system, the corresponding SORI algorithm can be constructed as follows:where , denotes the estimated value of , and is the relaxation factor, , and are the maximum and minimum eigenvalues of , respectively, where . Note that the EPA algorithm is implemented in a real-valued system; thus, we need to apply the SORI algorithm to a real-valued system. Here, the SORI method can be reexpressed as follows:where , denotes the estimated value of , and denotes the relaxation factor, , and and are the maximum eigenvalue and minimum eigenvalue of , respectively. The relationship between and satisfies: . In recently reported literature [29], the SORI algorithm is mainly applied to signal detection under complex-valued systems and cannot be directly applied to real-valued systems. Thus, it is necessary to further deduce the maximum and minimum eigenvalues of . As long as the eigenvalues of can be obtained, the eigenvalues of can be obtained.

Lemma 1. In the real-valued system, the maximum and minimum eigenvalues of are as follows:where and denote the number of columns and rows.

Proof. Assuming that one of the eigenvalues of is , and one of the eigenvalues of is . Based on the properties of eigenvalues, we have the following:Here, and can be expressed as follows:Substitute (20) and (21) into (18) and (19), and we have the following:Simplify (23), and we have the following:The comparison between (24) and (22) shows that the eigenvalues of the two are the same, i.e., . In addition, we can get the same conclusion by using random matrix theory [30, 31]. According to random matrix theory [30], in a real-valued system, . Thus, . Hence, we have and . Thus, Lemma 1 is proved.
The next step is to derive the relaxation factor in the real-valued system. According to [29], , where denotes the spectral radius of , and . Given that and are the maximum and minimum eigenvalues of , , thus can be obtained by the following formula:SinceThus, we have the following:It can be seen that is only related to and .
According to the typical properties of the massive MIMO channel, satisfies asymptotic orthogonality, and the Gram matrix is dominant diagonally. From (20) and (21), we have , thus, the matrix is also dominant diagonally. Therefore, could satisfy the approximation as follows:It can be seen from (29), when , can be used to approximate , the SORI algorithm can choose the initial iteration solution of , . The SORI algorithm is summarized in Algorithm 2.

(1) Input: , , , ,.
(2) Output: .
(3)
(4) .
(5) .
(6) .
(7) , .
(8) for do
(9) ;
(10) .
(11) end for
In the proposed EPA-SORI algorithm, and are used as the initial iteration solutions to further improve the convergence speed. Since is the diagonal matrix of the matrix, the complexity will not increase. The EPA-SORI algorithm is summarized in Algorithm 3, and an intuitive diagram of the processing of the EPA-SORI algorithm is presented in Figure 1.
(1) Input: , , , , , , .
(2) Output: .
(3)
(4) .
(5) .
(6) .
(7) .
(8)
(9)
(10) , .
(11) for do
(12) ;
(13) .
(14) end for
(15)
(16) for do
(17)
(18) end for
(19) repeat
(20) for do
(21)
(22) end for
(23)
(24)
(25)
(26) , the damping factor .
(27) .
(28) until convergence or .

5. Convergence Performance Analysis

This section mainly analyzes the convergence performance of the EPA-SORI algorithm and compares it with that of the EPA-wNSA algorithm. Note that the convergence performance of the EPA-SORI and EPA-wNSA algorithms mainly depends on the convergence performance of the initialization part (i.e., SORI and wNSA).

Lemma 2. The iterative spectral radius of the proposed EPA-SORI and the EPA-wNSA satisfy the following relationship:

Proof. First, is denoted as the SORI estimation error after iterations:From (16), the relationship between and can be obtained by the following:Thus, denotes the iterative matrix of the SORI iterative algorithm which can be expressed as follows [29]:Assuming that and are the eigenvalues of and , respectively. and are the maximum and minimum eigenvalues of , respectively. At the same time, and are the maximum and minimum eigenvalues of , respectively. Thus, we have and . Then, after substituting (33) into , we have . Thus, compared with , it can be derived as follows:Next, by substituting into (34), the maximum eigenvalue of can be obtained by , where . And since , can be calculated as follows:Based on the above derivation, we have . And after simplification, the spectrum radius of can be calculated by the following:Assuming that denotes the iterative matrix of the wNSA iterative algorithm in a real-valued system, which is given by the following:where denotes the weight factor of EPA-wNSA, and satisfies the following approximation:Thus, we have the following:(39) can be further simplified by substituting the minimum eigenvalue of . Furthermore, can be neglected since it is much smaller than . Thus, we have the following:Comparing the spectrum radius and , we have the following:where ; hence . Thus, Lemma 2 is proved. The convergence rate of the iterative algorithm is closely related to the spectral radius of the iterative matrix G, i.e., . Hence, the smaller the spectral radius of the iteration matrix, the greater the convergence speed. From (30), has a smaller spectrum radius than ; it means that the proposed EPA-SORI algorithm exhibits favorable convergence performance.

6. Computational Complexity Analysis

In this section, the computational complexity is given by the number of real-valued multiplications (RMULs). As shown in Figure 1, the entire calculation process is divided into two parts: the initialization process and the iterative processes of the EPA-SORI algorithm.

Firstly, in the initialization process, the number of RMULs required by and are and , respectively. Since is a diagonal matrix, thus the complexity is . Please note here that compared with matrix multiplication, the complexity of , and can be ignored. Thus, the initial solution does not need to be calculated, and the number of RMULs involved in calculating is . The next step is the SORI iteration part of the initialization process. The RMULs of and for each iteration is approximately and , respectively. Therefore, the RMULs complexity involved in calculating the SORI iteration is , where denotes the number of iterations of SORI. The final step of this process is the initialization of , whose computational complexity is .

Secondly, the RMULs complexity involved in calculating the EPA iteration is , where denotes the number of EPA iterations.

Finally, the overall complexity of the proposed EPA-SORI algorithm is approximately as follows:

Table 1 shows the computational complexity of EPA-SORI algorithm, MMSE [10], EPA-INSA [9], EPA-wNSA [32], and the Exact EP [18] algorithm. From Table 1, note that Gram matrix calculations with a complexity of up to cannot be avoided in many of the reported methods, such as MMSE, EPA-wNSA, and the Exact EP. Additionally, MMSE and the Exact EP also involve the inversion of a matrix with a complexity of up to . In contrast, the proposed EPA-SORI algorithm only involves operations with complexity of about . This is because in the SORI algorithm, direct calculation of the Gram matrix is avoided by splitting calculation. For example, to calculate , we first calculate , and then multiply the result with . Here, we have a much lower calculation complexity, which is . Compared with these methods of calculating Gram matrix first and then multiplying with vector, the complexity, in this case, is greatly reduced. Since and are much smaller than and for massive MIMO systems, the complexity of the EPA-SORI algorithm is much lower than that of other algorithms.

7. Simulation Result

In this section, the BER performance results of the EPA-SORI algorithm are presented and compared with EPA-wNSA [32], MMSE [10], EP-INSA [9], and the Exact EP [18] algorithms. To fully demonstrate and verify the BER performance of the proposed EPA-SORI, simulations are performed under different modulation methods (i.e., 16/64/256QAM) and different loading factors (i.e., ). For some damping factor , we set in EPA-wNSA and EPA-SPRI algorithms according to [20]. Assuming that the base station is able to obtain perfect channel state information (CSI) and complete signal detection based on the obtained CSI.

To further exhibit the convergence performance of EPA-SORI, Error-vector magnitude (EVM), which is defined as [33], is considered in Figure 2. As presented in Figure 2, EPA-INSA diverges for three different modulation methods when , . In contrast, the convergence performance of EPA-wNSA can be improved by the optimal choice of the weighted factor, but the degree of improvement is very limited [32]. At the same time, the proposed EPA-SORI algorithm could fast converge to the accurate Exact EP algorithm with obviously fewer iterations. As the modulation order increases, the benefit brought by EPA-SORI is more obvious. Therefore, it is verified that the advantage of the proposed algorithm is in fast convergence.

As shown in Figures 3-5, when the Massive MIMO system is configured as , , or , the loading factor is 0.5 and 0.25. In each system configuration, we consider three modulation methods: 16QAM, 64QAM, and 256QAM, and different algorithms are compared and analyzed. In Figure 4, at BER with , for 256-QAM, EPA-SORI has a better performance than that of MMSE . And in Figure 3, at BER with , for 256-QAM, EPA-SORI outperforms MMSE . It can be seen that the BER of EPA-SORI algorithm is obviously superior to that of the MMSE, and as the loading factor increases, the advantage will grow further. In Figures 4 and 5, EPA-INSA has a good performance when , but does not converge when . Meanwhile, choosing the weight factor of EPA-wNSA helps improve the convergence performance of EPA-wNSA but makes it difficult for further improvement, especially at a low . In contrast, EPA-SORI uses SORI iteration to solve the only one-time matrix inversion in EPA, which enables it to fast converge to the accurate Exact EP algorithm with low complexity. For example, in Figure 4, at BER with for 64-QAM, EPA-SORI outperforms EPA-wNSA 0.3 dB, and for 256-QAM, EPA-SORI outperforms EPA-wNSA . In Figure 5, at BER with , for 64-QAM, with only 8 iterations used by EPA-SORI , its performance is outperforming that of EPA-wNSA algorithm 2 dB which requires 21 iterations 0.2 dB. And for 256-QAM, the advantage will grow further. In other words, the performance of the EPA-SORI algorithm is always superior to that of EPA-wNSA under the same Massive MIMO system configuration.

Furthermore, a clear overview of the performance-complexity trade-off under different modulation methods and system configurations is provided in Figure 6. From Figure 6, the EPA-SORI algorithm can achieve not only significantly better performance than MMSE with a significantly lower computational complexity, but also almost the same performance as the Exact EP. For example, in Figures 6(b) and 6(c), at BER with , for 64-QAM and 256-QAM, EPA-SORI outperforms MMSE and , respectively, which is very close to that of Exact EP. However, EPA-SORI only consumes 67% and 63% of computational cost of MMSE and Exact EP, respectively. It is also observed from Figure 6, EPA-wNSA can improve the BER performance by increasing the number of iterations at the cost of a substantial increase in computational complexity. In contrast, the EPA-SORI algorithm can achieve BER performance close to Exact EP with fewer iterations, and its complexity is much lower than the Exact EP and EPA-wNSA. For example, in Figure 6(b), at BER with , for 64-QAM, by increasing the number of iterations , the performance of EPA-wNSA is increased by 0.9 dB compared with EPA-wNSA , but the complexity is increased by compared with EPA-wNSA . However, EPA-SORI can achieve a performance close to the Exact EP when , but it only consumes of complexity of EPA-wNSA . Next, we compare the EPA-SORI, EP-wNSA and Exact EP algorithms in Figures 6(a) and 6(d). In Figure 6(a), at BER with , for 256-QAM, EPA-SORI outperforms EPA-wNSA 2.3 dB, and the complexity is 65% of EP-wNSA and 52% of that of the Exact EP. In addition, in Figure 6(d), at BER with , for 256-QAM, EPA-SORI outperforms EPA-wNSA 2 dB. At the same time, it only consumes 42% of the computational cost of EP-wNSA and of that of Exact EP. In summary, compared with MMSE, Exact EP and the recently reported EPA-wNSA algorithms, the proposed EPA-SORI algorithm has a better performance-complexity trade-off advantage, which is more obvious in scenarios with high modulation order and a large number of users.

8. Conclusion

In this paper, we propose a novel data-detection scheme, EPA-SORI detector, which can achieve the same BER performance as the Exact EP algorithm with significantly lower complexity in various massive MIMO system configurations. At the same time, compared with MMSE and the existing EPA algorithms, the proposed EPA-SORI algorithm has a better performance-complexity trade-off advantage, which is more obvious in scenarios with high modulation order and a large number of users. The proposed algorithm avoids the direct calculation of the Gram matrix. At the same time, several effective techniques (i.e., the iteration initial solution and the optimal relaxation factor) are adopted to further enhance the convergence rate and accuracy.

In future work, there will be many potential applications. The proposed design can be extended to other more complex scenarios, such as the extension of EPA-SORI to decentralized architectures [3436]. Also, it can be further combined with deep learning methods to improve performance [37, 38]. Finally, we will investigate the proposed design to more realistic channel scenarios in our future work.

Data Availability

The data used to support the findings of this study are included within the article. No other data were used beyond those in this article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported in part by the Natural Science Foundation of Hainan Province under Grants 2019RC130 and 620QN238, in part by the National Natural Science Foundation of China under Grant 61771066, and in part by the Scientific Research Fund Project of Hainan University under Grants KYQD(ZR)-1999, KYQD(ZR)-21007, and KYQD(ZR)-21008.